fairlearn.metrics package¶

Functionality for computing metrics, with a particular focus on group metrics.

For our purpose, a metric is a function with signature f(y_true, y_pred, ....) where y_true are the set of true values and y_pred are values predicted by a machine learning algorithm. Other arguments may be present (most often sample weights), which will affect how the metric is calculated.

The group metrics in this module have signatures g(y_true, y_pred, group_membership, ...) where group_membership is an array of values indicating a group to which each pair of true and predicted values belong. The metric is evaluated for the entire set of data, and also for each subgroup identified in group_membership.

fairlearn.metrics.demographic_parity_difference(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶

Calculate the demographic parity difference.

Parameters

y_true (1D-array) – Ground truth (correct) labels.
y_pred (1D-array) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1D-array) – Sensitive features.
sample_weight (1D-array) – Sample weights.

Returns

The difference between the largest and the smallest group-level selection rate, \(E[h(X) | A=a]\), across all values \(a\) of the sensitive feature. The demographic parity difference of 0 means that all groups have the same selection rate.

fairlearn.metrics.demographic_parity_ratio(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶

Calculate the demographic parity ratio.

Parameters

y_true (1D-array) – Ground truth (correct) labels.
y_pred (1D-array) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1D-array) – Sensitive features.
sample_weight (1D-array) – Sample weights.

Returns

The ratio between the smallest and the largest group-level selection rate, \(E[h(X) | A=a]\), across all values \(a\) of the sensitive feature. The demographic parity ratio of 1 means that all groups have the same selection rate.

fairlearn.metrics.difference_from_summary(summary)[source]¶

Calculate the difference between the maximum and minimum metric value across groups.

Parameters: summary – A group metric summary
Returns: The difference between the maximum and the minimum group-level metrics described in summary.
Return type: float

fairlearn.metrics.equalized_odds_difference(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶

Calculate the equalized odds difference.

Parameters

y_true (1D-array) – Ground truth (correct) labels \(Y\).
y_pred (1D-array) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1D-array) – Sensitive features.
sample_weight (1D-array) – Sample weights.

Returns

The greater of two metrics: true_positive_rate_difference and false_negative_rate_difference. The former is the difference between the largest and smallest of \(P[h(X)=1 | A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1 | A=a, Y=0]\). The equalized odds difference of 0 means that all groups have the same true positive, true negative, false positive, and false negative rates.

fairlearn.metrics.equalized_odds_ratio(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶

Calculate the equalized odds ratio.

Parameters

y_true (1D-array) – Ground truth (correct) labels \(Y\).
y_pred (1D-array) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1D-array) – Sensitive features.
sample_weight (1D-array) – Sample weights.

Returns

The smaller of two metrics: true_positive_rate_ratio and false_negative_rate_ratio. The former is the ratio between the smallest and largest of \(P[h(X)=1 | A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1 | A=a, Y=0]\). The equalized odds ratio of 1 means that all groups have the same true positive, true negative, false positive, and false negative rates.

fairlearn.metrics.false_negative_rate(y_true, y_pred, sample_weight=None)[source]¶: Calculate the false negative rate (also called miss rate).

fairlearn.metrics.false_positive_rate(y_true, y_pred, sample_weight=None)[source]¶: Calculate the false positive rate (also called fall-out).

fairlearn.metrics.group_max_from_summary(summary)[source]¶

Retrieve the minimum group-level metric value from group summary.

Parameters: summary – A group metric summary
Returns: The maximum group-level metric value across all groups in summary.
Return type: float

fairlearn.metrics.group_min_from_summary(summary)[source]¶

Retrieve the minimum group-level metric value from group summary.

Parameters: summary – A group metric summary
Returns: The minimum group-level metric value across all groups in summary.
Return type: float

fairlearn.metrics.group_summary(metric_function, y_true, y_pred, *, sensitive_features, indexed_params=None, **metric_params)[source]¶

Apply a metric to each subgroup of a set of data.

Parameters

metric_function – Function with signature metric_function(y_true, y_pred, \*\*metric_params)
y_true – Array of ground-truth values
y_pred – Array of predicted values
sensitive_features – Array indicating the group to which each input value belongs
indexed_params – Names of metric_function parameters that should be split according to sensitive_features in addition to y_true and y_pred. Defaults to None corresponding to {"sample_weight"}.
**metric_params – Optional arguments to be passed to the metric_function

Returns

Object containing the result of applying metric_function to the entire dataset and to each group identified in sensitive_features

Return type

sklearn.utils.Bunch with the fields overall and by_group

fairlearn.metrics.make_derived_metric(transformation_function, summary_function, name=None)[source]¶

Make a callable that calculates a derived metric from the group summary.

Parameters

transformation_function (func) – A transformation function with the signature transformation_function(summary)
summary_function (func) – A metric group summary function with the signature summary_function(y_true, y_pred, *, sensitive_features, **metric_params)

Returns

A callable object with the signature derived_metric(y_true, y_pred, *, sensitive_features, **metric_params)

Return type

func

fairlearn.metrics.make_metric_group_summary(metric_function, indexed_params=None, name=None)[source]¶

Make a callable that calculates the group summary of a metric.

Parameters

metric_function (func) – A metric function with the signature metric_function(y_true, y_pred, **metric_params)
indexed_params – The names of parameters of metric_function that should be split according to sensitive_features in addition to y_true and y_pred. Defaults to None corresponding to ['sample_weight'].

Returns

A callable object with the signature metric_group_summary(y_true, y_pred, *, sensitive_features, **metric_params)

Return type

func

fairlearn.metrics.mean_prediction(y_true, y_pred, sample_weight=None)[source]¶

Calculate the (weighted) mean prediction.

The true values are ignored, but required as an argument in order to maintain a consistent interface

fairlearn.metrics.ratio_from_summary(summary)[source]¶

Calculate the ratio between the maximum and minimum metric value across groups.

Parameters: summary – A group metric summary
Returns: The ratio between the maximum and the minimum group-level metrics described in summary.
Return type: float

fairlearn.metrics.selection_rate(y_true, y_pred, *, pos_label=1, sample_weight=None)[source]¶

Calculate the fraction of predicted labels matching the ‘good’ outcome.

The argument pos_label specifies the ‘good’ outcome.

fairlearn.metrics.true_negative_rate(y_true, y_pred, sample_weight=None)[source]¶: Calculate the true negative rate (also called specificity or selectivity).

fairlearn.metrics.true_positive_rate(y_true, y_pred, sample_weight=None)[source]¶: Calculate the true positive rate (also called sensitivity, recall, or hit rate).

Versions

fairlearn.metrics package¶