Functionality for computing metrics, with a particular focus on group metrics.
For our purpose, a metric is a function with signature f(y_true, y_pred, ....) where y_true are the set of true values and y_pred are values predicted by a machine learning algorithm. Other arguments may be present (most often sample weights), which will affect how the metric is calculated.
f(y_true, y_pred, ....)
y_true
y_pred
The group metrics in this module have signatures g(y_true, y_pred, group_membership, ...) where group_membership is an array of values indicating a group to which each pair of true and predicted values belong. The metric is evaluated for the entire set of data, and also for each subgroup identified in group_membership.
g(y_true, y_pred, group_membership, ...)
group_membership
fairlearn.metrics.
demographic_parity_difference
Calculate the demographic parity difference.
y_true (1D-array) – Ground truth (correct) labels.
y_pred (1D-array) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1D-array) – Sensitive features.
sample_weight (1D-array) – Sample weights.
The difference between the largest and the smallest group-level selection rate, \(E[h(X) | A=a]\), across all values \(a\) of the sensitive feature. The demographic parity difference of 0 means that all groups have the same selection rate.
demographic_parity_ratio
Calculate the demographic parity ratio.
The ratio between the smallest and the largest group-level selection rate, \(E[h(X) | A=a]\), across all values \(a\) of the sensitive feature. The demographic parity ratio of 1 means that all groups have the same selection rate.
difference_from_summary
Calculate the difference between the maximum and minimum metric value across groups.
summary – A group metric summary
The difference between the maximum and the minimum group-level metrics described in summary.
summary
float
equalized_odds_difference
Calculate the equalized odds difference.
y_true (1D-array) – Ground truth (correct) labels \(Y\).
The greater of two metrics: true_positive_rate_difference and false_negative_rate_difference. The former is the difference between the largest and smallest of \(P[h(X)=1 | A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1 | A=a, Y=0]\). The equalized odds difference of 0 means that all groups have the same true positive, true negative, false positive, and false negative rates.
equalized_odds_ratio
Calculate the equalized odds ratio.
The smaller of two metrics: true_positive_rate_ratio and false_negative_rate_ratio. The former is the ratio between the smallest and largest of \(P[h(X)=1 | A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1 | A=a, Y=0]\). The equalized odds ratio of 1 means that all groups have the same true positive, true negative, false positive, and false negative rates.
false_negative_rate
Calculate the false negative rate (also called miss rate).
false_positive_rate
Calculate the false positive rate (also called fall-out).
group_max_from_summary
Retrieve the minimum group-level metric value from group summary.
The maximum group-level metric value across all groups in summary.
group_min_from_summary
The minimum group-level metric value across all groups in summary.
group_summary
Apply a metric to each subgroup of a set of data.
metric_function – Function with signature metric_function(y_true, y_pred, \*\*metric_params)
metric_function(y_true, y_pred, \*\*metric_params)
y_true – Array of ground-truth values
y_pred – Array of predicted values
sensitive_features – Array indicating the group to which each input value belongs
indexed_params – Names of metric_function parameters that should be split according to sensitive_features in addition to y_true and y_pred. Defaults to None corresponding to {"sample_weight"}.
metric_function
sensitive_features
None
{"sample_weight"}
**metric_params – Optional arguments to be passed to the metric_function
Object containing the result of applying metric_function to the entire dataset and to each group identified in sensitive_features
sklearn.utils.Bunch with the fields overall and by_group
sklearn.utils.Bunch
overall
by_group
make_derived_metric
Make a callable that calculates a derived metric from the group summary.
transformation_function (func) – A transformation function with the signature transformation_function(summary)
transformation_function(summary)
summary_function (func) – A metric group summary function with the signature summary_function(y_true, y_pred, *, sensitive_features, **metric_params)
summary_function(y_true, y_pred, *, sensitive_features, **metric_params)
A callable object with the signature derived_metric(y_true, y_pred, *, sensitive_features, **metric_params)
derived_metric(y_true, y_pred, *, sensitive_features, **metric_params)
func
make_metric_group_summary
Make a callable that calculates the group summary of a metric.
metric_function (func) – A metric function with the signature metric_function(y_true, y_pred, **metric_params)
metric_function(y_true, y_pred, **metric_params)
indexed_params – The names of parameters of metric_function that should be split according to sensitive_features in addition to y_true and y_pred. Defaults to None corresponding to ['sample_weight'].
['sample_weight']
A callable object with the signature metric_group_summary(y_true, y_pred, *, sensitive_features, **metric_params)
metric_group_summary(y_true, y_pred, *, sensitive_features, **metric_params)
mean_prediction
Calculate the (weighted) mean prediction.
The true values are ignored, but required as an argument in order to maintain a consistent interface
ratio_from_summary
Calculate the ratio between the maximum and minimum metric value across groups.
The ratio between the maximum and the minimum group-level metrics described in summary.
selection_rate
Calculate the fraction of predicted labels matching the ‘good’ outcome.
The argument pos_label specifies the ‘good’ outcome.
true_negative_rate
Calculate the true negative rate (also called specificity or selectivity).
true_positive_rate
Calculate the true positive rate (also called sensitivity, recall, or hit rate).