Defining custom fairness metrics#
Higher level machine learning algorithms (such as hyperparameter tuners) often
make use of metric functions to guide their optimisations.
Such algorithms generally work with scalar results, so if we want the tuning
to be done on the basis of our fairness metrics, we need to perform aggregations
over the MetricFrame
.
We provide a convenience function, fairlearn.metrics.make_derived_metric()
,
to generate scalar-producing metric functions based on the aggregation methods
mentioned above (MetricFrame.group_min()
, MetricFrame.group_max()
,
MetricFrame.difference()
, and MetricFrame.ratio()
).
This takes an underlying metric function, the name of the desired transformation, and
optionally a list of parameter names which should be treated as sample aligned parameters
(such as sample_weight).
The result is a function which builds the MetricFrame
internally and performs
the requested aggregation. For example:
>>> from fairlearn.metrics import make_derived_metric, MetricFrame
>>> from sklearn.metrics import recall_score
>>> y_true = [0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1]
>>> y_pred = [0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0]
>>> sf_data = ['b', 'b', 'a', 'b', 'b', 'c', 'c', 'c', 'a',
... 'a', 'c', 'a', 'b', 'c', 'c', 'b', 'c', 'c']
>>> recall_difference = make_derived_metric(metric=recall_score,
... transform='difference')
>>> recall_difference(y_true, y_pred,
... sensitive_features=sf_data).item()
0.19999...
>>> MetricFrame(metrics=recall_score,
... y_true=y_true,
... y_pred=y_pred,
... sensitive_features=sf_data).difference().item()
0.19999...
We use fairlearn.metrics.make_derived_metric()
to manufacture a number
of such functions which are commonly used.
The table below displays the aggregations that we have created for each
base metric:
Base metric |
|
|
|
|
---|---|---|---|---|
. |
. |
Y |
Y |
|
. |
. |
Y |
Y |
|
. |
. |
Y |
Y |
|
. |
. |
Y |
Y |
|
. |
. |
Y |
Y |
|
Y |
. |
Y |
Y |
|
Y |
. |
. |
. |
|
Y |
. |
. |
. |
|
. |
Y |
. |
. |
|
. |
Y |
. |
. |
|
. |
Y |
. |
. |
|
Y |
. |
. |
. |
|
Y |
. |
. |
. |
|
Y |
. |
. |
. |
|
Y |
. |
. |
. |
|
. |
Y |
Y |
Y |
The names of the generated functions are of the form
fairlearn.metrics.<base_metric>_<transformation>
.
For example fairlearn.metrics.accuracy_score_difference
and
fairlearn.metrics.precision_score_group_min
.