Assessment#
In this section, we will describe the steps involved in performing a fairness
assessment, and introduce some widely (if occasionally incautiously) used
fairness metrics, such as demographic parity and equalized odds.
We will show how MetricFrame
can be used to evaluate the metrics
identified during the course of a fairness assessment.
In the mathematical definitions below, \(X\) denotes a feature vector used for predictions, \(A\) will be a single sensitive feature (such as age or race), and \(Y\) will be the true label. Fairness metrics are phrased in terms of expectations with respect to the distribution over \((X,A,Y)\). Note that \(X\) and \(A\) may or may not share columns, dependent on whether the model is allowed to ‘see’ the sensitive features. When we need to refer to particular values, we will use lower case letters; since we are going to be comparing between groups identified by the sensitive feature, \(\forall a \in A\) will be appearing regularly to indicate that a property holds for all identified groups.
Fairlearn dashboard#
The Fairlearn dashboard was a Jupyter notebook widget for assessing how a model’s predictions impact different groups (e.g., different ethnicities), and also for comparing multiple models along different fairness and performance metrics.
Note
The FairlearnDashboard
is no longer being developed as
part of Fairlearn.
For more information on how to use it refer to
microsoft/responsible-ai-widgets.
Fairlearn provides some of the existing functionality through
matplotlib
-based visualizations. Refer to the Plotting section.