fairlearn.reductions package¶
This module contains algorithms implementing the reductions approach to disparity mitigation.
In this approach, disparity constraints are cast as Lagrange multipliers, which cause the reweighting and relabelling of the input data. This reduces the problem back to standard machine learning training.
- class fairlearn.reductions.AbsoluteLoss(min_val, max_val)[source]¶
Bases:
object
Class to evaluate absolute loss.
- class fairlearn.reductions.ClassificationMoment[source]¶
Bases:
fairlearn.reductions.Moment
Moment that can be expressed as weighted classification error.
- class fairlearn.reductions.ConditionalSelectionRate(ratio=1.0)[source]¶
Bases:
fairlearn.reductions.ClassificationMoment
Generic fairness moment for selection rates.
This serves as the base class for both
DemographicParity
andEqualizedOdds
. The two are distinguished by the events they define, which in turn affect the index field created byload_data()
.The index field is a
pandas.MultiIndex
corresponding to the rows of the DataFrames either required as arguments or returned by several of the methods of the ConditionalSelectionRate class. It is the cartesian product of:The unique events defined for the particular object
The unique values for the sensitive feature
The characters + and -, corresponding to the Lagrange multipliers for positive and negative violations of the constraint
The ratio specifies the multiple at which error(A = a) should be compared with total_error and vice versa. The value of ratio has to be in the range (0,1] with smaller values corresponding to weaker constraint. The ratio equal to 1 corresponds to the constraint where error(A = a) = total_error
- gamma(predictor)[source]¶
Calculate the degree to which constraints are currently violated by the predictor.
- load_data(X, y, event=None, utilities=None, **kwargs)[source]¶
Load the specified data into this object.
This adds a column event to the tags field.
The utilities is a 2-d array which correspond to g(X,A,Y,h(X)) as mentioned in the paper Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>. The utilities defaults to h(X), i.e. [0, 1] for each X_i. The first column is G^0 and the second is G^1. Assumes binary classification with labels 0/1. .. math:: utilities = [g(X,A,Y,h(X)=0), g(X,A,Y,h(X)=1)]
- project_lambda(lambda_vec)[source]¶
Return the projected lambda values.
i.e., returns lambda which is guaranteed to lead to the same or higher value of the Lagrangian compared with lambda_vec for all possible choices of the classifier, h.
- signed_weights(lambda_vec)[source]¶
Compute the signed weights.
Uses the equations for \(C_i^0\) and \(C_i^1\) as defined in Section 3.2 of Agarwal et al. (2018) in the ‘best response of the Q-player’ subsection to compute the signed weights to be applied to the data by the next call to the underlying estimator.
- Parameters
lambda_vec (
pandas.Series
) – The vector of Lagrange multipliers indexed by index
- class fairlearn.reductions.DemographicParity(ratio=1.0)[source]¶
Bases:
fairlearn.reductions.ConditionalSelectionRate
Implementation of Demographic Parity as a moment.
A classifier \(h(X)\) satisfies DemographicParity if
\[P[h(X) = y' | A = a] = P[h(X) = y'] \; \forall a, y'\]This implementation of
ConditionalSelectionRate
defines a single event, all. Consequently, the prob_eventpandas.Series
will only have a single entry, which will be equal to 1. Similarly, the index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature. Thesigned_weights()
method will compute the costs according to Example 3 of Agarwal et al. (2018).- short_name = 'DemographicParity'¶
- class fairlearn.reductions.EqualizedOdds(ratio=1.0)[source]¶
Bases:
fairlearn.reductions.ConditionalSelectionRate
Implementation of Equalized Odds as a moment.
Adds conditioning on label compared to Demographic parity, i.e.
\[P[h(X) = y' | A = a, Y = y] = P[h(X) = y' | Y = y] \; \forall a, y, y'\]This implementation of
ConditionalSelectionRate
defines events corresponding to the unique values of the Y array.The prob_event
pandas.Series
will record the fraction of the samples corresponding to each unique value in the Y array.The index MultiIndex will have a number of entries equal to the number of unique values for the sensitive feature, multiplied by the number of unique values of the Y array, multiplied by two (for the Lagrange multipliers for positive and negative constraints).
With these definitions, the
signed_weights()
method will calculate the costs according to Example 4 of Agarwal et al. (2018).- short_name = 'EqualizedOdds'¶
- class fairlearn.reductions.ErrorRate[source]¶
Bases:
fairlearn.reductions.ClassificationMoment
Misclassification error.
- short_name = 'Err'¶
- class fairlearn.reductions.ErrorRateRatio(ratio=1.0)[source]¶
Bases:
fairlearn.reductions.ConditionalSelectionRate
Implementation of Error Rate Ratio as a moment.
Measures the ratio in errors per attribute by overall error. The 2-sided version of error ratio can be written as ratio <= error(A=a) / total_error <= 1/ratio .. math:: ratio <= E[abs(h(x) - y)| A = a] / E[abs(h(x) - y)] <= 1/ratio; forall a
This implementation of
ConditionalSelectionRate
defines a single event, all. Consequently, the prob_eventpandas.Series
will only have a single entry, which will be equal to 1.The index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature.
The
signed_weights()
method will compute the costs according to Example 3 of Agarwal et al. (2018). However, in this scenario, g = abs(h(x)-y), rather than g = h(x)- short_name = 'ErrorRateRatio'¶
- class fairlearn.reductions.ExponentiatedGradient(estimator, constraints, eps=0.01, T=50, nu=None, eta_mul=2.0)[source]¶
Bases:
sklearn.base.BaseEstimator
,sklearn.base.MetaEstimatorMixin
An Estimator which implements the exponentiated gradient approach to reductions.
The exponentiated gradient algorithm is described in detail by Agarwal et al. (2018).
- Parameters
estimator (estimator) – An estimator implementing methods
fit(X, y, sample_weight)
andpredict(X)
, where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned bypredict(X)
are either 0 or 1.constraints (fairlearn.reductions.Moment) – The disparity constraints expressed as moments
eps (float) – Allowed fairness constraint violation; the solution is guaranteed to have the error within
2*best_gap
of the best error under constraint eps; the constraint violation is at most2*(eps+best_gap)
T (int) – Maximum number of iterations
nu (float) – Convergence threshold for the duality gap, corresponding to a conservative automatic setting based on the statistical uncertainty in measuring classification error
eta_mul (float) – Initial setting of the learning rate
- fit(X, y, **kwargs)[source]¶
Return a fair classifier under specified fairness constraints.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
- predict(X)[source]¶
Provide a prediction for the given input data.
Note that this is non-deterministic, due to the nature of the exponentiated gradient algorithm.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – Feature data
- Returns
The prediction. If X represents the data for a single example the result will be a scalar. Otherwise the result will be a vector
- Return type
Scalar or vector
- class fairlearn.reductions.GridSearch(estimator, constraints, selection_rule='tradeoff_optimization', constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid_offset=None, grid=None)[source]¶
Bases:
sklearn.base.BaseEstimator
,sklearn.base.MetaEstimatorMixin
Estimator to perform a grid search given a blackbox estimator algorithm.
The approach used is taken from section 3.4 of Agarwal et al. (2018).
- Parameters
estimator (estimator) – An estimator implementing methods
fit(X, y, sample_weight)
andpredict(X)
, where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned bypredict(X)
are either 0 or 1.constraints (fairlearn.reductions.Moment) – The disparity constraints expressed as moments.
selection_rule (str) – Specifies the procedure for selecting the best model found by the grid search. At the present time, the only valid value is “tradeoff_optimization” which minimises a weighted sum of the error rate and constraint violation.
constraint_weight (float) – When the selection_rule is “tradeoff_optimization” this specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be
1-constraint_weight
grid_size (int) – The number of Lagrange multipliers to generate in the grid
grid_limit (float) – The largest Lagrange multiplier to generate. The grid will contain values distributed between
-grid_limit
andgrid_limit
by defaultgrid_offset (
pandas.DataFrame
) – shifts the grid of Lagrangian multiplier by that value It is ‘0’ by defaultgrid – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument.
- fit(X, y, **kwargs)[source]¶
Run the grid search.
This will result in multiple copies of the estimator being made, and the
fit(X)
method of each one called.- Parameters
X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
sensitive_features (numpy.ndarray, pandas.DataFrame, pandas.Series, or list (for now)) – A (currently) required keyword argument listing the feature used by the constraints object
- predict(X)[source]¶
Provide a prediction using the best model found by the grid search.
This dispatches X to the
predict(X)
method of the selected estimator, and hence the return type is dependent on that method.- Parameters
X (numpy.ndarray or pandas.DataFrame) – Feature data
- predict_proba(X)[source]¶
Provide the result of
predict_proba
from the best model found by the grid search.The underlying estimator must support
predict_proba(X)
for this to work. The return type is determined by this method.- Parameters
X (numpy.ndarray or pandas.DataFrame) – Feature data
- class fairlearn.reductions.GroupLossMoment(loss)[source]¶
Bases:
fairlearn.reductions.ConditionalLossMoment
Moment for Group Loss.
- class fairlearn.reductions.LossMoment(loss)[source]¶
Bases:
fairlearn.reductions.Moment
Moment that can be expressed as weighted loss.
- class fairlearn.reductions.Moment[source]¶
Bases:
object
Generic moment.
Our implementations of the reductions approach to fairness described in Agarwal et al. (2018) make use of
Moment
objects to describe the disparity constraints imposed on the solution. This is an abstract class for all such objects.- gamma(predictor)[source]¶
Calculate the degree to which constraints are currently violated by the predictor.
- load_data(X, y, **kwargs)[source]¶
Load a set of data for use by this object.
The keyword arguments can contain a
sensitive_features
array.- Parameters
X (array) – The feature data
y (array) – The true label data
- property total_samples¶
Return the number of samples in the data.
- class fairlearn.reductions.SquareLoss(min_val, max_val)[source]¶
Bases:
object
Class to evaluate the square loss.
- class fairlearn.reductions.TruePositiveRateDifference(ratio=1.0)[source]¶
Bases:
fairlearn.reductions.ConditionalSelectionRate
Implementation of True Positive Rate Difference (Equal Opportunity Difference) as a moment.
Adds conditioning on label y=1 compared to Demographic parity, i.e.
\[P[h(X) = 1 | A = a, Y = 1] = P[h(X) = 1 | Y = 1] \; \forall a\]This implementation of
ConditionalSelectionRate
defines the event corresponding to y=1.The prob_event
pandas.DataFrame
will record the fraction of the samples corresponding to y = 1 in the Y array.The index MultiIndex will have a number of entries equal to the number of unique values of the sensitive feature, multiplied by the number of unique non-NaN values of the constructed event array, whose entries are either NaN or label=1 (so only one unique non-NaN value), multiplied by two (for the Lagrange multipliers for positive and negative constraints).
With these definitions, the
signed_weights()
method will calculate the costs for y=1 as they are calculated in Example 4 of Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>, but will use the weights equal to zero for y=0.- short_name = 'TruePositiveRateDifference'¶
- class fairlearn.reductions.ZeroOneLoss[source]¶
Bases:
fairlearn.reductions.AbsoluteLoss
Class to evaluate a zero-one loss.