This module contains algorithms implementing the reductions approach to disparity mitigation.
In this approach, disparity constraints are cast as Lagrange multipliers, which cause the reweighting and relabelling of the input data. This reduces the problem back to standard machine learning training.
fairlearn.reductions.
AbsoluteLoss
Bases: object
object
Class to evaluate absolute loss.
eval
Evaluate the absolute loss for the given set of true and predicted values.
BoundedGroupLoss
Bases: fairlearn.reductions.ConditionalLossMoment
fairlearn.reductions.ConditionalLossMoment
Moment for constraining the worst-case loss by a group.
For more information refer to the user guide.
ClassificationMoment
Bases: fairlearn.reductions.Moment
fairlearn.reductions.Moment
Moment that can be expressed as weighted classification error.
DemographicParity
Bases: fairlearn.reductions.UtilityParity
fairlearn.reductions.UtilityParity
Implementation of demographic parity as a moment.
A classifier \(h(X)\) satisfies demographic parity if
This implementation of UtilityParity defines a single event, all. Consequently, the prob_event pandas.Series will only have a single entry, which will be equal to 1. Similarly, the index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature. The signed_weights() method will compute the costs according to Example 3 of Agarwal et al. (2018).
UtilityParity
pandas.Series
signed_weights()
This Moment also supports control features, which can be used to stratify the data, with the Demographic Parity constraint applied within each stratum, but not between strata. If the control feature groups are \(c \in \mathcal{C}\) then the above equation will become
Moment
load_data
Load the specified data into the object.
short_name
EqualizedOdds
Implementation of equalized odds as a moment.
Adds conditioning on label compared to demographic parity, i.e.
This implementation of UtilityParity defines events corresponding to the unique values of the Y array.
The prob_event pandas.Series will record the fraction of the samples corresponding to each unique value in the Y array.
The index MultiIndex will have a number of entries equal to the number of unique values for the sensitive feature, multiplied by the number of unique values of the Y array, multiplied by two (for the Lagrange multipliers for positive and negative constraints).
With these definitions, the signed_weights() method will calculate the costs according to Example 4 of Agarwal et al. (2018).
This Moment also supports control features, which can be used to stratify the data, with the constraint applied within each stratum, but not between strata.
ErrorRate
Bases: fairlearn.reductions.ClassificationMoment
fairlearn.reductions.ClassificationMoment
Misclassification error.
gamma
Return the gamma values for the given predictor.
project_lambda
Return the lambda values.
signed_weights
Return the signed weights.
ErrorRateParity
Implementation of error rate parity as a moment.
A classifier \(h(X)\) satisfies error rate parity if
This implementation of UtilityParity defines a single event, all. Consequently, the prob_event pandas.Series will only have a single entry, which will be equal to 1.
The index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature.
The signed_weights() method will compute the costs according to Example 3 of Agarwal et al. (2018). However, in this scenario, g = abs(h(x)-y), rather than g = h(x)
ExponentiatedGradient
Bases: sklearn.base.BaseEstimator, sklearn.base.MetaEstimatorMixin
sklearn.base.BaseEstimator
sklearn.base.MetaEstimatorMixin
An Estimator which implements the exponentiated gradient approach to reductions.
The exponentiated gradient algorithm is described in detail by Agarwal et al. (2018).
estimator (estimator) – An estimator implementing methods fit(X, y, sample_weight) and predict(X), where X is the matrix of features, y is the vector of labels (binary classification) or continuous values (regression), and sample_weight is a vector of weights. In binary classification labels y and predictions returned by predict(X) are either 0 or 1. In regression values y and predictions are continuous.
fit(X, y, sample_weight)
predict(X)
constraints (fairlearn.reductions.Moment) – The disparity constraints expressed as moments
eps (float) – Allowed fairness constraint violation; the solution is guaranteed to have the error within 2*best_gap of the best error under constraint eps; the constraint violation is at most 2*(eps+best_gap)
2*best_gap
2*(eps+best_gap)
max_iter (int) – Maximum number of iterations
nu (float) – Convergence threshold for the duality gap, corresponding to a conservative automatic setting based on the statistical uncertainty in measuring classification error
eta_0 (float) – Initial setting of the learning rate
run_linprog_step (bool) – if True each step of exponentiated gradient is followed by the saddle point optimization over the convex hull of classifiers returned so far; default True
sample_weight_name (str) – Name of the argument to estimator.fit() which supplies the sample weights (defaults to sample_weight)
fit
Return a fair classifier under specified fairness constraints.
X (numpy.ndarray or pandas.DataFrame) – Feature data
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – Label vector
predict
Provide predictions for the given input data.
Predictions are randomized, i.e., repeatedly calling predict with the same feature data may yield different output. This non-deterministic behavior is intended and stems from the nature of the exponentiated gradient algorithm.
Notes
A fitted ExponentiatedGradient has an attribute predictors_, an array of predictors, and an attribute weights_, an array of non-negative floats of the same length. The prediction on each data point in X is obtained by first picking a random predictor according to the probabilities in weights_ and then applying it. Different predictors can be chosen on different data points.
random_state (int or RandomState instance, default=None) – Controls random numbers used for randomized predictions. Pass an int for reproducible output across multiple function calls.
The prediction. If X represents the data for a single example the result will be a scalar. Otherwise the result will be a vector
Scalar or vector
FalsePositiveRateParity
Implementation of false positive rate parity as a moment.
Adds conditioning on label Y=0 compared to demographic parity, i.e.,
This implementation of UtilityParity defines the event corresponding to Y=0.
The prob_event pandas.DataFrame will record the fraction of the samples corresponding to Y = 0 in the Y array.
pandas.DataFrame
The index MultiIndex will have a number of entries equal to the number of unique values of the sensitive feature, multiplied by the number of unique non-NaN values of the constructed event array, whose entries are either NaN or label=0 (so only one unique non-NaN value), multiplied by two (for the Lagrange multipliers for positive and negative constraints).
With these definitions, the signed_weights() method will calculate the costs for Y=0 as they are calculated in Example 4 of Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>, but will use the weights equal to zero for Y=1.
GridSearch
Estimator to perform a grid search given a blackbox estimator algorithm.
The approach used is taken from section 3.4 of Agarwal et al. (2018).
selection_rule (str) – Specifies the procedure for selecting the best model found by the grid search. At the present time, the only valid value is “tradeoff_optimization” which minimizes a weighted sum of the error rate and constraint violation.
constraint_weight (float) – When the selection_rule is “tradeoff_optimization” this specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be 1-constraint_weight
1-constraint_weight
grid_size (int) – The number of Lagrange multipliers to generate in the grid
grid_limit (float) – The largest Lagrange multiplier to generate. The grid will contain values distributed between -grid_limit and grid_limit by default
-grid_limit
grid_limit
grid_offset (pandas.DataFrame) – Shifts the grid of Lagrangian multiplier by that value. It is ‘0’ by default
grid – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument.
Run the grid search.
This will result in multiple copies of the estimator being made, and the fit(X) method of each one called.
fit(X)
X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
sensitive_features (numpy.ndarray, pandas.DataFrame, pandas.Series, or list (for now)) – A (currently) required keyword argument listing the feature used by the constraints object
Provide a prediction using the best model found by the grid search.
This dispatches X to the predict(X) method of the selected estimator, and hence the return type is dependent on that method.
predict_proba
Provide the result of predict_proba from the best model found by the grid search.
The underlying estimator must support predict_proba(X) for this to work. The return type is determined by this method.
predict_proba(X)
LossMoment
Moment that can be expressed as weighted loss.
Generic moment.
Our implementations of the reductions approach to fairness described in Agarwal et al. (2018) make use of Moment objects to describe the disparity constraints imposed on the solution. This is an abstract class for all such objects.
bound
Return vector of fairness bound constraint the length of gamma.
Calculate the degree to which constraints are currently violated by the predictor.
Load a set of data for use by this object.
X (array) – The feature array
y (pandas.Series) – The label vector
sensitive_features (pandas.Series) – The sensitive feature vector (default None)
Return the projected lambda values.
total_samples
Return the number of samples in the data.
SquareLoss
Class to evaluate the square loss.
Evaluate the square loss for the given set of true and predicted values.
TruePositiveRateParity
Implementation of true positive rate parity as a moment.
Adds conditioning on label Y=1 compared to demographic parity, i.e.,
This implementation of UtilityParity defines the event corresponding to Y=1.
The prob_event pandas.DataFrame will record the fraction of the samples corresponding to Y = 1 in the Y array.
The index MultiIndex will have a number of entries equal to the number of unique values of the sensitive feature, multiplied by the number of unique non-NaN values of the constructed event array, whose entries are either NaN or label=1 (so only one unique non-NaN value), multiplied by two (for the Lagrange multipliers for positive and negative constraints).
With these definitions, the signed_weights() method will calculate the costs for Y=1 as they are calculated in Example 4 of Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>, but will use the weights equal to zero for Y=0.
A generic moment for parity in utilities (or costs) under classification.
This serves as the base class for DemographicParity, EqualizedOdds, and others. All subclasses can be used as difference-based constraints or ratio-based constraints. Refer to the user guide for more information and example usage.
Constraints compare the group-level mean utility for each group with the overall mean utility (unless further events are specified, e.g., in equalized odds). Constraint violation for difference-based constraints starts if the difference between a group and the overall population with regard to a utility exceeds difference_bound. For ratio-based constraints, the ratio between the group-level and overal mean utility needs to be bounded between ratio_bound and its inverse (plus an additional additive ratio_bound_slack).
The index field is a pandas.MultiIndex corresponding to the constraint IDs. It is an index of various DataFrame and Series objects that are either required as arguments or returned by several of the methods of the UtilityParity class. It is the Cartesian product of:
pandas.MultiIndex
The unique events defining the particular moment object
The unique values of the sensitive feature
The characters + and -, corresponding to the Lagrange multipliers for positive and negative violations of the constraint
difference_bound (float) – The constraints’ difference bound for constraints that are expressed as differences, also referred to as \(\\epsilon\) in documentation. If ratio_bound is used then difference_bound needs to be None. If neither ratio_bound nor difference_bound are set then a default difference bound of 0.01 is used for backwards compatibility. Default None.
ratio_bound (float) – The constraints’ ratio bound for constraints that are expressed as ratios. The specified value needs to be in (0,1]. If difference_bound is used then ratio_bound needs to be None. Default None.
ratio_bound_slack (float) – The constraints’ ratio bound slack for constraints that are expressed as ratios, also referred to as \(\\epsilon\) in documentation. ratio_bound_slack is ignored if ratio_bound is not specified. Default 0.0
Return bound vector.
a vector of bound values corresponding to all constraints
default_objective
Return the default objective for moments of this kind.
Load the specified data into this object.
This adds a column event to the tags field.
The utilities is a 2-d array which correspond to g(X,A,Y,h(X)) as mentioned in the paper Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>. The utilities defaults to h(X), i.e. [0, 1] for each X_i. The first column is G^0 and the second is G^1. Assumes binary classification with labels 0/1.
i.e., returns lambda which is guaranteed to lead to the same or higher value of the Lagrangian compared with lambda_vec for all possible choices of the classifier, h.
Compute the signed weights.
Uses the equations for \(C_i^0\) and \(C_i^1\) as defined in Section 3.2 of Agarwal et al. (2018) in the ‘best response of the Q-player’ subsection to compute the signed weights to be applied to the data by the next call to the underlying estimator.
lambda_vec (pandas.Series) – The vector of Lagrange multipliers indexed by index
ZeroOneLoss
Bases: fairlearn.reductions.AbsoluteLoss
fairlearn.reductions.AbsoluteLoss
Class to evaluate a zero-one loss.