fairlearn.reductions package

This module contains algorithms implementing the reductions approach to disparity mitigation.

In this approach, disparity constraints are cast as Lagrange multipliers, which cause the reweighting and relabelling of the input data. This reduces the problem back to standard machine learning training.

class fairlearn.reductions.AbsoluteLoss(min_val, max_val)[source]

Bases: object

Class to evaluate absolute loss.

eval(y_true, y_pred)[source]

Evaluate the absolute loss for the given set of true and predicted values.

class fairlearn.reductions.ClassificationMoment[source]

Bases: fairlearn.reductions.Moment

Moment that can be expressed as weighted classification error.

class fairlearn.reductions.ConditionalSelectionRate(ratio=1.0)[source]

Bases: fairlearn.reductions.ClassificationMoment

Generic fairness moment for selection rates.

This serves as the base class for both DemographicParity and EqualizedOdds. The two are distinguished by the events they define, which in turn affect the index field created by load_data().

The index field is a pandas.MultiIndex corresponding to the rows of the DataFrames either required as arguments or returned by several of the methods of the ConditionalSelectionRate class. It is the cartesian product of:

  • The unique events defined for the particular object

  • The unique values for the sensitive feature

  • The characters + and -, corresponding to the Lagrange multipliers for positive and negative violations of the constraint

The ratio specifies the multiple at which error(A = a) should be compared with total_error and vice versa. The value of ratio has to be in the range (0,1] with smaller values corresponding to weaker constraint. The ratio equal to 1 corresponds to the constraint where error(A = a) = total_error

default_objective()[source]

Return the default objective for moments of this kind.

gamma(predictor)[source]

Calculate the degree to which constraints are currently violated by the predictor.

load_data(X, y, event=None, utilities=None, **kwargs)[source]

Load the specified data into this object.

This adds a column event to the tags field.

The utilities is a 2-d array which correspond to g(X,A,Y,h(X)) as mentioned in the paper Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>. The utilities defaults to h(X), i.e. [0, 1] for each X_i. The first column is G^0 and the second is G^1. Assumes binary classification with labels 0/1. .. math:: utilities = [g(X,A,Y,h(X)=0), g(X,A,Y,h(X)=1)]

project_lambda(lambda_vec)[source]

Return the projected lambda values.

i.e., returns lambda which is guaranteed to lead to the same or higher value of the Lagrangian compared with lambda_vec for all possible choices of the classifier, h.

signed_weights(lambda_vec)[source]

Compute the signed weights.

Uses the equations for \(C_i^0\) and \(C_i^1\) as defined in Section 3.2 of Agarwal et al. (2018) in the ‘best response of the Q-player’ subsection to compute the signed weights to be applied to the data by the next call to the underlying estimator.

Parameters

lambda_vec (pandas.Series) – The vector of Lagrange multipliers indexed by index

class fairlearn.reductions.DemographicParity(ratio=1.0)[source]

Bases: fairlearn.reductions.ConditionalSelectionRate

Implementation of Demographic Parity as a moment.

A classifier \(h(X)\) satisfies DemographicParity if

\[P[h(X) = y' | A = a] = P[h(X) = y'] \; \forall a, y'\]

This implementation of ConditionalSelectionRate defines a single event, all. Consequently, the prob_event pandas.Series will only have a single entry, which will be equal to 1. Similarly, the index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature. The signed_weights() method will compute the costs according to Example 3 of Agarwal et al. (2018).

load_data(X, y, **kwargs)[source]

Load the specified data into the object.

short_name = 'DemographicParity'
class fairlearn.reductions.EqualizedOdds(ratio=1.0)[source]

Bases: fairlearn.reductions.ConditionalSelectionRate

Implementation of Equalized Odds as a moment.

Adds conditioning on label compared to Demographic parity, i.e.

\[P[h(X) = y' | A = a, Y = y] = P[h(X) = y' | Y = y] \; \forall a, y, y'\]

This implementation of ConditionalSelectionRate defines events corresponding to the unique values of the Y array.

The prob_event pandas.Series will record the fraction of the samples corresponding to each unique value in the Y array.

The index MultiIndex will have a number of entries equal to the number of unique values for the sensitive feature, multiplied by the number of unique values of the Y array, multiplied by two (for the Lagrange multipliers for positive and negative constraints).

With these definitions, the signed_weights() method will calculate the costs according to Example 4 of Agarwal et al. (2018).

load_data(X, y, **kwargs)[source]

Load the specified data into the object.

short_name = 'EqualizedOdds'
class fairlearn.reductions.ErrorRate[source]

Bases: fairlearn.reductions.ClassificationMoment

Misclassification error.

gamma(predictor)[source]

Return the gamma values for the given predictor.

load_data(X, y, **kwargs)[source]

Load the specified data into the object.

project_lambda(lambda_vec)[source]

Return the lambda values.

signed_weights(lambda_vec=None)[source]

Return the signed weights.

short_name = 'Err'
class fairlearn.reductions.ErrorRateRatio(ratio=1.0)[source]

Bases: fairlearn.reductions.ConditionalSelectionRate

Implementation of Error Rate Ratio as a moment.

Measures the ratio in errors per attribute by overall error. The 2-sided version of error ratio can be written as ratio <= error(A=a) / total_error <= 1/ratio .. math:: ratio <= E[abs(h(x) - y)| A = a] / E[abs(h(x) - y)] <= 1/ratio; forall a

This implementation of ConditionalSelectionRate defines a single event, all. Consequently, the prob_event pandas.Series will only have a single entry, which will be equal to 1.

The index property will have twice as many entries (corresponding to the Lagrange multipliers for positive and negative constraints) as there are unique values for the sensitive feature.

The signed_weights() method will compute the costs according to Example 3 of Agarwal et al. (2018). However, in this scenario, g = abs(h(x)-y), rather than g = h(x)

load_data(X, y, **kwargs)[source]

Load the specified data into the object.

short_name = 'ErrorRateRatio'
class fairlearn.reductions.ExponentiatedGradient(estimator, constraints, eps=0.01, T=50, nu=None, eta_mul=2.0)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.MetaEstimatorMixin

An Estimator which implements the exponentiated gradient approach to reductions.

The exponentiated gradient algorithm is described in detail by Agarwal et al. (2018).

Parameters
  • estimator (estimator) – An estimator implementing methods fit(X, y, sample_weight) and predict(X), where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned by predict(X) are either 0 or 1.

  • constraints (fairlearn.reductions.Moment) – The disparity constraints expressed as moments

  • eps (float) – Allowed fairness constraint violation; the solution is guaranteed to have the error within 2*best_gap of the best error under constraint eps; the constraint violation is at most 2*(eps+best_gap)

  • T (int) – Maximum number of iterations

  • nu (float) – Convergence threshold for the duality gap, corresponding to a conservative automatic setting based on the statistical uncertainty in measuring classification error

  • eta_mul (float) – Initial setting of the learning rate

fit(X, y, **kwargs)[source]

Return a fair classifier under specified fairness constraints.

Parameters
predict(X)[source]

Provide a prediction for the given input data.

Note that this is non-deterministic, due to the nature of the exponentiated gradient algorithm.

Parameters

X (numpy.ndarray or pandas.DataFrame) – Feature data

Returns

The prediction. If X represents the data for a single example the result will be a scalar. Otherwise the result will be a vector

Return type

Scalar or vector

class fairlearn.reductions.GridSearch(estimator, constraints, selection_rule='tradeoff_optimization', constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid_offset=None, grid=None)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.MetaEstimatorMixin

Estimator to perform a grid search given a blackbox estimator algorithm.

The approach used is taken from section 3.4 of Agarwal et al. (2018).

Parameters
  • estimator (estimator) – An estimator implementing methods fit(X, y, sample_weight) and predict(X), where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned by predict(X) are either 0 or 1.

  • constraints (fairlearn.reductions.Moment) – The disparity constraints expressed as moments.

  • selection_rule (str) – Specifies the procedure for selecting the best model found by the grid search. At the present time, the only valid value is “tradeoff_optimization” which minimises a weighted sum of the error rate and constraint violation.

  • constraint_weight (float) – When the selection_rule is “tradeoff_optimization” this specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be 1-constraint_weight

  • grid_size (int) – The number of Lagrange multipliers to generate in the grid

  • grid_limit (float) – The largest Lagrange multiplier to generate. The grid will contain values distributed between -grid_limit and grid_limit by default

  • grid_offset (pandas.DataFrame) – shifts the grid of Lagrangian multiplier by that value It is ‘0’ by default

  • grid – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument.

fit(X, y, **kwargs)[source]

Run the grid search.

This will result in multiple copies of the estimator being made, and the fit(X) method of each one called.

Parameters
predict(X)[source]

Provide a prediction using the best model found by the grid search.

This dispatches X to the predict(X) method of the selected estimator, and hence the return type is dependent on that method.

Parameters

X (numpy.ndarray or pandas.DataFrame) – Feature data

predict_proba(X)[source]

Provide the result of predict_proba from the best model found by the grid search.

The underlying estimator must support predict_proba(X) for this to work. The return type is determined by this method.

Parameters

X (numpy.ndarray or pandas.DataFrame) – Feature data

class fairlearn.reductions.GroupLossMoment(loss)[source]

Bases: fairlearn.reductions.ConditionalLossMoment

Moment for Group Loss.

class fairlearn.reductions.LossMoment(loss)[source]

Bases: fairlearn.reductions.Moment

Moment that can be expressed as weighted loss.

class fairlearn.reductions.Moment[source]

Bases: object

Generic moment.

Our implementations of the reductions approach to fairness described in Agarwal et al. (2018) make use of Moment objects to describe the disparity constraints imposed on the solution. This is an abstract class for all such objects.

gamma(predictor)[source]

Calculate the degree to which constraints are currently violated by the predictor.

load_data(X, y, **kwargs)[source]

Load a set of data for use by this object.

The keyword arguments can contain a sensitive_features array.

Parameters
  • X (array) – The feature data

  • y (array) – The true label data

project_lambda(lambda_vec)[source]

Return the projected lambda values.

signed_weights(lambda_vec)[source]

Return the signed weights.

property total_samples

Return the number of samples in the data.

class fairlearn.reductions.SquareLoss(min_val, max_val)[source]

Bases: object

Class to evaluate the square loss.

eval(y_true, y_pred)[source]

Evaluate the square loss for the given set of true and predicted values.

class fairlearn.reductions.TruePositiveRateDifference(ratio=1.0)[source]

Bases: fairlearn.reductions.ConditionalSelectionRate

Implementation of True Positive Rate Difference (Equal Opportunity Difference) as a moment.

Adds conditioning on label y=1 compared to Demographic parity, i.e.

\[P[h(X) = 1 | A = a, Y = 1] = P[h(X) = 1 | Y = 1] \; \forall a\]

This implementation of ConditionalSelectionRate defines the event corresponding to y=1.

The prob_event pandas.DataFrame will record the fraction of the samples corresponding to y = 1 in the Y array.

The index MultiIndex will have a number of entries equal to the number of unique values of the sensitive feature, multiplied by the number of unique non-NaN values of the constructed event array, whose entries are either NaN or label=1 (so only one unique non-NaN value), multiplied by two (for the Lagrange multipliers for positive and negative constraints).

With these definitions, the signed_weights() method will calculate the costs for y=1 as they are calculated in Example 4 of Agarwal et al. (2018) <https://arxiv.org/abs/1803.02453>, but will use the weights equal to zero for y=0.

load_data(X, y, **kwargs)[source]

Load the specified data into the object.

short_name = 'TruePositiveRateDifference'
class fairlearn.reductions.ZeroOneLoss[source]

Bases: fairlearn.reductions.AbsoluteLoss

Class to evaluate a zero-one loss.