fairlearn.reductions.UtilityParity#

class fairlearn.reductions.UtilityParity(*, difference_bound=None, ratio_bound=None, ratio_bound_slack=0.0)[source]#

A generic moment for parity in utilities (or costs) under classification.

This serves as the base class for DemographicParity, EqualizedOdds, and others. All subclasses can be used as difference-based constraints or ratio-based constraints. Refer to the user guide for more information and example usage.

Constraints compare the group-level mean utility for each group with the overall mean utility (unless further events are specified, e.g., in equalized odds). Constraint violation for difference-based constraints starts if the difference between a group and the overall population with regard to a utility exceeds difference_bound. For ratio-based constraints, the ratio between the group-level and overall mean utility needs to be bounded between ratio_bound and its inverse (plus an additional additive ratio_bound_slack).

The index field is a pandas.MultiIndex corresponding to the constraint IDs. It is an index of various DataFrame and Series objects that are either required as arguments or returned by several of the methods of the UtilityParity class. It is the Cartesian product of:

  • The unique events defining the particular moment object

  • The unique values of the sensitive feature

  • The characters + and -, corresponding to the Lagrange multipliers for positive and negative violations of the constraint

Read more in the User Guide.

Parameters:
  • difference_bound (float) – The constraints’ difference bound for constraints that are expressed as differences, also referred to as \(\\epsilon\) in documentation. If ratio_bound is used then difference_bound needs to be None. If neither ratio_bound nor difference_bound are set then a default difference bound of 0.01 is used for backwards compatibility. Default None.

  • ratio_bound (float) – The constraints’ ratio bound for constraints that are expressed as ratios. The specified value needs to be in (0,1]. If difference_bound is used then ratio_bound needs to be None. Default None.

  • ratio_bound_slack (float) – The constraints’ ratio bound slack for constraints that are expressed as ratios, also referred to as \(\\epsilon\) in documentation. ratio_bound_slack is ignored if ratio_bound is not specified. Default 0.0

Attributes:
total_samples

Return the number of samples in the data.

Methods

bound()

Return bound vector.

default_objective()

Return the default objective for moments of this kind.

gamma(predictor)

Calculate the degree to which constraints are currently violated by the predictor.

load_data(X, y, *, sensitive_features[, ...])

Load the specified data into this object.

project_lambda(lambda_vec)

Return the projected lambda values.

signed_weights(lambda_vec)

Compute the signed weights.

bound()[source]#

Return bound vector.

Returns:

a vector of bound values corresponding to all constraints

Return type:

pandas.Series

default_objective()[source]#

Return the default objective for moments of this kind.

gamma(predictor)[source]#

Calculate the degree to which constraints are currently violated by the predictor.

load_data(X, y, *, sensitive_features, event=None, utilities=None)[source]#

Load the specified data into this object.

This adds a column event to the tags field.

The utilities is a 2-d array which corresponds to g(X,A,Y,h(X)) from Agarwal et al.[1]. The utilities defaults to h(X), i.e. [0, 1] for each X_i. The first column is G^0 and the second is G^1. Assumes binary classification with labels 0/1.

\[utilities = [g(X,A,Y,h(X)=0), g(X,A,Y,h(X)=1)]\]
project_lambda(lambda_vec)[source]#

Return the projected lambda values.

i.e., returns lambda which is guaranteed to lead to the same or higher value of the Lagrangian compared with lambda_vec for all possible choices of the classifier, h.

signed_weights(lambda_vec)[source]#

Compute the signed weights.

Uses the equations for \(C_i^0\) and \(C_i^1\) as defined in Section 3.2 of Agarwal et al.[1] in the ‘best response of the Q-player’ subsection to compute the signed weights to be applied to the data by the next call to the underlying estimator.

Parameters:

lambda_vec (pandas.Series) – The vector of Lagrange multipliers indexed by index

property total_samples#

Return the number of samples in the data.