fairlearn.reductions.UtilityParity#
- class fairlearn.reductions.UtilityParity(*, difference_bound=None, ratio_bound=None, ratio_bound_slack=0.0)[source]#
A generic moment for parity in utilities (or costs) under classification.
This serves as the base class for
DemographicParity
,EqualizedOdds
, and others. All subclasses can be used as difference-based constraints or ratio-based constraints. Refer to the user guide for more information and example usage.Constraints compare the group-level mean utility for each group with the overall mean utility (unless further events are specified, e.g., in equalized odds). Constraint violation for difference-based constraints starts if the difference between a group and the overall population with regard to a utility exceeds difference_bound. For ratio-based constraints, the ratio between the group-level and overall mean utility needs to be bounded between ratio_bound and its inverse (plus an additional additive ratio_bound_slack).
The index field is a
pandas.MultiIndex
corresponding to the constraint IDs. It is an index of various DataFrame and Series objects that are either required as arguments or returned by several of the methods of the UtilityParity class. It is the Cartesian product of:The unique events defining the particular moment object
The unique values of the sensitive feature
The characters + and -, corresponding to the Lagrange multipliers for positive and negative violations of the constraint
Read more in the User Guide.
- Parameters:
- difference_boundfloat
The constraints’ difference bound for constraints that are expressed as differences, also referred to as \(\\epsilon\) in documentation. If ratio_bound is used then difference_bound needs to be None. If neither ratio_bound nor difference_bound are set then a default difference bound of 0.01 is used for backwards compatibility. Default None.
- ratio_boundfloat
The constraints’ ratio bound for constraints that are expressed as ratios. The specified value needs to be in (0,1]. If difference_bound is used then ratio_bound needs to be None. Default None.
- ratio_bound_slackfloat
The constraints’ ratio bound slack for constraints that are expressed as ratios, also referred to as \(\\epsilon\) in documentation. ratio_bound_slack is ignored if ratio_bound is not specified. Default 0.0
- bound()[source]#
Return bound vector.
- Returns:
- pandas.Series
a vector of bound values corresponding to all constraints
- gamma(predictor)[source]#
Calculate the degree to which constraints are currently violated by the predictor.
- load_data(X, y, *, sensitive_features, event=None, utilities=None)[source]#
Load the specified data into this object.
This adds a column event to the tags field.
The utilities is a 2-d array which corresponds to g(X,A,Y,h(X)) from Agarwal et al.[1]. The utilities defaults to h(X), i.e. [0, 1] for each X_i. The first column is G^0 and the second is G^1. Assumes binary classification with labels 0/1.
\[utilities = [g(X,A,Y,h(X)=0), g(X,A,Y,h(X)=1)]\]
- project_lambda(lambda_vec)[source]#
Return the projected lambda values.
i.e., returns lambda which is guaranteed to lead to the same or higher value of the Lagrangian compared with lambda_vec for all possible choices of the classifier, h.
- signed_weights(lambda_vec)[source]#
Compute the signed weights.
Uses the equations for \(C_i^0\) and \(C_i^1\) as defined in Section 3.2 of Agarwal et al.[1] in the ‘best response of the Q-player’ subsection to compute the signed weights to be applied to the data by the next call to the underlying estimator.
- Parameters:
- lambda_vec
pandas.Series
The vector of Lagrange multipliers indexed by index
- lambda_vec
- property total_samples#
Return the number of samples in the data.
Gallery examples#
Passing pipelines to mitigation techniques