fairlearn.postprocessing package¶
This module contains methods which operate on a predictor, rather than an estimator.
The predictor’s output is adjusted to fulfill specified parity constraints. The postprocessors learn how to adjust the predictor’s output from the training data.
- class fairlearn.postprocessing.ThresholdOptimizer(*, estimator=None, constraints='demographic_parity', objective='accuracy_score', grid_size=1000, flip=False, prefit=False)[source]¶
Bases:
sklearn.base.BaseEstimator
,sklearn.base.MetaEstimatorMixin
A classifier based on the threshold optimization approach.
The classifier is obtained by applying group-specific thresholds to the provided estimator. The thresholds are chosen to optimize the provided performance objective subject to the provided fairness constraints.
- Parameters
estimator (estimator object implementing 'predict' and possibly 'fit') – An estimator whose output is postprocessed.
constraints (str, default='demographic_parity') –
Fairness constraints under which threshold optimization is performed. Possible inputs are:
- ’demographic_parity’, ‘selection_rate_parity’ (synonymous)
match the selection rate across groups
- ’{false,true}_{positive,negative}_rate_parity’
match the named metric across groups
- ’equalized_odds’
match true positive and false positive rates across groups
objective (str, default='accuracy_score') –
Performance objective under which threshold optimization is performed. Not all objectives are allowed for all types of constraints. Possible inputs are:
- ’accuracy_score’, ‘balanced_accuracy_score’
allowed for all constraint types
- ’selection_rate’, ‘true_positive_rate’, ‘true_negative_rate’,
allowed for all constraint types except ‘equalized_odds’
grid_size (int, default=1000) – The values of the constraint metric are discretized according to the grid of the specified size over the interval [0,1] and the optimization is performed with respect to the constraints achieving those values. In case of ‘equalized_odds’ the constraint metric is the false positive rate.
flip (bool, default=False) – If True, then allow flipping the decision if it improves the resulting
prefit (bool, default=False) – If True, avoid refitting the given estimator. Note that when used with
sklearn.model_selection.cross_val_score()
,sklearn.model_selection.GridSearchCV
, this will result in an error. In that case, please useprefit=False
.
Notes
The procedure is based on the algorithm of Hardt et al. (2016).
Methods
fit
(X, y, *, sensitive_features, **kwargs)Fit the model.
get_params
([deep])Get parameters for this estimator.
predict
(X, *, sensitive_features[, random_state])Predict label for each sample in X while taking into account sensitive features.
set_params
(**params)Set the parameters of this estimator.
- fit(X, y, *, sensitive_features, **kwargs)[source]¶
Fit the model.
The fit is based on training features and labels, sensitive features, as well as the fairness-unaware predictor or estimator. If an estimator was passed in the constructor this fit method will call fit(X, y, **kwargs) on said estimator.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
sensitive_features (numpy.ndarray, list, pandas.DataFrame, or pandas.Series) – sensitive features to identify groups by
- predict(X, *, sensitive_features, random_state=None)[source]¶
Predict label for each sample in X while taking into account sensitive features.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – feature matrix
sensitive_features (numpy.ndarray, list, pandas.DataFrame, pandas.Series) – sensitive features to identify groups by
random_state (int or RandomState instance, default=None) – Controls random numbers used for randomized predictions. Pass an int for reproducible output across multiple function calls.
- Returns
The prediction. If X represents the data for a single example the result will be a scalar. Otherwise the result will be a vector
- Return type
Scalar or vector as numpy.ndarray
- fairlearn.postprocessing.plot_threshold_optimizer(threshold_optimizer, ax=None, show_plot=True)[source]¶
Plot the chosen solution of the threshold optimizer.
For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘demographic_parity’ this will result in a selection/error curve plot. For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘equalized_odds’ this will result in a ROC curve plot.
- Parameters
threshold_optimizer (fairlearn.postprocessing.ThresholdOptimizer) – the ThresholdOptimizer instance for which the results should be illustrated.
ax (matplotlib.axes.Axes) – a custom matplotlib.axes.Axes object to use for the plots, default None
show_plot (bool) – whether or not the generated plot should be shown, default True