fairlearn.postprocessing package¶

This module contains methods which operate on a predictor, rather than an estimator.

The predictor’s output is adjusted to fulfill specified parity constraints. The postprocessors learn how to adjust the predictor’s output from the training data.

class fairlearn.postprocessing.ThresholdOptimizer(*, estimator=None, constraints='demographic_parity', grid_size=1000, flip=True, prefit=False)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin, sklearn.base.MetaEstimatorMixin

An Estimator based on the threshold optimization approach.

The procedure followed is described in detail in Hardt et al. (2016).

Parameters

estimator (An estimator) – An estimator whose output will be post processed
grid_size (int) – The number of ticks on the grid over which we evaluate the curves. A large grid_size means that we approximate the actual curve, so it increases the chance of being very close to the actual best solution.
flip (bool) – Allow flipping to negative weights if it improves accuracy.
prefit – If True, avoid re-fitting the given estimator if it’s already trained. Note that when used with cross_val_score, GridSearchCV and similar utilities that clone the estimator, the effective behavior is prefit=False.

Type

bool, default=False

Methods

`fit`(X, y, , sensitive_features, *kwargs)	Fit the model.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X, *, sensitive_features[, random_state])	Predict label for each sample in X while taking into account sensitive features.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y, *, sensitive_features, **kwargs)[source]¶

Fit the model.

The fit is based on training features and labels, sensitive features, as well as the fairness-unaware predictor or estimator. If an estimator was passed in the constructor this fit method will call fit(X, y, **kwargs) on said estimator.

Parameters

X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
sensitive_features (currently 1D array as numpy.ndarray, list, pandas.DataFrame, or pandas.Series) – sensitive features to identify groups by, currently allows only a single column

predict(X, *, sensitive_features, random_state=None)[source]¶

Predict label for each sample in X while taking into account sensitive features.

Parameters

X (numpy.ndarray or pandas.DataFrame) – feature matrix
sensitive_features (currently 1D array as numpy.ndarray, list, pandas.DataFrame, or pandas.Series) – sensitive features to identify groups by, currently allows only a single column
random_state (int) – set to a constant for reproducibility

Returns

predictions in numpy.ndarray

fairlearn.postprocessing.plot_threshold_optimizer(threshold_optimizer, ax=None, show_plot=True)[source]¶

Plot the chosen solution of the threshold optimizer.

For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘demographic_parity’ this will result in a selection/error curve plot. For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘equalized_odds’ this will result in a ROC curve plot.

Parameters

threshold_optimizer (fairlearn.postprocessing.ThresholdOptimizer) – the ThresholdOptimizer instance for which the results should be illustrated.
ax (matplotlib.axes.Axes) – a custom matplotlib.axes.Axes object to use for the plots, default None
show_plot (bool) – whether or not the generated plot should be shown, default True

Versions

fairlearn.postprocessing package¶