fairlearn.postprocessing package¶
This module contains methods which operate on a predictor, rather than an estimator.
The predictor’s output is adjusted to fulfill specified parity constraints. The postprocessors learn how to adjust the predictor’s output from the training data.
- class fairlearn.postprocessing.ThresholdOptimizer(*, estimator=None, constraints='demographic_parity', grid_size=1000, flip=True, prefit=False)[source]¶
Bases:
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
,sklearn.base.MetaEstimatorMixin
An Estimator based on the threshold optimization approach.
The procedure followed is described in detail in Hardt et al. (2016).
- Parameters
estimator (An estimator) – An estimator whose output will be post processed
grid_size (int) – The number of ticks on the grid over which we evaluate the curves. A large grid_size means that we approximate the actual curve, so it increases the chance of being very close to the actual best solution.
flip (bool) – Allow flipping to negative weights if it improves accuracy.
prefit – If
True
, avoid re-fitting the given estimator if it’s already trained. Note that when used withcross_val_score
,GridSearchCV
and similar utilities that clone the estimator, the effective behavior isprefit=False
.
- Type
bool, default=False
- fit(X, y, *, sensitive_features, **kwargs)[source]¶
Fit the model.
The fit is based on training features and labels, sensitive features, as well as the fairness-unaware predictor or estimator. If an estimator was passed in the constructor this fit method will call fit(X, y, **kwargs) on said estimator.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – The feature matrix
y (numpy.ndarray, pandas.DataFrame, pandas.Series, or list) – The label vector
sensitive_features (currently 1D array as numpy.ndarray, list, pandas.DataFrame, or pandas.Series) – sensitive features to identify groups by, currently allows only a single column
- predict(X, *, sensitive_features, random_state=None)[source]¶
Predict label for each sample in X while taking into account sensitive features.
- Parameters
X (numpy.ndarray or pandas.DataFrame) – feature matrix
sensitive_features (currently 1D array as numpy.ndarray, list, pandas.DataFrame, or pandas.Series) – sensitive features to identify groups by, currently allows only a single column
random_state (int) – set to a constant for reproducibility
- Returns
predictions in numpy.ndarray
- fairlearn.postprocessing.plot_threshold_optimizer(threshold_optimizer, ax=None, show_plot=True)[source]¶
Plot the chosen solution of the threshold optimizer.
For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘demographic_parity’ this will result in a selection/error curve plot. For fairlearn.postprocessing.ThresholdOptimizer objects that have their constraint set to ‘equalized_odds’ this will result in a ROC curve plot.
- Parameters
threshold_optimizer (fairlearn.postprocessing.ThresholdOptimizer) – the ThresholdOptimizer instance for which the results should be illustrated.
ax (matplotlib.axes.Axes) – a custom matplotlib.axes.Axes object to use for the plots, default None
show_plot (bool) – whether or not the generated plot should be shown, default True