fairlearn.preprocessing.CorrelationRemover#
- class fairlearn.preprocessing.CorrelationRemover(*, sensitive_feature_ids, alpha=1)[source]#
A component that filters out sensitive correlations in a dataset.
CorrelationRemover applies a linear transformation to the non-sensitive feature columns in order to remove their correlation with the sensitive feature columns while retaining as much information as possible (as measured by the least-squares error).
Read more in the User Guide.
- Parameters
Notes
This method will change the original dataset by removing all correlation with sensitive values. To describe that mathematically, let’s assume in the original dataset
we’ve got a set of sensitive attributes and a set of non-sensitive attributes . Mathematically this method will be solving the following problem.The solution to this problem is found by centering sensitive features, fitting a linear regression model to the non-sensitive features and reporting the residual.
The columns in
will be dropped but the hyper parameter does allow you to tweak the amount of filtering that gets applied.Note that the lack of correlation does not imply anything about statistical dependence. Therefore, we expect this to be most appropriate as a preprocessing step for (generalized) linear models.
Methods
fit
(X[, y])Learn the projection required to make the dataset uncorrelated with sensitive columns.
fit_transform
(X[, y])Fit to data, then transform it.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
transform
(X)Transform X by applying the correlation remover.
- fit(X, y=None)[source]#
Learn the projection required to make the dataset uncorrelated with sensitive columns.
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
routing – A
MetadataRequest
encapsulating routing information.- Return type
MetadataRequest
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters
transform ({"default", "pandas"}, default=None) –
Configure output of transform and fit_transform.
”default”: Default output format of a transformer
”pandas”: DataFrame output
None: Transform configuration is unchanged
- Returns
self – Estimator instance.
- Return type
estimator instance
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance