fairlearn.datasets.fetch_credit_card#
- fairlearn.datasets.fetch_credit_card(*, cache=True, data_home=None, as_frame=True, return_X_y=False)[source]#
Load the ‘Default of Credit Card clients’ dataset (binary classification).
Samples total
30000
Dimensionality
23
Features
real
Classes
2
Source: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients I-Cheng Yeh and Che-hui Lien, “The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients”, Expert Systems with Applications, 36(2), 2473-2480, 2009
- Parameters:
- cacheboolean, default=True
Whether to cache downloaded datasets using joblib
- data_homeoptional, default: None
Specifiy another download and cache folder for the datasets. By default, all scikit-learn data is stored in ‘~/.fairlearn-data’ subfolders.
- as_frameboolean, default=True
- If True,
Returns the data as Pandas DataFrame, and the target is returned as a Pandas Series.
- If False,
Returns a scikit-learn Bunch object with
frame
attribute containing the data and the target.
Changed in version 0.9.0: Default value changed to True.
- return_X_yboolean, default=False.
- If True,
returns
(data.data, data.target)
- Else,
return Sci-kit Learn Bunch object
- Returns:
- datasetclass:~sklearn.utils.Bunch
Dictionary-like object, with the following attributes.
- dataNumPy Array or Pandas DataFrame, Shape (30000, 23)
Each row corresponds to the 23 feature values in order. If
as_frame
is True,data
is a Pandas DataFrame- targetNumPy Array or Pandas Series, Shape (30000,)
Each value represents whether an applicant defaulted on credit loan. If
as_frame
is True,target
is a Pandas Series.- feature_namesList of Strings, Length 23
Array of ordered feature names used in the dataset.
- DESCRstring
Description of the UCI Default of Credit Card
- categoriesdict or None
Maps each categorical feature name to a list of values, such that the value encoded as i is ith in the list. If
as_frame
is True, this is None.- framepandas DataFrame
Only present when
as_frame
is True. DataFrame withdata
andtarget
.
- (data, target)tuple if
return_X_y
is True
Notes
Our API largely follows the API of
sklearn.datasets.fetch_openml()
.