fairlearn.datasets.fetch_bank_marketing#
- fairlearn.datasets.fetch_bank_marketing(*, cache=True, data_home=None, as_frame=True, return_X_y=False)[source]#
Load the UCI bank marketing dataset (binary classification).
Download it if necessary.
Samples total
45211
Dimensionality
16
Features
numeric, categorical
Classes
2
Source: UCI Repository [3] Paper: Moro et al., 2014 [4]
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be (or not) subscribed.
The classification goal is to predict if the client will subscribe a term deposit (variable y).
New in version 0.5.0.
- Parameters
cache (bool, default=True) – Whether to cache downloaded datasets using joblib.
data_home (str, default=None) – Specify another download and cache folder for the datasets. By default, all fairlearn data is stored in ‘~/.fairlearn-data’ subfolders.
as_frame (bool, default=True) –
If True, the data is a pandas DataFrame including columns with appropriate dtypes (numeric, string or categorical). The target is a pandas DataFrame or Series depending on the number of target_columns. The Bunch will contain a
frame
attribute with the target and the data. Ifreturn_X_y
is True, then(data, target)
will be pandas DataFrames or Series as describe above.Changed in version 0.9.0: Default value changed to True.
return_X_y (bool, default=False) – If True, returns
(data.data, data.target)
instead of a Bunch object.
- Returns
dataset (
Bunch
) – Dictionary-like object, with the following attributes.- datandarray, shape (45211, 16)
Each row corresponding to the 16 feature values in order. If
as_frame
is True,data
is a pandas object.- targetnumpy array of shape (45211,)
Each value represents whether the client subscribed a term deposit which is ‘yes’ if the client subscribed and ‘no’ otherwise. If
as_frame
is True,target
is a pandas object.- feature_nameslist of length 16
Array of ordered feature names used in the dataset.
- DESCRstring
Description of the UCI bank marketing dataset.
- categoriesdict or None
Maps each categorical feature name to a list of values, such that the value encoded as i is ith in the list. If
as_frame
is True, this is None.- framepandas DataFrame
Only present when
as_frame
is True. DataFrame withdata
andtarget
.
(data, target) (tuple if
return_X_y
is True)
Notes
Our API largely follows the API of
sklearn.datasets.fetch_openml()
.References
- 3
S. Moro, P. Cortez, and P. Rita, UCI Machine Learning Repository: Bank Marketing Data Set, 14-Feb-2014. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing.
- 4
S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014