fetch_credit_fraud#

skrub.datasets.fetch_credit_fraud(data_home=None)[source]#

Fetch the credit fraud dataset (classification) available at skrub-data/skrub-data-files

This is an imbalanced binary classification use-case. This dataset consists in two tables:

  • baskets, containing the binary fraud target label

  • products

Baskets contain at least one product each, so aggregation then joining operations are required to build a design matrix.

Parameters:
data_home: str or path, default=None

The directory where to download and unzip the files.

Returns:
bunchsklearn.utils.Bunch

A dictionary-like object with the following keys:

  • baskets : pd.DataFrame, table containing baskets ID and target

  • product : pd.DataFrame, table containing features about products contained in baskets

  • metadata : a dictionary containing the name, description, source and target