fetch_credit_fraud#

skrub.datasets.fetch_credit_fraud(data_home=None)[source]#

Fetch the credit fraud dataset (classification) available at skrub-data/skrub-data-files

This is an imbalanced binary classification use-case. This dataset consists in two tables:

baskets, containing the binary fraud target label
products

Baskets contain at least one product each, so aggregation then joining operations are required to build a design matrix.

Parameters:

data_home: str or path, default=None: The directory where to download and unzip the files.

Returns:

bunchsklearn.utils.Bunch

A dictionary-like object with the following keys:

baskets : pd.DataFrame, table containing baskets ID and target
product : pd.DataFrame, table containing features about products contained in baskets
metadata : a dictionary containing the name, description, source and target

Gallery examples#

AggJoiner on a credit fraud dataset

AggJoiner on a credit fraud dataset