fetch_credit_fraud#

skrub.datasets.fetch_credit_fraud(data_home=None, split='train')[source]#

Fetch the credit fraud dataset (classification) available at skrub-data/skrub-data-files

This is an imbalanced binary classification use-case. This dataset consists of two tables:

Baskets contain at least one product each, so aggregation then joining operations are required to build a design matrix.

Parameters:

data_homestr or path, default=None: The directory where to download and unzip the files.
splitstr, default=”train”: The split to load. Can be either “train”, “test”, or “all”.

Returns:

bunchsklearn.utils.Bunch

A dictionary-like object with the following keys:

baskets : pd.DataFrame, table containing baskets ID and target
product : pd.DataFrame, table containing features about products contained in baskets
metadata : a dictionary containing the name, description, source and target

Gallery examples#