fetch_videogame_sales#

skrub.datasets.fetch_videogame_sales(data_home=None)[source]#

Fetch the videogame sales dataset (regression) available at https://github.com/skrub-data/skrub-data-files

This is a regression use-case, where the single table contains information about videogames such as the publisher and platform, and the goal is to predict the number of sales worldwide. Size on disk: 1.8MB.

Warning

The original dataset is ordered by decreasing number of sales. This should be taken into account for cross-validation. Depending on the desired setting, one might consider shuffling the rows or ordering by publication year and splitting by year.

Parameters:
data_homestr or path-like, default=None

The directory where to download and unzip the files.

Returns:
bunchBunch

A dictionary-like object with the following keys:

videogame_salesDataFrame of shape (16572, 11)

The dataframe.

XDataFrame of shape (16572, 5)

Features, i.e. the dataframe without the target labels.

yDataFrame of shape (16572, 1)

Target labels.

metadatadict

A dictionary containing the name, source and target.

pathstr

The path to the videogame sales CSV file.