fetch_videogame_sales#

skrub.datasets.fetch_videogame_sales(data_home=None)[source]#

Fetch the videogame sales dataset (regression) available at skrub-data/skrub-data-files

This is a regression use-case, where the single table contains information about videogames such as the publisher and platform, and the goal is to predict the number of sales worldwide.

Warning

The original dataset is ordered by decreasing number of sales. This should be taken into account for cross-validation. Depending on the desired setting, one might consider shuffling the rows or ordering by publication year and splitting by year.

Parameters:
data_home: str or path, default=None

The directory where to download and unzip the files.

Returns:
bunchsklearn.utils.Bunch

A dictionary-like object with the following keys:

  • videogame_sales : pd.DataFrame, the full dataframe

  • X : pd.DataFrame, features, i.e. the dataframe without the target labels

  • y : pd.DataFrame, target labels

  • metadata : a dictionary containing the name, source and target