fetch_movielens#

skrub.datasets.fetch_movielens(data_home=None)[source]#

Fetch the movielens dataset (regression) available at https://github.com/skrub-data/skrub-data-files

This is a regression use-case, where the goal is to predict movie ratings. More details are provided in the output’s metadata['description']. Size on disk: 3.6MB.

Parameters:
data_homestr or path-like, default=None

The directory where to download and unzip the files.

Returns:
bunchBunch

A dictionary-like object with the following keys:

moviesDataFrame of shape (9742, 3)

Dataframe with movie titles and genres.

ratingsDataFrame of shape (100836, 4)

Dataframe with ratings of movies.

metadatadict

A dictionary containing the name source and description.

movies_pathstr

The path to the movies CSV file.

ratings_pathstr

The path to the ratings CSV file.