Note
Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder
Self-aggregation on MovieLens#
MovieLens is a famous movie dataset used for both explicit
and implicit recommender systems. It provides a main table,
“ratings”, that can be viewed as logs or transactions, comprised
of only 4 columns: userId
, movieId
, rating
and timestamp
.
MovieLens also gives a contextual table “movies”, including
movieId
, title
and types
, to enable content-based feature extraction.
From the perspective of machine-learning pipelines, one challenge is to transform the transaction log into features that can be fed to supervised learning.
In this notebook, we only deal with the main table “ratings”.
Our objective is not to achieve state-of-the-art performance on
the explicit regression task, but rather to illustrate how to perform
feature engineering in a simple way using AggJoiner
and AggTarget
.
Note that our performance is higher than the baseline of using the mean
rating per movies.
The benefit of using AggJoiner
and AggTarget
is that they readily
provide a full pipeline, from the original tables to the prediction, that can
be cross-validated or applied to new data to serve prediction. At the end of
this example, we showcase hyper-parameter optimization on the whole pipeline.
The data#
We begin with loading the ratings table from MovieLens. Note that we use the light version (100k rows).
import pandas as pd
from skrub.datasets import fetch_movielens
ratings = fetch_movielens(dataset_id="ratings")
ratings = ratings.X.sort_values("timestamp").reset_index(drop=True)
ratings["timestamp"] = pd.to_datetime(ratings["timestamp"], unit="s")
X = ratings[["userId", "movieId", "timestamp"]]
y = ratings["rating"]
X.shape, y.shape
((100836, 3), (100836,))
X.head()
Encoding the timestamp with a TableVectorizer#
Our first step is to extract features from the timestamp, using the
TableVectorizer
. Natively, it uses the DatetimeEncoder
on datetime
columns, and doesn’t interact with numerical columns.
from skrub import DatetimeEncoder, TableVectorizer
table_vectorizer = TableVectorizer(datetime=DatetimeEncoder(add_weekday=True))
X_date_encoded = table_vectorizer.fit_transform(X)
X_date_encoded.head()
We can now make a couple of plots and gain some insight on our dataset.
import seaborn as sns
from matplotlib import pyplot as plt
sns.set_style("darkgrid")
def make_barplot(x, y, title):
fig, ax = plt.subplots(layout="constrained")
norm = plt.Normalize(y.min(), y.max())
cmap = plt.get_cmap("magma")
sns.barplot(x=x, y=y, palette=cmap(norm(y)), ax=ax)
ax.set_title(title)
ax.set_xticks(ax.get_xticks(), labels=ax.get_xticklabels(), rotation=30)
ax.set_ylabel(None)
# O is Monday, 6 is Sunday
daily_volume = X_date_encoded["timestamp_weekday"].value_counts().sort_index()
make_barplot(
x=daily_volume.index,
y=daily_volume.values,
title="Daily volume of ratings",
)
![Daily volume of ratings](../_images/sphx_glr_08_join_aggregation_001.png)
/home/circleci/project/examples/08_join_aggregation.py:105: FutureWarning:
Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.
sns.barplot(x=x, y=y, palette=cmap(norm(y)), ax=ax)
/home/circleci/project/examples/08_join_aggregation.py:105: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14
sns.barplot(x=x, y=y, palette=cmap(norm(y)), ax=ax)
We also display the distribution of our target y
.
rating_count = y.value_counts().sort_index()
make_barplot(
x=rating_count.index,
y=rating_count.values,
title="Distribution of ratings given to movies",
)
![Distribution of ratings given to movies](../_images/sphx_glr_08_join_aggregation_002.png)
/home/circleci/project/examples/08_join_aggregation.py:105: FutureWarning:
Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.
sns.barplot(x=x, y=y, palette=cmap(norm(y)), ax=ax)
/home/circleci/project/examples/08_join_aggregation.py:105: UserWarning: Numpy array is not a supported type for `palette`. Please convert your palette to a list. This will become an error in v0.14
sns.barplot(x=x, y=y, palette=cmap(norm(y)), ax=ax)
AggTarget: aggregate y, then join#
We have just extracted datetime features from timestamps.
Let’s now perform an expansion for the target y
, by aggregating it before
joining it back on the main table. The biggest risk of doing target expansion
with multiple dataframe operations yourself is to end up leaking the target.
To solve this, the AggTarget
transformer allows you to
aggregate the target y
before joining it on the main table, without
risk of leaking. Note that to perform aggregation then joining on the features
X
, you need to use AggJoiner
instead.
You can also think of it as a generalization of the TargetEncoder
, which
encodes categorical features based on the target.
We only focus on aggregating the target by users, but later we will also consider aggregating by movies. Here, we compute the histogram of the target with 3 bins, before joining it back on the initial table.
This feature answer questions like “How many times has this user given a bad, medium or good rate to movies?”.
from skrub import AggTarget
agg_target_user = AggTarget(
main_key="userId",
suffix="_user",
operation="hist(3)",
)
X_transformed = agg_target_user.fit_transform(X, y)
X_transformed.shape
(100836, 7)
Similarly, we join on movieId
instead of userId
.
This feature answer questions like “How many times has this movie received a bad, medium or good rate from users?”.
agg_target_movie = AggTarget(
main_key="movieId",
suffix="_movie",
operation="hist(3)",
)
X_transformed = agg_target_movie.fit_transform(X, y)
X_transformed.shape
(100836, 7)
Chaining everything together in a pipeline#
To perform cross-validation and enable hyper-parameter tuning, we gather
all elements into a scikit-learn Pipeline
by using make_pipeline
,
and define a scikit-learn HistGradientBoostingRegressor
.
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.pipeline import make_pipeline
pipeline = make_pipeline(
table_vectorizer,
agg_target_user,
agg_target_movie,
HistGradientBoostingRegressor(learning_rate=0.1, max_depth=4, max_iter=40),
)
pipeline
Hyper-parameters tuning and cross validation#
We can finally create our hyper-parameter search space, and use a
GridSearchCV
. We select the cross validation splitter to be
the TimeSeriesSplit
to prevent leakage, since our data are timestamped
logs.
Note that you need the name of the pipeline elements to assign them hyper-parameters search.
You can lookup the name of the pipeline elements by doing:
list(pipeline.named_steps)
['tablevectorizer', 'aggtarget-1', 'aggtarget-2', 'histgradientboostingregressor']
Alternatively, you can use scikit-learn Pipeline
to name your transformers:
Pipeline([("agg_target_user", agg_target_user), ...])
We now perform the grid search over the AggTarget
transformers to find the
operation maximizing our validation score.
from sklearn.model_selection import GridSearchCV, TimeSeriesSplit
operations = ["mean", "hist(3)", "hist(5)", "hist(7)", "value_counts"]
param_grid = [
{
"aggtarget-2__operation": [op],
}
for op in operations
]
cv = GridSearchCV(pipeline, param_grid, cv=TimeSeriesSplit(n_splits=10))
cv.fit(X, y)
results = pd.DataFrame(cv.cv_results_)
cols = [f"split{idx}_test_score" for idx in range(10)]
results = results.set_index("param_aggtarget-2__operation")[cols].T
results
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_810cdf0b__
Feature names seen at fit time, yet now missing:
- index__skrub_e167678c__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_9f9696d7__
Feature names seen at fit time, yet now missing:
- index__skrub_4407c69f__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_a2515f49__
Feature names seen at fit time, yet now missing:
- index__skrub_59337b8a__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_ef51ebab__
Feature names seen at fit time, yet now missing:
- index__skrub_f0df999b__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_b10f184b__
Feature names seen at fit time, yet now missing:
- index__skrub_6d0ed8a6__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_604fdf98__
Feature names seen at fit time, yet now missing:
- index__skrub_111f8501__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_41ebad1d__
Feature names seen at fit time, yet now missing:
- index__skrub_64a1f497__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_5cc4f8c1__
Feature names seen at fit time, yet now missing:
- index__skrub_e8900710__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_865a1293__
Feature names seen at fit time, yet now missing:
- index__skrub_76873db3__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_7366f221__
Feature names seen at fit time, yet now missing:
- index__skrub_f59e789e__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_425bbf46__
Feature names seen at fit time, yet now missing:
- index__skrub_398f8277__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_8b780d1d__
Feature names seen at fit time, yet now missing:
- index__skrub_4e7b27bd__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_564a8f68__
Feature names seen at fit time, yet now missing:
- index__skrub_94146b12__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_9755619b__
Feature names seen at fit time, yet now missing:
- index__skrub_fa1258d7__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_a597b339__
Feature names seen at fit time, yet now missing:
- index__skrub_7dac1e94__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_9bcae509__
Feature names seen at fit time, yet now missing:
- index__skrub_0a697bad__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_776a5c59__
Feature names seen at fit time, yet now missing:
- index__skrub_5b9b5817__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_afc3f99b__
Feature names seen at fit time, yet now missing:
- index__skrub_828e5984__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_dda274dc__
Feature names seen at fit time, yet now missing:
- index__skrub_f9f9d4ef__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_c96a5c95__
Feature names seen at fit time, yet now missing:
- index__skrub_37a737c9__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_90761ca3__
Feature names seen at fit time, yet now missing:
- index__skrub_6bbeba03__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_ec150c55__
Feature names seen at fit time, yet now missing:
- index__skrub_2cade33d__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_2371788b__
Feature names seen at fit time, yet now missing:
- index__skrub_89e505b8__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_3efa16b7__
Feature names seen at fit time, yet now missing:
- index__skrub_140cf913__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_a4060719__
Feature names seen at fit time, yet now missing:
- index__skrub_ffb426a2__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_d93252f7__
Feature names seen at fit time, yet now missing:
- index__skrub_6fbfbbe1__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_eea4f81a__
Feature names seen at fit time, yet now missing:
- index__skrub_75682b04__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_8a9592f8__
Feature names seen at fit time, yet now missing:
- index__skrub_3d7e30c9__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_51a5b045__
Feature names seen at fit time, yet now missing:
- index__skrub_c973ccbd__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_06420b70__
Feature names seen at fit time, yet now missing:
- index__skrub_8de721b8__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_c76abfb1__
Feature names seen at fit time, yet now missing:
- index__skrub_0cf67b9e__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_64dddc45__
Feature names seen at fit time, yet now missing:
- index__skrub_946f4521__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_55ea7f79__
Feature names seen at fit time, yet now missing:
- index__skrub_7a894d21__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_14c5f4ec__
Feature names seen at fit time, yet now missing:
- index__skrub_f4bb9574__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_a809766f__
Feature names seen at fit time, yet now missing:
- index__skrub_16efe28a__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_aba9d923__
Feature names seen at fit time, yet now missing:
- index__skrub_c48929de__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_ee951e56__
Feature names seen at fit time, yet now missing:
- index__skrub_6aebee84__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_e4594399__
Feature names seen at fit time, yet now missing:
- index__skrub_f75016dc__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_6d430c5b__
Feature names seen at fit time, yet now missing:
- index__skrub_c8369667__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/metrics/_scorer.py", line 455, in __call__
return estimator.score(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/pipeline.py", line 1007, in score
return self.steps[-1][1].score(Xt, y, **score_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 848, in score
y_pred = self.predict(X)
^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1769, in predict
return self._loss.link.inverse(self._raw_predict(X).ravel())
^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 1278, in _raw_predict
X = self._preprocess_X(X, reset=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", line 268, in _preprocess_X
return self._validate_data(X, reset=False, **check_X_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 608, in _validate_data
self._check_feature_names(X, reset=reset)
File "/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py", line 535, in _check_feature_names
raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- index__skrub_a00b5363__
Feature names seen at fit time, yet now missing:
- index__skrub_6ecda470__
warnings.warn(
/home/circleci/project/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1052: UserWarning: One or more of the test scores are non-finite: [0.08466469 nan nan nan nan]
warnings.warn(
The score used in this regression task is the R2. Remember that the R2
evaluates the relative performance compared to the naive baseline consisting
in always predicting the mean value of y_test
.
Therefore, the R2 is 0 when y_pred = y_true.mean()
and is upper bounded
to 1 when y_pred = y_true
.
To get a better sense of the learning performances of our simple pipeline, we also compute the average rating of each movie in the training set, and uses this average to predict the ratings in the test set.
from sklearn.metrics import r2_score
def baseline_r2(X, y, train_idx, test_idx):
"""Compute the average rating for all movies in the train set,
and map these averages to the test set as a prediction.
If a movie in the test set is not present in the training set,
we simply predict the global average rating of the training set.
"""
X_train, y_train = X.iloc[train_idx].copy(), y.iloc[train_idx]
X_test, y_test = X.iloc[test_idx], y.iloc[test_idx]
X_train["y"] = y_train
movie_avg_rating = X_train.groupby("movieId")["y"].mean().to_frame().reset_index()
y_pred = X_test.merge(movie_avg_rating, on="movieId", how="left")["y"]
y_pred = y_pred.fillna(y_pred.mean())
return r2_score(y_true=y_test, y_pred=y_pred)
all_baseline_r2 = []
for train_idx, test_idx in TimeSeriesSplit(n_splits=10).split(X, y):
all_baseline_r2.append(baseline_r2(X, y, train_idx, test_idx))
results.insert(0, "naive mean estimator", all_baseline_r2)
# we only keep the 5 out of 10 last results
# because the initial size of the train set is rather small
fig, ax = plt.subplots(layout="constrained")
sns.boxplot(results.tail(5), palette="magma", ax=ax)
ax.set_ylabel("R2 score")
ax.set_title("Hyper parameters grid-search results")
plt.tight_layout()
![Hyper parameters grid-search results](../_images/sphx_glr_08_join_aggregation_003.png)
/home/circleci/project/examples/08_join_aggregation.py:288: UserWarning: The figure layout has changed to tight
plt.tight_layout()
The naive estimator has a lower performance than our pipeline, which means that our extracted features brought some predictive power.
It seems that using the "value_counts"
as an aggregation operator for
AggTarget
yields better performances than using the mean (which is
equivalent to using the TargetEncoder
).
Here, the number of bins encoding the target is proportional to the
performance: computing the mean yields a single statistic, whereas histograms
yield a density over a reduced set of bins, and "value_counts"
yields an
exhaustive histogram over all the possible values of ratings
(here 10 different values, from 0.5 to 5).
Total running time of the script: (0 minutes 20.361 seconds)