skrub.Expr.skb.cross_validate#

Expr.skb.cross_validate(environment=None, **kwargs)[source]#

Cross-validate the expression.

This generates the pipeline with default hyperparameters and runs scikit-learn cross-validation.

Parameters:
environmentdict or None

Bindings for variables contained in the expression. If not provided, the value``s passed when initializing ``var() are used.

kwargsdict

All other named arguments are forwarded to sklearn.model_selection.cross_validate, except that scikit-learn’s return_estimator parameter is named return_pipeline here.

Returns:
dict

Cross-validation results.

Examples

>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> import skrub
>>> X_a, y_a = make_classification(random_state=0)
>>> X, y = skrub.X(X_a), skrub.y(y_a)
>>> pred = X.skb.apply(LogisticRegression(), y=y)
>>> pred.skb.cross_validate(cv=2)['test_score']
0    0.84
1    0.78
Name: test_score, dtype: float64

Passing some data:

>>> data = {'X': X_a, 'y': y_a}
>>> pred.skb.cross_validate(data)['test_score']
0    0.75
1    0.90
2    0.85
3    0.65
4    0.90
Name: test_score, dtype: float64