cross_validate#
- skrub.cross_validate(pipeline, environment, **kwargs)[source]#
Cross-validate a pipeline built from an expression.
This runs cross-validation from a pipeline that was built from a skrub expression with
.skb.get_pipeline()
,.skb.get_grid_search()
or.skb.get_randomized_search()
.It is useful to run nested cross-validation of a grid search or randomized search.
- Parameters:
- pipelineskrub pipeline
A pipeline generated from a skrub expression.
- environment
dict
Bindings for variables contained in the expression.
- kwargs
dict
All other named arguments are forwarded to
sklearn.model_selection.cross_validate()
, except that scikit-learn’sreturn_estimator
parameter is namedreturn_pipeline
here.
- Returns:
dict
Cross-validation results.
See also
sklearn.model_selection.cross_validate()
Evaluate metric(s) by cross-validation and also record fit/score times.
skrub.Expr.skb.get_pipeline()
Get a skrub pipeline for this expression.
skrub.Expr.skb.get_grid_search()
Find the best parameters with grid search.
skrub.Expr.skb.get_randomized_search()
Find the best parameters with grid search.
Examples
>>> from sklearn.datasets import make_classification >>> from sklearn.linear_model import LogisticRegression >>> import skrub
>>> X_a, y_a = make_classification(random_state=0) >>> X, y = skrub.X(X_a), skrub.y(y_a) >>> log_reg = LogisticRegression( ... **skrub.choose_float(0.01, 1.0, log=True, name="C") ... ) >>> pred = X.skb.apply(log_reg, y=y) >>> search = pred.skb.get_randomized_search(random_state=0) >>> skrub.cross_validate(search, pred.skb.get_data())['test_score'] 0 0.75 1 0.90 2 0.95 3 0.75 4 0.85 Name: test_score, dtype: float64