tabular_learner#
- skrub.tabular_learner(estimator, *, n_jobs=None)[source]#
Get a simple machine-learning pipeline for tabular data.
Deprecated since version 0.6.0: The functionality provided by this function is now implemented in
tabular_pipeline()
.'regressor'
,'regression'
,'classifier'
,'classification'
, this function creates a scikit-learn pipeline that extracts numeric features, imputes missing values and scales the data if necessary, then applies the estimator.Note
The heuristics used by the
tabular_pipeline
to define an appropriate preprocessing based on theestimator
may change in future releases.Changed in version 0.6.0: The high cardinality encoder has been changed from
MinHashEncoder
toStringEncoder
.- Parameters:
- estimator{“regressor”, “regression”, “classifier”, “classification”} or sklearn.base.BaseEstimator
The estimator to use as the final step in the pipeline. Based on the type of estimator, the previous preprocessing steps and their respective parameters are chosen. The possible values are:
'regressor'
or'regression'
: aHistGradientBoostingRegressor
is used as the final step;'classifier'
or'classification'
: aHistGradientBoostingClassifier
is used as the final step;a scikit-learn estimator: the provided estimator is used as the final step.
- n_jobs
int
, default=None Number of jobs to run in parallel in the
TableVectorizer
step.None
means 1 unless in a joblibparallel_backend
context.-1
means using all processors.
- Returns: