Wrangling data with good defaults#
This section covers how to build a predictive pipeline starting from a dataframe. The skrub objects described in this section can be used as strong defaults for building baseline pipelines, and can be customized for specific use cases.
Cleaner: sanitizing a dataframe- Transforming a table into an array of numeric features:
TableVectorizer - Building robust ML baselines with
tabular_pipeline() - The logic used by the tabular pipeline is quite simple
- Extending the pipeline with the
.stepsattribute - Using a pipeline as the estimator
- Transforming selected columns with
ApplyToCols