Examples#

Getting Started

Getting Started

Encoding: from a dataframe to a numerical matrix for machine learning

Encoding: from a dataframe to a numerical matrix for machine learning

Various string encoders: a sentiment analysis example

Various string encoders: a sentiment analysis example

Handling datetime features with the DatetimeEncoder

Handling datetime features with the DatetimeEncoder

Fuzzy joining dirty tables with the Joiner

Fuzzy joining dirty tables with the Joiner

Deduplicating misspelled categories

Deduplicating misspelled categories

Wikipedia embeddings to enrich the data

Wikipedia embeddings to enrich the data

Spatial join for flight data: Joining across multiple columns

Spatial join for flight data: Joining across multiple columns

AggJoiner on a credit fraud dataset

AggJoiner on a credit fraud dataset

Interpolation join: infer missing rows when joining two tables

Interpolation join: infer missing rows when joining two tables

Hands-On with Column Selection and Transformers

Hands-On with Column Selection and Transformers

SquashingScaler: Robust numerical preprocessing for neural networks

SquashingScaler: Robust numerical preprocessing for neural networks

Skrub DataOps#

Introduction to machine-learning pipelines with skrub DataOps

Introduction to machine-learning pipelines with skrub DataOps

Multiples tables: building machine learning pipelines with DataOps

Multiples tables: building machine learning pipelines with DataOps

Hyperparameter tuning with DataOps

Hyperparameter tuning with DataOps

Subsampling for faster development

Subsampling for faster development

Usecase: developing locally, and avoiding to repeat code in production

Usecase: developing locally, and avoiding to repeat code in production

Gallery generated by Sphinx-Gallery