Skip to main content
Ctrl+K
skrub - Home skrub - Home
  • Install
  • User guide
  • API
  • Examples
    • Learning Materials
    • Release history
    • Development
    • Contributing to skrub
  • GitHub
  • Discord
  • Bluesky
  • X (ex-Twitter)
  • Install
  • User guide
  • API
  • Examples
  • Learning Materials
  • Release history
  • Development
  • Contributing to skrub
  • GitHub
  • Discord
  • Bluesky
  • X (ex-Twitter)

Section Navigation

  • Getting Started
  • Encoding: from a dataframe to a numerical matrix for machine learning
  • Various string encoders: a sentiment analysis example
  • Handling datetime features with the DatetimeEncoder
  • Fuzzy joining dirty tables with the Joiner
  • Deduplicating misspelled categories
  • Wikipedia embeddings to enrich the data
  • Spatial join for flight data: Joining across multiple columns
  • AggJoiner on a credit fraud dataset
  • Interpolation join: infer missing rows when joining two tables
  • Skrub expressions
    • Building complex tabular pipelines
    • Tuning pipelines
    • Subsampling for faster development
  • Examples
  • Skrub expressions

Skrub expressions#

Building complex tabular pipelines

Building complex tabular pipelines

Tuning pipelines

Tuning pipelines

Subsampling for faster development

Subsampling for faster development

previous

Interpolation join: infer missing rows when joining two tables

next

Building complex tabular pipelines

This Page

  • Show Source

© Copyright 2018-2023, the dirty_cat developers, 2023-2025, the skrub developers.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.16.1.