filter#
- skrub.selectors.filter(predicate, *args, **kwargs)[source]#
Select columns for which
predicate
returns True.For each column
col
in the dataframe,predicate
is called aspredicate(col, *args, **kwargs)
and the column is kept if it returns True. To filter columns based only on their name, see alsofilter_names
.args
andkwargs
are extra parameters for the predicate. Storing parameters like this rather than in a closure can help using an importable function as the predicate rather than a local one, which is necessary to pickle the selector. (An alternative is to usefunctools.partial
).Examples
>>> from skrub import selectors as s >>> import pandas as pd >>> df = pd.DataFrame( ... { ... "height_mm": [297.0, 420.0], ... "width_mm": [210.0, 297.0], ... "kind": ["A4", "A3"], ... "ID": [4, 3], ... } ... ) >>> df height_mm width_mm kind ID 0 297.0 210.0 A4 4 1 420.0 297.0 A3 3
>>> selector = s.filter(lambda col: 'A4' in col.values) >>> s.select(df, selector) kind 0 A4 1 A3
>>> def contains(col, value): ... return value in col.values
>>> selector = s.filter(contains, 3) >>> selector filter(contains, 3)
>>> s.select(df, selector) ID 0 4 1 3