has_nulls#

skrub.selectors.has_nulls(proportion=0.0)[source]#

Select columns that contain at least one null value.

Examples

>>> from skrub import selectors as s
>>> import pandas as pd
>>> df = pd.DataFrame(dict(a=[0, 1, 2], b=[0, None, 20], c=['a', 'b', None]))
>>> s.select(df, s.has_nulls())
      b     c
0   0.0     a
1   ...     b
2  20.0  ...

Use the proportion parameter to filter columns by null percentage:

>>> df2 = pd.DataFrame(dict(
...     few_nulls=[1, 2, 3, None],
...     many_nulls=[1, None, None, None],
...     no_nulls=[1, 2, 3, 4]))
>>> s.select(df2, s.has_nulls(proportion=0.20))
   few_nulls  many_nulls
0        1.0         1.0
1        2.0         ...
2        3.0         ...
3        ...         ...
>>> s.select(df2, s.has_nulls(proportion=0.5))
many_nulls
0        1.0
1        ...
2        ...
3        ...