has_nulls#
- skrub.selectors.has_nulls(proportion=0.0)[source]#
Select columns that contain at least one null value.
Examples
>>> from skrub import selectors as s >>> import pandas as pd >>> df = pd.DataFrame(dict(a=[0, 1, 2], b=[0, None, 20], c=['a', 'b', None])) >>> s.select(df, s.has_nulls()) b c 0 0.0 a 1 ... b 2 20.0 ...
Use the
proportionparameter to filter columns by null percentage:>>> df2 = pd.DataFrame(dict( ... few_nulls=[1, 2, 3, None], ... many_nulls=[1, None, None, None], ... no_nulls=[1, 2, 3, 4])) >>> s.select(df2, s.has_nulls(proportion=0.20)) few_nulls many_nulls 0 1.0 1.0 1 2.0 ... 2 3.0 ... 3 ... ...
>>> s.select(df2, s.has_nulls(proportion=0.5)) many_nulls 0 1.0 1 ... 2 ... 3 ...