has_dtype#
- skrub.selectors.has_dtype(*dtypes)[source]#
Select columns whose dtype is equal to one of the provided dtypes.
This selector takes a hands-off approach: skrub does not normalize or infer dtypes across dataframe libraries. A column is selected if
sbd.dtype(column) == dtypefor at least one of the provideddtypes.The most reliable way to use this selector is to get the dtype from an existing column and pass that dtype object directly.
Examples
>>> from skrub import selectors as s >>> import pandas as pd >>> df = pd.DataFrame( ... { ... "items": [["A4", "A3"], ["A5"]], ... "count": [2, 1], ... } ... ) >>> items_dtype = df["items"].dtype >>> s.select(df, s.has_dtype(items_dtype)) items 0 [A4, A3] 1 [A5]
Several dtypes can be accepted:
>>> count_dtype = df["count"].dtype >>> s.select(df, s.has_dtype(items_dtype, count_dtype)) items count 0 [A4, A3] 2 1 [A5] 1
Some dataframe libraries may also accept shorthand values that compare equal to a dtype object. However, this is backend-specific, so the most robust approach is still to pass the dtype obtained from the dataframe library itself.