SingleColumnTransformer#

class skrub.core.SingleColumnTransformer[source]#

Base class for single-column transformers.

Such transformers are applied independently to each column by ApplyToCols; see the docstring of ApplyToCols for more information.

Single-column transformers are not required to inherit from this class in order to work with ApplyToCols, however doing so avoids some boilerplate:

  • The required __single_column_transformer__ attribute is set.

  • fit is defined (calls fit_transform and discards the result).

  • fit, transform and fit_transform are wrapped to check

    that the input is a single column and raise a ValueError with a helpful message when it is not.

  • A note about single-column transformers (vs dataframe transformers)

    is added after the summary line of the docstring.

Subclasses must define fit_transform and transform (or inherit them from another superclass).

Methods

fit(column[, y])

Fit the transformer.

get_feature_names_out([input_features])

Get the output feature names.

get_params([deep])

Get parameters for this estimator.

set_output(*[, transform])

Default no-op implementation for set_output.

set_params(**params)

Set the parameters of this estimator.

fit(column, y=None, **kwargs)[source]#

Fit the transformer.

This default implementation simply calls fit_transform() and returns self.

Subclasses should implement fit_transform and transform.

Parameters:
columna pandas or polars Series

Unlike most scikit-learn transformers, single-column transformers transform a single column, not a whole dataframe.

ycolumn or dataframe

Prediction targets.

**kwargs

Extra named arguments are passed to self.fit_transform().

Returns:
self

The fitted transformer.

get_feature_names_out(input_features=None)[source]#

Get the output feature names.

Parameters:
input_featuresarray_like of str, default=None

Input feature names. Ignored.

Returns:
list of str

The names of the output features.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Default no-op implementation for set_output.

Skrub transformers already output dataframes of the correct type by default so there is usually no need for set_output to do anything.

Subclasses are of course free to redefine set_output (e.g. by inheriting from TransformerMixin before SingleColumnTransformer).

Parameters:
transformstr or None, default=None

Ignored.

Returns:
SingleColumnTransformer

Returns self.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.