optional#
- skrub.optional(value, *, name=None, default=value)[source]#
A choice between
valueandNone.Typically,
valueis an estimator and this choice is passed toDataOp.skb.apply().optionalallows to build a branch in the search space where thevalueis a component of the pipeline (e.g., a dimensionality reduction step, a feature selection step, etc) which may be present or not. When the component is not present, it is represented byNone. This is equivalent toskrub.choose_from([value, None], name=name).When a learner is fitted without hyperparameter tuning, the outcome of this choice is
value. Passdefault=Noneto makeNonethe default outcome.- Parameters:
- valueobject
The outcome (when
Noneis not chosen).- name
str, optional (default=None) If not
None,nameis used when displaying search results and can also be used to override the choice’s value by setting it in the environment containing a learner’s inputs.- defaultNoneType, optional
An
optionalis a choice between the providedvalueandNone. Normally, the default outcome when a learner is used without hyperparameter tuning is the providedvalue. Passdefault=Noneto make the alternative outcome,None, the default.Noneis the only allowed value for this parameter.
- Returns:
- Choice
An object representing this choice, which can be used in a skrub learner.
Examples
optionalis useful for optional steps in a DataOps plan, or a learner. If we want to try a learner with or without dimensionality reduction, we can add a step such as:>>> from sklearn.decomposition import PCA >>> from skrub import optional >>> optional(PCA(), name='use dim reduction') optional(PCA(), name='use dim reduction')
The constructed parameter grid will include a branch of the plan with the the PCA and one without:
>>> print( ... optional(PCA(), name='dim reduction').as_data_op().skb.describe_param_grid() ... ) - dim reduction: [PCA(), None]
When a learner that contains an
optionalstep is used without hyperparameter tuning, the default outcome is the providedvalue.>>> print(optional(PCA()).default()) PCA()
This can be overridden by passing
default=None:>>> print(optional(PCA(), default=None).default()) None
In practice,
optionalis used withDataOp.skb.apply()to make the application of a transformer optional. For example, if we want to make the application of PCA optional, we can do:>>> import skrub >>> from skrub.datasets import toy_products >>> from sklearn.decomposition import PCA
>>> products = skrub.var("products", toy_products()) >>> vectorized = products.skb.apply(skrub.TableVectorizer()) >>> reduced = vectorized.skb.apply(skrub.optional(PCA(n_components=2), name="pca")) >>> print(reduced.skb.describe_param_grid()) - pca: [PCA(n_components=2), None]
If we perform a hyperparameter search (for example with
DataOp.skb.make_grid_search()), both pipelines (with and without a PCA) will be considered and the one giving the best predictions will be selected. See also ——– choose_bool :Construct a choice between False and True.
- choose_float :
Construct a choice of floating-point numbers from a numeric range.
- choose_from :
Construct a choice among several possible outcomes.
- choose_int :
Construct a choice of integers from a numeric range.