var#

skrub.var(name, value=NULL)[source]#

Create a skrub variable.

Variables represent inputs to a DataOps plan, and the corresponding learner. They can be combined with other variables, constants, operators, function calls etc. to build up complex DataOps, which implicitly define the plan.

See the example gallery for more information about skrub DataOps.

Parameters:

namestr: The name for this input. It corresponds to a key in the dictionary that is passed to the learner’s fit() method (see Examples below). Names must be unique within a learner and must not start with "_skrub_"
valueobject, optional: Optionally, an initial value can be given to the variable. When it is available, it is used to provide a preview of the learner’s results, to detect errors in the learner early, and to provide better help and tab-completion in interactive Python shells.

Returns:

A skrub variable

Raises:

TypeError: If the provided value is a skrub DataOp or a skrub choose_* function.

See also

skrub.X: Create a skrub variable and mark it as being X.
skrub.y: Create a skrub variable and mark it as being y.

Examples

Variables without a value:

>>> import skrub
>>> a = skrub.var('a')
>>> a
<Var 'a'>
>>> b = skrub.var('b')
>>> c = a + b
>>> c
<BinOp: add>
>>> print(c.skb.describe_steps())
Var 'a'
Var 'b'
BinOp: add

The names of variables correspond to keys in the inputs:

>>> c.skb.eval({'a': 10, 'b': 6})
16

And also to keys to the inputs to the DataOps plan:

>>> learner = c.skb.make_learner()
>>> learner.fit_transform({'a': 5, 'b': 4})
9

When providing a value, we see what the learner produces for the values we provided:

>>> a = skrub.var('a', 2)
>>> b = skrub.var('b', 3)
>>> b
<Var 'b'>
Result:
―――――――
3
>>> c = a + b
>>> c
<BinOp: add>
Result:
―――――――
5

The values are also used as defaults for eval():

>>> c.skb.eval()
5

But we can still override them. And inputs must be provided explicitly when using the learner returned by .skb.make_learner().

>>> c.skb.eval({'a': 10, 'b': 6})
16

Much more information about skrub variables is provided in the examples gallery.

Gallery examples#

Introduction to machine-learning pipelines with skrub DataOps

Multiples tables: building machine learning pipelines with DataOps

Subsampling for faster development

var#

Gallery examples#

This Page