skrub.DataOp.skb.get_vars#
- DataOp.skb.get_vars(all_named_ops=False)[source]#
Get all the variables used in the DataOp.
- Parameters:
- Returns:
dictKeys are names, and values the corresponding DataOp.
Examples
>>> import skrub
>>> a = skrub.var("a") >>> b = skrub.var("b") >>> c = (a + b).skb.set_name("c") >>> d = c + c >>> d <BinOp: add>
Our DataOp, d, contains 2 variables: “a” and “b”:
>>> d.skb.get_vars() {'a': <Var 'a'>, 'b': <Var 'b'>}
Those are the keys for which we need to provide values in the environment when evaluating d:
>>> d.skb.eval({"a": 10, "b": 3}) # (10 + 3) + (10 + 3) = 26 26
In addition, we set a name on the internal node c. It is not a variable, and normally it is computed as (a + b). But as it has a name, we can override its output by passing a value for “c” in the environment. When we do, the computation of c never happens (nor of a or b, here, because they are only used to compute c) – it is bypassed and the provided value is used instead.
>>> d.skb.eval({"c": 7}) # 7 + 7 = 14 14
If we want
get_varsto also list nodes like our examplecwhich have a name and can be passed in the environment, we passall_named_ops=True:>>> d.skb.get_vars(all_named_ops=True) {'a': <Var 'a'>, 'b': <Var 'b'>, 'c': <c | BinOp: add>}
Note
get_varscan be particularly useful when we have a learner (e.g. loaded from a pickle file) and we want to check what inputs we should pass to its methods such asfitandtransform:>>> learner = d.skb.make_learner() >>> list(learner.data_op.skb.get_vars().keys()) ['a', 'b']
The output above tells us what keys the dict we pass to
learner.fit()should contain:>>> learner.fit({'a': 2, 'b': 3}) SkrubLearner(data_op=<BinOp: add>)