DAG Syntax Operation Hooks
DAG Syntax Operation Hooks#
DAG syntax operation hooks (short: operation hooks) help to make the specification of data transformations more concise and powerful.
A hook consists of a callable that is attached to a certain operation name, e.g.
expression, and is invoked after the DAG syntax parser extracted all arguments.
The hook can manipulate the given
kwargs arguments prior to the creation of the
For the integration into the transformation framework, see here.
The following hooks are available by default:
In : ", ".join(dantro.data_ops.DAG_PARSER_OPERATION_HOOKS) Out: 'expression'
The section titles below use the operation name of the hooks they are triggered by.
It tries to extract the free symbols from the expression string and turns them into
DAGTag objects of the same name.
For example, with the tags
transform: - define: 2 tag: a - define: 3 tag: b - define: 1.5 tag: c - expression: a**b / (c - 1.) tag: result
The parser and the hook transform the
expression operation node into:
operation: expression args: ["a**b / (c - 1.)"] kwargs: symbols: a: !dag_tag a b: !dag_tag b c: !dag_tag c
This alleviates specifying the
kwargs.symbols argument manually, thus saving a lot of typing.
define operation in the above example is just a trivial example of an operation; instead of defining extra DAG nodes, it would be much easier to simply add the parameters to the expression directly.
c would be the result of some prior, more complicated expression, e.g using any of the other available operations.
Furthermore, if any of the symbols are called
previous_result, they are turned into
DAGNode references to the previous node, similar to the
!dag_prev YAML tag:
transform: - define: 3 tag: x - define: 2. # ... or something more complicated - expression: 1 - prev/(1 + x) # Reusing ``prev`` here. :tada: tag: result
The hook also makes the
expression operation more robust in cases where
with_previous_result is set.
As the previous result is inserted as first positional argument, this would normally produce invalid syntax for the
By default, the
expression() operation will attempt to cast the result to a floating point value, which is more compatible with other operations.
However, this default prohibits to work with the symbolic math features of sympy.
If you would like to keep symbolic expressions, specify the
astype argument accordingly.
transform: - np.array: [[1, 2, 3]] tag: arr - .mean: !dag_tag arr tag: a - .sum: !dag_tag arr tag: b - define: 10 tag: c - operation: expression args: [a**2] kwargs: astype: ~ tag: equation_one - operation: expression args: [b**2] kwargs: astype: ~ tag: equation_two - expression: (equation_one + equation_two) * c**2 tag: result
In the DAG visualization this would look like this: