DAG Syntax Operation Hooks#
DAG syntax operation hooks (short: operation hooks) help to make the specification of data transformations more concise and powerful.
A hook consists of a callable that is attached to a certain operation name, e.g. expression
, and is invoked after the DAG syntax parser extracted all arguments.
The hook can manipulate the given operation
, args
and kwargs
arguments prior to the creation of the Transformation
object.
For the integration into the transformation framework, see here.
Available Hooks#
The following hooks are available by default:
In [1]: ", ".join(dantro.data_ops.DAG_PARSER_OPERATION_HOOKS)
Out[1]: 'expression'
The section titles below use the operation name of the hooks they are triggered by.
expression
#
The op_hook_expression()
prepares arguments for the expression()
operation, making it more convenient to perform symbolic math operations with entities defined in the DAG.
It tries to extract the free symbols from the expression string and turns them into DAGTag
objects of the same name.
For example, with the tags a
, b
, and c
:
transform:
- define: 2
tag: a
- define: 3
tag: b
- define: 1.5
tag: c
- expression: a**b / (c - 1.)
tag: result
The parser and the hook transform the expression
operation node into:
operation: expression
args: ["a**b / (c - 1.)"]
kwargs:
symbols:
a: !dag_tag a
b: !dag_tag b
c: !dag_tag c
This alleviates specifying the kwargs.symbols
argument manually, thus saving a lot of typing.
Note
The define
operation in the above example is just a trivial example of an operation; instead of defining extra DAG nodes, it would be much easier to simply add the parameters to the expression directly.
Typically, nodes a
, b
, c
would be the result of some prior, more complicated expression, e.g using any of the other available operations.
Warning
If using the expression
operation as part of a meta-operation, make sure to refer to these tags inside the meta-operation in some way.
See the Remarks & Caveats there for more information.
Furthermore, if any of the symbols are called prev
or previous_result
, they are turned into DAGNode
references to the previous node, similar to the !dag_prev
YAML tag:
transform:
- define: 3
tag: x
- define: 2. # ... or something more complicated
- expression: 1 - prev/(1 + x) # Reusing ``prev`` here. :tada:
tag: result
Hint
The hook also makes the expression
operation more robust in cases where with_previous_result
is set.
As the previous result is inserted as first positional argument, this would normally produce invalid syntax for the expression()
operation.
By default, the expression()
operation will attempt to cast the result to a floating point value, which is more compatible with other operations.
However, this default prohibits to work with the symbolic math features of sympy.
If you would like to keep symbolic expressions, specify the astype
argument accordingly.
transform:
- np.array: [[1, 2, 3]]
tag: arr
- .mean: !dag_tag arr
tag: a
- .sum: !dag_tag arr
tag: b
- define: 10
tag: c
- operation: expression
args: [a**2]
kwargs:
astype: ~
tag: equation_one
- operation: expression
args: [b**2]
kwargs:
astype: ~
tag: equation_two
- expression: (equation_one + equation_two) * c**2
tag: result
In the DAG visualization this would look like this: