DAG Syntax Operation Hooks#

DAG syntax operation hooks (short: operation hooks) help to make the specification of data transformations more concise and powerful.

A hook consists of a callable that is attached to a certain operation name, e.g. expression, and is invoked after the DAG syntax parser extracted all arguments. The hook can manipulate the given operation, args and kwargs arguments prior to the creation of the Transformation object.

For the integration into the transformation framework, see here.


Available Hooks#

The following hooks are available by default:

In [1]: ", ".join(dantro.data_ops.DAG_PARSER_OPERATION_HOOKS)
Out[1]: 'expression'

The section titles below use the operation name of the hooks they are triggered by.

expression#

The op_hook_expression() prepares arguments for the expression() operation, making it more convenient to perform symbolic math operations with entities defined in the DAG.

It tries to extract the free symbols from the expression string and turns them into DAGTag objects of the same name. For example, with the tags a, b, and c:

transform:
  - define: 2
    tag: a
  - define: 3
    tag: b
  - define: 1.5
    tag: c
  - expression: a**b / (c - 1.)
    tag: result

The parser and the hook transform the expression operation node into:

operation: expression
args: ["a**b / (c - 1.)"]
kwargs:
  symbols:
    a: !dag_tag a
    b: !dag_tag b
    c: !dag_tag c

This alleviates specifying the kwargs.symbols argument manually, thus saving a lot of typing.

Note

The define operation in the above example is just a trivial example of an operation; instead of defining extra DAG nodes, it would be much easier to simply add the parameters to the expression directly.

Typically, nodes a, b, c would be the result of some prior, more complicated expression, e.g using any of the other available operations.

Warning

If using the expression operation as part of a meta-operation, make sure to refer to these tags inside the meta-operation in some way. See the Remarks & Caveats there for more information.

Furthermore, if any of the symbols are called prev or previous_result, they are turned into DAGNode references to the previous node, similar to the !dag_prev YAML tag:

transform:
  - define: 3
    tag: x
  - define: 2.                       # ... or something more complicated
  - expression: 1 - prev/(1 + x)     # Reusing ``prev`` here. :tada:
    tag: result

Hint

The hook also makes the expression operation more robust in cases where with_previous_result is set. As the previous result is inserted as first positional argument, this would normally produce invalid syntax for the expression() operation.

By default, the expression() operation will attempt to cast the result to a floating point value, which is more compatible with other operations. However, this default prohibits to work with the symbolic math features of sympy. If you would like to keep symbolic expressions, specify the astype argument accordingly.

transform:
  - np.array: [[1, 2, 3]]
    tag: arr
  - .mean: !dag_tag arr
    tag: a
  - .sum: !dag_tag arr
    tag: b
  - define: 10
    tag: c

  - operation: expression
    args: [a**2]
    kwargs:
      astype: ~
    tag: equation_one
  - operation: expression
    args: [b**2]
    kwargs:
      astype: ~
    tag: equation_two

  - expression: (equation_one + equation_two) * c**2
    tag: result

In the DAG visualization this would look like this:

DAG visualization