Plot Data Selection#
This page describes how the plot creators can make data for plotting available in a programmatic fashion.
Each plot creator is associated with a DataManager
instance, which holds all the data that is currently available.
This data is usually made available to you such that you can select data which you can then pass on to whatever you use for plotting.
While manual selection directly from the data manager suffices for specific cases, automation is often desired.
The uniform interface of the DataManager
paired with the TransformationDAG
framework makes automated data selection and transformation for plotting possible while making available all the benefits of the Data Transformation Framework framework:
Generic application of transformations on data
Fully configuration-based interface
Caching of computationally expensive results
This functionality is embedded at the level of the BasePlotCreator
, making it available for all plot creators and allowing subclasses to tailor it to their needs.
Additionally, result placeholders can be specified inside the plot configuration, thus allowing to use transformation results not only for data selection, but also for programmatically determining other configuration parameters.
General remarks#
This section holds information that is valid for all plot creators.
Enabling DAG usage#
If using the recommended plot function signature, the use_dag
key can be specified right there and enables the data transformation framework.
This declares that the plot function expects data selection to occur via the transformation framework.
After computation, the results are made available to the selected python plot function via the data
keyword argument, which is a dictionary of the tags that were selected to be computed.
Setting use_dag
in the plot configuration#
Alternatively, DAG usage can be controlled via the use_dag
argument in the plot configuration:
# Some plot configuration file
---
my_plot:
use_dag: true
# ... more arguments
Hint
If setting use_dag
in the plot configuration, take care not to create conflicts with the chosen plot function signature.
Arguments to control DAG behaviour#
You then have the following arguments available to control its behaviour:
select
andtransform
: select data and perform transformations on it, seeadd_nodes()
.
compute_only
: controls which tags are to be computed, seecompute()
dag_options
: passed toTransformationDAG
initialization, e.g. to controlfile_cache_defaults
,verbosity
, or adding transformations via thedefine
interface, see The define interface.
dag_visualization
: controls visualization of the DAG which can be very helpful for debugging, see below. These arguments are passed to_generate_DAG_vis()
.
The creation of the DAG and its computation is controlled by the chosen plot creator and can be specialized to suit that plot creator’s needs.
Note
If DAG usage is enabled, these arguments will be used exclusively for the DAG, i.e.: they are not available downstream in the plot creator or the plot function.
Hint
To use meta-operations for plot data selection, define them under the dag_options.meta_operations
key of a plot configuration.
Same for adding nodes via the define
interface (see The define interface), which is also only available via dag_options.define
.
Also check out the dantro base plot configs from which some pre-defined meta-operations can be included using based_on
.
Hint
Specialized plot creators, like those based on paramspace operations may implement an expanded syntax.
Example#
Some example plot configuration to select some containers from the data manager, perform simple transformations on them and compute a result
tag:
# Some plot configuration file
---
my_plot:
creator: my_creator
# ... some plot arguments here ...
# Data selection via DAG framework
use_dag: true
select:
foo: some/path/foo
bar:
path: some/path/bar
transform:
- mean: [!dag_prev ]
- increment: [!dag_prev ]
transform:
- add: [!dag_tag foo, !dag_tag bar]
tag: result
compute_only: [result]
dag_options:
define:
foo: bar
verbosity: 3 # to show more profiling statistics (default: 1)
file_cache_defaults:
write: true
read: true
# ... other parameters here are passed on to TransformationDAG.__init__
DAG object caching#
For very complex data transformation sequences, DAGs can have many hundreds of thousands of nodes. In those cases, parsing the DAG configuration and creating the corresponding objects can be time-consuming and begin to noticeably prolong the plotting procedure.
To remedy this, the plotting framework implements memory-caching of TransformationDAG
objects such that they can be re-used across multiple plots or repeated invocation of the same plot.
The cache is used if the DAG-related configuration parameters (transform
, select
, …) are equal, i.e. have equal results when serialized using repr
.
In other words: if plots use the same data selection arguments, thus creating identical DAGs, the cache can be used.
Multiple aspects of caching can be controlled using the dag_object_cache
parameter, passed via dag_options
(see below):
read
: whether to read from the cache (default: false)write
: whether to write from the cache (default: false)use_copy
: whether to read and write a deep copy of theTransformationDAG
object to the cache (default: true).clear
: if set, will remove all objects from the cache (after reading from it) and trigger garbage collection (default: false)collect_garbage
: can be used to separately control garbage collection, e.g. to suppress it despiteclear
having been passed.
Warning
Only use use_copy: false
if you can be certain that plot functions do not change the object; this would create side effects that may be very hard to track down.
Note
The clear
option will also invoke general garbage collection (if not explicitly disabled).
This will free up memory … but it may also take some time.
Example#
# Some plot configuration file
---
my_plot:
# ... some plot arguments here ...
# Data selection via DAG framework
use_dag: true
select:
foo: some/path/foo
bar:
path: some/path/bar
transform:
- mean: [!dag_prev ]
- increment: [!dag_prev ]
transform:
- add: [!dag_tag foo, !dag_tag bar]
tag: result
compute_only: [result]
# Enable DAG object caching
dag_options:
dag_object_cache:
read: true
write: true
# Other parameters (and their default values)
# use_copy: true # true: cache a deep copy of the object
# clear: false # true: clears the object cache and invokes
# garbage collection
# collect_garbage: ~ # true: invokes garbage collection
# false: suppresses garbage collection even
# if `clear` was set
my_other_plot_using_the_cache:
based_on: my_plot # --> identical DAG arguments (if not overwritten below)
# ... some plot arguments ...
Defining a generic plot function#
Ideally, a plot function can focus on providing a bridge from data to a visual representation.
Using the PyPlotCreator
, this becomes feasible:
from dantro.plot import is_plot_func, PlotHelper
@is_plot_func(use_dag=True)
def my_plot_func(*, data: dict, hlpr: PlotHelper, **further_kwargs):
"""This is my custom plot function with preprocessed DAG data"""
# ...
pass
The only required arguments here are data
and hlpr
.
The former contains all results from the DAG computation; the latter is the plot helper, which effectively is the interface to the visualization of the data.
Importantly, this makes the plot function averse to the specific choice of a creator: the plot function can be used with the PyPlotCreator
and with its specializations, UniversePlotCreator
and MultiversePlotCreator
.
In such cases, the creator
should not be specified in the decorator, but it should be given in the plot configuration.
Special case: UniversePlotCreator
#
For the UniversePlotCreator
, data selection and transformation has to occur based on data from the currently selected universe.
This is taken care of automatically by this creator: it dynamically sets the select_base()
property to the current universe, not requiring any further user action.
In effect, the select
argument acts as if selections were to happen directly from the universe.
Except for the select_base
and base_transform
arguments, the full DAG interface is available via the UniversePlotCreator
.
Hint
To restore parts of the functionality of the already-in-use select_base
and base_transform
arguments, the select_path_prefix
argument of TransformationDAG
can be used.
It can be specified as part of dag_options
and is prepended to all path
arguments specified within select
.
Example#
The following suffices to define a UniversePlotCreator
-based plot function:
from dantro.plot import UniversePlotCreator
@is_plot_func(creator_type=UniversePlotCreator, use_dag=True)
def my_universe_plot(*, data: dict, hlpr: PlotHelper, **kwargs):
"""This is my custom universe plot function with DAG usage"""
# ...
pass
Hint
To not restrict the plot function to a specific creator, using the creator-averse plot function definition is recommended, which omits the creator_type
in the decorator and instead specifies it in the plot configuration.
The DAG can be configured in the same way as in the general case.
Special case: MultiversePlotCreator
#
The MultiversePlotCreator
has a harder job: It has to select data from the whole multiverse subspace, apply transformations to it, and finally combine it, with optional further transformations following.
It does so fully within the DAG framework by building a separate DAG branch for each universe and bundling all of them into a transformation that combines the data.
This happens via the select_and_combine
argument.
Important: The select_and_combine
argument behaves differently to the select
argument of the DAG interface!
This is because it has to accommodate various further configuration parameters that control the selection of universes and the multidimensional combination of the selected data.
The select_and_combine
argument expects the following keys:
fields
: all keys given here will appear as tags in the results dictionary. The values of these keys are dictionaries that contain the same parameters that can also be given to theselect
argument of the DAG interface. In other words: paths you would like to select from within each universe should be specified atselect_and_combine.fields.<result_tag>.path
rather than atselect.<result_tag>.path
.base_path
(optional): if given, this path is prepended to all paths given underfields
combination_method
(optional, default:concat
): how to combine the selected and transformed data from the various universes. Available parameters:concat
: attempts to preserve data types but is only possible if the universes fill a hypercube without holesmerge
: which is always possible, but leads to the data type falling back to float. Missing data will benp.nan
in the results.
The combination method can also be specified for each tag under
select_and_combine.<result_tag>.combination_method
.subspace
(optional): which multiverse subspace to work on. This is evaluated fully by theparamspace.ParamSpace.activate_subspace
method. The subspace can also be specified for each tag underselect_and_combine.<result_tag>.subspace
.
Remarks#
The select operations on each universe set the
omit_tag
flag in order not to create a flood of only-internally-used tags. Setting tags manually here does not make sense, as the tag names would collide with tags from other universe branches.File caching is hard-coded to be disabled for the initial select operation and for the operation that attaches the parameter space coordinates to it. This behavior cannot be influenced.
The best place to cache is the result of the combination method.
The regular
select
argument is still available, but it is applied only after theselect_and_combine
-defined nodes were added and it does only act globally, i.e. not on each universe.The
select_path_prefix
argument toTransformationDAG
is not allowed for theMultiversePlotCreator
. Use theselect_and_combine.base_path
argument instead.
Example#
A MultiversePlotCreator
-based plot function can be implemented like this:
from dantro.plot import MultiversePlotCreator
@is_plot_func(creator_type=MultiversePlotCreator, use_dag=True)
def my_multiverse_plot(*, data: dict, hlpr: PlotHelper, **kwargs):
"""This is my custom multiverse plot function with DAG usage"""
# ...
pass
Hint
To not restrict the plot function to a specific creator, using the creator-averse plot function definition is recommended, which omits the creator_type
in the decorator and instead specifies it in the plot configuration.
An associated plot configuration might look like this:
---
my_plot:
# ... some plot arguments here ...
# Data selection via DAG framework
select_and_combine:
fields:
foo: some/path/foo
bar:
path: some/path/bar
transform:
- mean: [!dag_prev ]
- increment: [!dag_prev ]
combination_method: concat # can be ``concat`` (default) or ``merge``
subspace: ~ # some subspace selection
transform:
- add: [!dag_tag foo, !dag_tag bar]
tag: result
Handling missing data#
In some cases, the ParamSpaceGroup
associated with the MultiversePlotCreator
might miss some states.
This can happen, for instance, if the to-be-plotted data is the result of a simulation for each point in parameter space and the simulation was stopped before visiting all these points.
In such a case, select_and_combine
will typically fail.
Another reason for errors during this operation may be that the data structures between the different points in parameter space are different, such that a valid path within one ParamSpaceStateGroup
(or: “universe”) is not a valid path in another.
To be able to plot the partial data in both of these cases, this plot creator makes use of the error handling feature in the data transformation framework.
It’s as simple as adding the allow_missing_or_failing
key to select_and_combine
:
# Select the creator and use the generic facet grid plotting function
based_on:
- .creator.multiverse
- .plot.facet_grid
# Select data, allowing for missing universes or failing .mean operation
select_and_combine:
allow_missing_or_failing: true
combination_method: merge # needed with allow_missing_or_failing
fields:
data:
path: labelled/randints
transform:
- .mean: [!dag_prev , [x]]
This option kicks in when any of the following scenarios occur:
A universe from the selected subspace is missing altogether
The
getitem
operation for the givenpath
within a universe failsAny operation within
transform
fails
In any of these cases, the data for the whole universe is discarded.
Instead, an empty xr.Dataset
with the coordinates of that universe is used as fallback, with the following effect:
The corresponding coordinates will be present in the final xr.Dataset
, but they contain no data (or NaNs).
The latter is also the reason why the merge
combination method is required here.
Note
The rationale behind this behavior is that coordinate information is valuable, as it shows which data would have been available.
If desired, null-like data can be dropped afterwards using the .dropna
operation.
In case of missing data, the error message will come from the dantro.expand_dims
operation and contain information on the failure.
..warning:
If *all* data is missing, ``select_and_combine`` will not be able to succeed, because there will be nothing to combine and insufficient information to create a null-like output instead.
This feature is explicitly meant for data *partially* missing.
The expected error message for such a case will be coming from ``dantro.merge``:
::
The Dataset resulting from the xr.merge operation can only be reduced
to a DataArray, if one and only one data variable is present in the
Dataset! However, the merged Dataset contains 0 data variables.
Hint
The allow_missing_or_failing
argument accepts the same values as the allow_failure
argument of the error handling framework; in fact, it sets exactly that argument internally.
Thus, the messaging behavior can be influenced as follows:
select_and_combine:
allow_missing_or_failing: silent # other options: warn, log
Hint
Same as combination_method
and subspace
, the allow_missing_or_failing
argument can also be specified separately for each field, overwriting the default value from the select_and_combine
root level:
select_and_combine:
allow_missing_or_failing: silent
fields:
some_data:
allow_missing_or_failing: warn # overwrites default from above
path: path/to/some/data
Applying transformations after combination of data#
In some cases, it can be useful to define postprocessing transformations on the combined data.
For that purpose, there is the transform_after_combine
option which can be added for each individual field or as a default on the select_and_combine
level.
While this postprocessing can of course also be done alongside transform
, it is often easier to define this alongside the field.
Some example use cases:
Perform some postprocessing on all fields, without having to repeat the definitions.
Use
print
to see the result of the combination directly, without having to touch thetransform
definition.Call
.squeeze
to reduce the one-sized dimensions of a combination, which can simplify some plotting calls.
Custom combination method#
Apart from the merge
and concat
combination methods, a custom combination method can also be used by specifying the name of an operation that is capable of combining the data in a desired way:
select_and_combine:
# further kwargs are passed on to the chosen custom operation
fields:
some_data:
path: path/to/some_data
combination_method:
operation: my_combination_operation
pass_pspace: false # default: false. If true, will pass additional
# keyword argument ``pspace``.
# further kwargs passed to combination operation
combination_kwargs:
Such a combination operation needs to have the following signature:
def my_combination_function(objs: list, **kwargs) -> xr.DataArray:
# ...
Here, objs
is a list of the data from each individual parameter space state (“universe”), ready with attached coordinates.
Note
While the given objs
already have coordinates assigned, you might be interested in some macroscopic information about the shape of the target data.
To that end, an additional argument can be passed to the combination function by setting combination_method.pass_pspace: true
.
The pspace
argument is then a ParamSpace
object (from the paramspace package) which contains information about the dimensionality of the data and the names and coordinates of the dimensions.
The data in objs
is ordered in the same way as the iteration over pspace
.
Full DAG configuration interface for multiverse selection#
An example of all options available in the MultiversePlotCreator
.
# Full DAG specification for multiverse selection
---
my_plot:
# ... some plot arguments here ...
# DAG parameters
# Selection from multiple universes with subsequent combination
select_and_combine:
fields:
# Define a tag 'foo' that will use the defaults defined directly on
# the ``select_and_combine`` level, see below
foo: foo # ``base_path`` will be prepended here
# resulting in: some/path/foo
# Define a tag 'bar' that overwrites some of the defaults
bar:
path: bar
subspace: # only use universes from a subspace
seed: [0, 10]
my_param: [-42., 42.]
combination_method: merge # overwriting default specified below
combination_kwargs: # passed to Transformation.__init__
# of the *tagged* output node
file_cache:
read: true
write:
enabled: true
# Configure the file cache to only be written if this
# operation took a large amount of time.
min_cumulative_compute_time: 20.
allow_missing_or_failing: silent # transformations or path lookup
# is allowed to fail
transform:
- mean: !dag_prev
- increment: [!dag_prev ]
- some_op_with_kwargs:
data: !dag_prev
foo: bar
spam: 42
- operation: my_operation
args: [!dag_prev ]
file_cache: {} # can configure file cache here
transform_after_combine: # applied after combination
- increment
- print
base_path: some_path # if given, prepended to ``path`` in ``fields``
# Default arguments, can be overwritten in each ``fields`` entry
combination_method: concat # can be ``concat`` (default), ``merge``.
# If a dict, may contain the key
# ``operation`` which will then be used as
# the operation to use for combination; any
# further arguments are passed on to that
# operation call.
subspace: ~ # some subspace selection
allow_missing_or_failing: ~ # whether to allow missing universes or
# failing transformations; can be: boolean,
# ``log``, ``warn``, ``silent``
transform_after_combine: ~
# Additional selections, now based on ``dm`` tag
select: {}
# Additional transformations; all tags from above available here
transform: []
# Other DAG-related parameters: ``compute_only``, ``dag_options``
# ...
Note
This does not include all possible options for DAG configuration but focusses on those options added by MultiversePlotCreator
to work with multiverse data, e.g. subspace
, combination_kwargs
.
For other arguments, see Full syntax specification of a single transformation node.
Using data transformation results in the plot configuration#
The data transformation framework can not only be used for the selection of plot data: using so-called “result placeholders”, data transformation results can be used as part of the plot configuration.
One use case is to include a computation result, e.g. some mean value, into the title of the plot via the plot helper. In general, this feature allows to automate further parts of the plot configuration by giving access to the capabilities of the transformation framework.
Let’s look at an example plot configuration:
# Select the creator and use the generic errorbar plotting function
based_on:
- .creator.universe
- .plot.facet_grid.errorbars
select:
# 3D data with random integers
some_data: randints
transform:
# Compute the mean and standard deviation
- .mean: [!dag_tag some_data, [x, z]]
tag: mean
- .std: [!dag_tag some_data, [x, z]]
tag: stddev
# Assemble them into a Dataset for the errorbars plot
- xr.Dataset:
- mean: !dag_tag mean
stddev: !dag_tag stddev
tag: data
# Additional transformations for ResultPlaceholders
- .mean: [!dag_tag mean]
- .item # ... otherwise it's still an xr.DataArray
- .format: ["Some Data (total mean: {:.3g})", !dag_prev ]
tag: title_str
# Specify which data variable to plot as line and which as errorbands
y: mean
yerr: stddev
use_bands: true
# Now, use the place holder in the helper configuration
helpers:
set_title:
title: !dag_result title_str
As can be seen here, there are additional operations defined within transform
, which lead to the title_str
tag.
In the helper configuration, that tag is referred to via the !dag_result
YAML tag, thus creating a placeholder at the helpers.set_title.title
key.
This illustrates the basic idea. Of course, multiple placeholders can be used and they can be used almost everywhere inside the plot configuration; however, make sure to have a look at the caveats to learn about current limitations.
Hint
When adding placeholders, you will notice additional log messages which inform about the placeholder names and their computation profile.
Caveats#
Where in the plot configuration can placeholders be used?#
Placeholders can be used in wide parts of the plot configuration, but not everywhere.
If you encounter errors that refer to an unexpected ResultPlaceholder object
, this is probably because they were defined in a part of the plot configuration where they cannot be resolved.
Where can (✅) placeholders always be used? Where can they never (❌) be used?
✅ They can be used in all configuration entries that are passed through to the selected plot function of the The PyPlotCreator and derived plot creators.
✅ They can be used within the
helpers
argument that controls the The PlotHelper.❌ They can not be used for entries related to data transformation (
select
,transform
,dag_options
, …) because these need to be evaluated in order to set up theTransformationDAG
.❌ They can not be used for entries evaluated by the The PlotManager (
out_path
, etc) or the plot creator prior to data selection (animation
,style
,module
, etc).
Why is my placeholder not resolved?#
The identification and replacement of placeholders happens by recursively iterating through list
-like and dict
-like objects in the plot configuration dict
.
Typically, this reaches all places where these placeholders could be defined.
The only exception being if the placeholder is in some part of an object that does not behave like a list
or a dict
.
Implementation details#
Under the hood, the !dag_result
YAML tag is read as a ResultPlaceholder
object, which simply stores the name of the tag that should come in its place.
After the plot data was computed, the BasePlotCreator
inspects the plot configuration and recursively collects all these placeholder objects.
The compute()
method is then invoked to retrieve the specified results.
Subsequently, the placeholder entries in the plot configuration are replaced with the result from the computation.
For the above operations, functions from the paramspace package are used, specifically: paramspace.tools.recursive_collect
and paramspace.tools.recursive_replace
.
DAG Visualization#
The DAG used for plot data selection and transformation can also be visualized. This can be helpful to understand what kind of operations are carried out on which kind of data; this can be a big assistance during debugging.
By default, DAG visualization is enabled and will generate output if there was an error during the computation of data transformation results. This can be controlled; see below.
However, there are many ways to further control when a visualization is created and how it looks like.
All parameters for controlling DAG visualization can be passed via the dag_visualization
in a plot configuration.
Such a plot may look like these:
DAG generation#
The way the DAG is generated is controlled by the generation
arguments, which are evaluated by generate_nx_graph()
.
Also see Graph representation and visualization for more information.
Controlling when to generate a DAG plot#
For instance, if we’d like to always generate a DAG plot upon a computation, we can pass the following parameters:
my_dag_plot:
# ...
dag_visualization:
when:
only_once: true # only generate a single DAG plot
on_compute_error: true # ... either upon failing computation
on_compute_success: true # ... or upon a successful one.
Hint
To only plot if the creator runs in debug mode (i.e., with raise_exc
set), set the scenario to debug
instead of a boolean.
my_dag_plot:
# ...
dag_visualization:
when:
on_compute_error: debug
In the on_compute_error
scenario, it is advisable to activate the show_node_status
option for visualization, which will indicate at which node an error occurred:
The colors indicate the following node status, as detailed in the legend:
green: computation succeeded
yellow: computation failed but a fallback value was used
red: computation failed in this node
dark red: computation failed in a node that this node depends on
Hint
To adjust the status colors, set the node_status_colors
argument; see visualize()
docstring for more info.
Changing plot content#
What is shown in the plot depends mostly on the label
attribute of the nodes.
By default, that content is generated via the get_description()
operation function, which takes into account the name of the tag, the operation, and potential results.
What is shown in the plot is the label
attribute, so in order to show something else there, we need to tell the visualize()
method to use something else for the label.
By default, the description
attribute is shown.
In the following example, we will instead show simply the operation
attribute by setting the drawing.labels.from_attr
entry of the configuration:
dag_visualization:
drawing:
labels:
from_attr: operation
# available attributes: tag, description, operation
Using the manipulate_attributes()
function, we can also generate custom attributes.
In the following example, the name of that attribute is my_custom_attr
, which is then also set as the label.
dag_visualization:
generation:
include_results: true
manipulate_attrs:
map_node_attrs:
my_custom_attr:
# Invoke *some* function; as an example, use a lambda to copy
# over some node attribute data into `my_attr`
call_lambda: "lambda *, attrs: attrs.get('result', '(no result)')"
drawing:
labels:
# Use the custom attribute as a label
from_attr: my_custom_attr
Note
If not setting drawing.labels.from_attr
explicitly, it will always use the description
attribute as the label.
Setting plot aesthetics#
The looks of the DAG plot are set via the drawing
keyword, which end up in the visualize()
method:
my_dag_plot:
# ...
dag_visualization:
drawing:
# Whether to include default values for nodes, edges, and labels.
# If true, will recursively update these defaults with the values
# given below.
# Set to false to use the networkx defaults instead.
use_defaults: true
# Arguments to networkx.draw_networkx_*
nodes:
node_color: blue
# ...
edges:
width: 2.5
# ...
labels:
font_size: 10
# ...
Note
With networkx using matplotlib as drawing backend, there are a number of limitations: For instance, it is not possible to let edges terminate exactly at the edge of the label’s box.
If this is desired, you may want to have a look at Exporting a DAG representation.
Exporting a DAG representation#
For more control over the looks of the DAG, you can use the export
keyword and use whatever other program you like to look at the plot output.
This will invoke export_graph()
.
In that case you may want to set plot_enabled: False
as well:
my_dag_plot:
# ...
dag_visualization:
plot_enabled: false
export_enabled: true
# ...
export:
manipulate_attrs:
# Use the description as label and keep only that attribute
map_node_attrs:
label:
attr_mapper.copy_from_attr: description
keep_node_attrs:
- label
# Export formats
graphml: true
dot: true
# ...
Remarks#
For more information on possible arguments, see _generate_DAG_vis()
.
For a background on DAG representation as a networkx.DiGraph
, see Graph representation and visualization.
Note
The layouting algorithm cannot be changed yet.
If GraphViz and pygraphviz are installed, graphviz_layout()
is used with the dot
algorithm.
If those are not installed, a multipartite_layout()
is carried out.
Full Interface#
The following documents the full interface and the corresponding default values:
# DAG Visualization interface; values given here are default values
dag_visualization:
enabled: true # Main toggle
plot_enabled: true # Whether to generate a plot
export_enabled: true # Whether to export the graph
# Whether to raise an exception if graph generation, plotting or
# exporting failed. If None, will use the creator's setting.
raise_exc: ~
# Whether to *additionally* export the graph
export:
# Manipulate node or edge attributes (for export only)
manipulate_attrs:
map_node_attrs: {}
map_edge_attrs: {}
keep_node_attrs: True
keep_edge_attrs: True
# Export formats
# ... need to be specified here.
# Examples:
gml: True
graphml: # arguments passed on to writer
infer_numeric_types: True
# dot: True # Needs pygraphviz
# ... more formats here ...
# Output arguments
output:
plot_dir: ~ # None: Output will be aside the just-generated plot
# A format string that is used to create the actual output path.
# The `plot_dir` key is the one evaluated from the above argument.
path_fstr: "{plot_dir:}/{name:}_dag_{scenario:}.pdf"
# When to generate the visualization
when:
# General toggles
always: false # If true: always generate a DAG plot
only_once: false # If true: only generate one DAG plot
# Scenarios: After which events to generate a DAG plot
# Values can be: false, true, debug.
# In case of 'debug', output is only generated if the creator was in
# debug mode itself.
on_compute_error: true
on_compute_success: false
on_plot_error: false
on_plot_success: false
# Generation kwargs
generation:
tags_to_include: all
include_results: false
lookup_tags: true
manipulate_attrs:
map_node_attrs:
# Default operations: these are set by default
operation: attr_mapper.dag.get_operation
layer: attr_mapper.dag.get_layer
description: attr_mapper.dag.get_description
# Other available operations:
# meta_operation: attr_mapper.dag.get_meta_operation
# arguments: attr_mapper.dag.format_arguments
# some_attr: attr_mapper.copy_from_attrs
# another_attr: attr_mapper.set_value
# ... or any other registered data operation:
# my_attr:
# call_lambda: "lambda *, attrs: attrs.get('foo')"
# Whether to base layouting and visualization on optimized default
# values or not. For illustration, the actual default values are used
# below; they do NOT have to be set explicitly as done here!
use_defaults: true
# Whether to show the node status and which colors to use for it
show_node_status: true
node_status_color:
initialized: lightskyblue
queued: cornflowerblue
computed: limegreen
looked_up: forestgreen
failed_here: red
failed_in_dependency: firebrick
used_fallback: gold
no_status: silver
# Layouting algorithm (and fallback)
layout:
model: graphviz_dot # requires graphviz and pygraphviz
# In case the above model fails, silently switch to another one
fallback: multipartite
silent_fallback: true
# Arguments for the respective layouting models
model_kwargs:
graphviz_dot: {}
multipartite:
align: horizontal
subset_key: layer
scale: -1
# Whether to wiggle layouted positions to reduce edge overlap.
# This is recommended for the multipartite layout, because it
# does not handle edges going over multiple layers very well,
# producing confusing edge overlaps ...
wiggle:
x: 0.005
y: ~
seed: 123 # set to None to always get new wiggles
# Drawing, using networkx.draw_networkx_<...>
drawing:
nodes:
alpha: 0.
node_size: &node_size 600
edges:
arrows: true
arrowsize: 12
min_target_margin: 20
min_source_margin: 20
node_size: *node_size
labels:
# Which attribute to use as node label
from_attr: description
# Aesthetics; see matplotlib.patches.FancyBboxPatch
font_size: 7
bbox:
fc: "#fffa"
ec: "#666"
linewidth: 0.5
boxstyle: round
# Figure creation via matplotlib.pyplot.figure
figure_kwargs:
figsize: [9, 7]
# Scale figure size with "width" and "height" of the resulting graph
# to avoid node overlapping; using these scaling factors.
# Set to False to disable.
scale_figsize: [0.25, 0.22]
# Figure-level plot annotations: suptitle, figure legend for node color
annotate_kwargs:
# Title
title: my custom DAG visualization
title_kwargs: {}
# Legend
add_legend: true
legend_kwargs: {}
handle_kwargs: {}
# Saving via matplotlib.pyplot.savefig
save_kwargs:
bbox_inches: tight