Plot Functions#

This page gives an overview of plot functions that are implemented within dantro for the use with The PyPlotCreator and derived plot creators. These plot functions are meant to be as generic as possible, allowing to work with a wide variety of data. They make use of the Data Transformation Framework for Plot Data Selection.

To use these plot functions, the following information needs to be specified in the plot configuration:

my_plot:
  creator: pyplot        # or: universe, multiverse, ...
  module: .generic       # absolute: dantro.plot.funcs.generic
  plot_func: facet_grid  # or some other plot function name

  # ...

facet_grid(): A Declarative Generic Plot Function#

Handling, transforming, and plotting high-dimensional data is difficult and often requires specialization to use-cases. dantro provides the generic facet_grid() plot function that - together with the other dantro features - allows for a declarative way of creating plots from high-dimensional data.

The idea is that high-dimensional raw data first is transformed using the Data Transformation Framework. The facet_grid() function then gets the ready-to-plot data as input and visualizes it by automatically choosing an appropriate kind of plot – if possible and not explicitly given – in a declarative way through the specification of layout keywords such as colums, rows, or hue. This approach is called faceting; dantro makes use of the excellent plotting functionality of xarray for this feature.

The facet_grid() plot function further extends the xarray plotting functionality by adding the possibility to create animations, simply by using the frames argument to specify the data dimension to represent as individual frames of an animation.

The PlotHelper interface then copes with the plot style and further layout. All steps are fully configurable and optimized for the YAML-based plotting interface. Thus, generating a plot of multidimensional data does not require touching any actual code but just specifying the desired representation in the plot configuration. πŸŽ‰

For more information, have a look at the facet_grid() docstring.

Automatically selecting plot kind#

The kind keyword of the facet grid plot is quite important. It determines most of the aesthetics and the possible dimensionality that the to-be-visualized data may have.

However, in some scenarios, one would like to choose an appropriate plot kind. While kind: None outsources the plot kind to xarray, this frequently leads to kind: hist being created, depending on which layout specifiers were given.

The determine_plot_kind() function used in facet_grid() uses the plot data’s dimensionality to select a plotting kind. By default, the following mapping of data-dimensionality to plot kind is used:

1:               "line",
2:               "pcolormesh",
3:               "pcolormesh",
4:               "pcolormesh",
5:               "pcolormesh",
"with_hue":      "line",         # used when `hue` is explicitly set
"with_x_and_y":  "pcolormesh",   # used when _both_ `x` and `y` were set
"dataset":       "scatter",      # used for xr.Dataset-like data
"fallback":      "hist",         # used when none of the above matches

Aside from the dimensionality as key, there are a few special cases that handle already-fixed layout encoding (hue / x and y); the case of xr.Dataset-like data; and a fallback option for all other dimensionalities or cases. For details, see the docstring of determine_plot_kind().

Setting kind: auto becomes especially powerful in conjunction with Auto-encoding of plot layout.

Auto-encoding of plot layout#

dantro also adds the auto_encoding feature to the facet grid plot, which automatically associates data dimensions with certain layout encoding specifiers (x, y, col, and others). With this functionality, the facet grid plot can be used to visualize high-dimensional data regardless of the dimension names; the only relevant information is the dimensionality of the data.

The available encodings for the facet_grid() plot are:

In [1]: print(available_facet_grid_kinds)
        scatter : ('hue', 'col', 'row', 'frames')
           line : ('x', 'hue', 'col', 'row', 'frames')
           step : ('x', 'col', 'row', 'frames')
       contourf : ('x', 'y', 'col', 'row', 'frames')
        contour : ('x', 'y', 'col', 'row', 'frames')
         imshow : ('x', 'y', 'col', 'row', 'frames')
     pcolormesh : ('x', 'y', 'col', 'row', 'frames')
           hist : ('frames',)
      errorbars : ('x', 'hue', 'col', 'row', 'frames')
      scatter3d : ('hue', 'markersize', 'col', 'row', 'frames')

In combination with Automatically selecting plot kind, this further reduces the plot configuration arguments required to generate facet grid plots.

For further details, see determine_encoding().

Add custom plot kinds that support faceting#

While the already-available plot kinds of the facet grid cover many use cases, there is still room for extension. As part of the generic plot functions module, dantro provides the make_facet_grid_plot decorator that wraps the decorated function in such a way that it becomes facetable.

That means that after decoration:

  • The function will support faceting in col, row and frames in addition to those dimensions handled within the decorated function.

  • It will be registered with the generic facet_grid() function, such that it is available as kind.

  • It will be integrated in such a way that it supports auto encoding.

The make_facet_grid_plot decorator wraps the functionality of xarray.plot.FacetGrid and makes it easy to add faceting support to plot functions. It can be used if the following requirements are fulfilled:

  • Works with a single xr.Dataset or xr.DataArray object as input

  • Will only plot to the current axis and not create a figure

  • It is desired to have the same kind of plot repeated over multiple axes, the plots differing only in the slice of data passed to them.

As an example, have a look at the implementation of the errorbars() plot function.

ColorManager integration#

All facet grid plots in dantro integrate the ColorManager to parse the colormap-related plotting arguments cmap and norm.

This allows to specify colormap properties right from the plot configuration. For instance, it can be used to specify a colormap of a certain name, including the over, under, and bad color values, and additionally specifying a normalization:

cmap:
  name: Greys
  over:
    color: black
    alpha: 0.5
  under:
    color: black
    alpha: 0.5
  bad: red
vmin: 0
vmax: 1
norm:
  name: PowerNorm
  gamma: 0.5

For more examples, see the ColorManager docstring.

Hint

YAML tags can be used to generate colormaps or norms in places where the color manager is not integrated but a corresponding matplotlib.colors.Colormap object or norm object is accepted / required:

cmap: !cmap         # will create a matplotlib.colors.Colormap object
  continuous: true
  from_values:
    0: "#EC7070"
    0.5: "#EC9F7E"
    1: black
  bad: white
norm: !cmap_norm    # will create a matplotlib.colors.BoundaryNorm object
  name: BoundaryNorm
  ncolors: 256
  boundaries: [0, 0.2, 0.7, 1]

multiplot(): Plot multiple functions on one axis#

The multiplot() plotting function enables the consecutive application of multiple plot functions on the current axis generated and provided through the PlotHelper.

Plot functions can be specified in three ways:

  • as a string that is used to map to the corresponding function

  • by importing a callable on the fly

  • or by directly passing a callable function

For plot function lookup by string, the following seaborn plot functions and some matplotlib functions are available:

# Seaborn - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# https://seaborn.pydata.org/api.html

# Relational plots
"sns.scatterplot":      _sns.scatterplot,
"sns.lineplot":         _sns.lineplot,

# Distribution plots
"sns.histplot":         _sns.histplot,
"sns.kdeplot":          _sns.kdeplot,
"sns.ecdfplot":         _sns.ecdfplot,
"sns.rugplot":          _sns.rugplot,

# Categorical plots
"sns.stripplot":        _sns.stripplot,
"sns.swarmplot":        _sns.swarmplot,
"sns.boxplot":          _sns.boxplot,
"sns.violinplot":       _sns.violinplot,
"sns.boxenplot":        _sns.boxenplot,
"sns.pointplot":        _sns.pointplot,
"sns.barplot":          _sns.barplot,
"sns.countplot":        _sns.countplot,

# Regression plots
"sns.regplot":          _sns.regplot,
"sns.residplot":        _sns.residplot,

# Matrix plots
"sns.heatmap":          _sns.heatmap,

# Utility functions
"sns.despine":          _sns.despine,

# Matplotlib - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# https://matplotlib.org/tutorials/introductory/sample_plots.html

# Relational plots
"plt.fill":             _plt.fill,
"plt.scatter":          _plt.scatter,
"plt.plot":             _plt.plot,
"plt.polar":            _plt.polar,
"plt.loglog":           _plt.loglog,
"plt.semilogx":         _plt.fill,
"plt.semilogy":         _plt.semilogy,

# Distribution plots
"plt.hist":             _plt.hist,
"plt.hist2d":           _plt.hist2d,

# Categorical plots
"plt.bar":              _plt.bar,
"plt.barh":             _plt.barh,
"plt.pie":              _plt.pie,
"plt.table":            _plt.table,

# Matrix plots
"plt.imshow":           _plt.imshow,
"plt.pcolormesh":       _plt.pcolormesh,

# Vector plots
"plt.contour":          _plt.contour,
"plt.quiver":           _plt.quiver,
"plt.streamplot":       _plt.streamplot,

To import a callable, specify a (module, name) tuple; this will use import_module_or_object() to carry out the import and traverse any modules.

You can also invoke any other function operating on a Axes object by importing or constructing a callable via the data transformation framework.

Let us look at some example configurations to illustrate the above features:

# Minimal example
sns_lineplot_example:
  plot_func: multiplot
  to_plot:
    # Plot a seaborn.lineplot
    # As data use the previously DAG-tagged 'seaborn_data'.
    # Note that it is important to specify the data to use
    # otherwise sns.lineplot plots and shows nothing!
    - function: sns.lineplot
      data: !dag_result seaborn_data
      # Add further sns.lineplot-specific kwargs below...
      markers: true

    # Can add more function specifications here to plot on the same axes

# An advanced example
sns_lineplot_and_more:
  plot_func: multiplot

  # Define some custom callable
  dag_options:
    define:
      my_custom_callable:
        - lambda: "lambda *, ratio, ax: ax.set_aspect(ratio)"

  to_plot:
    # Look up the callable from a dict
    - function: sns.lineplot
      data: !dag_result seaborn_data
      # Add further sns.lineplot-specific kwargs below...
      markers: true

    # Import a callable on the fly
    - function: [matplotlib, pyplot.plot]
      # plt.plot requires the x and y values to be passed as positional
      # arguments.
      args:
        - !dag_result plot_x
        - !dag_result plot_y
      # Add further plot-specific kwargs below...

    # Call the constructed plot function, passing the axis object along
    - function: !dag_result my_custom_callable
      pass_axis_object_as: ax
      ratio: 0.625

    # Can add more functions here, if desired

Hint

As can be seen in the above example, it is possible to pass an axis object to the function, if needed. To do so, use the pass_axis_object_as argument to specify the name of the keyword argument the axis object should be passed on as.

Hint

The actual implementation is part of the PlotHelper interface, which also gives access to arbitrary function invocations on the current axis. The corresponding helper function is named call (_hlpr_call()).

Use multiplot with multiple subplots#

Generating plots with multiple subplots is also possible via the multiplot() function. This is a two-step process:

  • In the PlotHelper configuration, specify the desired subplots of the figure using setup_figure.

  • In the multiplot() configuration, address each axis separately and specify which function calls should be made on it.

Example:

based_on:
  - .creator.pyplot
  - .plot.multiplot

  # use helpers for styling
  - .hlpr.limits.x.min_max
  - .hlpr.lines.h_zero

# Select some example data
select:
  some_data: labelled/time_series

transform:
  # Compute mean and std. deviation over the space dimension
  - .mean: [!dag_tag some_data, [space]]
    tag: mean

  - .std: [!dag_tag some_data, [space]]
    tag: std

  # Explicitly extract coordinates, needed by plt.plot
  - .coords: [!dag_tag mean, time]
    tag: time_coords

# Use PlotHelper to configure figure to have two subplots
helpers:
  setup_figure:
    ncols: 1
    nrows: 2
    sharex: true

  set_suptitle:
    title: Some Time Series (mean and std)

  set_labels:
    x: Time
    only_label_outer: true

# Specify the multiplot calls on the upper and lower subplots
to_plot:
  [0, 0]:
    - function: plt.plot
      args:
        - !dag_result time_coords
        - !dag_result mean
    - function: [matplotlib.pyplot, ylabel]
      args: [mean]
  [0, 1]:
    - function: plt.plot
      args:
        - !dag_result time_coords
        - !dag_result std
    - function: [matplotlib.pyplot, ylabel]
      args: [std]

The resulting plot looks like this:

Multiplot plot example with subplots and artificial time series data