dantro.plot.funcs package

dantro.plot.funcs package#

Plotting functions that can be used by the PyPlotCreator and derived plot creators.

Submodules#

dantro.plot.funcs._multiplot module#

Implements config-configurable function invocation that can be used for applying function calls to a plot. This is used in multiplot() plot and the _hlpr_call() helper function.

MULTIPLOT_FUNC_KINDS = {'plt.bar': <function bar>, 'plt.barh': <function barh>, 'plt.contour': <function contour>, 'plt.errorbar': <function errorbar>, 'plt.fill': <function fill>, 'plt.hist': <function hist>, 'plt.hist2d': <function hist2d>, 'plt.imshow': <function imshow>, 'plt.loglog': <function loglog>, 'plt.pcolormesh': <function pcolormesh>, 'plt.pie': <function pie>, 'plt.plot': <function plot>, 'plt.polar': <function polar>, 'plt.quiver': <function quiver>, 'plt.scatter': <function scatter>, 'plt.semilogx': <function fill>, 'plt.semilogy': <function semilogy>, 'plt.streamplot': <function streamplot>, 'plt.table': <function table>, 'sns.barplot': <function barplot>, 'sns.boxenplot': <function boxenplot>, 'sns.boxplot': <function boxplot>, 'sns.countplot': <function countplot>, 'sns.despine': <function despine>, 'sns.ecdfplot': <function ecdfplot>, 'sns.heatmap': <function heatmap>, 'sns.histplot': <function histplot>, 'sns.kdeplot': <function kdeplot>, 'sns.lineplot': <function lineplot>, 'sns.pointplot': <function pointplot>, 'sns.regplot': <function regplot>, 'sns.residplot': <function residplot>, 'sns.rugplot': <function rugplot>, 'sns.scatterplot': <function scatterplot>, 'sns.stripplot': <function stripplot>, 'sns.swarmplot': <function swarmplot>, 'sns.violinplot': <function violinplot>}#

The default-available plot kinds for the multiplot() function.

Details of the seaborn-related plots can be found here in the seaborn docs.

MULTIPLOT_CAUTION_FUNC_NAMES = ('sns.scatterplot', 'sns.lineplot', 'sns.histplot', 'sns.kdeplot', 'sns.ecdfplot', 'sns.rugplot', 'sns.stripplot', 'sns.swarmplot', 'sns.boxplot', 'sns.violinplot', 'sns.boxenplot', 'sns.pointplot', 'sns.barplot', 'sns.countplot', 'sns.regplot', 'sns.residplot', 'sns.heatmap', 'plt.fill', 'plt.scatter', 'plt.plot', 'plt.polar', 'plt.loglog', 'plt.semilogx', 'plt.semilogy', 'plt.errorbar', 'plt.hist', 'plt.hist2d', 'plt.bar', 'plt.barh', 'plt.pie', 'plt.table', 'plt.imshow', 'plt.pcolormesh', 'plt.contour', 'plt.quiver', 'plt.streamplot')#: The multiplot functions that emit a warning if they do not get any arguments when called. This is helpful for functions that e.g. require a data argument but do not fail or warn if no such argument is passed on to them.

parse_function_specs(*, _hlpr: PlotHelper, _funcs: Dict[str, Callable] = None, _shared_kwargs: dict = {}, function: str | Callable | Tuple[str, str], args: list = None, pass_axis_object_as: str = None, pass_helper: bool = False, pass_helper_attrs: List[str] = None, **func_kwargs) → Tuple[str, Callable, list, dict][source]#

Parses a function specification used in the invoke_function helper. If function is a string it is looked up from the _funcs dict.

See parse_and_invoke_function() and multiplot().

Parameters:

_hlpr (PlotHelper) – The currently used PlotHelper instance
_funcs (Dict[str, Callable]) – The lookup dictionary for callables
_shared_kwargs (dict, optional) – Shared kwargs that passed on to all multiplot functions. They are recursively updated with the individual plot functions’ func_kwargs.
function (Union[str, Callable, Tuple[str, str]]) – The callable function object or the name of the plot function to look up. If given as 2-tuple (module, name), will attempt an import of that module.
args (list, optional) – The positional arguments for the plot function
pass_axis_object_as (str, optional) – If given, will add a keyword argument with this name to pass the current axis object to the to-be-invoked function.
pass_helper (bool, optional) – If true, passes the helper instance to the function call as keyword argument hlpr.
pass_helper_attrs (List[str], optional) – If given, names of the helper’s (figure-level) attrs that are to be passed on to the function call; if an attribute is missing, the corresponding keyword argument will have None as value.
**func_kwargs (dict) – The function kwargs to be passed on to the function object.

Returns:

A tuple of function name, callable,: positional arguments, and keyword arguments.

Return type:

Tuple[str, Callable, list, dict]

parse_and_invoke_function(*, hlpr: PlotHelper, shared_kwargs: dict, func_kwargs: dict, show_hints: bool, call_num: int, funcs: Dict[str, Callable] = None, caution_func_names: List[str] = None) → Any[source]#

Parses function arguments and then calls multiplot().

Parameters:

hlpr (PlotHelper) – The currently used PlotHelper instance
funcs (Dict[str, Callable], optional) – The lookup dictionary for the plot functions. If not given, will use a default lookup dictionary with a set of seaborn and matplotlib functions.
shared_kwargs (dict) – Arguments shared between function calls
func_kwargs (dict) – Arguments for this function in particular
show_hints (bool) – Whether to show hints
call_num (int) – The number of this plot, for easier identification
caution_func_names (List[str], optional) – a list of function names that will trigger a log message if no function kwargs were given. If not explicitly given, will use some defaults.

Returns:

return value of plot function call

Return type:

Any

dantro.plot.funcs._utils module#

A module that implements a bunch of plot utilities used in the plotting functions. These can be shared tools between the plotting functions.

determine_ideal_col_wrap(N: int, *, fill_last_row: bool = True, fill_ratio_thrs: float = 0.75) → int | None[source]#

Given a number of subplots to place in a grid, determines the ideal number of columns to wrap after such that:

The resulting grid is most “square”

If fill_last_row is set, we compromise on squared-ness in order to have a last row that is more filled (avoiding lonely plots)

To get to the square-like configuration, uses:

col_wrap = math.ceil(math.sqrt(N))

With fill_last_row, will improve the fill ratio

Parameters:

N (int) – Number of elements to place in the grid. If this is below 4, will return None.
fill_last_row (bool, optional) – Whether to not only optimize for a square-like grid, but to also reduce lonely plots in the last row
fill_ratio_thrs (float, optional) – If the fill ratio of the last row is greater or equal this number already without optimization, will not begin optimization.

Returns:

The determined column wrapping number.: Will be None for N < 4.

Return type:

Optional[int]

plot_errorbar(*, ax, x: ndarray, y: ndarray, yerr: ndarray, fill_between: bool = False, fill_between_kwargs: dict | None = None, **errorbar_kwargs)[source]#

Given the data and (optionally) the y-error data, plots a single errorbar line. With fill_between=True, a shaded area is plotted instead of the errorbar markers.

The following fill_between_kwargs defaults are assumed:

color = line_color

alpha = 0.2 * line_alpha

lw = 0.

Parameters:

ax – The axis to plot on
x (ndarray) – The x data to use
y (ndarray) – The y-data to use for ax.errorbar. Needs to be 1D and have coordinates associated which will be used for the x-values.
yerr (ndarray) – The y-error data
fill_between (bool, optional) – Whether to use plt.fill_between or plt.errorbar to plot y-errors
fill_between_kwargs (dict, optional) – Passed on to plt.fill_between
**errorbar_kwargs – Passed on to plt.errorbar

Raises:

ValueError – On non-1D data

Returns:

The matplotlib legend handle of the errorbar line or of the errorbands

dantro.plot.funcs.basic module#

Holds basic plot functions for use with PyPlotCreator

lineplot(dm: DataManager, *, out_path: str, y: str, x: str | None = None, fmt: str | None = None, save_kwargs: dict | None = None, **plot_kwargs)[source]#

Performs a simple lineplot using matplotlib.pyplot.plot().

Parameters:

dm (DataManager) – The data manager from which to retrieve the data
out_path (str) – Where to store the plot to
y (str) – The path to get to the y-data from the data tree
x (str, optional) – The path to get to the x-data from the data tree
save_kwargs (dict, optional) – Keyword arguments for matplotlib.pyplot.savefig()
**plot_kwargs – Passed on to matplotlib.pyplot.plot().

dantro.plot.funcs.generic module#

Generic, DAG-based plot functions for the PyPlotCreator and derived plot creators.

ENSURE_UNIQUE_DIMS: Dict[Tuple[str, bool], bool | str] = {('raise_auto', False): True, ('raise_auto', True): False, ('warn_auto', False): 'warn', ('warn_auto', True): False}#: For auto mode, maps (ensure_unique_dims, data_vars is not None) tuples to the appropriate evaluated parameter.

_XR_PLOT_KINDS = {'contour': ('x', 'y', 'col', 'row'), 'contourf': ('x', 'y', 'col', 'row'), 'hist': ('free',), 'imshow': ('x', 'y', 'col', 'row'), 'line': ('x', 'hue', 'col', 'row'), 'pcolormesh': ('x', 'y', 'col', 'row'), 'scatter': ('free', 'hue', 'col', 'row'), 'step': ('x', 'col', 'row')}#: The available plot kinds for the xarray plotting interface, together with the supported layout specifier keywords.

_FACET_GRID_KINDS = {'contour': ('x', 'y', 'col', 'row', ('files', Ellipsis), 'frames'), 'contourf': ('x', 'y', 'col', 'row', ('files', Ellipsis), 'frames'), 'errorbars': ('x', 'hue', 'col', 'row', ('files', Ellipsis), 'frames'), 'hist': ('free', ('files', Ellipsis), 'frames'), 'imshow': ('x', 'y', 'col', 'row', ('files', Ellipsis), 'frames'), 'line': ('x', 'hue', 'col', 'row', ('files', Ellipsis), 'frames'), 'pcolormesh': ('x', 'y', 'col', 'row', ('files', Ellipsis), 'frames'), 'scatter': ('free', 'hue', 'col', 'row', ('files', Ellipsis), 'frames'), 'scatter3d': ('hue', 'markersize', 'col', 'row', ('files', Ellipsis), 'frames'), 'step': ('x', 'col', 'row', ('files', Ellipsis), 'frames')}#: The available plot kinds for the dantro plotting interface, together with the supported layout specifiers, which include the frames option.

_AUTO_PLOT_KINDS = {'dataset': 'scatter', 'fallback': 'hist', 'with_hue': 'line', 'with_x_and_y': 'pcolormesh', 1: 'line', 2: 'line', 3: 'line', 4: 'pcolormesh', 5: 'pcolormesh'}#: A mapping from data dimensionality to preferred plot kind, used in automatic plot kind selection. This assumes the specifiers of _FACET_GRID_KINDS

_FACET_GRID_FUNCS: Dict[str, Callable] = {'errorbars': <function make_facet_grid_plot.__call__.<locals>.fgplot>, 'scatter3d': <function make_facet_grid_plot.__call__.<locals>.fgplot>}#: A dict mapping additional facet grid kinds to callables. This is populated by the make_facet_grid_plot decorator.

_fmt_spec(spec: str | Tuple[str, int]) → str[source]#: Formats a single encoding specification, taking care of the various kinds of possible encodings, e.g. potential Ellipsis sizes.

_fmt_specs(specs: list) → str[source]#: Formats an encoding specifications list, typically a list of strings or tuples.

_fmt_encoding(enc: dict, fstr='{s}: {d}') → str[source]#: Formats an encoding dictionary into a single-line, comma-separated string, taking care of multi-dimensional encoding specifiers.

determine_plot_kind(d: DataArray | Dataset, *, kind: str | dict, default_kind_map: dict = {'dataset': 'scatter', 'fallback': 'hist', 'with_hue': 'line', 'with_x_and_y': 'pcolormesh', 1: 'line', 2: 'line', 3: 'line', 4: 'pcolormesh', 5: 'pcolormesh'}, **plot_kwargs) → str[source]#

Determines the plot kind to use for the given data. If kind: auto, this will determine the plot kind depending on the dimensionality of the data and other (potentially fixed) encoding specifiers. Otherwise, it will simply return kind.

What if layout encodings were partly fixed? There are two special cases where this is of relevance, and both these cases are covered explicitly:

If both x and y are given, line- or hist-like plot kinds are no longer possible; hence, a pcolormesh-like kind has to be chosen.

In turn, if hue was given, pcolormesh-like plot kinds are no longer applicable, thus a line-like argument needs to be chosen.

These two special cases are specified via the extra keys with_x_and_y and with_hue in the kind mapping.

A kind mapping may look like this:

1:               "line",
2:               "line",
3:               "line",
4:               "pcolormesh",
5:               "pcolormesh",
"with_hue":      "line",         # used when `hue` is explicitly set
"with_x_and_y":  "pcolormesh",   # used when _both_ `x` and `y` were set
"dataset":       "scatter",      # used for xr.Dataset-like data
"fallback":      "hist",         # used when none of the above matches

Parameters:

d (Union[DataArray, Dataset]) – The data for which to determine the plot kind.
kind (Union[str, dict]) – The given kind argument. If it is auto, the kind_map is used to determine the kind from the dimensionality of d. If it is a dict, auto is implied and the dict is assumed to be a (ndim -> kind) mapping, updating the default_kind_map.
default_kind_map (dict, optional) – The default mapping to use for kind: auto, with keys being d’s dimensionality and values being the plot kind to use. There are two special keys, fallback and dataset. The value belonging to dataset is used for data that is dataset- like, i.e. does not have an ndim attribute. The value of fallback specifies the plot kind for data dimensionalities that match no other key.
**plot_kwargs – All remaining plot function arguments, including any layout encoding arguments that aim to fix a dimension; these are used to determine the with_hue and with_x_and_y special cases. Everything else is ignored.

Returns:

The selected plot kind. This is equal to the given kind if: it was None or a string unequal to auto.

Return type:

str

parse_encoding_spec(s: str | Tuple[str, int]) → Tuple[str, int][source]#: Brings an encoding specification into a uniform 2-tuple shape, where the first is the name of the encoding and the second is how many dimensions it may absorb. The second value can also be Ellipsis to denote that all remaining dimensions are to be absorbed.

map_dims_to_encoding(all_specs: List[str | Tuple[str, int]], all_dims: List[str], *, encoding: Dict[str, str | Tuple[str, ...]] | None = None, drop_missing_dims: bool = False, data_vars: List[str] | None = None, ignore_encodings: List[str] | None = None, ensure_unique_dims: bool | Literal['warn'] = False) → Tuple[Dict[str, str | Tuple[str, ...]], List[Tuple[str, int]], List[str]][source]#

Maps encoding specifiers to one or multiple dimension names.

Encoding specifiers are given as a list of encoding names that are filled, one by one, with a free dimension. The encoding specifier can be given as a string, assuming that it can absorb a single dimension; alternatively, if a 2-tuple (name, num_dims) is given, the second entry denotes the number of dimensions that specifier can absorb.

Specifiers are assigned in the given order. They may also appear multiple times, in which case they are handled as multi-dimensional encodings.

specs = [x, hue, col, row]           # -> 1 absorbing dim each
specs = [(x, 1), (hue, 1), (col, 1), (row, 1)]   # same as above
specs = [x, col, row, (files, 5)]    # -> files absorbs 5 dims
specs = [x, files, hue, (files, 4)]  # -> files absorbs 5 dims, but hue
                                     #    will be assigned in between
specs = [x, (files, ...), hue]       # -> files absorbs all free dims

Parameters:

all_specs (List[Union[str, Tuple[str, int]]]) – All available encoding specifiers as a list of string or list of (name, num_dims) tuples. num_dims can also be ... (an Ellipsis) to denote that this dimension will absorb remaining free dimensions.
all_dims (List[str]) – List of all available dimension names; these will be the values of the returned mapping.
encoding (Dict[str, Union[str, Tuple[str, ...]]], optional) – If given, denotes which encodings and dimensions are already in use.
drop_missing_dims (bool, optional) – If True, will drop those entries in encoding that use a dimension name that is not part of all_dims.
ignore_encodings (List[str], optional) – Names of encoding specifiers that should be ignored, i.e. which remain in all_specs but are not automatically assigned dimensions; note that they remain in encoding and retain the value they have received manually.
ensure_unique_dims (Union[bool, Literal["warn"]], optional) – If True, will make sure that the user-specified encoding does not cause dimensions to be assigned more than once. This should be set if the plot function does not support duplicate encodings, e.g. because it involves a sequential dimensionality reduction. It should not be set for plot functions that allow parallel encodings, e.g. scatter plots with hue and size encodings shared within a subplot. If set to warn, will warn instead of raise.

Returns:

A 3-tuple (mapping, free_specs, free_dims) containing the desired mapping dictionary and information about possibly free encoding specifiers or dimensions.

Return type:

Tuple[Dict[str, Union[str, Tuple[str, …]]], List[Tuple[str, int]], List[str]]

determine_encoding(dims: List[str] | Dict[str, int], *, kind: str, auto_encoding: bool | dict, default_encodings: dict, plot_kwargs: dict, data_vars: List[str] | None = None, allow_y_for_x: List[str] = ('line',), drop_missing_dims: bool = False, ignore_encodings: List[str] | None = None, ensure_unique_dims: bool | str = False, return_encoding_info: bool = False) → dict[source]#

Determines the layout encoding for the given plot kind and the available data dimensions (as specified by the dims argument).

If auto_encoding does not evaluate to true or kind is None, this function does nothing and simply returns all given plotting arguments.

Otherwise, it uses the chosen plot kind to associate layout specifiers with dimension names of d. The available layout encoding specifiers (x, y, col etc.) can be specified in two ways:

By default, default_encodings is used as a map from plot kind to a sequence of available layout encodings.

If auto_encoding is a dictionary, the default map will be updated with that dictionary.

The association is done in the following way:

Inspecting plot_kwargs, all layout encoding specifiers are extracted, regardless of their value.

The encodings mapping is determined (see above).

The available dimension names are determined from dims.

Depending on kind and the already fixed specifiers, the free encoding specifiers and dimension names are extracted.

These free specifiers are associated with free dimension names, in order of descending dimension size. Encoding specifiers that have previously been set will keep that value, even if it was None.

Example: Assume, the available specifiers are ('x', 'y', 'col') and the data has dimensions dim0, dim1 and dim2. Let’s further say that y was already fixed to dim2, leaving x and col as available encodings and dim0 and dim1 as free dimensions. With x being specified before col in the list of available encodings, x would be associated to the remaining dimension with the larger size and col to the remaining one.

An encodings mapping may look like this:

"scatter":      ("free", "hue", "col", "row"),
"line":         ("x", "hue", "col", "row"),
"step":         ("x", "col", "row"),
"contourf":     ("x", "y", "col", "row"),
"contour":      ("x", "y", "col", "row"),
"imshow":       ("x", "y", "col", "row"),
"pcolormesh":   ("x", "y", "col", "row"),
"hist":         ("free",),  # can also set ("free", ...) to absorb all

Here, string-like specifiers denote encodings that can represent only a single data dimension. The (name, ndim) syntax can be used to let an encoding absorb ndim dimensions. Setting ndim to an Ellipsis (..., Ellipsis or the '...' string) specifies that encoding to take up all data dimensions that are not taken-up by other encodings. Encodings with ndim > 1 are always multi-dimensional, regardless of how many dimensions will be associated with it.

Example: Let’s assume the available encoding is x, hue, files… and there are five free dimensions to assign. In this case, the largest will go to x, the next-largest to hue and the remaining three to the multi-dimensional``files`` encoding.

The drop_missing_dims option will unset a previously set encoding if that dimension does not exist in the data; a log message will inform about this case. Setting this can be useful to make a plot definition more flexible.

The ignore_encodings option allows to not automatically assign certain encodings, e.g. if it is desired that an encoding is typically kept unassigned. Effectively, it is never regarded as an available encoding, regardless of its value. This can be useful to set if it is undesired to change the auto_encoding dict.

When working with xarray.Dataset objects, its data variables may play a role in the encoding as some specifiers (like hue in a scatter plot) accept both dimension names and data variables, behaving differently depending on which one was passed. By passing on the data variables via the data_vars argument, the encoding algorithm can take into account that a specified encoding does perhaps not refer to a data dimension but to a data variable.

This function also implements automatic column wrapping, aiming to produce a efficient figure use with column wrapping. The prerequisites are the following:

The dims argument is a dict, containing size information

The col_wrap argument is given and set to "auto"

The col specifier is in use

The row specifier is not used, i.e. wrapping is possible

There are more than three columns

To determine the column wrapping number, a little optimization routine tries to reduce the number of empty spots in the last row while trying to get a square-like grid. To skip the optimization, potentially leading to last rows that have only one or few subplots, set col_wrap to "square", in which case wrapping will happen after ceil(sqrt(num_cols)) columns; see determine_ideal_col_wrap() for more information and implementation.

Parameters:

dims (Union[List[str], Dict[str, int]]) – The dimension names (and, if given as dict: their sizes) that are to be encoded. If no sizes are provided, the assignment order will be the same as in the given sequence of dimension names. If sizes are given, these will be used to sort the dimension names in descending order of their sizes. For xarray objects, da.sizes or ds.sizes should be used.
kind (str) – The chosen plot kind. If this was None, will directly return, because auto-encoding information is missing.
auto_encoding (Union[bool, dict]) – Whether to perform auto-encoding. If a dict, will regard it as a mapping of available encodings and update default_encodings.
default_encodings (dict) – A map from plot kinds to available layout specifiers, e.g. {"line": ("x", "hue", "col", "row")}.
allow_y_for_x (List[str], optional) – A list of plot kinds for which the following replacement will be allowed: if a y specifier is given but no x specifier, the "x" in the list of available encodings will be replaced by a "y". This is to support plots that allow either an x or a y specifier, like the line kind.
plot_kwargs (dict) – The actual plot function arguments, including any layout encoding arguments that aim to fix a dimension. Everything else is ignored.
drop_missing_dims (bool, optional) – If set, will drop pre-specified encodings from plot_kwargs if they refer to a dimension that is not available in dims. The encoding can then be filled with another dimension.
data_vars (List[str], optional) – If given, names of data variables that may (in addition to the dims) be used for encoding; this is relevant when determining whether an encoding includes a missing dimension, as some encodings may also refer not to dimensions but to data variables.
ignore_encodings (List[str], optional) – If given, will ignore these encodings when automatically assigning.
ensure_unique_dims (Union[bool, str], optional) – If True, will make sure that the user-specified encoding does not cause dimensions to be assigned more than once. This should be set if the plot function does not support duplicate encodings, e.g. because it involves a sequential dimensionality reduction. It should not be set for plot functions that allow parallel encodings, e.g. scatter plots with hue and size encodings shared within a subplot. If set to warn, will warn instead of raise. If set to warn_auto or raise_auto, will warn or raise only if data_vars is None; in such a case, encoding is typically used for dimensionality reduction, which can only be done once…
return_encoding_info (bool, optional) – If set, will return a 2-tuple of the updated plots config and the encoding information as a 3-tuple (encoding, free_specs, free_dims).

build_pspace_selector(d: DataArray | Dataset, dims: List[str], **sel) → Dict[str, ParamDim | Any][source]#

Builds a selector for sel() operations that uses ParamDim as values.

This method also combines the parameter space selector with an existing selector dict, sel, and throws an error if there is an overlap between keys in dims and sel.

class make_facet_grid_plot(*, map_as: str, encodings: Tuple[str], supported_hue_styles: Tuple[str] | None = None, register_as_kind: bool | str = True, overwrite_existing: bool = False, drop_kwargs: Tuple[str] = ('_fg', 'meta_data', 'hue_style', 'add_guide'), parse_cmap_and_norm_kwargs: bool = True, **default_kwargs)[source]#

Bases: object

This is a decorator class that transforms a plot function that works on a single axis into one that supports faceting via xarray.plot.FacetGrid.

Additionally, it allows to register the plotting function with the generic facet_grid() plot by adding the callable to _FACET_GRID_FUNCS.

MAP_FUNCS = {'dataarray': <function make_facet_grid_plot.<lambda>>, 'dataarray_line': <function make_facet_grid_plot.<lambda>>, 'dataset': <function make_facet_grid_plot.<lambda>>}#: The available mapping functions in xarray.plot.FacetGrid

DEFAULT_ENCODINGS = ('col', 'row', ('files', Ellipsis), 'frames')#: The default encodings the facet grid supplies; these are those supported by the generic facet grid function, irrespective of chosen kind

DEFAULT_DROP_KWARGS = ('_fg', 'meta_data', 'hue_style', 'add_guide')#: The default kwargs that are to be dropped rather than passed on to the wrapped plotting function. Can be customized via drop_kwargs argument.

__init__(*, map_as: str, encodings: Tuple[str], supported_hue_styles: Tuple[str] | None = None, register_as_kind: bool | str = True, overwrite_existing: bool = False, drop_kwargs: Tuple[str] = ('_fg', 'meta_data', 'hue_style', 'add_guide'), parse_cmap_and_norm_kwargs: bool = True, **default_kwargs)[source]#

Initialize the decorator, making the decorated function capable of performing a facet grid plot.

Parameters:

map_as (str) – Which mapping to use. Available: dataset, dataarray and dataarray_line.
encodings (Tuple[str]) – The encodings supported by the wrapped plot function, e.g. ("x", "hue"). Note that these need to be dimensionality-reducing encodings that have a qualitatively similar effect as col & row in that they consume a data dimension. This is in contrast to plots that may represent multiple data variables, e.g. if the data comes from a xarray.Dataset; those should not be specified here.
supported_hue_styles (Tuple[str]) – Which hue styles are supported by the wrapped plot function. It is suggested to set this value if mapping via dataset or dataarray_line in order to disallow configurations that will not work with the wrapped plot function. If set to None, no check will be done.
register_as_kind (Union[bool, str], optional) – If boolean, controls whether to register the wrapped function with the generic facet grid plot, using its own name. If a string, uses that name for registration.
overwrite_existing (bool, optional) – Whether to overwrite an existing registration in _FACET_GRID_FUNCS. If False, an existing entry of the same register_as_kind value will lead to an error.
drop_kwargs (Tuple[str], optional) – Which keyword arguments to drop before invocation of the wrapped function; this can be useful to trim down the signature of the wrapped function.
parse_cmap_and_norm_kwargs (bool, optional) – Whether to parse colormap-related plot function arguments using the parse_cmap_and_norm_kwargs() function. Should be set to false if the decorated plot function takes care of these arguments itself.
**default_kwargs – Additional arguments that are passed to the single-axis plotting function. These are used both when calling it via the selected mapping function and when invoking it without a facet grid. These are recursively updated with those given upon plot function invocation.

parse_wpf_kwargs(data, **kwargs) → dict[source]#: Parses the keyword arguments in preparation for invoking the wrapped plot function. This can happen both in context of a facet grid mapping and a single invocation.

__call__(plot_single_axis: Callable) → Callable[source]#: Generates a standalone DAG-based plotting function that supports faceting. Additionally, integrates it as kind for the general facet grid plotting function by adding it to the global _FACET_GRID_FUNCS dictionary.

facet_grid(*, data: dict, hlpr: PlotHelper, kind: dict | str | None = None, auto_encoding: bool | dict = False, auto_encoding_options: dict | None = None, title_kwargs: dict | None = None, suptitle_kwargs: dict | None = None, squeeze: bool = True, drop_nonindexed_coords: bool = False, sel: dict | None = None, show_data: bool = False, **plot_kwargs)[source]#

A generic facet grid plot function for high dimensional data.

This function calls the data['data'].plot function if no plot kind is given, otherwise data['data'].plot.<kind>. It is designed for plotting with xarray objects, i.e. xarray.DataArray and xarray.Dataset. Specifying the kind of plot requires the data to be of one of those types and have a dimensionality that can be represented in these plots. See the correponding API documentation for more information.

In most cases, this function creates a so-called xarray.plot.FacetGrid object that automatically layouts and chooses a visual representation that fits the dimensionality of the data. To specify which data dimension should be represented in which way, it supports a declarative syntax: via the optional keyword arguments x, y, row, col, and/or hue (available options are listed in the corresponding plot function documentation), the representation of the data dimensions can be selected. In dantro, this is referred to as “layout encoding”.

dantro not only wraps this interface, but adds the following functionality:

the frames layout encoding argument, which behaves in the same way as the other encodings, but leads to an animation being generated, thus opening up one further dimension of representation;

the files encoding, which triggers plot config updating and thereby allows to represent data of arbitrary dimensionality; this is achieved by performing a parameter sweep plot where each point corresponds to a single plot file of a subspace of the data;

the auto_encoding feature, which allows to assign layout- encodings automatically, depending on dimensions and dimension sizes of the data;

the kind: 'auto' option, which can be used in conjunction with auto_encoding to choose the plot kind automatically as well;

the col_wrap: 'auto' option, which selects the value such that the figure becomes more square-like (requires auto_encoding);

and allowing to register additional plot kind values that create plots with a custom single-axis plotting function, using the make_facet_grid_plot decorator.

For details about auto-encoding and how the plot kind is chosen, see determine_encoding() and determine_plot_kind().

Note

The way the plot data is labelled for the facet grid plot is very important to understand how this plot function behaves.

Background: One can distinguish different categories of xarray data dimensions, most relevant for association of encodings: those with and those without coordinate labels. If coordinates are available, the corresponding dimension is called indexed, otherwise it is a non-indexed dimension, no coordinate labels exist and hence only trivial indexing is possible.

xarray objects may also contain additional (scalar) coordinate metadata which has no relation to the data dimensions and is ignored here.

Furthermore, there can be additional non-scalar coordinates that are associated with existing data dimensions, but are not acting as their index; these run “in parallel” to the existing coordinates along that dimension.

Note

When specifying frames, the animation arguments also need to be specified. See here for more information on the expected animation parameters.

The value of the animation.enabled key is not relevant for this function; it will automatically enter or exit animation mode, depending on whether the frames argument is given or not. This uses the animation mode switching feature.

Note

Internally, this function by default call .squeeze on the selected data (controlled by the squeeze argument), thus being more tolerant with data that has size-1 dimension coordinates. To suppress this behaviour, set the squeeze argument accordingly.

Warning

Depending on kind and the dimensionality of the data, some plot functions might create their own figure, disregarding any previously set up figure. This includes the figure from the plot helper.

To control figure aesthetics, you can either specify matplotlib RC style parameters (via the style argument), or you can use the plot_kwargs to pass arguments to the respective plot functions. For the latter, refer to the respective documentation to find out about available arguments.

Parameters:

data (dict) – The data selected by the data transformation framework, expecting the data key.
hlpr (PlotHelper) – The plot helper
kind (str, optional) – The kind of plot to use. Options are: contourf, contour, imshow, line, pcolormesh, step, hist, scatter, errorbars and any plot kinds that were additionally registered via the make_facet_grid_plot decorator. With auto, dantro chooses an appropriate kind by itself; this setting is useful when also using the auto_encoding feature; see Automatically selecting plot kind for more information. If None is given, xarray automatically determines it using the dimensionality of the data, frequently falling back to hist for higher-dimensional data or lacking specifiers.
frames (str, optional) – Data dimension from which to create animation frames. If given, this results in the creation of an animation. If not given, a single plot is generated. Note that this requires animation options as part of the plot configuration.
auto_encoding (Union[bool, dict], optional) – Whether to choose the layout encoding options automatically. For further options, can pass a dict. See Auto-encoding of plot layout for more info.
auto_encoding_options (dict, optional) – Additional arguments for determine_encoding().
title_kwargs (dict, optional) – Keyword arguments passed on xarray.plot.FacetGrid.set_titles() to set the template (allowing {coord} and {value} placeholders), maxchar and other properties of the title strings. Invoked only if a FacetGrid object is produced, i.e. if col and/or row encodings are used. If not given, FacetGrid still invokes the same method, but then uses default arguments.
suptitle_kwargs (dict, optional) – Keyword arguments passed on to the PlotHelper’s set_suptitle helper function. Only used if animations are enabled. The title entry can be a format string with the following keys, which are updated for each frame of the animation: dim, value. Default: {dim:} = {value:.3g}.
squeeze (bool, optional) – whether to squeeze the data before plotting, such that size-1 dimensions do not take up encoding dimensions.
drop_nonindexed_coords (bool, optional) – If true, non-indexed coordinates will be dropped.
sel (dict, optional) – A selector dict that is applied to the data to use only a subset of it for the plot; passed to xarray.Dataset.sel() or xarray.DataArray.sel(). Note that this requires the data to have indexed dimensions.
show_data (bool, optional) – If true, shows the head of the data that will be used for plotting.
**plot_kwargs – Passed on to <data>.plot or <data>.plot.<kind> These should include the layout encoding specifiers (x, y, hue, col, row, frames, files, …).

Raises:

AttributeError – Upon unsupported kind value
ValueError – Upon any upstream error in invocation of the xarray plotting capabilities. This wraps the given error message and provides additional information that helps to track down why the plotting failed.
UpdatePlotConfig – To rewrite the plot configuration and restart this plot with a new configuration.
EnterAnimationMode – To enter animation mode if not already in it.
ExitAnimationMode – To exit animation mode if unnecessarily in it.

errorbars(*, data: dict, hlpr: PlotHelper, **kwargs)[source]#

scatter3d(*, data: dict, hlpr: PlotHelper, **kwargs)[source]#

dantro.plot.funcs.graph module#

Plot functions to draw networkx.Graph objects.

Todo

Should really integrate utopya GraphPlot here!

_wiggle_pos(pos: dict, *, x: float | None = None, y: float | None = None, seed: int | None = None) → dict[source]#

Wiggles positions by absolute random amplitudes in x and y direction

Parameters:

pos (dict) – Positions dict with values being x and y positions
x (float, optional) – Absolute wiggle amplitude
y (float, optional) – Absolute wiggle amplitude
seed (int, optional) – Seed for the numpy.random.RandomState that is used for drawing random numbers. Set to a fixed value to always get the same positions.

_get_positions(g: Graph, *, model: str | Callable, wiggle: dict = None, **kwargs) → dict[source]#

Returns the positions dict for the given graph, created from a networkx layouting algorithm of a certain name or an arbitrary callable.

Parameters:

g (Graph) – The graph object for which to create the layout
model (Union[str, Callable]) – Name of the layouting model or the layouting function itself. If starting with graphviz_<prog>, will invoke networkx.drawing.nx_agraph.graphviz_layout() with the given value for prog. Note that these only take a single keyword argument, args. If it is a string, it’s looked up from the networkx namespace. If it is a callable, it is invoked with g as only positional argument and **kwargs as keyword arguments.
wiggle (dict, optional) – If given, will postprocess the positions dict by randomly wiggling x and y coordinates according to the absolute amplitudes given as values.
**kwargs – Passed on to the layouting algorithm.

get_positions(g: Graph, *, model: str | Callable = 'spring', model_kwargs: dict = {}, fallback: str | dict = None, silent_fallback: bool = False, **kwargs) → dict[source]#

Returns the positions dict for the given graph, created from a networkx layouting algorithm of a certain name or an arbitrary callable.

This is a wrapper around _get_positions() which allows to specify a fallback layouting model to use in case the first one fails for whatever reason.

Parameters:

g (Graph) – The graph object for which to create the layout
model (Union[str, Callable], optional) – Name of the layouting model or the layouting function itself. If starting with graphviz_<prog>, will invoke networkx.drawing.nx_agraph.graphviz_layout() with the given value for prog. Note that these only take a single keyword argument, args. If it is a string, it’s looked up from the networkx namespace. If it is a callable, it is invoked with g as only positional argument and **kwargs as keyword arguments.
model_kwargs (dict, optional) – A dict where keys correspond to names of layouting models and values are parameters that are to be passed to the layouting function. This dict may contain more arguments than required, only the model key is looked up here. This can be useful for providing a wider set of defaults. These defaults are not considered when model is a callable.
fallback (Union[str, dict], optional) – The fallback model name (if a string) or a dict containing the key model and further kwargs.
silent_fallback (bool, optional) – Whether to log a visible message about the fallback or a more discrete one.
**kwargs – Passed on to the layouting algorithm in addition to the selected entry from model_kwargs. Keys given here update those from model_kwargs. Also, these are not passed on to the fallback invocation.

_draw_graph(g: Graph, *, ax: Axes = None, drawing: dict = {}, layout: dict = {}) → list[source]#

Draws a graph using networkx.drawing.nx_pylab.draw_networkx_nodes(), networkx.drawing.nx_pylab.draw_networkx_edges(), and networkx.drawing.nx_pylab.draw_networkx_labels().

Warning

This function is not yet completed and may change anytime.

Parameters:

g (Graph) – The graph to draw
out_path (str) – Where to store it to
drawing (dict, optional) – Drawing arguments, containing the nodes, edges and labels keys. The labels key can contain the from_attr key which will read the attribute specified there and use it for the label.
layout (dict, optional) – Used to generate node positions via the get_positions() function.

dantro.plot.funcs.multiplot module#

Generic, DAG-based multiplot function for the PyPlotCreator and derived plot creators.

multiplot(*, hlpr: PlotHelper, to_plot: List[dict] | Dict[Tuple[int, int], List[dict]], data: dict, funcs: Dict[str, Callable] = None, show_hints: bool = True, **shared_kwargs) → None[source]#

Consecutively call multiple plot functions on one or multiple axes.

to_plot contains all relevant information for the functions to plot. If to_plot is list-like the plot functions are plotted on the current axes created through the hlpr. If to_plot is dict-like, the keys specify the coordinate pair selecting an ax to plot on, e.g. (0,0), while the values specify a list of plot function configurations to apply consecutively. Each list entry specifies one function plot and is parsed via the parse_function_specs() function.

The multiplot works with any plot function that either operates on the current axis and does not create a new figure or does not require an axis at all.

Note

While most functions will automatically operate on the current axis, some function calls may require an axis object. If so, use the pass_axis_object_as argument to specify the name of the keyword argument as which the current axis is to be passed to the function call.

Look at the multiplot documentation for further information.

Example

A simple to_plot specification for a single axis may look like this:

to_plot:
  - function: sns.lineplot
    data: !dag_result data
    # Note that especially seaborn plot functions require a
    # `data` input argument that can conveniently be
    # provided via the `!dag_result` YAML-tag.
    # If not provided, nothing is plotted without emitting
    # a warning.
  - function: sns.despine

A to_plot specification for a two-column subplot could look like this:

to_plot:
  [0,0]:
    - function: sns.lineplot
      data: !dag_result data
    - # ... more here ...
  [1,0]:
    - function: sns.scatterplot
      data: !dag_result data

If function is a string it is looked up from the following dictionary:

# Seaborn - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# https://seaborn.pydata.org/api.html

# Relational plots
"sns.scatterplot":      _sns.scatterplot,
"sns.lineplot":         _sns.lineplot,

# Distribution plots
"sns.histplot":         _sns.histplot,
"sns.kdeplot":          _sns.kdeplot,
"sns.ecdfplot":         _sns.ecdfplot,
"sns.rugplot":          _sns.rugplot,

# Categorical plots
"sns.stripplot":        _sns.stripplot,
"sns.swarmplot":        _sns.swarmplot,
"sns.boxplot":          _sns.boxplot,
"sns.violinplot":       _sns.violinplot,
"sns.boxenplot":        _sns.boxenplot,
"sns.pointplot":        _sns.pointplot,
"sns.barplot":          _sns.barplot,
"sns.countplot":        _sns.countplot,

# Regression plots
"sns.regplot":          _sns.regplot,
"sns.residplot":        _sns.residplot,

# Matrix plots
"sns.heatmap":          _sns.heatmap,

# Utility functions
"sns.despine":          _sns.despine,

# Matplotlib - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# https://matplotlib.org/tutorials/introductory/sample_plots.html

# Relational plots
"plt.fill":             _plt.fill,
"plt.scatter":          _plt.scatter,
"plt.plot":             _plt.plot,
"plt.polar":            _plt.polar,
"plt.loglog":           _plt.loglog,
"plt.semilogx":         _plt.fill,
"plt.semilogy":         _plt.semilogy,
"plt.errorbar":         _plt.errorbar,

# Distribution plots
"plt.hist":             _plt.hist,
"plt.hist2d":           _plt.hist2d,

# Categorical plots
"plt.bar":              _plt.bar,
"plt.barh":             _plt.barh,
"plt.pie":              _plt.pie,
"plt.table":            _plt.table,

# Matrix plots
"plt.imshow":           _plt.imshow,
"plt.pcolormesh":       _plt.pcolormesh,

# Vector plots
"plt.contour":          _plt.contour,
"plt.quiver":           _plt.quiver,
"plt.streamplot":       _plt.streamplot,

It is also possible to import callables on the fly. To do so, pass a 2-tuple of (module, name) to function, which will then be loaded using import_module_or_object().

Parameters:

hlpr (PlotHelper) – The PlotHelper instance for this plot, carrying the to-be-plotted-on figure object.
to_plot (Union[list, dict]) – The plot specifications. If list-like, assumes that there is only a single axis and applies all functions to that axis. If dict-like, expects 2-tuples for keys and selects the axis before commencing to plot. Beforehand, the figure needs to have been set up accordingly via the setup_figure helper.
data (dict) – Data from TransformationDAG selection. These results are ignored; data needs to be passed via the result placeholders! See above.
funcs (Dict[str, Callable], optional) – If given, use this dictionary to look up functions by name. If not given, will use a default dict with a set of matplotlib and seaborn functions.
show_hints (bool) – Whether to show hints in the case of not passing any arguments to a plot function.
**shared_kwargs (dict) – Shared kwargs for all plot functions. They are recursively updated, if to_plot specifies the same kwargs.

Warning

Note that especially seaborn plot functions require a data argument that needs to be passed via a !dag_result key, see Using data transformation results in the plot configuration. The multiplot function neither expects nor automatically passes a data DAG-node to the individual functions.

Note

If a plot fails and the helper is configured to not raise on a failing invocation, the logger will inform about the error. This allows to still apply other functions on the same axis.

Raises:: TypeError – On a non-list-like or non-dict-like to_plot argument.

dantro.plot.funcs.snsplot module#

Implements seaborn-based plotting functions

normalize_df_names(df: DataFrame) → DataFrame[source]#: In-place normalizes index and column names by prefixing index_ or col_ if they are not named.

apply_selection(df: DataFrame, **sel) → DataFrame[source]#: Apply a selection, defined by key-value pairs, on a DataFrame.

build_df_selector(df: DataFrame, vars: List[str], **sel) → Dict[str, ParamDim | Any][source]#

Builds a selector dict for DataFrame selections, using ParamDim as values.

This method also combines the parameter space selector with an existing selector dict, sel, and throws an error if there is an overlap between keys in vars and sel.

convert_to_df(df: Dataset | DataArray | DataFrame, to_dataframe_kwargs: dict | None = None) → DataFrame[source]#: Converts a xarray DataArray or Dataset to a pandas DataFrame

_log_df_summary(df: DataFrame)[source]#: Logs a summary of the the data frame properties

sample_df(df: DataFrame, sample: int, sample_kwargs: dict) → DataFrame[source]#

snsplot(*, data: dict, hlpr: PlotHelper, sns_kind: str, free_indices: Tuple[str, ...] | None = None, optional_free_indices: Tuple[str, ...] = (), auto_encoding: bool | dict | None = None, auto_encoding_options: dict | None = None, reset_index: bool | List[str] = False, to_dataframe_kwargs: dict | None = None, normalize_names: bool = False, dropna: bool = False, dropna_kwargs: dict | None = None, sample: bool | int = False, sample_kwargs: dict | None = None, _sel: dict | None = None, **plot_kwargs) → None[source]#

An experimental interface to seaborn’s figure-level plot functions.

Seaborn plot functions are selected via the sns_kind argument:

relplot: seaborn.relplot()
displot: seaborn.displot()
catplot: seaborn.catplot()
lmplot: seaborn.lmplot()
clustermap: seaborn.clustermap() (not faceting)
pairplot: seaborn.pairplot() (not faceting)
jointplot: seaborn.jointplot() (not faceting)

This plot function also supports the files encoding, which triggers plot config updating and thereby allows to represent data of arbitrary dimensionality; this is achieved by performing a parameter sweep plot where each point corresponds to a single plot file of a subspace of the data.

Warning

This plot function is still being experimented with and surely will show some quirks. Please report any errors or unexpected behavior and note that the interface may still change in future versions.

Parameters:

data (dict) – The data transformation framework results, expecting a single entry data which can be a pandas.DataFrame or an xarray.DataArray or xarray.Dataset.
hlpr (PlotHelper) – The plot helper instance
sns_kind (str) – Which seaborn plot to use, see list above.
free_indices (Tuple[str, ...], optional) – Which index names not to associate with a layout encoding; seaborn uses these to calculate the distribution statistics.
optional_free_indices (Tuple[str, ...], optional) – These indices will be added to the free indices if they are part of the data frame. Otherwise, they are silently ignored.
auto_encoding (Union[bool, dict], optional) – Whether to use auto-encoding to map encodings to data variables or dimensions; see determine_encoding().
auto_encoding_options (dict, optional) – Additional arguments for determine_encoding().
reset_index (Union[bool, List[str]], optional) – If a boolean, controls whether to reset indices such that only the free_indices remain as indices and all others are converted into columns. Otherwise, assumes it’s a sequence of index names to reset.
to_dataframe_kwargs (dict, optional) – For xarray data types, this is used to convert the given data into a pandas.DataFrame.
normalize_names (bool, optional) – If True (default), unnamed columns and indices will get names assigned. This makes handling of various data frames easier.
dropna (bool, optional) – If True, will invoke .dropna on the data.
dropna_kwargs (dict, optional) – Additional arguments to the .dropna call on the data.
sample (Union[bool, int], optional) – If True, will sample a subset from the final dataframe, controlled by sample_kwargs. If an integer, will use that as the absolute number of samples to draw. If a float in the unit interval, will use it as the fraction of samples to draw.
sample_kwargs (dict, optional) – Passed to pandas.DataFrame.sample(). May contain n for absolute or frac for relative number of samples to keep.
_sel (dict, optional) – Select a subset of the dataframe. (For internal use only!)
**plot_kwargs – Passed on to the selected plotting function, containing the respective encoding variables, e.g. x, y, hue, col, row, files, …

dantro.plot.funcs package

Contents

dantro.plot.funcs package#

Submodules#

dantro.plot.funcs._multiplot module#

dantro.plot.funcs._utils module#

dantro.plot.funcs.basic module#

dantro.plot.funcs.generic module#

dantro.plot.funcs.graph module#

dantro.plot.funcs.multiplot module#

dantro.plot.funcs.snsplot module#