dantro.utils.coords module

This module provides coordinate parsing abilities.

dantro.utils.coords.extract_dim_names(attrs: dict, *, ndim: int, attr_name: str, attr_prefix: str) → Tuple[str][source]

Extract dimension names from the given attributes.

This can be done in two ways:
  1. A list of dimension names was specified in an attribute with the name specified by the attr_name argument

  2. One by one via attributes that start with the string prefix defined in attr_prefix. This can be used if not all dimension names are available. Note that this will also _not_ be used if option 1 is used!

Parameters
  • attrs (dict) – The dict-like object to read attributes from

  • obj_logstr (str) – A string that is given as context in log and error messages, ideally describing the object these attributes belong to

  • ndim (int) – The expected rank of the dimension names

  • attr_name (str) – The key to look for in attrs that would give a sequence of the dimension names.

  • attr_prefix (str) – The prefix to look for in the keys of the attrs that would specify the name of a single dimension.

Returns

The dimension names or None as placeholder

Return type

Tuple[Union[str, None]]

Raises
  • TypeError – Attribute found at attr_name was a string, was not iterable or was not a sequence of strings

  • ValueError – Length mismatch of attribute found at attr_name and the data.

dantro.utils.coords._coords_start_and_step(cargs, *, data_shape: tuple, dim_num: int, **__) → Iterable[int][source]

Interpret as integer start and step of range expression and use the length of the data dimension as number of steps

dantro.utils.coords._coords_trivial(_, *, data_shape: tuple, dim_num: int, **__) → Iterable[int][source]

Returns trivial coordinates for the given dimension by creating a range iterator from the selected data shape.

dantro.utils.coords._coords_scalar(coord, **__) → List[TCoord][source]

Returns a single, scalar coordinate, i.e.: list of length 1

dantro.utils.coords._coords_linked(cargs, *, link_anchor_obj, **__) → dantro.utils.link.Link[source]

Creates a Link object which is to be used for coordinates

dantro.utils.coords.extract_coords_from_attrs(obj: Union[dantro.abc.AbstractDataContainer, numpy.ndarray], *, dims: Tuple[Optional[str]], strict: bool, coords_attr_prefix: str, default_mode: str, mode_attr_prefix: str = None, attrs: dict = None) → Dict[str, Sequence[TCoord]][source]

Extract coordinates from the given object’s attributes.

This is done by iterating over the given dims and then looking for attributes that are prefixed with coords_attr_prefix and ending in the name of the dimension, e.g. attributes like coords__time.

The value of that attribute is then evaluated according to a so-called attribute mode. By default, the mode set by default_mode is used, but it can be set explicitly for each dimension by the mode_attr_prefix parameter.

The resulting number of coordinates for a dimension always need to match the length of that dimension. However, the corresponding error can only be raised once this information is applied.

Parameters
  • obj (Union[AbstractDataContainer, np.ndarray]) – The object to retrieve the attributes from (via the attrs attribute). If the attrs argument is given, will use those instead. It is furthermore expected that this object specifies the shape of the numerical data the coordinates are to be generated for by providing a shape property. This is possible with NumpyDataContainer and derived classes.

  • dims (Tuple[Union[str, None]]) – Sequence of dimension names; this may also contain None’s, which are ignored for coordinates.

  • strict (bool) – Whether to use strict checking, where no additional coordinate-specifying attributes are allowed.

  • coords_attr_prefix (str) – The attribute name prefix for coordinate specifications

  • default_mode (str) –

    The default coordinate extraction mode. Available modes:

    • values: the explicit values (iterable) to use for coordinates

    • range: range arguments

    • arange: np.arange arguments

    • linspace: np.linspace arguments

    • logspace: np.logspace arguments

    • trivial: The trivial indices. This does not require a value for the coordinate argument.

    • scalar: makes sure only a single coordinate is provided

    • start_and_step: the start and step values of an integer range expression; the stop value is deduced by looking at the length of the corresponding dimension. This is then passed to the python range function as (start, stop, step)

    • linked: Load the coordinates from a linked object within the tree; this works only if link_anchor_obj is part of a data tree at the point of coordinate resolution!

  • mode_attr_prefix (str, optional) – The attribute name prefix that can be used to specify a non-default extraction mode. If not given, the default mode will be used.

  • attrs (dict, optional) – If given, these attributes will be used instead of attempting to extract attributes from obj.

Returns

The (dim_name -> coords) mapping

Return type

TCoordsDict

Raises

ValueError – On invalid coordinates mode or (with strict attribute checking) on superfluous coordinate-setting attributes.

dantro.utils.coords.extract_coords_from_name(obj: dantro.abc.AbstractDataContainer, *, dims: Tuple[str], separator: str, attempt_conversion: bool = True) → Dict[str, Sequence[TCoord]][source]

Given a container or group, extract the coordinates from its name.

The name of the object may be a separator-separated string, where each segment contains the coordinate value for one dimension.

This function assumes that the coordinates for each dimension are scalar. Thus, the values of the returned dict are sequences of length 1.

Parameters
  • obj (AbstractDataContainer) – The object to get the coordinates of by inspecting its name.

  • dims (TDims) – The dimension names corresponding to the coordinates that are expected to be found in the object’s name.

  • separator (str) – The separtor to apply on the name.

  • attempt_conversion (bool, optional) – Whether to attempt conversion of the string value to a numerical type.

Returns

The coordinate dict, i.e. a mapping from the external

dimension names to the coordinate values. In this case, there can only a single value for each dimension!

Return type

TCoordsDict

Raises

ValueError – Raised upon failure to extract external coordinates: On ext_dims evaluating to False, f coordinates were missing for any of the external dimensions, if the number of coordinates extracted from the name did not match the number of external dimensions, if any of the strings extracted from the object’s name were empty.

dantro.utils.coords.extract_coords_from_data(obj: dantro.abc.AbstractDataContainer, *, dims: Tuple[str]) → Dict[str, Sequence[TCoord]][source]

Tries to extract the coordinates from the data of the given container or group. For that purpose, the obj needs to support the coords property.

Parameters
  • obj (AbstractDataContainer) – The object that holds the data from which the coordinates are to be extracted.

  • dims (TDims) – The sequence of dimension names for which the coordinates are to be extracted.

dantro.utils.coords.extract_coords(obj: dantro.abc.AbstractDataContainer, *, mode: str, dims: Tuple[str], use_cache: bool = False, cache_prefix: str = '__coords_cache_', **kwargs) → Dict[str, Sequence[TCoord]][source]

Wrapper around the more specific coordinate extraction functions.

Note

This function does not support the extraction of non-dimension coordinates.

Parameters
  • obj (AbstractDataContainer) – The object from which to extract the coordinates.

  • mode (str) –

    Which mode to use for extraction. Can be:

    • name: Use the name of the object

    • attrs: Use the attributes of the object

    • data: Use the data of the object

  • dims (TDims) – The dimensions for which the attributes are to be extracted. All dimension names given here are expected to be found.

  • use_cache (bool, optional) – Whether to use the object’s attributes to write an extracted value to the cache and read it, if available.

  • cache_prefix (str, optional) – The prefix to use for writing the cache entries to the object attributes. Will suffix this with dims and coords and store the respective data there.

  • **kwargs – Passed on to the actual coordinates extraction method.