dantro.groups package#

The groups sub-package implements BaseDataGroup specializations.

isort:skip_file

Submodules#

dantro.groups.graph module#

In this module, the GraphGroup is implemented, which provides an interface between hierarchically stored data and the creation of graph objects using the networkx package and the therein implemented networkx.Graph classes.

See The GraphGroup for more information.

class GraphGroup(*args, **kwargs)[source]#

Bases: dantro.base.BaseDataGroup

The GraphGroup class manages groups of graph data containers and provides the possibility to create networkx graph objects using the data inside this group.

See The GraphGroup for more information.

_ALLOWED_CONT_TYPES = (<class 'dantro.containers.xr.XrDataContainer'>, <class 'dantro.groups.labelled.LabelledDataGroup'>)#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_GG_node_container = 'nodes'#
_GG_edge_container = 'edges'#
_GG_attr_directed = 'directed'#
_GG_attr_parallel = 'parallel'#
_GG_attr_edge_container_is_transposed = 'edge_container_is_transposed'#
_GG_attr_keep_dim = 'keep_dim'#
_GG_WARN_UPON_BAD_ALIGN = True#
__init__(*args, **kwargs)[source]#

Initialize a GraphGroup.

Parameters
property property_maps: dict#

The property maps associated with this group, keyed by name.

property node_container#

Returns the associated node container of this graph group

property edge_container#

Returns the associated edge container of this graph group

property default_keep_dim#

The default dimensions not to be squeezed during data selection as specified in the respective group attribute.

_get_item_or_pmap(key: Union[str, List[str]])[source]#

Returns the object accessible via key. Apart from allowing to retrieve objects in this group, the method additionally allows to access data stored in property maps.

Parameters

key (Union[str, List[str]]) – The object to retrieve. If this is a path, will recurse down until at the end.

Returns

The object at key

Raises

KeyError – If no such key can be found

_get_data_at(*, data: Union[XrDataContainer, LabelledDataGroup], sel: dict = None, isel: dict = None, at_time: int = None, at_time_idx: int = None, keep_dim=None) Union[DataArray, XrDataContainer][source]#

Returns a xarray.DataArray containing the data specified via the selectors sel and isel. Any dimension of size 1 is removed from the selected data.

Warning

Any invalid key in sel and isel is ignored silently.

Parameters
  • data (Union[XrDataContainer, LabelledDataGroup]) – Data to select from.

  • sel (dict, optional) – Dict of coordinate values keyed by dimensions, passed to data.sel. Used to select data via index label. May be given together with isel if no key exists in both.

  • isel (dict, optional) – Dict of indexes keyed by dimensions, passed to data.isel. Used to select data via index. May be given together with sel if no key exists in both.

  • at_time (int, optional) – Select along time dimension via index label. Translated to sel = dict(time=at_time), potentially overwriting an existing time entry.

  • at_time_idx (int, optional) – Select along time dimension via index. Translated to isel = dict(time=at_time_idx), potentially overwriting an existing time entry.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed.

Returns

The selected data

Return type

DataArray

Raises

ValueError – On keys that exist in both sel and isel

_prepare_edge_data(*, edges, max_tuple_size: int)[source]#

Prepares the edge data. Depending on the _GG_attr_edge_container_is_transposed class attribute, the edge data is transposed or not. If the attribute does not exist, the data is transposed only if the correct shape could unambiguously be deduced.

Parameters
  • edges – The edge data stored in a 2-dimensional container

  • max_tuple_size (int) – The maximum allowed edge tuple size (4 for networkx.MultiGraph, else 3). Used if the correct shape is tried to be deduced automatically.

Returns

The edge data, possibly transposed

Raises

TypeError – Edge data is not 2-dimensional

_prepare_property_data(name: str, data)[source]#

Prepares external property data.

Parameters
  • name (str) – The properties’ name

  • data – The property data

Returns

The data, potentially converted to a
py:class

~dantro.containers.xr.XrDataContainer

Raises

TypeError – On invalid type of data

_check_alignment(*, ent, prop)[source]#

Checks the alignment of property data and entity (node or edge) data. If self._GG_WARN_UPON_BAD_ALIGN is True, warn on possible pitfalls.

Parameters
  • ent – The entity (node or edge) data

  • prop – The property data

register_property_map(key: str, data)[source]#

Registers a new property map. It allows for the given data to be accessed internally by the specified key.

Parameters
  • key (str) – The key via which the registered data will be available

  • data – The data to be mapped. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, will property map registration fail.

Raises

ValueError – On invalid key

create_graph(*, directed: bool = None, parallel_edges: bool = None, node_props: list = None, edge_props: list = None, sel: dict = None, isel: dict = None, at_time: int = None, at_time_idx: int = None, align: bool = False, keep_dim=None, **graph_kwargs) Graph[source]#

Create a networkx networkx.Graph (or a more specialized graph type) object from the node and edge data associated with this graph group. Optionally, node and edge properties can be added from data stored or registered in the graph group. The coordinates for the selected or squeezed dimensions of the node, edge, and property data are stored as graph attributes in g.graph.

Note

Any pre-selection specified by sel, isel, at_time, or at_time_idx will be applied to the node data, edge data, as well as any given property data.

Warning

Any invalid key in sel and isel is ignored silently (see _get_data_at()).

Parameters
  • directed (bool, optional) – If true, the graph will be directed. If not given, the value given by the group attribute with name _GG_attr_directed is used instead.

  • parallel_edges (bool, optional) – If true, the graph will allow parallel edges. If not given, the value is tried to be read from the group attribute with name _GG_attr_parallel.

  • node_props (list, optional) – List of names specifying the containers that contain the node property data.

  • edge_props (list, optional) – List of names specifying the containers that contain the edge property data.

  • sel (dict, optional) – Dict of coordinate values keyed by dimensions, passed to _get_data_at(). Used to select data via index label.

  • isel (dict, optional) – Dict of indexes keyed by dimensions, passed to _get_data_at(). Used to select data via index.

  • at_time (int, optional) – Select along time dimension via index label. Translated to sel = dict(time=at_time).

  • at_time_idx (int, optional) – Select along time dimension via index. Translated to isel = dict(time=at_time_idx).

  • align (bool, optional) – If True, the property data is aligned with the node/edge data using xarray.align (default: False). The indexes of the <node/edge>_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **graph_kwargs – Passed to the constructor of the respective networkx graph object.

Returns

The networkx graph object. Depending on the provided information, one of the following graph objects is created: networkx.Graph, networkx.DiGraph, networkx.MultiGraph, networkx.MultiDiGraph.

set_node_property(*, g: Graph, name: str, data=None, align: bool = False, keep_dim=None, **selector)[source]#

Sets a property to every node in Graph g that is also in the node_container of the graph group. The coordinates for the selected or squeezed dimensions of the property data are stored as Graph attributes (in g.graph).

Parameters
  • g (Graph) – The networkx graph object

  • name (str) – If data is None, name must specify the container within the graph group that contains the property values, or be valid key in property_maps. name is used as the name for the property in the graph object, potentially overwriting an existing property.

  • data (None, optional) – If given, load node properties directly from data. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, the node property setting will fail.

  • align (bool, optional) – If True, the property data is aligned with the node data using xarray.align. The indexes of the node_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **selector – Specifies the selection applied to both node data and property data. Passed on to _get_data_at(). Use the sel (isel) dict to select data via coordinate value (index).

Raises

ValueError – Lenght mismatch of the selected property and node data

set_edge_property(*, g: Graph, name: str, data=None, align: bool = False, keep_dim=None, **selector)[source]#

Sets a property to every edge in Graph g that is also in the edge_container of the graph group. The coordinates for the selected or squeezed dimensions of the property data are stored as Graph attributes (in g.graph).

Parameters
  • g (Graph) – The networkx graph object

  • name (str) – If data is None, name must specify the container within the graph group that contains the property values, or be valid key in property_maps. name is used as the name for the property in the graph object, potentially overwriting an existing property.

  • data (None, optional) – If given, load edge properties directly from data. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, the edge property setting will fail.

  • align (bool, optional) – If True, the property data is aligned with the edge data using xarray.align. The indexes of the edge_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **selector – Specifies the selection applied to both edge data and property data. Passed on to _get_data_at(). Use the sel (isel) dict to select data via coordinate value (index).

Raises

ValueError – Lenght mismatch of the selected property and edge data

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS: type = None#

Which class to use for creating a new container via call to the new_container() method. If None, the type needs to be specified explicitly in the method call.

_NEW_GROUP_CLS: type = None#

Which class to use when creating a new group via new_group(). If None, the type of the current instance is used for the new group.

_STORAGE_CLS#

alias of dict

__contains__(cont: Union[str, AbstractDataContainer]) bool#

Whether the given container is in this group or not.

If this is a data tree object, it will be checked whether this specific instance is part of the group, using is-comparison.

Otherwise, assumes that cont is a valid argument to the __getitem__() method (a key or key sequence) and tries to access the item at that path, returning True if this succeeds and False if not.

Lookup complexity is that of item lookup (scalar) for both name and object lookup.

Parameters

cont (Union[str, AbstractDataContainer]) – The name of the container, a path, or an object to check via identity comparison.

Returns

Whether the given container object is part of this group or

whether the given path is accessible from this group.

Return type

bool

__delitem__(key: str) None#

Deletes an item from the group

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, List[str]]) AbstractDataContainer#

Looks up the given key and returns the corresponding item.

This supports recursive relative lookups in two ways:

  • By supplying a path as a string that includes the path separator. For example, foo/bar/spam walks down the tree along the given path segments.

  • By directly supplying a key sequence, i.e. a list or tuple of key strings.

With the last path segment, it is possible to access an element that is no longer part of the data tree; successive lookups thus need to use the interface of the corresponding leaf object of the data tree.

Absolute lookups, i.e. from path /foo/bar, are not possible!

Lookup complexity is that of the underlying data structure: for groups based on dict-like storage containers, lookups happen in constant time.

Note

This method aims to replicate the behavior of POSIX paths.

Thus, it can also be used to access the element itself or the parent element: Use . to refer to this object and .. to access this object’s parent.

Parameters

key (Union[str, List[str]]) – The name of the object to retrieve or a path via which it can be found in the data tree.

Returns

The object at key, which concurs to the

dantro tree interface.

Return type

AbstractDataContainer

Raises

ItemAccessError – If no object could be found at the given key or if an absolute lookup, starting with /, was attempted.

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, List[str]], val: BaseDataContainer) None#

This method is used to allow access to the content of containers of this group. For adding an element to this group, use the add method!

Parameters
  • key (Union[str, List[str]]) – The key to which to set the value. If this is a path, will recurse down to the lowest level. Note that all intermediate keys need to be present.

  • val (BaseDataContainer) – The value to set

Returns

None

Raises

ValueError – If trying to add an element to this group, which should be done via the add method.

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont) None#

Called after a container was added.

_add_container_to_data(cont: AbstractDataContainer) None#

Performs the operation of adding the container to the _data. This can be used by subclasses to make more elaborate things while adding data, e.g. specify ordering …

NOTE This method should NEVER be called on its own, but only via the

_add_container method, which takes care of properly linking the container that is to be added.

NOTE After adding, the container need be reachable under its .name!

Parameters

cont – The container to add

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() List[str]#

For ipython integration, return a list of available keys

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

keys()#

Returns an iterator over the container names in this group.

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

dantro.groups.labelled module#

Implements the LabelledDataGroup, which allows to handle groups and containers that can be associated with further coordinates.

This imitates the xarray selection interface and provides a uniform interface to select data from these groups. Most importantly, it allows to combine all the data of one group, allowing to conveniently work with heterogeneously stored data.

class LabelledDataGroup(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)[source]#

Bases: dantro.groups.ordered.OrderedDataGroup

A group that assumes that the members it contains can be labelled with dimension names and coordinates.

Such a group has the great benefit to provide a selection interface that works fully on the dimension labels and coordinates and can cooperate with the xarray selection interface, i.e. the sel and isel methods.

_NEW_CONTAINER_CLS#

alias of dantro.containers.xr.XrDataContainer

LDG_ALLOW_DEEP_SELECTION = True#
LDG_DIMS = ()#
LDG_EXTRACT_COORDS_FROM = 'data'#
LDG_COORDS_ATTR_PREFIX = 'ext_coords__'#
LDG_COORDS_MODE_ATTR_PREFIX = 'ext_coords_mode__'#
LDG_COORDS_MODE_DEFAULT = 'scalar'#
LDG_STRICT_ATTR_CHECKING = False#
LDG_COORDS_SEPARATOR_IN_NAME = ';'#
_COLLECTIVE_SELECT_THRESHOLD = 1.8#
__init__(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)[source]#

Initialize a LabelledDataGroup

Parameters
  • *args – Passed on to OrderedDataGroup

  • dims (TDims, optional) – The dimensions associated with this group. If not given, will use those defined in the LDG_DIMS class variable. These can not be changed afterwards!

  • mode (str, optional) – By which coordinate extraction mode to get the coordinates from the group members. Can be attrs, name, data or anything else specified in extract_coords().

  • allow_deep_selection (bool, optional) – Whether to allow deep selection. If not given, will use the LDG_ALLOW_DEEP_SELECTION class variable’s value. Behaviour can be changed via the property of the same name.

  • **kwargs – Passed on to OrderedDataGroup

property dims: Tuple[str]#

The names of the group-level dimensions this group manages.

It _may_ contain dimensions that overlap with dimension names from the members; this is intentional.

property ndim: int#

The rank of the space covered by the group-level dimensions.

property coords: Dict[str, List[dantro.utils.coords.TCoord]]#

Returns a dict-like container of group-level coordinate values keyed by dimension.

property shape: Tuple[int]#

Return the shape of the space covered by the group-level dimensions.

property allow_deep_selection: bool#

Whether deep selection is allowed.

property member_map: DataArray#

Returns an array that represents the space that the members of this group span, where each value (i.e. a specific coordinate combination) is the name of the corresponding member of this group.

Upon first call, this is computed here. If members are added, it is tried to accomodate them in there; if not possible, the cache will be invalidated.

The member map _may_ include empty strings, i.e. coordinate combinations that are not covered by any member. Also, they can contain duplicate names, as one member can cover multiple coordinates.

Note

The member map is invalidated when new members are added that can not be accomodated in it. It will be recalculated when needed.

property member_map_available: bool#

Whether the member map is available yet.

isel(indexers: dict = None, *, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray[source]#

Return a new labelled xarray.DataArray with an index-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick indices. As xarray.DataArray.isel(), uses pandas-like indexing, i.e.: slices do not include the terminal value.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

sel(indexers: dict = None, *, method: str = None, tolerance: float = None, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray[source]#

Return a new labelled xarray.DataArray with a coordinate-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels. As xarray.DataArray.sel(), uses pandas-like indexing, i.e.: slices include the terminal value.

  • method (str, optional) – Method to use for inexact matches

  • tolerance (float, optional) – Maximum (absolute) distance between original and given label for inexact matches.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

_get_coords_of(obj: AbstractDataContainer) Dict[str, Sequence[dantro.utils.coords.TCoord]][source]#

Extract the coordinates for the given object using the extract_coords() function.

Parameters

obj (AbstractDataContainer) – The object to get the coordinates of.

Returns

The extracted coordinates

Return type

TCoordsDict

_add_container_callback(cont: AbstractDataContainer) None[source]#

Called by the base class after adding a container, this method checks whether the member map needs to be invalidated or whether the new container can be accomodated in it.

If it can be accomodated, the member map will be adjusted such that for all coordinates associated with the given cont, the member map points to the newly added container.

Parameters

cont (AbstractDataContainer) – The newly added container

_parse_indexers(indexers: dict, *, allow_deep: bool, **indexers_kwargs) Tuple[dict, dict][source]#

Parses the given indexer arguments and split them into indexers for the selection of group members and deep selection.

Parameters
  • indexers (dict) – The indexers dict, may be empty

  • allow_deep (bool) – Whether to allow deep selection

  • **indexers_kwargs – Additional indexers

Returns

(shallow indexers, deep indexers)

Return type

Tuple[dict, dict]

Raises

ValueError – If deep indexers were given but deep selection was not enabled

_get_cont(name: str, *, combination_method: str) Optional[XrDataContainer][source]#

Retrieve the container from the group. If no container could be found, returns None, which denotes that further processing should be skipped.

Parameters
  • name (str) – Name of the container to be extracted

  • combination_method (str) – How the container data will be combined

Returns

The extracted container

Return type

Union[XrDataContainer, None]

Raises

ItemAccessError – If combination_method == "concat", on invalid container name.

_process_cont(cont, *, coords, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray[source]#

Process the given container and coordinates into a data array; this applies selection along container dimensions that overlap with the group dimensions as well as deep selection.

Parameters
  • cont – The container to be processed

  • coords – The DataArrayCoordinates of the given container in the preselected member map.

  • shallow_indexers (dict) – Indexers that were used to preselect the member map.

  • deep_indexers (dict) – Indexers to be applied to the container

  • by_index (bool) – Whether to select by index

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The processed container data

Return type

DataArray

Raises

ValueError – In name mode, on conflicting non-dimension container coordinates.

_select(*, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray[source]#

Preselect the member map (if needed) and designate a suitable method for further processing and selection based on the given combination method and indexers.

If possible, take shortcuts when selecting all data or when selecting data from a single group member.

Parameters
  • combination_method (str) – How to combine the member data.

  • shallow_indexers (dict) – Indexers to be applied on the group-level.

  • deep_indexers (dict) – Indexers to be applied on the member-level only.

  • by_index (bool) – Whether to select by index.

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The selected data.

Return type

DataArray

Raises

ValueError – On invalid combination_method.

_select_single(cont_names: DataArray, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray[source]#

Select data from a single group member. Expects the preselected member map to contain only a single valid container name.

_select_all_merge() DataArray[source]#

Select all group data by directly merging all containers. This circumvents building the member map. This might fail, e.g. if there are conflicting or duplicate coordinates.

_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_GROUP_CLS: type = None#

Which class to use when creating a new group via new_group(). If None, the type of the current instance is used for the new group.

_STORAGE_CLS#

alias of collections.OrderedDict

__contains__(cont: Union[str, AbstractDataContainer]) bool#

Whether the given container is in this group or not.

If this is a data tree object, it will be checked whether this specific instance is part of the group, using is-comparison.

Otherwise, assumes that cont is a valid argument to the __getitem__() method (a key or key sequence) and tries to access the item at that path, returning True if this succeeds and False if not.

Lookup complexity is that of item lookup (scalar) for both name and object lookup.

Parameters

cont (Union[str, AbstractDataContainer]) – The name of the container, a path, or an object to check via identity comparison.

Returns

Whether the given container object is part of this group or

whether the given path is accessible from this group.

Return type

bool

__delitem__(key: str) None#

Deletes an item from the group

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, List[str]]) AbstractDataContainer#

Looks up the given key and returns the corresponding item.

This supports recursive relative lookups in two ways:

  • By supplying a path as a string that includes the path separator. For example, foo/bar/spam walks down the tree along the given path segments.

  • By directly supplying a key sequence, i.e. a list or tuple of key strings.

With the last path segment, it is possible to access an element that is no longer part of the data tree; successive lookups thus need to use the interface of the corresponding leaf object of the data tree.

Absolute lookups, i.e. from path /foo/bar, are not possible!

Lookup complexity is that of the underlying data structure: for groups based on dict-like storage containers, lookups happen in constant time.

Note

This method aims to replicate the behavior of POSIX paths.

Thus, it can also be used to access the element itself or the parent element: Use . to refer to this object and .. to access this object’s parent.

Parameters

key (Union[str, List[str]]) – The name of the object to retrieve or a path via which it can be found in the data tree.

Returns

The object at key, which concurs to the

dantro tree interface.

Return type

AbstractDataContainer

Raises

ItemAccessError – If no object could be found at the given key or if an absolute lookup, starting with /, was attempted.

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, List[str]], val: BaseDataContainer) None#

This method is used to allow access to the content of containers of this group. For adding an element to this group, use the add method!

Parameters
  • key (Union[str, List[str]]) – The key to which to set the value. If this is a path, will recurse down to the lowest level. Note that all intermediate keys need to be present.

  • val (BaseDataContainer) – The value to set

Returns

None

Raises

ValueError – If trying to add an element to this group, which should be done via the add method.

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_to_data(cont: AbstractDataContainer) None#

Performs the operation of adding the container to the _data. This can be used by subclasses to make more elaborate things while adding data, e.g. specify ordering …

NOTE This method should NEVER be called on its own, but only via the

_add_container method, which takes care of properly linking the container that is to be added.

NOTE After adding, the container need be reachable under its .name!

Parameters

cont – The container to add

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() List[str]#

For ipython integration, return a list of available keys

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_select_generic(cont_names: DataArray, *, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray[source]#

Select data from group members using the given indexers and combine it via the specified method. If deep indexers are given, apply the deep indexing on each of the members.

This method receives a labelled array of container names, on which the selection already took place. The aim is now to align the objects these names refer to, including their coordinates, and thereby construct an array that contains both the dimensions given by the cont_names array and each members’ data dimensions.

Available combination methods are based either on xarray.merge() operations or xarray.concat() along each dimension. For both these combination methods, the members of this group need to be prepared such that the operation can be applied, i.e.: they need to already be in an array capable of that operation and they need to directly or indirectly preserve coordinate information.

For that purpose, an object-array is constructed holding the processed member data. As the xarray.Dataset and xarray.DataArray types have issues with handling array-like objects in object arrays, this is done via a numpy.ndarray.

Parameters
  • cont_names (DataArray) – The pre-selected member map object, i.e. a labelled array containing names of the desired members that are to be combined.

  • combination_method (str) – How to combine them: concat, try_concat, or merge. Concatenation will allow preserving the dtype of the underlying data.

  • shallow_indexers (dict) – Indexer arguments that were used for the group member selection.

  • deep_indexers (dict) – Indexer arguments for deep selection to be done before combination.

  • by_index (bool) – Whether the deep indexing should take place by index; if False, will use label-based selection.

  • **sel_kwargs – Passed on to sel().

Returns

The selected data of the members from

cont_names, combined using the given combination method.

Return type

Dataset

Raises
  • ValueError – On conflicting coordinate information on group-level and member-level.

  • KeyError – In concat mode, upon missing members.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

keys()#

Returns an iterator over the container names in this group.

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

classmethod _combine_by_merge(dsets: ndarray) Dataset[source]#

Combine the given datasets by merging using xarray’s xarray.merge().

Parameters

dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined.

Returns

All datasets, aligned and combined via

xarray.merge()

Return type

Dataset

classmethod _combine_by_concatenation(dsets: ndarray, *, dims: Tuple[str]) Dataset[source]#

Combine the given datasets by concatenation using xarray’s xarray.concat() and subsequent application along all dimensions specified in dims.

Parameters
  • dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined by concatenation.

  • dims (TDims) – The dimension names corresponding to all the dimensions of the dsets array.

Returns

The dataset resulting from the concatenation

Return type

Dataset

dantro.groups.ordered module#

In this module, the BaseDataGroup is specialized for holding members in a specific order.

class OrderedDataGroup(*, name: str, containers: Optional[list] = None, attrs=None)[source]#

Bases: dantro.base.BaseDataGroup, collections.abc.MutableMapping

The OrderedDataGroup class manages groups of data containers, preserving the order in which they were added to this group.

It uses an collections.OrderedDict to associate containers with this group.

_STORAGE_CLS#

alias of collections.OrderedDict

_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS: type = None#

Which class to use for creating a new container via call to the new_container() method. If None, the type needs to be specified explicitly in the method call.

_NEW_GROUP_CLS: type = None#

Which class to use when creating a new group via new_group(). If None, the type of the current instance is used for the new group.

__contains__(cont: Union[str, AbstractDataContainer]) bool#

Whether the given container is in this group or not.

If this is a data tree object, it will be checked whether this specific instance is part of the group, using is-comparison.

Otherwise, assumes that cont is a valid argument to the __getitem__() method (a key or key sequence) and tries to access the item at that path, returning True if this succeeds and False if not.

Lookup complexity is that of item lookup (scalar) for both name and object lookup.

Parameters

cont (Union[str, AbstractDataContainer]) – The name of the container, a path, or an object to check via identity comparison.

Returns

Whether the given container object is part of this group or

whether the given path is accessible from this group.

Return type

bool

__delitem__(key: str) None#

Deletes an item from the group

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, List[str]]) AbstractDataContainer#

Looks up the given key and returns the corresponding item.

This supports recursive relative lookups in two ways:

  • By supplying a path as a string that includes the path separator. For example, foo/bar/spam walks down the tree along the given path segments.

  • By directly supplying a key sequence, i.e. a list or tuple of key strings.

With the last path segment, it is possible to access an element that is no longer part of the data tree; successive lookups thus need to use the interface of the corresponding leaf object of the data tree.

Absolute lookups, i.e. from path /foo/bar, are not possible!

Lookup complexity is that of the underlying data structure: for groups based on dict-like storage containers, lookups happen in constant time.

Note

This method aims to replicate the behavior of POSIX paths.

Thus, it can also be used to access the element itself or the parent element: Use . to refer to this object and .. to access this object’s parent.

Parameters

key (Union[str, List[str]]) – The name of the object to retrieve or a path via which it can be found in the data tree.

Returns

The object at key, which concurs to the

dantro tree interface.

Return type

AbstractDataContainer

Raises

ItemAccessError – If no object could be found at the given key or if an absolute lookup, starting with /, was attempted.

__init__(*, name: str, containers: Optional[list] = None, attrs=None)#

Initialize a BaseDataGroup, which can store other containers and attributes.

Parameters
  • name (str) – The name of this data container

  • containers (list, optional) – The containers that are to be stored as members of this group. If given, these are added one by one using the .add method.

  • attrs (None, optional) – A mapping that is stored as attributes

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, List[str]], val: BaseDataContainer) None#

This method is used to allow access to the content of containers of this group. For adding an element to this group, use the add method!

Parameters
  • key (Union[str, List[str]]) – The key to which to set the value. If this is a path, will recurse down to the lowest level. Note that all intermediate keys need to be present.

  • val (BaseDataContainer) – The value to set

Returns

None

Raises

ValueError – If trying to add an element to this group, which should be done via the add method.

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont) None#

Called after a container was added.

_add_container_to_data(cont: AbstractDataContainer) None#

Performs the operation of adding the container to the _data. This can be used by subclasses to make more elaborate things while adding data, e.g. specify ordering …

NOTE This method should NEVER be called on its own, but only via the

_add_container method, which takes care of properly linking the container that is to be added.

NOTE After adding, the container need be reachable under its .name!

Parameters

cont – The container to add

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() List[str]#

For ipython integration, return a list of available keys

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

keys()#

Returns an iterator over the container names in this group.

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

class IndexedDataGroup(*, name: str, containers: Optional[list] = None, attrs=None)[source]#

Bases: dantro.mixins.indexing.IntegerItemAccessMixin, dantro.groups.ordered.OrderedDataGroup

The IndexedDataGroup class holds members that are of the same type and have names that can directly be interpreted as positive integers.

Especially, this group maintains the correct order of members according to integer ordering.

To speed up element insertion, this group keeps track of recently added container names, which are then used as hints for subsequent insertions.

Note

Albeit the members of this group being ordered, item access still refers to the names of the members, not their index within the sequence!

Warning

With the underlying ordering mechanism of KeyOrderedDict, the performance of this data structure is sensitive to the insertion order of elements.

It is fastest for in-order insertions, where the complexity per insertion is constant (regardless of whether insertion order is ascending or descending). For out-of-order insertions, the whole key map may have to be searched, in which case the complexity scales with the number of elements in this group.

Hint

If experiencing trouble with the performance of this data structure, sort elements before adding them to this group.

__last_keys = None#
_STORAGE_CLS#

alias of dantro.utils.ordereddict.IntOrderedDict

_NEW_GROUP_CLS#

alias of dantro.groups.ordered.OrderedDataGroup

key_at_idx(idx: int) str[source]#

Get a key by its index within the container. Can be negative.

Parameters

idx (int) – The index within the member sequence

Returns

The desired key

Return type

str

Raises

IndexError – Index out of range

keys_as_int() Generator[int, None, None][source]#

Returns an iterator over keys as integer values

_add_container_to_data(cont) None[source]#

Adds a container to the underlying integer-ordered dictionary.

Unlike the parent method, this uses insert() in order to provide hints regarding the insertion position. It is optimised for insertion in ascending order.

_ipython_key_completions_() List[int][source]#

For ipython integration, return a list of available keys.

Unlike the BaseDataGroup method, which returns a list of strings, this returns a list of integers.

_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS: type = None#

Which class to use for creating a new container via call to the new_container() method. If None, the type needs to be specified explicitly in the method call.

__contains__(key: Union[str, int]) bool#

Adjusts the parent method to allow checking for integers

__delitem__(key: Union[str, int])#

Adjusts the parent method to allow item deletion by integer key

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, int])#

Adjusts the parent method to allow integer key item access

__init__(*, name: str, containers: Optional[list] = None, attrs=None)#

Initialize a BaseDataGroup, which can store other containers and attributes.

Parameters
  • name (str) – The name of this data container

  • containers (list, optional) – The containers that are to be stored as members of this group. If given, these are added one by one using the .add method.

  • attrs (None, optional) – A mapping that is stored as attributes

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, int])#

Adjusts the parent method to allow item setting by integer key

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont) None#

Called after a container was added.

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_parse_key(key: Union[str, int]) str#

Makes sure a key is a string

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

keys()#

Returns an iterator over the container names in this group.

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

dantro.groups.psp module#

This module implements BaseDataContainer specializations that make use of features from the paramspace package, in particular the ParamSpace class.

class ParamSpaceStateGroup(*, name: str, containers: Optional[list] = None, attrs=None)[source]#

Bases: dantro.groups.ordered.OrderedDataGroup

A ParamSpaceStateGroup is meant to be used as a member group of the ParamSpaceGroup.

While its own name need be interpretable as a positive integer (enforced in the enclosing ParamSpaceGroup but also here), it can hold members with any name.

_NEW_GROUP_CLS#

alias of dantro.groups.ordered.OrderedDataGroup

_check_name(name: str) None[source]#

Called by __init__ and overwritten here to check the name.

property coords: dict#

Retrieves the coordinates of this group within the parameter space described by the ParamSpaceGroup this group is enclosed in.

Returns

The coordinates of this group, keys being dimension names and

values being the coordinate values for this group.

Return type

dict

_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS: type = None#

Which class to use for creating a new container via call to the new_container() method. If None, the type needs to be specified explicitly in the method call.

_STORAGE_CLS#

alias of collections.OrderedDict

__contains__(cont: Union[str, AbstractDataContainer]) bool#

Whether the given container is in this group or not.

If this is a data tree object, it will be checked whether this specific instance is part of the group, using is-comparison.

Otherwise, assumes that cont is a valid argument to the __getitem__() method (a key or key sequence) and tries to access the item at that path, returning True if this succeeds and False if not.

Lookup complexity is that of item lookup (scalar) for both name and object lookup.

Parameters

cont (Union[str, AbstractDataContainer]) – The name of the container, a path, or an object to check via identity comparison.

Returns

Whether the given container object is part of this group or

whether the given path is accessible from this group.

Return type

bool

__delitem__(key: str) None#

Deletes an item from the group

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, List[str]]) AbstractDataContainer#

Looks up the given key and returns the corresponding item.

This supports recursive relative lookups in two ways:

  • By supplying a path as a string that includes the path separator. For example, foo/bar/spam walks down the tree along the given path segments.

  • By directly supplying a key sequence, i.e. a list or tuple of key strings.

With the last path segment, it is possible to access an element that is no longer part of the data tree; successive lookups thus need to use the interface of the corresponding leaf object of the data tree.

Absolute lookups, i.e. from path /foo/bar, are not possible!

Lookup complexity is that of the underlying data structure: for groups based on dict-like storage containers, lookups happen in constant time.

Note

This method aims to replicate the behavior of POSIX paths.

Thus, it can also be used to access the element itself or the parent element: Use . to refer to this object and .. to access this object’s parent.

Parameters

key (Union[str, List[str]]) – The name of the object to retrieve or a path via which it can be found in the data tree.

Returns

The object at key, which concurs to the

dantro tree interface.

Return type

AbstractDataContainer

Raises

ItemAccessError – If no object could be found at the given key or if an absolute lookup, starting with /, was attempted.

__init__(*, name: str, containers: Optional[list] = None, attrs=None)#

Initialize a BaseDataGroup, which can store other containers and attributes.

Parameters
  • name (str) – The name of this data container

  • containers (list, optional) – The containers that are to be stored as members of this group. If given, these are added one by one using the .add method.

  • attrs (None, optional) – A mapping that is stored as attributes

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, List[str]], val: BaseDataContainer) None#

This method is used to allow access to the content of containers of this group. For adding an element to this group, use the add method!

Parameters
  • key (Union[str, List[str]]) – The key to which to set the value. If this is a path, will recurse down to the lowest level. Note that all intermediate keys need to be present.

  • val (BaseDataContainer) – The value to set

Returns

None

Raises

ValueError – If trying to add an element to this group, which should be done via the add method.

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont) None#

Called after a container was added.

_add_container_to_data(cont: AbstractDataContainer) None#

Performs the operation of adding the container to the _data. This can be used by subclasses to make more elaborate things while adding data, e.g. specify ordering …

NOTE This method should NEVER be called on its own, but only via the

_add_container method, which takes care of properly linking the container that is to be added.

NOTE After adding, the container need be reachable under its .name!

Parameters

cont – The container to add

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() List[str]#

For ipython integration, return a list of available keys

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

keys()#

Returns an iterator over the container names in this group.

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

class ParamSpaceGroup(*, name: str, pspace: Optional[ParamSpace] = None, containers: Optional[list] = None, **kwargs)[source]#

Bases: dantro.mixins.indexing.PaddedIntegerItemAccessMixin, dantro.groups.ordered.IndexedDataGroup

The ParamSpaceGroup is associated with a paramspace.paramspace.ParamSpace object and the loaded results of an iteration over this parameter space.

Thus, the groups that are stored in the ParamSpaceGroup need all relate to a state of the parameter space, identified by a zero-padded string name. In fact, this group allows no other kinds of groups stored inside.

To make access to a specific state easier, it allows accessing a state by its state number as integer.

_PSPGRP_PSPACE_ATTR_NAME = 'pspace'#
_PSPGRP_TRANSFORMATOR = None#
_NEW_GROUP_CLS#

alias of dantro.groups.psp.ParamSpaceStateGroup

_ALLOWED_CONT_TYPES = (<class 'dantro.groups.psp.ParamSpaceStateGroup'>,)#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

__init__(*, name: str, pspace: Optional[ParamSpace] = None, containers: Optional[list] = None, **kwargs)[source]#

Initialize a OrderedDataGroup from the list of given containers.

Parameters
  • name (str) – The name of this group.

  • pspace (ParamSpace, optional) – Can already pass a ParamSpace object here.

  • containers (list, optional) – A list of containers to add, which need to be ParamSpaceStateGroup objects.

  • **kwargs – Further initialisation kwargs, e.g. attrs

property pspace: Optional[ParamSpace]#

Reads the entry named _PSPGRP_PSPACE_ATTR_NAME in .attrs and returns a ParamSpace object, if available there.

Returns

The associated

parameter space, or None, if there is none associated yet.

Return type

Union[ParamSpace, None]

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS: type = None#

Which class to use for creating a new container via call to the new_container() method. If None, the type needs to be specified explicitly in the method call.

_PADDED_INT_FSTR: str = None#

The format string to generate a padded integer; deduced upon first call

_PADDED_INT_KEY_WIDTH: int = None#

The number of digits of the padded string representing the integer

_PADDED_INT_MAX_VAL: int = None#

The allowed maximum value of an integer key; checked only in strict mode

_PADDED_INT_STRICT_CHECKING: bool = True#

Whether to use strict checking when parsing keys, i.e. check that the range of keys is valid and an error is thrown when an integer key was given that cannot be represented consistently by a padded string of the determined key width.

_STORAGE_CLS#

alias of dantro.utils.ordereddict.IntOrderedDict

__contains__(key: Union[str, int]) bool#

Adjusts the parent method to allow checking for integers

__delitem__(key: Union[str, int])#

Adjusts the parent method to allow item deletion by integer key

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, int])#

Adjusts the parent method to allow integer key item access

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, int])#

Adjusts the parent method to allow item setting by integer key

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont) None#

Called after a container was added.

_add_container_to_data(cont) None#

Adds a container to the underlying integer-ordered dictionary.

Unlike the parent method, this uses insert() in order to provide hints regarding the insertion position. It is optimised for insertion in ascending order.

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont: AbstractDataContainer) None#

This method is invoked when adding a member to a group and makes sure the name of the added group is correctly zero-padded.

Also, upon first call, communicates the zero padded integer key width, i.e.: the length of the container name, to the PaddedIntegerItemAccessMixin.

Parameters

cont – The member container to add

Returns

None: No return value needed

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() List[int]#

For ipython integration, return a list of available keys.

Unlike the BaseDataGroup method, which returns a list of strings, this returns a list of integers.

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_parse_key(key: Union[str, int]) str#

Parse a potentially integer key to a zero-padded string

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data#

The stored data.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

items()#

Returns an iterator over the (name, data container) tuple of this group.

key_at_idx(idx: int) str#

Get a key by its index within the container. Can be negative.

Parameters

idx (int) – The index within the member sequence

Returns

The desired key

Return type

str

Raises

IndexError – Index out of range

keys()#

Returns an iterator over the container names in this group.

keys_as_int() Generator[int, None, None]#

Returns an iterator over keys as integer values

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property name: str#

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property only_default_data_present: bool#

Returns true if only data for the default point in parameter space is available in this group.

property padded_int_key_width: Optional[int]#

Returns the width of the zero-padded integer key or None, if it is not already specified.

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)#

This method is not supported for a data group

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

select(*, field: Union[str, List[str]] = None, fields: Dict[str, List[str]] = None, subspace: dict = None, method: str = 'concat', idx_as_label: bool = False, base_path: str = None, **kwargs) Dataset[source]#

Selects a multi-dimensional slab of this ParamSpaceGroup and the specified fields and returns them bundled into an xarray.Dataset with labelled dimensions and coordinates.

Parameters
  • field (Union[str, List[str]], optional) – The field of data to select. Should be path or a list of strings that points to an entry in the data tree. To select multiple fields, do not pass this argument but use the fields argument.

  • fields (Dict[str, List[str]], optional) – A dict specifying the fields that are to be loaded into the dataset. Keys will be the names of the resulting variables, while values should specify the path to the field in the data tree. Thus, they can be strings, lists of strings or dicts with the path key present. In the latter case, a dtype can be specified via the dtype key in the dict.

  • subspace (dict, optional) – Selector for a subspace of the parameter space. Adheres to the parameter space’s activate_subspace() signature.

  • method (str, optional) –

    How to combine the selected datasets.

    • concat: concatenate sequentially along all parameter space dimensions. This can preserve the data type but it does not work if one data point is missing.

    • merge: merge always works, even if data points are missing, but will convert all dtypes to float.

  • idx_as_label (bool, optional) – If true, adds the trivial indices as labels for those dimensions where coordinate labels were not extractable from the loaded field. This allows merging for data with different extends in an unlabelled dimension.

  • base_path (str, optional) – If given, path specifications for each field can be seen as relative to this path

  • **kwargs – Passed along either to xr.concat or xr.merge, depending on the method argument.

Raises
  • KeyError – On invalid state key.

  • ValueError – Raised in multiple scenarios: If no ParamSpace was associated with this group, for wrong argument values, if the data to select cannot be extracted with the given argument values, exceptions passed on from xarray.

Returns

The selected hyperslab of the parameter space,

holding the desired fields.

Return type

Dataset

dantro.groups.time_series module#

Implements LabelledDataGroup specializations for time series data.

class TimeSeriesGroup(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)[source]#

Bases: dantro.groups.labelled.LabelledDataGroup, dantro.groups.ordered.IndexedDataGroup

A time-series group assumes that each stored member refers to one point in time, where the name is to be interpreted as the time coordinate.

For more information on selection methods, see:

LDG_DIMS = ('time',)#

Expected dimension names. There is only one dimension in a TimeSeriesGroup: time

LDG_EXTRACT_COORDS_FROM = 'name'#

Where to extract time coordinates from. Here, the container name is expected to be the time coordinate.

LDG_ALLOW_DEEP_SELECTION = True#
LDG_COORDS_ATTR_PREFIX = 'ext_coords__'#
LDG_COORDS_MODE_ATTR_PREFIX = 'ext_coords_mode__'#
LDG_COORDS_MODE_DEFAULT = 'scalar'#
LDG_COORDS_SEPARATOR_IN_NAME = ';'#
LDG_STRICT_ATTR_CHECKING = False#
_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COLLECTIVE_SELECT_THRESHOLD = 1.8#
_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS#

alias of dantro.containers.xr.XrDataContainer

_NEW_GROUP_CLS#

alias of dantro.groups.ordered.OrderedDataGroup

_STORAGE_CLS#

alias of dantro.utils.ordereddict.IntOrderedDict

__contains__(key: Union[str, int]) bool#

Adjusts the parent method to allow checking for integers

__delitem__(key: Union[str, int])#

Adjusts the parent method to allow item deletion by integer key

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, int])#

Adjusts the parent method to allow integer key item access

__init__(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)#

Initialize a LabelledDataGroup

Parameters
  • *args – Passed on to OrderedDataGroup

  • dims (TDims, optional) – The dimensions associated with this group. If not given, will use those defined in the LDG_DIMS class variable. These can not be changed afterwards!

  • mode (str, optional) – By which coordinate extraction mode to get the coordinates from the group members. Can be attrs, name, data or anything else specified in extract_coords().

  • allow_deep_selection (bool, optional) – Whether to allow deep selection. If not given, will use the LDG_ALLOW_DEEP_SELECTION class variable’s value. Behaviour can be changed via the property of the same name.

  • **kwargs – Passed on to OrderedDataGroup

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, int])#

Adjusts the parent method to allow item setting by integer key

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont: AbstractDataContainer) None#

Called by the base class after adding a container, this method checks whether the member map needs to be invalidated or whether the new container can be accomodated in it.

If it can be accomodated, the member map will be adjusted such that for all coordinates associated with the given cont, the member map points to the newly added container.

Parameters

cont (AbstractDataContainer) – The newly added container

_add_container_to_data(cont) None#

Adds a container to the underlying integer-ordered dictionary.

Unlike the parent method, this uses insert() in order to provide hints regarding the insertion position. It is optimised for insertion in ascending order.

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

classmethod _combine_by_concatenation(dsets: ndarray, *, dims: Tuple[str]) Dataset#

Combine the given datasets by concatenation using xarray’s xarray.concat() and subsequent application along all dimensions specified in dims.

Parameters
  • dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined by concatenation.

  • dims (TDims) – The dimension names corresponding to all the dimensions of the dsets array.

Returns

The dataset resulting from the concatenation

Return type

Dataset

classmethod _combine_by_merge(dsets: ndarray) Dataset#

Combine the given datasets by merging using xarray’s xarray.merge().

Parameters

dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined.

Returns

All datasets, aligned and combined via

xarray.merge()

Return type

Dataset

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_get_cont(name: str, *, combination_method: str) Optional[XrDataContainer]#

Retrieve the container from the group. If no container could be found, returns None, which denotes that further processing should be skipped.

Parameters
  • name (str) – Name of the container to be extracted

  • combination_method (str) – How the container data will be combined

Returns

The extracted container

Return type

Union[XrDataContainer, None]

Raises

ItemAccessError – If combination_method == "concat", on invalid container name.

_get_coords_of(obj: AbstractDataContainer) Dict[str, Sequence[dantro.utils.coords.TCoord]]#

Extract the coordinates for the given object using the extract_coords() function.

Parameters

obj (AbstractDataContainer) – The object to get the coordinates of.

Returns

The extracted coordinates

Return type

TCoordsDict

_ipython_key_completions_() List[int]#

For ipython integration, return a list of available keys.

Unlike the BaseDataGroup method, which returns a list of strings, this returns a list of integers.

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_parse_indexers(indexers: dict, *, allow_deep: bool, **indexers_kwargs) Tuple[dict, dict]#

Parses the given indexer arguments and split them into indexers for the selection of group members and deep selection.

Parameters
  • indexers (dict) – The indexers dict, may be empty

  • allow_deep (bool) – Whether to allow deep selection

  • **indexers_kwargs – Additional indexers

Returns

(shallow indexers, deep indexers)

Return type

Tuple[dict, dict]

Raises

ValueError – If deep indexers were given but deep selection was not enabled

_parse_key(key: Union[str, int]) str#

Makes sure a key is a string

_process_cont(cont, *, coords, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Process the given container and coordinates into a data array; this applies selection along container dimensions that overlap with the group dimensions as well as deep selection.

Parameters
  • cont – The container to be processed

  • coords – The DataArrayCoordinates of the given container in the preselected member map.

  • shallow_indexers (dict) – Indexers that were used to preselect the member map.

  • deep_indexers (dict) – Indexers to be applied to the container

  • by_index (bool) – Whether to select by index

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The processed container data

Return type

DataArray

Raises

ValueError – In name mode, on conflicting non-dimension container coordinates.

_select(*, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Preselect the member map (if needed) and designate a suitable method for further processing and selection based on the given combination method and indexers.

If possible, take shortcuts when selecting all data or when selecting data from a single group member.

Parameters
  • combination_method (str) – How to combine the member data.

  • shallow_indexers (dict) – Indexers to be applied on the group-level.

  • deep_indexers (dict) – Indexers to be applied on the member-level only.

  • by_index (bool) – Whether to select by index.

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The selected data.

Return type

DataArray

Raises

ValueError – On invalid combination_method.

_select_all_merge() DataArray#

Select all group data by directly merging all containers. This circumvents building the member map. This might fail, e.g. if there are conflicting or duplicate coordinates.

_select_generic(cont_names: DataArray, *, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Select data from group members using the given indexers and combine it via the specified method. If deep indexers are given, apply the deep indexing on each of the members.

This method receives a labelled array of container names, on which the selection already took place. The aim is now to align the objects these names refer to, including their coordinates, and thereby construct an array that contains both the dimensions given by the cont_names array and each members’ data dimensions.

Available combination methods are based either on xarray.merge() operations or xarray.concat() along each dimension. For both these combination methods, the members of this group need to be prepared such that the operation can be applied, i.e.: they need to already be in an array capable of that operation and they need to directly or indirectly preserve coordinate information.

For that purpose, an object-array is constructed holding the processed member data. As the xarray.Dataset and xarray.DataArray types have issues with handling array-like objects in object arrays, this is done via a numpy.ndarray.

Parameters
  • cont_names (DataArray) – The pre-selected member map object, i.e. a labelled array containing names of the desired members that are to be combined.

  • combination_method (str) – How to combine them: concat, try_concat, or merge. Concatenation will allow preserving the dtype of the underlying data.

  • shallow_indexers (dict) – Indexer arguments that were used for the group member selection.

  • deep_indexers (dict) – Indexer arguments for deep selection to be done before combination.

  • by_index (bool) – Whether the deep indexing should take place by index; if False, will use label-based selection.

  • **sel_kwargs – Passed on to sel().

Returns

The selected data of the members from

cont_names, combined using the given combination method.

Return type

Dataset

Raises
  • ValueError – On conflicting coordinate information on group-level and member-level.

  • KeyError – In concat mode, upon missing members.

_select_single(cont_names: DataArray, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Select data from a single group member. Expects the preselected member map to contain only a single valid container name.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property allow_deep_selection: bool#

Whether deep selection is allowed.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property coords: Dict[str, List[dantro.utils.coords.TCoord]]#

Returns a dict-like container of group-level coordinate values keyed by dimension.

property data#

The stored data.

property dims: Tuple[str]#

The names of the group-level dimensions this group manages.

It _may_ contain dimensions that overlap with dimension names from the members; this is intentional.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

isel(indexers: dict = None, *, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray#

Return a new labelled xarray.DataArray with an index-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick indices. As xarray.DataArray.isel(), uses pandas-like indexing, i.e.: slices do not include the terminal value.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

items()#

Returns an iterator over the (name, data container) tuple of this group.

key_at_idx(idx: int) str#

Get a key by its index within the container. Can be negative.

Parameters

idx (int) – The index within the member sequence

Returns

The desired key

Return type

str

Raises

IndexError – Index out of range

keys()#

Returns an iterator over the container names in this group.

keys_as_int() Generator[int, None, None]#

Returns an iterator over keys as integer values

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property member_map: DataArray#

Returns an array that represents the space that the members of this group span, where each value (i.e. a specific coordinate combination) is the name of the corresponding member of this group.

Upon first call, this is computed here. If members are added, it is tried to accomodate them in there; if not possible, the cache will be invalidated.

The member map _may_ include empty strings, i.e. coordinate combinations that are not covered by any member. Also, they can contain duplicate names, as one member can cover multiple coordinates.

Note

The member map is invalidated when new members are added that can not be accomodated in it. It will be recalculated when needed.

property member_map_available: bool#

Whether the member map is available yet.

property name: str#

The name of this DataContainer-derived object.

property ndim: int#

The rank of the space covered by the group-level dimensions.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

sel(indexers: dict = None, *, method: str = None, tolerance: float = None, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray#

Return a new labelled xarray.DataArray with a coordinate-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels. As xarray.DataArray.sel(), uses pandas-like indexing, i.e.: slices include the terminal value.

  • method (str, optional) – Method to use for inexact matches

  • tolerance (float, optional) – Maximum (absolute) distance between original and given label for inexact matches.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

setdefault(key, default=None)#

This method is not supported for a data group

property shape: Tuple[int]#

Return the shape of the space covered by the group-level dimensions.

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: bool#

Whether the class this mixin is mixed into is currently in direct insertion mode.

__locked#

Whether the data is regarded as locked. Note name-mangling here.

__in_direct_insertion_mode#

A name-mangled state flag that determines the state of the object.

class HeterogeneousTimeSeriesGroup(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)[source]#

Bases: dantro.groups.time_series.TimeSeriesGroup

This extends the TimeSeriesGroup by configuring it such that it retrieves its coordinates not from the name of the members contained in it but from _their_ data.

It still manages only the time dimension, which is now overlapping with the time dimension in the members of this group. However, the py:class:~dantro.groups.labelled.LabelledDataGroup can handle this overlap and provides a uniform selection interface that allows combining this heterogeneously stored data.

This becomes especially useful in cases where the members of this group store data with the following properties:

  • Potentially different coordiantes than the coordinates of other members of the group.

  • Containing time information for more than a single time coordinate

  • No guarantee for overlaps between time dimension or any other dimension.

As such it is suitable to work with data that represents ensembles that frequently change not only their size but also their identifying labels. Additionally, it supports them not being stored in regular intervals but only upon a change in coordinates.

LDG_ALLOW_DEEP_SELECTION = True#
LDG_COORDS_ATTR_PREFIX = 'ext_coords__'#
LDG_COORDS_MODE_ATTR_PREFIX = 'ext_coords_mode__'#
LDG_COORDS_MODE_DEFAULT = 'scalar'#
LDG_COORDS_SEPARATOR_IN_NAME = ';'#
LDG_DIMS = ('time',)#

Expected dimension names. There is only one dimension in a TimeSeriesGroup: time

LDG_STRICT_ATTR_CHECKING = False#
_ALLOWED_CONT_TYPES = None#

The types that are allowed to be stored in this group. If None, the dantro base classes are allowed

_ATTRS_CLS#

alias of dantro.base.BaseDataAttrs

_COLLECTIVE_SELECT_THRESHOLD = 1.8#
_COND_TREE_CONDENSE_THRESH = 10#

Condensed tree representation threshold parameter

_COND_TREE_MAX_LEVEL = 10#

Condensed tree representation maximum level

_NEW_CONTAINER_CLS#

alias of dantro.containers.xr.XrDataContainer

_NEW_GROUP_CLS#

alias of dantro.groups.ordered.OrderedDataGroup

_STORAGE_CLS#

alias of dantro.utils.ordereddict.IntOrderedDict

__contains__(key: Union[str, int]) bool#

Adjusts the parent method to allow checking for integers

__delitem__(key: Union[str, int])#

Adjusts the parent method to allow item deletion by integer key

__eq__(other) bool#

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) str#

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, int])#

Adjusts the parent method to allow integer key item access

__init__(*args, dims: Optional[Tuple[str]] = None, mode: Optional[str] = None, allow_deep_selection: Optional[bool] = None, **kwargs)#

Initialize a LabelledDataGroup

Parameters
  • *args – Passed on to OrderedDataGroup

  • dims (TDims, optional) – The dimensions associated with this group. If not given, will use those defined in the LDG_DIMS class variable. These can not be changed afterwards!

  • mode (str, optional) – By which coordinate extraction mode to get the coordinates from the group members. Can be attrs, name, data or anything else specified in extract_coords().

  • allow_deep_selection (bool, optional) – Whether to allow deep selection. If not given, will use the LDG_ALLOW_DEEP_SELECTION class variable’s value. Behaviour can be changed via the property of the same name.

  • **kwargs – Passed on to OrderedDataGroup

__iter__()#

Returns an iterator over the OrderedDict

__len__() int#

The number of members in this group.

__repr__() str#

Same as __str__

__setitem__(key: Union[str, int])#

Adjusts the parent method to allow item setting by integer key

__sizeof__() int#

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof() function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

__str__() str#

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>#
_add_container(cont, *, overwrite: bool)#

Private helper method to add a container to this group.

_add_container_callback(cont: AbstractDataContainer) None#

Called by the base class after adding a container, this method checks whether the member map needs to be invalidated or whether the new container can be accomodated in it.

If it can be accomodated, the member map will be adjusted such that for all coordinates associated with the given cont, the member map points to the newly added container.

Parameters

cont (AbstractDataContainer) – The newly added container

_add_container_to_data(cont) None#

Adds a container to the underlying integer-ordered dictionary.

Unlike the parent method, this uses insert() in order to provide hints regarding the insertion position. It is optimised for insertion in ascending order.

_attrs = None#

The class attribute that the attributes will be stored to

_check_cont(cont) None#

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) None#

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) None#

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

classmethod _combine_by_concatenation(dsets: ndarray, *, dims: Tuple[str]) Dataset#

Combine the given datasets by concatenation using xarray’s xarray.concat() and subsequent application along all dimensions specified in dims.

Parameters
  • dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined by concatenation.

  • dims (TDims) – The dimension names corresponding to all the dimensions of the dsets array.

Returns

The dataset resulting from the concatenation

Return type

Dataset

classmethod _combine_by_merge(dsets: ndarray) Dataset#

Combine the given datasets by merging using xarray’s xarray.merge().

Parameters

dsets (ndarray) – The object-dtype array of xarray.Dataset objects that are to be combined.

Returns

All datasets, aligned and combined via

xarray.merge()

Return type

Dataset

_direct_insertion_mode(*, enabled: bool = True)#

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()#

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()#

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() str#

A __format__ helper function: returns the class name

_format_info() str#

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() str#

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() str#

A __format__ helper function: returns the name

_format_path() str#

A __format__ helper function: returns the path to this container

_format_tree() str#

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() str#

Returns the default tree representation of this group by invoking the .tree property

_get_cont(name: str, *, combination_method: str) Optional[XrDataContainer]#

Retrieve the container from the group. If no container could be found, returns None, which denotes that further processing should be skipped.

Parameters
  • name (str) – Name of the container to be extracted

  • combination_method (str) – How the container data will be combined

Returns

The extracted container

Return type

Union[XrDataContainer, None]

Raises

ItemAccessError – If combination_method == "concat", on invalid container name.

_get_coords_of(obj: AbstractDataContainer) Dict[str, Sequence[dantro.utils.coords.TCoord]]#

Extract the coordinates for the given object using the extract_coords() function.

Parameters

obj (AbstractDataContainer) – The object to get the coordinates of.

Returns

The extracted coordinates

Return type

TCoordsDict

_ipython_key_completions_() List[int]#

For ipython integration, return a list of available keys.

Unlike the BaseDataGroup method, which returns a list of strings, this returns a list of integers.

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()#

Invoked upon locking.

_parse_indexers(indexers: dict, *, allow_deep: bool, **indexers_kwargs) Tuple[dict, dict]#

Parses the given indexer arguments and split them into indexers for the selection of group members and deep selection.

Parameters
  • indexers (dict) – The indexers dict, may be empty

  • allow_deep (bool) – Whether to allow deep selection

  • **indexers_kwargs – Additional indexers

Returns

(shallow indexers, deep indexers)

Return type

Tuple[dict, dict]

Raises

ValueError – If deep indexers were given but deep selection was not enabled

_parse_key(key: Union[str, int]) str#

Makes sure a key is a string

_process_cont(cont, *, coords, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Process the given container and coordinates into a data array; this applies selection along container dimensions that overlap with the group dimensions as well as deep selection.

Parameters
  • cont – The container to be processed

  • coords – The DataArrayCoordinates of the given container in the preselected member map.

  • shallow_indexers (dict) – Indexers that were used to preselect the member map.

  • deep_indexers (dict) – Indexers to be applied to the container

  • by_index (bool) – Whether to select by index

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The processed container data

Return type

DataArray

Raises

ValueError – In name mode, on conflicting non-dimension container coordinates.

_select(*, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Preselect the member map (if needed) and designate a suitable method for further processing and selection based on the given combination method and indexers.

If possible, take shortcuts when selecting all data or when selecting data from a single group member.

Parameters
  • combination_method (str) – How to combine the member data.

  • shallow_indexers (dict) – Indexers to be applied on the group-level.

  • deep_indexers (dict) – Indexers to be applied on the member-level only.

  • by_index (bool) – Whether to select by index.

  • drop (bool) – Whether to drop coordinate variables instead of making them scalar.

  • **sel_kwargs – Passed to sel().

Returns

The selected data.

Return type

DataArray

Raises

ValueError – On invalid combination_method.

_select_all_merge() DataArray#

Select all group data by directly merging all containers. This circumvents building the member map. This might fail, e.g. if there are conflicting or duplicate coordinates.

_select_generic(cont_names: DataArray, *, combination_method: str, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Select data from group members using the given indexers and combine it via the specified method. If deep indexers are given, apply the deep indexing on each of the members.

This method receives a labelled array of container names, on which the selection already took place. The aim is now to align the objects these names refer to, including their coordinates, and thereby construct an array that contains both the dimensions given by the cont_names array and each members’ data dimensions.

Available combination methods are based either on xarray.merge() operations or xarray.concat() along each dimension. For both these combination methods, the members of this group need to be prepared such that the operation can be applied, i.e.: they need to already be in an array capable of that operation and they need to directly or indirectly preserve coordinate information.

For that purpose, an object-array is constructed holding the processed member data. As the xarray.Dataset and xarray.DataArray types have issues with handling array-like objects in object arrays, this is done via a numpy.ndarray.

Parameters
  • cont_names (DataArray) – The pre-selected member map object, i.e. a labelled array containing names of the desired members that are to be combined.

  • combination_method (str) – How to combine them: concat, try_concat, or merge. Concatenation will allow preserving the dtype of the underlying data.

  • shallow_indexers (dict) – Indexer arguments that were used for the group member selection.

  • deep_indexers (dict) – Indexer arguments for deep selection to be done before combination.

  • by_index (bool) – Whether the deep indexing should take place by index; if False, will use label-based selection.

  • **sel_kwargs – Passed on to sel().

Returns

The selected data of the members from

cont_names, combined using the given combination method.

Return type

Dataset

Raises
  • ValueError – On conflicting coordinate information on group-level and member-level.

  • KeyError – In concat mode, upon missing members.

_select_single(cont_names: DataArray, shallow_indexers: dict, deep_indexers: dict, by_index: bool, drop: bool, **sel_kwargs) DataArray#

Select data from a single group member. Expects the preselected member map to contain only a single valid container name.

_tree_repr(*, level: int = 0, max_level: Optional[int] = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Optional[Union[int, Callable[[int, int], int]]] = None, total_item_count: int = 0) Union[str, List[str]]#

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()#

Invoked upon unlocking.

add(*conts, overwrite: bool = False)#

Add the given containers to this group.

property allow_deep_selection: bool#

Whether deep selection is allowed.

property attrs#

The container attributes.

property classname: str#

Returns the name of this DataContainer-derived class

clear()#

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property coords: Dict[str, List[dantro.utils.coords.TCoord]]#

Returns a dict-like container of group-level coordinate values keyed by dimension.

property data#

The stored data.

property dims: Tuple[str]#

The names of the group-level dimensions this group manages.

It _may_ contain dimensions that overlap with dimension names from the members; this is intentional.

get(key, default=None)#

Return the container at key, or default if container with name key is not available.

isel(indexers: dict = None, *, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray#

Return a new labelled xarray.DataArray with an index-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick indices. As xarray.DataArray.isel(), uses pandas-like indexing, i.e.: slices do not include the terminal value.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

items()#

Returns an iterator over the (name, data container) tuple of this group.

key_at_idx(idx: int) str#

Get a key by its index within the container. Can be negative.

Parameters

idx (int) – The index within the member sequence

Returns

The desired key

Return type

str

Raises

IndexError – Index out of range

keys()#

Returns an iterator over the container names in this group.

keys_as_int() Generator[int, None, None]#

Returns an iterator over keys as integer values

lock()#

Locks the data of this object

property locked: bool#

Whether this object is locked

property logstr: str#

Returns the classname and name of this object

property member_map: DataArray#

Returns an array that represents the space that the members of this group span, where each value (i.e. a specific coordinate combination) is the name of the corresponding member of this group.

Upon first call, this is computed here. If members are added, it is tried to accomodate them in there; if not possible, the cache will be invalidated.

The member map _may_ include empty strings, i.e. coordinate combinations that are not covered by any member. Also, they can contain duplicate names, as one member can cover multiple coordinates.

Note

The member map is invalidated when new members are added that can not be accomodated in it. It will be recalculated when needed.

property member_map_available: bool#

Whether the member map is available yet.

property name: str#

The name of this DataContainer-derived object.

property ndim: int#

The rank of the space covered by the group-level dimensions.

new_container(path: Union[str, List[str]], *, Cls: Optional[type] = None, **kwargs)#

Creates a new container of type Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

The created container of type Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: Optional[type] = None, **kwargs)#

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

The created group of type Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent#

The associated parent of this container or group

property path: str#

The path to get to this container or group from some root path

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: Optional[str] = None)#

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)#

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

sel(indexers: dict = None, *, method: str = None, tolerance: float = None, drop: bool = False, combination_method: str = 'auto', deep: bool = None, **indexers_kwargs) DataArray#

Return a new labelled xarray.DataArray with a coordinate-selected subset of members of this group.

If deep selection is activated, those indexers that are not available in the group-managed dimensions are looked up in the members of this group.

Note

For data combination (via any combination_method) dimensions that differ in size across group members have to be labelled, such that arrays can be aligned using xarray’s xarray.align() function and the respective coordinates. See the xarray documentation for more information about coordinates.

Parameters
  • indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels. As xarray.DataArray.sel(), uses pandas-like indexing, i.e.: slices include the terminal value.

  • method (str, optional) – Method to use for inexact matches

  • tolerance (float, optional) – Maximum (absolute) distance between original and given label for inexact matches.

  • drop (bool, optional) – Whether to drop coordinate variables instead of making them scalar.

  • combination_method (str, optional) –

    How to combine group-level data with member-level data. Ignored if data from a single group member is selected, i.e. no data has to be combined. Can be:

    • concat: Concatenate. This can preserve the dtype, but requires that no data is missing.

    • merge: Merge, using xarray.merge(). This leads to a type conversion to float64, but allows members being missing or coordinates not fully filling the available space.

    • try_concat: Try concatenation, fall back to merging if that was unsuccessful.

    • auto: Automatically deduce suitably combination method. Use merge if data is non-integer type and try_concat otherwise.

    Note

    Selecting all data (by not passing any indexers) can be significantly faster using the merge combination method than using the concat method.

  • deep (bool, optional) – Whether to allow deep indexing, i.e.: that indexers may contain dimensions that don’t refer to group- level dimensions but to dimensions that are only availble among the member data. If None, will use the value returned by the allow_deep_selection property.

  • **indexers_kwargs – Additional indexers

Returns

The selected data, potentially a combination of

data on group level and member-level data.

Return type

DataArray

setdefault(key, default=None)#

This method is not supported for a data group

property shape: Tuple[int]#

Return the shape of the space covered by the group-level dimensions.

property tree: str#

Returns the default (full) tree representation of this group

property tree_condensed: str#

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()#

Unlocks the data of this object

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()#

Returns an iterator over the containers in this group.

property with_direct_insertion: