dantro.groups.graph module

In this module, the GraphGroup is implemented, which provides an interface between hierarchically stored data and the creation of graph objects.

See The GraphGroup for more information.

class dantro.groups.graph.GraphGroup(*args, **kwargs)[source]

Bases: dantro.base.BaseDataGroup

The GraphGroup class manages groups of graph data containers and provides the possibility to create networkx graph objects using the data inside this group.

See The GraphGroup for more information.

_ALLOWED_CONT_TYPES = (<class 'dantro.containers.xrdatactr.XrDataContainer'>, <class 'dantro.groups.labelled.LabelledDataGroup'>)
_GG_node_container = 'nodes'
_GG_edge_container = 'edges'
_GG_attr_directed = 'directed'
_GG_attr_parallel = 'parallel'
_GG_attr_edge_container_is_transposed = 'edge_container_is_transposed'
_GG_attr_keep_dim = 'keep_dim'
_GG_WARN_UPON_BAD_ALIGN = True
__init__(*args, **kwargs)[source]

Initialize a GraphGroup.

Parameters
property property_maps

The property maps associated with this group, keyed by name.

property node_container

Returns the associated node container of this graph group

property edge_container

Returns the associated edge container of this graph group

property default_keep_dim

The default dimensions not to be squeezed during data selection as specified in the respective group attribute.

_get_item_or_pmap(key: Union[str, List[str]])[source]

Returns the object accessible via key. Apart from allowing to retrieve objects in this group, the method additionally allows to access data stored in property maps.

Parameters

key (Union[str, List[str]]) – The object to retrieve. If this is a path, will recurse down until at the end.

Returns

The object at key

Raises

KeyError – If no such key can be found

_get_data_at(*, data: Union[dantro.containers.xrdatactr.XrDataContainer, dantro.groups.labelled.LabelledDataGroup], sel: dict = None, isel: dict = None, at_time: int = None, at_time_idx: int = None, keep_dim=None) → Union[xarray.core.dataarray.DataArray, dantro.containers.xrdatactr.XrDataContainer][source]

Returns a xarray.DataArray containing the data specified via the selectors sel and isel. Any dimension of size 1 is removed from the selected data.

Warning

Any invalid key in sel and isel is ignored silently.

Parameters
  • data (Union[XrDataContainer, LabelledDataGroup]) – Data to select from.

  • sel (dict, optional) – Dict of coordinate values keyed by dimensions, passed to data.sel. Used to select data via index label. May be given together with isel if no key exists in both.

  • isel (dict, optional) – Dict of indexes keyed by dimensions, passed to data.isel. Used to select data via index. May be given together with sel if no key exists in both.

  • at_time (int, optional) – Select along time dimension via index label. Translated to sel = dict(time=at_time), potentially overwriting an existing time entry.

  • at_time_idx (int, optional) – Select along time dimension via index. Translated to isel = dict(time=at_time_idx), potentially overwriting an existing time entry.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed.

Returns

The selected data

Return type

xr.DataArray

Raises

ValueError – On keys that exist in both sel and isel

_prepare_edge_data(*, edges, max_tuple_size: int)[source]

Prepares the edge data. Depending on the _GG_attr_edge_container_is_transposed class attribute, the edge data is transposed or not. If the attribute does not exist, the data is transposed only if the correct shape could unambiguously be deduced.

Parameters
  • edges – The edge data stored in a 2-dimensional container

  • max_tuple_size (int) – The maximum allowed edge tuple size (4 for nx.Multigraph, else 3). Used if the correct shape is tried to be deduced automatically.

Returns

The edge data, possibly transposed

Raises

TypeError – Edge data is not 2-dimensional

_prepare_property_data(name: str, data)[source]

Prepares external property data.

Parameters
  • name (str) – The properties’ name

  • data – The property data

Returns

The data, potentially converted to a
py:class

~dantro.containers.xrdatactr.XrDataContainer

Raises

TypeError – On invalid type of data

_check_alignment(*, ent, prop)[source]

Checks the alignment of property data and entity (node or edge) data. If self._GG_WARN_UPON_BAD_ALIGN is True, warn on possible pitfalls.

Parameters
  • ent – The entity (node or edge) data

  • prop – The property data

register_property_map(key: str, data)[source]

Registers a new property map. It allows for the given data to be accessed internally by the specified key.

Parameters
  • key (str) – The key via which the registered data will be available

  • data – The data to be mapped. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, will property map registration fail.

Raises

ValueError – On invalid key

create_graph(*, directed: bool = None, parallel_edges: bool = None, node_props: list = None, edge_props: list = None, sel: dict = None, isel: dict = None, at_time: int = None, at_time_idx: int = None, align: bool = False, keep_dim=None, **graph_kwargs) → networkx.classes.graph.Graph[source]

Create a networkx graph object from the node and edge data associated with the graph group. Optionally, node and edge properties can be added from data stored or registered in the graph group. The coordinates for the selected or squeezed dimensions of the node, edge, and property data are stored as Graph attributes (in g.graph).

Note

Any pre-selection specified by sel, isel, at_time, or at_time_idx will be applied to the node data, edge data, as well as any given property data.

Warning

Any invalid key in sel and isel is ignored silently (see _get_data_at()).

Parameters
  • directed (bool, optional) – If true, the graph will be directed. If not given, the value given by the group attribute with name _GG_attr_directed is used instead.

  • parallel_edges (bool, optional) – If true, the graph will allow parallel edges. If not given, the value is tried to be read from the group attribute with name _GG_attr_parallel.

  • node_props (list, optional) – List of names specifying the containers that contain the node property data.

  • edge_props (list, optional) – List of names specifying the containers that contain the edge property data.

  • sel (dict, optional) – Dict of coordinate values keyed by dimensions, passed to _get_data_at(). Used to select data via index label.

  • isel (dict, optional) – Dict of indexes keyed by dimensions, passed to _get_data_at(). Used to select data via index.

  • at_time (int, optional) – Select along time dimension via index label. Translated to sel = dict(time=at_time).

  • at_time_idx (int, optional) – Select along time dimension via index. Translated to isel = dict(time=at_time_idx).

  • align (bool, optional) – If True, the property data is aligned with the node/edge data using xarray.align (default: False). The indexes of the <node/edge>_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **graph_kwargs – Passed to the constructor of the respective networkx graph object.

Returns

The networkx graph object. Depending on the provided information, one of the following graph objects is created: nx.Graph, nx.DiGraph, nx.MultiGraph, nx.MultiDiGraph.

set_node_property(*, g, name: str, data=None, align: bool = False, keep_dim=None, **selector)[source]

Sets a property to every node in Graph g that is also in the node_container of the graph group. The coordinates for the selected or squeezed dimensions of the property data are stored as Graph attributes (in g.graph).

Parameters
  • g – The networkx graph object

  • name (str) – If data is None, name must specify the container within the graph group that contains the property values, or be valid key in property_maps. name is used as the name for the property in the graph object, potentially overwriting an existing property.

  • data (None, optional) – If given, load node properties directly from data. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, the node property setting will fail.

  • align (bool, optional) – If True, the property data is aligned with the node data using xarray.align. The indexes of the node_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **selector – Specifies the selection applied to both node data and property data. Passed on to _get_data_at(). Use the sel (isel) dict to select data via coordinate value (index).

Raises

ValueError – Lenght mismatch of the selected property and node data

set_edge_property(*, g, name: str, data=None, align: bool = False, keep_dim=None, **selector)[source]

Sets a property to every edge in Graph g that is also in the edge_container of the graph group. The coordinates for the selected or squeezed dimensions of the property data are stored as Graph attributes (in g.graph).

Parameters
  • g – The networkx graph object

  • name (str) – If data is None, name must specify the container within the graph group that contains the property values, or be valid key in property_maps. name is used as the name for the property in the graph object, potentially overwriting an existing property.

  • data (None, optional) – If given, load edge properties directly from data. If the given data is not an allowed container type, an attempt is made to construct an XrDataContainer with the data. Only if this operation fails, the edge property setting will fail.

  • align (bool, optional) – If True, the property data is aligned with the edge data using xarray.align. The indexes of the edge_container are used for each dimension. If the class variable _GG_WARN_UPON_BAD_ALIGN is True, warn upon missing values or if no re-ordering was done. Any dimension of size 1 is squeezed and thus alignment (via align=True) will have no effect on such dimensions.

  • keep_dim (optional) – Iterable containing names of the dimensions that can not be squeezed. Passed on to _get_data_at().

  • **selector – Specifies the selection applied to both edge data and property data. Passed on to _get_data_at(). Use the sel (isel) dict to select data via coordinate value (index).

Raises

ValueError – Lenght mismatch of the selected property and edge data

_ATTRS_CLS

alias of dantro.base.BaseDataAttrs

_COND_TREE_CONDENSE_THRESH = 10
_COND_TREE_MAX_LEVEL = 10
_DirectInsertionModeMixin__in_direct_insertion_mode = False
_LockDataMixin__locked = False
_MutableMapping__marker = <object object>
_NEW_CONTAINER_CLS = None
_NEW_GROUP_CLS = None
_STORAGE_CLS

alias of builtins.dict

__contains__(cont: Union[str, dantro.abc.AbstractDataContainer]) → bool

Whether the given container is in this group or not.

If this is a data tree object, it will be checked whether this specific instance is part of the group, using is-comparison.

Otherwise, assumes that cont is a valid argument to the __getitem__() method (a key or key sequence) and tries to access the item at that path, returning True if this succeeds and False if not.

Lookup complexity is that of item lookup (scalar) for both name and object lookup.

Parameters

cont (Union[str, AbstractDataContainer]) – The name of the container, a path, or an object to check via identity comparison.

Returns

Whether the given container object is part of this group or

whether the given path is accessible from this group.

Return type

bool

__delitem__(key: str) → None

Deletes an item from the group

__eq__(other) → bool

Evaluates equality by making the following comparisons: identity, strict type equality, and finally: equality of the _data and _attrs attributes, i.e. the private attribute. This ensures that comparison does not trigger any downstream effects like resolution of proxies.

If types do not match exactly, NotImplemented is returned, thus referring the comparison to the other side of the ==.

__format__(spec_str: str) → str

Creates a formatted string from the given specification.

Invokes further methods which are prefixed by _format_.

__getitem__(key: Union[str, List[str]]) → dantro.abc.AbstractDataContainer

Looks up the given key and returns the corresponding item.

This supports recursive relative lookups in two ways:

  • By supplying a path as a string that includes the path separator. For example, foo/bar/spam walks down the tree along the given path segments.

  • By directly supplying a key sequence, i.e. a list or tuple of key strings.

With the last path segment, it is possible to access an element that is no longer part of the data tree; successive lookups thus need to use the interface of the corresponding leaf object of the data tree.

Absolute lookups, i.e. from path /foo/bar, are not possible!

Lookup complexity is that of the underlying data structure: for groups based on dict-like storage containers, lookups happen in constant time.

Note

This method aims to replicate the behavior of POSIX paths.

Thus, it can also be used to access the element itself or the parent element: Use . to refer to this object and .. to access this object’s parent.

Parameters

key (Union[str, List[str]]) – The name of the object to retrieve or a path via which it can be found in the data tree.

Returns

The object at key, which concurs to the

dantro tree interface.

Return type

AbstractDataContainer

Raises

ItemAccessError – If no object could be found at the given key or if an absolute lookup, starting with /, was attempted.

__iter__()

Returns an iterator over the OrderedDict

__len__() → int

The number of members in this group.

__repr__() → str

Same as __str__

__setitem__(key: Union[str, List[str]], val: dantro.base.BaseDataContainer) → None

This method is used to allow access to the content of containers of this group. For adding an element to this group, use the add method!

Parameters
  • key (Union[str, List[str]]) – The key to which to set the value. If this is a path, will recurse down to the lowest level. Note that all intermediate keys need to be present.

  • val (BaseDataContainer) – The value to set

Returns

None

Raises

ValueError – If trying to add an element to this group, which should be done via the add method.

__sizeof__() → int

Returns the size of the data (in bytes) stored in this container’s data and its attributes.

Note that this value is approximate. It is computed by calling the sys.getsizeof function on the data, the attributes, the name and some caching attributes that each dantro data tree class contains. Importantly, this is not a recursive algorithm.

Also, derived classes might implement further attributes that are not taken into account either. To be more precise in a subclass, create a specific __sizeof__ method and invoke this parent method additionally.

For more information, see the documentation of sys.getsizeof:

__str__() → str

An info string, that describes the object. This invokes the formatting helpers to show the log string (type and name) as well as the info string of this object.

_abc_impl = <_abc_data object>
_add_container(cont, *, overwrite: bool)

Private helper method to add a container to this group.

_add_container_callback(cont) → None

Called after a container was added.

_add_container_to_data(cont: dantro.abc.AbstractDataContainer) → None

Performs the operation of adding the container to the _data. This can be used by subclasses to make more elaborate things while adding data, e.g. specify ordering …

NOTE This method should NEVER be called on its own, but only via the

_add_container method, which takes care of properly linking the container that is to be added.

NOTE After adding, the container need be reachable under its .name!

Parameters

cont – The container to add

_attrs = None
_check_cont(cont) → None

Can be used by a subclass to check a container before adding it to this group. Is called by _add_container before checking whether the object exists or not.

This is not expected to return, but can raise errors, if something did not work out as expected.

Parameters

cont – The container to check

_check_data(data: Any) → None

This method can be used to check the data provided to this container

It is called before the data is stored in the __init__ method and should raise an exception or create a warning if the data is not as desired.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Note

The CheckDataMixin provides a generalised implementation of this method to perform some type checks and react to unexpected types.

Parameters

data (Any) – The data to check

_check_name(new_name: str) → None

Called from name.setter and can be used to check the name that the container is supposed to have. On invalid name, this should raise.

This method can be subclassed to implement more specific behaviour. To propagate the parent classes’ behaviour the subclassed method should always call its parent method using super().

Parameters

new_name (str) – The new name, which is to be checked.

_direct_insertion_mode(*, enabled: bool = True)

A context manager that brings the class this mixin is used in into direct insertion mode. While in that mode, the with_direct_insertion() property will return true.

This context manager additionally invokes two callback functions, which can be specialized to perform certain operations when entering or exiting direct insertion mode: Before entering, _enter_direct_insertion_mode() is called. After exiting, _exit_direct_insertion_mode() is called.

Parameters

enabled (bool, optional) – whether to actually use direct insertion mode. If False, will yield directly without setting the toggle. This is equivalent to a null-context.

_enter_direct_insertion_mode()

Called after entering direct insertion mode; can be overwritten to attach additional behaviour.

_exit_direct_insertion_mode()

Called before exiting direct insertion mode; can be overwritten to attach additional behaviour.

_format_cls_name() → str

A __format__ helper function: returns the class name

_format_info() → str

A __format__ helper function: returns an info string that is used to characterize this object. Does NOT include name and classname!

_format_logstr() → str

A __format__ helper function: returns the log string, a combination of class name and name

_format_name() → str

A __format__ helper function: returns the name

_format_path() → str

A __format__ helper function: returns the path to this container

_format_tree() → str

Returns the default tree representation of this group by invoking the .tree property

_format_tree_condensed() → str

Returns the default tree representation of this group by invoking the .tree property

_ipython_key_completions_() → List[str]

For ipython integration, return a list of available keys

Links the new_child to this class, unlinking the old one.

This method should be called from any method that changes which items are associated with this group.

_lock_hook()

Invoked upon locking.

_tree_repr(*, level: int = 0, max_level: int = None, info_fstr='<{:cls_name,info}>', info_ratio: float = 0.6, condense_thresh: Union[int, Callable[[int, int], int]] = None, total_item_count: int = 0) → Union[str, List[str]]

Recursively creates a multi-line string tree representation of this group. This is used by, e.g., the _format_tree method.

Parameters
  • level (int, optional) – The depth within the tree

  • max_level (int, optional) – The maximum depth within the tree; recursion is not continued beyond this level.

  • info_fstr (str, optional) – The format string for the info string

  • info_ratio (float, optional) – The width ratio of the whole line width that the info string takes

  • condense_thresh (Union[int, Callable[[int, int], int]], optional) – If given, this specifies the threshold beyond which the tree view for the current element becomes condensed by hiding the output for some elements. The minimum value for this is 3, indicating that there should be at most 3 lines be generated from this level (excluding the lines coming from recursion), i.e.: two elements and one line for indicating how many values are hidden. If a smaller value is given, this is silently brought up to 3. Half of the elements are taken from the beginning of the item iteration, the other half from the end. If given as integer, that number is used. If a callable is given, the callable will be invoked with the current level, number of elements to be added at this level, and the current total item count along this recursion branch. The callable should then return the number of lines to be shown for the current element.

  • total_item_count (int, optional) – The total number of items already created in this recursive tree representation call. Passed on between recursive calls.

Returns

The (multi-line) tree representation of

this group. If this method was invoked with level == 0, a string will be returned; otherwise, a list of strings will be returned.

Return type

Union[str, List[str]]

Unlink a child from this class.

This method should be called from any method that removes an item from this group, be it through deletion or through

_unlock_hook()

Invoked upon unlocking.

add(*conts, overwrite: bool = False)

Add the given containers to this group.

property attrs

The container attributes.

property classname

Returns the name of this DataContainer-derived class

clear()

Clears all containers from this group.

This is done by unlinking all children and then overwriting _data with an empty _STORAGE_CLS object.

property data

The stored data.

get(key, default=None)

Return the container at key, or default if container with name key is not available.

items()

Returns an iterator over the (name, data container) tuple of this group.

keys()

Returns an iterator over the container names in this group.

lock()

Locks the data of this object

property locked

Whether this object is locked

property logstr

Returns the classname and name of this object

property name

The name of this DataContainer-derived object.

new_container(path: Union[str, List[str]], *, Cls: type = None, **kwargs)

Creates a new container of class Cls and adds it at the given path relative to this group.

If needed, intermediate groups are automatically created.

Parameters
  • path (Union[str, List[str]]) – Where to add the container.

  • Cls (type, optional) – The class of the container to add. If None, the _NEW_CONTAINER_CLS class variable’s value is used.

  • **kwargs – passed on to Cls.__init__

Returns

the created container

Return type

Cls

Raises
  • ValueError – If neither the Cls argument nor the class variable _NEW_CONTAINER_CLS were set or if path was empty.

  • TypeError – When Cls is not compatible to the data tree

new_group(path: Union[str, list], *, Cls: type = None, **kwargs)

Creates a new group at the given path.

Parameters
  • path (Union[str, list]) – The path to create the group at. Note that the whole intermediate path needs to already exist.

  • Cls (type, optional) – If given, use this type to create the group. If not given, uses the class specified in the _NEW_GROUP_CLS class variable or, as last resort, the type of this instance.

  • **kwargs – Passed on to Cls.__init__

Returns

the created group

Return type

Cls

Raises

TypeError – For the given class not being derived from BaseDataGroup

property parent

The associated parent of this container or group

property path

The path to get to this container or group from some root path

pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() → (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

raise_if_locked(*, prefix: str = None)

Raises an exception if this object is locked; does nothing otherwise

recursive_update(other, *, overwrite: bool = True)

Recursively updates the contents of this data group with the entries of the given data group

Note

This will create shallow copies of those elements in other that are added to this object.

Parameters
  • other (BaseDataGroup) – The group to update with

  • overwrite (bool, optional) – Whether to overwrite already existing object. If False, a conflict will lead to an error being raised and the update being stopped.

Raises

TypeError – If other was of invalid type

setdefault(key, default=None)

This method is not supported for a data group

property tree

Returns the default (full) tree representation of this group

property tree_condensed

Returns the condensed tree representation of this group. Uses the _COND_TREE_* prefixed class attributes as parameters.

unlock()

Unlocks the data of this object

update([E, ]**F) → None. Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()

Returns an iterator over the containers in this group.

property with_direct_insertion

Whether the class this mixin is mixed into is currently in direct insertion mode.