dantro.proxy package#

This modules implements data proxies as specializations of the dantro.base.BaseDataProxy.

Submodules#

dantro.proxy.hdf5 module#

This module implements a dantro.base.BaseDataProxy specialization for HDF5 data.

class Hdf5DataProxy(obj: Dataset, *, resolve_as_dask: bool = False)[source]#

Bases: dantro.base.BaseDataProxy

The Hdf5DataProxy is a placeholder for a h5py.Dataset.

It saves the filename and dataset name needed to later load the dataset. Additionaly, it caches some values that give information on the shape and dtype of the dataset, thus further delaying the load to the time the actual data is required.

Depending on the type that this proxy is resolved as via the resolve() method, the corresponding h5py.File object needs to stay open and in memory; it is closed upon garbage-collection of this object.

__init__(obj: Dataset, *, resolve_as_dask: bool = False)[source]#

Initializes a proxy object for a h5py.Dataset object.

Parameters
  • obj (Dataset) – The dataset object to be proxy for

  • resolve_as_dask (bool, optional) – Whether to resolve the dataset object as a delayed dask.array.Array object, using an h5py.Dataset to initialize it and passing over chunk information.

resolve(*, astype: Optional[type] = None)[source]#

Resolve the data of this proxy by opening the hdf5 file and loading the dataset into a numpy.ndarray or a type specified by the astype argument.

Parameters

astype (type, optional) – As which type to return the data from the dataset this object is proxy for. If None, will return as numpy.ndarray. For h5py.Dataset, the h5py.File object stays in memory until the proxy is deleted. Note that if resolve_as_dask was specified during proxy initialization, the data will be loaded as dask.array.Array only if astype is not specified in this call!

Returns

the resolved data.

Return type

type specified by astype

_open_h5file() File[source]#

Opens the associated HDF5 file and stores it in _h5files in order to keep it in scope. These file objects are only closed upon deletion of this proxy object!

Returns

The newly opened HDF5 file

Return type

File

__del__()[source]#

Make sure all potentially still open h5py.File objects are closed

property shape#

The cached shape of the dataset, accessible without resolving

property dtype#

The cached dtype of the dataset, accessible without resolving

property ndim#

The cached ndim of the dataset, accessible without resolving

property size#

The cached size of the dataset, accessible without resolving

property chunks#

The cached chunks of the dataset, accessible without resolving

_abc_impl = <_abc._abc_data object>#
_tags: tuple = ()#

Associated tags.

These are empty by default and may also be overwritten in the object.

property classname: str#

Returns this proxy’s class name

property tags: Tuple[str]#

The tags describing this proxy object