dantro.proxy.hdf5 module

This module implements a dantro.base.BaseDataProxy specialization for HDF5 data.

class dantro.proxy.hdf5.Hdf5DataProxy(obj: h5py._hl.dataset.Dataset, *, resolve_as_dask: bool = False)[source]

Bases: dantro.base.BaseDataProxy

The Hdf5DataProxy is a placeholder for Hdf5 datasets.

It saves the filename and dataset name needed to later load the dataset. Additionaly, it caches some values that give information on the shape and dtype of the dataset.

Depending on the type that this proxy is resolved as via the resolve() method, the corresponding h5py.File object needs to stay open and in memory; it is closed upon garbage-collection of this object.

__init__(obj: h5py._hl.dataset.Dataset, *, resolve_as_dask: bool = False)[source]

Initializes a proxy object for Hdf5 datasets.

Parameters
  • obj (h5.Dataset) – The dataset object to be proxy for

  • resolve_as_dask (bool, optional) – Whether to resolve the dataset object as a delayed dask.array.core.Array object, using an h5py.Dataset to initialize it and passing over chunk information.

resolve(*, astype: type = None)[source]

Resolve the data of this proxy by opening the hdf5 file and loading the dataset into a numpy array or a type specified by astype.

Parameters

astype (type, optional) – As which type to return the data from the dataset this object is proxy for. If None, will return as np.array. For h5py.Dataset, the h5py.File object stays in memory until the proxy is deleted. Note that if resolve_as_dask was specified, the data will be loaded as dask.array.core.Array(h5py.Dataset) only if astype is _not_ specified in this call!

Returns

the resolved data.

Return type

type specified by astype

_open_h5file() → h5py._hl.files.File[source]

Opens the associated HDF5 file and stores it in _h5files in order to keep it in scope. These file objects are only closed upon deletion of this proxy object!

Returns

The newly opened HDF5 file

Return type

h5.File

__del__()[source]

Make sure all potentially still open h5py.File objects are closed

property shape

The cached shape of the dataset, accessible without resolving

property dtype

The cached dtype of the dataset, accessible without resolving

property ndim

The cached ndim of the dataset, accessible without resolving

property size

The cached size of the dataset, accessible without resolving

property chunks

The cached chunks of the dataset, accessible without resolving

_abc_impl = <_abc_data object>
_tags = ()
property classname

Returns this proxy’s class name

property tags

The tags describing this proxy object