dantro.proxy.hdf5 module¶
This module implements a BaseDataProxy specialization for Hdf5 data.
-
class
dantro.proxy.hdf5.Hdf5DataProxy(obj: h5py._hl.dataset.Dataset, *, resolve_as_dask: bool = False)[source]¶ Bases:
dantro.base.BaseDataProxyThe Hdf5DataProxy is a placeholder for Hdf5 datasets.
It saves the filename and dataset name needed to later load the dataset. Additionaly, it caches some values that give information on the shape and dtype of the dataset.
Depending on the type that this proxy is resolved as via the
resolve()method, the correspondingh5py.Fileobject needs to stay open and in memory; it is closed upon garbage-collection of this object.-
__init__(obj: h5py._hl.dataset.Dataset, *, resolve_as_dask: bool = False)[source]¶ Initializes a proxy object for Hdf5 datasets.
- Parameters
obj (h5.Dataset) – The dataset object to be proxy for
resolve_as_dask (bool, optional) – Whether to resolve the dataset object as a delayed
dask.array.core.Arrayobject, using anh5py.Datasetto initialize it and passing over chunk information.
-
resolve(*, astype: type = None)[source]¶ Resolve the data of this proxy by opening the hdf5 file and loading the dataset into a numpy array or a type specified by
astype.- Parameters
astype (type, optional) – As which type to return the data from the dataset this object is proxy for. If None, will return as np.array. For h5py.Dataset, the h5py.File object stays in memory until the proxy is deleted. Note that if
resolve_as_daskwas specified, the data will be loaded asdask.array.core.Array(h5py.Dataset)only ifastypeis _not_ specified in this call!- Returns
the resolved data.
- Return type
type specified by
astype
-
_open_h5file() → h5py._hl.files.File[source]¶ Opens the associated HDF5 file and stores it in _h5files in order to keep it in scope. These file objects are only closed upon deletion of this proxy object!
- Returns
The newly opened HDF5 file
- Return type
h5.File
-
property
shape¶ The cached shape of the dataset, accessible without resolving
-
property
dtype¶ The cached dtype of the dataset, accessible without resolving
-
property
ndim¶ The cached ndim of the dataset, accessible without resolving
-
property
size¶ The cached size of the dataset, accessible without resolving
-
property
chunks¶ The cached chunks of the dataset, accessible without resolving
-
_abc_impl= <_abc_data object>¶
-
property
classname¶ Returns this proxy’s class name
The tags describing this proxy object
-