dantro.proxy package#
This modules implements data proxies as specializations of the
dantro.base.BaseDataProxy.
Submodules#
dantro.proxy.hdf5 module#
This module implements a dantro.base.BaseDataProxy
specialization for HDF5 data.
- class Hdf5DataProxy(obj: Dataset, *, resolve_as_dask: bool = False)[source]#
Bases:
BaseDataProxyThe Hdf5DataProxy is a placeholder for a
h5py.Dataset.It saves the filename and dataset name needed to later load the dataset. Additionaly, it caches some values that give information on the shape and dtype of the dataset, thus further delaying the load to the time the actual data is required.
Depending on the type that this proxy is resolved as via the
resolve()method, the correspondingh5py.Fileobject needs to stay open and in memory; it is closed upon garbage-collection of this object.- __init__(obj: Dataset, *, resolve_as_dask: bool = False)[source]#
Initializes a proxy object for a
h5py.Datasetobject.- Parameters:
obj (Dataset) – The dataset object to be proxy for
resolve_as_dask (bool, optional) – Whether to resolve the dataset object as a delayed
dask.array.Arrayobject, using anh5py.Datasetto initialize it and passing over chunk information.
- resolve(*, astype: type | None = None)[source]#
Resolve the data of this proxy by opening the hdf5 file and loading the dataset into a
numpy.ndarrayor a type specified by theastypeargument.- Parameters:
astype (type, optional) – As which type to return the data from the dataset this object is proxy for. If None, will return as
numpy.ndarray. Forh5py.Dataset, theh5py.Fileobject stays in memory until the proxy is deleted. Note that ifresolve_as_daskwas specified during proxy initialization, the data will be loaded asdask.array.Arrayonly ifastypeis not specified in this call!- Returns:
the resolved data.
- Return type:
type specified by
astype
- _open_h5file() File[source]#
Opens the associated HDF5 file and stores it in
_h5filesin order to keep it in scope. These file objects are only closed upon deletion of this proxy object!- Returns:
The newly opened HDF5 file
- Return type:
- property shape#
The cached shape of the dataset, accessible without resolving
- property dtype#
The cached dtype of the dataset, accessible without resolving
- property ndim#
The cached ndim of the dataset, accessible without resolving
- property size#
The cached size of the dataset, accessible without resolving
- property chunks#
The cached chunks of the dataset, accessible without resolving
- _abc_impl = <_abc._abc_data object>#