dantro.data_loaders package¶
This module implements loaders mixin classes for use with the
DataManager.
All these mixin classes should follow the following pattern:
class LoadernameLoaderMixin:
@add_loader(TargetCls=TheTargetContainerClass)
def _load_loadername(filepath: str, *, TargetCls: type):
# ...
return TargetCls(...)
As ensured by the add_loader() decorator
(implemented in dantro.data_loaders._tools module), each
_load_loadername method gets supplied with the path to a file and the
TargetCls argument, which can be called to create an object of the correct
type and name.
By default, and to decouple the loader from the container, it should be
considered to be a static method; in other words: the first positional argument
should ideally not be self!
If self is required for some reason, set the omit_self option of the
decorator to False, making it a regular (instead of a static) method.
-
class
dantro.data_loaders.AllAvailableLoadersMixin[source]¶ Bases:
dantro.data_loaders.load_text.TextLoaderMixin,dantro.data_loaders.load_yaml.YamlLoaderMixin,dantro.data_loaders.load_pkl.PickleLoaderMixin,dantro.data_loaders.load_hdf5.Hdf5LoaderMixin,dantro.data_loaders.load_xarray.XarrayLoaderMixin,dantro.data_loaders.load_numpy.NumpyLoaderMixinA mixin bundling all data loaders that are available in dantro.
This is useful for a more convenient import in a downstream
DataManager.-
_HDF5_DECODE_ATTR_BYTESTRINGS= True¶
-
_HDF5_DSET_DEFAULT_CLS¶
-
_HDF5_DSET_MAP= None¶
-
_HDF5_GROUP_MAP= None¶
-
_HDF5_MAP_FROM_ATTR= None¶
-
_PICKLE_LOAD_FUNC()¶ Read and return an object from the pickle data stored in a file.
This is equivalent to
Unpickler(file).load(), but may be more efficient.The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.
The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus file can be a binary file object opened for reading, an io.BytesIO object, or any other custom object that meets this interface.
Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is True, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
-
_load_hdf5(*args, **kwargs)¶ Loads the specified hdf5 file into DataGroup- and DataContainer-like objects; this completely recreates the hierarchic structure of the hdf5 file. The data can be loaded into memory completely, or be loaded as a proxy object.
The h5py File and Group objects will be converted to the specified DataGroup-derived objects; the Dataset objects to the specified DataContainer-derived object.
All HDF5 group or dataset attributes are carried over and are accessible under the
attrsattribute of the respective dantro objects in the tree.- Parameters
filepath (str) – The path to the HDF5 file that is to be loaded
TargetCls (type) – The group type this is loaded into
load_as_proxy (bool, optional) – if True, the leaf datasets are loaded as
dantro.proxy.hdf5.Hdf5DataProxyobjects. That way, the data is only loaded into memory when their.dataproperty is accessed the first time, either directly or indirectly.proxy_kwargs (dict, optional) – When loading as proxy, these parameters are unpacked in the
__init__call. For available argument seedantro.proxy.hdf5.Hdf5DataProxy.lower_case_keys (bool, optional) – whether to use only lower-case versions of the paths encountered in the HDF5 file.
enable_mapping (bool, optional) – If true, will use the class variables _HDF5_GROUP_MAP and _HDF5_DSET_MAP to map groups or datasets to a custom container class during loading. Which attribute to read is determined by the map_from_attr argument
map_from_attr (str, optional) – From which attribute to read the key that is used in the mapping. If nothing is given, the class variable _HDF5_MAP_FROM_ATTR is used.
print_params (dict, optional) –
parameters for the status report. Available keys:
- level (int):
how verbose to print loading info; possible values are:
0: None,1: on file level,2: on dataset level- fstr1:
format string level 1, receives keys
nameandfile, which is the file path.- fstr2:
format string level 2, receives keys
name,fileandobj, which is anh5py.Dataset.
- Returns
- The populated root-level group, corresponding to
the base group of the file
- Return type
- Raises
ValueError – If enable_mapping, but no map attribute can be determined from the given argument or the class variable _HDF5_MAP_FROM_ATTR
-
_load_hdf5_as_dask(*args, **kwargs)¶ This is a shorthand for
_load_hdf5()with theload_as_proxyflag set andresolve_as_daskpassed as additional arguments to the proxy viaproxy_kwargs.
-
_load_hdf5_proxy(*args, **kwargs)¶ This is a shorthand for
_load_hdf5()with theload_as_proxyflag set.
-
_load_numpy(*args, **kwargs)¶ Loads the output of
numpy.saveback into aNumpyDataContainer.- Parameters
filepath (str) – Where the
*.npyfile is locatedTargetCls (type) – The class constructor
**load_kwargs – Passed on to
numpy.load, see there for kwargs
- Returns
The reconstructed NumpyDataContainer
- Return type
-
_load_numpy_binary(*args, **kwargs)¶ Loads the output of
numpy.saveback into aNumpyDataContainer.- Parameters
filepath (str) – Where the
*.npyfile is locatedTargetCls (type) – The class constructor
**load_kwargs – Passed on to
numpy.load, see there for kwargs
- Returns
The reconstructed NumpyDataContainer
- Return type
-
_load_pickle(*args, **kwargs)¶ Load a pickled object.
This uses the load function defined under the _PICKLE_LOAD_FUNC class variable, which defaults to the pickle.load function.
- Parameters
filepath (str) – Where the pickle-dumped file is located
TargetCls (type) – The class constructor
**pkl_kwargs – Passed on to the load function
- Returns
The unpickled file, stored in a dantro container
- Return type
-
_load_pkl(*args, **kwargs)¶ Load a pickled object.
This uses the load function defined under the _PICKLE_LOAD_FUNC class variable, which defaults to the pickle.load function.
- Parameters
filepath (str) – Where the pickle-dumped file is located
TargetCls (type) – The class constructor
**pkl_kwargs – Passed on to the load function
- Returns
The unpickled file, stored in a dantro container
- Return type
-
_load_plain_text(*args, **kwargs)¶ Loads the content of a plain text file back into a
StringContainer.- Parameters
filepath (str) – Where the plain text file is located
TargetCls (type) – The class constructor
**load_kwargs – Passed on to
open, see there for possible kwargs
- Returns
The reconstructed StringContainer
- Return type
-
_load_text(*args, **kwargs)¶ Loads the content of a plain text file back into a
StringContainer.- Parameters
filepath (str) – Where the plain text file is located
TargetCls (type) – The class constructor
**load_kwargs – Passed on to
open, see there for possible kwargs
- Returns
The reconstructed StringContainer
- Return type
-
_load_xr_dataarray(*args, **kwargs)¶ Loads an xr.DataArray from a netcdf file into an XrDataContainer.
- Parameters
filepath (str) – Where the xarray-dumped netcdf file is located
TargetCls (type) – The class constructor
load_completely (bool, optional) – If true, will call .load() on the loaded DataArray to load it completely into memory
**load_kwargs – Passed on to xr.load_dataarray, see there for kwargs
- Returns
The reconstructed XrDataContainer
- Return type
-
_load_xr_dataset(*args, **kwargs)¶ Loads an xr.Dataset from a netcdf file into a PassthroughContainer.
Note
As there is no proper equivalent of a dataset in dantro (yet), and unpacking the dataset into a dantro group would reduce functionality, the PassthroughContainer is used here. It should behave almost the same as an xr.Dataset.
- Parameters
filepath (str) – Where the xarray-dumped netcdf file is located
TargetCls (type) – The class constructor
load_completely (bool, optional) – If true, will call .load() on the loaded xr.Dataset to load it completely into memory.
**load_kwargs – Passed on to xr.load_dataarray, see there for kwargs
- Returns
- The reconstructed XrDataset, stored in a
passthrough container.
- Return type
-
_load_yaml(*args, **kwargs)¶ Load a yaml file from the given path and creates a container to store that data in.
- Parameters
filepath (str) – Where to load the yaml file from
TargetCls (type) – The class constructor
- Returns
The loaded yaml file as a container
- Return type
-
_load_yaml_to_object(*args, **kwargs)¶ Load a yaml file from the given path and creates a container to store that data in.
- Parameters
filepath (str) – Where to load the yaml file from
TargetCls (type) – The class constructor
- Returns
The loaded yaml file as a container
- Return type
-
_load_yml(*args, **kwargs)¶ Load a yaml file from the given path and creates a container to store that data in.
- Parameters
filepath (str) – Where to load the yaml file from
TargetCls (type) – The class constructor
- Returns
The loaded yaml file as a container
- Return type
-
_load_yml_to_object(*args, **kwargs)¶ Load a yaml file from the given path and creates a container to store that data in.
- Parameters
filepath (str) – Where to load the yaml file from
TargetCls (type) – The class constructor
- Returns
The loaded yaml file as a container
- Return type
-