dantro.utils.data_ops module¶
This module implements data processing operations for dantro objects
-
dantro.utils.data_ops.print_data(data: Any) → Any[source]¶ Prints and passes on the data.
The print operation distinguishes between dantro types (in which case some more information is shown) and non-dantro types.
-
dantro.utils.data_ops.import_module_or_object(module: str = None, name: str = None)[source]¶ Imports a module or an object using the specified module string and the object name.
- Parameters
module (str, optional) – A module string, e.g. numpy.random. If this is not given, it will import from the :py:mod`builtins` module. Also, relative module strings are resolved from
dantro.name (str, optional) – The name of the object to retrieve from the chosen module and return. This may also be a dot-separated sequence of attribute names which can be used to traverse along attributes.
- Returns
The chosen module or object, i.e. the object found at <module>.<name>
- Raises
AttributeError – In cases where part of the
nameargument could not be resolved due to a bad attribute name.
-
dantro.utils.data_ops.create_mask(data: xarray.core.dataarray.DataArray, operator_name: str, rhs_value: float) → xarray.core.dataarray.DataArray[source]¶ Given the data, returns a binary mask by applying the following comparison:
data <operator> rhs value.- Parameters
data (xr.DataArray) – The data to apply the comparison to. This is the lhs of the comparison.
operator_name (str) – The name of the binary operator function as registered in the
BOOLEAN_OPERATORSconstant.rhs_value (float) – The right-hand-side value
- Raises
KeyError – On invalid operator name
- Returns
Boolean mask
- Return type
xr.DataArray
-
dantro.utils.data_ops.where(data: xarray.core.dataarray.DataArray, operator_name: str, rhs_value: float) → xarray.core.dataarray.DataArray[source]¶ Filter elements from the given data according to a condition. Only those elemens where the condition is fulfilled are not masked.
NOTE This leads to a dtype change to float.
-
dantro.utils.data_ops.count_unique(data) → xarray.core.dataarray.DataArray[source]¶ Applies np.unique to the given data and constructs a xr.DataArray for the results.
-
dantro.utils.data_ops.populate_ndarray(*objs, shape: tuple, dtype: str = 'float', order: str = 'C') → numpy.ndarray[source]¶ Populates an empty np.ndarray of the given dtype with the objects.
- Parameters
*objs – The objects to add to the
shape (tuple) – The shape of the new array
dtype (str, optional) – Data type of the new array
order (str, optional) – Order of the new array
- Returns
The newly created and populated array
- Return type
np.ndarray
- Raises
ValueError – If the number of given objects did not match the array size
-
dantro.utils.data_ops.multi_concat(arrs: numpy.ndarray, *, dims: Sequence[str]) → xarray.core.dataarray.DataArray[source]¶ Concatenates
xr.Datasetorxr.DataArrayobjects usingxr.concat. This function expects the xarray objects to be pre-aligned inside the numpy object arrayarrs, with the number of dimensions matching the number of concatenation operations desired. The position inside the array carries information on where the objects that are to be concatenated are placed inside the higher dimensional coordinate system.Through multiple concatenation, the dimensionality of the contained objects is increased by
dims, while their dtype can be maintained.For the sequential application of
xr.concatalong the outer dimensions, the customdantro.tools.apply_along_axis()is used.- Parameters
arrs (np.ndarray) – The array containing xarray objects which are to be concatenated. Each array dimension should correspond to one of the given
dims. For each of the dimensions, thexr.concatoperation is applied along the axis, effectively reducing the dimensionality ofarrsto a scalar and increasing the dimensionality of the contained xarray objects until they additionally contain the dimensions specified indims.dims (Sequence[str]) – A sequence of dimension names that is assumed to match the dimension names of the array. During each concatenation operation, the name is passed along to
xr.concatwhere it is used to select the dimension of the content ofarrsalong which concatenation should occur.
- Raises
ValueError – If number of dimension names does not match the number of data dimensions.
-
dantro.utils.data_ops.merge(arrs: Union[Sequence[Union[xarray.core.dataarray.DataArray, xarray.core.dataset.Dataset]], numpy.ndarray], *, reduce_to_array: bool = False, **merge_kwargs) → Union[xarray.core.dataset.Dataset, xarray.core.dataarray.DataArray][source]¶ Merges the given sequence of xarray objects into an xr.Dataset.
As a convenience, this also allows passing a numpy object array containing the xarray objects. Furthermore, if the resulting Dataset contains only a single data variable, that variable can be extracted as a DataArray which is then the return value of this operation.
-
dantro.utils.data_ops.expand_dims(d: Any, *, dim: dict = None, **kwargs) → xarray.core.dataarray.DataArray[source]¶ Expands the dimensions of the given object.
If the object does not support the expand_dims method, it will be attempted to convert it to an xr.DataArray.
-
dantro.utils.data_ops.register_operation(*, name: str, func: Callable, skip_existing: bool = False, overwrite_existing: bool = False) → None[source]¶ Adds an entry to the shared OPERATIONS registry.
- Parameters
name (str) – The name of the operation
func (Callable) – The callable
skip_existing (bool, optional) – Description
overwrite_existing (bool, optional) – Description
- Raises
TypeError – On invalid name or non-callable for the func argument
ValueError – On already existing operation name and no skipping or overwriting enabled.
-
dantro.utils.data_ops.apply_operation(op_name: str, *op_args, _log_level: int = 5, **op_kwargs) → Any[source]¶ Apply an operation with the given arguments and then return it.
- Parameters
op_name (str) – The name of the operation to carry out; need to be part of the OPERATIONS database.
*op_args – The positional arguments to the operation
_log_level (int, optional) – Log level of the log messages created by this function.
**op_kwargs – The keyword arguments to the operation
- Returns
The result of the operation
- Return type
Any
- Raises
KeyError – On invalid operation name. This also suggests possible other names that might match.
Exception – On failure to apply the operation, preserving the original exception.
-
dantro.utils.data_ops.available_operations(*, match: str = None, n: int = 5) → Sequence[str][source]¶ Returns all available operation names or a fuzzy-matched subset of them.
- Parameters
match (str, optional) – If given, fuzzy-matches the names and only returns close matches to this name.
n (int, optional) – Number of close matches to return. Passed on to difflib.get_close_matches
- Returns
- All available operation names or the matched subset.
The sequence is sorted alphabetically.
- Return type
Sequence[str]