Welcome to dantro’s documentation!#

dantro – from data and dentro (Greek for tree) – is a Python package that provides a uniform interface for hierarchically structured and semantically heterogeneous data. It is built around three main features:

  • data handling: loading heterogeneous data into a tree-like data structure, providing a uniform interface to it

  • data transformation: performing arbitrary operations on the data, if necessary using lazy evaluation

  • data visualization: creating a visual representation of the processed data

Together, these stages constitute a data processing pipeline: an automated sequence of predefined, configurable operations. Akin to a Continuous Integration pipeline, a data processing pipeline provides a uniform, consistent, and easily extensible infrastructure that contributes to more efficient and reproducible workflows. This can be beneficial especially in a scientific context, for instance when handling data that was generated by computer simulations.

dantro is meant to be integrated into projects and be used to set up such a data processing pipeline, customized to the needs of the project. It is designed to be easily customizable to the requirements of the project it is integrated in, even if the involved data is hierachically structured or semantically heterogeneous. Furthermore, it allows a configuration-based specification of all operations via YAML configuration files; the resulting pipeline can then be controlled entirely via these configuration files and without requiring code changes.

The dantro package is open source software released under the LGPLv3+ license. It was developed alongside the Utopia project (a modelling framework for complex and adaptive systems), but is an independent package.

Hint

A description paper about the motivation and scope of dantro can be found in in the Journal of Open Source Software.

For a real-world example of how dantro is used, make sure to check out the Utopia project, where the dantro-based data processing pipeline is fed with the output of complex systems models.

Note

If you find any errors in this documentation or would like to contribute to the project, we are happy about your visit to the project page.