I'd like to share DSO, a command line helper to build reproducible data science projects with ease.
It is an opinionated way to organize data science projects, built around data version control (DVC).
https://github.com/Boehringer-Ingelheim/dso/
It is an opinionated way to organize data science projects, built around data version control (DVC).
https://github.com/Boehringer-Ingelheim/dso/
Comments
- git, for code versioning
- dvc, for data versioning and tracking inputs and outputs
- jinja2, for templates
- uv, for Python dep mgmt
- quarto, for authoring reports
- hiyapyco, for hierarchical YAML config
- pre-commit, for linting
But I think this is useful for any kind of data analysis project.