๐ If you're tired of setting up the same directory and file structure for your new Python projects again and again, then this might be for you ;-)
This repository provides a "template" of a directory structure for small to medium-sized (scientific) projects, making use of CookieCutter, a templating engine for project structures. Check out the links at the bottom of the page to create your own CookieCutter or use this one to start your project. Also, feel free to fork the repository and adjust it to your own needs.
- Cookiecutter as your productivity booster
- Usage
- Features
- Sources of inspiration
- Maintainer & Contribution
By running cookiecutter with this repository, a new directory will be created with a pre-defined structure and some default files, making you all set to start a new Python project. No need to manually create the same files and directory structure over and over again. This includes
- code that is importable from every place in your environment
- automatically resolved paths to the project's root and the directories for data, plots, logs, etc.
- commands to run automated unit tests, create documentation of your code, etc.
- creating a nice HTML representation of your project's documentation, including Jupyter notebooks, docstrings, etc.
- and so on... ๐
-
For first time use
If you use git it for the first time on your machine, make sure to set your global configuration:$ git config --global user.name "John Doe" $ git config --global user.email johndoe@example.com -
just (optional but recommended)
-
GitHub account (optional)
The easiest way to get started is using uv.
Make sure you have uv installed, and then run the following command to create a new project from this template:
$ uvx cookiecutter gh:markusritschel/cookiecutter-pyproject
Alternatively, without uv
install CookieCutter via pip or conda, and then run the following command to create a new project from this template:
$ cookiecutter gh:markusritschel/cookiecutter-pyprojectOnce you have answered the questions, your directory structure will be created and you're set, ready to start working on your new project ๐.
Tip
For further information, see also the README of your new project.
You may also want to check out the justfile targets (simply type just in your terminal to get a command overview).
Also see the next section
This is a boilerplate for Python projects โ both for package development and (scientific) data projects. It comes with a set of tools supportingyou r development workflow. It also provides an optional structure for research projects (see corresponding section below and the documentation for details).
| Purpose | Tool | Comment |
|---|---|---|
| Dependency management | uv | A modern and blazingly fast dependency manager for Python |
| Version control | Git | A popular version control system (VCS), automatically initialized |
| Documentation | Sphinx | A popular and versatile docs generator for Python |
| Code quality | Ruff | A fast linter and code formatter for Python |
| Testing | Pytest | A powerful testing framework for Python |
| Git hooks | pre-commit | Automatically enforces checks (e.g. lockfile sync) before commits |
| Task automation | Just | A modern taskrunner, simplifying your workflow |
| Test Github Actions | Act | A tool to run GitHub Actions locally |
just is a modern taskrunner alternative to Make. It can help you keep your workflow clean, simple, memorable, and reproducable. You can add complex commands such as
python scripts/raw_data_processing.py -i data/input_data.csv --clean-data --pre-process-data -o data/output.csvto your justfile
# Process raw data
process-raw-data:
python scripts/raw_data_processing.py -i data/input_data.csv --clean-data --pre-process-data -o data/output.csvand then simply run just process-raw-data to execute the command.
For more available commands, simply execute just in your terminal in your newly created project.
๐ Check out the corresponding page in the documentation.
The template ships with a Sphinx-powered documentation setup. Write your documentation in Markdown, paired with the flexibility and customizability of Sphinx. For example, use reference to literature, parse your docstrings, cross-link your own and third-party API, even from within your docstrings.
๐ Check out the corresponding page in the documentation.
The project follows a src layout, which means that the package's source code resides in a subdirectory of src. This follows the Good Integration Practices from pytest.org and is a common and recommended layout for Python project; it helps avoid issues with imports and ensures that the installed version of the package is always used during development and testing.
๐ Check out the corresponding page in the documentation.
directory structure
โโโ assets <- A place for assets like shapefiles or config files
โ
โโโ data <- Contains all data used for the analyses in this project.
โ โ The sub-directories can be links to the actual location of your data.
โ โ However, they should never be under version control! (-> .gitignore)
โ โโโ interim <- Intermediate data that have been transformed from the raw data
โ โโโ processed <- The final, processed data used for the actual analyses
โ โโโ raw <- The original, immutable(!) data
โ
โโโ docs <- The technical documentation (default engine: Sphinx; but feel free to
โ use MkDocs, Jupyter-Book or anything similar).
โ This should contain only documentation of the code and the assets.
โ A report of the actual project should be placed in `reports/book`.
โ
โโโ logs <- Storage location for the log files being generated by scripts
โ
โโโ notebooks <- Jupyter Notebooks. Follow a naming convention, such as a number (for ordering),
โ โ and a short `-` or `_` delimited description, e.g. `01-initial-analyses`
โ โโโ _paired <- Optional location for your paired Jupyter Notebook files
โ โโโ exploratory <- Notebooks for exploratory tasks
โ โโโ reports <- Notebooks generating reports and figures
โ
โโโ references <- Data descriptions, manuals, and all other explanatory materials
โ
โโโ reports <- Generated reports (e.g. HTML, PDF, LaTeX, etc.)
โ โโโ figures <- Generated graphics and figures to be used in reporting
โ โโโ README.md <- More information about Jupyter-Book and MyST-MD
โ
โโโ scripts <- High-level scripts that use (low-level) source code from `src/`
โโโ src <- Source code (and only source code!) for use in this project
โ โโโ core <- Provides some core functionalities
โ โโโ tests <- Contains tests for the code in `src/`
โ โโโ __init__.py <- Makes src a Python module and provides some standard variables
โ
โโโ .env <- In this file, specify all your custom environment variables
โ Keep this out of version control! (i.e. have it in your .gitignore)
โโโ .gitignore <- Here, list all the files and folders (patterns allowed) that you want to
โ keep out of git version control.
โโโ CHANGELOG.md <- All major changes should go in there
โโโ CITATION.cff <- The citation information for this project (update your ORCID ID!)
โโโ environment.yml <- The conda environment file for reproducing the environment
โโโ LICENSE <- The license used for this project
โโโ Makefile <- A self-documenting Makefile for standard CLI tasks
โโโ pyproject.toml <- Configuration file for the project
โโโ README.md <- The top-level README of this project
โโโ requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
generated with `pip freeze > requirements.txt`
Further reading:
- https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/
- https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
Some great sources of inspiration and orientation when I created this template:
- A great article on how to structure your scientific data projects: https://drivendata.github.io/cookiecutter-data-science
- https://coderefinery.github.io/reproducible-research/
- https://github.com/drivendata/cookiecutter-data-science
- https://github.com/audreyfeldroy/cookiecutter-pypackage
- https://github.com/hackalog/easydata
- https://github.com/aubricus/cookiecutter-python-package
- Martin, R. C. (Ed.). (2009). Clean code: A handbook of agile software craftsmanship. Prentice Hall.
- Croucher, M., Graham, L., James, T., Krystalli, A., & Michonneau, F. (2019). Reproducible Code (Guides to Better Science). British Ecological Society. https://www.britishecologicalsociety.org/publications/guides-to/
markusritschel maintains this project.
Issues & pull-requests accepted.
ยฉ Markus Ritschel 2021โ2026