Skip to content

Latest commit

 

History

History
89 lines (69 loc) · 2.33 KB

File metadata and controls

89 lines (69 loc) · 2.33 KB

turba-data

Dataset package for site-specific NPK fertilizer recommendations in Morocco.

PyPI Socket Downloads License: MIT

turba-data is the data package of the turba ecosystem. It provides a simple way to discover the datasets currently available and load them directly into Python. Future work will expand the catalog with additional datasets covering more crops, sources, and regions, add more notebooks for analysis and comparison, and introduce hosted remote downloads as the collection grows.

Install

pip install turba-data

This also pulls:

  • pandas
  • pyarrow

Quick start

import turba_data as td

print(td.list_datasets())

df = td.load_dataset("esa_worldcereal_morocco_cereals_medium")
print(df.head())

Users can convert parquet to CSV themselves after loading:

import turba_data as td

df = td.load_dataset("esa_worldcereal_morocco_cereals_medium")
df.to_csv("esa_worldcereal_morocco_cereals_medium.csv", index=False)

First release dataset

Current snapshot for the first packaged dataset:

  • dataset id: esa_worldcereal_morocco_cereals_medium
  • file: esa_worldcereal_morocco_cereals_medium.parquet
  • rows: 132,017
  • columns: 22
  • unique sites: 44,096
  • regions: 10
  • provinces: 66
  • communes: 1,149

Repository structure

turba-data/
├── src/
│   └── turba_data/
│       ├── __init__.py
│       ├── core.py
│       ├── registry.json
│       └── datasets/
│           └── esa_worldcereal_morocco_cereals_medium.parquet
├── tests/
│   └── test_api.py
├── LICENSE
├── pyproject.toml
└── README.md