|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The Clade Ontology is an ontology of exemplar phyloreferences curated from peer-reviewed publications. It stores phyloreferences (computable clade definitions) as PHYX files that are converted to OWL/JSON-LD for reasoning with an OWL reasoner (JPhyloRef + JFact++). |
| 8 | + |
| 9 | +## Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +npm test # Lint + run all Mocha tests (requires Node.js) |
| 13 | +npm run lint # ESLint on test/, phyx2ontology/, and regnum2phyx/ |
| 14 | +npm run mocha # Run tests without linting |
| 15 | +npm run build-ontology # Convert all phyx/ files into CLADO.json |
| 16 | +``` |
| 17 | + |
| 18 | +**Run a single test file:** |
| 19 | +```bash |
| 20 | +npx mocha test/test_phyx.js |
| 21 | +npx mocha test/regnum2phyx/exec.js |
| 22 | +``` |
| 23 | + |
| 24 | +**Enable slow tests (requires Java + JPhyloRef):** |
| 25 | +```bash |
| 26 | +RUN_SLOW_TESTS=1 npm test |
| 27 | +# Optional env vars: JVM_ARGS, JPHYLOREF_ARGS, MAX_INTERNAL_SPECIFIERS, MAX_EXTERNAL_SPECIFIERS |
| 28 | +``` |
| 29 | + |
| 30 | +**Download test dependencies (JPhyloRef JAR and Phyx JSON schema):** |
| 31 | +```bash |
| 32 | +cd test && bash download.sh |
| 33 | +``` |
| 34 | + |
| 35 | +**Build the Clade Ontology:** |
| 36 | +```bash |
| 37 | +node phyx2ontology/phyx2ontology.js phyx/ > CLADO.json |
| 38 | +node phyx2ontology/phyx2ontology.js phyx/ --no-phylogenies > CLADO.json |
| 39 | +``` |
| 40 | + |
| 41 | +**Convert a PhyloRegnum database dump to Phyx files:** |
| 42 | +```bash |
| 43 | +node regnum2phyx/regnum2phyx.js dump.json -o output_dir/ |
| 44 | +node regnum2phyx/regnum2phyx.js dump.json -o output_dir/ --filenames regnum-id |
| 45 | +``` |
| 46 | + |
| 47 | +## Architecture |
| 48 | + |
| 49 | +### Data Pipeline |
| 50 | + |
| 51 | +``` |
| 52 | +PhyloRegnum DB dump (JSON) |
| 53 | + → regnum2phyx.js → PHYX files (.json in phyx/) |
| 54 | + → phyx2ontology.js → CLADO.json (OWL/JSON-LD) |
| 55 | + → JPhyloRef (Java) → reasoning/test results |
| 56 | +``` |
| 57 | + |
| 58 | +### Key Directories |
| 59 | + |
| 60 | +- **`phyx/`** — Curated PHYX files organized by source: |
| 61 | + - `from_papers/` — Phyloreferences from peer-reviewed papers (e.g., `Brochu 2003/`) |
| 62 | + - `phylonym/` — Files from the PhyloNym database |
| 63 | + - `encrypted/` — Git-crypt encrypted files (skipped during processing) |
| 64 | +- **`phyx2ontology/phyx2ontology.js`** — Converts PHYX files to a single Clade Ontology JSON-LD. Reads PHYX files, wraps them via `@phyloref/phyx`, and emits JSON-LD to STDOUT. |
| 65 | +- **`regnum2phyx/regnum2phyx.js`** — Converts PhyloRegnum database dumps (JSON arrays) into individual PHYX files. Handles specifiers, citations (BibJSON format), and author formatting. |
| 66 | +- **`test/`** — Mocha test suite: |
| 67 | + - `test_phyx.js` — Validates all PHYX files in `phyx/` (JSON schema + JSON-LD conversion). Skips git-crypt encrypted files. |
| 68 | + - `test_phyx2ontology.js` — Smoke-tests `phyx2ontology.js` execution on all Phyx files. |
| 69 | + - `regnum2phyx/exec.js` — Tests `regnum2phyx.js` against example dumps in `test/regnum2phyx/examples/` and compares output against `test/regnum2phyx/expected/`. |
| 70 | + |
| 71 | +### PHYX Format |
| 72 | + |
| 73 | +PHYX files are JSON with: |
| 74 | +- `@context`: points to `http://www.phyloref.org/phyx.js/context/v0.2.0/phyx.json` |
| 75 | +- `phylorefs`: array of phyloreference objects with `internalSpecifiers`, `externalSpecifiers`, `cladeDefinition`, etc. |
| 76 | +- `phylogenies`: array of phylogeny objects with Newick strings |
| 77 | + |
| 78 | +### Key Library |
| 79 | + |
| 80 | +`@phyloref/phyx` (npm) provides `PhylorefWrapper`, `PhylogenyWrapper`, and `PhyxWrapper` classes that convert PHYX objects to JSON-LD/OWL class restrictions. |
| 81 | + |
| 82 | +### Linting |
| 83 | + |
| 84 | +ESLint uses `airbnb-base` + `mocha` plugin. Trailing commas required on multiline arrays/objects (not functions). ES6 syntax. |
| 85 | + |
| 86 | +### Git-Crypt |
| 87 | + |
| 88 | +Some PHYX files in `phyx/encrypted/` are git-crypt encrypted. Both `phyx2ontology.js` and `test_phyx.js` detect these by checking for the `\x00GITCRYPT` magic bytes and skip them gracefully. |
0 commit comments