Skip to content

Fix regnum2phyx.js#104

Draft
gaurav wants to merge 24 commits intomasterfrom
fix-regnum2phyx
Draft

Fix regnum2phyx.js#104
gaurav wants to merge 24 commits intomasterfrom
fix-regnum2phyx

Conversation

@gaurav
Copy link
Copy Markdown
Member

@gaurav gaurav commented Dec 10, 2025

WIP: this PR will eventually update regnum2phyx.js to support the current version of the Phyx format that we support.

@gaurav gaurav changed the base branch from master to replace-eslint-with-biome March 4, 2026 09:46
gaurav and others added 3 commits March 4, 2026 04:53
- Replace bare per-line stderr writes with a per-entry results array so
  every warning/skip is attributed to its phyloreference label and regnum ID
- Add --report <path> option that writes a rectangular CSV with columns for
  regnum_id, label, status, output_file, specifier counts, per-specifier labels,
  and a semicolon-joined issues field
- Post-loop summary now breaks down counts: written successfully, skipped, written
  with issues (exit 1 if any errors, same as before)
- Remove unused `keys` import from lodash

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Base automatically changed from replace-eslint-with-biome to master March 17, 2026 23:48
gaurav and others added 15 commits March 17, 2026 19:50
yargs 18 exports a factory function rather than a Yargs instance, so
`yargs(process.argv.slice(2))` is required to obtain an instance before
chaining `.usage()` and other methods.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hand-built JSON objects with library classes:
- TaxonConceptWrapper.wrapTaxonName() for specifier construction
- TaxonNameWrapper.TYPE_TAXON_NAME / TaxonConceptWrapper.TYPE_TAXON_CONCEPT
  instead of hardcoded IRI strings
- CitationWrapper.normalize() instead of lodash.pickBy() for citations
- PhylorefWrapper to manage the phyloref object and its specifiers
- owlterms.PHYX_CONTEXT_JSON for the context URL (fixes v1.1.0 → v0.2.0
  regression that was causing the test to fail)

Import each class directly from its source file rather than the package
index, as PhyxWrapper pulls in jsonld which is incompatible with Node.js v25.

Also fix test/regnum2phyx/exec.js: use a non-existing subdirectory of the
tmp dir so regnum2phyx can create it (it now throws if the dir already exists).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
merge-phylonym.js merges new Regnum dumps with existing curated Phylonym
PHYX files, preserving manually added newick phylogeny strings via a
three-tier citation matching cascade (DOI, title+year, position fallback).
README documents both scripts, the merge workflow, and known limitations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
update-phylonym.js combines merge-phylonym.js, newick verification, and the
directory swap into a single command. Running without --accept stages the
output for review; --accept performs the replacement. Aborts if any newicks
are lost. Added as the npm run update-phylonym script.

Also updates CLAUDE.md and regnum2phyx/README.md to lead with the new
orchestrator and retain the manual steps as a fallback reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove dead newicksLost state (always 0, never incremented) from
  mergePhyxFile stats, merge loop, CSV report, and summary output
- Remove unreachable `if (!match) continue` in scanDirectory (filter
  already guarantees the regex matches)
- Fix redundant Map .data lookups in NEW_ONLY/OLD_ONLY branches by
  hoisting newData/oldData outside the dryRun block
- Read work file existence before old file in verifyNewicks to skip
  unnecessary reads for orphaned entries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jsonld v5 (a transitive dependency of @phyloref/phyx@1.2.1) bundles
the esm package, which is broken on Node v25 and causes a
"Function.prototype.apply was called on undefined" crash when any
phyx.js wrapper is loaded.

Force jsonld@^9.0.0 via npm overrides, which drops the esm dependency
and resolves the crash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes for phyx2ontology.js broken by the dependency upgrades:

- Update yargs invocation from yargs.usage() to yargs(argv).usage(),
  matching the yargs 18 API (same fix already applied to regnum2phyx.js).

- Wrap phylorefWrapper.asJSONLD() in a try/catch and skip phylorefs that
  throw, with a stderr warning. phyx 1.2.1 throws "Cannot determine class
  expression for a single specifier" for phylorefs with only one specifier,
  which 0.2.1 silently accepted; skipping rather than crashing lets the
  rest of the ontology build successfully.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes for test_phyx.js broken by the phyx 0.2.1 → 1.2.1 upgrade:

- Import PhyxWrapper and PhylorefWrapper directly from their source files
  rather than via the package index, avoiding the jsonld/esm crash path
  (consistent with the same approach in regnum2phyx.js).

- Wrap the asJSONLD() call at describe-time in try/catch so a thrown
  exception is captured and surfaced as a failing test rather than
  crashing the entire mocha suite.

- Skip the asJSONLD conversion test for phylorefs with fewer than 2
  specifiers (consistent with the existing skips for too-many-specifiers).
  phyx 1.2.1 correctly throws for single-specifier phylorefs; the skip
  makes this visible without marking the whole file as broken.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If regnum2phyx.js exits non-zero (or crashes) the output directory is
never created, and the bare readdirSync at describe-time throws — which
crashes the entire mocha suite before any test result is recorded.

Guard with fs.existsSync so a subprocess failure is surfaced as proper
test failures (the 'could be executed' and 'should produce the expected
files' assertions) rather than an unhandled ENOENT crash.

Also print child.stderr before asserting on exit status, so any error
output from the subprocess is visible in CI logs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant