build test suite / evaluation benchmark with ideal metadata transformations

add / adjust the structures in https://github.com/SciCodes/software-metadata-extraction-benchmark to set up a ground truth dataset to be used in test suites / evaluation benchmarks

Initial supported formats:

1. codemeta (versioned)
2. DataCite (versioned)
3. Citation File Format
4. any others?

Initial work will focus on purely deterministic transformation, use codemeticulous to convert from format A -> B for all items in the datasets (should be 100%, non-lossy, deterministic output)

Later work may include LLM augmentation where the LLM-assisted transformation augments it with additional metadata that wasn't included in the original manually curated transformations. This starts to bleed into responsibilities and functionality that should exist in somef-core though

@SciCodes/2025-workshop-organizers 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build test suite / evaluation benchmark with ideal metadata transformations #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

build test suite / evaluation benchmark with ideal metadata transformations #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions