Skip to content

harmonize formats for metadata schema and dataset creation #4451

@pameyer

Description

@pameyer

User stories:

  • As a repository administrator / curator / metadata person, I would like to deal with hierarchical metadata schema in a way that is less awkward than a spreadsheet.
  • As a curator / metadata person, I would like to be able to configure which metadata fields should be included in particular export formats without having to do additional development.
  • As a data depositor, I would like there to be a clearer relationship between the definition of the metadata schema for my dataset and the API calls to create that dataset.
  • As an external (aka - developing stuff that interoperates with dataverse) developer / data depositor, I would like to be able to verify that my attempt to create or edit a dataset via API will match what Dataverse is expecting prior to making an API call and failing.
  • As a metadata person / dataverse developer, I would like to have the metadata crosswalk handled programmatically (rather than requiring a person to keep TSV files in synchronization with google spreadsheets).
  • As a researcher, I would like to reduce the amount of development time required for Dataverse to provide metadata to newly developed/discovered systems, and to adapt to updated versions of existing metadata schemas.

A first step towards addressing this could be replacing the TSV files used for metadatablocks / DatasetFieldTypes with a file format supporting hierarchical structures (JSON, YAML, XML), and updating the APIs in Dataverse reading these (with no additional information provided). This format should be something that either can be, or can be easily transformed into a form to, validate dataset creation/edit API input files.

  • For a second step, define fields in the metadata schema format to indicate a metadata element should be exported (or potentially a more detailed indication about which types of exporters the element should be sent to), and indicate which other metadata schema this element corresponds to.
  • For a third step, places where metadata is currently exported (HTML meta tags, schema.org, DataCite/EZID DOI registration / Handle registration?, OAI-PMH feeds, others?) would be modified to use the provided information (aka - remove hard-coding).
  • For a fourth step, new (or updated) dataset creation/edit/export APIs should be made compatible with the schema in the first step.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions