Skip to content
Mark Wilkinson edited this page Apr 3, 2020 · 30 revisions

Fair Data

As an initial remark about data sharing: SARS-CoV-2 genomes are sequenced by a variety of different institutions, who submit their results to GISAID.org. From there, these data are only accessible after making a user account and then clicking through the UI to get the record you want. Simply fetching all genomes (it's only a few hundred, and they're 30k bases each so it's not a huge set) is currently not possible at all, let alone via an API.

Communication

Coming.

Participants

  • Mark Wilkinson (coordinator)
  • Michel Dumontier
  • Stian Soiland-Reyes
  • Philippe Rocca-Serra
  • Evangelos Pafilis
  • Susanna-Assunta Sansone
  • Lynn Schriml

Ideas

Repackage SARS-CoV-2 sequences

FAIRify (add metadata, identifiers, etc) reproducible research

For instance describe/package as an RO-Crate: (MDW: note that I have spoken with the RO Crate team, and they think the use of LDP as the container system for Crates would be a good idea. that's what I plan to do...)

Workflow Hub - registering COVID-19 workflows as FAIR

Working with ELIXIR effort, this project proposes to set up an early pre-production instance of the EOSC-Life Workflow Hub, covid19.workflowhub.eu, to be a registry that gather the COVID-19 workflows and their metadata. Part of the tasks here is also to curate the existing workflows and help making them interoperable, reusable and reproducible.

The curated metadata will be in a FAIR format based on RO-Crate and BioSchemas annotations and where possible contributed back to the workflow's origin GitHub repositories.

For details, tasks and participants, see sub-topic Workflow Hub.

Clone this wiki locally