-
Notifications
You must be signed in to change notification settings - Fork 31
FairData
As an initial remark about data sharing: SARS-CoV-2 genomes are sequenced by a variety of different institutions, who submit their results to GISAID.org. From there, these data are only accessible after making a user account and then clicking through the UI to get the record you want. Simply fetching all genomes (it's only a few hundred, and they're 30k bases each so it's not a huge set) is currently not possible at all, let alone via an API.
Coming.
- Mark Wilkinson (coordinator)
- Michel Dumontier
- Stian Soiland-Reyes
- Philippe Rocca-Serra
- Evangelos Pafilis
- Susanna-Assunta Sansone
- Lynn Schriml
For instance describe/package as an RO-Crate: (MDW: note that I have spoken with the RO Crate team, and they think the use of LDP as the container system for Crates would be a good idea. that's what I plan to do...)
- https://doi.org/10.26434/chemrxiv.11871402.v4
- https://github.com/galaxyproject/SARS-CoV-2
I have created a Linked Data Platform endpoint on my institutional server in Madrid for us to use for back-end storage. It uses Virtuoso's LDP implementation (so we get SPARQL over the Linked Data submitted to that server):
-
https://w3id.org/FAIR_Training_LDP/DAV/coronavirus/ (note the trailing slash!)
-
For GET operations, you need no un/pw
-
For POST and PUT operations your username is hackathon and pw is b**hac****on
-
The endpoint for PUT/POST operations is: http://ldp.cbgp.upm.es:8890/DAV/coronavirus/
There is NOTHING on that server that is in any way valuable - it is entirely used for FAIR training - so we can make as many mistakes as we need to and I can wipe the DB and start again if necessary. Alternately, you can download the image linked above, and run it on localhost for your tests.
Please be "good citizens" and start by creating a sub-container inside of the /coronavirus/ container where you can store your information. Please remember that LDP Containers have a trailing slash! I believe that the Virtuoso implementation of LDP can ingest both Turtle and JSON-LD for the purposes of SPARQL, but I have only ever tried Turtle so I cannot promise the latter. The SPARQL endpoint is: https://w3id.org/FAIR_Training_LDP/sparql
Typical Interaction:
To create your "home" Container:
Create a file "container.ttl" that contains a small piece of turtle:
@prefix ldp: <http://www.w3.org/ns/ldp#> . <> a ldp:Container.
To upload this to the server:
curl -v -H "Accept: text/turtle" -H "Content-type: text/turtle" -u hackathon:********** --data-binary @container.ttl -H "Slug: myProjectName" http://ldp.cbgp.upm.es:8890/DAV/coronavirus/
(note that the trailing slash is required for containers! If you miss it, you will get a 301 redirect)
To create an ldp:Resource, the RDF should have the rdf:type ldp:Resource .
For more complex interactions, see the options in the HTTP headers.
Set up a Wiki page where people can deposit any properties/classes that are currently missing from existing ontologies. We could then contact appropriate vocabulary providers and try to have these added.
- Sub-topic: Workflow Hub
Working with ELIXIR effort, this project proposes to set up an early pre-production instance of the EOSC-Life Workflow Hub, covid19.workflowhub.eu, to be a registry that gather the COVID-19 workflows and their metadata. Part of the tasks here is also to curate the existing workflows and help making them interoperable, reusable and reproducible.
The curated metadata will be in a FAIR format based on RO-Crate and BioSchemas annotations and where possible contributed back to the workflow's origin GitHub repositories.
For details, tasks and participants, see sub-topic Workflow Hub.