-
Notifications
You must be signed in to change notification settings - Fork 31
serratus
Artem Babaian edited this page Mar 26, 2020
·
6 revisions

COVID-19 came out of seemingly nowhere. We will search all SRA sequence data to identify new members of the family Coronaviridae to trace the lineage of SARS-CoV-2.
- Create a phylogenetic tree for coronaviridae with all available sequences.
- Identify libraries with novel coronaviruses by searching all public data on SRA (~100 PB)
- Assemble putative coronaviruse genomes and return to step 1)
We're currently building the framework for very high-efficiency (cost) skimming/alignment of data off of SRA. Since February, SRA has been mirrored to AWS S3, as such we can access all the data for almost no cost using AWS services.
Email me ababaian {at} bccrc {dot} ca if you'd like to contribute. We will set-up a slack in the BH20 slack closer to the hackathon date.