-
Notifications
You must be signed in to change notification settings - Fork 31
serratus
Artem Babaian edited this page Mar 26, 2020
·
6 revisions

COVID-19 came out of seemingly nowhere. We will search all SRA sequence data to identify new members of the family Coronaviridae to trace the lineage of SARS-CoV-2.
- Create a phylogenetic tree for coronaviridae with all available sequences.
- Identify libraries with novel coronaviruses by searching all public data on SRA (~100 PB)
- Assemble putative coronaviruse genomes and return to step 1)
We're currently building the framework for very high-efficiency (cost) skimming/alignment of data off of SRA. Since February, SRA has been mirrored to AWS S3, as such we can access all the data for almost no cost using AWS services.