Skip to content
Ben Busby edited this page Apr 5, 2020 · 44 revisions

Pangenome and variation graph

We will used tools supporting the variation graph data model, as described at the pangenome tools and workflows page, to build and distribute pangenome data structures from SARS-CoV-2 genomes. These models are useful for diagnostic and resequencing applications. They can also help us generate assemblies from raw sequencing data.

Communication

Josiah: After some debate, I decided to create a second project for a Pangenome Browser. This depends on Variation graph construction, but it's certainly a different set of tasks that can be carried out independently. In order for a browser to be effective, we must have annotations aggregated/curated as a third task. I would still like us to coordinate closely.

Specific use cases.

In my personal opinion (Ben Busby), it may be particularly productive to look at the less conserved satellite genes near the S locus.

We may also want to look at amino acid 614 of the spike protein. This is in an unstructured loop, presumably between presumably transmembrane helices. This may be involved in immune evasion. We should look at correspondence between SARS-1 and SARS-2 at this position.

Being able to see these loci in the context of each other, as well as the CoVID genomes in general, may be very beneficial in terms of subclassing the virus wrt human reaction.

Participants

Clone this wiki locally