Merging of ssDNA and dsDNA genotypes.

He Yu (a colleague) has developed a small tool to "smush" together genotypes from different libraries of the same individual. Some version of this could be implemented into eager to solve the issues with ssDNA vs dsDNA genotyping with pileupCaller. It would allow users to keep genotypes called with `--singleStrandMode` but also fill in any missing data that is genotyped in an accompanying dsDNA library of the same individual.

The (soon to be updated) implementation of pileupCaller will be creating two genotyping datasets, one for dsDNA and one for ssDNA libraries. Adding a genotype "smushing" option would allow us to provide a single dataset with a single version for each "duplicated"  individual that includes the best version of the data from both library types. 

The current implementation of He's tool randomly picks one of the genotypes in positions where both versions of an individual are genotyped. It might be good to add an option to overwrite this behaviour (so dsDNA/ssDNA is preferred if the user has a preference) before it is implemented in eager. 

I will look into `conda`ing and implementing the tool in eager after I discuss further with He.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging of ssDNA and dsDNA genotypes. #481

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Merging of ssDNA and dsDNA genotypes. #481

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions