Skip to content

Merging of ssDNA and dsDNA genotypes. #481

@TCLamnidis

Description

@TCLamnidis

He Yu (a colleague) has developed a small tool to "smush" together genotypes from different libraries of the same individual. Some version of this could be implemented into eager to solve the issues with ssDNA vs dsDNA genotyping with pileupCaller. It would allow users to keep genotypes called with --singleStrandMode but also fill in any missing data that is genotyped in an accompanying dsDNA library of the same individual.

The (soon to be updated) implementation of pileupCaller will be creating two genotyping datasets, one for dsDNA and one for ssDNA libraries. Adding a genotype "smushing" option would allow us to provide a single dataset with a single version for each "duplicated" individual that includes the best version of the data from both library types.

The current implementation of He's tool randomly picks one of the genotypes in positions where both versions of an individual are genotyped. It might be good to add an option to overwrite this behaviour (so dsDNA/ssDNA is preferred if the user has a preference) before it is implemented in eager.

I will look into condaing and implementing the tool in eager after I discuss further with He.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestfeaturependingAddressed on branch waiting for related PR

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions