Skip to content

Retain read group information in bam merging steps #808

@TCLamnidis

Description

@TCLamnidis

Is your feature request related to a problem? Please describe

Currently, every bam merging step in nf-core eager will overwrite the read groups in the bam, thus discarding potentially useful information that would otherwise allow users to trace the origin of specific reads to a library/sequencing run. In some form this information may exist among the intermediate files, but it should not be discarded without cause.

This information can be important also for calling of genotype likelihoods (which is currently not done within eager, but might be a good future addition).

Describe the solution you'd like

Each bam merging step should return the union of read groups, instead of overwriting that information.

Additional context

The current behaviour is (I think) a fossil-feature leftover from EAGER, that had to do with how pathogen screening works and how GATK UG prefers its input bams.

I think tweaking the read groups produced during mapping would potentially kill two birds with one stone. Investigating this further.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions