Skip to content

Report normalized read counts #17

@epruesse

Description

@epruesse

We have tried normalizing the read counts previously but saw no big impact. Presumably this is because the read count distribution is exponential, leading to little impact created by sequencing depth as these don't typically vary by orders of magnitude for our samples.

The reads fractionate as:

  • regular human mapped reads (hisat2)
  • additional human mapped reads (bowtie2)
  • reads mapped to human contigs (blast vs human)
  • reads mapped to classifiable contigs
  • reads mapped to unclassified contigs
  • reads unmapped

The unmapped reads are presumably dominated by poor quality reads, as are the reads mapped to unclassified contigs. A good total amount of "total reads" would be all fractions except for that one. Alternatively, the typically dominant human mapped reads could be used exclusively as normalisation target. The question here is whether normalization then helps at all, as it would be done by DeSeq & friends prefer raw read counts, and would process a virus just as they would process any other expressed gene.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions