nf-core
diff --git a/‎assets/multiqc_config.yaml‎
Lines changed: 8 additions & 8 deletions b/‎assets/multiqc_config.yaml‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎conf/base.config‎
Lines changed: 4 additions & 0 deletions b/‎conf/base.config‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/output.md‎
Lines changed: 17 additions & 1 deletion b/‎docs/output.md‎
Lines changed: 17 additions & 1 deletion
@@ -88,10 +88,10 @@ top_modules:
             - '*_postfilterflagstat.stats'
     - 'dedup'
     - 'picard'
+    - 'preseq'
     - 'damageprofiler'
-    - 'qualimap'
     - 'mtnucratio'
-    - 'preseq'
+    - 'qualimap'
     - 'sexdeterrmine'
     - 'gatk'
     - 'multivcfanalyzer':
@@ -151,13 +151,13 @@ table_columns_visible:
         3 Prime2: False
         mean_readlength: True
         median: True
+    mtnucratio: 
+        mt_nuc_ratio: True
     QualiMap:
         mean_coverage: True
         1_x_pc: True
         5_x_pc: True
         percentage_aligned: False
-    mtnucratio: 
-        mt_nuc_ratio: True
     MultiVCFAnalyzer:
         Heterozygous SNP alleles (percent): True
 
@@ -205,6 +205,10 @@ table_columns_placement:
         3 Prime2: 730
         mean_readlength: 740
         median: 750
+    mtnucratio:
+        mtreads: 760
+        mt_cov_avg: 770
+        mt_nuc_ratio: 780
     QualiMap:
         mean_coverage: 800
         median_coverage: 810
@@ -214,10 +218,6 @@ table_columns_placement:
         4_x_pc: 850
         5_x_pc: 860
         avg_gc: 870
-    mtnucratio:
-        mtreads: 900
-        mt_cov_avg: 910
-        mt_nuc_ratio: 920
     sexdeterrmine:
         RateX: 100
         RateY: 1010
 
@@ -76,6 +76,10 @@ process {
     errorStrategy = 'ignore'
   }
 
+  withName:damageprofiler {
+    errorStrategy = { task.exitStatus in [1,143,137,104,134,139] ? 'retry' : 'finish' }
+  }
+
   // Add 141 ignore due to unclean pipe closing by pmdtools https://github.com/pontussk/PMDtools/issues/7
   withName: pmdtools {
     errorStrategy = { task.exitStatus in [141] ? 'ignore' : 'retry' }
 
@@ -108,6 +108,8 @@ For other non-default columns, hover over the column name for further descriptio
 
 [FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your raw reads. It provides information about the quality score distribution across your reads, the per base sequence content (%T/A/G/C) as sequenced. You also get information about adapter contamination and other overrepresented sequences.
 
+You will receive output for each supplied FASTQ file.
+
 When dealing with ancient DNA data the MultiQC plots for FastQC will often show lots of 'warning' or 'failed' samples. You generally can discard this sort of information as we are dealing with very degraded and metagenomic samples which have artefacts that violate the FastQC 'quality definitions', while still being valid data for aDNA researchers. Instead you will _normally_ be looking for 'global' patterns across all samples of a sequencing run to check for library construction or sequencing failures. Decision on whether a individual sample has 'failed' or not should be made by the user after checking all the plots themselves (e.g. if the sample is consistently an outlier to all others in the run).
 
 For further reading and documentation see the [FastQC help](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).
@@ -243,6 +245,8 @@ In the case of dual-indexed paired-end sequencing, it is likely poly-G tails are
 
 While the MultiQC report has multiple plots for FastP, we will only look at GC content as that's the functionality we use currently.
 
+You will receive output for each supplied FASTQ file.
+
 #### GC Content
 
 This line plot shows the average GC content (Y axis) across each nucleotide of the reads (X-axis). There are two buttons per read (i.e. 2 for single-end, and 4 for paired-end) representing before and after the poly-G tail trimming.
@@ -274,6 +278,8 @@ Quality trimming (or 'truncating') involves looking at ends of reads for low-con
 
 Length filtering involves removing any read that does not reach the number of bases specified by a particular value.
 
+You will receive output for each FASTQ file supplied for single end data, or for each pair of merged FASTQ files for paired end data.
+
 #### Retained and Discarded Reads Plot
 
 These stacked bars plots are unfortunately a little confusing, when displayed in MultiQC. However are relatively straight-forward once you understand each category. They can be displayed as counts of reads per AdapterRemoval read-category, or as percentages of the same values. Each forward(/reverse) file combination are displayed once.
@@ -317,6 +323,8 @@ With paired-end ancient DNA sequencing runs You expect to see a slight increase
 
 This module provides numbers in raw counts of the mapping of your DNA reads to your reference genome.
 
+You will receive output for each _library_. This means that if you use TSV input and have one library sequenced over multiple lanes merging, you will get mapping statistics of all lanes in one value.
+
 #### Flagstat Plot
 
 This dot plot shows different statistics, and the number of reads (typically as an multiple e.g. million, or thousands), are represented by dots on the X axis.
@@ -335,6 +343,8 @@ The remaining rows will be 0 when running `bwa aln` as these characteristucs of
 
 ### DeDup
 
+You will receive output for each _library_. This means that if you use TSV input and have one library sequenced over multiple lanes merging, you will get mapping statistics of all lanes of the library in one value.
+
 #### Background
 
 DeDup is a duplicate removal tool which searches for PCR duplicates and removes them from your BAM file. We remove these duplicates because otherwise you would be artificially increasing your coverage and subsequently confidence in genotyping, by considering these lab artefacts which are not biologically meaningful. DeDup looks for reads with the same start and end coordinates, and whether they have exactly the same sequence. The main difference of DeDup versus e.g. `samtools markduplicates` is that DeDup considers _both_ ends of a read, not just the start position, so it is more precise in removing actual duplicates without penalising often already low aDNA data.
@@ -364,6 +374,8 @@ Things to look out for:
 
 ### Preseq
 
+You will receive output for each deduplicated _library_. This means that if you use TSV input and have one library sequenced over multiple lanes merging, you will get mapping statistics of all lanes of the library in one value.
+
 #### Background
 
 Preseq is a collection of tools that allow assessment of the complexity of the library, where complexity means the number of unique molecules in your library (i.e. not molecules with the exact same length and sequence).
@@ -390,6 +402,8 @@ Plateauing can be caused by a number of reasons:
 
 ### DamageProfiler
 
+You will receive output for each deduplicated _library_. This means that if you use TSV input and have one library sequenced over multiple lanes merging, you will get mapping statistics of all lanes of the library in one value.
+
 #### Background
 
 DamageProfiler is a tool which calculates a variety of standard 'aDNA' metrics from a BAM file. The primary plots here are the misincorporation and length distribution plots. Ancient DNA undergoes depurination and hydrolysis, causing fragmentation of molecules into gradually shorter fragments, and cytosine to thymine deamination damage, that occur on the subsequent single-stranded overhangs at the ends of molecules.
@@ -431,14 +445,16 @@ When looking at the length distribution plots, keep in mind the following:
 
 ### QualiMap
 
-#### QualiMap
+#### Background
 
 Qualimap is a tool which provides statistics on the quality of the mapping of your reads to your reference genome. It allows you to assess how well covered your reference genome is by your data, both in 'fold' depth (average number of times a given base on the reference is covered by a read) and 'percentage' (the percentage of all bases on the reference genome that is covered at a given fold depth). These outputs allow you to make decision if you have enough quality data for downstream applications like genotyping, and how to adjust the parameters for those tools accordingly.
 
 > NB: Neither fold coverage nor percent coverage on there own is sufficient to assess whether you have a high quality mapping. Abnormally high fold coverages of a smaller region such as highly conserved genes or un-removed-adapter-containing reference genomes can artificially inflate the mean coverage, yet a high percent coverage is not useful if all bases of the genome are covered at just 1x coverage.
 
 Note that many of the statistics from this module are displayed in the General Stats table (see above), as they represent single values that are not plottable.
 
+You will receive output for each _sample_. This means you will statistics of deduplicated values of all types of libraries combined in a single value (i.e. non-UDG treated, full-UDG, paired-end, single-end all together).
+
 #### Coverage Histogram
 
 This plot shows on the Y axis the range of fold coverages that the bases of the reference genome are possibly covered by. The Y axis shows the number of bases that were covered at the given fold coverage depth as indicated on the Y axis.