You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --run_bam_filtering --bam_mapping_quality_threshold 37 --bam_unmapped_type 'fastq'
110
+
- name: BAM_FILTERING Run basic mapping pipeline with post-mapping length filtering
111
+
run: |
112
+
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --clip_readlength 0 --run_bam_filtering --bam_filter_minreadlength 50
110
113
- name: DEDUPLICATION Test with dedup
111
114
run: |
112
115
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --dedupper 'dedup' --dedup_all_merged
@@ -160,17 +163,17 @@ jobs:
160
163
for i in index0.idx ref.db ref.idx ref.inf table0.db table0.idx taxonomy.idx taxonomy.map taxonomy.tre; do wget https://github.com/nf-core/test-datasets/raw/eager/databases/malt/"$i" -P databases/malt/; done
161
164
- name: METAGENOMIC Run the basic pipeline but with unmapped reads going into MALT
Copy file name to clipboardExpand all lines: CHANGELOG.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,11 +24,11 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
24
24
* General documentation additions and cleaning, updated figures with CC-BY license
25
25
* Added large 'fullsize' dataset test-profiles for ancient fish, human, and a draft pathogen contexts.
26
26
*[#257](https://github.com/nf-core/eager/issues/257) Added the bowtie2 aligner as option for mapping, following Poullet and Orlando 2020 doi: [10.3389/fevo.2020.00105](https://doi.org/10.3389/fevo.2020.00105)
27
-
*[#451] Adds ANGSD genotype likelihood calculations as alternative to typical 'genotypers'
28
-
*[#504] Removed sexdeterrmine-snps plot from MultiQC report.
27
+
*[#451](https://github.com/nf-core/eager/issues/451) Adds ANGSD genotype likelihood calculations as alternative to typical 'genotypers'
29
28
* Nuclear contamination results are now shown in the MultiQC report.
30
29
* Tutorial on how to use profiles for reproducible science (i.e. parameter sharing between different groups)
31
-
* Added flexible trimming of bams by library type. 'half' and 'none' UDG libraries can now be trimmed differentially within a single eager run.
30
+
*[#522](https://github.com/nf-core/eager/issues/522) Added post-mapping length filter to asisst in more realistic endogenous DNA calculations
31
+
*[#512](https://github.com/nf-core/eager/issues/512) Added flexible trimming of bams by library type. 'half' and 'none' UDG libraries can now be trimmed differentially within a single eager run.
32
32
33
33
### `Fixed`
34
34
@@ -48,6 +48,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
*[#508](https://github.com/nf-core/eager/issues/508) - Made Markduplicates default dedupper due to narrower context specificity of dedup
50
50
*[#516](https://github.com/nf-core/eager/issues/516) - Made bedtools not report out of memory exit code when warning of inconsistant FASTA/Bed entry names
51
+
*[#504](https://github.com/nf-core/eager/issues/504) - Removed uninformative sexdeterrmine-snps plot from MultiQC report.
51
52
* Nuclear contamination is now reported with the correct library names.
Copy file name to clipboardExpand all lines: docs/usage.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -549,6 +549,8 @@ Defines the adapter sequence to be used for the reverse read in paired end seque
549
549
550
550
Defines the minimum read length that is required for reads after merging to be considered for downstream analysis after read merging. Default is `30`.
551
551
552
+
Note that performing read length filtering at this step is not reliable for correct endogenous DNA calculation, when you have a large percentage of very short reads in your library - such retrieved in single-stranded library protocols. When you have very few reads passing this length filter, it will artificially inflate your endogenous DNA by creating a very small denominator. In these cases it is recommended to set this to 0, and use `--bam_filter_minreadlength` to instead, to filter out 'unusuable' short reads after mapping.
553
+
552
554
#### `--clip_min_read_quality`
553
555
554
556
Defines the minimum read quality per base that is required for a base to be kept. Individual bases at the ends of reads falling below this threshold will be clipped off. Default is set to `20`.
@@ -705,6 +707,12 @@ Note that in all cases, if `--bam_mapping_quality_threshold` is also supplied, m
705
707
706
708
Specify a mapping quality threshold for mapped reads to be kept for downstream analysis. By default keeps all reads and is therefore set to `0` (basically doesn't filter anything).
707
709
710
+
#### `bam_filter_minreadlength`
711
+
712
+
Specify minimum length of mapped reads. This filtering will apply at the same time as mapping quality filtering.
713
+
714
+
If used _instead_ of minimum length read filtering at AdapterRemoval, this can be useful to get more realistic endogenous DNA percentages, when most of your reads are very short (e.g. in single-stranded libraries) and would otherwise be discarded by AdapterRemoval (thus making an artifically small denominator for a typical endogenous DNA calculation). Note in this context you should not perform mapping quality filtering nor discarding of unmapped reads to ensure a correct 'denominator' of 'all reads', for the Endogenous DNA calculation.
715
+
708
716
### Read DeDuplication Parameters
709
717
710
718
If using TSV input, deduplication is performed library, i.e. after lane merging.
Copy file name to clipboardExpand all lines: main.nf
+84-29Lines changed: 84 additions & 29 deletions
Original file line number
Diff line number
Diff line change
@@ -105,7 +105,8 @@ def helpMessage() {
105
105
--run_bam_filtering Turn on samtools filter for mapping quality or unmapped reads of BAM files.
106
106
--bam_mapping_quality_threshold Minimum mapping quality for reads filter. Default: ${params.bam_mapping_quality_threshold}
107
107
--bam_unmapped_type Defines whether to discard all unmapped reads, keep both mapped and unmapped together, or save as bam and/or only fastq format Options: 'discard', 'bam', 'keep', 'fastq', 'both'. Default: ${params.bam_unmapped_type}
108
-
108
+
--bam_filter_minreadlength Specify minimum read length to be kept after mapping.
109
+
109
110
DeDuplication
110
111
--dedupper Deduplication method to use. Options: 'dedup', 'markduplicates'. Default: '${params.dedupper}'
111
112
--dedup_all_merged Turn on treating all reads as merged reads.
0 commit comments