Skip to content

Commit 634ae16

Browse files
authored
Merge branch 'dev' into kraken2-emptyfastq-fix
2 parents 99053b5 + 0cca15a commit 634ae16

6 files changed

Lines changed: 11 additions & 11 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,15 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
77

88
### `Added`
99

10+
1011
### `Fixed`
1112

1213
- [#882](https://github.com/nf-core/eager/pull/882) Define DSL1 execution explicitly, as new versions Nextflow made DSL2 default (♥ to & fix from @Lehmann-Fabian)
1314
- [#879](https://github.com/nf-core/eager/issues/879) Add missing threads parameter for pre-clipping FastQC for single end data that caused insufficient memory in some cases (♥ to @marcel-keller for reporting)
1415
- [#885](https://github.com/nf-core/eager/issues/885) Specify task memory for all tools in get_software_versions to account for incompatibilty of java with some SGE clusters causing hanging of the process (♥ to @maxibor for reporting)
16+
- [#887](https://github.com/nf-core/eager/issues/887) Clarify what is considered 'ultra-short' reads in the help text of clip_readlength, for when you may wish to turn of length filtering during AdapterRemoval (♥ to @TCLamnidis for reporting)
17+
- [#889](https://github.com/nf-core/eager/issues/889) Remove/updated parameters from benchmarking test profiles (♥ to @TCLamnidis for reporting)
18+
- [#895](https://github.com/nf-core/eager/issues/895) Output documentation typo fix and added location of output docs in pipeline summary (♥ to @RodrigoBarquera for reporting)
1519
- [#897](https://github.com/nf-core/eager/issues/897) Fix pipeline crashing if no Kraken2 results generated (♥ to @alexandregilardet for reporting)
1620

1721
### `Dependencies`

conf/benchmarking_human.config

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,26 +12,24 @@ params {
1212
config_profile_description = "A 'fullsized' benchmarking profile for deepish Human sequencing aDNA data"
1313

1414
//Input data
15-
input = 'https://raw.githubusercontent.com/jfy133/test-datasets/eager/testdata/Benchmarking/benchmarking_human.tsv'
15+
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/eager/testdata/Benchmarking/benchmarking_human.tsv'
1616
// Genome reference
1717
fasta = 'https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz'
1818

1919
run_bam_filtering = true
20-
bam_discard_unmapped = true
2120
bam_unmapped_type = 'discard'
2221
bam_mapping_quality_threshold = 30
2322

2423
dedupper = 'markduplicates'
2524

2625
run_trim_bam = true
27-
bamutils_clip_left = 1
28-
bamutils_clip_right = 1
26+
bamutils_clip_double_stranded_none_udg_left = 1
27+
bamutils_clip_double_stranded_none_udg_right = 1
2928

3029
// JAR will need to be downloaded first!
3130
run_genotyping = true
3231
genotyping_tool = 'ug'
3332
genotyping_source = 'trimmed'
34-
gatk_ug_jar = 'GenomeAnalysisTK.jar'
3533
gatk_call_conf = 20
3634

3735
run_sexdeterrmine = true
@@ -41,8 +39,6 @@ params {
4139
contamination_chrom_name = 'chrX'
4240

4341
run_mtnucratio = true
44-
45-
4642
}
4743

4844
process {

conf/benchmarking_vikingfish.config

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ params {
2020
bwaalnl = 1024
2121

2222
run_bam_filtering = true
23-
bam_discard_unmapped = true
2423
bam_unmapped_type = 'discard'
2524
bam_mapping_quality_threshold = 25
2625

docs/output.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -679,7 +679,7 @@ Each module has it's own output directory which sit alongside the `MultiQC/` dir
679679
* `fastqc/`: this contains the original per-FASTQ FastQC reports that are summarised with MultiQC. These occur in both `html` (the report) and `.zip` format (raw data). The `after_clipping` folder contains the same but for after AdapterRemoval.
680680
* `adapterremoval/`: this contains the log files (ending with `.settings`) with raw trimming (and merging) statistics after AdapterRemoval. In the `output` sub-directory, are the output trimmed (and merged) `fastq` files. These you can use for downstream applications such as taxonomic binning for metagenomic studies.
681681
* `post_ar_fastq_trimmed`: this contains `fastq` files that have been additionally trimmed after AdapterRemoval (if turned on). These reads are usually that had internal barcodes, or damage that needed to be removed before mapping.
682-
* `mapping/`: this contains a sub-directory corresponding to the mapping tool you used, inside of which will be the initial BAM files containing the reads that mapped to your reference genome with no modification (see below). You will also find a corresponding BAM index file (ending in `.csi` or `.bam`), and if running the `bowtie2` mapper: a log ending in `_bt2.log`. You can use these for downstream applications e.g. if you wish to use a different de-duplication tool not included in nf-core/eager (although please feel free to add a new module request on the Github repository's [issue page](https://github.com/nf-core/eager/issues)!).
682+
* `mapping/`: this contains a sub-directory corresponding to the mapping tool you used, inside of which will be the initial BAM files containing the reads that mapped to your reference genome with no modification (see below). You will also find a corresponding BAM index file (ending in `.csi` or `.bai`), and if running the `bowtie2` mapper: a log ending in `_bt2.log`. You can use these for downstream applications e.g. if you wish to use a different de-duplication tool not included in nf-core/eager (although please feel free to add a new module request on the Github repository's [issue page](https://github.com/nf-core/eager/issues)!).
683683
* `samtools/`: this contains two sub-directories. `stats/` contain the raw mapping statistics files (ending in `.stats`) from directly after mapping. `filter/` contains BAM files that have had a mapping quality filter applied (set by the `--bam_mapping_quality_threshold` flag) and a corresponding index file. Furthermore, if you selected `--bam_discard_unmapped`, you will find your separate file with only unmapped reads in the format you selected. Note unmapped read BAM files will _not_ have an index file.
684684
* `deduplication/`: this contains a sub-directory called `dedup/`, inside here are sample specific directories. Each directory contains a BAM file containing mapped reads but with PCR duplicates removed, a corresponding index file and two stats file. `.hist.` contains raw data for a deduplication histogram used for tools like preseq (see below), and the `.log` contains overall summary deduplication statistics.
685685
* `endorSpy/`: this contains all JSON files exported from the endorSpy endogenous DNA calculation tool. The JSON files are generated specifically for display in the MultiQC general statistics table and is otherwise very likely not useful for you.

main.nf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3320,6 +3320,7 @@ workflow.onComplete {
33203320
if (workflow.success) {
33213321
log.info "-${c_purple}[nf-core/eager]${c_green} Pipeline completed successfully${c_reset}-"
33223322
log.info "-${c_purple}[nf-core/eager]${c_green} MultiQC run report can be found in ${params.outdir}/multiqc ${c_reset}-"
3323+
log.info "-${c_purple}[nf-core/eager]${c_green} Further output documentation can be seen at https://nf-core/eager/output ${c_reset}-"
33233324
} else {
33243325
checkHostname()
33253326
log.info "-${c_purple}[nf-core/eager]${c_red} Pipeline completed with errors${c_reset}-"

nextflow_schema.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -475,7 +475,7 @@
475475
"default": 30,
476476
"description": "Specify read minimum length to be kept for downstream analysis.",
477477
"fa_icon": "fas fa-ruler",
478-
"help_text": "Defines the minimum read length that is required for reads after merging to be considered for downstream analysis after read merging. Default is `30`.\n\nNote that performing read length filtering at this step is not reliable for correct endogenous DNA calculation, when you have a large percentage of very short reads in your library - such as retrieved in single-stranded library protocols. When you have very few reads passing this length filter, it will artificially inflate your endogenous DNA by creating a very small denominator. In these cases it is recommended to set this to 0, and use `--bam_filter_minreadlength` instead, to filter out 'un-usable' short reads after mapping.\n\n> Modifies AdapterRemoval parameter: `--minlength`\n"
478+
"help_text": "Defines the minimum read length that is required for reads after merging to be considered for downstream analysis after read merging. Default is `30`.\n\nNote that when you have a large percentage of very short reads in your library (< 20 bp) - such as retrieved in single-stranded library protocols - that performing read length filtering at this step is not _always_ reliable for correct endogenous DNA calculation. When you have very few reads passing this length filter, it will artificially inflate your 'endogenous DNA' value by creating a very small denominator. \n\nIf you notice you have ultra short reads (< 20 bp), it is recommended to set this parameter to 0, and use `--bam_filter_minreadlength` instead, to filter out 'un-usable' short reads after mapping. A caveat, however, is that this will cause a very large increase in computational run time, due to all reads in the library will be being mapped.\n\n> Modifies AdapterRemoval parameter: `--minlength`\n"
479479
},
480480
"clip_min_read_quality": {
481481
"type": "integer",
@@ -1683,7 +1683,7 @@
16831683
"maltextract_percentidentity": {
16841684
"type": "number",
16851685
"description": "Minimum percent identity alignments are required to have to be reported. Recommended to set same as MALT parameter.",
1686-
"default": 85.0,
1686+
"default": 85,
16871687
"fa_icon": "fas fa-id-card",
16881688
"help_text": "Minimum percent identity alignments are required to have to be reported. Higher values allows fewer mismatches between read and reference sequence, but therefore will provide greater confidence in the hit. Lower values allow more mismatches, which can account for damage and divergence of a related strain/species to the reference. Recommended to set same as MALT parameter or higher. Default: `85.0`.\n\nOnly when `--metagenomic_tool malt` is also supplied.\n\n> Modifies MaltExtract parameter: `--minPI`"
16891689
},

0 commit comments

Comments
 (0)