You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,13 +7,17 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
7
7
8
8
### `Added`
9
9
10
+
-[#640](https://github.com/nf-core/eager/issues/640) - Added a pre-metagenomic screening filtering of low-sequence complexity reads with `bbduk`
11
+
-[#583](https://github.com/nf-core/eager/issues/583) - Added `mapDamage2` rescaling of BAM files to remove damage
10
12
- Updated usage (merging files) and workflow images reflecting new functionality.
11
13
12
14
### `Fixed`
13
15
14
16
- Removed leftover old DockerHub push CI commands.
15
-
-[#627](https://github.com/nf-core/eager/issues/627) Added de Barros Damgaard citation to README
16
-
-[#630](https://github.com/nf-core/eager/pull/630) Better handling of Qualimap memory requirements and error strategy.
17
+
-[#627](https://github.com/nf-core/eager/issues/627) - Added de Barros Damgaard citation to README
18
+
-[#630](https://github.com/nf-core/eager/pull/630) - Better handling of Qualimap memory requirements and error strategy.
19
+
- Fixed some incomplete schema options to ensure users supply valid input values
20
+
-[#638](https://github.com/nf-core/eager/issues/638#issuecomment-748877567) Fixed inverted circularfilter filtering (previously filtering would happen by default, not when requested by user as originally recorded in documentation)
2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`Podman`](https://podman.io/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_
33
+
34
+
3. Download the pipeline and test it on a minimal dataset with a single command:
35
+
36
+
```bash
37
+
nextflow run nf-core/eager -profile test,<docker/singularity/podman/conda/institute>
38
+
```
39
+
40
+
> Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists foryour Institute. If so, you can simply use `-profile <institute>`in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
41
+
42
+
4. Start running your own analysis!
43
+
44
+
```bash
45
+
nextflow run nf-core/eager -profile <docker/singularity/conda> --input '*_R{1,2}.fastq.gz' --fasta '<your_reference>.fasta'
46
+
```
47
+
48
+
5. Once your run has completed successfully, clean up the intermediate files.
49
+
50
+
```bash
51
+
nextflow clean -f -k
52
+
```
53
+
54
+
See [usage docs](https://nf-co.re/eager/docs/usage.md) for all of the available options when running the pipeline.
55
+
56
+
**N.B.** You can see an overview of the run in the MultiQC report located at `./results/MultiQC/multiqc_report.html`
57
+
58
+
Modifications to the default pipeline are easily made using various options as described in the documentation.
59
+
60
+
## Pipeline Summary
29
61
30
62
### Default Steps
31
63
@@ -77,6 +109,7 @@ Additional functionality contained by the pipeline currently includes:
77
109
78
110
#### Metagenomic Screening
79
111
112
+
* Low-sequenced complexity filtering (`BBduk`)
80
113
* Taxonomic binner with alignment (`MALT`)
81
114
* Taxonomic binner without alignment (`Kraken2`)
82
115
* aDNA characteristic screening of taxonomically binned data from MALT (`MaltExtract`)
@@ -89,48 +122,6 @@ A graphical overview of suggested routes through the pipeline depending on conte
89
122
<img src="docs/images/usage/eager2_metromap_complex.png" alt="nf-core/eager metro map" width="70%"
2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`Podman`](https://podman.io/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_
97
-
98
-
3. Download the pipeline and test it on a minimal dataset with a single command:
99
-
100
-
```bash
101
-
nextflow run nf-core/eager -profile test,<docker/singularity/podman/conda/institute>
102
-
```
103
-
104
-
> Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists foryour Institute. If so, you can simply use `-profile <institute>`in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
105
-
106
-
4. Start running your own analysis!
107
-
108
-
```bash
109
-
nextflow run nf-core/eager -profile <docker/singularity/conda> --input '*_R{1,2}.fastq.gz' --fasta '<your_reference>.fasta'
110
-
```
111
-
112
-
5. Once your run has completed successfully, clean up the intermediate files.
113
-
114
-
```bash
115
-
nextflow clean -f -k
116
-
```
117
-
118
-
See [usage docs](https://nf-co.re/eager/docs/usage.md) for all of the available options when running the pipeline.
119
-
120
-
**N.B.** You can see an overview of the run in the MultiQC report located at `./results/MultiQC/multiqc_report.html`
121
-
122
-
Modifications to the default pipeline are easily made using various options
123
-
as described in the documentation.
124
-
125
-
## Pipeline Summary
126
-
127
-
By default, the pipeline currently performs the following:
128
-
129
-
<!-- TODO nf-core: Fill in short bullet-pointed list of default steps of pipeline -->
130
-
131
-
* Sequencing quality control (`FastQC`)
132
-
* Overall pipeline run summaries (`MultiQC`)
133
-
134
125
## Documentation
135
126
136
127
The nf-core/eager pipeline comes with documentation about the pipeline: [usage](https://nf-co.re/eager/usage) and [output](https://nf-co.re/eager/output).
@@ -236,6 +227,8 @@ In addition, references of tools and data used in this pipeline are as follows:
236
227
* **Bowtie2** Langmead, B. and Salzberg, S. L. 2012 Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), p. 357–359. doi: [10.1038/nmeth.1923](https:/dx.doi.org/10.1038/nmeth.1923).
237
228
* **sequenceTools** Stephan Schiffels (Unpublished). Download: [https://github.com/stschiff/sequenceTools](https://github.com/stschiff/sequenceTools)
238
229
* **EigenstratDatabaseTools** Thiseas C. Lamnidis (Unpublished). Download: [https://github.com/TCLamnidis/EigenStratDatabaseTools.git](https://github.com/TCLamnidis/EigenStratDatabaseTools.git)
230
+
* **mapDamage2** Jónsson, H., et al 2013. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics , 29(13), 1682–1684. [https://doi.org/10.1093/bioinformatics/btt193](https://doi.org/10.1093/bioinformatics/btt193)
231
+
* **BBduk** Brian Bushnell (Unpublished). Download: [https://sourceforge.net/projects/bbmap/](sourceforge.net/projects/bbmap/)
Copy file name to clipboardExpand all lines: docs/output.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -658,11 +658,13 @@ Each module has it's own output directory which sit alongside the `MultiQC/` dir
658
658
-`damageprofiler/` - this contains sample specific directories containing raw statistics and damage plots from DamageProfiler. The `.pdf` files can be used to visualise C to T miscoding lesions or read length distributions of your mapped reads. All raw statistics used for the PDF plots are contained in the `.txt` files.
659
659
-`pmdtools/` - this contains raw output statistics of pmdtools (estimates of frequencies of substitutions), and BAM files which have been filtered to remove reads that do not have a Post-mortem damage (PMD) score of `--pmdtools_threshold`.
660
660
-`trimmed_bam/` - this contains the BAM files with X number of bases trimmed off as defined with the `--bamutils_clip_half_udg_left`, `--bamutils_clip_half_udg_right`, `--bamutils_clip_none_udg_left`, and `--bamutils_clip_none_udg_right` flags and corresponding index files. You can use these BAM files for downstream analysis such as re-mapping data with more stringent parameters (if you set trimming to remove the most likely places containing damage in the read).
661
+
-`damage_rescaling/` - this contains rescaled BAM files from mapDamage2. These BAM files have damage probabilistically removed via a bayesian model, and can be used for downstream genotyping.
661
662
-`genotyping/` - this contains all the (gzipped) genotyping files produced by your genotyping module. The file suffix will have the genotyping tool name. You will have files corresponding to each of your deduplicated BAM files (except pileupcaller), or any turned-on downstream processes that create BAMs (e.g. trimmed bams or pmd tools). If `--gatk_ug_keep_realign_bam` supplied, this may also contain BAM files from InDel realignment when using GATK 3 and UnifiedGenotyping for variant calling. When pileupcaller is used to create eigenstrat genotypes, this directory also contains eigenstrat SNP coverage statistics.
662
663
-`multivcfanalyzer/` - this contains all output from MultiVCFAnalyzer, including SNP calling statistics, various SNP table(s) and FASTA alignment files.
663
664
-`sex_determination/` - this contains the output for the sex determination run. This is a single `.tsv` file that includes a table with the sample name, the number of autosomal SNPs, number of SNPs on the X/Y chromosome, the number of reads mapping to the autosomes, the number of reads mapping to the X/Y chromosome, the relative coverage on the X/Y chromosomes, and the standard error associated with the relative coverages. These measures are provided for each bam file, one row per file. If the `sexdeterrmine_bedfile` option has not been provided, the error bars cannot be trusted, and runtime will be considerably longer.
664
665
-`nuclear_contamination/` - this contains the output of the nuclear contamination processes. The directory contains one `*.X.contamination.out` file per individual, as well as `nuclear_contamination.txt` which is a summary table of the results for all individual. `nuclear_contamination.txt` contains a header, followed by one line per individual, comprised of the Method of Moments (MOM) and Maximum Likelihood (ML) contamination estimate (with their respective standard errors) for both Method1 and Method2.
665
666
-`bedtools/` - this contains two files as the output from bedtools coverage. One file contains the 'breadth' coverage (`*.breadth.gz`). This file will have the contents of your annotation file (e.g. BED/GFF), and the following subsequent columns: no. reads on feature, # bases at depth, length of feature, and % of feature. The second file (`*.depth.gz`), contains the contents of your annotation file (e.g. BED/GFF), and an additional column which is mean depth coverage (i.e. average number of reads covering each position).
667
+
-`metagenomic_complexity_filter` - this contains the output from filtering of input reads to metagenomic classification of low-sequence complexity reads as performed by `bbduk`. This will include the filtered FASTQ files (`*_lowcomplexityremoved.fq.gz`) and also the run-time log (`_bbduk.stats`) for each sample. **Note:** there are no sections in the MultiQC report for this module, therefore you must check the `._bbduk.stats` files to get summary statistics of the filtering.
666
668
-`metagenomic_classification/` - this contains the output for a given metagenomic classifier.
667
669
- Running MALT will contain RMA6 files that can be loaded into MEGAN6 or MaltExtract for phylogenetic visualisation of read taxonomic assignments and aDNA characteristics respectively. Additional a `malt.log` file is provided which gives additional information such as run-time, memory usage and per-sample statistics of numbers of alignments with taxonomic assignment etc. This will also include gzip SAM files if requested.
668
670
- Running kraken will contain the Kraken output and report files, as well as a merged Taxon count table.
0 commit comments