You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+19-1Lines changed: 19 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,8 +4,26 @@ All notable changes to this project will be documented in this file.
4
4
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
5
5
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
6
6
7
-
## unpublished
7
+
## [Unpublished]
8
8
9
+
### `Added`
10
+
*[#80](https://github.com/nf-core/eager/pull/80) - BWA Index file handling
11
+
*[#77](https://github.com/nf-core/eager/pull/77) - Lots of documentation updates by [@jfy133](https://github.com/jfy133)
12
+
13
+
## [2.0.3] - 2018-12-09
14
+
15
+
### `Added`
16
+
*[#80](https://github.com/nf-core/eager/pull/80) - BWA Index file handling
17
+
*[#77](https://github.com/nf-core/eager/pull/77) - Lots of documentation updates by [@jfy133](https://github.com/jfy133)
18
+
*[#81](https://github.com/nf-core/eager/pull/81) - Renaming of certain BAM options
19
+
*[#92](https://github.com/nf-core/eager/issues/92) - Complete restructure of BAM options
20
+
21
+
### `Fixed`
22
+
*[#84](https://github.com/nf-core/eager/pull/85) - Fix for [Samtools index issues](https://github.com/nf-core/eager/issues/84)
23
+
*[#96](https://github.com/nf-core/eager/issues/96) - Fix for [MarkDuplicates issues](https://github.com/nf-core/eager/issues/96) found by [@nilesh-tawari](https://github.com/nilesh-tawari)
[](http://bioconda.github.io/)
5
+
[](https://nf-core-invite.herokuapp.com)[](http://bioconda.github.io/)
**nf-core/eager** is a bioinformatics best-practice analysis pipeline for ancient DNA data analysis.
14
+
**nf-core/eager** is a bioinformatics best-practice analysis pipeline for NGS
15
+
sequencing based ancient DNA (aDNA) data analysis.
16
16
17
-
The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results. It comes with docker / singularity containers making installation trivial and results highly reproducible.
17
+
The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics
18
+
workflow tool. It pre-processes raw data from FASTQ inputs, aligns the reads
19
+
and performs extensive general NGS and aDNA specific quality-control on the
20
+
results. It comes with docker, singularity or conda containers making
21
+
installation trivial and results highly reproducible.
18
22
19
-
###Pipeline steps
23
+
## Pipeline steps
20
24
21
-
* Create reference genome indices (optional)
22
-
* BWA
23
-
* Samtools Index
24
-
* Sequence Dictionary
25
-
* QC with FastQC
26
-
* AdapterRemoval for read clipping and merging
27
-
* Read mapping with BWA, BWA Mem or CircularMapper
28
-
* Samtools sort, index, stats & conversion to BAM
29
-
* DeDup or MarkDuplicates read deduplication
30
-
* QualiMap BAM QC Checking
31
-
* Preseq Library Complexity Estimation
32
-
* DamageProfiler damage profiling
33
-
* BAM Clipping for UDG+/UDGhalf protocols
34
-
* PMDTools damage filtering / assessment
25
+
By default the pipeline currently performs the following:
26
+
27
+
* Create reference genome indices for mapping (`bwa`, `samtools`, and `picard`)
28
+
* Sequencing quality control (`FastQC`)
29
+
* Sequencing adapter removal and for paired end data merging (`AdapterRemoval`)
30
+
* Read mapping to reference using (`bwa aln`, `bwa mem` or `CircularMapper`)
31
+
* Post-mapping processing, statistics and conversion to bam (`samtools`)
32
+
* Ancient DNA C-to-T damage pattern visualisation (`DamageProfiler`)
33
+
* PCR duplicate removal (`DeDup` or `MarkDuplicates`)
34
+
* Post-mapping statistics and BAM quality control (`Qualimap`)
* Automatic conversion of unmapped reads to FASTQ (`samtools`)
42
+
* Damage removal/clipping for UDG+/UDG-half treatment protocols (`BamUtil`)
43
+
* Damage reads extraction and assessment (`PMDTools`)
44
+
45
+
## Quick Start
46
+
47
+
1. Install [`nextflow`](docs/installation.md)
48
+
2. Install one of [`docker`](https://docs.docker.com/engine/installation/), [`singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`conda`](https://conda.io/miniconda.html)
49
+
3. Download the EAGER pipeline
50
+
51
+
```bash
52
+
nextflow pull nf-core/eager
53
+
```
54
+
55
+
4. Set up your job with default parameters
56
+
57
+
```bash
58
+
nextflow run nf-core -profile <docker/singularity/conda> --reads'*_R{1,2}.fastq.gz' --fasta '<REFERENCE.fasta'
59
+
```
60
+
61
+
5. See the overview of the run with under `<OUTPUT_DIR>/MultiQC/multiqc_report.html`
62
+
63
+
Modifications to the default pipeline are easily made using various options
64
+
as described in the documentation.
65
+
66
+
## Documentation
35
67
36
-
### Documentation
37
68
The nf-core/eager pipeline comes with documentation about the pipeline, found in the `docs/` directory:
38
69
39
70
1.[Installation](docs/installation.md)
@@ -44,5 +75,27 @@ The nf-core/eager pipeline comes with documentation about the pipeline, found in
44
75
4.[Output and how to interpret the results](docs/output.md)
45
76
5.[Troubleshooting](docs/troubleshooting.md)
46
77
47
-
### Credits
48
-
This pipeline was written by Alexander Peltzer ([apeltzer](https://github.com/apeltzer)), with major contributions from Stephen Clayton, ideas and documentation from James Fellows-Yates, Raphael Eisenhofer and Judith Neukamm. If you want to contribute, please open an issue and ask to be added to the project - happy to do so and everyone is welcome to contribute here!
78
+
79
+
## Credits
80
+
81
+
This pipeline was written by Alexander Peltzer ([apeltzer](https://github.com/apeltzer)),
82
+
with major contributions from Stephen Clayton, ideas and documentation from
83
+
James Fellows Yates, Raphael Eisenhofer and Judith Neukamm. If you want to
84
+
contribute, please open an issue and ask to be added to the project - happy to
85
+
do so and everyone is welcome to contribute here!
86
+
87
+
## Tool References
88
+
89
+
***EAGER v1**, CircularMapper, DeDup* Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., & Nieselt, K. (2016). EAGER: efficient ancient genome reconstruction. Genome Biology, 17(1), 1–14. [https://doi.org/10.1186/s13059-016-0918-z](https://doi.org/10.1186/s13059-016-0918-z) Download: [https://github.com/apeltzer/EAGER-GUI](https://github.com/apeltzer/EAGER-GUI) and [https://github.com/apeltzer/EAGER-CLI](https://github.com/apeltzer/EAGER-CLI)
***AdapterRemoval v2** Schubert, M., Lindgreen, S., & Orlando, L. (2016). AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Research Notes, 9, 88. [https://doi.org/10.1186/s13104-016-1900-2](https://doi.org/10.1186/s13104-016-1900-2) Download: [https://github.com/MikkelSchubert/adapterremoval](https://github.com/MikkelSchubert/adapterremoval)
92
+
***bwa** Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics , 25(14), 1754–1760. [https://doi.org/10.1093/bioinformatics/btp324](https://doi.org/10.1093/bioinformatics/btp324) Download: [http://bio-bwa.sourceforge.net/bwa.shtml](http://bio-bwa.sourceforge.net/bwa.shtml)
93
+
***SAMtools** Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … 1000 Genome Project Data Processing Subgroup. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics , 25(16), 2078–2079. [https://doi.org/10.1093/bioinformatics/btp352](https://doi.org/10.1093/bioinformatics/btp352) Download: [http://www.htslib.org/](http://www.htslib.org/)
94
+
***DamageProfiler** Judith Neukamm (Unpublished)
95
+
***QualiMap** Okonechnikov, K., Conesa, A., & García-Alcalde, F. (2016). Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics , 32(2), 292–294. [https://doi.org/10.1093/bioinformatics/btv566](https://doi.org/10.1093/bioinformatics/btv566) Download: [http://qualimap.bioinfo.cipf.es/](http://qualimap.bioinfo.cipf.es/)
96
+
***preseq** Daley, T., & Smith, A. D. (2013). Predicting the molecular complexity of sequencing libraries. Nature Methods, 10(4), 325–327. [https://doi.org/10.1038/nmeth.2375](https://doi.org/10.1038/nmeth.2375). Download: [http://smithlabresearch.org/software/preseq/](http://smithlabresearch.org/software/preseq/)
97
+
***PMDTools** Skoglund, P., Northoff, B. H., Shunkov, M. V., Derevianko, A. P., Pääbo, S., Krause, J., & Jakobsson, M. (2014). Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal. Proceedings of the National Academy of Sciences of the United States of America, 111(6), 2229–2234. [https://doi.org/10.1073/pnas.1318934111](https://doi.org/10.1073/pnas.1318934111) Download: [https://github.com/pontussk/PMDtools](https://github.com/pontussk/PMDtools)
98
+
***MultiQC** Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. [https://doi.org/10.1093/bioinformatics/btw354](https://doi.org/10.1093/bioinformatics/btw354) Download: [https://multiqc.info/](https://multiqc.info/)
99
+
***BamUtils** Jun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. (2015). An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Research, 25(6), 918–925. [https://doi.org/10.1101/gr.176552.114](https://doi.org/10.1101/gr.176552.114) Download: [https://genome.sph.umich.edu/wiki/BamUtil](https://genome.sph.umich.edu/wiki/BamUtil)
Copy file name to clipboardExpand all lines: docs/configuration/adding_your_own.md
+16-3Lines changed: 16 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,6 @@ process {
28
28
}
29
29
```
30
30
31
-
32
31
## Software Requirements
33
32
To run the pipeline, several software packages are required. How you satisfy these requirements is essentially up to you and depends on your system. If possible, we _highly_ recommend using either Docker or Singularity.
34
33
Please see the [`installation documentation`](../installation.md) for how to run using the below as a one-off. These instructions are about configuring a config file for repeated use.
@@ -51,7 +50,6 @@ Note that the dockerhub organisation name annoyingly can't have a hyphen, so is
51
50
### Singularity image
52
51
Many HPC environments are not able to run Docker due to security issues.
53
52
[Singularity](http://singularity.lbl.gov/) is a tool designed to run on such HPC systems which is very similar to Docker.
54
-
>>>>>>> TEMPLATE
55
53
56
54
To specify singularity usage in your pipeline config file, add the following:
57
55
@@ -81,5 +79,20 @@ To use conda in your own config file, add the following:
81
79
82
80
```nextflow
83
81
process.conda = "$baseDir/environment.yml"
84
-
>>>>>>> TEMPLATE
85
82
```
83
+
84
+
## Job Resources
85
+
#### Automatic resubmission
86
+
Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with an error code of `143` (exceeded requested resources) it will automatically resubmit with higher requests (2 x original, then 3 x original). If it still fails after three times then the pipeline is stopped.
87
+
88
+
#### Custom resource requests
89
+
Wherever process-specific requirements are set in the pipeline, the default value can be changed by creating a custom config file. See the files in [`conf`](../conf) for examples.
90
+
91
+
### AWS Batch specific parameters
92
+
Running the pipeline on AWS Batch requires a couple of specific parameters to be set according to your AWS Batch configuration. Please use the `-awsbatch` profile and then specify all of the following parameters.
93
+
#### `--awsqueue`
94
+
The JobQueue that you intend to use on AWS Batch.
95
+
#### `--awsregion`
96
+
The AWS region to run your job in. Default is set to `eu-west-1` but can be adjusted to your needs.
97
+
98
+
Please make sure to also set the `-w/--work-dir` and `--outdir` parameters to a S3 storage bucket of your choice - you'll get an error message notifying you if you didn't.
0 commit comments