Skip to content

Commit 5a6ed1a

Browse files
authored
Merge pull request #105 from nf-core/dev
RELEASE 2.0.3 (Minor bugfixes)
2 parents cd04c50 + b9d3c51 commit 5a6ed1a

16 files changed

Lines changed: 364 additions & 174 deletions

.travis.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ before_install:
1111
# PRs to master are only ok if coming from dev branch
1212
- '[ $TRAVIS_PULL_REQUEST = "false" ] || [ $TRAVIS_BRANCH != "master" ] || ([ $TRAVIS_PULL_REQUEST_SLUG = $TRAVIS_REPO_SLUG ] && [ $TRAVIS_PULL_REQUEST_BRANCH = "dev" ])'
1313
# Pull the docker image first so the test doesn't wait for this
14-
- docker pull nfcore/eager
14+
- docker pull nfcore/eager:dev
1515
# Fake the tag locally so that the pipeline runs properly
16-
- docker tag nfcore/eager nfcore/eager:2.0.2
16+
- docker tag nfcore/eager:dev nfcore/eager:2.0.3
1717

1818
install:
1919
# Install Nextflow
@@ -37,16 +37,16 @@ script:
3737
# Lint the pipeline code
3838
- nf-core lint ${TRAVIS_BUILD_DIR}
3939
# Run the basic pipeline with the test profile
40-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd
40+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --saveReference
4141
# Run the basic pipeline with single end data (pretending its single end actually)
42-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --singleEnd
42+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --singleEnd --bwa_index results/reference_genome/bwa_index/
4343
# Run the same pipeline testing optional step: fastp, complexity
44-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --complexity_filter
44+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --complexity_filter --bwa_index results/reference_genome/bwa_index/
4545
# Test BAM Trimming
46-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --trim_bam
46+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --trim_bam --bwa_index results/reference_genome/bwa_index/
4747
# Test running with CircularMapper
4848
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --circularmapper --circulartarget 'NC_007596.2'
4949
# Test running with BWA Mem
50-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --bwamem
50+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --bwamem --bwa_index results/reference_genome/bwa_index/
5151
# Test basic pipeline with Conda too
52-
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,conda --pairedEnd
52+
- travis_wait 25 nextflow run ${TRAVIS_BUILD_DIR} -profile test,conda --pairedEnd --bwa_index results/reference_genome/bwa_index/

CHANGELOG.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,26 @@ All notable changes to this project will be documented in this file.
44
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
55
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
66

7-
## unpublished
7+
## [Unpublished]
88

9+
### `Added`
10+
* [#80](https://github.com/nf-core/eager/pull/80) - BWA Index file handling
11+
* [#77](https://github.com/nf-core/eager/pull/77) - Lots of documentation updates by [@jfy133](https://github.com/jfy133)
12+
13+
## [2.0.3] - 2018-12-09
14+
15+
### `Added`
16+
* [#80](https://github.com/nf-core/eager/pull/80) - BWA Index file handling
17+
* [#77](https://github.com/nf-core/eager/pull/77) - Lots of documentation updates by [@jfy133](https://github.com/jfy133)
18+
* [#81](https://github.com/nf-core/eager/pull/81) - Renaming of certain BAM options
19+
* [#92](https://github.com/nf-core/eager/issues/92) - Complete restructure of BAM options
20+
21+
### `Fixed`
22+
* [#84](https://github.com/nf-core/eager/pull/85) - Fix for [Samtools index issues](https://github.com/nf-core/eager/issues/84)
23+
* [#96](https://github.com/nf-core/eager/issues/96) - Fix for [MarkDuplicates issues](https://github.com/nf-core/eager/issues/96) found by [@nilesh-tawari](https://github.com/nilesh-tawari)
24+
25+
### Other
26+
* Added Slack button to repository readme
927

1028
## [2.0.2] - 2018-11-03
1129

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ FROM nfcore/base
33
LABEL description="Docker image containing all requirements for nf-core/eager pipeline"
44
COPY environment.yml /
55
RUN conda env create -f /environment.yml && conda clean -a
6-
ENV PATH /opt/conda/envs/nf-core-eager-2.0.2/bin:$PATH
6+
ENV PATH /opt/conda/envs/nf-core-eager-2.0.3/bin:$PATH

README.md

Lines changed: 75 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22

33
[![Build Status](https://travis-ci.org/nf-core/eager.svg?branch=master)](https://travis-ci.org/nf-core/eager)
44
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.32.0-brightgreen.svg)](https://www.nextflow.io/)
5-
[![Gitter](https://img.shields.io/badge/gitter-%20join%20chat%20%E2%86%92-4fb99a.svg)](https://gitter.im/nf-core/eager)
6-
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)
5+
[![Slack Status](https://nf-core-invite.herokuapp.com/badge.svg)](https://nf-core-invite.herokuapp.com)[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)
76
[![Docker Container available](https://img.shields.io/docker/automated/nfcore/eager.svg)](https://hub.docker.com/r/nfcore/eager/)
87
![Singularity Container available](https://img.shields.io/badge/singularity-available-7E4C74.svg)
98
[![DOI](https://zenodo.org/badge/135918251.svg)](https://zenodo.org/badge/latestdoi/135918251)
@@ -12,28 +11,60 @@
1211

1312
## Introduction
1413

15-
**nf-core/eager** is a bioinformatics best-practice analysis pipeline for ancient DNA data analysis.
14+
**nf-core/eager** is a bioinformatics best-practice analysis pipeline for NGS
15+
sequencing based ancient DNA (aDNA) data analysis.
1616

17-
The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results. It comes with docker / singularity containers making installation trivial and results highly reproducible.
17+
The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics
18+
workflow tool. It pre-processes raw data from FASTQ inputs, aligns the reads
19+
and performs extensive general NGS and aDNA specific quality-control on the
20+
results. It comes with docker, singularity or conda containers making
21+
installation trivial and results highly reproducible.
1822

19-
### Pipeline steps
23+
## Pipeline steps
2024

21-
* Create reference genome indices (optional)
22-
* BWA
23-
* Samtools Index
24-
* Sequence Dictionary
25-
* QC with FastQC
26-
* AdapterRemoval for read clipping and merging
27-
* Read mapping with BWA, BWA Mem or CircularMapper
28-
* Samtools sort, index, stats & conversion to BAM
29-
* DeDup or MarkDuplicates read deduplication
30-
* QualiMap BAM QC Checking
31-
* Preseq Library Complexity Estimation
32-
* DamageProfiler damage profiling
33-
* BAM Clipping for UDG+/UDGhalf protocols
34-
* PMDTools damage filtering / assessment
25+
By default the pipeline currently performs the following:
26+
27+
* Create reference genome indices for mapping (`bwa`, `samtools`, and `picard`)
28+
* Sequencing quality control (`FastQC`)
29+
* Sequencing adapter removal and for paired end data merging (`AdapterRemoval`)
30+
* Read mapping to reference using (`bwa aln`, `bwa mem` or `CircularMapper`)
31+
* Post-mapping processing, statistics and conversion to bam (`samtools`)
32+
* Ancient DNA C-to-T damage pattern visualisation (`DamageProfiler`)
33+
* PCR duplicate removal (`DeDup` or `MarkDuplicates`)
34+
* Post-mapping statistics and BAM quality control (`Qualimap`)
35+
* Library Complexity Estimation (`preseq`)
36+
* Overall pipeline statistics summaries (`MultiQC`)
37+
38+
Additional functionality contained by the pipeline currently includes:
39+
40+
* Illumina two-coloured sequencer poly-G tail removal (`fastp`)
41+
* Automatic conversion of unmapped reads to FASTQ (`samtools`)
42+
* Damage removal/clipping for UDG+/UDG-half treatment protocols (`BamUtil`)
43+
* Damage reads extraction and assessment (`PMDTools`)
44+
45+
## Quick Start
46+
47+
1. Install [`nextflow`](docs/installation.md)
48+
2. Install one of [`docker`](https://docs.docker.com/engine/installation/), [`singularity`](https://www.sylabs.io/guides/3.0/user-guide/) or [`conda`](https://conda.io/miniconda.html)
49+
3. Download the EAGER pipeline
50+
51+
```bash
52+
nextflow pull nf-core/eager
53+
```
54+
55+
4. Set up your job with default parameters
56+
57+
```bash
58+
nextflow run nf-core -profile <docker/singularity/conda> --reads'*_R{1,2}.fastq.gz' --fasta '<REFERENCE.fasta'
59+
```
60+
61+
5. See the overview of the run with under `<OUTPUT_DIR>/MultiQC/multiqc_report.html`
62+
63+
Modifications to the default pipeline are easily made using various options
64+
as described in the documentation.
65+
66+
## Documentation
3567

36-
### Documentation
3768
The nf-core/eager pipeline comes with documentation about the pipeline, found in the `docs/` directory:
3869

3970
1. [Installation](docs/installation.md)
@@ -44,5 +75,27 @@ The nf-core/eager pipeline comes with documentation about the pipeline, found in
4475
4. [Output and how to interpret the results](docs/output.md)
4576
5. [Troubleshooting](docs/troubleshooting.md)
4677

47-
### Credits
48-
This pipeline was written by Alexander Peltzer ([apeltzer](https://github.com/apeltzer)), with major contributions from Stephen Clayton, ideas and documentation from James Fellows-Yates, Raphael Eisenhofer and Judith Neukamm. If you want to contribute, please open an issue and ask to be added to the project - happy to do so and everyone is welcome to contribute here!
78+
79+
## Credits
80+
81+
This pipeline was written by Alexander Peltzer ([apeltzer](https://github.com/apeltzer)),
82+
with major contributions from Stephen Clayton, ideas and documentation from
83+
James Fellows Yates, Raphael Eisenhofer and Judith Neukamm. If you want to
84+
contribute, please open an issue and ask to be added to the project - happy to
85+
do so and everyone is welcome to contribute here!
86+
87+
## Tool References
88+
89+
* **EAGER v1**, CircularMapper, DeDup* Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., & Nieselt, K. (2016). EAGER: efficient ancient genome reconstruction. Genome Biology, 17(1), 1–14. [https://doi.org/10.1186/s13059-016-0918-z](https://doi.org/10.1186/s13059-016-0918-z) Download: [https://github.com/apeltzer/EAGER-GUI](https://github.com/apeltzer/EAGER-GUI) and [https://github.com/apeltzer/EAGER-CLI](https://github.com/apeltzer/EAGER-CLI)
90+
* **FastQC** download: [https://www.bioinformatics.babraham.ac.uk/projects/fastqc/](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
91+
* **AdapterRemoval v2** Schubert, M., Lindgreen, S., & Orlando, L. (2016). AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Research Notes, 9, 88. [https://doi.org/10.1186/s13104-016-1900-2](https://doi.org/10.1186/s13104-016-1900-2) Download: [https://github.com/MikkelSchubert/adapterremoval](https://github.com/MikkelSchubert/adapterremoval)
92+
* **bwa** Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics , 25(14), 1754–1760. [https://doi.org/10.1093/bioinformatics/btp324](https://doi.org/10.1093/bioinformatics/btp324) Download: [http://bio-bwa.sourceforge.net/bwa.shtml](http://bio-bwa.sourceforge.net/bwa.shtml)
93+
* **SAMtools** Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … 1000 Genome Project Data Processing Subgroup. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics , 25(16), 2078–2079. [https://doi.org/10.1093/bioinformatics/btp352](https://doi.org/10.1093/bioinformatics/btp352) Download: [http://www.htslib.org/](http://www.htslib.org/)
94+
* **DamageProfiler** Judith Neukamm (Unpublished)
95+
* **QualiMap** Okonechnikov, K., Conesa, A., & García-Alcalde, F. (2016). Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics , 32(2), 292–294. [https://doi.org/10.1093/bioinformatics/btv566](https://doi.org/10.1093/bioinformatics/btv566) Download: [http://qualimap.bioinfo.cipf.es/](http://qualimap.bioinfo.cipf.es/)
96+
* **preseq** Daley, T., & Smith, A. D. (2013). Predicting the molecular complexity of sequencing libraries. Nature Methods, 10(4), 325–327. [https://doi.org/10.1038/nmeth.2375](https://doi.org/10.1038/nmeth.2375). Download: [http://smithlabresearch.org/software/preseq/](http://smithlabresearch.org/software/preseq/)
97+
* **PMDTools** Skoglund, P., Northoff, B. H., Shunkov, M. V., Derevianko, A. P., Pääbo, S., Krause, J., & Jakobsson, M. (2014). Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal. Proceedings of the National Academy of Sciences of the United States of America, 111(6), 2229–2234. [https://doi.org/10.1073/pnas.1318934111](https://doi.org/10.1073/pnas.1318934111) Download: [https://github.com/pontussk/PMDtools](https://github.com/pontussk/PMDtools)
98+
* **MultiQC** Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. [https://doi.org/10.1093/bioinformatics/btw354](https://doi.org/10.1093/bioinformatics/btw354) Download: [https://multiqc.info/](https://multiqc.info/)
99+
* **BamUtils** Jun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. (2015). An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Research, 25(6), 918–925. [https://doi.org/10.1101/gr.176552.114](https://doi.org/10.1101/gr.176552.114) Download: [https://genome.sph.umich.edu/wiki/BamUtil](https://genome.sph.umich.edu/wiki/BamUtil)
100+
* **FastP** Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34(17), i884–i890. [https://doi.org/10.1093/bioinformatics/bty560](https://doi.org/10.1093/bioinformatics/bty560) Download: [https://github.com/OpenGene/fastp](https://github.com/OpenGene/fastp)
101+

Singularity

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@ Bootstrap:docker
44
%labels
55
MAINTAINER Alexander Peltzer <alexander.peltzer@qbic.uni-tuebingen.de>
66
DESCRIPTION Container image containing all requirements for the nf-core/eager pipeline
7-
VERSION 2.0.2
7+
VERSION 2.0.3
88

99
%environment
10-
PATH=/opt/conda/envs/nf-core-eager-2.0.2/bin:$PATH
10+
PATH=/opt/conda/envs/nf-core-eager-2.0.3/bin:$PATH
1111
export PATH
1212

1313
%files

conf/acad-pheonix.config

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
/*
2+
* ----------------------------------------------------------------------------
3+
* Nextflow config file for use with Singularity on Phoenix Cluster Adelaide
4+
* ----------------------------------------------------------------------------
5+
* Defines basic usage limits and singularity image id.
6+
*/
7+
8+
singularity {
9+
enabled = true
10+
autoMounts = true
11+
}
12+
13+
process {
14+
beforeScript = 'module load Singularity/2.5.2-GCC-5.4.0-2.26'
15+
executor = 'slurm'
16+
}
17+
18+
params {
19+
max_memory = 128.GB
20+
max_cpus = 32
21+
max_time = 48.h
22+
}

conf/binac.config

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ singularity {
1010
}
1111

1212
process {
13-
beforeScript = 'module load devel/singularity/2.4.1'
13+
beforeScript = 'module load devel/singularity/2.6.0'
1414
executor = 'pbs'
1515
queue = 'short'
1616
}

conf/multiqc_config.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,15 @@ report_comment: >
55
report_section_order:
66
nf-core/eager-software-versions:
77
order: -1000
8+
fastqc:
9+
after: 'nf-core/eager-software-versions'
10+
adapterRemoval:
11+
after: 'fastqc'
12+
Samtools:
13+
after: 'adapterRemoval'
14+
dedup:
15+
after: 'Samtools'
16+
qualimap:
17+
after: 'dedup'
18+
preseq:
19+
after: 'qualimap'

conf/shh.config

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
/*
2+
* -------------------------------------------------------------
3+
* Nextflow config file for use with Singularity at SHH Clusters
4+
* -------------------------------------------------------------
5+
* Defines basic usage limits and singularity image id.
6+
*/
7+
8+
singularity {
9+
enabled = true
10+
}
11+
12+
/*
13+
* To be improved by process specific resource requests
14+
* By default, take the medium queue, smaller processes might just go to short (e.g. multiqc or similar things)
15+
*/
16+
17+
process {
18+
executor = 'slurm'
19+
queue = 'medium'
20+
21+
22+
withName:makeFastaIndex {
23+
queue = 'short'
24+
time = 2.h
25+
}
26+
withName:makeSeqDict {
27+
queue = 'short'
28+
time = 2.h
29+
}
30+
}
31+
32+
params {
33+
max_memory = 734.GB
34+
max_cpus = 64
35+
max_time = 48.h
36+
}

docs/configuration/adding_your_own.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ process {
2828
}
2929
```
3030

31-
3231
## Software Requirements
3332
To run the pipeline, several software packages are required. How you satisfy these requirements is essentially up to you and depends on your system. If possible, we _highly_ recommend using either Docker or Singularity.
3433
Please see the [`installation documentation`](../installation.md) for how to run using the below as a one-off. These instructions are about configuring a config file for repeated use.
@@ -51,7 +50,6 @@ Note that the dockerhub organisation name annoyingly can't have a hyphen, so is
5150
### Singularity image
5251
Many HPC environments are not able to run Docker due to security issues.
5352
[Singularity](http://singularity.lbl.gov/) is a tool designed to run on such HPC systems which is very similar to Docker.
54-
>>>>>>> TEMPLATE
5553

5654
To specify singularity usage in your pipeline config file, add the following:
5755

@@ -81,5 +79,20 @@ To use conda in your own config file, add the following:
8179

8280
```nextflow
8381
process.conda = "$baseDir/environment.yml"
84-
>>>>>>> TEMPLATE
8582
```
83+
84+
## Job Resources
85+
#### Automatic resubmission
86+
Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with an error code of `143` (exceeded requested resources) it will automatically resubmit with higher requests (2 x original, then 3 x original). If it still fails after three times then the pipeline is stopped.
87+
88+
#### Custom resource requests
89+
Wherever process-specific requirements are set in the pipeline, the default value can be changed by creating a custom config file. See the files in [`conf`](../conf) for examples.
90+
91+
### AWS Batch specific parameters
92+
Running the pipeline on AWS Batch requires a couple of specific parameters to be set according to your AWS Batch configuration. Please use the `-awsbatch` profile and then specify all of the following parameters.
93+
#### `--awsqueue`
94+
The JobQueue that you intend to use on AWS Batch.
95+
#### `--awsregion`
96+
The AWS region to run your job in. Default is set to `eu-west-1` but can be adjusted to your needs.
97+
98+
Please make sure to also set the `-w/--work-dir` and `--outdir` parameters to a S3 storage bucket of your choice - you'll get an error message notifying you if you didn't.

0 commit comments

Comments
 (0)