Skip to content

Commit 1f32913

Browse files
authored
Merge pull request #476 from nf-core/bowtie2
Adds Bowtie2 to address
2 parents b89163e + 8e65bb2 commit 1f32913

10 files changed

Lines changed: 395 additions & 334 deletions

File tree

.github/workflows/ci.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,9 @@ jobs:
9898
- name: MAPPER_BWAMEM Test running with BWA Mem
9999
run: |
100100
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --mapper 'bwamem'
101+
- name: MAPPER_BT2 Test running with BowTie2
102+
run: |
103+
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --mapper 'bowtie2' --bt2_alignmode 'local' --bt2_sensitivity 'sensitive' --bt2n 1 --bt2l 16 --bt2_trim5 1 --bt2_trim3 1
101104
- name: STRIP_FASTQ Run the basic pipeline with output unmapped reads as fastq
102105
run: |
103106
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv,docker --strip_input_fastq

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,5 @@ tests/
77
testing/
88
*.pyc
99
main_playground.nf
10+
.vscode
11+
*.code-workspace

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
2323
* Added ability for automated emails using `mailutils` to also send MultiQC reports
2424
* General documentation additions and cleaning, updated figures with CC-BY license
2525
* Added large 'fullsize' dataset test-profiles for ancient fish, human, and a draft pathogen contexts.
26+
* [#257](https://github.com/nf-core/eager/issues/257) Added the bowtie2 aligner as option for mapping, following Poullet and Orlando 2020 doi: [10.3389/fevo.2020.00105](https://doi.org/10.3389/fevo.2020.00105)
2627

2728
### `Fixed`
2829

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ By default the pipeline currently performs the following:
3333
* Create reference genome indices for mapping (`bwa`, `samtools`, and `picard`)
3434
* Sequencing quality control (`FastQC`)
3535
* Sequencing adapter removal and for paired end data merging (`AdapterRemoval`)
36-
* Read mapping to reference using (`bwa aln`, `bwa mem` or `CircularMapper`)
36+
* Read mapping to reference using (`bwa aln`, `bwa mem`, `CircularMapper`, or `bowtie2`)
3737
* Post-mapping processing, statistics and conversion to bam (`samtools`)
3838
* Ancient DNA C-to-T damage pattern visualisation (`DamageProfiler`)
3939
* PCR duplicate removal (`DeDup` or `MarkDuplicates`)
@@ -66,7 +66,7 @@ Additional functionality contained by the pipeline currently includes:
6666
#### Biological Information
6767

6868
* Mitochondrial to Nuclear read ratio calculation (`MtNucRatioCalculator`)
69-
* Statistical sex determination of human individuals (`SexDetErrmine`)
69+
* Statistical sex determination of human individuals (`Sex.DetERRmine`)
7070

7171
#### Metagenomic Screening
7272

@@ -178,8 +178,10 @@ If you've contributed and you're missing in here, please let us know and we will
178178
* Vågene, Å.J. et al., 2018. Salmonella enterica genomes from victims of a major sixteenth-century epidemic in Mexico. Nature ecology & evolution, 2(3), pp.520–528. Available at: [http://dx.doi.org/10.1038/s41559-017-0446-6](http://dx.doi.org/10.1038/s41559-017-0446-6).
179179
* Herbig, A. et al., 2016. MALT: Fast alignment and analysis of metagenomic DNA sequence data applied to the Tyrolean Iceman. bioRxiv, p.050559. Available at: [http://biorxiv.org/content/early/2016/04/27/050559](http://biorxiv.org/content/early/2016/04/27/050559).
180180
* **MaltExtract** Huebler, R. et al., 2019. HOPS: Automated detection and authentication of pathogen DNA in archaeological remains. bioRxiv, p.534198. Available at: [https://www.biorxiv.org/content/10.1101/534198v1?rss=1](https://www.biorxiv.org/content/10.1101/534198v1?rss=1). Download: [https://github.com/rhuebler/MaltExtract](https://github.com/rhuebler/MaltExtract)
181-
* **Kraken2** Wood, D et al., 2019. Improved metagenomic analysis with Kraken 2. Genome Biology volume 20, Article number: 257. Available at: [https://doi.org/10.1186/s13059-019-1891-0](https://doi.org/10.1186/s13059-019-1891-0). Download: [https://ccb.jhu.edu/software/kraken2/](https://ccb.jhu.edu/software/kraken2/)
181+
* **Kraken2** Wood, D et al., 2019. Improved metagenomic analysis with Kraken 2. Genome Biology volume 20, Article number: 257. Available at: [https://doi.org/10.1186/s13059-019-1891-0](https://doi.org/10.1186/s13059-019-1891-0). Download: [https://ccb.jhu.edu/software/kraken2/](https://ccb.jhu.edu/software/kraken2/)
182182
* **endorS.py** Aida Andrades Valtueña (Unpublished). Download: [https://github.com/aidaanva/endorS.py](https://github.com/aidaanva/endorS.py)
183+
* **Bowtie2** Langmead, B. and Salzberg, S. L. 2012 Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), p. 357–359. doi: [10.1038/nmeth.1923](https:/dx.doi.org/10.1038/nmeth.1923).
184+
* **sequenceTools** Stephan Schiffels (Unpublished). Download: [https://github.com/stschiff/sequenceTools](https://github.com/stschiff/sequenceTools)
183185

184186
## Data References
185187

assets/multiqc_config.yaml

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ report_comment: >
99
1010
run_modules:
1111
- adapterRemoval
12+
- bowtie2
1213
- custom_content
1314
- damageprofiler
1415
- dedup
@@ -56,6 +57,7 @@ extra_fn_clean_exts:
5657
- '.unifiedgenotyper'
5758
- '.trimmed_stats'
5859
- '_libmerged'
60+
- '_bt2'
5961

6062

6163
top_modules:
@@ -68,8 +70,11 @@ top_modules:
6870
- 'fastqc':
6971
name: 'FastQC (post-AdapterRemoval)'
7072
path_filters:
71-
- '*.truncated_fastqc.zip'
72-
- '*.combined*_fastqc.zip'
73+
- '*.truncated_fastqc.zip'
74+
- '*.combined*_fastqc.zip'
75+
- 'bowtie2':
76+
path_filters:
77+
- '*_bt2.log'
7378
- 'malt'
7479
- 'hops'
7580
- 'kraken'
@@ -117,6 +122,8 @@ table_columns_visible:
117122
percent_duplicates: False
118123
total_sequences: True
119124
percent_gc: True
125+
bowtie2:
126+
overall_alignment_rate: True
120127
MALT:
121128
Taxonomic assignment success: False
122129
Assig. Taxonomy: False
@@ -167,6 +174,8 @@ table_columns_placement:
167174
total_sequences: 400
168175
avg_sequence_length: 410
169176
percent_gc: 420
177+
Bowtie 2 / HiSAT2:
178+
overall_alignment_rate: 450
170179
MALT:
171180
Num. of queries: 430
172181
Total reads: 440
@@ -181,14 +190,14 @@ table_columns_placement:
181190
Samtools Flagstat (post-samtools filter):
182191
flagstat_total: 553
183192
mapped_passed: 554
193+
custom_content:
194+
endogenous_dna: 600
195+
endogenous_dna_post: 610
184196
DeDup:
185-
mapped_after_dedup: 600
186-
clusterfactor: 610
197+
mapped_after_dedup: 620
198+
clusterfactor: 630
187199
Picard:
188200
PERCENT_DUPLICATION: 650
189-
custom_content:
190-
endogenous_dna: 680
191-
endogenous_dna_post: 690
192201
DamageProfiler:
193202
5 Prime1: 700
194203
5 Prime2: 710

bin/scrape_software_versions.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,12 @@
88
'Nextflow': ['v_nextflow.txt', r"(\S+)"],
99
'FastQC': ['v_fastqc.txt', r"FastQC v(\S+)"],
1010
'AdapterRemoval':['v_adapterremoval.txt', r"AdapterRemoval ver. (\S+)"],
11-
'Picard MarkDuplicates': ['v_markduplicates.txt', r"([\d\.]+)-SNAPSHOT"],
11+
'Picard MarkDuplicates': ['v_markduplicates.txt', r"(\S+)"],
1212
'Samtools': ['v_samtools.txt', r"samtools (\S+)"],
1313
'Preseq': ['v_preseq.txt', r"Version: (\S+)"],
1414
'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"],
15-
'BWA': ['v_bwa.txt', r"Version: (\S+)"],
15+
'BWA': ['v_bwa.txt', r"Version: (\S+)"],
16+
'Bowtie2': ['v_bowtie2.txt', r"bowtie2-([0-9]+\.[0-9]+\.[0-9]+) -fdebug"],
1617
'Qualimap': ['v_qualimap.txt', r"QualiMap v.(\S+)"],
1718
'GATK HaplotypeCaller': ['v_gatk.txt', r" v(\S+)"],
1819
#'GATK UnifiedGenotyper': ['v_gatk3_5.txt', r"version (\S+)"],
@@ -24,7 +25,7 @@
2425
'circulargenerator':['v_circulargenerator.txt',r"CircularGeneratorv(\S+)"],
2526
'DeDup':['v_dedup.txt',r"DeDup v(\S+)"],
2627
'freebayes':['v_freebayes.txt',r"v([0-9]\S+)"],
27-
'Sequence Tools':['v_sequencetools.txt',r"v([0-9]\S+)"],
28+
'sequenceTools':['v_sequencetools.txt',r"(\S+)"],
2829
'maltextract':['v_maltextract.txt', r"version(\S+)"],
2930
'malt':['v_malt.txt',r"version (\S+)"],
3031
'multivcfanalyzer':['v_multivcfanalyzer.txt', r"MultiVCFAnalyzer - (\S+)"],
@@ -43,6 +44,7 @@
4344
results['AdapterRemoval'] = '<span style="color:#999999;\">N/A</span>'
4445
results['fastP'] = '<span style="color:#999999;\">N/A</span>'
4546
results['BWA'] = '<span style="color:#999999;\">N/A</span>'
47+
results['Bowtie2'] = '<span style="color:#999999;\">N/A</span>'
4648
results['circulargenerator'] = '<span style="color:#999999;\">N/A</span>'
4749
results['Samtools'] = '<span style="color:#999999;\">N/A</span>'
4850
results['endorS.py'] = '<span style="color:#999999;\">N/A</span>'

0 commit comments

Comments
 (0)