Skip to content

Commit 2a7e70e

Browse files
authored
Merge pull request #111 from apeltzer/zip_fasta
Enable gzipped FastA input as reference genome
2 parents 1f38063 + 16d4412 commit 2a7e70e

5 files changed

Lines changed: 35 additions & 4 deletions

File tree

.travis.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,5 +48,7 @@ script:
4848
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --circularmapper --circulartarget 'NC_007596.2'
4949
# Test running with BWA Mem
5050
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --bwamem --bwa_index results/reference_genome/bwa_index/
51+
# Test with zipped reference input
52+
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --pairedEnd --fasta 'https://raw.githubusercontent.com/nf-core/test-datasets/eager2/reference/Test.fasta.gz'
5153
# Test basic pipeline with Conda too
5254
- travis_wait 25 nextflow run ${TRAVIS_BUILD_DIR} -profile test,conda --pairedEnd --bwa_index results/reference_genome/bwa_index/

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
66

77
## [Unpublished / Dev Branch]
88

9-
### `Added`
9+
### `Added`
10+
* [#111](https://github.com/nf-core/eager/pull/110) - Allow [Zipped FastA reference input](https://github.com/nf-core/eager/issues/91)
1011
* [#113](https://github.com/nf-core/eager/pull/113) - All files are now staged via channels, which is considered best practice by Nextflow.
1112

13+
1214
### `Fixed`
1315
* [#110](https://github.com/nf-core/eager/pull/110) - Fix for [MultiQC Missing Second FastQC report](https://github.com/nf-core/eager/issues/107)
1416

docs/configuration/reference_genomes.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ Read [Adding your own system](adding_your_own.md) to find out how to set up cust
1010
## Adding paths to a config file
1111
Specifying long paths every time you run the pipeline is a pain.
1212
To make this easier, the pipeline comes configured to understand reference genome keywords which correspond to preconfigured paths, meaning that you can just specify `--genome ID` when running the pipeline.
13-
>>>>>>> TEMPLATE
1413

1514
Note that this genome key can also be specified in a config file if you always use the same genome.
1615

docs/usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ If you prefer, you can specify the full path to your reference genome when you r
136136
```bash
137137
--fasta '[path to Fasta reference]'
138138
```
139-
> If you don't specify appropriate `--bwa_index`, `--fasta_index` parameters, the pipeline will create these indices for you automatically. Note, that saving these for later has to be turned on using `--saveReference`.
139+
> If you don't specify appropriate `--bwa_index`, `--fasta_index` parameters, the pipeline will create these indices for you automatically. Note, that saving these for later has to be turned on using `--saveReference`. You may also specify the path to a gzipped (`*.gz` file extension) FastA as reference genome - this will be uncompressed by the pipeline automatically for you.
140140
141141
### `--genome` (using iGenomes)
142142

main.nf

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,9 +217,37 @@ Channel.fromPath("$baseDir/assets/where_are_my_files.txt")
217217
.into{ ch_where_for_bwa_index; ch_where_for_fasta_index; ch_where_for_seqdict}
218218

219219
// Validate inputs
220-
Channel.fromPath("${params.fasta}")
220+
if("${params.fasta}".endsWith(".gz")){
221+
//Put the zip into a channel, then unzip it and forward to downstream processes. DONT unzip in all steps, this is inefficient as NXF links the files anyways from work to work dir
222+
Channel.fromPath("${params.fasta}")
223+
.ifEmpty { exit 1, "No genome specified! Please specify one with --fasta"}
224+
.set {ch_unzip_fasta}
225+
226+
process unzip_reference{
227+
tag "$zipfasta"
228+
229+
input:
230+
file zipfasta from ch_unzip_fasta
231+
232+
output:
233+
file "*.fasta" into (ch_fasta_for_bwa_indexing, ch_fasta_for_faidx_indexing, ch_fasta_for_dict_indexing, ch_fasta_for_bwa_mapping, ch_fasta_for_damageprofiler, ch_fasta_for_qualimap, ch_fasta_for_pmdtools, ch_fasta_for_circularmapper, ch_fasta_for_circularmapper_index,ch_fasta_for_bwamem_mapping)
234+
235+
script:
236+
"""
237+
pigz -f -d -p ${task.cpus} $zipfasta
238+
"""
239+
}
240+
} else {
241+
Channel.fromPath("${params.fasta}")
221242
.ifEmpty { exit 1, "No genome specified! Please specify one with --fasta"}
222243
.into {ch_fasta_for_bwa_indexing;ch_fasta_for_faidx_indexing;ch_fasta_for_dict_indexing; ch_fasta_for_bwa_mapping; ch_fasta_for_damageprofiler; ch_fasta_for_qualimap; ch_fasta_for_pmdtools; ch_fasta_for_circularmapper; ch_fasta_for_circularmapper_index;ch_fasta_for_bwamem_mapping}
244+
}
245+
246+
247+
248+
249+
250+
223251

224252
//Index files provided? Then check whether they are correct and complete
225253
if (params.aligner != 'bwa' && !params.circularmapper && !params.bwamem){

0 commit comments

Comments
 (0)