It appears that we are staging the index files from the standard igenomes.config in different ways in different pipelines. It would be nice to unify this and to use the same logic in the pipeline code. It can often be a source of confusion and I think its about time we come up with a robust solution.
One possible solution is to add the index prefix in igenomes.config for all types of indices (this is only done for BWA at the moment):
|
bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/genome.fa" |
and then to split the path into a directory and prefix in the pipeline code so the directory can be staged. This is flexible to custom user-provided paths, instances where the index may be named differently to the genome fasta and works on AWS where globs may not resolve all of the other files in the index (e.g. genome.fa*; @maxulysse ?).
https://github.com/nf-core/chipseq/blob/21be3149542cdc84431e12d1e092359058aed32a/main.nf#L168-L183
It would be nice if we can update and use a single igenomes.config across all pipelines if possible i.e. updating the template version and having this rolled out to pipelines via the automated sync. May be worth adding in Bowtie 1 index paths used in smrnaseq (Ping @lpantano). Also, ping @maxibor who is using Bowtie2 in coproid.
It appears that we are staging the index files from the standard
igenomes.configin different ways in different pipelines. It would be nice to unify this and to use the same logic in the pipeline code. It can often be a source of confusion and I think its about time we come up with a robust solution.One possible solution is to add the index prefix in
igenomes.configfor all types of indices (this is only done for BWA at the moment):tools/nf_core/pipeline-template/{{cookiecutter.name_noslash}}/conf/igenomes.config
Line 15 in d435406
and then to split the path into a directory and prefix in the pipeline code so the directory can be staged. This is flexible to custom user-provided paths, instances where the index may be named differently to the genome fasta and works on AWS where globs may not resolve all of the other files in the index (e.g.
genome.fa*; @maxulysse ?).https://github.com/nf-core/chipseq/blob/21be3149542cdc84431e12d1e092359058aed32a/main.nf#L168-L183
It would be nice if we can update and use a single
igenomes.configacross all pipelines if possible i.e. updating the template version and having this rolled out to pipelines via the automated sync. May be worth adding in Bowtie 1 index paths used insmrnaseq(Ping @lpantano). Also, ping @maxibor who is using Bowtie2 incoproid.