Skip to content

Unify iGenomes index usage #522

@drpatelh

Description

@drpatelh

It appears that we are staging the index files from the standard igenomes.config in different ways in different pipelines. It would be nice to unify this and to use the same logic in the pipeline code. It can often be a source of confusion and I think its about time we come up with a robust solution.

One possible solution is to add the index prefix in igenomes.config for all types of indices (this is only done for BWA at the moment):

bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/genome.fa"

and then to split the path into a directory and prefix in the pipeline code so the directory can be staged. This is flexible to custom user-provided paths, instances where the index may be named differently to the genome fasta and works on AWS where globs may not resolve all of the other files in the index (e.g. genome.fa*; @maxulysse ?).
https://github.com/nf-core/chipseq/blob/21be3149542cdc84431e12d1e092359058aed32a/main.nf#L168-L183

It would be nice if we can update and use a single igenomes.config across all pipelines if possible i.e. updating the template version and having this rolled out to pipelines via the automated sync. May be worth adding in Bowtie 1 index paths used in smrnaseq (Ping @lpantano). Also, ping @maxibor who is using Bowtie2 in coproid.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedtemplatenf-core pipeline/component template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions