-
Notifications
You must be signed in to change notification settings - Fork 529
Add cram support + read splitting with seqkit for speedup #388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
73 commits
Select commit
Hold shift + click to select a range
6ce647d
Add cram support, read splitting
FriederikeHanssen 21a695a
Add estimate library complexity if spark is used
FriederikeHanssen a634a79
Fixes resume problem, but is losing the file name...
FriederikeHanssen 1632d4b
Add tmp dir to gatk processes so tmp files are written to the proper …
FriederikeHanssen ab0c1c8
Fix filename display + resume for TABIX
FriederikeHanssen e8290a1
Try to get spark to work
FriederikeHanssen 809d321
Add MDSpark back in
FriederikeHanssen 6ba8720
try with runoptions
FriederikeHanssen 46dcdc2
The newest gatk container is not working for me with spark, 4.1.9.0 is
FriederikeHanssen 41649cc
Add docker.userEmulation back in
FriederikeHanssen b4dd4ca
Add more spark things
FriederikeHanssen 147d6b4
Fix module params
FriederikeHanssen 53d69fa
Publish ref
FriederikeHanssen 5b4fc53
try with path instead of file
FriederikeHanssen 1158010
try with path instead of file
FriederikeHanssen 9b1768f
try with fromFile instead
FriederikeHanssen 6323fbe
file
FriederikeHanssen 838932f
whole fixownership seems to work
FriederikeHanssen dd4f783
Merge remote-tracking branch 'upstream/dsl2' into dsl2
FriederikeHanssen 39403ec
Add numLanes to meta sheet to deal with blocked mapping output channels
FriederikeHanssen a62acb1
Add channel dumping to check for missing id
FriederikeHanssen cd51cbc
use groupKey instead
FriederikeHanssen 21c168e
not sure if this works, but run a bigger test for this
FriederikeHanssen d4354d6
Simplify mapping epression
FriederikeHanssen 26448a6
remove unused sw & add ref to samtools stats
FriederikeHanssen 87b6831
Add skip_fastqc in again
FriederikeHanssen 5184462
try with exporting ref path and cache
FriederikeHanssen 46effa4
Remove quotes
FriederikeHanssen 70d28c0
add mutect2 somatic module
FriederikeHanssen a920909
Try to circumvent stats issues with view
FriederikeHanssen 2047dc8
Add ref to cram merge
FriederikeHanssen 112aaf0
Add ref to cram merge
FriederikeHanssen 52632e6
Use double quotes for output
FriederikeHanssen 144fef7
Add ref to stats
FriederikeHanssen 44c5bdf
fix logic for bam to cram conversion
FriederikeHanssen 1d5b395
Simplify if
FriederikeHanssen 6a77b3c
Add memory overhead for gatk based tools
FriederikeHanssen 5ab6078
Add mutect2 somatic
FriederikeHanssen e1e0d34
add conf
FriederikeHanssen dbe4bdb
remove failing dumo statement
FriederikeHanssen 951b78d
select spark tools
FriederikeHanssen f08901f
change use_gatk_spark in bwamem2
FriederikeHanssen d44e522
change use_gatk_spark in bwamem2
FriederikeHanssen dda8b13
add dump tag to figure out why bqsr is not working
FriederikeHanssen db3d98a
try withotu clone to get to work on aws
FriederikeHanssen 287146f
remove meta.id
FriederikeHanssen 2732291
USe channels for known sites
FriederikeHanssen c398e01
Try to fix known_sites channel
FriederikeHanssen a665e66
add groupTuple back in
FriederikeHanssen bccdb7d
change dbsnp/knownindels channel
FriederikeHanssen 118f9c1
fix multiple knwonindels input
FriederikeHanssen c2e73c6
add dump statements, why are the intervals not working for humans
FriederikeHanssen d5cec16
sth of when providing multiple indices
FriederikeHanssen 777f7a2
add tbi back in
FriederikeHanssen 286bbe3
concat seems to fix this channel madness
FriederikeHanssen 3fd8078
hardcode number of intervals for tests
FriederikeHanssen 493a147
fix docker image tag, can't find singularity one
FriederikeHanssen dd88aa3
add haplotypecalelr back in
FriederikeHanssen ec7a2d5
count num intervals with map oprator
FriederikeHanssen 1be7416
collect dbsnp tbi to avoid consumption of channel
FriederikeHanssen 206db89
add gvcf back in
FriederikeHanssen 9ce70a0
Add bamqc after bqsr with crams
FriederikeHanssen 047e5cf
Use docker image for htslib + singularity
FriederikeHanssen 19c911a
add dbsnp back in
FriederikeHanssen 2342971
add dbsnp back in
FriederikeHanssen 5e8f2cc
Resolve merge conflicts
FriederikeHanssen 4f974f6
add step/tools to indices wf
FriederikeHanssen d766d44
Resolve remaining merge conflicts/fix problems
FriederikeHanssen 3a37778
add try/catch to figure out why module conf is not loaded
FriederikeHanssen 22d55ab
Split by num reads instead of parts to generate similar sized files
FriederikeHanssen 3f26535
Code clean up
FriederikeHanssen 069a4f1
Fix merge conflicts
FriederikeHanssen 1866022
apply suggestions from code review
FriederikeHanssen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| // Import generic module functions | ||
| include { initOptions; saveFiles; getSoftwareName } from './functions' | ||
|
|
||
| params.options = [:] | ||
| options = initOptions(params.options) | ||
|
|
||
| process INDEX_TARGET_BED { | ||
| tag "$target_bed" | ||
| label 'process_medium' | ||
| publishDir "${params.outdir}", | ||
| mode: params.publish_dir_mode, | ||
| saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } | ||
|
|
||
| conda (params.enable_conda ? "bioconda::htslib=1.12" : null) | ||
| if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { | ||
| //TODO: No singularity container at the moment, use docker container for the moment | ||
| container "quay.io/biocontainers/htslib:1.12--h9093b5e_1" | ||
| } else { | ||
| container "quay.io/biocontainers/htslib:1.12--hd3b49d5_0" | ||
| } | ||
|
|
||
| input: | ||
| path target_bed | ||
|
|
||
| output: | ||
| tuple path("${target_bed}.gz"), path("${target_bed}.gz.tbi") | ||
|
|
||
| script: | ||
| """ | ||
| bgzip --threads ${task.cpus} -c ${target_bed} > ${target_bed}.gz | ||
| tabix ${target_bed}.gz | ||
| """ | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| // Import generic module functions | ||
| include { initOptions; saveFiles; getSoftwareName } from './functions' | ||
|
|
||
| params.options = [:] | ||
| options = initOptions(params.options) | ||
|
|
||
| process FREEBAYES { | ||
| tag "$meta.id" | ||
| label 'process_low' | ||
| publishDir "${params.outdir}", | ||
| mode: params.publish_dir_mode, | ||
| saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } | ||
|
|
||
| conda (params.enable_conda ? "bioconda::freebayes=1.3.5" : null) | ||
| if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { | ||
| container "https://depot.galaxyproject.org/singularity/freebayes:1.3.5--py38ha193a2f_3" | ||
| } else { | ||
| container "quay.io/biocontainers/freebayes:1.3.5--py38ha193a2f_3" | ||
| } | ||
|
|
||
| input: | ||
| tuple val(meta), path(cram), path(crai) | ||
|
|
||
| output: | ||
| // TODO nf-core: Named file extensions MUST be emitted for ALL output channels | ||
| tuple val(meta), path("*.bam"), emit: bam | ||
| // TODO nf-core: List additional required output channels/values here | ||
| path "*.version.txt" , emit: version | ||
|
|
||
| script: | ||
| def software = getSoftwareName(task.process) | ||
| def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" | ||
| """ | ||
|
|
||
|
|
||
| echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt | ||
| """ | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| // | ||
| // Utility functions used in nf-core DSL2 module files | ||
| // | ||
|
|
||
| // | ||
| // Extract name of software tool from process name using $task.process | ||
| // | ||
| def getSoftwareName(task_process) { | ||
| return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() | ||
| } | ||
|
|
||
| // | ||
| // Function to initialise default values and to generate a Groovy Map of available options for nf-core modules | ||
| // | ||
| def initOptions(Map args) { | ||
| def Map options = [:] | ||
| options.args = args.args ?: '' | ||
| options.args2 = args.args2 ?: '' | ||
| options.args3 = args.args3 ?: '' | ||
| options.publish_by_meta = args.publish_by_meta ?: [] | ||
| options.publish_dir = args.publish_dir ?: '' | ||
| options.publish_files = args.publish_files | ||
| options.suffix = args.suffix ?: '' | ||
| return options | ||
| } | ||
|
|
||
| // | ||
| // Tidy up and join elements of a list to return a path string | ||
| // | ||
| def getPathFromList(path_list) { | ||
| def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries | ||
| paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes | ||
| return paths.join('/') | ||
| } | ||
|
|
||
| // | ||
| // Function to save/publish module results | ||
| // | ||
| def saveFiles(Map args) { | ||
| if (!args.filename.endsWith('.version.txt')) { | ||
| def ioptions = initOptions(args.options) | ||
| def path_list = [ ioptions.publish_dir ?: args.publish_dir ] | ||
| if (ioptions.publish_by_meta) { | ||
| def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta | ||
| for (key in key_list) { | ||
| if (args.meta && key instanceof String) { | ||
| def path = key | ||
| if (args.meta.containsKey(key)) { | ||
| path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] | ||
| } | ||
| path = path instanceof String ? path : '' | ||
| path_list.add(path) | ||
| } | ||
| } | ||
| } | ||
| if (ioptions.publish_files instanceof Map) { | ||
| for (ext in ioptions.publish_files) { | ||
| if (args.filename.endsWith(ext.key)) { | ||
| def ext_list = path_list.collect() | ||
| ext_list.add(ext.value) | ||
| return "${getPathFromList(ext_list)}/$args.filename" | ||
| } | ||
| } | ||
| } else if (ioptions.publish_files == null) { | ||
| return "${getPathFromList(path_list)}/$args.filename" | ||
| } | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that in nf-core/modules