Skip to content

Commit 631e592

Browse files
authored
Merge pull request #701 from nf-core/sex-det-collision
Fix file collision names on sex determination of same library two strand types
2 parents 5a4621b + d840a0b commit 631e592

4 files changed

Lines changed: 29 additions & 6 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,12 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
2222
- [#688](https://github.com/nf-core/eager/issues/668) - Allow pipeline to complete, even if Qualimap crashes due to an empty or corrupt BAM file for one sample/library
2323
- [#683](https://github.com/nf-core/eager/pull/683) - Sets `--igenomes_ignore` to true by default, as rarely used by users currently and makes resolving configs less complex
2424
- Added exit code `140` to re-tryable exit code list to account for certain scheduler wall-time limit fails
25-
- [672](https://github.com/nf-core/eager/issues/672) - Removed java parameter from picard tools which could cause memory issues
26-
- [679](https://github.com/nf-core/eager/issues/679) - Refactor within-process bash conditions to groovy/nextflow, due to incompatibility with some servers environments
25+
- [#672](https://github.com/nf-core/eager/issues/672) - Removed java parameter from picard tools which could cause memory issues
26+
- [#679](https://github.com/nf-core/eager/issues/679) - Refactor within-process bash conditions to groovy/nextflow, due to incompatibility with some servers environments
2727
- [#690](https://github.com/nf-core/eager/pull/690) - Fixed ANGSD output mode for beagle by setting `-doMajorMinor 1` as default in that case
2828
- [#693](https://github.com/nf-core/eager/issues/693) - Fixed broken TSV input validation for the Colour Chemistry column
2929
- [#695](https://github.com/nf-core/eager/issues/695) - Fixed incorrect `-profile` order in tutorials (originally written reversed due to [nextflow bug](https://github.com/nextflow-io/nextflow/issues/1792))
30+
- [#653](https://github.com/nf-core/eager/issues/653) - Fixed file collision errors with sexdeterrmine for two same-named libraries with different strandedness
3031

3132
### `Dependencies`
3233

docs/output.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -600,6 +600,8 @@ Sex.DetERRmine calculates the coverage of your mapped reads on the X and Y chrom
600600

601601
When a bedfile of specific sites is provided, Sex.DetERRmine additionally calculates error bars around each relative coverage estimate. For this estimate to be trustworthy, the sites included in the bedfile should be spaced apart enough that a single sequencing read cannot overlap multiple sites. Hence, when a bedfile has not been provided, this error should be ignored. When a suitable bedfile is provided, each observation of a covered site is independent, and the error around the coverage is equal to the binomial error estimate. This error is then propagated during the calculation of relative coverage for the X and Y chromosomes.
602602

603+
> Note that in nf-core/eager this will be run on single- and double-stranded variants of the same library _separately_. This can also help assess for differential contamination between libraries.
604+
603605
#### Relative Coverage
604606

605607
Theoretically, males are expected to cluster around (0.5, 0.5) in the produced scatter plot, while females are expected to cluster around (1.0, 0.0). In practice, when analysing ancient DNA, these relative coverage on both axes is slightly lower than expected, and individuals can cluster around (0.45, 0.45) and (0.85, 0.05). As the number of covered sites for an individual gets smaller, the confidence on the estimate becomes lower, because it is increasingly more likely to be affected by randomness in the preservation and sequencing of ancient DNA.

main.nf

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2226,7 +2226,7 @@ process additional_library_merge {
22262226

22272227
ch_trimmed_formerge.skip_merging
22282228
.mix(ch_output_from_trimmerge)
2229-
.into{ ch_output_from_bamutils; ch_addlibmerge_for_qualimap; ch_for_sexdeterrmine }
2229+
.into{ ch_output_from_bamutils; ch_addlibmerge_for_qualimap; ch_for_sexdeterrmine_prep }
22302230

22312231
// General mapping quality statistics for whole reference sequence - e.g. X and % coverage
22322232

@@ -2633,13 +2633,33 @@ process multivcfanalyzer {
26332633

26342634
// Human biological sex estimation
26352635

2636+
// rename to prevent single/double stranded library sample name-based file conflict
2637+
process sexdeterrmine_prep {
2638+
label 'sc_small'
2639+
2640+
input:
2641+
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, path(bam), path(bai) from ch_for_sexdeterrmine_prep
2642+
2643+
output:
2644+
file "*_{single,double}strand.bam" into ch_prepped_for_sexdeterrmine
2645+
2646+
when:
2647+
params.run_sexdeterrmine
2648+
2649+
script:
2650+
"""
2651+
mv ${bam} ${bam.baseName}_${strandedness}strand.bam
2652+
"""
2653+
2654+
}
2655+
26362656
// As we collect all files for a single sex_deterrmine run, we DO NOT use the normal input/output tuple
2637-
process sex_deterrmine {
2657+
process sexdeterrmine {
26382658
label 'sc_small'
26392659
publishDir "${params.outdir}/sex_determination", mode: params.publish_dir_mode
26402660

26412661
input:
2642-
path bam from ch_for_sexdeterrmine.map { it[7] }.collect()
2662+
path bam from ch_prepped_for_sexdeterrmine.collect()
26432663
path(bed) from ch_bed_for_sexdeterrmine
26442664

26452665
output:

nextflow_schema.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1294,7 +1294,7 @@
12941294
"properties": {
12951295
"run_sexdeterrmine": {
12961296
"type": "boolean",
1297-
"description": "Turn on sex determination for human reference genomes.",
1297+
"description": "Turn on sex determination for human reference genomes. This will run on single- and double-stranded variants of a library separately.",
12981298
"fa_icon": "fas fa-transgender-alt",
12991299
"help_text": "Specify to run the optional process of sex determination.\n"
13001300
},

0 commit comments

Comments
 (0)