Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
* [#80](https://github.com/nf-core/eager/pull/80) - BWA Index file handling
* [#77](https://github.com/nf-core/eager/pull/77) - Lots of documentation updates by [@jfy133](https://github.com/jfy133)

### `Changed`
* [#81](https://github.com/nf-core/eager/pull/81) - Renaming of certain BAM options

## [2.0.2] - 2018-11-03

### `Changed`
Expand Down
2 changes: 1 addition & 1 deletion conf/binac.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ singularity {
}

process {
beforeScript = 'module load devel/singularity/2.4.1'
beforeScript = 'module load devel/singularity/2.6.0'
executor = 'pbs'
queue = 'short'
}
Expand Down
14 changes: 9 additions & 5 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,17 +383,21 @@ Turn this on to utilize BWA Mem instead of `bwa aln` for alignment. Can be quite

Users can configure to keep/discard/extract certain groups of reads efficiently in the nf-core/eager pipeline.

### `--bam_keep_mapped_only`
### `--bam_analyse_mapped_only`

This can be used to only keep mapped reads for downstream analysis. By default turned off, all reads are kept in the BAM file. Unmapped reads are stored both in BAM and FastQ format e.g. for different downstream processing.
This can be used to only keep mapped reads in the BAM file for downstream analysis. By default turned off, where all reads are kept in the bam file. Unmapped reads are stored both in BAM and FastQ format e.g. for different downstream processing.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly rephrase again for clarity (my bad):
Use only mapped reads in the BAM file for downstream analysis. Unmapped reads are stored in a separate BAM and FASTQ format e.g. for different downstream processing. By default turned off, where default behaviour is to keep both mapped and unmapped reads in the output BAM file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to instead have a independent flag to save unmapped reads in FASTQ rather than BAM format. In this case we will be make redundancy.


### `--bam_discard_unmapped_entirely`

This discards all unmapped and extracted reads entirely. By default, this is turned off.

### `--bam_keep_all`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference of this flag verses --bam_retain_all_reads?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--bam_discard_unmapped_entirely

Removes the unmapped reads file only, the BAM file contains only mapped reads and unmapped reads are entirely discarded (no fastq/bam at all present).

I guess I'll have to give this another proper thought. Right now bam_retain_all_reads just is the thing to turn on BAM filtering in general.


Turned on by default, keeps all reads that were mapped in the dataset.
Turned on by default, keeps all reads that were mapped in the dataset.

### `--bam_filter_reads`
### `--bam_retain_all_reads`

Specify this, if you want to filter reads for downstream analysis.
Specify this, if you want to filter reads for downstream analysis. This keeps all mapped and unmapped reads in the output, but allows for quality threshold filtering using `--bam_mapping_quality_threshold`.

### `--bam_mapping_quality_threshold`

Expand Down
14 changes: 8 additions & 6 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ def helpMessage() {
--bwamem Turn on BWA Mem instead of CM/BWA aln for mapping

BAM Filtering
--bam_keep_mapped_only Only consider mapped reads for downstream analysis. Unmapped reads are extracted to separate output.
--bam_filter_reads Keep all reads in BAM file for downstream analysis
--bam_analyse_mapped_only Only consider mapped reads for downstream analysis. Unmapped reads are extracted to separate output.
--bam_retain_all_reads Keep all reads in BAM file for downstream analysis
--bam_mapping_quality_threshold Minimum mapping quality for reads filter

DeDuplication
Expand Down Expand Up @@ -173,9 +173,9 @@ params.circularfilter = false
params.bwamem = false

//BAM Filtering steps (default = keep mapped and unmapped in BAM file)
params.bam_keep_mapped_only = false
params.bam_analyse_mapped_only = false
params.bam_keep_all = true
params.bam_filter_reads = false
params.bam_retain_all_reads = false
params.bam_mapping_quality_threshold = 0

//DamageProfiler settings
Expand Down Expand Up @@ -715,16 +715,18 @@ process samtools_filter {
file "*.unmapped.bam" optional true
file "*.bai"

when: "${params.bam_filter_reads}"
when: "${params.bam_retain_all_reads}"

script:
prefix="$bam" - ~/(\.bam)?/
rm_unmapped = "${params.bam_discard_unmapped_entirely}" ? 'rm *.unmapped.*' : ''

if("${params.bam_keep_mapped_only}"){
if("${params.bam_analyse_mapped_only}"){
"""
samtools view -h $bam | tee >(samtools view - -@ ${task.cpus} -f4 -q ${params.bam_mapping_quality_threshold} -o ${prefix}.unmapped.bam) >(samtools view - -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -o ${prefix}.filtered.bam)
samtools fastq -tn "${prefix}.unmapped.bam" | gzip > "${prefix}.unmapped.fq.gz"
samtools index -@ ${task.cpus} ${prefix}.filtered.bam
${rm_unmapped}
"""
} else {
"""
Expand Down
1 change: 1 addition & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ params {
complexity_filter = false
complexity_filter_poly_g_min = 10
trim_bam = false
bam_discard_unmapped_entirely = false

// AWS Batch
awsqueue = false
Expand Down