Skip to content

Commit 6f41b7c

Browse files
Merge pull request #660 from FriederikeHanssen/release_review
Review Comments for release
2 parents ce77595 + 6a893cc commit 6f41b7c

43 files changed

Lines changed: 997 additions & 671 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/local_modules.yml

Lines changed: 0 additions & 99 deletions
This file was deleted.

conf/test.config

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,7 @@ profiles {
177177
}
178178
umi {
179179
params.input = "${projectDir}/tests/csv/3.0/fastq_umi.csv"
180-
params.umi_read_structure = '7M1S+T'
180+
params.umi_read_structure = '+T 7M1S+T'
181181
}
182182
use_gatk_spark {
183183
params.use_gatk_spark = 'baserecalibrator,markduplicates'

conf/test_full_somatic.config

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ params {
1919

2020
// Other params
2121
tools = 'strelka,mutect2,freebayes,ascat,manta,cnvkit,tiddit,controlfreec,vep'
22-
2322
split_fastq = 20000000
2423
intervals = 's3://nf-core-awsmegatests/sarek/input/S07604624_Padded_Agilent_SureSelectXT_allexons_V6_UTR.bed'
2524
wes = true

docs/usage.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Note that the pipeline will create the following files in your working directory
2626
```console
2727
work # Directory containing the nextflow working files
2828
results # Finished results (configurable, see below)
29-
.nextflow_log # Log file from Nextflow
29+
.nextflow.log # Log file from Nextflow
3030
# Other nextflow hidden files, eg. history of pipeline runs and old logs.
3131
```
3232

@@ -58,7 +58,7 @@ Multiple CSV files can be specified if the path is enclosed in quotes.
5858
| `sex` | **Sex chromosomes of the patient**; i.e. XX, XY..., only used for Copy-Number Variation analysis in a tumor/pair<br /> _Optional, Default: `NA`_ |
5959
| `status` | **Normal/tumor status of sample**; can be `0` (normal) or `1` (tumor).<br /> _Optional, Default: `0`_ |
6060
| `sample` | **Custom sample ID** for each tumor and normal sample; more than one tumor sample for each subject is possible, i.e. a tumor and a relapse; samples can have multiple lanes for which the _same_ ID must be used to merge them later (see also `lane`). Sample IDs must be unique for unique biological samples <br /> _Required_ |
61-
| `lane` | Lane ID, used when the `sample` is multiplexed on several lanes. Must be unique for each lane in the same sample (but does not need to be the original lane name), and must contain at least one character <br /> _Required for `--step_mapping`_ |
61+
| `lane` | Lane ID, used when the `sample` is multiplexed on several lanes. Must be unique for each lane in the same sample (but does not need to be the original lane name), and must contain at least one character <br /> _Required for `--step mapping`_ |
6262
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension `.fastq.gz` or `.fq.gz`. |
6363
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension `.fastq.gz` or `.fq.gz`. |
6464
| `bam` | Full path to (u)BAM file |
@@ -672,7 +672,8 @@ This will enable pre-processing of the reads and UMI consensus reads calling, wh
672672
### UMI Read Structure
673673

674674
This parameter is a string, which follows a [convention](https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures) to describe the structure of the umi.
675-
If your reads contain a UMI only on one end, the string should only represent one structure (i.e. "2M11S+T"); should your reads contain a UMI on both ends, the string will contain two structures separated by a blank space (i.e. "2M11S+T 2M11S+T").
675+
676+
As an example: if your reads contain a UMI only on the forward read, the string can only represent one structure (i.e. "2M11S+T"); should your reads contain a UMI on both reas, the string will contain two structures separated by a blank space (i.e. "2M11S+T 2M11S+T"); should your reads contain a UMI only on the reverse read, your structure must represent the template only for the forward read and template plus UMI for the reverse read (i.e. +T 12M11S+T). Please do refer to FGBIO documentation for more details, as providing the correct structure is essential and specific to the UMI kit used.
676677

677678
### Limitations and future updates
678679

nextflow.config

Lines changed: 46 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -10,43 +10,43 @@ params {
1010
// Workflow flags:
1111

1212
// Mandatory arguments
13-
input = null // No default input
14-
step = 'mapping' // Starts with mapping
13+
input = null // No default input
14+
step = 'mapping' // Starts with mapping
1515

1616
// Genome and references options
17-
genome = 'GATK.GRCh38'
18-
igenomes_base = 's3://ngi-igenomes/igenomes/'
17+
genome = 'GATK.GRCh38'
18+
igenomes_base = 's3://ngi-igenomes/igenomes/'
1919
igenomes_ignore = false
20-
save_reference = false // Built references not saved
20+
save_reference = false // Built references not saved
2121

2222
// Main options
23-
no_intervals = false // Intervals will be built from the fasta file
24-
nucleotides_per_second = 1000 // Default interval size
25-
tools = null // No default Variant_Calling or Annotation tools
26-
skip_tools = null // All tools (markduplicates + baserecalibrator + QC) are used by default
23+
no_intervals = false // Intervals will be built from the fasta file
24+
nucleotides_per_second = 1000 // Default interval size
25+
tools = null // No default Variant_Calling or Annotation tools
26+
skip_tools = null // All tools (markduplicates + baserecalibrator + QC) are used by default
27+
split_fastq = 0 // FASTQ files will not be split by default by FASTP
2728

28-
// Modify fastqs (trim/split)
29-
trim_fastq = false // No trimming
30-
clip_r1 = 0
31-
clip_r2 = 0
29+
// Modify fastqs (trim/split) with FASTP
30+
trim_fastq = false // No trimming
31+
clip_r1 = 0
32+
clip_r2 = 0
3233
three_prime_clip_r1 = 0
3334
three_prime_clip_r2 = 0
34-
trim_nextseq = 0
35-
save_trimmed = false
36-
split_fastq = 0 // FASTQ files will not be split by default
37-
save_split_fastqs = false
35+
trim_nextseq = 0
36+
save_trimmed = false
37+
save_split_fastqs = false
3838

3939
// UMI tagged reads
40-
umi_read_structure = null // no UMI
41-
group_by_umi_strategy = 'Adjacency' // default strategy when UMI
40+
umi_read_structure = null // no UMI
41+
group_by_umi_strategy = 'Adjacency' // default strategy when running with UMI for GROUPREADSBYUMI
4242

4343
// Preprocessing
44-
aligner = 'bwa-mem' // Default is bwa-mem, bwa-mem2 and dragmap can be used too
45-
use_gatk_spark = null // GATK Spark implementation of their tools in local mode not used by default
46-
save_bam_mapped = false // Mapped BAMs not saved
47-
save_output_as_bam = false //Output files from preprocessing are saved as bam and not as cram files
48-
seq_center = null // No sequencing center to be written in read group CN field by aligner
49-
seq_platform = 'ILLUMINA' // Default platform written in read group PL field by aligner
44+
aligner = 'bwa-mem' // Default is bwa-mem, bwa-mem2 and dragmap can be used too
45+
use_gatk_spark = null // GATK Spark implementation of their tools in local mode not used by default
46+
save_bam_mapped = false // Mapped BAMs not saved
47+
save_output_as_bam = false //Output files from preprocessing are saved as bam and not as cram files
48+
seq_center = null // No sequencing center to be written in read group CN field by aligner
49+
seq_platform = 'ILLUMINA' // Default platform written in read group PL field by aligner
5050

5151
// Variant Calling
5252
only_paired_variant_calling = false //if true, skips germline variant calling for normal-paired samples
@@ -62,31 +62,31 @@ params {
6262
cf_mincov = 0 // ControlFreec default values
6363
cf_minqual = 0 // ControlFreec default values
6464
cf_window = null // by default we are not using this in Control-FREEC
65-
ignore_soft_clipped_bases = false // no --dont-use-soft-clipped-bases for GATK Mutect2
66-
wes = false // Set to true, if data is exome/targeted sequencing data. Used to use correct models in various variant callers
65+
ignore_soft_clipped_bases = false // no --dont-use-soft-clipped-bases for GATK Mutect2
66+
wes = false // Set to true, if data is exome/targeted sequencing data. Used to use correct models in various variant callers
6767

6868
// Annotation
69-
vep_out_format = 'vcf'
70-
vep_dbnsfp = null // dbnsfp plugin disabled within VEP
71-
dbnsfp = null // No dbnsfp processed file
72-
dbnsfp_tbi = null // No dbnsfp processed file index
73-
dbnsfp_consequence = null // No default consequence for dbnsfp plugin
74-
dbnsfp_fields = "rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF" // Default fields for dbnsfp plugin
75-
vep_loftee = null // loftee plugin disabled within VEP
76-
vep_spliceai = null // spliceai plugin disabled within VEP
77-
spliceai_snv = null // No spliceai_snv file
78-
spliceai_snv_tbi = null // No spliceai_snv file index
79-
spliceai_indel = null // No spliceai_indel file
80-
spliceai_indel_tbi = null // No spliceai_indel file index
81-
vep_spliceregion = null // spliceregion plugin disabled within VEP
82-
snpeff_cache = null // No directory for snpEff cache
83-
vep_cache = null // No directory for VEP cache
84-
vep_include_fasta = false // Don't use fasta file for annotation with VEP
69+
vep_out_format = 'vcf'
70+
vep_dbnsfp = null // dbnsfp plugin disabled within VEP
71+
dbnsfp = null // No dbnsfp processed file
72+
dbnsfp_tbi = null // No dbnsfp processed file index
73+
dbnsfp_consequence = null // No default consequence for dbnsfp plugin
74+
dbnsfp_fields = "rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF" // Default fields for dbnsfp plugin
75+
vep_loftee = null // loftee plugin disabled within VEP
76+
vep_spliceai = null // spliceai plugin disabled within VEP
77+
spliceai_snv = null // No spliceai_snv file
78+
spliceai_snv_tbi = null // No spliceai_snv file index
79+
spliceai_indel = null // No spliceai_indel file
80+
spliceai_indel_tbi = null // No spliceai_indel file index
81+
vep_spliceregion = null // spliceregion plugin disabled within VEP
82+
snpeff_cache = null // No directory for snpEff cache
83+
vep_cache = null // No directory for VEP cache
84+
vep_include_fasta = false // Don't use fasta file for annotation with VEP
8585

8686
// MultiQC options
87-
multiqc_config = null
88-
multiqc_title = null
89-
max_multiqc_email_size = '25.MB'
87+
multiqc_config = null
88+
multiqc_title = null
89+
max_multiqc_email_size = '25.MB'
9090

9191
// Boilerplate options
9292
outdir = 'results'

nextflow_schema.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@
166166
"fa_icon": "fas fa-tape",
167167
"description": "Specify UMI read structure",
168168
"hidden": true,
169-
"help_text": "One structure if UMI is present on one end (i.e. '2M11S+T'), or two structures separated by a blank space if UMIs a present on both ends (i.e. '2M11S+T 2M11S+T'); please note, this does not handle duplex-UMIs.\n\nIt is recommended to skip duplicate marking and base quality score recalibration. See `--skip_tools`."
169+
"help_text": "One structure if UMI is present on one end (i.e. '+T 2M11S+T'), or two structures separated by a blank space if UMIs a present on both ends (i.e. '2M11S+T 2M11S+T'); please note, this does not handle duplex-UMIs.\n\nFor more info on UMI usage in the pipeline, also check docs [here](./docs/usage.md/#how-to-handle-umis)."
170170
},
171171
"group_by_umi_strategy": {
172172
"type": "string",

0 commit comments

Comments
 (0)