Make ControlFREEC work for WES and WGS data in both paired and tumor-only modes#302
Make ControlFREEC work for WES and WGS data in both paired and tumor-only modes#302maxulysse merged 28 commits intonf-core:devfrom
Conversation
… the ControFREEC process
|
@jfnavarro with the help of @ewels we finally fixed #305 issue |
Yes, of course. |
…ES in Control-FREEC
| intervals = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/intervals/GRCm38_calling_list.bed" | ||
| known_indels = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/MouseGenomeProject/mgp.v5.merged.indels.dbSNP142.normed.vcf.gz" | ||
| known_indels_index = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/MouseGenomeProject/mgp.v5.merged.indels.dbSNP142.normed.vcf.gz.tbi" | ||
| mappability = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Control-FREEC/GRCm38_68_mm10.gem" |
There was a problem hiding this comment.
This line should not be deleted
| Available: conda, docker, singularity, test, awsbatch, <institute> and more | ||
| --step [list] Specify starting step (only one) | ||
| Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, Control-FREEC | ||
| Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, ControlFREEC |
There was a problem hiding this comment.
| Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, ControlFREEC | |
| Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, Control-FREEC |
Actually as for tools, the string is stripped out of _ and - and lowercased, so Control-FREEC should actually work.
I would prefer that spelling as it's the one from the tool itself
maxulysse
left a comment
There was a problem hiding this comment.
Looks good to me.
@szilvajuhos any comments?
szilvajuhos
left a comment
There was a problem hiding this comment.
Looks fine, though we have changed the coefficientOfVariation for a reason I guess. It needs testing anyway.
| ascat_ploidy = null // Use default value | ||
| ascat_purity = null // Use default value | ||
| cf_coeff = "0.015" // default value for Control-FREEC | ||
| cf_coeff = "0.05" // default value for Control-FREEC |
There was a problem hiding this comment.
Dunno, what is the default value, and why exactly are we changing it?
There was a problem hiding this comment.
We are changing it to the default value as recommend in their manual (http://boevalab.inf.ethz.ch/FREEC/tutorial.html). Was there any specific reason to use a value different than the default one?
There was a problem hiding this comment.
Well, our colleagues played with that, and were happier with 0.05. Will ask why.
There was a problem hiding this comment.
I see :) I personally believe that is nice to have as default the same as in Control-FREEC specially given that this parameter can be changed by the user. The optimal value is probably different for every case but of course I may be wrong here.
There was a problem hiding this comment.
As I check the Control-FREEC source, this value is used only at one place: https://github.com/BoevaLab/FREEC/blob/master/src/GenomeCopyNumber.cpp#L60 and the sole purpose is to calculate the windowSize, if it is not provided in the config. For the clarity, it is actually genome_size/(read_number*cf_coeff^2) and my guess the readlength should be considered somewhere too, but whatever. Smaller windowSize should mean higher resolution but more noise. We can use the default, sure, and change the value in our own config. I can run some tests on our data, to be safe before release.
There was a problem hiding this comment.
Interesting, the recommended setting for WES is to use cv_window = 0. I added an entry about that in the docs. It may be a good idea to mention this for the cv_coef parameter.
|
@szilvajuhos @jfnavarro any need for more modifications? or are we good to go for a merge? |
|
Hej, I just do not know, for T/N samples it is not working for me with 11.6. Whatever, can we change https://github.com/jfnavarro/sarek/blob/dev/main.nf#L3421 and https://github.com/jfnavarro/sarek/blob/dev/main.nf#L3454 to something like so it should work for any fileformat hopefully. |
…e. Relax cpu requirements in ControlFREEC
nf-core/sarek pull request
Control-FREEC has been upgraded and the configuration of its process has been optimized and updated to work with WES data as well as WGS data. Few bugs have been fixed and improvements have been made in the code and docs. Tumor-only mode for Control-FREEC has been added too.
PR checklist
nextflow run . -profile test,docker).nf-core lint .).docsis updatedCHANGELOG.mdis updatedREADME.mdis updatedLearn more about contributing: CONTRIBUTING.md