Skip to content

Make ControlFREEC work for WES and WGS data in both paired and tumor-only modes#302

Merged
maxulysse merged 28 commits intonf-core:devfrom
jfnavarro:dev
Dec 18, 2020
Merged

Make ControlFREEC work for WES and WGS data in both paired and tumor-only modes#302
maxulysse merged 28 commits intonf-core:devfrom
jfnavarro:dev

Conversation

@jfnavarro
Copy link
Copy Markdown
Contributor

@jfnavarro jfnavarro commented Nov 7, 2020

nf-core/sarek pull request

Control-FREEC has been upgraded and the configuration of its process has been optimized and updated to work with WES data as well as WGS data. Few bugs have been fixed and improvements have been made in the code and docs. Tumor-only mode for Control-FREEC has been added too.

PR checklist

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repo
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Make sure your code lints (nf-core lint .).
  • Documentation in docs is updated
  • CHANGELOG.md is updated
  • README.md is updated

Learn more about contributing: CONTRIBUTING.md

@jfnavarro jfnavarro requested a review from maxulysse as a code owner November 7, 2020 15:00
@maxulysse maxulysse mentioned this pull request Nov 17, 2020
8 tasks
@maxulysse
Copy link
Copy Markdown
Member

@jfnavarro with the help of @ewels we finally fixed #305 issue
So instead of removing the mappability option from Control-FREEC, would it be possible to make it as an optional parameter when the file is not provided?

@jfnavarro
Copy link
Copy Markdown
Contributor Author

@jfnavarro with the help of @ewels we finally fixed #305 issue
So instead of removing the mappability option from Control-FREEC, would it be possible to make it as an optional parameter when the file is not provided?

Yes, of course.

Comment thread conf/igenomes.config
intervals = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/intervals/GRCm38_calling_list.bed"
known_indels = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/MouseGenomeProject/mgp.v5.merged.indels.dbSNP142.normed.vcf.gz"
known_indels_index = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/MouseGenomeProject/mgp.v5.merged.indels.dbSNP142.normed.vcf.gz.tbi"
mappability = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Control-FREEC/GRCm38_68_mm10.gem"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be deleted

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uppps :) Fixed!

Comment thread main.nf
Available: conda, docker, singularity, test, awsbatch, <institute> and more
--step [list] Specify starting step (only one)
Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, Control-FREEC
Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, ControlFREEC
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, ControlFREEC
Available: mapping, prepare_recalibration, recalibrate, variant_calling, annotate, Control-FREEC

Actually as for tools, the string is stripped out of _ and - and lowercased, so Control-FREEC should actually work.
I would prefer that spelling as it's the one from the tool itself

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

Copy link
Copy Markdown
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
@szilvajuhos any comments?

Copy link
Copy Markdown
Contributor

@szilvajuhos szilvajuhos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, though we have changed the coefficientOfVariation for a reason I guess. It needs testing anyway.

Comment thread nextflow.config
ascat_ploidy = null // Use default value
ascat_purity = null // Use default value
cf_coeff = "0.015" // default value for Control-FREEC
cf_coeff = "0.05" // default value for Control-FREEC
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno, what is the default value, and why exactly are we changing it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are changing it to the default value as recommend in their manual (http://boevalab.inf.ethz.ch/FREEC/tutorial.html). Was there any specific reason to use a value different than the default one?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, our colleagues played with that, and were happier with 0.05. Will ask why.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see :) I personally believe that is nice to have as default the same as in Control-FREEC specially given that this parameter can be changed by the user. The optimal value is probably different for every case but of course I may be wrong here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I check the Control-FREEC source, this value is used only at one place: https://github.com/BoevaLab/FREEC/blob/master/src/GenomeCopyNumber.cpp#L60 and the sole purpose is to calculate the windowSize, if it is not provided in the config. For the clarity, it is actually genome_size/(read_number*cf_coeff^2) and my guess the readlength should be considered somewhere too, but whatever. Smaller windowSize should mean higher resolution but more noise. We can use the default, sure, and change the value in our own config. I can run some tests on our data, to be safe before release.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, the recommended setting for WES is to use cv_window = 0. I added an entry about that in the docs. It may be a good idea to mention this for the cv_coef parameter.

@jfnavarro jfnavarro changed the title Remove mappability option from ControlFREEC Make ControlFREEC work for WES and WGS data in both paired and tumor-only modes Dec 10, 2020
Comment thread main.nf Outdated
@maxulysse
Copy link
Copy Markdown
Member

@szilvajuhos @jfnavarro any need for more modifications? or are we good to go for a merge?

@szilvajuhos
Copy link
Copy Markdown
Contributor

Hej, I just do not know, for T/N samples it is not working for me with 11.6. Whatever, can we change https://github.com/jfnavarro/sarek/blob/dev/main.nf#L3421 and https://github.com/jfnavarro/sarek/blob/dev/main.nf#L3454 to something like

LINEWIDTH=`head -1 ${cnvTumor}| wc -w`; awk 'NF=='$LINEWIDTH'{print}'  ${cnvTumor} > TUMOR.CNVs

so it should work for any fileformat hopefully.

Copy link
Copy Markdown
Contributor

@szilvajuhos szilvajuhos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

espléndido

@maxulysse maxulysse merged commit e94d595 into nf-core:dev Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants