Skip to content

Incomplete gVCF files with --target_bed #344

@jtangrot

Description

@jtangrot

Check Documentation

I have checked the following places for your error:

Description of the bug

I used Sarek to generate gVCF files with the tool HaplotypeCaller, and then planned to do joint genotyping myself on all samples together. As this is exome sequencing, I first used the option --target_bed, but realised that this results in lots of missing genotypes. The reason is that "bcftools isec" apparently is run on the gvcf files, which removes all regions where the start of a non-variant block in the gvcf is not within the regions listed in the bed file. This means that many of the regions with reference alleles are removed from the file, even if parts of these blocks are indeed covered by the bed (bcftools does not look at the END tag). The vcf files generated for each sample are fine though.

Steps to reproduce

Command line:
nextflow run ~/sarek/main.nf -profile uppmax,singularity -with-singularity /sw/data/ToolBox/nf-core/nfcore-sarek-2.6.1.img --containerPath ~/sarek/containers --custom_config_base ~/configs-master/ --genome_base /sw/data/ToolBox/hg38bundle/ --project XXX --genome GRCh38 --step prepare_recalibration --target_bed Twist_Exome_RefSeq_targets_hg38.bed --input mapped_bam_files.tsv

Expected behaviour

Even if it might be better to do the joint genotyping on the full file anyway, I would expect the gvcf files generated to include (at least) the regions in the given bed file when using --target_bed. Or maybe just a note/warning on this in the description of --target_bed?

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline
  • The .nextflow.log file

System

  • Hardware: HPC
  • Executor: slurm
  • Sarek version: 2.6.1

Nextflow Installation

  • Version: 20.10.0.5431

Container engine

  • Engine: Singularity

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions