Benchmark

This repository contains a benchmark for germline variant calling using Illumina short-read sequencing (~30X coverage).
The goal is to evaluate the performance of different variant calling pipelines using a well-characterized reference dataset.

The benchmarking focuses on SNPs and small indels, which are typical outputs of short-read germline variant calling pipelines.

Dataset

HG002 (Genome in a Bottle)

The benchmark uses the HG002 sample from the Genome in a Bottle (GIAB) consortium.

HG002 is widely used as a gold standard dataset for benchmarking germline variant calling pipelines because it provides:

High-confidence truth variants
High-confidence confident regions
Well-curated reference datasets for SNPs and indels

Dataset characteristics:

Sample: HG002 (NA24385)
Sequencing: Illumina short reads
Coverage: ~30X
Reference genome: GRCh38
Truth set: GIAB high-confidence variant calls

The truth VCF and confident region BED files are used to compare predicted variants against the gold standard.

Pipelines

The following pipelines are evaluated in this benchmark.

Sarek

:contentReference[oaicite:0]{index=0} is a widely used Nextflow pipeline for germline and somatic variant calling.

Key characteristics:

Developed by the :contentReference[oaicite:1]{index=1} community
Supports multiple variant callers
Designed for reproducible genomics workflows
Containerized with Docker / Singularity

Typical tools used:

Alignment: BWA
Variant callers:
- GATK HaplotypeCaller
- Strelka2
- DeepVariant (optional)

nf-germline-short-read-variant-calling

:contentReference[oaicite:2]{index=2} is a Nextflow pipeline designed for standardized germline variant calling with Illumina short reads.

This pipeline aims to provide:

A clear and reproducible workflow
Standardized best-practice variant calling
Modular processes for benchmarking and extension

Typical workflow:

Read quality control
Read alignment
BAM processing
Variant calling
Variant filtering
Benchmarking against truth sets

Benchmarking Method

Variant calls generated by each pipeline are compared against the GIAB truth set using benchmarking tools.

SNP/INDEL Benchmarking

Metrics evaluated include:

Precision
Recall
F1 score
SNP performance
Indel performance

Benchmarking is performed within the high-confidence regions defined by GIAB.

To benchmark nf-score/sarek and this workflow small variant calls against HG002 truth set:

cd benchmark/small/benchmark
pixi run --environment snpindelbench bash benchmark_and_summary.sh

Structural Variant (SV) Benchmarking

Structural variants are benchmarked separately using Truvari, a specialized tool for SV comparison.

SV benchmarking workflow:

VCF Normalization: Multi-allelic records are split, indels are left-aligned using bcftools norm
Coordinate Conversion: Query VCFs are converted from hg38 (pipeline output) to hg19 (truth set) using CrossMap with chain file
Chromosome Naming: Remove chr prefix from query VCF to match truth set naming convention
VCF Sorting & Indexing: Sort and index normalized VCFs for Truvari benchmarking
Truvari Benchmark: Compare variants using Truvari with default parameters
Metrics Extraction: Extract summary statistics from Truvari JSON output

Running SV Benchmarking (Manta)

To benchmark Manta structural variant calls from Sarek against HG002 truth set:

cd benchmark
pixi run --environment svbench bash benchmark_sv_sarek.sh

This script will:

Normalize the Manta VCF output
Convert from hg38 to hg19 coordinates
Run Truvari comparison
Generate summary metrics file (HG002_manta.summary.txt)

Expected Output

The benchmark generates:

Normalized and converted VCF files
Truvari output directory with TP/FP/FN classifications
Summary metrics in text format with:
- True Positives (TP)
- False Positives (FP)
- False Negatives (FN)
- Precision, Recall, F1 score
- Genotype concordance

Example output:

True Positives: 1082
False Positives: 1858
False Negatives: 12650
Sensitivity (Recall): 0.0788 (7.88%)
Precision: 0.3680 (36.8%)
F1 Score: 0.1298
Genotype Concordance: 0.9039 (90.39%)

Tools Used

hap.py

:contentReference[oaicite:3]{index=3} is a widely used tool for benchmarking germline variant calls.

It performs:

Variant comparison between predicted VCF and truth VCF
Stratified benchmarking for SNPs and indels
Standard benchmarking metrics (precision, recall, F1)

Repository:

https://github.com/qbic-projects/QSARK/tree/main

Truvari

Truvari is a specialized benchmarking tool for structural variants (SVs).

It performs:

SV comparison and matching between query and truth VCFs
Genotype concordance calculation
Classification of true positives, false positives, and false negatives
Generation of stratified comparison reports

Key features:

Handles SV size and type variations
Provides detailed TP/FP/FN VCF outputs
Generates JSON summary statistics
Supports multiple matching algorithms

CrossMap

CrossMap is a utility for converting genome coordinates and annotation files between different genome assemblies.

Used for:

Converting VCF coordinates from hg38 (pipeline output) to hg19 (truth set)
Handling chromosome naming differences (chr prefix)
Liftover operations between reference genomes

bcftools

bcftools provides utilities for variant calling and manipulation.

Used for:

VCF normalization (splitting multi-allelic records, left-aligning indels)
VCF sorting and indexing
Variant filtering and annotation

References

hap.py tool for benchmarking germline short-read variants
https://github.com/qbic-projects/QSARK/tree/main
Genome in a Bottle (GIAB) program for gold standard datasets
:contentReference[oaicite:4]{index=4}
https://www.nist.gov/programs-projects/genome-bottle
Truvari: SV benchmarking tool
https://github.com/ACEnglish/truvari
CrossMap: Genome coordinate conversion tool
http://crossmap.sourceforge.net/
bcftools: VCF manipulation utilities
http://samtools.github.io/bcftools/
HG002 Structural Variant Truth Set (v0.6)
NIST Genome in a Bottle SVs for Tier 1 regions
SV Benchmarking Guide
See benchmark_sv_sarek.sh for implementation details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark

Dataset

HG002 (Genome in a Bottle)

Pipelines

Sarek

nf-germline-short-read-variant-calling

Benchmarking Method

SNP/INDEL Benchmarking

Structural Variant (SV) Benchmarking

Running SV Benchmarking (Manta)

Expected Output

Tools Used

hap.py

Truvari

CrossMap

bcftools

References

FilesExpand file tree

Readme.md

Latest commit

History

Readme.md

File metadata and controls

Benchmark

Dataset

HG002 (Genome in a Bottle)

Pipelines

Sarek

nf-germline-short-read-variant-calling

Benchmarking Method

SNP/INDEL Benchmarking

Structural Variant (SV) Benchmarking

Running SV Benchmarking (Manta)

Expected Output

Tools Used

hap.py

Truvari

CrossMap

bcftools

References