Consistent quality and adapter trimming for next-generation sequencing data, with special handling for RRBS libraries.
Note
Trim Galore v2.0 is a faithful Rust rewrite — a single binary with zero external dependencies, designed as a drop-in replacement for v0.6.x scripts and pipelines. Same CLI, same output filenames, same report format. Adds poly-G auto-detection and trimming for 2-colour instruments, a generic poly-A trimmer, per-pair adapter auto-detection, and cleaner multi-adapter invocation (repeatable -a/-a2 instead of Perl's embedded-string syntax) — among other extensions. For details on what changed, benchmarks, and migration notes, see the v2.0 migration notes.
- Adapter auto-detection — automatically identifies Illumina, Nextera, Small RNA, and BGI/DNBSEQ adapters from the first 1M reads. Stranded Illumina remains explicit (
--stranded_illumina) because its sequence is ambiguous with Nextera. - Multi-adapter support — specify multiple adapters by repeating
-a/-a2or via-a "file:adapters.fa", with optional multi-round trimming (-n) - Quality trimming — Phred-based trimming from the 3' end (BWA algorithm)
- Paired-end — single-pass processing of both reads with automatic pair validation
- RRBS — MspI end-repair artifact removal, directional and non-directional libraries
- Poly-G trimming — sequence-based removal of no-signal G-runs at the 3' end of Read 1 (and poly-C at the 5' end of Read 2) from 2-colour instruments (NovaSeq, NextSeq, NovaSeq X). Auto-detected from the data; opt-out with
--no_poly_g - NextSeq / 2-colour quality trim —
--nextseq N/--2colour Napplies 2-colour-aware quality trimming (opt-in; replaces-q) - Poly-A trimming — built-in removal of poly-A tails without external tools; recommended for mRNA-seq / poly-A-selected RNA-seq libraries
- Parallel processing —
--cores Nruns trimming and gzip compression in worker threads under an N+4 thread model (N workers + 2 decompressors + 1 batcher + 1 writer); near-linear speedup up to--cores 8for paired-end runs, then gzip-output I/O typically becomes binding - Clumpify compression (v2.2+) — opt-in
--clumpifyreorders reads by canonical 16-mer minimizer so similar reads share gzip dictionary windows; combined with--compression 1–9it shrinks output 15–55% on fragment-clustered data (ATAC, Ribo, RRBS, RNA-seq, MiSeq amplicons). Coverage-diverse data (WGBS PE, scRNA-seq R2) regress — see Clumpy compression for the per-data-type guidance - FastQC integration — optional post-trimming quality reports built in via the bundled fastqc-rust library; produces FastQC 0.12.1-compatible HTML + ZIP outputs without requiring Java or an external
fastqcon$PATH - MultiQC compatible — trimming reports parse cleanly in MultiQC dashboards (text + JSON)
- Demultiplexing — 3' inline barcode demultiplexing
Requires the Rust toolchain (1.88+):
cargo install trim-galoreconda install -c bioconda trim-galoregit clone https://github.com/FelixKrueger/TrimGalore.git
cd TrimGalore
cargo build --release
# Binary is at target/release/trim_galoreTo install the latest unreleased changes directly from the development branch:
cargo install --git https://github.com/FelixKrueger/TrimGalore --branch dev trim-galore --forceThe --force flag overwrites any existing trim_galore binary (e.g. a v2.0.0 install from crates.io).
Multi-arch images (amd64 + arm64) are available from GitHub Container Registry:
docker run --rm -v "$PWD":/data -w /data ghcr.io/felixkrueger/trimgalore:latest trim_galore input.fastq.gzFastQC is built into the binary itself via the bundled fastqc-rust library — no external fastqc or Java runtime needed in the image. Tags published: :latest (latest stable, currently v2.2.0), :v2.2.0 (pinned to a specific release), :beta (latest prerelease — only set during an active beta cycle), and :dev (every push to the dev branch). See the docs site install page for the full table.
Prebuilt binaries for Linux (x86_64, aarch64) and macOS (Apple Silicon) are available on the Releases page. Intel Mac users: install via cargo install trim-galore (local build) or use the Docker amd64 image.
# Single-end
trim_galore input.fastq.gz
# Paired-end
trim_galore --paired file_R1.fastq.gz file_R2.fastq.gz
# Parallel processing (recommended for large files)
# Near-linear speedup up to ~8 cores on v2.2.0; beyond that the
# gzip-output I/O on the storage layer typically becomes binding.
trim_galore --cores 8 --paired file_R1.fastq.gz file_R2.fastq.gz
# RRBS mode
trim_galore --rrbs --paired file_R1.fastq.gz file_R2.fastq.gz
# Run FastQC on trimmed output
trim_galore --fastqc input.fastq.gzFor the complete list of options:
trim_galore --help| Mode | Trimmed output | Reports |
|---|---|---|
| Single-end | *_trimmed.fq.gz |
*_trimming_report.txt + *_trimming_report.json |
| Paired-end | *_val_1.fq.gz / *_val_2.fq.gz |
per-read text + JSON reports |
Unpaired (with --retain_unpaired) |
*_unpaired_1.fq.gz / *_unpaired_2.fq.gz |
Output compression mirrors the input: gzipped input (*.fastq.gz) produces gzipped output (*.fq.gz); plain input (*.fastq) produces plain output (*.fq). Pass --dont_gzip to force plain output regardless. By default, gzip output is written at compression level 1 (fastest); pass --compression <N> (1–9) to override — decompressed content is byte-identical regardless of level, but level-1 .fq.gz files are roughly 75% larger than level-9 in exchange for substantially faster trimming.
Pass --clumpify to reorder reads inside each gzip member by canonical 16-mer minimizer so reads sharing similar sequence land adjacent on disk, letting gzip's 32 KB dictionary find longer back-references. The right configuration depends on what the trimmed FASTQ is used for:
# Pipeline intermediates (deleted after the run): reorder is essentially free
# and shrinks the file 15–35% on most data — net I/O win for the next step.
trim_galore --clumpify <input>
# Long-term storage / disk-constrained workdirs: add gzip L6 for 15–50% saving
# at 4–6× plain wall.
trim_galore --clumpify --compression 6 <input>
# Archival use, max compression
trim_galore --clumpify --compression 9 <input>
# Smaller output without the reorder cost (e.g. for 10x scRNA-seq, see docs)
trim_galore --compression 6 <input>No information loss — only the on-disk order of records changes. Output records are byte-identical to the unsorted output and trimming reports are unaffected. --clumpify requires --cores >= 2.
Memory budget is controlled by the global --memory flag (default 1G); bigger budgets give bigger per-gzip-member sort runs and better compression up to roughly the uncompressed input size, with sharply diminishing returns above ~2 GB. With enough memory, --clumpify --compression 9 gets you within 1–2 percentage points of bbmap clumpify and stevekm/squish on the same data.
Intended for short reads (Illumina, AVITI). Long-read inputs (Oxford Nanopore, PacBio) and 10x scRNA-seq typically see no benefit or a small negative result; see Clumpy compression for per-data-type recommendations.
The JSON report contains the same statistics as the text report in a structured format (schema v1), designed for native parsing by MultiQC.
Full documentation is published at https://www.trimgalore.com/
- User Guide — full reference for all options and modes
- RRBS Guide — bisulfite sequencing and RRBS-specific guidance
- v2.0 migration notes — what changed in the Rust rewrite
- Benchmarks
Trim Galore was developed at The Babraham Institute by @FelixKrueger, now part of Altos Labs.