GitHub - FelixKrueger/TrimGalore: Consistent adapter and quality trimming for NGS, with extra functionality for RRBS data

Consistent quality and adapter trimming for next-generation sequencing data, with special handling for RRBS libraries.

https://www.trimgalore.com/

Note

Trim Galore v2.0 is a faithful Rust rewrite — a single binary with zero external dependencies, designed as a drop-in replacement for v0.6.x scripts and pipelines. Same CLI, same output filenames, same report format. Adds poly-G auto-detection and trimming for 2-colour instruments, a generic poly-A trimmer, per-pair adapter auto-detection, and cleaner multi-adapter invocation (repeatable -a/-a2 instead of Perl's embedded-string syntax) — among other extensions. For details on what changed, benchmarks, and migration notes, see the v2.0 migration notes.

Features

Adapter auto-detection — automatically identifies Illumina, Nextera, Small RNA, and BGI/DNBSEQ adapters from the first 1M reads. Stranded Illumina remains explicit (--stranded_illumina) because its sequence is ambiguous with Nextera.
Multi-adapter support — specify multiple adapters by repeating -a/-a2 or via -a "file:adapters.fa", with optional multi-round trimming (-n)
Quality trimming — Phred-based trimming from the 3' end (BWA algorithm)
Paired-end — single-pass processing of both reads with automatic pair validation
RRBS — MspI end-repair artifact removal, directional and non-directional libraries
Poly-G trimming — sequence-based removal of no-signal G-runs at the 3' end of Read 1 (and poly-C at the 5' end of Read 2) from 2-colour instruments (NovaSeq, NextSeq, NovaSeq X). Auto-detected from the data; opt-out with --no_poly_g
NextSeq / 2-colour quality trim — --nextseq N / --2colour N applies 2-colour-aware quality trimming (opt-in; replaces -q)
Poly-A trimming — built-in removal of poly-A tails without external tools; recommended for mRNA-seq / poly-A-selected RNA-seq libraries
Parallel processing — --cores N runs trimming and gzip compression in worker threads under an N+4 thread model (N workers + 2 decompressors + 1 batcher + 1 writer); near-linear speedup up to --cores 8 for paired-end runs, then gzip-output I/O typically becomes binding
Clumpify compression (v2.2+) — opt-in --clumpify reorders reads by canonical 16-mer minimizer so similar reads share gzip dictionary windows; combined with --compression 1–9 it shrinks output 15–55% on fragment-clustered data (ATAC, Ribo, RRBS, RNA-seq, MiSeq amplicons). Coverage-diverse data (WGBS PE, scRNA-seq R2) regress — see Clumpy compression for the per-data-type guidance
FastQC integration — optional post-trimming quality reports built in via the bundled fastqc-rust library; produces FastQC 0.12.1-compatible HTML + ZIP outputs without requiring Java or an external fastqc on $PATH
MultiQC compatible — trimming reports parse cleanly in MultiQC dashboards (text + JSON)
Demultiplexing — 3' inline barcode demultiplexing

Installation

From crates.io

Requires the Rust toolchain (1.88+):

cargo install trim-galore

From bioconda

conda install -c bioconda trim-galore

Build from source

git clone https://github.com/FelixKrueger/TrimGalore.git
cd TrimGalore
cargo build --release
# Binary is at target/release/trim_galore

Latest development version

To install the latest unreleased changes directly from the development branch:

cargo install --git https://github.com/FelixKrueger/TrimGalore --branch dev trim-galore --force

The --force flag overwrites any existing trim_galore binary (e.g. a v2.0.0 install from crates.io).

Docker

Multi-arch images (amd64 + arm64) are available from GitHub Container Registry:

docker run --rm -v "$PWD":/data -w /data ghcr.io/felixkrueger/trimgalore:latest trim_galore input.fastq.gz

FastQC is built into the binary itself via the bundled fastqc-rust library — no external fastqc or Java runtime needed in the image. Tags published: :latest (latest stable, currently v2.2.0), :v2.2.0 (pinned to a specific release), :beta (latest prerelease — only set during an active beta cycle), and :dev (every push to the dev branch). See the docs site install page for the full table.

Prebuilt binaries

Prebuilt binaries for Linux (x86_64, aarch64) and macOS (Apple Silicon) are available on the Releases page. Intel Mac users: install via cargo install trim-galore (local build) or use the Docker amd64 image.

Usage

# Single-end
trim_galore input.fastq.gz

# Paired-end
trim_galore --paired file_R1.fastq.gz file_R2.fastq.gz

# Parallel processing (recommended for large files)
# Near-linear speedup up to ~8 cores on v2.2.0; beyond that the
# gzip-output I/O on the storage layer typically becomes binding.
trim_galore --cores 8 --paired file_R1.fastq.gz file_R2.fastq.gz

# RRBS mode
trim_galore --rrbs --paired file_R1.fastq.gz file_R2.fastq.gz

# Run FastQC on trimmed output
trim_galore --fastqc input.fastq.gz

For the complete list of options:

trim_galore --help

Output files

Mode	Trimmed output	Reports
Single-end	`*_trimmed.fq.gz`	`_trimming_report.txt` + `_trimming_report.json`
Paired-end	`_val_1.fq.gz` / `_val_2.fq.gz`	per-read text + JSON reports
Unpaired (with `--retain_unpaired`)	`_unpaired_1.fq.gz` / `_unpaired_2.fq.gz`

Output compression mirrors the input: gzipped input (*.fastq.gz) produces gzipped output (*.fq.gz); plain input (*.fastq) produces plain output (*.fq). Pass --dont_gzip to force plain output regardless. By default, gzip output is written at compression level 1 (fastest); pass --compression <N> (1–9) to override — decompressed content is byte-identical regardless of level, but level-1 .fq.gz files are roughly 75% larger than level-9 in exchange for substantially faster trimming.

Pass --clumpify to reorder reads inside each gzip member by canonical 16-mer minimizer so reads sharing similar sequence land adjacent on disk, letting gzip's 32 KB dictionary find longer back-references. The right configuration depends on what the trimmed FASTQ is used for:

# Pipeline intermediates (deleted after the run): reorder is essentially free
# and shrinks the file 15–35% on most data — net I/O win for the next step.
trim_galore --clumpify <input>

# Long-term storage / disk-constrained workdirs: add gzip L6 for 15–50% saving
# at 4–6× plain wall.
trim_galore --clumpify --compression 6 <input>

# Archival use, max compression
trim_galore --clumpify --compression 9 <input>

# Smaller output without the reorder cost (e.g. for 10x scRNA-seq, see docs)
trim_galore --compression 6 <input>

No information loss — only the on-disk order of records changes. Output records are byte-identical to the unsorted output and trimming reports are unaffected. --clumpify requires --cores >= 2.

Memory budget is controlled by the global --memory flag (default 1G); bigger budgets give bigger per-gzip-member sort runs and better compression up to roughly the uncompressed input size, with sharply diminishing returns above ~2 GB. With enough memory, --clumpify --compression 9 gets you within 1–2 percentage points of bbmap clumpify and stevekm/squish on the same data.

Intended for short reads (Illumina, AVITI). Long-read inputs (Oxford Nanopore, PacBio) and 10x scRNA-seq typically see no benefit or a small negative result; see Clumpy compression for per-data-type recommendations.

The JSON report contains the same statistics as the text report in a structured format (schema v1), designed for native parsing by MultiQC.

Documentation

Full documentation is published at https://www.trimgalore.com/

User Guide — full reference for all options and modes
RRBS Guide — bisulfite sequencing and RRBS-specific guidance
v2.0 migration notes — what changed in the Rust rewrite
Benchmarks

Credits

Trim Galore was developed at The Babraham Institute by @FelixKrueger, now part of Altos Labs.

License

GPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 425 Commits
.github		.github
docs		docs
scripts		scripts
src		src
test_files		test_files
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build.rs		build.rs
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

https://www.trimgalore.com/

Features

Installation

From crates.io

From bioconda

Build from source

Latest development version

Docker

Prebuilt binaries

Usage

Output files

Documentation

Credits

License

About

Uh oh!

Releases 27

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

https://www.trimgalore.com/

Features

Installation

From crates.io

From bioconda

Build from source

Latest development version

Docker

Prebuilt binaries

Usage

Output files

Documentation

Credits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 27

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages