Docling: Unsafe Archive Extraction and XML Parsing in METS-GBS Backend

Impact

The METS-GBS backend's XML parsing and the input document format detection lacked security controls, enabling:

XML External Entity (XXE) attacks to read local files or cause denial of service
Decompression bombs (zip bombs) to exhaust memory and disk space
Unbounded archive extraction consuming system resources

An attacker could craft malicious METS-GBS archives that, when processed, could read sensitive files, exhaust system resources, or cause application crashes.

Patches

Fixed in version 2.91.0. The fix implements:

Secure XML parsing with resolve_entities=False, load_dtd=False, and no_network=True
Configurable limits: 300 MB total extraction size, 10 MB per file, 1000 member count
Cumulative size tracking across all extractions
Early termination when limits are exceeded
Secure format detection of METS-GBS tar archives with _detect_mets_gbs() method: maximum file size (10 MB per file), maximum member count (1000 members), and exception handling to gracefully fail when limits are exceeded

Workarounds

Avoid processing METS-GBS archives from untrusted sources. If necessary, pre-validate archives in an isolated environment with resource limits.

References

Fix release: v2.91.0

References

dolfim-ibm published to docling-project/docling Jun 2, 2026

Published to the GitHub Advisory Database Jun 3, 2026

Reviewed Jun 3, 2026

Last updated Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package

Affected versions

Patched versions

Description

Impact

Patches

Workarounds

References

References

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

EPSS score

Exploit Prediction Scoring System (EPSS)

Weaknesses

Improper Handling of Highly Compressed Data (Data Amplification)

Improper Restriction of XML External Entity Reference

Improper Restriction of Recursive Entity References in DTDs ('XML Entity Expansion')

CVE ID

GHSA ID

Source code

Credits

Uh oh!