Skip to content

0x0OZ/CVE-2026-7482-PoC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVE-2026-7482: Ollama Heap Out-of-Bounds Read (1-Day PoC)

This repository contains a 1-day Proof of Concept (PoC) exploitation chain for CVE-2026-7482, an unauthenticated Out-of-Bounds (OOB) Read vulnerability in Ollama's GGUF model loader (versions prior to 0.17.1).

Note: This is a 1-day research reproduction. I did not discover the original CVE. This PoC was engineered based on the public advisory details to demonstrate the mechanics of the vulnerability for educational and defensive research purposes.

Vulnerability Overview

By supplying a maliciously crafted, truncated GGUF file to the /api/create endpoint, an attacker can force the quantization parser in fs/ggml/gguf.go and server/quantization.go to read past the allocated heap buffer. The leaked memory is then exfiltrated by pushing the resulting model artifact to an attacker-controlled Docker registry via the /api/push endpoint.

Technical Details: The Exploit Primitive

During this 1-day research, reproducing the crash was trivial, but achieving stable exfiltration without crashing the server or hitting API validation blocks required specific architectural forging:

  1. Frontend Validation Bypass: The payload must be tagged as F16 (general.file_type = 1) to satisfy the Ollama API's strict pre-flight checks.
  2. Quantizer Coercion: We request a Q4_K_M down-quantization. Because the payload is seen as F16, the C++ ggml backend is forced to process the payload rather than performing a safe, 1:1 memory copy.
  3. Perfect Block Alignment: The target tensor (token_embd.weight) must be shaped as a 2D matrix where the innermost dimension is exactly 256 (e.g., [num_rows, 256]). This strictly aligns with Q4_K_M block requirements, preventing the backend from skipping the layer.
  4. Physical Truncation: The physical file is truncated to 32 bytes. When the matrix multiplication loop runs, it hits EOF and over-reads directly into the adjacent heap space.

Prerequisites

pip install requests numpy gguf

You also need a publicly accessible HTTP listener (like Ngrok) to catch the exfiltrated Docker layer pushes.

Usage

1. Start the Rogue Registry Start the listener to catch the leaked memory blobs.

sudo python3 registry.py

2. Forge the Malicious Payload Generate the truncated GGUF file. You can adjust TARGET_LEAK_SIZE_MB inside the script to control how much heap memory is scraped per request. (Recommended: 0.5MB to 2.0MB to avoid segfaulting unmapped pages).

python3 forge.py

3. Fire the Exploit Edit exploit.py to include your target IP and your rogue registry URL, then execute:

python3 exploit.py

4. Analyze the Artifact The registry will drop the leaked heap dumps into the exfils/ directory.

Note on Data Integrity (The Quantization Trap): While the exploit successfully captures and exfiltrates up to several megabytes of server heap memory, the data is subjected to Ollama's Q4_K_M down-quantization algorithm during the OOB read. The backend casts the raw memory bytes to float16 and applies a lossy 4-bit block compression scheme. Consequently, the leaked memory is mathematically mangled. Standard ASCII extraction tools will yield binary garbage, making plaintext credential recovery practically unviable via this specific coercion path.

Disclaimer

This project is for educational and authorized vulnerability research purposes only. Do not use this tool against systems you do not own or have explicit permission to test.

About

1day vuln research I guess

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages