feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin#3906
feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin#3906crivetimihai merged 12 commits intomainfrom
Conversation
5d7ba49 to
2f54db3
Compare
|
Thanks @gandhipratik203. Will review the detection improvements and Rust parity implementation. |
1a31311 to
2cfc137
Compare
There was a problem hiding this comment.
Hi @lucarlig , thanks for the detailed review — really appreciate you testing with a concrete payload and catching the double-counting issue.
Addressed in this push:
-
Double-counting — Great catch. When a string is valid JSON, we now scan only the parsed structure and skip the raw text scan entirely, so a single secret produces exactly one finding. Added a regression test for this.
-
JSON bypasses oversized-input guard — Good point. Added a size check (
len <= max_scan_string_length) and a quick heuristic (string must start with{or[) before attempting JSON parse. Applied to both Python and Rust. -
Rust path test coverage — Fair feedback. Added 8 new parametrized tests covering per-encoding thresholds, JSON parsing, heuristic skip, and malformed JSON on both Python and Rust paths.
-
README inconsistencies — Fixed. Corrected the wheel name to
mcpgateway-encoded-exfil-detection, removed JSON-within-strings and per-encoding thresholds from Known Limitations (since they're now implemented), and added a Performance section with benchmark results.
Noted for follow-up:
-
Double decode in nested scan — Agree this is an optimization opportunity. In practice the cost is microseconds per candidate (Rust path: ~0.007ms per finding) so it doesn't show up in benchmarks today. Happy to optimize if profiling surfaces it as a bottleneck.
-
JSON FFI allocation churn — The serde_json → Python object conversion is inherent to the recursive approach. It's now bounded by both
max_recursion_depthand the newmax_scan_string_lengthguard on JSON parsing. Happy to revisit if large embedded JSON becomes a concern in production.
lucarlig
left a comment
There was a problem hiding this comment.
-
JSON-string parsing introduces a real Rust-path bypass for secrets embedded in JSON object keys. Once the string is parsed, the Rust scanner only traverses values and never scans keys, so a payload like
{"<base64-secret>":"x"}is missed. -
the Rust JSON-string path can change a clean string payload into a structured object even with no findings.
scan_container()parses the string, converts the JSON tree back into Python objects withjson_value_to_py(), and returns that structure instead of the original string. In hook flows, that can surface as a payload mutation purely because JSON parsing happened.
|
Hi Dima, thanks for the thorough review and the positive feedback on the test coverage and documentation. Already addressed (before your review):
Addressed now:
Deferring:
|
|
Hi @lucarlig, thanks for the follow-up — good catches both. Fixed:
|
There was a problem hiding this comment.
Commit Review: afab86f
Commit: afab86f24bd580045f3596860cbbb03c28bc20bb
Author: Pratik Gandhi
Date: March 30, 2026
Title: fix(encoded-exfil): scan dict keys for encoded secrets, prevent JSON type mutation
Summary
This commit addresses two important issues in the encoded exfiltration detection plugin:
- Dict Key Scanning: Adds detection of encoded secrets used as JSON object keys
- JSON Type Mutation Prevention: Fixes a bug where JSON-within-strings scanning could mutate the return type from string to dict/list
Files Changed: 4
Lines Changed: +91, -30
Verdict: ✅ Approve
Well-structured fix with comprehensive tests. Addresses real security gaps and type safety issues.
Changes Reviewed
1. Python Implementation (encoded_exfil_detector.py)
Change A: JSON Type Mutation Prevention (Lines 420-447)
Before:
if isinstance(container, str):
# Try parsing string as JSON first — scan parsed structure only
if cfg.parse_json_strings and ...:
parsed = json.loads(container)
if isinstance(parsed, (dict, list)):
return _scan_container(parsed, cfg, path=json_path, ...) # ❌ Returns dict/list
redacted, findings = _scan_text(container, cfg, path=path)
return len(findings), redacted, findingsAfter:
if isinstance(container, str):
# Scan as raw text first — always returns the original type (string)
redacted, findings = _scan_text(container, cfg, path=path)
# Try parsing string as JSON for additional findings (metadata only, no type mutation)
if cfg.parse_json_strings and ...:
try:
parsed = json.loads(container)
if isinstance(parsed, (dict, list)):
_, _, json_findings = _scan_container(parsed, cfg, path=json_path, ...)
# Deduplicate: only add JSON findings whose encoded match isn't already found
raw_matches = {f.get("match") for f in findings}
for jf in json_findings:
if jf.get("match") not in raw_matches:
findings.append(jf)
except (json.JSONDecodeError, ValueError):
pass
return len(findings), redacted, findings # ✅ Always returns stringAssessment: ✅ Correct
- Type Safety: String input now always returns string output (critical for plugin contract)
- Additive Scanning: JSON parsing is now findings-only, not transformative
- Deduplication: Match preview comparison prevents double-counting
- Performance: JSON heuristic check (
startswith('{') or startswith('[')) already exists from previous commit
Change B: Dict Key Scanning (Lines 451-457)
Added:
for key, value in container.items():
child_path = f"{path}.{key}" if path else str(key)
# Scan keys that are long enough to contain encoded content
if isinstance(key, str) and len(key) >= cfg.min_encoded_length:
key_path = f"{child_path}(key)"
_, key_findings = _scan_text(key, cfg, path=key_path)
findings.extend(key_findings)
total += len(key_findings)
count, new_value, child_findings = _scan_container(value, cfg, path=child_path, ...)Assessment: ✅ Correct
- Security: Detects encoded secrets used as JSON keys (e.g.,
{"cGFzc3dvcmQ9c2VjcmV0": "value"}) - Performance: Only scans keys meeting
min_encoded_lengththreshold - Path Annotation:
(key)suffix clearly identifies key-based findings - Symmetry: Matches value scanning logic
Security Impact: 🔴 High - Closes evasion vector where attackers hide encoded secrets in JSON keys
2. Rust Implementation (lib.rs)
Change: Mirrors Python Logic
The Rust implementation correctly mirrors the Python changes:
// Scan as raw text first
let (redacted_text, findings) = scan_text(&text, path, cfg, 0);
let findings_list = PyList::empty(py);
for finding in &findings {
findings_list.append(finding_to_dict(py, finding)?)?;
}
// Try parsing string as JSON for additional findings
if cfg.parse_json_strings && ... {
let (_, _, json_findings) = scan_container(py, &py_parsed, &json_path, cfg, depth + 1)?;
// Deduplicate using HashSet for O(1) lookup
let raw_matches: std::collections::HashSet<String> =
findings.iter().map(|f| f.matched_preview.clone()).collect();
for item in json_findings.iter() {
if let Ok(dict) = item.cast::<PyDict>() {
let preview = dict.get_item("match")...;
if !raw_matches.contains(&preview) {
findings_list.append(item)?;
}
}
}
}
// Dict key scanning
for (key, value) in dict.iter() {
let key_str = key.str()?.to_string_lossy().into_owned();
// Scan keys that are long enough to contain encoded content
if key_str.len() >= cfg.min_encoded_length {
let key_path = format!("{}(key)", child_path);
let (_, key_findings) = scan_text(&key_str, &key_path, cfg, 0);
// ... add findings ...
}
}Assessment: ✅ Correct
- Uses
HashSet<String>for efficient O(1) deduplication lookup - Properly handles PyO3 type conversions
- Returns
PyStringto preserve original type - Dict key scanning logic matches Python implementation
3. Test Coverage (test_encoded_exfil_detector.py)
New Test 1: Type Preservation
def test_json_string_returns_string_not_dict(self, use_rust: bool):
"""JSON-parsed strings must return the original string type, not a parsed dict."""
json_str = json.dumps({"key": "clean value"})
cfg = EncodedExfilDetectorConfig(parse_json_strings=True)
payload = {"data": json_str}
_, result, _ = _scan_container(payload, cfg, use_rust=use_rust)
# The "data" value must still be a string, not a parsed dict
assert isinstance(result["data"], str), f"Expected str but got {type(result['data'])}"Assessment: ✅ Correct - Validates type preservation contract
New Test 2: Dict Key Detection
def test_encoded_secret_in_dict_key_detected(self, use_rust: bool):
"""Encoded secrets used as dict keys should be detected."""
encoded_key = base64.b64encode(b"password=super-secret-credential-value").decode()
cfg = EncodedExfilDetectorConfig(min_suspicion_score=1)
payload = {encoded_key: "some value"}
count, _, findings = _scan_container(payload, cfg, use_rust=use_rust)
assert count >= 1, "Encoded secret in dict key should be detected"
assert any("key" in f.get("path", "") for f in findings), "Finding path should contain 'key'"Assessment: ✅ Correct - Validates key scanning functionality
Updated Test: JSON Heuristic
def test_json_within_string_detection(self, use_rust: bool):
"""JSON embedded in string should be scanned with precise paths."""
json_str = json.dumps({"password": "secret-value"})
cfg = EncodedExfilDetectorConfig(min_suspicion_score=1, parse_json_strings=True)
payload = {"data": json_str}
count, result, findings = _scan_container(payload, cfg, use_rust=use_rust)
assert count == 1
# Return type must be string (no type mutation)
assert isinstance(result["data"], str), f"Expected str but got {type(result['data'])}"Assessment: ✅ Correct - Updated assertion validates type preservation
Findings
✅ Strengths
- Type Safety: Fixes critical type mutation bug that could break downstream plugins
- Security: Closes evasion vector via encoded JSON keys
- Deduplication: Smart match-preview comparison prevents double-counting
- Test Coverage: Both changes have dedicated tests with Python/Rust parity
- Performance: Key scanning gated by
min_encoded_lengththreshold - Symmetry: Python and Rust implementations are consistent
✅ No Issues Found
- Logic is sound
- Edge cases handled (non-string keys, JSON parse failures)
- Type annotations correct
- Test coverage comprehensive
Security Impact Assessment
Before This Commit
| Attack Vector | Detected? |
|---|---|
| Encoded secret in JSON value | ✅ Yes |
| Encoded secret in JSON key | ❌ No (evasion possible) |
| JSON string mutated to dict |
After This Commit
| Attack Vector | Detected? |
|---|---|
| Encoded secret in JSON value | ✅ Yes |
| Encoded secret in JSON key | ✅ Yes (fixed) |
| JSON string mutated to dict | ✅ Fixed (type preserved) |
Recommended Actions
None Required
This commit is ready to merge as-is. The changes are:
- ✅ Well-tested (Python + Rust parity)
- ✅ Security-enhancing (closes evasion vector)
- ✅ Bug-fixing (type mutation prevention)
- ✅ Performance-safe (threshold-gated scanning)
Conclusion
✅ Approve
This is a high-quality fix that addresses both a security gap (encoded keys) and a type safety bug (JSON mutation). The implementation is clean, well-tested, and maintains Python/Rust parity.
Key Improvements:
- Encoded secrets in JSON keys are now detected
- String inputs always return string outputs (type contract preserved)
- Deduplication prevents double-counting across raw/JSON scans
Review generated by automated code review process.
…with full Rust parity - Add allowlist_patterns config with regex validation to suppress false positives - Add extra_sensitive_keywords and extra_egress_hints for configurable detection tuning - Add nested encoding detection with max_decode_depth (peels base64-of-hex etc.) - Add per_encoding_score for per-encoding suspicion thresholds - Add parse_json_strings to detect encoded payloads inside JSON string values - Add resource_post_fetch hook to scan fetched resources for exfiltration - Add container recursion depth limiting (max_recursion_depth) - Add detection logging (log_detections flag, no sensitive content in logs) - Port all features to Rust with full parity via persistent ExfilDetectorEngine class - Add compare_performance.py: Python vs Rust benchmarks (4.3x-12.1x speedup) - Add 112 TDD tests: config validation, bypass resistance, parity, integration, nested encoding, JSON parsing, and xfail-documented limitations - Full README rewrite with config reference, tuning guide, and worked examples Closes #3807 Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
- Replace object.__setattr__() with setattr() to fix ruff PLC2801 - Rename unused __context to _context to fix vulture - Run cargo fmt on Rust test code Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
- Collapse nested if statements in Rust to satisfy clippy collapsible_if - Add pylint disable for model_post_init arguments-differ Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…is enabled When a string is valid JSON, scan only the parsed structure — do not also scan the raw text. This prevents the same encoded value from being counted twice (once in the raw string, once in the parsed JSON value), which could incorrectly trip min_findings_to_block. Add regression test verifying a single secret in a JSON string produces exactly 1 finding, not 2. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…s, README fixes
- Add JSON parse heuristic: only attempt json.loads if string starts with
{ or [ and is within max_scan_string_length (Python + Rust)
- Add Rust-path test coverage for per-encoding thresholds, JSON parsing,
heuristic skip, and malformed JSON (8 new parametrized tests)
- Fix README: correct wheel name to mcpgateway-encoded-exfil-detection,
remove implemented features from Known Limitations, add Performance section
- Include pattern index in allowlist validation error messages
- Regenerate secrets baseline
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…ist partial match test
- Hoist core_chars.replace('=', "") to a single let above both boundary
checks in has_valid_boundaries() to avoid redundant string allocation
- Add test verifying allowlist partial match suppresses detection
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…type mutation - Scan dict keys as strings when len >= min_encoded_length to detect secrets used as JSON object keys (both Python and Rust) - Prevent JSON-within-strings from mutating return type: scan raw text first (preserves original string), then parse JSON for additional findings with deduplication by match preview - Add tests for key scanning and type preservation - Update JSON test assertions to verify string return type Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…ve matching - Fix parity bug: extra_sensitive_keywords and extra_egress_hints were not lowercased in Python, causing case-insensitive matching to fail when users configured mixed-case keywords. Rust already lowercases both (kw.to_lowercase() / h.to_lowercase()). - Update module docstring to list all 3 hooks (was missing resource_post_fetch). - Add pragma: no cover to Rust-only code paths (import, scan, engine init). - Add regression tests for mixed-case extra keywords and egress hints. - Add test for max_recursion_depth container depth guard. - Add test for JSON dedup path (Unicode-escaped base64 only found via JSON parse). - Achieve 100% differential test coverage on Python plugin. - Update .secrets.baseline for new test file entries (false positives). Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
9cbbcad
afab86f to
9cbbcad
Compare
Review — rebased, reviewed, and fixedRebased onto Bugs fixed (commit 9cbbcad)
Test coverage → 100%
Review notes (no issues)
|
crivetimihai
left a comment
There was a problem hiding this comment.
LGTM — rebased, fixed two parity bugs (case-insensitive matching for extra keywords/hints), achieved 100% differential test coverage. Ready to merge.
Summary
Hardens the encoded exfiltration detection plugin (
plugins/encoded_exfil_detection/) to production quality for WXO integration, following a test-driven development approach. All new features are implemented in both Python and Rust with full parity.The plugin detects suspicious encoded payloads (base64, base64url, hex, percent-encoding, hex escapes) in prompt arguments, tool outputs, and resource content, then blocks or redacts based on a multi-factor suspicion scoring system. It uses Rust acceleration automatically when the
encoded_exfil_detection_rustwheel is installed (4.3x–12.1x speedup), otherwise falling back to a pure Python implementation with identical behavior.This PR closes 10 gaps identified during a thorough gap analysis against the PII filter and secrets detection sibling plugins. It adds allowlisting, configurable keyword/egress lists, nested multi-layer encoding detection, per-encoding thresholds, JSON-within-strings parsing, a
resource_post_fetchhook, container recursion depth limiting, and detection logging — all in both Python and Rust. It also adds 112 TDD tests, a full README rewrite, and a Python-vs-Rust performance comparison script.Gaps closed
Python plugin
Gap 1 (HIGH) — No allowlisting: legitimate base64 in production (JWTs, image data URIs, git SHAs) would cause constant false positives, leading ops teams to disable the plugin. Fixed by adding
allowlist_patterns: list[str]to config — regex patterns validated via Pydanticfield_validator, compiled once at config creation viamodel_post_init, and checked before scoring in_scan_text. The PII filter sibling plugin uses the same pattern (whitelist_patterns).Gap 2 (HIGH) — No per-encoding thresholds: hex encodings have significantly higher false-positive rates than percent-encoding or escaped hex, but a single
min_suspicion_scoreapplied to all encoding types. Fixed by addingper_encoding_score: Dict[str, int]to config. The threshold lookup in_evaluate_candidatenow usescfg.per_encoding_score.get(encoding, cfg.min_suspicion_score), falling back to the global threshold for unconfigured encodings. Operators can set e.g.{"hex": 5, "base64": 2}to apply stricter filtering to hex while keeping base64 sensitive.Gap 3 (MEDIUM) — No
resource_post_fetchhook: the secrets detection sibling plugin scans resources, but encoded exfil only covered prompts and tool outputs. An attacker could exfiltrate via encoded content in a resource response. Fixed by addingasync def resource_post_fetch()following the same pattern astool_post_invoke. Updatedplugin-manifest.yamlandplugins/config.yamlto declare the new hook.Gap 4 (MEDIUM) — Hardcoded keyword/egress lists:
_SENSITIVE_KEYWORDS(14 entries) and_EGRESS_HINTS(12 entries) were baked into the code with no way to extend them for deployment-specific secrets (e.g.,watsonx_api,ibm_cloud_key). Fixed by addingextra_sensitive_keywords: list[str]andextra_egress_hints: list[str]to config. Extras are pre-encoded/lowercased at config creation viamodel_post_initand merged with built-in defaults during scoring.Gap 5 (MEDIUM) — No nested encoding detection: an attacker could double-encode a payload (
base64(hex(secret))) and the scanner would only see the outer layer, which decodes to another encoded string with no sensitive keywords. Fixed by addingmax_decode_depth: int(default 2, range 1-5). After evaluating a candidate, the scanner decodes it and re-scans the decoded text for additional encoded segments. If a nested finding scores higher than the outer finding, the nested finding replaces it. This catches multi-layer obfuscation while respecting the depth limit.Gap 6 (MEDIUM) — No JSON-within-strings parsing: a string value containing JSON (e.g.,
'{"secret": "base64..."}') was scanned as flat text. The scanner could find encoded segments in the raw text, but would not recurse into the JSON structure to find segments at deeper nesting levels. Fixed by addingparse_json_strings: bool(defaulttrue). After scanning a string, if it parses as a JSON dict or list, the plugin recurses into the parsed structure. Respectsmax_recursion_depthto prevent abuse.Gap 7 (LOW) — No container recursion depth limit: deeply nested dicts/lists could cause unbounded recursion. Fixed by adding
max_recursion_depth: int(default 32, range 1-1000) to_scan_container. Recursion returns early when the depth limit is exceeded.Gap 8 (LOW) — No detection logging: when findings were detected, nothing was logged. The PII filter sibling plugin has
log_detections. Fixed by addinglog_detections: bool(defaulttrue) and a_log_detection()method that emitslogger.warningwith hook name, count, encoding types, and request ID — never decoded payload content.Rust implementation
Gap 9 (HIGH) — Rust lacked all new features: allowlisting, configurable keywords/egress, nested detection, per-encoding thresholds, JSON-within-strings parsing, and recursion depth were only in Python. Any deployment using Rust acceleration (the default when the wheel is installed) would silently lack these features. Fixed by porting all features to
plugins_rust/encoded_exfil_detection/src/lib.rs:allowlist_patterns: Vec<Regex>compiled from Python config strings with explicitPyValueErroron invalid regex (matching Python validation behavior)extra_sensitive_keywords: Vec<Vec<u8>>andextra_egress_hints: Vec<String>merged with built-in constantsscan_textwithdecode_depthparameter, same best-score-wins logic as Pythonper_encoding_score: HashMap<String, u32>with same threshold lookup as Pythonserde_json::from_str+json_value_to_pyconverter for recursive scanningmax_recursion_depthcounter inscan_containertest_nested_base64_detection,test_allowlist_skips_matching_candidate,test_extra_sensitive_keywordsGap 10 (HIGH) — No Python-vs-Rust performance comparison script: all four other Rust plugins (
rate_limiter,pii_filter,secrets_detection,url_reputation) havecompare_performance.pybut encoded exfil did not. Fixed by creatingplugins_rust/encoded_exfil_detection/compare_performance.pywith latency mode (per-call timing, 7 scenarios), throughput mode (async concurrency at 1/4/16/64 tasks), and parity smoke tests before each benchmark.Gap 11 (MEDIUM) — Per-call config parsing overhead in Rust: the Rust entry point was a bare function (
py_scan_container) that re-parsed the Python config object via 15.getattr()calls on every request — including recompiling allowlist regexes, re-encoding keywords to bytes, and re-lowercasing egress hints. The rate limiter Rust plugin avoids this with a persistentRateLimiterEngineclass that parses config once at__new__(). Fixed by creating#[pyclass] ExfilDetectorEnginewith__new__()(parses config once) andscan()(reuses pre-parsed config). The Python plugin creates the engine at__init__and callsself._rust_engine.scan(container)on each hook. The barepy_scan_container()function is kept as a backward-compatible wrapper. Result: clean payload speedup improved from 2.4x to 4.3x, small payload from 3.7x to 4.7x.Additional hardening
model_post_init(Python) and__new__()(Rust), not on every scan call. Previous implementation re-compiled on every direct_scan_containercall.{**nf, "start": ..., "end": ...}) moved inside the conditional, only allocating when the finding will actually be used.PyValueErrorfor invalid allowlist regex patterns instead of silently dropping them viafilter_map(|p| Regex::new(&p).ok()). Both implementations now reject bad patterns at config time.PromptPrehookPayload,PluginMode,ResourcePostFetchResult) removed from test files.class Config: underscore_attrs_are_private = Truewithmodel_config = {"ignored_types": (re.Pattern,)}to silence Pydantic v2 deprecation warning.0.1.0→0.2.0in bothplugin-manifest.yamlandplugins/config.yaml.interrogatecoverage on the plugin directory (21/21).Architecture
The plugin uses optional Rust acceleration with automatic Python fallback:
prompt_pre_fetch,tool_post_invoke,resource_post_fetch), and result construction (PluginViolation,PluginResult)Plugin architecture: hook integration, Rust/Python dispatch, and result handling
Detection pipeline: what happens inside scan_text for each string
Scoring worked example: base64 credential near egress context
Test results
Test results summary
cargo test1. Unit test breakdown
Full test group results (12 groups, 112 tests)
min_entropy < 0,ratio > 1.0, etc.)resource_post_fetchhookblock_on_detection=falsemetadata,Noneargs,min_findings_to_blockmax_scan_string_lengthboundary, all encodings disabled, non-container typesNoneinput safe, logs don't leak secrets2. Rust unit tests
Full Rust test results (7 tests)
3. Python vs Rust performance comparison
Full benchmark results (Apple Silicon, release build, 200 iterations)
Parity smoke tests pass for all 7 scenarios before benchmarking.
Rust speedup scales with payload size: 4.3x on clean payloads (improved from 2.4x after the persistent engine refactor eliminated per-call config parsing), up to 12.1x on large text where pre-compiled static regexes, fixed-size entropy arrays, and zero-copy string processing provide the largest advantage.
Limitations
Cross-request correlation is not tracked
The plugin is stateless. An attacker splitting a credential across multiple requests (slow exfil) will not be detected. This is a separate system concern — it requires stateful storage across requests (Redis, decay windows, reassembly logic) that is fundamentally different from a plugin's scan-per-request model. Documented as an xfail test.
Custom encoding patterns are not supported
Only the 5 built-in encoding types (base64, base64url, hex, percent-encoding, escaped hex) are available. User-defined regex patterns are not accepted because user-supplied regexes can cause ReDoS (catastrophic backtracking). The 5 built-in types cover the vast majority of real-world encoding-based exfiltration attempts. Documented as an xfail test.
JSON-within-strings adds overhead
Every string value gets a
json.loads()/serde_json::from_str()attempt. On clean payloads (majority of traffic), this is a failed parse on every string (~1-3 microseconds). Theparse_json_strings: falseflag allows operators to disable this if the overhead is unacceptable.Nested detection adds scan overhead
With
max_decode_depth=2(default), every candidate that decodes successfully but scores below threshold triggers a re-scan of the decoded text. For payloads with many decodable-but-innocent segments, this increases scan time. Settingmax_decode_depth=1disables nested detection.Allowlist patterns match the encoded form, not decoded content
The allowlist regex is tested against each raw encoded candidate string before decoding. This means patterns must match the encoded form (e.g., the base64 prefix of a JWT), not the decoded content. This is intentional — matching decoded content would require decoding first, defeating the purpose of skipping known-good patterns early.
Not validated on OpenShift
All testing was performed locally and in CI. The plugin has not been tested on an OpenShift cluster, which is the target production environment.
Closes #3807.