fix(pii_filter): add comprehensive Rust implementation hardening and regression tests by lucarlig · Pull Request #3840 · IBM/mcp-context-forge

lucarlig · 2026-03-24T14:49:12Z

📌 Summary

This PR hardens the Rust PII filter implementation with comprehensive validation, error handling, detection improvements, and regression test coverage to ensure robust security behavior. It also tightens loopback passthrough header filtering so internal loopback requests do not forward hop-by-hop, routing, or MCP session headers from the inbound client request.

🔁 Reproduction Steps

Issues were identified through internal security review and testing that revealed gaps in validation, detection patterns, error handling, and loopback header filtering.

🐞 Root Cause

The Rust PII filter implementation needed hardening in several areas:

Mask strategy handling: Strategies and nested keys required proper preservation
Detection patterns: Patterns needed expansion for better coverage
Input validation: Missing validation and error handling in core logic
Configuration limits: Resource limits needed proper bounds and validation
Test coverage: Missing regression tests for edge cases
Loopback passthrough filtering: Internal loopback requests still allowed transport and routing headers that should be regenerated or blocked by the gateway

💡 Fix Description

Implemented comprehensive improvements across the branch:

Mask strategy preservation and nested key support (16a9930c0)
- Added proper handling of mask strategies across detection types
- Implemented nested JSON key support
- Added tests for strategy preservation
Comprehensive detection patterns and protection (b6860120d)
- Expanded PII detection patterns for better coverage
- Added pattern complexity validation
- Implemented protection against pathological regex behavior
- Added 400+ lines of detection logic
Input validation and error handling (56e85f82e)
- Added validation for masking inputs
- Implemented proper error messages and handling
- Added 90+ lines of validation logic
Resource limits and config validation (e886de2a6)
- Added configuration validation with proper bounds
- Implemented resource limit enforcement
- Added config validation tests
Comprehensive test coverage (415ba377f)
- Added 90+ lines of test coverage for edge cases
- Implemented regression tests for detection gaps
- Added error path testing
Documentation (e77106295)
- Documented protection limits and constraints
- Added upper bounds to resource limit fields
- Updated README with security considerations
Loopback passthrough header hardening
- Expanded the loopback skip list to drop hop-by-hop and routing headers such as Connection, Transfer-Encoding, TE, Trailer, Upgrade, Host, and Content-Length
- Prevented inbound clients from influencing internal loopback request framing, routing, or MCP session propagation
- Added deny-path coverage for the filtered header set

🧪 Verification

Check	Command	Status
Lint suite	`make lint`	✅
Unit tests	`make test`	✅
Coverage ≥ 80 %	`make coverage`	✅
Rust tests	`cargo test`	✅
Manual regression no longer fails	Verified all edge cases	✅

📐 MCP Compliance (if relevant)

Matches current MCP spec
No breaking change to MCP clients

✅ Checklist

Code formatted (make black isort pre-commit)
No secrets/credentials committed
Tests added for all changes
Documentation updated

sco3 · 2026-03-25T14:05:47Z

I approve and put some details here for a record, findings are of low priority.

Branch Review Findings

Branch: fix/pii-filter-regression-tests
Compared to: main
Review Date: March 25, 2026

Executive Summary

Metric	Value
Files Changed	13
Insertions	1,696 lines
Deletions	129 lines
Commits	10
Critical Issues	0
Medium Issues	1
Low Issues	4
Positive Findings	6

Verdict: APPROVE with minor fixes

Branch Overview

This branch focuses on three main areas:

Loopback Passthrough Header Hardening (security deny-path)
Rust PII Filter Improvements (SSN validation, resource limits, ReDoS protection)
Test Coverage for edge cases and regression testing

Files Changed

File	Changes	Purpose
`mcpgateway/utils/passthrough_headers.py`	+9	Block hop-by-hop headers
`plugins/pii_filter/pii_filter.py`	+5	Add resource limit config
`plugins/pii_filter/pii_filter_rust.py`	-19	Simplify import logic
`plugins/pii_filter/README.md`	+26	Document SSN validation
`plugins_rust/pii_filter/README.md`	+149	Document detection coverage
`plugins_rust/pii_filter/src/config.rs`	+104	Enforce resource limits
`plugins_rust/pii_filter/src/detector.rs`	+900	SSN validation, error handling
`plugins_rust/pii_filter/src/masking.rs`	+132	Range validation, UTF-8 safety
`plugins_rust/pii_filter/src/patterns.rs`	+226	ReDoS protection, contextual matching
`plugins_rust/pii_filter/benches/pii_filter.rs`	+1	Fix benchmark config
`plugins_rust/pii_filter/python/pii_filter_rust/__init__.pyi`	-1	Cleanup
`tests/unit/mcpgateway/plugins/plugins/pii_filter/test_pii_filter.py`	+226	Edge case coverage
`tests/unit/mcpgateway/test_loopback_passthrough_headers.py`	+17	Deny-path tests

Issues by Severity

🔴 CRITICAL: None

No show-stopping bugs or security vulnerabilities found.

🟡 MEDIUM: 1 Issue

1. Custom Pattern ReDoS Validation Incomplete

File: plugins_rust/pii_filter/src/patterns.rs (lines 324-362)
Function: validate_custom_pattern()

Problem: The validation counts total quantifiers but doesn't detect nested quantifiers, which are the primary cause of Regular Expression Denial of Service (ReDoS) attacks.

Current Validation:

let quantifiers = pattern.chars()
    .filter(|ch| matches!(ch, '*' | '+' | '?'))
    .count()
    + pattern.matches('{').count();
if quantifiers > MAX_QUANTIFIERS {  // 24
    return Err("too many quantifiers");
}

Dangerous Patterns That Pass Current Validation:

Pattern	Quantifiers	Nesting	Risk
`(a+)+`	2	2	🔴 ReDoS
`((a+)+)+`	3	3	🔴 ReDoS
`(\w+\s?)+`	3	2	🔴 ReDoS
`([A-Za-z0-9]+[-_]?)+`	3	2	🔴 ReDoS
`a+b+c+d+`	4	0	✅ Safe

Impact: A malicious or erroneous custom pattern can cause catastrophic backtracking, consuming CPU for seconds or minutes on small inputs.

Example Attack:

# This pattern passes current validation
pattern = r"(a+)+"

# Input: 30 'a' characters + '!'
text = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!"

# Processing time: ~1.2 seconds (exponential growth)
# Each additional character doubles the time

Fix Option A: Add Nested Quantifier Validation (Recommended):

/// Вычисляет максимальную глубину вложенности квантификаторов
fn calculate_quantifier_nesting_depth(pattern: &str) -> usize {
    let mut group_depth = 0;
    let mut max_quantifier_depth = 0;
    let mut in_char_class = false;
    let mut escaped = false;

    for ch in pattern.chars() {
        if escaped {
            escaped = false;
            continue;
        }
        if ch == '\\' {
            escaped = true;
            continue;
        }
        if ch == '[' && !in_char_class {
            in_char_class = true;
            continue;
        }
        if ch == ']' && in_char_class {
            in_char_class = false;
            continue;
        }
        if in_char_class {
            continue;
        }

        match ch {
            '(' => group_depth += 1,
            ')' => group_depth = group_depth.saturating_sub(1),
            '*' | '+' | '?' => {
                if group_depth > 0 {
                    max_quantifier_depth = max_quantifier_depth.max(group_depth);
                }
            }
            '{' => {
                if group_depth > 0 {
                    max_quantifier_depth = max_quantifier_depth.max(group_depth);
                }
            }
            _ => {}
        }
    }

    max_quantifier_depth
}

// Add to validate_custom_pattern:
const MAX_NESTING_DEPTH: usize = 2;

let nesting = calculate_quantifier_nesting_depth(pattern);
if nesting > MAX_NESTING_DEPTH {
    return Err(format!(
        "Pattern has nested quantifiers (depth: {}, max: {}) - potential ReDoS",
        nesting, MAX_NESTING_DEPTH
    ));
}

Fix Option B: Document Trusted-Input Assumption (Alternative):

If custom patterns are only added by trusted admins (not end users), document this assumption:

// SECURITY NOTE: Custom patterns are trusted input (added by admins only).
// This validation provides basic safeguards against typos and obvious errors.
// For untrusted pattern sources, use regex-automata DFA engine instead.

Tests to Add:

#[test]
fn test_rejects_nested_quantifiers() {
    assert!(validate_custom_pattern("(a+)+").is_err());
    assert!(validate_custom_pattern("((a+)+)+").is_err());
    assert!(validate_custom_pattern(r"(\w+\s?)+").is_err());
}

#[test]
fn test_accepts_flat_quantifiers() {
    assert!(validate_custom_pattern("a+b+c+").is_ok());
    assert!(validate_custom_pattern("EMP-[0-9]{6}").is_ok());
}

Effort: ~60 lines of code
Priority: Fix before merge OR document trusted-input assumption

🟢 LOW: 4 Issues

1. UTF-8 Boundary Error Message Lacks Diagnostic Info

File: plugins_rust/pii_filter/src/masking.rs (lines 58-87)
Function: validate_detection_ranges()

Current Code:

if !text.is_char_boundary(detection.start) || !text.is_char_boundary(detection.end) {
    return Err("Invalid detection range: offsets must align to UTF-8 boundaries".to_string());
}

Problem: Error message doesn't include which detection failed or what the actual byte offsets were, making debugging difficult.

Fix:

if !text.is_char_boundary(detection.start) || !text.is_char_boundary(detection.end) {
    return Err(format!(
        "Invalid detection range: offsets {}..{} must align to UTF-8 boundaries (text len: {})",
        detection.start, detection.end, text.len()
    ));
}

Effort: ~5 lines
Priority: Recommended

2. Hash Output Length Change Undocumented

File: plugins_rust/pii_filter/src/masking.rs (line 218)

Change:

// Before: format!("[HASH:{}]", &format!("{:x}", result)[..8])
// After:  format!("[HASH:{}]", &format!("{:x}", result)[..16])

Impact:

Before: [HASH:abcd1234] (15 characters)
After: [HASH:abcd1234efgh5678] (23 characters)

Downstream systems parsing masked output may break if they expect fixed-width fields.

Fix: Add to plugins_rust/pii_filter/README.md or CHANGELOG.md:

## Changelog

### [Unreleased]

#### Breaking Changes
- **Hash mask output length increased**: Hash strategy now produces 16-character 
  hex output instead of 8 characters for improved collision resistance.
  - Before: `[HASH:abcd1234]`
  - After: `[HASH:abcd1234efgh5678]`
  - Migration: Update any regex parsers or fixed-width field extractors

Effort: ~10 lines documentation
Priority: Document before merge

3. Performance Test Generates Invalid SSNs

File: tests/unit/mcpgateway/plugins/plugins/pii_filter/test_pii_filter.py (lines 914-918)

Current Code:

for i in range(10000):
    area = (i % 799) + 100
    if area == 666:  # Only skips exactly 666
        area = 667
    lines.append(f"User {i}: SSN {area:03d}-45-6789, Email user{i}@example.com")

Problem: Doesn't skip all invalid SSN area codes per SSA rules:

000 — invalid
666 — invalid (only one skipped)
900-999 — invalid (not skipped)

Fix:

def is_valid_ssn_area(area: int) -> bool:
    """Check if SSN area code is structurally valid per SSA rules."""
    return area != 0 and area != 666 and area < 900

lines = []
i = 0
while len(lines) < 10000:
    area = (i % 800) + 100  # Range 100-899
    if is_valid_ssn_area(area):
        lines.append(f"User {len(lines)}: SSN {area:03d}-45-6789, Email user{len(lines)}@example.com")
    i += 1

Effort: ~15 lines
Priority: Recommended for test accuracy

4. Cumulative Text Size Not Tracked in Nested Structures (Optional)

File: plugins_rust/pii_filter/src/detector.rs (lines 216-350)
Function: process_nested_internal()

Current Behavior: Each string is validated individually against max_text_bytes, but cumulative size across many strings is not tracked.

Attack Scenario:

# Each string is 1KB (passes individual check)
data = {"field_" + str(i): "x" * 1024 for i in range(2000)}
# Total: 2MB (may exceed intended memory budget)

Fix (Optional, More Invasive):

fn process_nested_internal(
    &self,
    py: Python,
    data: &Bound<'_, PyAny>,
    path: &str,
    depth: usize,
    cumulative_size: &mut usize,  // NEW parameter
) -> PyResult<(bool, Py<PyAny>, Py<PyAny>)> {
    // ...

    if let Ok(text) = data.extract::<String>() {
        *cumulative_size += text.len();
        if *cumulative_size > self.config.max_text_bytes {
            return Err(PyErr::new::<pyo3::exceptions::PyValueError, _>(
                "Cumulative text size exceeds maximum limit"
            ));
        }
        // ...
    }

    // Recursive calls pass cumulative_size
    let (val_modified, new_value, val_detections) =
        self.process_nested_internal(py, &value, &new_path, depth + 1, cumulative_size)?;
}

Why Optional: This is defense-in-depth. The existing per-string check catches most attacks. Only needed if memory exhaustion via many small strings is a documented threat model.

Effort: ~50 lines
Priority: Optional hardening

Positive Findings

✅ 1. Excellent SSN Validation

File: plugins_rust/pii_filter/src/detector.rs (lines 1133-1146)

The Rust detector correctly implements SSA Publication No. 05-10033 rules:

fn is_valid_ssn(value: &str) -> bool {
    let digits: String = value.chars().filter(|c| c.is_ascii_digit()).collect();
    if digits.len() != 9 {
        return false;
    }

    let area = &digits[0..3];
    let group = &digits[3..5];
    let serial = &digits[5..9];

    area != "000" && area != "666" && area < "900" && group != "00" && serial != "0000"
}

Validates:

✅ Area code cannot be 000
✅ Area code cannot be 666
✅ Area code cannot be 900-999
✅ Group number cannot be 00
✅ Serial number cannot be 0000

Impact: Reduces false positives on random 9-digit numbers.

✅ 2. Strong Credit Card Validation

File: plugins_rust/pii_filter/src/detector.rs (lines 1148-1190)

Implements proper Luhn algorithm + card prefix validation:

fn passes_luhn(value: &str) -> bool {
    // ✓ Luhn checksum validation
    // ✓ Length check (13-19 digits)
    // ✓ Card prefix validation (Visa, MC, Amex, etc.)
}

Validates:

✅ Luhn checksum
✅ Card length (13-19 digits)
✅ Known card prefixes (Visa 4, MC 51-55, Amex 34/37, etc.)

Impact: Prevents false positives on random 16-digit numbers.

✅ 3. Contextual PII Detection

File: plugins_rust/pii_filter/src/patterns.rs (lines 36-175)

Built-in patterns require explicit context labels for ambiguous identifiers:

PII Type	Required Context	Example Match	Example Non-Match
SSN	"SSN", "Social Security"	`SSN: 123-45-6789`	`Order: 123-45-6789`
BSN	"BSN", "Citizen Service Number"	`My BSN is 123456789`	`Invoice: 123456789`
Passport	"Passport", "Passport No"	`Passport: AB123456`	`ID: AB123456`
Bank Account	"Account", "Bank Account"	`Account: 123456789`	`Reference: 123456789`

Impact: Significantly reduces false positives on generic identifiers.

✅ 4. Comprehensive Loopback Header Filtering

File: mcpgateway/utils/passthrough_headers.py (lines 552-568)

Blocks all HTTP/1.1 hop-by-hop and routing headers:

_LOOPBACK_SKIP_HEADERS: frozenset[str] = frozenset({
    "authorization",
    "connection",
    "content-type",
    "content-length",
    "host",
    "keep-alive",
    "mcp-session-id",
    "proxy-connection",
    "te",
    "trailer",
    "transfer-encoding",
    "upgrade",
    "x-mcp-session-id",
    "x-forwarded-internally",
})

Blocks:

✅ Authentication headers (Authorization)
✅ Hop-by-hop headers (Connection, Keep-Alive, TE, Trailer, Transfer-Encoding, Upgrade)
✅ Routing headers (Host, Content-Length, Content-Type)
✅ MCP-specific headers (MCP-Session-ID, X-MCP-Session-ID, X-Forwarded-Internally)

Impact: Prevents header injection attacks in loopback scenarios.

✅ 5. Resource Limit Enforcement

File: plugins_rust/pii_filter/src/config.rs (lines 220-252)

Validates and enforces safe resource limits:

pub max_text_bytes: usize,      // Default: 10MB, Max: 100MB
pub max_nested_depth: usize,    // Default: 32, Max: 1000
pub max_collection_items: usize // Default: 4096, Max: 1,000,000

Validates:

✅ Limits are within safe bounds
✅ Rejects zero or negative limits
✅ Enforced at every entry point (detect, mask, process_nested)

Impact: Prevents DoS via oversized inputs or deeply nested structures.

✅ 6. Detection Range Validation

File: plugins_rust/pii_filter/src/masking.rs (lines 58-87)

Comprehensive validation of detection ranges before masking:

fn validate_detection_ranges(text: &str, detections: &[Detection]) -> Result<(), String> {
    for detection in detections {
        // ✓ start <= end
        // ✓ end <= text length
        // ✓ UTF-8 character boundaries
        // ✓ Overlapping range detection
    }
}

Impact: Prevents panics and memory safety issues during masking.

Recommended Priority

Required Before Merge

[MEDIUM] Add nested quantifier check in patterns.rs
OR document that custom patterns are trusted-input only

Recommended Before Merge

[LOW] Document hash length change in changelog

Nice to Have

[LOW] Improve UTF-8 error message in masking.rs
[LOW] Fix test SSN generation to skip invalid ranges
[LOW] Add cumulative text size tracking (optional)

Testing Summary

Test Coverage Added

Test File	New Tests	Coverage
`test_pii_filter.py`	+226 lines	SSN edge cases, mask strategies, performance
`test_loopback_passthrough_headers.py`	+17 lines	Deny-path regression tests

Key Test Scenarios Covered

✅ Structurally impossible SSNs (000, 666, 900-999 area codes)
✅ BSN vs other 9-digit numbers
✅ Mask strategy regression (partial vs redact vs hash)
✅ AWS key detection edge cases
✅ Nested structure processing
✅ Large batch detection performance
✅ Loopback header filtering deny-paths

Security Assessment

Attack Vectors Addressed

Vector	Status	Mitigation
Header injection (loopback)	✅ Mitigated	Comprehensive header filtering
ReDoS (custom patterns)	⚠️ Partial	Length/quantifier limits, missing nesting check
DoS (oversized inputs)	✅ Mitigated	Text size, depth, collection limits
DoS (deep nesting)	✅ Mitigated	Max depth validation
False positives (SSN)	✅ Mitigated	SSA structural validation
False positives (generic IDs)	✅ Mitigated	Contextual detection

Remaining Concerns

Custom Pattern ReDoS (MEDIUM): Nested quantifiers not detected
- Mitigation: Add nesting depth validation OR document trusted-input assumption

Conclusion

This branch demonstrates strong security engineering with:

✅ Defense-in-depth header filtering
✅ Robust input validation
✅ Comprehensive test coverage
✅ Proper error handling
✅ SSA-compliant SSN validation
✅ Contextual PII detection to reduce false positives

Overall Verdict: APPROVE with minor fixes

Required Action: Address the nested quantifier validation gap (Issue #1) before merging, or explicitly document that custom patterns are trusted-input only.

dima-zakharov

Looks good

lucarlig · 2026-03-25T14:37:42Z

Addressed the review follow-ups on the branch.

What changed:

Documented the custom-pattern trust boundary in plugins_rust/pii_filter/src/patterns.rs and both PII filter READMEs instead of adding a nested-quantifier rejector. The branch uses Rust regex, so matching stays linear-time, and the remaining limits are now described as guardrails for trusted admin-authored patterns.
Improved the UTF-8 boundary error in plugins_rust/pii_filter/src/masking.rs to include the failing byte offsets and text length, and added a Rust unit test for that path.
Documented the 16-hex-character HASH mask output in the Rust and plugin READMEs so downstream parsers have an explicit migration note.
Fixed the large-batch Python performance test so the generated SSNs skip invalid SSA area codes instead of only patching 666.

Verification:

cargo test masking::tests --manifest-path plugins_rust/pii_filter/Cargo.toml
cargo test patterns::tests --manifest-path plugins_rust/pii_filter/Cargo.toml
cargo fmt --manifest-path plugins_rust/pii_filter/Cargo.toml --check
uv run maturin develop --release --manifest-path plugins_rust/pii_filter/Cargo.toml
uv run pytest tests/unit/mcpgateway/plugins/plugins/pii_filter/test_pii_filter.py::TestRustPIIDetectorSpecific::test_large_batch_detection -q

I did not add the optional cumulative nested-text budget tracking in this PR. That one is more invasive, and the existing per-string size/depth/collection limits remain unchanged.

source:
https://docs.rs/regex/latest/regex/#untrusted-input

dima-zakharov · 2026-03-25T15:08:24Z

I gave page content to cline with IBM sonnet 4.5 model and this is the result of analysis:

CRITICAL CORRECTION: ReDoS Analysis for Rust Regex Crate

Summary: Traditional ReDoS Does NOT Apply to Rust's regex Crate

After reviewing the official Rust regex documentation, the ReDoS concern about nested quantifiers (Issue #2 in my review) is based on a
MISUNDERSTANDING of how the Rust regex engine works.

Key Facts from Rust regex Documentation:

"This crate is meant to be able to run regex searches on untrusted haystacks without fear of ReDoS."
"This crate differs from most (but not all) other regex engines in that it doesn't use unbounded backtracking to run a regex search."
*Guaranteed worst case O(m n) time complexity where:
- m = size of regex (after expansion)
- n = length of haystack
Uses finite automata engines:
- Thompson NFA
- Lazy DFA
- One-pass DFA
- Bounded backtracker (not unbounded!)
- PikeVM
No catastrophic backtracking: The regex crate explicitly does NOT suffer from catastrophic backtracking that causes traditional ReDoS attacks.

What This Means:

❌ BUSTED: Traditional ReDoS Concerns

Patterns like (a+)+ or (a*)* that cause catastrophic backtracking in PCRE, JavaScript, Python, etc. DO NOT cause ReDoS in Rust's regex crate.*

These patterns will:

Compile successfully ✓
Run in guaranteed O(m n) time ✓
NOT hang or take exponential time ✓

✅ VALID: Size Limit Concerns

The RegexBuilder::size_limit and custom pattern validation in validate_custom_pattern() serve a DIFFERENT purpose:

Prevent exponential memory usage during compilation
- Pattern a{5}{5}{5}{5}{5}{5} expands to a{15625} which is huge
Keep m reasonable in O(m n)
- A very large m (even with guaranteed O(mn)) can still be slow
- But it's NOT exponential/catastrophic
Control compile times
- Regex compilation is O(m) but large m means longer compile time

dima-zakharov

Fixes Applied
- Documented custom pattern trust boundary in plugins_rust/pii_filter/src/patterns.rs and READMEs
- Improved UTF-8 boundary error message in plugins_rust/pii_filter/src/masking.rs
- Documented hash mask output length change in READMEs
- Fixed SSN generation in performance tests to skip invalid SSA area codes
- Correction: Rust regex engine does not suffer from traditional ReDoS attacks

Recommendation
Ready to merge

dawid-nowak

As as suggestion... From looking at the code, it seems that all pattern matching is executed one-by-one. It might be good to try to parallelize the execution with Rayon/parallel streams.

Assuming that Rayon is going to work with Python that is..

lucarlig · 2026-03-27T10:32:08Z

As as suggestion... From looking at the code, it seems that all pattern matching is executed one-by-one. It might be good to try to parallelize the execution with Rayon/parallel streams.

Assuming that Rayon is going to work with Python that is..

Given the plugin is currently in-process (and that’s not changing for now), adding parallelism would directly compete with mcp-gateway for resources and will impact its performance. I think we should consider both internal plugin parallelism and plugin-level parallelism (running multiple plugins in parallel), but I’d consider that out of scope for this PR

…t in Rust implementation Signed-off-by: lucarlig <luca.carlig@ibm.com>

…ction to Rust implementation Signed-off-by: lucarlig <luca.carlig@ibm.com>

…ing logic Signed-off-by: lucarlig <luca.carlig@ibm.com>

…tector Signed-off-by: lucarlig <luca.carlig@ibm.com>

…dge cases Signed-off-by: lucarlig <luca.carlig@ibm.com>

…imit upper bounds Signed-off-by: lucarlig <luca.carlig@ibm.com> Signed-off-by: lucarlig <luca.carlig@ibm.com>

Signed-off-by: lucarlig <luca.carlig@ibm.com>

ja8zyjits · 2026-03-27T13:06:47Z

The Python Part looks good to me, need to make sure we have some E2E test so that other plugins wont fail.

brian-hussey · 2026-03-27T14:06:43Z

Using admin override to merge. PR approved by @dima-zakharov.

lucarlig requested review from araujof and terylt as code owners March 24, 2026 14:49

lucarlig added the bug Something isn't working label Mar 24, 2026

lucarlig requested a review from jonpspri as a code owner March 24, 2026 14:49

lucarlig added security Improves security rust Rust programming labels Mar 24, 2026

lucarlig requested a review from dima-zakharov as a code owner March 24, 2026 14:49

lucarlig added the plugins label Mar 24, 2026

lucarlig requested a review from crivetimihai as a code owner March 24, 2026 14:49

lucarlig force-pushed the fix/pii-filter-regression-tests branch 3 times, most recently from 3af63ae to b4b958b Compare March 24, 2026 15:34

lucarlig requested review from kevalmahajan and madhav165 as code owners March 24, 2026 15:58

dawid-nowak reviewed Mar 24, 2026

View reviewed changes

Comment thread plugins_rust/pii_filter/src/config.rs Outdated

dima-zakharov previously approved these changes Mar 25, 2026

View reviewed changes

lucarlig dismissed dima-zakharov’s stale review via 3ced268 March 25, 2026 14:23

lucarlig requested review from dawid-nowak and dima-zakharov March 26, 2026 08:30

lucarlig added wxo wxo integration release-fix Critical bugfix required for the release labels Mar 26, 2026

dima-zakharov previously approved these changes Mar 26, 2026

View reviewed changes

crivetimihai assigned dima-zakharov Mar 26, 2026

lucarlig added release-fix Critical bugfix required for the release and removed release-fix Critical bugfix required for the release labels Mar 26, 2026

lucarlig mentioned this pull request Mar 26, 2026

fix: tighten secrets detection coverage and add focused benchmarking #3764

Merged

lucarlig force-pushed the fix/pii-filter-regression-tests branch from 3ced268 to 6905c12 Compare March 26, 2026 17:44

lucarlig force-pushed the fix/pii-filter-regression-tests branch from 6905c12 to a7b9c7e Compare March 27, 2026 09:13

dawid-nowak reviewed Mar 27, 2026

View reviewed changes

lucarlig force-pushed the fix/pii-filter-regression-tests branch from a7b9c7e to 46786b2 Compare March 27, 2026 10:57

lucarlig added 16 commits March 27, 2026 11:27

fix(pii_filter): add mask strategy preservation and nested key suppor…

ffd00d0

…t in Rust implementation Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): add comprehensive detection patterns and ReDoS prote…

7b5f9fc

…ction to Rust implementation Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): add input validation and error handling to Rust mask…

5c34afc

…ing logic Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): add resource limits and config validation to Rust de…

865cee6

…tector Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): add comprehensive test coverage for Rust detection e…

655f690

…dge cases Signed-off-by: lucarlig <luca.carlig@ibm.com>

docs(pii_filter): document ReDoS protection limits and add resource l…

b5ad872

…imit upper bounds Signed-off-by: lucarlig <luca.carlig@ibm.com> Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): resolve clippy warnings in Rust implementation

c31f18d

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): tighten rust ssn and limit validation

c0c7b3f

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): restore benchmark config build

40c43e1

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix: raise default rust pii text limit

b8a3ff3

Signed-off-by: lucarlig <luca.carlig@ibm.com>

docs: clarify rust pii filter coverage

9819d08

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix: restore rust pii coverage edge cases

bb73fa3

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix: harden loopback passthrough and ssn validation

38eddbc

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix(pii_filter): address rust review follow-ups

afe0a57

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix: address pii filter review follow-ups

deaa150

Signed-off-by: lucarlig <luca.carlig@ibm.com>

fix: audit detect-secrets baseline for pii filter tests

cd6ed0f

Signed-off-by: lucarlig <luca.carlig@ibm.com>

lucarlig dismissed dima-zakharov’s stale review via cd6ed0f March 27, 2026 11:31

lucarlig force-pushed the fix/pii-filter-regression-tests branch from 46786b2 to cd6ed0f Compare March 27, 2026 11:31

Merge branch 'main' into fix/pii-filter-regression-tests

89bcfc3

lucarlig requested review from dawid-nowak and dima-zakharov March 27, 2026 11:57

dima-zakharov approved these changes Mar 27, 2026

View reviewed changes

brian-hussey merged commit 8071c6d into main Mar 27, 2026
35 checks passed

brian-hussey deleted the fix/pii-filter-regression-tests branch March 27, 2026 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pii_filter): add comprehensive Rust implementation hardening and regression tests#3840

fix(pii_filter): add comprehensive Rust implementation hardening and regression tests#3840
brian-hussey merged 17 commits intomainfrom
fix/pii-filter-regression-tests

lucarlig commented Mar 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

sco3 commented Mar 25, 2026 •

edited by dima-zakharov

Loading

Uh oh!

dima-zakharov left a comment

Uh oh!

lucarlig commented Mar 25, 2026 •

edited

Loading

Uh oh!

dima-zakharov commented Mar 25, 2026 •

edited

Loading

Uh oh!

dima-zakharov left a comment

Uh oh!

dawid-nowak left a comment

Uh oh!

lucarlig commented Mar 27, 2026

Uh oh!

ja8zyjits commented Mar 27, 2026

Uh oh!

Uh oh!

brian-hussey commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

lucarlig commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Summary

🔁 Reproduction Steps

🐞 Root Cause

💡 Fix Description

🧪 Verification

📐 MCP Compliance (if relevant)

✅ Checklist

Uh oh!

Uh oh!

sco3 commented Mar 25, 2026 • edited by dima-zakharov Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Branch Review Findings

Executive Summary

Branch Overview

Files Changed

Issues by Severity

🔴 CRITICAL: None

🟡 MEDIUM: 1 Issue

1. Custom Pattern ReDoS Validation Incomplete

🟢 LOW: 4 Issues

1. UTF-8 Boundary Error Message Lacks Diagnostic Info

2. Hash Output Length Change Undocumented

3. Performance Test Generates Invalid SSNs

4. Cumulative Text Size Not Tracked in Nested Structures (Optional)

Positive Findings

✅ 1. Excellent SSN Validation

✅ 2. Strong Credit Card Validation

✅ 3. Contextual PII Detection

✅ 4. Comprehensive Loopback Header Filtering

✅ 5. Resource Limit Enforcement

✅ 6. Detection Range Validation

Recommended Priority

Required Before Merge

Recommended Before Merge

Nice to Have

Testing Summary

Test Coverage Added

Key Test Scenarios Covered

Security Assessment

Attack Vectors Addressed

Remaining Concerns

Conclusion

Uh oh!

dima-zakharov left a comment

Choose a reason for hiding this comment

Uh oh!

lucarlig commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dima-zakharov commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CRITICAL CORRECTION: ReDoS Analysis for Rust Regex Crate

Summary: Traditional ReDoS Does NOT Apply to Rust's regex Crate

Key Facts from Rust regex Documentation:

What This Means:

❌ BUSTED: Traditional ReDoS Concerns

✅ VALID: Size Limit Concerns

Uh oh!

dima-zakharov left a comment

Choose a reason for hiding this comment

Uh oh!

dawid-nowak left a comment

Choose a reason for hiding this comment

Uh oh!

lucarlig commented Mar 27, 2026

Uh oh!

ja8zyjits commented Mar 27, 2026

Uh oh!

Uh oh!

brian-hussey commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

lucarlig commented Mar 24, 2026 •

edited

Loading

sco3 commented Mar 25, 2026 •

edited by dima-zakharov

Loading

lucarlig commented Mar 25, 2026 •

edited

Loading

dima-zakharov commented Mar 25, 2026 •

edited

Loading