fix: tighten secrets detection coverage and add focused benchmarking#3764
fix: tighten secrets detection coverage and add focused benchmarking#3764
Conversation
d632d0e to
a3cf96f
Compare
|
Thanks @lucarlig. Good separation of concerns — moving secret patterns out of PII filter into the dedicated secrets detection plugin. Benchmark results show solid Rust improvement (~7% latency, ~29% at p99). |
|
we will be using rust plugin for 1.0 |
5078f2f to
e00c6e1
Compare
dcc2d48 to
eceae89
Compare
eceae89 to
ae63d74
Compare
f610059 to
38ccb4f
Compare
dima-zakharov
left a comment
There was a problem hiding this comment.
all blocking issues resolved. Broad heuristics like generic_api_key_assignment are properly disabled by default with clear warnings when enabled.
Recommendation
Ready to merge
38ccb4f to
1e7fc30
Compare
2cfeca1 to
cd6bde9
Compare
jonpspri
left a comment
There was a problem hiding this comment.
Review Summary
Clean separation of concerns — secret-style credential detection (AWS keys, API keys) is properly moved from the PII filter into the dedicated secrets_detection plugin. New provider-specific patterns (GitHub tokens, Stripe secret keys, generic API key assignment) are well-tested with both positive and negative cases across Python and Rust.
Key Positives
- Secure defaults:
is_enabled()now defaults tofalse(wastrue), and all broad heuristic patterns (jwt_like,hex_secret_32,base64_24,generic_api_key_assignment) default to disabled in code — matching the documented opt-in behavior. Productionconfig.yamlexplicitly enables the ones needed. - Config merging: The
_merge_enabled_patternsfield validator prevents partial YAML configs from silently enabling broad heuristics. - Python/Rust parity: All 11 patterns and their defaults are in sync across both implementations.
- Test coverage: 221 Python tests and 59 Rust secrets_detection tests pass, covering new patterns, config edge cases, Rust log bridge, fallback exception logging, and broad-pattern warnings.
- Rust logging:
pyo3-logbridge gives operators visibility into Rust-side scan behavior without separate log infrastructure.
Minor Notes (non-blocking)
- The e2e test changes (
test_mcp_session_isolation.py,test_mcp_access_matrix.py,mcp_test_helpers.py,fuzz/conftest.py) are tangential to secrets detection — consider splitting in future PRs for cleaner history. - Linters (
ruff,bandit,cargo test,bash -n) all clean on changed files.
cd6bde9 to
301cd05
Compare
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
The detection and redaction paths in the Python secrets plugin used different defaults for unknown patterns (False vs True), which could cause redaction of patterns that were never detected. Extract an is_enabled() method on both the Python and Rust config classes so the safe default (disabled) is defined in exactly one place. Also commits config.yaml changes that align with the code: - remove stale detect_aws_keys / detect_api_keys from PII filter config - set SecretsDetection mode to disabled (opt-in, not enforced by default) - default unknown patterns to disabled in Rust scanner Signed-off-by: Jonathan Springer <jps@s390x.com>
CLAUDE.md and DEVELOPING.md only listed Python linters in their "before committing" sections. Add make rust-check so developers and AI agents run clippy -D warnings before pushing Rust changes. Signed-off-by: Jonathan Springer <jps@s390x.com>
301cd05 to
8ad9c32
Compare
Summary
This updates secret handling so secret-like values are owned by the dedicated secrets detection plugin instead of the PII filter, strengthens provider-specific secret coverage, improves Rust plugin logging, adds a focused Rust-vs-Python benchmarking workflow for the secrets detector, and documents the broader intrinsic-shape heuristics that can still catch longer secret-like values.
What changed
SecretsDetectioninplugins/config.yamlbase64_24can still catch assignment-style values when the value itself looks secret-likeenabled:map so omitted heuristics do not silently turn back onpyo3_logtests/loadtest/locustfile_secret_detection.pymake load-test-secret-detection-compareandtests/loadtest/run_secret_detection_compare.shto compare the Rust-backed and forced Python fallback paths on the same focused workloadgitleaks:allowannotations to intentional secret-shaped test fixtures in the Python and Rust secrets detection tests and the focused secret-detection Locust file so secret scanning stays clean without changing detector behaviorValidation
uv run pytest tests/unit/plugins/test_secrets_detection.pycargo testinplugins_rust/secrets_detectioncargo testinplugins_rust/pii_filtermake rust-checkgitleaks detect --no-gitscans fortests/unit/plugins/test_secrets_detection.py,plugins_rust/secrets_detection/src, andtests/loadtest/locustfile_secret_detection.pymake test/rpcverification against the compose UI stack with the Rust plugin enabled for blocking and redaction scenariosEnd-to-End Benchmark
Focused secret-detection benchmark using the new Locust scenario against the full compose stack:
252.32 RPSvs Python236.70 RPS(+6.2%)333.33 msvs Python358.22 ms(-7.5%)520 msvs Python580 ms(-11.5%)590 msvs Python760 ms(-28.8%)Per-request-type improvements from the same focused run:
125.44 RPSvs Python118.04 RPS126.88 RPSvs Python118.65 RPSCloses #3741