Skip to content

[INS-468] Add improved lob detector to defaults.go#4971

Open
mustansir14 wants to merge 19 commits into
mainfrom
ins-468-add-lob-detector-to-defaults-list
Open

[INS-468] Add improved lob detector to defaults.go#4971
mustansir14 wants to merge 19 commits into
mainfrom
ins-468-add-lob-detector-to-defaults-list

Conversation

@mustansir14

@mustansir14 mustansir14 commented May 18, 2026

Copy link
Copy Markdown
Contributor

Summary

The Lob detector existed in the codebase but was never registered in the default detector list in defaults.go. This PR adds it to the defaults and, after discovering via corpora testing that the original regex was too loose and produced significant noise, refactors the detector to be more precise and follow current practices.

Regex tightened to reduce noise (the core fix):

The original regex relied on a loose proximity-based prefix match against the word "lob" and matched any 40-character alphanumeric string:

PrefixRegex([]string{"lob"}) + `\b([a-zA-Z0-9_]{40})\b`

Corpora testing showed this was extremely noisy. Lob API keys have a well-defined format — they always begin with live_ or test_ — so the new regex anchors on that structure:

`\b((live|test)_[a-zA-Z0-9_]{35})\b`

Keywords updated to match key prefix:

  • Before: ["lob"]
  • After: ["live_", "test_"]

This makes pre-filtering align with the actual key format rather than relying on a nearby context word.

Additional improvements (following current detector practices):

  • Scanner struct now accepts an injectable *http.Client (via getClient() helper) to support test mocking without a global variable.
  • Package-level client renamed to defaultClient to avoid shadowing.
  • Verification logic extracted into a dedicated verify() method.
  • Verification endpoint changed from GET /v1/addresses to POST /v1/us_verifications. The old endpoint returns 401 Unauthorized both for invalid keys and for active keys with no billing method on file, making it impossible to distinguish between the two cases. The new endpoint returns 403 Forbidden for active keys with no billing method, allowing a correct verification signal. Status code handling:
    • 403 Forbidden → verified (active key, no billing method on file)
    • 422 Unprocessable Entity → verified (active key, request body is invalid — expected for an empty POST)
    • 401 Unauthorized → not verified
    • anything else → verification error
  • Duplicate matches are now deduplicated before result construction.
  • ExtraData field added to expose the key environment (live or test).

Gating behind feature flag

Since this is considered a new detector addition, it is gated behind a feature flag. This is why the PR is based off of #4969 which contains some require plumbing for this.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Note

Medium Risk
Enabling a previously unlisted default detector changes scan coverage when the flag is on, and verification performs live HTTP calls to Lob’s API with revised status semantics.

Overview
Registers the Lob detector in the default engine list and gates it behind LobDetectorEnabled, with OSS startup enabling that flag alongside other new detectors.

The Lob scanner is refactored for lower noise and clearer verification: the key regex now requires live_ / test_ prefixes instead of a loose "lob" proximity match, keywords and unit tests follow that format, matches are deduplicated, and ExtraData.environment is set. Verification moves to POST /v1/us_verifications with explicit status handling (403/422 → verified, 401 → not), plus an injectable HTTP client and a dedicated verify() helper.

Reviewed by Cursor Bugbot for commit 0c29d48. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented May 18, 2026

Copy link
Copy Markdown

Corpora Test Results

Scans a corpus of real-world public code against only the detectors changed in this PR, then compares unique match counts between the PR build and the main baseline to catch regex regressions. Verification is disabled — each detector's regex is measured independently.

1 new · 0 clean  |  Scoped to: lob

Status Detector Unique matches (main) Unique matches (PR) New Removed
🆕 lob 0
  • 🔴 regression: >5 new, >20% increase over main, or any removed
  • ⚠️ warning: 1–5 new and ≤20% increase over main
  • ✅ clean
  • 🆕 new detector (no baseline)

@mustansir14 mustansir14 marked this pull request as ready for review May 19, 2026 08:32
@mustansir14 mustansir14 requested review from a team May 19, 2026 08:32
@mustansir14 mustansir14 changed the title [INS-468] Add lob detector to defaults.go [INS-468] Add improved lob detector to defaults.go May 19, 2026
Comment thread pkg/detectors/lob/lob.go
Comment thread pkg/detectors/lob/lob.go
@mustansir14 mustansir14 force-pushed the ins-468-add-lob-detector-to-defaults-list branch from d67397f to 5d7fa71 Compare May 21, 2026 10:04
@mustansir14 mustansir14 changed the base branch from main to ins-465-add-datadogapikey-detector-to-defaults May 21, 2026 10:05
@mustansir14 mustansir14 added the review/product-eng Team integrations reviewed, awaiting product-eng review label May 22, 2026
Base automatically changed from ins-465-add-datadogapikey-detector-to-defaults to main June 9, 2026 05:49
@mustansir14 mustansir14 requested review from a team as code owners June 9, 2026 05:49
@mustansir14 mustansir14 requested a review from a team June 9, 2026 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/product-eng Team integrations reviewed, awaiting product-eng review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants