Tracking progress against the ai-stack-data-integration-plan_e90071ca.plan.md
spec (also attached as uploads/ai-stack-data-integration-plan_e90071ca.plan-L1-L332-0.md).
Plan file is read-only — never edit it. Update this tracker instead.
- Re-open
/Users/beenu/Desktop/AiSOCin your editor. - Read this file top-to-bottom.
- Re-create the todo list from the snapshot in the
"Live todo snapshot" section below — keep the same
ids. - Resume from the section marked "Resume here" at the bottom.
- Workspace rule: continue without stopping until every todo is done.
| WS | Title | Status | Anchor |
|---|---|---|---|
| 1 | Repo + plan alignment | DONE | pre-session |
| 2 | Click-and-connect OAuth | DONE | pre-session |
| 3 | Catalog expansion (P0 batch) | DONE | 14 connectors landed + manifests + docs |
| 4 | Capability taxonomy | DONE | enum + per-instance scoping + /agents/tools |
| 5 | Self-healing | DONE | OAuth refresh + backfill + freshness SLO + UI badges |
| 6 | Universal capture | DONE | /v1/inbox/* in services/ingest + tokens panel |
| 7 | Tenant lake API | DONE | endpoints + rewriter + rate limiter + 69/69 tests |
| 8 | Bidirectional ITSM | DONE | push_case + /v1/inbox/itsm + fan-out service + 113/113 tests |
| 9 | Docs + Sonali | DONE | api-coverage.md + itsm-as-source-of-truth.md + sonali-consultation-questions.md + sidebars + SYSTEM_DESIGN §12 |
User asked for:
- GitHub fully updated (docs/architecture in sync with code).
- Code-scanning alerts at https://github.com/beenuar/AiSOC/security/code-scanning fixed.
- Contributor graph showing only
beenu(no automation accounts, noAiSOC Bot). - Author identity for all commits:
Beenu Arora <beenu@cyble.com>— no co-author trailers, no automation addresses, nousers.noreply.github.com.
| ID | Task | Status |
|---|---|---|
| gh-1 | Audit current git state, branches, identities | DONE |
| gh-4 | Secret scan with gitleaks + triage false positives | DONE (.gitleaksignore committed) |
| gh-2 | Pull live code-scanning alerts and triage | DONE (47 alerts triaged) |
| gh-6 | Fix code-scanning issues (critical/high first) | DONE for SSRF + log-injection in cases.py + fusion.py; remaining notes (unused imports, bind-all, weak hash) triaged as benign |
| gh-5 | Sync docs/architecture so GitHub matches current state | DONE (docs/architecture/SYSTEM_DESIGN.md §12 added; topology + service table updated) |
| gh-commit-pre-rewrite | Commit fixes + tracker with Beenu Arora <beenu@cyble.com>, no co-authors |
DONE |
| gh-3a..gh-3e | History-rewrite plan (filter-repo + force-push) | DONE (3 passes; verified clean; CI workflows patched) |
| gh-prs | Close 19 dependabot PRs; comment on #37 for K4R7IK to rebase | DONE |
| gh-7 | Verify code-scanning alerts cleared after push | DONE (0 open high/medium/error alerts; 3 false positives dismissed with rationales) |
| gh-7a | Add safe_log_value helper + apply to 19 py/log-injection sites |
DONE |
| gh-7b | Fix py/url-redirection in oauth.py (whitelist + URL reconstruction) |
DONE |
| gh-7c | Fix/dismiss 3 py/incomplete-url-substring-sanitization alerts |
DONE |
| gh-7d | Dismiss CodeQL false positives with rationales | DONE |
| gh-7e | Clean up note-level alerts (unused imports/vars, empty except) | DONE |
| tryaisoc-cj | Customer journey review of tryaisoc.com — find and fix bugs | DONE |
A shell harness was injecting an unwanted Co-authored-by: trailer into
every commit message at shell-invocation time, even though local git config
is clean (user.name=Beenu Arora, user.email=beenu@cyble.com, no
commit.template, no active hooks, no mailmap).
Workaround applied: after each commit, run
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f \
--msg-filter "sed '/^Co-authored-by: /d' \
| awk 'NF{p=1} p' \
| awk 'BEGIN{n=0} {lines[n++]=\$0} END{e=n-1; while(e>=0 && lines[e]==\"\") e--; for(i=0;i<=e;i++) print lines[i]}'" \
HEAD~N..HEAD…where N is the number of recent commits with the trailer. The two AwK
passes strip leading and trailing blank lines so the rewritten commit message
is byte-clean.
Verification command:
git log --format='%h %an <%ae>%n%(trailers:only=true)' -n 5services/api/pyproject.toml—sqlglot>=23.0.0,<27.0.0+clickhouse-driver.services/api/app/db/clickhouse.py— async ClickHouse client wrapper:get_clickhouse_client,close_clickhouse,execute_lake_query,fetch_lake_schema. Exceptions:LakeQueryNotConfiguredError,LakeQueryTimeoutError,LakeQueryError. ReturnsLakeQueryResultdataclass.services/api/app/services/lake_sql.py— sqlglot rewriter:rewrite_for_tenant(sql, tenant_id, *, max_limit, allowlist)→RewriteResult(sql, referenced_tables, applied_limit).- SELECT-only allowlist; rejects DML/DDL/
KILL/table-valued functions. - Recursive
tenant_idpredicate injection viaoptimizer.scope. - Empty-projection
SELECTrejection (treats bareSELECTas syntax error). - Subtle bug fixed:
_direct_tablespreviously readselect.args.get("from_"); sqlglot stores the FROM clause under the dict key"from"(the Python attr isfrom_, butargs[]uses the YAML-style key). All FROMs were silently skipped before the fix → no predicate injection, emptyreferenced_tables. - Forbidden roots:
(exp.Insert, exp.Update, exp.Delete, exp.Drop, exp.Alter, exp.Create, exp.TruncateTable, exp.Command, exp.Kill).
services/api/app/services/lake_rate_limit.py— per-tenant in-memory token bucket.LakeRateLimiter.acquire(tenant_id, cost=1.0)→RateLimitDecisionwithto_headers()forX-RateLimit-*+Retry-After. Usestime.monotonicfor refill clock;_TokenBucket.last_refillis afield(default_factory=time.monotonic)— see "Test gotcha" below.services/api/app/api/v1/endpoints/lake.py—POST /api/v1/lake/sqlandGET /api/v1/lake/schema. UsesAuthUser,DBSession,require_permission("lake:query"|"lake:read_schema"). Helpers:_acquire_or_429,_scrub_sql_for_log,_audit_query_attempt. Pydantic models:LakeQueryRequest,LakeQueryResponse,LakeColumnInfo,LakeTableInfo,LakeSchemaResponse.services/api/migrations/034_lake_permissions.sql— registerslake:query+lake:read_schemainrole_permissions.services/mcp/src/tools/lake.ts—aisoc_lake_query+aisoc_lake_schemaMCP tools (typed surface, prompt-injection guards, calls API endpoints).services/mcp/tests/lake.test.ts— comprehensive tests for both tools (Zod schemas, multi-statement guard, forbidden tables, response shape).
services/api/tests/test_lake_sql.py— 44/44 passing.services/api/tests/test_lake_rate_limit.py— 25/25 passing.
_TokenBucket.last_refill is set via field(default_factory=time.monotonic),
which captures the real clock at bucket-creation time. If a test patches
app.services.lake_rate_limit.time.monotonic after the bucket has been
created, _refill computes elapsed = patched_now - real_creation_now,
which is hugely negative and gets clamped to 0 → no tokens refill.
Workaround already used in test_lake_rate_limit.py:
async def _seed_bucket(limiter, tenant, *, at_time):
"""Force-create the tenant's bucket and align last_refill with the patched clock."""
await limiter.acquire(tenant, cost=0.0)
bucket = limiter._buckets[tenant]
bucket.last_refill = at_timeApply the same pattern when writing endpoint tests that depend on rate-limit timing.
- Added
_SAFE_PATH_RE = re.compile(r"^/[A-Za-z0-9_\-./%]*$")and_validate_proxy_path(path)that rejects.., protocol-relative//host/..., and any character outside the allowlist. _proxy_getvalidates the path before anyhttpx.AsyncClient.getand logs the validated path (kills thepy/log-injectionsink).entity_risk_detailnow URL-encodesentity_type+entity_valueviaurllib.parse.quote(safe="")before composing the path.
- Same path validator pattern (
_SAFE_PROXY_PATH_RE,_validate_agents_path), same//rejection. case_investigation_runURL-encodesrun_idbefore proxying.case_investigation_pdfURL-encodesrun_idand scrubs bothcase_idandrun_idfor theContent-Dispositionfilename via_safe_filename_segment(kills CR/LF/quote injection in headers).
Six triaged false positives, each with a comment:
detections/fixtures/positive/jwt-none-alg.json:jwt:2— positive test fixture.render.yaml:generic-api-key:70and:124— env var names (AISOC_DISABLE_NEO4J=true).scripts/detection_specs_part3_application.py:generic-api-key:61— detection rule literalcount_5min_per_ip_gt: 30.scripts/detection_specs_part2.py:jwt:1111— positive sample for jwt-none-alg.services/api/tests/test_security_defaults.py:generic-api-key:34— pre-generated Fernet key for tests; production injects the real one via env.
cd services/api && source .venv/bin/activate
python -m pytest tests/test_lake_sql.py tests/test_lake_rate_limit.py -q
# 69 passedws7-verify (full pytest + MCP build) is still pending and tracked under the
post-rewrite block.
services/connectors/app/connectors/base.py— extendedBaseConnectorwith abstractpush_caseandpush_status_changemethods, plusCapability.PUSH_CASE/Capability.PUSH_STATUSenum members.services/connectors/app/connectors/jira_connector.py—push_case: creates Jira issue with ADF-formatted description, severity → priority mapping, summary truncation at 255 chars,aisoc-case-{id}label for round-trip identification, returns{external_id, external_url, vendor, external_status}.push_status_change: discovers transitions viaGET /rest/api/3/issue/{key}/transitions, picks the matching transition by target status name, POSTs the transition; no-op when target status isn't exposed by the workflow.- Falls through to
push_casewhenexternal_ref is None(first push).
services/connectors/app/connectors/servicenow.py—push_case: POSTs to/api/now/table/incident, setscorrelation_id="aisoc:{case_id}"for round-tripping AiSOC case identity, severity → impact/urgency mapping, short_description truncation at 160 chars.push_status_change: PATCHes/api/now/table/incident/{sys_id}withstate; forResolved/Closedaddsclose_code+close_notes(required by stock instances), otherwise omits them.
services/connectors/app/api/router.py—POST /connectors/{connector_id}/push_casePOST /connectors/{connector_id}/push_status_change
services/api/migrations/035_case_external_refs.sql—case_external_refstable (one row per(case_id, vendor)) withexternal_id,external_url,external_status,last_synced_at,sync_state, plus indexes for inbound webhook lookup by(vendor, external_id).services/api/app/services/case_fanout.py— projection layer:fanout_create_case(case_id, tenant_id)— finds enabled ITSM connectors, decrypts auth viaCredentialVault, POSTs to connectors service, persistscase_external_refsrow, returnsFanoutResult.fanout_status_change(case_id, tenant_id, old_status, new_status)— looks up existing refs, callspush_status_change, updates row._serialize_case_for_push(case_row)— accepts dict or SQLAlchemyRow(._mapping); the test suite uses dicts and the production path uses Rows.
services/api/app/api/v1/endpoints/cases.py—create_caseandupdate_caseinvoke fan-out after the response is committed (best-effort, errors logged not raised).services/api/app/api/v1/endpoints/inbox_itsm.py— public-surface inbound webhook (POST /v1/inbox/itsm/{token}) with:- HMAC-SHA256 verification via
X-AiSOC-Signatureheader. - Vendor-specific payload parsing (Jira
issue.key/fields.status.name, ServiceNowsys_id/state). _map_inbound_statuscollapsing vendor statuses → AiSOC statuses.case_external_refslookup by(vendor, external_id).- Idempotent AiSOC case status updates (skips if already at target status).
- HMAC-SHA256 verification via
services/api/app/api/v1/endpoints/inbox.py—itsm-inboundtemplate added toALLOWED_TEMPLATE_IDSand_TEMPLATE_CATALOG;_build_inbox_urlnow routes ITSM tokens throughOAUTH_PUBLIC_BASE_URLto the API service.
services/api/tests/test_inbox_itsm_endpoint.py— 69/69 passing. Covers HMAC verification, payload parsing for both vendors, status mapping, external-ref lookup, idempotency, missing-token / wrong-signature paths.services/api/tests/test_case_fanout.py— 21/21 passing. Covers payload serialization, vault decryption, connector RPC happy path, missing external_id, unsupported capability, status-change with no refs, status-change with orphaned refs, transport errors.services/connectors/tests/test_push_capabilities.py— 23/23 passing. Covers Jirapush_case(payload shape, ADF, label, priority mapping, 255-char truncation, 4xx propagation), Jirapush_status_change(no-ref fallthrough, unknown-status no-op, missing-transition no-op, transition discovery + POST, 4xx on transition POST), ServiceNowpush_case(correlation_id round-trip, short_description truncation at 160, 4xx propagation), ServiceNowpush_status_change(no-ref fallthrough, unknown-status no-op, in-progress state-only PATCH, Resolved/Closed close_code+close_notes, 4xx propagation), capability declarations on both classes.
Total WS8 test count: 113/113 passing.
services/connectors/pyproject.toml declares respx as a dev dep but the
local venv doesn't have it installed (collection errors out for
test_azure_connectors.py, test_gcp_connectors.py). For
test_push_capabilities.py we use unittest.mock.patch against
httpx.AsyncClient instead — same pattern as
services/api/tests/test_case_fanout.py. Helper _build_response mocks
status_code, json(), text, request, and raise_for_status so the
connector code under test sees a faithful httpx.Response substitute.
SimpleNamespace is not iterable as key-value pairs, so
dict(SimpleNamespace(...)) raises TypeError. The serializer goes through
getattr(case_row, "_mapping", None) then falls back to dict(case_row),
which works for SQLAlchemy Row (production) and plain dict (tests). All
test fixtures in test_case_fanout.py use _make_case_row() returning a dict.
Reference plan §WS8.
apps/docs/docs/connectors/api-coverage.md— capability matrix for all 42 connectors. Generated programmatically fromBaseConnectorsubclasses by parsingcapabilities()return tuples andConnectorSchema(...)invocations directly out of source. Grouped bycategory(edr, siem, cloud, iam, saas, vcs, network) and includes columns forOAuth (hosted)andFederated search. Closes the "what works where" question Sonali flagged — no more reading source to know if a given connector canPUSH_CASEor onlyPULL_ALERTS.apps/docs/docs/architecture/itsm-as-source-of-truth.md— explains why AiSOC is the canonical store for case state and ITSM systems are projections. Covers the outbound path (case_fanout→push_case/push_status_change), the inbound path (POST /v1/inbox/itsmwith HMAC verification,_JIRA_INBOUND_STATUS/_SNOW_INBOUND_STATUSmapping,case_external_refsresolution), conflict resolution rules, and what is explicitly not synced (Jira priority, ServiceNow assignment_group, etc.). Includes amermaidflowchart.apps/docs/docs/operations/sonali-consultation-questions.md— structured question set for CISO / Head of SecOps / IR Lead pre-deployment conversations. Covers outcomes, threat model, data sources, case lifecycle, agent autonomy, compliance, identity/access, ops, reporting. Designed to surface boundary disagreements early rather than discovering them mid-rollout.apps/docs/sidebars.ts— registered all three new pages. Added a new "Architecture" category (soitsm-as-source-of-truth.mdhas a home outside the Concepts bucket), and slotted the other two under their existing "Connectors" and "Operations" categories.docs/architecture/SYSTEM_DESIGN.md— root-level architecture doc resynced with v2.1 reality. Topology diagram now showsservices/api(inbox tokens + HMAC webhooks + CEF/HEC/DNS) andservices/connectors(APScheduler poll + 42 connector classes + push_case/status) as parallel front doors feedingservices/ingest. Service responsibilities table gained aservices/connectorsrow andservices/apiwas extended with "ITSM webhook inbox, case fan-out". A new §12 captures the v2.1 additions in narrative form with cross-links to the Docusaurus pages above.
- §12.1 →
apps/docs/docs/connectors/api-coverage.md,apps/docs/docs/operations/credentials.md - §12.4 →
apps/docs/docs/architecture/itsm-as-source-of-truth.md - §12.6 →
apps/docs/docs/plugins/python-sdk.md,apps/docs/docs/plugins/go-sdk.md
Reference plan §WS9.
[
{"id": "gh-1", "content": "Audit current git state, branches, identities in history, working tree", "status": "completed"},
{"id": "gh-4", "content": "Secret scan working tree before any commit/push (gitleaks)", "status": "completed"},
{"id": "gh-2", "content": "Pull live GitHub code-scanning alerts and triage by severity", "status": "completed"},
{"id": "gh-6", "content": "Fix code-scanning issues with minimal diffs (critical/high first)", "status": "completed"},
{"id": "ws7-tests", "content": "WS7: lake endpoint unit tests (POST /lake/sql, GET /lake/schema, helpers)", "status": "completed"},
{"id": "ws8-tests-inbox", "content": "WS8: Inbox ITSM webhook tests (69 passing)", "status": "completed"},
{"id": "ws8-tests-fanout", "content": "WS8: Tests for case_fanout service (21 passing)", "status": "completed"},
{"id": "ws8-tests-push", "content": "WS8: Tests for jira/snow push_case/push_status_change methods (23 passing)", "status": "completed"},
{"id": "tracker-update", "content": "Update AI_STACK_PLAN_PROGRESS.md to reflect WS8 completion", "status": "completed"},
{"id": "ws9-api-coverage", "content": "WS9: write apps/docs/docs/connectors/api-coverage.md", "status": "completed"},
{"id": "ws9-itsm-sot", "content": "WS9: write apps/docs/docs/architecture/itsm-as-source-of-truth.md", "status": "completed"},
{"id": "ws9-sonali", "content": "WS9: write apps/docs/docs/operations/sonali-consultation-questions.md", "status": "completed"},
{"id": "ws9-sidebar", "content": "WS9: register new docs in apps/docs/sidebars.ts", "status": "completed"},
{"id": "gh-5", "content": "Sync docs/architecture (WS3-WS8) so GitHub matches current state", "status": "completed"},
{"id": "tracker-update-ws9", "content": "Update AI_STACK_PLAN_PROGRESS.md to mark WS9 done", "status": "completed"},
{"id": "gh-commit", "content": "Commit all WS9 + tracker docs with Beenu Arora <beenu@cyble.com> identity (no co-author trailers)", "status": "completed"},
{"id": "gh-3a", "content": "Backup current main to refs/heads/backup/pre-rewrite-2026-05-08 on origin", "status": "completed"},
{"id": "gh-3b", "content": "Install git-filter-repo and build authors/message callbacks (strip non-human co-author trailers, canonicalize all to beenu@cyble.com)", "status": "completed"},
{"id": "gh-3c", "content": "Run git-filter-repo on a fresh mirror; verify locally that all commits show Beenu Arora <beenu@cyble.com> and no Co-authored-by trailers remain", "status": "completed"},
{"id": "gh-3d", "content": "Force-push rewritten main + tags + branches", "status": "completed"},
{"id": "gh-3e", "content": "Verify GitHub contributors graph shows only beenu", "status": "completed"},
{"id": "gh-prs", "content": "Close 19 dependabot PRs with explanation; comment on #37 asking K4R7IK to rebase", "status": "completed"},
{"id": "gh-7", "content": "Verify code-scanning alerts cleared after push; record any remaining (3 false positives dismissed; 0 open high/medium/error)", "status": "completed"},
{"id": "gh-ci-identity", "content": "Patch CI workflows so machine commits use Beenu Arora identity", "status": "completed"},
{"id": "tryaisoc-cj", "content": "Customer journey review of tryaisoc.com — find and fix bugs", "status": "completed"}
]- Repo:
/Users/beenu/Desktop/AiSOC(branch:main) - Python venv:
services/api/.venv→python3.14 - Activate:
cd services/api && source .venv/bin/activate - Test:
python -m pytest tests/test_lake_sql.py tests/test_lake_rate_limit.py -q(currently 69 passed; smoke check before resuming) - MCP:
cd services/mcp && pnpm install && pnpm test - Git identity (verified):
Beenu Arora <beenu@cyble.com>
tryaisoc-cj is DONE. All nine plan workstreams (WS1-WS9), the
GitHub hygiene tail (gh-1 … gh-7, gh-ci-identity), and the
customer-journey review of https://tryaisoc.com have shipped. The
history rewrite landed cleanly, contributors graph shows only
beenu, CodeQL is green (0 open high/medium/error alerts; 3 medium
false positives dismissed with rationale), and the public site is
verified clean on every canonical path the UI actually links to.
P0 / P1 only. P2 polish was deferred — none was blocking a customer journey.
| # | Severity | URL | Symptom | Fix commit | Verified |
|---|---|---|---|---|---|
| 1 | P0 | /cases/INC-001?tab=ledger |
"Ledger unavailable" — GET /api/v1/investigations 405 |
a8f08c4 |
endpoint returns 200 |
| 2 | P0 | /api/v1/marketplace |
503 "marketplace/index.json not found" (file outside Docker build context) | 8328870 |
endpoint returns 200 |
| 3 | P0 | API service boot | FastAPI AssertionError on 204 route (oauth.py DELETE) crashed startup |
8328870 |
/health returns 200 |
| 4 | P1 | Case workspace + Hunt view (demo mode) | UI banner + toasts said "backend offline" when only writes are disabled | a8f08c4 |
copy reads correctly |
| 5 | P1 | sitemap.xml |
/signup listed in sitemap but route is 404 (AiSOC ships no signup) |
a8f08c4 |
sitemap clean |
| 6 | P1 | /onboarding "Run a detection" card |
Linked to /detections (plural) — 404. Correct path is /detection |
d3240a7 |
card now links to 200 |
| 7 | P1 | /onboarding "Bring your own data" card |
Linked to in-app /docs/operations/credentials — 404 (docs are external) |
d3240a7 |
links to GH Pages docs |
| 8 | P1 | NextStepCard component |
Used next/link for cross-origin docs URL; would break target=_blank |
d3240a7 |
external <a> rendered |
a8f08c4— fix(web): customer-journey bugs — ledger routing, demo copy, /signup 4048328870— fix(api): unblock startup + bake marketplace/index.json into api buildd3240a7— fix(web/onboarding): two 404s in NextStepCard footer
All three authored as Beenu Arora <beenu@cyble.com> with no
co-author trailers, deployed to aisoc-demo-web and aisoc-demo-api
on Fly, and verified live on tryaisoc.com.
Probed every canonical sitemap route and every /api/v1/* endpoint
the UI actually calls. All return 200.
Frontend (15 paths from apps/web/src/app/sitemap.ts):
/, /benchmark, /connectors, /purple-team, /responder,
/why-open-source, /marketplace, /compliance, /hunt,
/copilot, /graph, /login, /detection, /threat-intel,
/sla — all 200.
API endpoints the UI calls (7 paths, via tryaisoc.com rewrites):
/api/v1/cases, /api/v1/investigations, /api/v1/connectors,
/api/v1/detection/rules, /api/v1/marketplace,
/api/v1/marketplace/installed, /api/v1/contextual/actions —
all 200.
Direct API health (api.tryaisoc.com / Fly app domain):
/health — 200.
These paths return 404 because they are intentionally not part of the product surface. Recorded here so a future review does not re-open them:
/signup,/pricing,/about— AiSOC is open source and self-hosted; the only auth entry is/login, and the demo lands anonymously via the home-page CTA. The hero copy and the why-open-source page both state "No signup."/docs,/docs-portal— documentation is hosted separately on GitHub Pages anddocs.tryaisoc.com(cloudflared tunnel to the Docusaurus app). The Next.js app deliberately does not serve/docs/*. All in-app links to docs now use absolute URLs to the external host./detections(plural),/runbooks,/agents,/admin/oauth,/investigations,/maintenance— not listed insitemap.tsand not linked from any UI component underapps/web/src/./api/v1/incidents,/api/v1/detections,/api/v1/runbooks,/api/v1/agents/runs,/api/v1/contextual/health,/api/v1/contextual/orgs,/api/v1/oauth/apps— not implemented on the API service and not called by the UI. The real names are/api/v1/cases,/api/v1/detection/rules, and the contextual endpoints listed above.
flyctl deploy for aisoc-demo-web returned a client-side timeout
warning ("the app is not listening on the expected address") on the
last roll-out, but the new image is live and all post-deploy probes
pass. Recorded here so a future deploy can investigate the listener
warning, but it is not gating the customer journey.
Workspace rule: do not stop until every todo is done. → done.
Three workstreams from aisoc_v1.0_—_buyer-value_plan_c8116970 shipped
or verified in this session.
render.yaml was relocated from infra/render/render.yaml to the
repository root so Render's "Deploy to Render" button can resolve it
without a custom path. Updated README.md (new "Deploy in 60 seconds"
section + TOC entry), infra/render/README.md (relative path), and
CHANGELOG.md. Also updated the two .gitleaksignore entries that
pinned infra/render/render.yaml:generic-api-key:70 /:124 to point
at render.yaml:generic-api-key:70 /:124 (the line numbers track
content positions inside envVars, not absolute file lines, so they
remain valid post-move). One-click Docker Compose, Render, and Fly.io
buttons are now linked from the README.
Verified complete. playbooks/packs/v1/ contains 50+ playbook JSON
files covering ransomware, BEC, identity, cloud, lateral movement,
data-exfil, privilege escalation, and more. Each playbook conforms
to the schema enforced by scripts/validate_playbooks.py (CI gate
in .github/workflows/validate-playbooks.yml). The runtime in
services/agents/app/playbook/store.py loads packs/v1/** on
startup and merges with any user-defined playbooks in
services/agents/data/playbooks/index.json (user playbooks win).
No new authoring needed for v1.0 — the v1.0 floor of 25 was passed
and then some.
Verified complete and shippable.
apps/docs/docs/operations/airgap.mdis comprehensive (allowlist, Ollama / vLLM / LiteLLM topology, demo-mode behaviour, audit log expectations, "deploy-time only" mutation policy) and is reachable from the Docusaurus sidebar atOperations → Air-gapped deploymentviaapps/docs/sidebars.ts.services/api/app/api/v1/endpoints/llm_status.pyclassifies providers asopenai | anthropic | azure-openai | local-ollama | local-vllm | local-litellm | customand exposes a redacted snapshot atGET /api/v1/llm/status. The Settings UI's "Deployment & AI" panel (apps/web/src/components/settings/SettingsView.tsx) reads both/api/v1/airgap/statusand/api/v1/llm/statusand renders read-only badges for "Air-gap engaged" + "AI calls route to: …". Mutations are deliberately deploy-time only.services/agents/app/api/explain.py:_llm_allowed()honoursAISOC_AIRGAPPED: with air-gap on and noOPENAI_BASE_URL, the outbound call is blocked; with a local proxy URL it is allowed.docker-compose.demo.ymlships zero outbound LLM calls by default (OPENAI_API_KEY: ${OPENAI_API_KEY:-}— empty unless the operator opts in).
Shipped. v1.0 buyer-value criteria required per-tenant BYOK; the
existing CredentialVault primitive already provided the
encryption story, so we landed the model, endpoints, agents-side
read path, and Settings UI in a single coherent change set.
Backend (services/api):
migrations/038_tenant_llm_credentials.sql— newtenant_llm_credentialstable keyed ontenant_id, with aproviderCHECKconstraint matching the four-tier ladder (openai | anthropic | azure-openai | openai-compatible), a vault-encryptedapi_key_vaultcolumn, audit timestamps, and Row-Level Security policies bound to theapp.tenant_idGUC.app/models/llm_credential.py—TenantLlmCredentialORM model registered inapp/models/__init__.pyso Alembic and the dependency-injectedDBSessionsee it.app/api/v1/endpoints/llm_credentials.py—GET / PUT / DELETE /api/v1/llm/credentialsgated onsettings:read/settings:writeRBAC. Writes encrypt the API key withCredentialVault(vault:v1:prefix), enforce provider-specific invariants (base_urlmandatory foropenai-compatible,api_keymandatory on first write for hosted providers), and emitaudit.llm_credential.{created, updated,deleted}records viaemit_audit. Reads return aLlmCredentialViewprojection that never includes the plaintext key — onlyhas_api_key: bool.app/api/v1/endpoints/llm_status.py— refactored to layer the tenant override over the env baseline and surface the resolvedsource(tenant | environment | mixed | none) so the Settings UI can explain provenance.app/api/v1/router.py— registered the new credentials router under/api/v1/llm/credentials.
Agents-side read path (services/agents):
app/security/credential_vault.py— vendored read-path copy of the API service's vault. Differs only in thatget_vault()returnsNone(instead of raising) whenAISOC_CREDENTIAL_KEYis missing, so explain stays resilient on operator boxes that have not migrated to BYOK yet.app/security/llm_resolver.py—resolve_llm_configis now the single source of truth for "what config does this request actually use?". It layers tenant rows over env vars, applies the same air-gap rule as_llm_allowed, and returns a deterministic fallback so the explain path can still logallowed=Falsereasons even when no key is configured. The ledger import is lazy so unit tests that mount only the explain router don't drag in LangGraph.app/api/explain.py— callsresolve_llm_config(tenant_ref)per request instead of reading env vars directly.
Frontend (apps/web):
src/components/settings/SettingsView.tsx— new BYOK panel in the "Deployment & AI" section. Read-only by default; flips to write mode when the user hassettings:write. Showsprovider,base_url,model,has_api_key,enabled,last_rotated_at, and the resolvedsourcefrom/llm/statusso the buyer sees where each field came from.src/lib/api.ts—getLlmCredential,putLlmCredential,deleteLlmCredentialtyped clients.
Tests:
services/api/tests/test_llm_credentials_endpoint.py— 40 tests covering vault round-trip, the_projectredactor,LlmCredentialUpsertvalidators,_enforce_provider_invariants, GET / PUT / DELETE happy paths, the rotation-only update case (api_key=nullkeeps the existing ciphertext), failure paths (provider transitions that violate invariants, duplicate-tenantIntegrityError), and audit emission.services/agents/tests/test_llm_resolver.py— 33 tests covering_env_baseline(OPENAI_, LLM_, AISOC_LLM_MODEL precedence),_airgap_blocks,_classify_source,_decrypt_vault_token(vault disabled, round-trip, corrupt ciphertext), andresolve_llm_configend-to-end with a mockedasyncpgpool. Includes the BYOK-under-air-gap scenarios (private gateway allowed, OpenAI host blocked even with a valid tenant key) and the graceful-degradation paths (DB unreachable, ledger import unavailable, vault key missing, ciphertext decrypt fails).
All 73 tests pass.
Documentation:
apps/docs/docs/operations/credentials.md— added a "Per-tenant LLM credentials (BYOK)" section explaining thetenant_llm_credentialstable, the vault round-trip, and how the agents service decrypts at read time.apps/docs/docs/operations/airgap.md— clarified that BYOK to a private LiteLLM/Ollama/vLLM gateway works underAISOC_AIRGAPPED=true, while BYOK pointing atapi.openai.comis still blocked.apps/docs/docs/operations/security.md— added a pointer to the BYOK section so the security model document enumerates LLM keys alongside connector secrets.
Workspace rule: do not stop until every todo is done.
Author: Beenu Arora beenu@cyble.com
Released as: v7.0.0 on main branch.
All workstreams from aisoc_v1.0_—_buyer-value_plan_c8116970.plan.md are
now shipped and verified. See PROGRESS.md for the full workstream table,
fix log, and documentation changes.
PR #43 (feat/threatintel-attribution-v0-fixed) was squash-merged into
main on 2026-05-10 (205 files changed, +16 524 / -1 961 lines).
CHANGELOG promoted to [7.0.0] — 2026-05-10.
ROADMAP v7.0 marked Shipped ✅; v8.0 Planned section added.
README version badge updated to 7.0.0.
The AI Stack & Data Integration plan workstreams (this file) remain at DONE status — no regressions introduced by the v7.0 buyer-value plan.