All notable changes to AiSOC will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Security-hardening and platform release. Folds in the May 27–29 hardening wave,
multi-agent routing, and multi-cloud infrastructure skeletons that landed on
main since v7.3.1.
- Security hardening. Prompt-injection sanitizer wired into the classification agents (PR #219); cross-tenant isolation enforced on the detection-loop suggestion lookups (PR #221) and on the compliance, phishing, and knowledge-base endpoints (PR #236); nightly cross-tenant RBAC regression gate (PR #197); cryptography CVEs cleared and unfixable advisories time-boxed (PR #229); CodeQL quality notes resolved (PR #224).
- Multi-agent routing.
DetectAgent.processwired to theFusionEngineover cross-service HTTP (PR #198);/investigateswapped to theRouterOrchestratorbehind theROUTER_INVESTIGATEflag (PR #196); Redis-backed scheduler singleton guard for in-process workers (PR #218). - Multi-cloud infrastructure. Serverless-container Terraform skeletons for GCP (Cloud Run + Cloud SQL + Memorystore) and Azure (Container Apps + PostgreSQL Flexible Server + Cache for Redis), mirroring the AWS/EKS reference file-for-file (PR #240).
- Live dashboard & landing. Real
/metricsdata restored ontryaisoc.com/dashboard(PR #192); API/agents machines kept warm so the dashboard no longer 500s (PR #234); seed timestamps re-anchored so the live dashboard never goes empty (PR #235); landing CTAs pointed at the live dashboard (PR #233). - Dependency & CI maintenance. ~40 Dependabot upgrades across the Python, JS, and Go services plus CI stabilization (Ruff cleanup, OpenAPI export permissions, pnpm-lock dedupe).
Dev-only dependency upgrade (PR #178).
@vitejs/plugin-react@6 is built against vite@8, while vitest@4 (landed in
PR #179) still ships its own internal vite@7. pnpm resolves both side-by-side
without conflict: vitest@4 uses vite@7 for the test runtime, and react() is
loaded from the vite@8-flavoured build of the plugin. Vitest is tolerant of
the plugin API surface across vite 5/6/7/8, so apps/web/vitest.config.ts
needed no further changes after the cast we already removed in #179.
No production code touched. Locally verified: web 349/349 tests pass, lint
remains at 0 errors / 76 warnings (unchanged baseline), tsc --noEmit clean,
production build succeeds.
Dev-only dependency upgrade (PR #179)
across apps/web, packages/sdk-ts, and services/mcp. Vitest v3 and v4
introduced two breaking changes that surfaced in our suite:
vitest/configno longer exportsUserConfig.apps/web/vitest.config.tsusedimport('vitest/config').UserConfig['plugins']to bridge the vitest@2 (vite@5 types) ↔@vitejs/plugin-react@4(vite@7 types) version mismatch. In vitest@4 both packages target vite@7, so the bridging cast is gone andreact()is consumed directly.globalis no longer in the default DOM lib in@vitest/runner's typing.packages/sdk-ts/src/client.test.tsreferenced the Node global namespace via(global.fetch as ...); it now usesglobalThis.fetch, which is the cross-runtime idiom and was already what every other test in the SDK suite used. No runtime behaviour change —global === globalThisin Node.
Verified locally: SDK 9/9 tests pass, web 349/349 tests pass, web lint stays at
0 errors (warning count unchanged from PR #193's baseline). No production code
touched, no behavioural change to the published @aisoc/sdk package or to the
shipped web bundle.
Closes #190.
Closes the missing edge in the four-agent façade: DetectAgent previously
self-described as the public detection surface but had no synchronous entry
point into the fusion pipeline — callers either had to enqueue onto Kafka and
wait, or reach into services/fusion internals directly. This change adds the
last mile so a raw alert from any caller (LLM tool calls, ad-hoc CLI, the API
gateway) runs through the same FusionEngine instance that backs the Kafka
consumer path — dedup, correlation, ML scoring, confidence labelling, and RBA
all apply identically regardless of how the alert arrived.
Three additive pieces, no behavioural changes to existing paths:
POST /processon the fusion service (services/fusion/app/api/router.py). Accepts aRawAlert, returns aFusedAlert, and is wired to the already-runningFusionWorker's engine instance via the module-level_worker_refthe worker registers on startup. Returns503when the worker hasn't finished booting (Kafka consumer not yet attached) so callers fail loudly instead of getting a half-initialised pipeline. Lives at the root path — the router is mounted with no prefix inservices/fusion/app/main.py.services/agents/app/tools/fusion.py— thin async HTTP client used by the agents service. Posts to{FUSION_SERVICE_URL}/process(defaults tohttp://fusion:8003/processinside the docker-compose network), forwards an optional bearer token, and raises on any non-2xx or transport error. This is a deliberate contrast withapp.tools.graph, which degrades gracefully for investigation queries: fusion is the primary detection plane, so a silent fallback here would lose alerts.DetectAgent.process(raw_alert, *, api_token=None)(services/agents/app/agents/__init__.py). Classmethod delegate over the HTTP client — keepsDetectAgentimport-light (no engine instantiation in the agents process) and preserves the existing back-compat aliases.
Tests lock the contract on both sides. services/fusion/tests/test_process_endpoint.py
exercises the endpoint against an ASGITransport + AsyncClient: novel
alerts return a NEW_INCIDENT envelope, replays return DUPLICATE, an
unwired worker yields 503, a worker without an engine yields 503,
malformed and bad-severity payloads return 422, and a regression guard
asserts the endpoint and worker share the same FusionEngine instance.
services/agents/tests/test_fusion_client.py uses respx to lock the
client wiring: it must post to /process (not /api/fusion/process — that
mismatch was caught and fixed during initial wiring), the Authorization
header is set if and only if a token is supplied, httpx.HTTPStatusError
propagates on 503/422, and httpx.HTTPError propagates on transport
failures. A final trio of tests pins DetectAgent.process as a faithful
delegate to the client (args pass through unchanged, errors propagate, no
swallowed exceptions).
No feature flag and no env gate: the wiring is purely additive — no existing caller of the fusion service or the agents service changes shape, and the new endpoint/method only fire when something explicitly invokes them.
Closes #159.
Pure-unit isolation suites that exercise the tenant boundary at the endpoint-function level (no live DB, no FastAPI request cycle) so the contract is testable in milliseconds and survives ORM churn:
services/api/tests/test_threat_intel_tenant_isolation.py— IOC, actor, and feed list/get/create/delete are scoped bytenant_id, cross-tenant lookups resolve to 404, and writes attachcurrent_user.tenant_ideven when the payload smuggles a different one.services/api/tests/test_alerts_tenant_isolation.py— every read/write/queue/claim path on/alertsbindstenant_idinto the compiled SQL or forwards it to the service layer (build_queue/claim_alert).services/api/tests/test_llm_credentials_tenant_isolation.py— BYOK credential GET/PUT/DELETE scope bytenant_id, new rows bind the caller's tenant, andemit_auditis invoked with the caller's tenant + actor (CredentialVaultis stubbed so the assertions are on the persistence boundary, not crypto).
Assertions read the compiled SQL bind parameters rather than the
shape of any single query so they don't break on benign rewrites. All
three suites were mutation-tested by temporarily dropping the
tenant_id predicate in the corresponding endpoint — every dropped
predicate produced at least one failing test, confirming the suites
are wired to the right surface.
.github/workflows/cross-tenant-rbac.yml runs the three suites
nightly on main (06:30 UTC, ahead of compose-smoke-nightly so a
tenant boundary regression shows up as the first nightly signal) and
on-demand via workflow_dispatch. On failure it uploads a JUnit
report and opens a security-labelled tracking issue.
/cases/{id} now ships an Attack Chain tab that visualises the ranked
timeline returned by /v1/cases/{id}/attack-chain (shipped earlier under
8df637b9). The new AttackChainPanel in
apps/web/src/components/cases/CaseWorkspace.tsx:
- Window selector with the same vocabulary as the backend
WindowLiteral(1h,6h,24h,72h,7d,30d) — selection is deep-linkable via?window=…and survives reload. - One card per
ChainLinkwith the alert title, severity chip (driven by the canonical 5-tier ladderinfo | low | medium | high | critical), confidence percent, MITRE technique IDs, and the deterministic narrative reason emitted byservices/api/app/services/attack_chain.py. - Entity-graph summary panel — node count grouped by
kind(user,asset,process,ip,domain,alert), top edges, and a per-node severity chip when present in_entity_graph_payload. - SWR-keyed on
(case_id, window)with skeleton, error, and empty states that match the rest of the case workspace. - New
casesApi.getAttackChainmethod +AttackChainTimeline,AttackChainWindow,AttackChainLink,AttackChainEntityNode,AttackChainEntityEdge,BackendAttackChainResponsetypes inapps/web/src/lib/api.ts. The wire format matches the backendto_dictshape exactly (nodekindrather thantype; optionalseverityandevent_timefrom_entity_graph_payload). - Coverage in
apps/web/src/components/cases/CaseWorkspace.test.tsx: empty-state, error-state, and three data-rendering assertions (alert titles, confidence percent, MITRE techniques). The SWR mock is now key-aware so attack-chain and attack-path fetches stay isolated, anduseSearchParamsis stateful so window-selection deep-links round-trip cleanly under test. TheWindowSelectoris a labelledrole="group"of buttons witharia-pressed, so deep-link assertions resolve the active option via the single pressed button inside the group rather than a non-existent<select>value.
Closes the UI side of T3.3 in AISOC_V8_PROGRESS.md. Pre-existing
non-blocking lint warnings in CaseWorkspace.tsx are unchanged by this
diff.
Closes T2.3 by adding the missing bypass-prevention layer on top of the
existing fail-closed validator (services/agents/app/llm/contract.py). Two
new test files in services/agents/tests/:
test_llm_contract_extra.py(10 cases) — fills the coverage gaps in the shipped contract:safe_astreamvalidates messages exactly once and refuses to yield any chunk on violation;make_safe_chat_modelproxies non-LLM attributes through but routesainvoke/astreamthrough validation;classify_messagerejectsapi_key = '...'assignments and PEM private-key headers;set_contract_enforcement(False)lets raw OCSF through in soft mode and re-arms cleanly when flipped back toTrue.test_llm_contract_no_bypass.py(3 cases) — AST-based static gate that walks every*.pyfile underservices/agents/app/and fails CI on any direct.ainvoke(...)/.astream(...)call whose receiver is not on an explicit allowlist (_graph,investigation_graph,graph— all LangGraph control-flow handles, not LLMs) or whose file is not the contract module itself. Ships with self-tests proving (a) a syntheticllm.ainvoke(...)bypass trips the detector and (b) allowlisted receivers do not. Adding a new agent that calls a chat model directly now fails the build until it routes throughsafe_ainvoke/safe_astream/make_safe_chat_model.
The survey behind this gate confirmed every existing direct chat-model call
under services/agents/app/ already goes through the safe wrapper — the
remaining .ainvoke / .astream call sites are LangGraph control-flow on
compiled graphs, which is why those receivers are explicitly allowlisted
rather than silently ignored.
services/agents/tests/test_llm_contract.py exercises classify_message /
LLMInputContract.validate / validate_messages: raw OCSF-shaped JSON in a
user message fails closed when AISOC_AGENTS_LLM_CONTRACT_ENFORCED=1
(default), and prose plus summarize_structure_for_llm output passes. Tests
use {"role", "content"} dict messages so they run without importing
langchain_core (the contract already coerces LangChain BaseMessage and
dicts the same way).
Closes the v8.0 loop between the ingest-side graph writer (T1.1) and the
operator console. services/realtime now exposes a graph WebSocket
channel reachable at /ws/graph (or piggy-backed on /ws/all) and runs a
dedicated aisoc-realtime-graph Kafka consumer group against the
security.graph_updates topic that the Go ingest writer publishes to
(services/ingest/internal/graph/writer.go). Each GraphUpdate envelope
(entity_id, change_type, ts, label, rel_type, from, to,
properties, schema_version) is fanned out to clients scoped by
tenant_id, with default as the single-tenant fallback so self-hosted
deploys without explicit tenant tagging still light up live. The new
consumer is wired alongside the existing fused-alerts consumer in
non-blocking mode: a missing or unreachable graph topic logs at warn and
never blocks the higher-priority alerts/cases/agents/insights fan-out. The
topic name honours both AISOC_GRAPH_UPDATES_TOPIC and
KAFKA_TOPIC_GRAPH_UPDATES envs (defaults to security.graph_updates so
it matches the Go writer's default in
services/ingest/internal/config/config.go without manual plumbing), and
setting it to the empty string disables the consumer entirely for tests
that don't spin up Kafka graph traffic. The Investigation Rail and Attack
Chain views (T3.3 UI, in flight) can subscribe today and pick up node /
edge mutations within ~1s of the upstream event reaching ingest.
Public, append-only weekly scoreboard now lives at
/docs/benchmark-scoreboard.
One row per published eval run — date, agent version, commit SHA, MITRE
accuracy, MTC p50/p95, total USD, total tokens — sourced from a
checked-in JSON file at apps/docs/static/data/scoreboard.json and
validated against scoreboard.schema.json on every docs build via the new
pnpm --filter @aisoc/docs scoreboard:check script. Substrate rows
(deterministic CI gate, no LLM) are visually separated from wet-eval rows
(real LangGraph agent, real LLM, real cost), so substrate numbers can
never be quoted as live agent performance. Includes an inline SSR-rendered
SVG sparkline of MITRE accuracy over time, no Recharts/client JS bundle
hit. The marketing /benchmark page now cross-links to the scoreboard for
the full weekly history. Wet-eval rows arrive automatically once the T5.5
weekly CI workflow lands.
New first-class endpoint connector for Wazuh deployments. AiSOC now polls the
Wazuh Indexer API directly (no agent rewrite required) and normalizes alerts
into the platform's OCSF-aligned schema, collapsing Wazuh's native severity
ladder into the four-tier info | low | medium | high set used everywhere
else.
services/connectors/app/connectors/wazuh.py—WazuhConnectorsubclassesBaseConnector, pollswazuh-alerts-*indices over HTTPX with basic-auth, paginates time-windowed queries, retries on 5xx with capped backoff, and emits one normalized event per alert hit. Cursor is the highest@timestampseen so reruns are idempotent.services/connectors/app/connectors/__init__.py— registered in_CONNECTOR_CLASSES; the registry now declares 52 first-party connectors.plugins/wazuh/plugin.yaml+pnpm marketplace:sync— connector ships as a marketplace entry under categorysiem, mirrored intoapps/web/public/marketplace/index.json.apps/docs/docs/connectors/wazuh.md+ sidebar entry — operator setup walkthrough (API user + role, time-window semantics, severity collapse table, troubleshooting matrix).services/connectors/tests/test_wazuh_connector.py— 24 unit tests cover schema, auth headers, time-window query shape, retry policy, every documented severity bucket, and the empty/error paths.
Replaces the old hard-coded plugin scaffold with a real templated generator
keyed on plugin kind (enricher | connector | responder | detection | widget).
Templates ship inside the aisoc-cli wheel via importlib.resources so the
CLI works unchanged after pip install aisoc-cli.
packages/aisoc-cli/src/aisoc_cli/main.py—aisoc plugin new <NAME> --type <kind>loads the template tree fromsrc/aisoc_cli/templates/<kind>/, runsstring.Templatesubstitution for${slug},${name},${author}, and writes a project that already validates against the manifest schema.aisoc plugin scaffoldis preserved as an alias for backwards compatibility.pyproject.toml—force-includeships the templates tree in the wheel.- Tests parameterize across all five plugin types and assert the manifest
validates and no
${...}placeholders leak through. plugins/templates/README.mdis now a pointer to the canonical templates inside the CLI package.apps/docs/docs/plugins/cli.md— documents the new CLI surface and is added to the Plugin SDK sidebar.
Adds a serverless-first BYOC equivalent of the existing AWS module so AiSOC
can be stood up on Google Cloud with one terraform apply. Stage 2 #15.
infra/terraform/gcp/— Cloud Run forapi/web/ingest, Cloud SQL Postgres 16 + Memorystore Redis 7.2 on private IPs through a dedicated VPC and Serverless VPC Access connector, Secret Manager for every credential (auto-generatedpostgres_password,secret_key,credential_key,redis_auth, optionalopenai_api_key), and Artifact Registry for images. One service account per Cloud Run service with least-privilegesecretAccessorbindings. The skeleton points at the public GHCR demo images so a freshapplyworks zero-config; operators override viaapi_image/web_image/ingest_image.apps/docs/docs/deployment/gcp.md+ sidebar entry (betweenkubernetesandenv-vars) — quickstart, state-backend guidance, Cloud SQL Auth Proxy notes, cost envelope, and the long-running-services follow-up plan (GKE Autopilot foragents,realtime,connectors,alert-fusion,threatintel,fusion).infra/terraform/gcp/README.mdmirrors the deploy doc for module-local consumption.
Adds a vendor-pluggable response-action surface so plugins can register
executors against the existing capability taxonomy without forking the
in-tree executor list. The dispatcher always returns a typed
LiveActionResult; unknown (vendor_id, capability) pairs return FAILED
with error="executor_not_found" so the agent degrades gracefully instead
of seeing a 500.
services/actions/app/live_actions/models.py—LiveActionRequest/Result/DescriptorPydantic models (UTC-aware).services/actions/app/live_actions/registry.py—LiveActionExecutorABC + module-levelLiveActionRegistry.services/actions/app/live_actions/dispatcher.py— structured logging, error translation, dry-run + missing-credential semantics (SIMULATED, neverPARTIAL).- Adapters wrap every existing in-tree executor (CrowdStrike, Okta, AWS SG,
Splunk) so they now show up as
builtindescriptors. services/api/app/api/v1/endpoints/live_actions.py—discover,dispatch,dry-runREST routes; built-ins are registered at app startup.- 45 new tests across models / registry / dispatcher / router / builtins (full actions suite: 99 passed).
apps/docs/docs/concepts/live-actions.md+ sidebar slot.- Drive-by: fixed two pre-existing broken doc links flagged by the
Docusaurus build (osctrl → aisoc-direct stub,
air-gapped→env-vars).
Replaces the template fallback in
services/api/app/api/v1/endpoints/nl_query.py with a real, offline-friendly,
deterministic IR + renderer that emits ES|QL, KQL, and SPL and runs every
output through a lightweight grammar validator before returning. An optional
LLM enhancement path (gpt-4o-mini) is exposed via enhance_with_llm for
callers with credentials; failures fall back to the deterministic path so the
air-gapped story keeps working and the eval harness stays reproducible.
services/agents/app/nl_query/— IR, grammar, translator, renderers.- All
# TODO: translatecomments removed fromnl_query.py. services/agents/tests/eval_data/nl_query_eval.json— 50-pair gold NL→ES|QL eval set.services/agents/tests/test_nl_query_eval.py— 100% syntactic validity, 100% semantic match (50/50 perfect) against gold intents.- Pre-existing services/agents tests still green (162 passed) when ignoring the asyncpg-dependent suites that fail on a fresh checkout.
Replaces the host-agent dependency for Linux endpoint visibility with a
file-tail connector that consumes audit.log directly, plus an opinionated
auditctl ruleset whose -k keys map 1:1 to detection rules.
services/connectors/app/connectors/auditd.py—AuditdConnectortails/var/log/audit/audit.log, reassembles multi-record events by msg id, decodes hexproctitle/argvblobs, and normalizes via_severity_from_eventusingaisoc_*keys baked into the audit rules profile. Cursor is(inode, byte_offset)so log rotation is handled.profiles/auditd/aisoc.rules+profiles/auditd/README.md— ships an opinionated auditctl ruleset and documents install + reload.detections/— 4 new detection rules pivot offauditd_keyfor sudoers / SSH config tampering, kernel module load, and systemd persistence. No host-agent dependency.plugins/auditd/plugin.yaml+pnpm marketplace:sync— registers the connector in the public marketplace.apps/docs/docs/connectors/auditd.md+ sidebar entry — setup doc.services/connectors/tests/test_auditd_connector.py— covers schema, hex decode, argv reassembly, multi-record merge, severity heuristic, and file tailing (full connectors suite: 444 passed, excluding theapschedulerdev-deptest_scheduler.py).
Two new operator-facing docs pages, both registered in the Docusaurus sidebar:
apps/docs/docs/operations/notifications.md— complete inventory of every notification surface in AiSOC: Web Push to the responder PWA (VAPID, Redis, topic routing), Slack ChatOps via/aisoc, Slack/Teams ChatOps verification, one-shotnotify_slackfrom playbooks,create_ticketsimulation + recommended plugin path, honeytoken first-touch webhooks, connector freshness alerts, on-call gating, suppression / quiet-hours, and a per-mechanism testing recipe.apps/docs/docs/plugins/lifecycle.md— operator's view of plugin states (Discovered → Loaded → Enabled/Disabled, plussignature_status), trust modes (strict | warn | disabled), filesystem + OCI discovery, the full operator REST API with required permissions, configuration reference, upgrade and rollback semantics, and the structlog events worth alerting on.
Both pages cross-link the existing concepts/live-actions, plugins/overview,
plugins/publishing, and plugins/cli pages so they sit in the right place
in the information architecture.
Mirrors the existing case auto-summary pipeline to produce a deterministic, blameless retrospective for any case.
services/api/app/services/case_postmortem.py— pure builder + async DB orchestrator (build_case_postmortem). ReusesSummaryCaseRow/SummaryCommentRow/SummaryTaskRowfetchers fromcase_summaryso the post-mortem and the live summary draw from the same source of truth. Output is a PydanticCasePostmortemcovering incident overview, contributing factors, detection timing/gaps, response phases (detect → contain → eradicate → recover), blast radius, what went well / what fell short, and concrete action items.services/api/app/services/case_postmortem_html.py— pure HTML renderer matching the summary renderer (inline CSS, print-friendly, defensive escaping, no external assets).services/api/app/api/v1/endpoints/cases.py—GET /api/v1/cases/{case_id}/postmortemwith?format=json|html.services/api/tests/test_case_postmortem.py— pure-builder + HTML tests including XSS escaping, deterministic ordering, and explicit blamelessness assertions (analyst handles must not surface in the narrative; the assignee header line is explicitly allow-listed).apps/docs/docs/operations/case-reports.md+ sidebar — operator page covering both/summaryand/postmortemwith audience, output, automation, and runbook archive guidance. Cases summary breadcrumb now points operators at both endpoints.
The threat-intel pipeline already pulled events from MISP (read-only). This
closes the loop with a write path: every STIX 2.1 indicator or bundle
published through /api/v1/threatintel/stix/... can be mirrored into the
configured MISP instance as a native event with one or more attributes.
services/api/app/services/misp_push.py- Pure mappers:
parse_stix_pattern,stix_indicator_to_misp_attribute,stix_bundle_to_misp_event,confidence_to_threat_level. Coversipv4/ipv6,domain-name,url,email-addr,file:hashes(MD5/SHA-1/SHA-256/SHA-512) andfile:name. Untranslatable patterns are counted inskipped_attributes, never silently dropped. MispPushClient— async httpx wrapper for/users/view/me(health),/events/add(push),/events/view/{id}(read-back). Every call runs through the air-gap gate (enforce_airgap_for_url) first.
- Pure mappers:
services/api/app/api/v1/endpoints/stix_taxii.pyPOST /stix/indicators?push_to_misp=true— response now includes amispblock (pushed,misp_event_id,misp_event_uuid,url,pushed_attributes,skipped_attributes,error).POST /stix/bundles?push_to_misp=true— same, but the whole bundle becomes one MISP event.GET /stix/misp/health— calls MISP/users/view/me, never echoes the API key back.POST /stix/misp/dry-run— returns the exact MISP event payload AiSOC would send, plus anairgap_blockedflag for air-gapped audits.- Push failures are intentionally non-fatal: the AiSOC store is the source of truth, the MISP mirror is best-effort and surfaces the structured error on the same response.
services/api/app/core/config.py— new MISP push settings:MISP_VERIFY_SSL,MISP_PUSH_AUTO,MISP_PUSH_DEFAULT_DISTRIBUTION,MISP_PUSH_DEFAULT_THREAT_LEVEL,MISP_PUSH_DEFAULT_ANALYSIS,MISP_PUSH_TIMEOUT_SECONDS. ExistingMISP_URL/MISP_API_KEYare reused from the read path.services/api/tests/test_misp_push.py— 76 tests covering pure mappers, air-gap gating, MISP HTTP failures (401 / 5xx / timeout), the publish endpoints with and without push, the health probe, and the dry-run endpoint.apps/docs/docs/integrations/misp-push.md+ sidebar entry — operator doc with config, endpoints, the STIX→MISP type table, failure modes, and the dry-run-as-air-gap-proof workflow.apps/docs/docs/operations/airgap.md— clarifies that the existingMISP_URL/MISP_API_KEYenvs cover both pull and push, with a pointer to the new integration page.
The /v1/threat-intel/* endpoints (IOCs, threat actors, intel feeds) were
previously gated only by get_current_user, meaning any authenticated
role, including viewer and soc_analyst, could POST an IOC, DELETE
a feed, or create a new ThreatActor profile. In a managed-SOC / MSSP
deployment that is a privilege-escalation vector: a compromised analyst
seat can poison detections across the whole tenant by injecting false IOCs
or deleting the feed that hydrates them.
services/api/app/api/v1/endpoints/threat_intel.py— every route now declares the explicit permission it needs viaDepends(require_permission("threat_intel:read" | "threat_intel:write")). Read routes (GET /iocs,/iocs/{id},/actors,/feeds) requirethreat_intel:read; write routes (POST /iocs,DELETE /iocs/{id},POST /actors,POST /feeds,DELETE /feeds/{id}) requirethreat_intel:write. The legacyUser-typed dependency was replaced with the platform-standardAuthUserso JWT and API-key callers are gated by the same code path.services/api/app/core/security.py—ROLE_PERMISSIONSnow grantsthreat_intel:writetotenant_adminandsoc_leadin addition to the existingadmin/platform_admin/threat_hunterset. Without this the endpoint hardening would have locked out the two roles that legitimately need to manage tenant intel during an investigation.services/api/tests/test_threat_intel_rbac.py— 38 new regression tests pin the role/permission map (write-roles must hold:write, read-only roles must not), assert thatCurrentUser.require_permissionraises HTTP 403 for under-privileged roles and 200 for privileged ones, cover the API-key code path including scope wildcards, and grep the endpoint module to ensure every route still usesrequire_permission(...)(so a refactor that silently downgrades a route fails CI).
Tracked as F013 in docs/community-feedback/2026-05-12/.
scripts/validate_detections.py already replays each native rule against
its own positive + negative fixture (TP / TN gates), but that test cannot
catch the failure mode operators feel hardest in production: rule R
firing on an event that was meant for rule O. A single overly-broad
rule that matches every ConsoleLogin or every rundll32.exe execution
silently drives alert volume up and precision down across the whole pack
without tripping the per-rule TP/TN replay.
services/agents/tests/test_detection_fp_rate.py— new pytest suite that replays every native rule'smatch_whenagainst every other rule's positive fixture and grades the per-rule cross-fire FPR. Fails CI if any rule exceedsMAX_PER_RULE_FPR(default 5%) or regresses on its own positive/negative fixture. Failure output groups the worst 10 offenders with their cross-fire targets so the operator can narrow the rule (or allowlist a deliberate broad-vs-narrow overlap viaEXPECTED_CROSS_FIRES) without re-running a full eval sweep. Current corpus: 816 native rules evaluated, mean FPR 0.0, worst FPR 0.49% — well under the 5% ceiling.scripts/run_evals.py— wires the new gate into the unified eval runner assuites.detection_fp_rate, reportingworst_per_rule_fp_rate(lower-is-better) alongside the existing alert-reduction / investigation-completeness / response-quality gates so dashboards and CI consume it through the same JSON shape.
Tracked as F005 in docs/community-feedback/2026-05-12/.
Documentation-only refresh that aligns every install / architecture page with the actual shipped state of the repo. No service code, schema, or API surface changed.
- One-click install pipeline is now a first-class doc surface.
- New Docusaurus page
apps/docs/docs/installation.md(sidebar position 2) walks throughinstall.sh/install.ps1end-to-end — supported package managers, what gets installed, idempotency, theuninstall.sh/uninstall.ps1graduated cleanup flags, and the security model. apps/docs/docs/quickstart.mdadds it as Path 0 ("zero-prerequisite bootstrap") and renumbers the demo / dev paths.apps/docs/docs/deployment/docker.mdopens with a callout to the installer, refreshes every host/container port mapping againstdocker-compose.yml, splits profile-gated services (connectors,osquery-tls,slack-bot) out of the default stack, and updates the GHCR image list to the full 16-image set.apps/docs/docs/intro.mdadds the installer to Get started and corrects the connector-count copy.- Root
README.mdalready had Path 0 — verified and synced with the architecture refresh below.
- New Docusaurus page
- v2.2 architecture surfaces are now reflected everywhere.
apps/docs/docs/architecture.mddata-flow diagram, monorepo layout, and Service Responsibilities table now includeservices/osquery-tls,services/osquery-extensions, andservices/slack-bot. Connector count corrected to 50 (was 26 / 42 in stale paragraphs).docs/architecture/SYSTEM_DESIGN.mdconnector count corrected to 50, Service Responsibilities table extended with the v2.2 services, and a new §13 — v2.2 Additions appended that documents endpoint telemetry (osquery TLS server + extensions), ChatOps (slack-bot), Responder PWA, MCP server, Investigation Ledger / Ambient Copilot, and the one-click install pipeline. v2 / v2.1 narrative preserved.- Root
README.mdmermaid diagram + service-map table extended withosquery-tls,slack-bot,mcpand the correctedRealtime/Web Consoledescriptions.
- Connector count corrected to 50 across the repo.
apps/docs/docs/connectors/index.md: catalog count updated and the 23 missing connectors added across the existing categories (cloud / CNAPP / vuln-mgmt, SIEM, EDR/XDR, SaaS, ITSM, network, endpoint fleet, container orchestration).apps/docs/docs/connectors/api-coverage.md: coverage-table heading updated.apps/web/src/components/onboarding/StartHero.tsx: in-product copy on the onboarding tile updated.apps/docs/docs/intro.md: two stale paragraphs updated.- Source of truth:
services/connectors/app/connectors/__init__.py(_CONNECTOR_CLASSES).
Old historical entries in AI_STACK_PLAN_PROGRESS.md reference 42
connectors and are intentionally left as a snapshot of the v2.1 increment
they describe.
Track 1 + Track 2 of the docker-compose hardening work that began in
7.1.1. 7.1.1 fixed the boot-path bugs that surfaced on
a clean clone; this release attacks the time dimension. The previous
behaviour — docker compose up -d on a fresh checkout building all 15
services from source — took 10–20 minutes on a typical laptop and was the
single largest source of "I tried AiSOC and gave up" reports. With this
release, the same command pulls 12 prebuilt images from GHCR and is
healthy in roughly 90 seconds.
No service code, no API surface, no database schema changed. Every change in this release is in the boot path, the image-publish path, or the CI gate that proves both still work.
docker-compose.yml: Every service that previously had abuild:directive now also has animage:andpull_policy: missing. Compose will pull the prebuilt image fromghcr.io/aisoc-platform/aisoc-<svc>if it exists locally or in the registry; only if the pull fails does it fall back to building from source. The 12 backend services that publish images (api, agents, realtime, web, ingest, enrichment, fusion, actions, connectors, threatintel, ueba, slack-bot) are tagged via the${AISOC_VERSION:-latest}interpolation so the same compose file works forlatest,main, a release tag (v7.2.0), or a local override. The three deferred services (osquery-tls, honeytokens, purple-team) are marked with a# TODO(publish)comment and continue to build locally..env.example: Added a new top-of-fileAISOC_VERSION=latestblock that documents how to pin the entire backend to a release tag for reproducible deploys (AISOC_VERSION=v7.2.0), or track the bleeding edge (AISOC_VERSION=main)..github/workflows/publish-images.yml: Extended the build matrix from 4 services to 12 by adding ingest, enrichment, fusion, actions, connectors, threatintel, ueba, and slack-bot. These are the backend services that every full-stackdocker compose up -dboots; without them in the publish matrix,pull_policy: missingwould resolve to "build from source" for two-thirds of the stack and the change would be cosmetic..github/workflows/release.yml: Mirrored the same 12-service matrix on tagged-release builds so thatAISOC_VERSION=v7.2.0resolves to a real published image for every service in the compose file, not just the demo subset.
The pull-by-default path only matters if the underlying images actually build. Track 2 attacks the two largest historical sources of build-path flakes — Poetry resolution failures during image build, and Dockerfile regressions that nobody catches until release day.
- All seven Python service Dockerfiles
(
services/{api,fusion,threatintel,slack-bot,actions,connectors,osquery-tls}/Dockerfile): Added apoetry install→pip installfallback. The previous pattern failed the build on any transient PyPI hiccup, lock-file drift, or proxy timeout duringpoetry install. The new pattern wraps the install inset -eux; if poetry install ...; then ...; else pip install <pinned list>; fi, logs which path was taken, and pins every runtime dependency explicitly in the fallback list. The pinned list is documented as needing to trackpyproject.tomland is exercised by the new nightly cold-cache CI run. .github/workflows/compose-smoke.yml(new): On every PR that touchesdocker-compose.yml,docker-compose.demo.yml, any service Dockerfile,.env.example, or the workflow itself, GitHub Actions now boots the full stack from a clean checkout and assertsaisoc-postgresis healthy,apireturns 200 on/health, andwebreturns 200 on/— all within a 10-minute budget. Pull-by-default by design (so the CI run mirrors what the user sees), with automatic detection of Dockerfile changes that flips the workflow into rebuild-from-source mode so we don't smoke-test against a stale published image. Capturesdocker compose ps,docker compose logs, disk, and memory on failure..github/workflows/compose-smoke-nightly.yml(new): At 09:00 UTC every day, GitHub Actions does a full cold-cache rebuild of every service (docker compose build --no-cache --pull) and re-runs the same smoke gates with a wider 20-minute budget. This is the gate that catches the regressions PR smoke physically cannot — upstreampython:3.11-slimbreakage, transitive dependency drift,pyproject.toml↔ pip-fallback drift in the seven Python services. Failures upload a forensics artifact and open aci-labelled tracking issue automatically so a nightly break is visible by standup.
apps/web/package.json: Bumped to7.2.0.
None for users on 7.1.1. The compose file is backwards-compatible —
pull_policy: missing only changes behaviour the first time you boot
(it tries the registry before building); existing local images are
honoured. If you want the new fast path explicitly, run docker compose pull once after upgrading. To pin a deploy to this release rather than
tracking latest, set AISOC_VERSION=v7.2.0 in .env.
If you skipped 7.1.1, also read its migration note
about the osquery-tls host-port change (8007 → 8091).
Hotfix in response to user-reported docker compose up -d failures on a clean
clone. None of these are functional changes to the running services — every
fix is in the boot path, the boot documentation, or the pre-flight check.
-
docker-compose.yml: Removed the obsoleteversion: '3.8'declaration, which Docker Compose v2 ignores and warns about on every invocation (level=warning msg="...the attribute version is obsolete..."). The warning is harmless but is the very first line of output a new user sees, which signals "this project is broken" before the build even starts. -
docker-compose.yml: Addedmem_limit+mem_reservationto the four data-tier containers most likely to OOM-kill on an under-provisioned Docker Desktop:kafka: 1.5 GB limit / 1 GB reservationclickhouse: 1 GB limit / 768 MB reservationopensearch: 1 GB limit / 768 MB reservationneo4j: 1 GB limit / 768 MB reservation
Without these caps, a 4 GB Docker Desktop allocation (the default on macOS) would silently OOM-kill OpenSearch or Neo4j during JVM warmup, leaving the rest of the stack running but the alert/case feeds permanently empty.
-
docker-compose.yml(osquery-tlsservice): FixedAISOC_INGEST_BASE_URLpointing at the non-existentingest:8080(the actual service is namedingest-worker). Also remapped the host port from8007to8091to resolve a host-port collision with theuebaservice. Both bugs only surfaced if the user actually queried the osquery TLS server, which is why they survived the previous release; runningdocker compose up -dwould succeed butosquery-tlswould log connection-refused errors on every agent check-in.
README.md— Quick start: Restructured sopnpm aisoc:demois the canonical first-touch path (4 prebuilt images, ~90s to a working SOC console) anddocker compose up -dis explicitly labelled the "developer-build path" (22 services, 10–20 min cold build, requires Docker with at least 6 GB RAM allocated). The previous structure presented both paths as equally valid, which led users with stock Docker Desktop settings straight into a stack that physically cannot fit in the daemon's memory.README.md— Service map: Updatedosquery-tlsfrom:8090to:8091and added aKafka UIrow at:8090, matching the compose hygiene fix above.README.md— Boot section: Added explicit timing expectations ("~5 GB of base image pulls + 10–20 min of build on a typical laptop"), a recommendation to runpnpm aisoc:doctorbefore kicking off the build, and a troubleshooting note pointing under-provisioned Docker Desktop installs at Settings → Resources.
The pre-flight check that the user is now told to run before
docker compose up -d was previously useless to first-time users — its
container check used docker compose ps (which is project-scoped and
therefore couldn't see containers launched by a sibling compose file), and
it had no opinion on whether Docker itself was provisioned to actually run
the stack. This release fixes both:
- Docker Compose plugin enforcement: New check that fails with an
actionable error if the user only has Compose v1 (
docker-composePython binary) on PATH, which is now end-of-life and lacks healthcheck semantics the stack depends on. - Docker daemon RAM check: Reads
docker info --format jsonand asserts at least 6 GB allocated for the full stack (4 GB for the demo stack). Anything less hard-fails with a pointer to Docker Desktop → Settings → Resources. This single check would have prevented every variant of "the build succeeds butdocker compose psshows half my containers in a restart loop" reported to date. - Cross-compose-project container discovery: Replaced
docker compose pswithdocker ps -a --format json --filter name=aisoc-. The doctor now detects whether the user is on the demo stack (aisoc-demo-*containers) or full stack (aisoc-*containers) and accepts either as a valid boot, so demo users no longer see falseFAILrows for services the demo intentionally omits (kafka-ui, neo4j, etc.). - Exit-code aware container reporting: When a container exists but is
not running, the doctor now emits the exact
Exited (255)status fromdocker psand tells the userrun \docker logs ``. The previous output ("not running") gave the user no signal about whether the container had crashed, never started, or been manually stopped. - Stack flavor summary: A new
stack flavorrow reportsdemo,full, ormixed, plus a running/total container count ((4/8 container(s) running)) so the user can see at a glance whether they're looking at a half-broken stack or a fully-broken stack.
apps/web/package.json: Bumped to7.1.1.
None. This is a docker-compose hygiene release — no service code,
no database schema, no API surface area changed. Pull, re-run
pnpm aisoc:doctor, and re-run docker compose up -d (the
osquery-tls port change means existing deployments need to update any
osquery-agent tls_hostname:tls_port config from localhost:8007 to
localhost:8091, but no one was using that interface yet).
Six new connectors, three documentation backfills, and one new ingest template. Closes the biggest cloud-security gap in the connector catalogue: every Tier-1 cloud workload protection platform (Wiz, Prisma Cloud, Orca, Lacework, AWS Security Hub) now has a first-class integration, AWS gets three native data sources (GuardDuty, CloudTrail, VPC Flow Logs), and Kubernetes audit logs land through a dual-mode connector that works on both managed and air-gapped clusters.
apps/docs/docs/connectors/wiz.md: Documented the Wiz GraphQL connector end-to-end — service-account creation, scope (read:issues,read:vulnerabilities), token rotation, normalised severity mapping, and a worked example of a WizIssuecollapsing tocategory=cloud_alertin the inbox.apps/docs/docs/connectors/aws-security-hub.md: Documented IAM role vs. static-key auth, thesecurityhub:GetFindingspermission model, and theBLOCK_IP/ALLOW_IPcapabilities backed byservices/actions/app/clients/aws_security_groups.py(i.e. how a SOC analyst can quarantine an attacker IP from the Security Hub finding without leaving the case workspace).apps/docs/docs/connectors/lacework.md: Documented the Lacework API token flow,api_urlregional variants, and the alert→event severity map.apps/docs/sidebars.ts: Registered all three new docs pages under theConnectorscategory, plus the four new connector pages from Tracks B–D (prisma-cloud,orca,aws-guardduty,aws-cloudtrail,aws-vpc-flow,kubernetes-audit).
services/connectors/app/connectors/prisma_cloud.py—PrismaCloudConnectorwith full Prisma Cloud (CSPM/CWPP) coverage. JWT auth viaPOST /login, paginatedGET /alert/v1/alertwithtime.from/time.towindowing, severity collapse (critical/high → high,medium → medium,low/informational → low), and acompute_urloverride for self-hosted Compute Edition. Capability:PULL_ALERTS. Manifest:plugins/prisma-cloud/plugin.yaml, docs atapps/docs/docs/connectors/prisma-cloud.md, tests inservices/connectors/tests/test_prisma_cloud.py.services/connectors/app/connectors/orca.py—OrcaConnectorhittinghttps://api.orcasecurity.io/api/alertswith anapi_tokenfield, severity collapse (critical/high/hazardous → high,medium → medium,informational/low → low). Manifest, docs, and tests follow the same pattern. Capability:PULL_ALERTS.
services/connectors/app/connectors/aws_guardduty.py—AWSGuardDutyConnectormirroringAWSSecurityHubConnector's shape: boto3-based, supports IAM-role or static-key auth, callsguardduty.list_findings+get_findingsper detector. Normalises GuardDuty's continuous numeric severity scale (0.1–10.0) into AiSOC's four-tierinfo|low|medium|highladder (>= 7.0 → high,>= 4.0 → medium,>= 1.0 → low, elseinfo). Capability:PULL_ALERTS.services/connectors/app/connectors/aws_cloudtrail.py—AWSCloudTrailConnectorusingcloudtrail.lookup_events. Ships with a curated default allow-list of 21 high-signal event names covering identity abuse (ConsoleLogin,AssumeRoleWithSAML,GetSessionToken,GetFederationToken,CreateAccessKey,CreateLoginProfile,CreateUser), persistence (AttachUserPolicy,PutUserPolicy,CreateRole,AttachRolePolicy), data-plane abuse (PutBucketPolicy,PutBucketAcl,DeleteBucketPolicy,PutObjectAcl), network exposure (AuthorizeSecurityGroupIngress,RevokeSecurityGroupIngress,ModifyDBInstance), and trail tampering (DeleteTrail,StopLogging,UpdateTrail). Allow-list is overridable via theevent_namesconfig field. Pagination handled viaNextTokenwith a hard cap to keep poll latency bounded. Capability:PULL_LOGS.services/connectors/app/connectors/aws_vpc_flow.py—AWSVPCFlowLogsConnectorusingcloudwatch_logs.filter_log_events. Parses both v2 (default 14-field) and v5 (header-defined) flow-log formats. Defaultfilter_patternis?REJECTto surface dropped traffic only — keeps volume manageable while flagging external-facing security groups that are getting scanned. Public-IP heuristic (_is_public_ip) is RFC-5735-aware, treating RFC1918/loopback/link-local/multicast/CGNAT/TEST-NET as private. Severity heuristic: public-IP REJECTs →medium, internal REJECTs →low, ACCEPT-only flows →info. Capability:PULL_LOGS.
services/connectors/app/connectors/kubernetes_audit.py—KubernetesAuditConnectorshipping with two delivery modes selected via themodeconfig field:webhook(recommended) — Kubernetes API server pushes audit events to AiSOC's new dedicatedPOST /v1/ingest/k8s-audit/{tenant_id}route, authenticated with a shared secret in theX-AiSOC-K8s-Tokenheader (compared in constant time so partial-prefix matches still fail). The legacy/v1/inbox/{token}path with thek8s-audittemplate is kept around as a fallback for control planes that cannot inject custom headers into the audit-webhook kubeconfig.file_tail— AiSOC's connector pod tails a localaudit.logfile using a byte-position cursor (atomically written to a.aisoc-cursorsidecar), with rotation/truncation detection and a hard per-poll byte cap so a backlog can't blow up a single poll cycle.
services/ingest/internal/handler/k8s_audit.go— New Go handler for the dedicated webhook route. Caps body size viaK8S_AUDIT_MAX_BODY_BYTES(default 16 MiB), rejects oversized batches with413so the apiserver shrinks--audit-webhook-batch-max-sizeand retries, and publishes eachEventList.items[]entry through the existing normalizer + Kafka publisher usingconnector_type: kubernetes_audit. The route is disabled (returns503) until an operator setsK8S_AUDIT_SHARED_SECRET, so a fresh install never accidentally accepts unauthenticated audit traffic.services/ingest/internal/normalizer/normalizer.go— Added thekubernetes_auditconnector profile. MapsauditIDtoexternal_id,verbtoactivity_name,user.usernametoactor.user.name,objectRef.{namespace,resource,name}to a compositetarget.resource.name, and translates the connector's string severity (critical|high|medium|low| info) into OCSF integer severities (5/4/3/2/1).services/ingest/internal/normalizer/templates/k8s-audit.yaml— New inbox template (legacy path) that maps Kubernetes apiserverEventpayloads (apiVersion: audit.k8s.io/v1) onto AiSOC's normalised event shape:external_id ← auditIDvendor ← "Kubernetes",product ← "apiserver-audit",category ← "k8s_audit"actor ← user.username(plususer.groupscarried through metadata)target ← objectRef.namespace + "/" + objectRef.resource + "/" + objectRef.nameseverityis derived in the connector's_classify_severityheuristic, not in the template, so the same logic applies to both delivery modes.
- Severity heuristic (
_classify_severityinkubernetes_audit.py):high—exec/attach/portforwardon a Pod,createonClusterRoleBinding,impersonateverb,updateonserviceaccounts/token, anyRequestResponseevent whereresponseStatus.code >= 500on a sensitive verb.medium—create/patch/deleteonSecret/ConfigMap/ClusterRole/Role,escalateverb, failed authentication (responseStatus.code == 401|403) on a write verb.low— successful reads on sensitive resources (getonSecret), successful writes on routine resources.info— everything else (health probes, list/watch on benign resources, successful low-impact reads).
plugins/kubernetes-audit/plugin.yaml— Manifest with a 4-field config schema (mode,cluster_name,inbox_token,audit_log_path,cursor_path),category: cloud, capabilitiespull_audit+pull_alerts.apps/docs/docs/connectors/kubernetes-audit.md— Includes a complete sampleAuditPolicy(omitStages on RequestReceived for verbosity control; Metadata level for routine reads, RequestResponse for writes on Secret / ConfigMap / ClusterRoleBinding) and a sampleAuditSinkpointing at AiSOC's inbox URL.
marketplace/index.json+apps/web/public/marketplace/index.json— Rebuilt viapnpm marketplace:sync. Plugin count rose from 43 → 49 (+6 cloud connectors). Total marketplace entries:total=7104 detections=6993 playbooks=62 plugins=49 mitre_techniques=493.apps/web/package.json— Version bumped from7.0.3to7.1.0; the sidebar and landing-page footer both surface the new version automatically.
- 43 unit tests for
KubernetesAuditConnectorcovering both delivery modes, cursor persistence, rotation/truncation, byte-cap drain semantics, and the full severity-heuristic decision table. - 27 unit tests for
AWSVPCFlowLogsConnectorcovering v2/v5 parsing, public-IP classification edge cases (RFC1918, CGNAT, TEST-NET-1/2/3), and the default REJECT filter pattern. - Mirroring tests for
PrismaCloudConnector,OrcaConnector,AWSGuardDutyConnector,AWSCloudTrailConnectorcovering schema, normalise, pagination, and auth-error paths. - Full
services/connectorssuite passes at 364 tests; schema-introspection tests inservices/apialso pass with the six new connectors added to_CONNECTOR_CLASSES.
src/components/layout/AppShell.tsx: Wrapped<DemoBanner />in a new<ClientOnly>boundary so the banner (which readsNEXT_PUBLIC_DEMO_MODE) is never server-rendered. This eliminates React hydration error #418 caused by stale env-var inlining producing a structural tree mismatch (server saw<button>from Sidebar, client expected<div>from DemoBanner).src/app/layout.tsx: Addedpreload: falseto theJetBrains_Mononext/font/googleconfig. The monospace font is only used in code blocks and is not needed on the initial paint of most pages, causing Chrome to log "preloaded but not used within a few seconds" warnings. Lazy-loading the font eliminates these warnings without any visible FOUT.
apps/web/package.json: Bumpedversionto7.0.2; sidebar now showsv7.0.2dynamically.apps/web/src/components/landing/Footer.tsx: Replaced hard-codedv6.1.0string with a dynamic import ofpackage.jsonso the landing page footer always reflects the current package version.README.md: Updated version badge to7.0.1; addedosquery-tls(port 8090) andosquery-extensionsentries to the services table, the Swagger-UI URL table, and the directory tree; added osquery TLS server URL to the dev surface table.
- Python: Resolved
py/unused-global-variableincredential_vault.py,pack_loader.py,executive_digest.py,case_summary.py,cost_dashboard.py, andactions/executors/base.pyby refactoring mutable state into dictionaries and exposing identifiers via__all__. - Python: Resolved
py/cyclic-importbetweenosquery-tlsmodules by extractinggenerate_node_keyinto a newapp/core/crypto.pymodule. - Python: Resolved
py/empty-exceptinapi/main.pyandapi/services/github.pyby replacing barepassblocks withlogger.debugcalls. - Python: Resolved
py/log-injectioningithub.py,detection_proposals.py, andllm_credentials.pyby switching log format specifiers to%r. - Python: Resolved
py/clear-text-logging-sensitive-datainworkers/oauth_refresh.pyby redactingtenant_idand sanitising reason strings. - Python: Resolved
py/incomplete-url-substring-sanitizationinllm_resolver.pyby usingurllib.parse.urlparsefor hostname extraction. - Python: Resolved
py/stack-trace-exposureinagents/api/explain.pyby returning a generic error string from the exception handler. - Python: Resolved
py/call/wrong-argumentsinagents/tests/smoke_explain.pyby importing and passing aLlmConfiginstance to_stream_explanation. - Python: Resolved
py/unused-importinosquery-tls/db/env.py; fixedE402(import ordering) in the same file. - JavaScript: Resolved
js/unused-local-variableinAlertsView.tsx(removed unusedtoastimport) andSettingsView.byok.test.tsx(removed unusedwithinimport).
next.config.js: Removed deprecatedeslint.ignoreDuringBuildskey that Next.js 16 no longer accepts in the config file; addedturbopack.rootso Turbopack resolves workspace packages correctly.src/app/layout.tsx: AddedsuppressHydrationWarningto the<html>element so that the render-blockingthemeBootstrapScriptcan freely writedata-theme,data-theme-preference, andstyle.colorSchemeon the client without React reporting a hydration mismatch on every page load.
⚠️ Reconciliation notice (2026-05-12): The work described in this section was developed on branchfeat/pr6-osquery-extensions(commitse0d70fa1→3ab5aa81) but the branch was not merged intomainbefore this changelog entry was written. The files referenced below — includingservices/osquery-tls/,services/connectors/app/connectors/aisoc_direct.py,services/agents/app/playbook/steps/osquery_live_query.py, and the osquery-extensions Go module — exist on that branch and can be reviewed there, but are not present onmainas of v7.1.0 planning. Treat this section as a record of in-flight work pending PR merge, not as shipped functionality. The community-feedback-driven roadmap (docs/community-feedback/2026-05-12/) builds the genericlive_actioninterface (Issue #8) onmaindirectly rather than assuming this section's primitives are in place.
Added — osctrl, FleetDM, aisoc-osquery-tls, aisoc-direct, native osquery detections, live-query playbook step, FIM, custom virtual tables
Six-PR wave that closes #44 ("osctrl connector for fleet-wide osquery telemetry") and significantly extends osquery coverage end to end. Shipped in the v7.0 release window between the v7.0.0 baseline and the v7.0.1 hardening patch.
services/connectors/app/connectors/osctrl.py,fleetdm.py— Two newBaseConnectorsubclasses with fullschema(),validate(),fetch_events(), andnormalize()implementations. Schema-driven setup runs a liveTest connectionround-trip before save; secrets encrypted with the application-layerCredentialVault(Fernet AES-128-CBC + HMAC-SHA256); polling on per-instance schedule viaConnectorScheduler.plugins/osctrl/plugin.yaml,plugins/fleetdm/plugin.yaml— Marketplace manifests mirroring the connector schemas.marketplace/index.jsonregenerated viapnpm marketplace:sync.services/connectors/tests/test_osquery_connectors.py— Schema contract + severity heuristics tests.
detections/endpoint/osquery-*.yaml— 16 osquery detection rules migrated from_quarantine/to the native schema, IDsdet-endpoint-281throughdet-endpoint-296. Coverage spans credential access, persistence, lateral movement, defense evasion, and discovery on macOS, Linux (auditd), and Windows.detections/fixtures/osquery_*.json— Positive / negative test fixtures for every migrated rule, gated by the Detection Validation workflow in CI.
-
services/actions/app/clients/osctrl_client.py,fleetdm_client.py,aisoc_direct_client.py— Production-grade HTTP clients with per-vendor auth, retries, and structured error handling. -
services/actions/app/clients/osquery_allowlist.py— Strict allowlist enforcing only safe SELECT-only queries against approved tables (noATTACH, noINSERT, nopragma_*introspection of secrets). -
services/agents/app/playbook/engine.py::_handle_osquery_live_query— Newosquery_live_querystep type, registered inservices/agents/app/playbook/models.pyasStepType.OSQUERY_LIVE_QUERYand dispatched from theSTEP_HANDLERStable at the bottom ofengine.py. Pushes allowlisted distributed queries to a single host or fleet-wide via osctrl / FleetDM / aisoc-direct with HMAC-signed ChatOps approval before execution. Tests live inservices/agents/tests/test_osquery_live_query_step.py.v7.0.x reconciliation: Earlier drafts of this CHANGELOG referenced a separate module at
services/agents/app/playbook/steps/osquery_live_query.py. That module never landed onmain— the handler is inlined inengine.pyto keep the playbook engine's dispatch table in one place. The behaviour, tests, and CLI surface are identical to the originally documented design.
-
services/osquery-tls/— New first-party FastAPI service exposing/api/v1/enroll,/api/v1/config,/api/v1/log,/api/v1/distributed/read,/api/v1/distributed/write, plus/api/v1/fimfor file-integrity events. Self-hosted osquery TLS plugin endpoints are FleetDM-compatible so any off-the-shelf osquery agent can enroll without a third-party SaaS hop. Uses dedicated SQLite + Alembic migrations underservices/osquery-tls/db/. -
services/osquery-tls/app/api/v1/endpoints/log.py+ matchingplugins/aisoc-direct/plugin.yamlandservices/actions/app/clients/aisoc_direct_client.py— Direct-from-agent ingest path that consumes the osquery-tls log stream and normalises into the standard alert schema; bypasses third-party SaaS entirely. Theaisoc-directconnector is implemented as a virtual connector: agents push events directly into/api/v1/logon the osquery-tls service, which fans them out to the same ingest pipeline the polled connectors use. The marketplace manifest lives atplugins/aisoc-direct/plugin.yaml; the outbound client (used by playbooks to drive distributed queries) lives atservices/actions/app/clients/aisoc_direct_client.py.v7.0.x reconciliation: Earlier drafts of this CHANGELOG referenced a polled connector module at
services/connectors/app/connectors/aisoc_direct.py. That module never landed onmain. The connector is implemented as a push-based virtual connector (theosquery-tlsservice is itself the ingest endpoint), so there is nothing to register inservices/connectors/app/connectors/__init__.py. Functionally the data path is identical to the originally documented design.
services/osquery-tls/app/osquery_packs/— Bundled IR / OSquery-ATT&CK / FIM packs distributed to every enrolled agent on enrollment. Pack loader preserves hand-crafted playbooks underpack root(do notrmtree).services/osquery-tls/app/api/v1/endpoints/fim.py— File-integrity monitoring endpoint. Ingestsfile_eventsand synthesises alerts on writes to/etc/passwd,/etc/shadow, sshd configs, sudoers, and Windows registry hives. FIM-specific detection IDsdet-endpoint-297..300(renumbered from 281–284 to avoid collision with osquery-macos rules).apps/web/src/components/dashboard/FimDashboard.tsx— New dashboard panel grouping FIM events by host, file, and severity.
services/osquery-extensions/tables/— 5 custom Go-based virtual tables shipping with the agent for richer endpoint visibility plus a bidirectional response channel:aisoc_browser_extensions— installed browser extensions across Chrome, Firefox, Edge, Safari profiles.aisoc_kernel_modules— currently loaded kernel modules with signing / tainting state.aisoc_attck_persistence— MITRE ATT&CK persistence locations (LaunchAgents, scheduled tasks, systemd units, Run keys).aisoc_pending_actions— pending response actions queued for the agent; enables host → server → host bidirectional flow.aisoc_alert_cache— local cache of alerts the agent has emitted, for deduplication and replay.
services/osquery-extensions/tables/pending_actions_test.go— Unit tests for the bidirectional action queue.docs/openapi.yamlregenerated to include the extensions API endpoints.
- CI: Detection Validation workflow now covers the 16 migrated osquery rules; Python Tests, Web Build, and the osquery-tls service build are all green.
- Lint:
ruff formatandruff check --fixapplied across the newosquery-tlsservice; F401 / UP017 / UP037 / I001 / W291 cleared. - Marketplace:
apps/web/public/marketplace/curated.jsonre-synced frommarketplace/after the new connector / plugin manifests landed.
This release ships the complete v1.0 buyer-value plan across 16 workstreams. All items were designed, implemented, tested, and reviewed by Beenu Arora beenu@cyble.com.
services/slack-bot/— New standalone FastAPI service usingslack-boltasync adapter. Ships/aisoc triage <case_id>,/aisoc approve <action_id>,/aisoc status <case_id>, and/aisoc summary <case_id>slash commands. Interactive approval buttons route back through the API approval endpoint so human-in-the-loop gates work from Slack without opening the console.- 61 pytest cases cover the slash-command handlers, interactive payloads, API client calls, and error paths (bad token, non-200 API response, missing case).
services/api/app/services/digest_pdf.py— Generates a branded A4 PDF forExecutiveDigestobjects using ReportLab. Includes cover page, KPI tiles, alert-volume chart, top-rule table, top-actor table, and remediation summary.services/api/app/workers/weekly_digest_task.py— APScheduler task that runs every Monday at 06:00 UTC, builds a digest for every active tenant, and delivers it viaPOST /api/v1/reports/digest/emailor writes it to blob storage. Controlled byDIGEST_SCHEDULE_ENABLEDenv flag.services/api/app/services/digest_html.py— HTML mirror of the PDF for in-browser preview.services/api/tests/test_digest_pdf.py— 12 pytest cases covering PDF generation, chart rendering, and weekly scheduler triggering.
apps/web/src/components/playbooks/PlaybooksGallery.tsx— Tabbed gallery with 12 curated packs (Phishing, Ransomware, BEC, IAM Key Compromise, …). Each card shows TTP coverage badges, author, version, and a one-click Import button that callsPOST /api/v1/playbooks/import.services/api/migrations/039_detection_proposal_github_pr.sql— Addsgithub_pr_url TEXTandgithub_pr_number INTtodetection_proposals.services/api/app/services/github.py—GitHubServicecreates draft PRs against the tenant's detection repo when a detection proposal is promoted. Supports GHES and github.com viaGITHUB_API_URLenv var.- 25 playbook YAML templates added under
detections/playbooks/and 12 pre-built playbook packs underplaybooks/packs/v1/.
apps/web/src/components/settings/SettingsView.tsx— New "AI / LLM" settings panel: provider picker (OpenAI, Azure OpenAI, Anthropic, Ollama), API-key input, model selector, temperature slider, and connection test button.apps/web/src/components/settings/SettingsView.byok.test.tsx— 12 Vitest tests covering form rendering, provider switching, key masking, connection test success/error paths, and save confirmation.
apps/web/src/components/copilot/InvestigationTimeline.tsx— 684-line React component that renders the investigation ledger as a playable timeline. Each step shows the agent name, tool call, rationale, duration, and status badge. A scrubber lets analysts replay from any step.
services/api/app/services/case_summary.py— LLM-powered case summariser (structured output via function-calling). ProducesCaseSummaryResultwithheadline,severity_rationale,recommended_action, andevidence_links.services/api/app/services/case_summary_html.py— HTML renderer for the summary, used by the PDF exporter and the in-browser case card.
apps/web/src/components/theme/ThemeProvider.tsx— Theme preference (light|dark|system) stored inlocalStorageand synced toPATCH /api/v1/users/me/preferences. Survives logout and device switch.
apps/web/src/test/a11y.test.tsx— 55-line axe-core test suite. RendersAlertsView,CasesView,PlaybooksView,DashboardView, and 3 modal components; fails the build if any WCAG 2.1 AA violation is found.- Sidebar landmark roles, ARIA labels, focus trapping in modals, skip-navigation link, and colour-contrast fixes applied across the entire component tree.
apps/web/src/components/dashboard/DashboardView.tsx— Dashboard is now fully composable: widgets can be dragged, dropped, resized, pinned, and removed. Layout serialised toPOST /api/v1/saved-views.services/api/app/api/v1/endpoints/saved_views.py— CRUD for per-user saved views (dashboard layout, column configs, active filters).
services/threatintel/app/actors/attribution.py— NewThreatActorAttributionEnginescores observed IOCs, MITRE ATT&CK techniques, tools, and target sectors against an in-memory catalog of three seed actor profiles (APT28, APT29, Lazarus). Scoring is the weighted sum of TTP (0.4) / Tool (0.3) / Target (0.2) / IOC (0.1) components, multiplied by the actor profile's baseline confidence, then thresholded.services/threatintel/app/api/actor_attribution.py— New router mounted at/api/v1/actorswithPOST /attribute,GET /profiles, andGET /profiles/{actor_id}. Constructs the engine once via FastAPI lifespan and passes it throughDepends(get_attribution_engine).services/agents/app/agents/investigation_agent.py— Investigation agent now callsPOST /actors/attributeand surfaces attribution results in the investigation ledger.docker-compose.airgap.yml— Compose override for fully disconnected deployments: disables all external feed pullers, enables Ollama sidecar, and setsAIRGAP_MODE=trueso the API switches to local-only LLM routing.apps/docs/docs/operations/air-gapped.md— Step-by-step air-gap deployment guide: image pre-pulling, Ollama model loading, threat-feed pre-seeding, and smoke-test checklist.
services/api/app/api/v1/endpoints/mssp.py— NewGET /mssp/tenantsaggregation endpoint: per-child tenant alert counts, open case counts, SLA breach rate, and last-seen connector heartbeat.services/api/app/models/tenant.py— Addedparent_tenant_idandmssp_rolecolumns supporting the parent-child tenant hierarchy.
services/api/app/api/v1/endpoints/llm_credentials.py— CRUD for per-tenant LLM credential records. Secrets encrypted at rest viaCredentialVault.- LLM routing layer (
services/api/app/core/config.py) reads per-tenant credentials before falling back to the platform-wide key.
apps/web/src/components/analytics/TeamAnalyticsView.tsx— Analyst leaderboard with MTTR per analyst, alert disposition accuracy, cases closed per shift, and false-positive rate trend over the selected window.
services/api/app/api/v1/endpoints/llm_status.py— Reports whether the deployment is running in air-gap mode and which local models are available via the Ollama sidecar. Used by the settings UI to auto-populate the model picker.
- Ruff
E501/W291/W293/B007/B017/F821/I001violations inservices/api. mypyerrors across all 16 plan-modified files:RowMappingimport,Optionallistlen(),current_user.user_idrename,fetchone()None checks,sort_keyreturn type,PYTHONPATHsubprocess handling.- Converted structlog-style
logger.info(key=value)calls to stdlib formatting inrule_engine.py,neo4j.py, anddigest_pdf.py. - SQLAlchemy relationship
name-definedmypy errors suppressed with# type: ignore[name-defined]intenant.pyandconnector.py.
The /api/v1/actors/* endpoints are reachable on the threatintel
service without RBAC enforcement in v0 — they assume cluster-internal
network reachability only. Do not expose them through public
ingress until a Depends(require_permission(...)) guard is added.
Tracked as a known limitation in the docs.
services/threatintel/app/actors/attribution.py— NewThreatActorAttributionEnginescores observed IOCs, MITRE ATT&CK techniques, tools, and target sectors against an in-memory catalog of three seed actor profiles (APT28, APT29, Lazarus). Scoring is the weighted sum of TTP (0.4) / Tool (0.3) / Target (0.2) / IOC (0.1) components, multiplied by the actor profile's baseline confidence, then thresholded.services/threatintel/app/api/actor_attribution.py— New router mounted at/api/v1/actorswithPOST /attribute,GET /profiles, andGET /profiles/{actor_id}. Constructs the engine once via FastAPI lifespan and passes it throughDepends(get_attribution_engine).services/agents/app/agents/investigation_agent.py— Investigation agent now calls the attribution API after triage/enrichment and records the result onstate.threat_intel["attribution"]. Failure is soft and surfaces a[medium]finding rather than aborting the investigation.docs/threat-actor-attribution.md— Full operator-facing docs, including scoring model, API surface, observability, env vars, v0 caveats, and instructions for adding custom profiles.
AISOC_ATTRIBUTION_THRESHOLD— Override the default confidence threshold (0.30). Clamped to[0.0, 1.0]; invalid values fall back to the default and emit a warning.AISOC_THREATINTEL_URL— Base URL the agent uses to reach thethreatintelservice. Default:http://threatintel:8083.AISOC_ATTRIBUTION_TIMEOUT_SECONDS— HTTP timeout the agent uses for attribution calls. Default:10.
- New Prometheus series exported by
threatintel:threatintel_attribution_requests_total{result="matched|unknown|error"}threatintel_attribution_score{actor_id}(histogram)
- Tool matching uses an alphanumeric-only boundary regex
(
(?<![a-zA-Z0-9])tool(?![a-zA-Z0-9])) instead of Python's\b. Python's\btreats_as a word character, which broke common malware-filename patterns likeminiduke_v3.dll. The new boundary treats_,-,., and/as delimiters while still rejecting alphanumeric neighbours (sox-agentdoes not matchx-agentic). - Tool matching now also scans the IOC's
descriptionandtagsfields, not justvalue. - IOC lookups go through a new public method
OpenSearchStore.match_ioc_values()rather than reaching intoos_store._os.search()directly. - The attribution engine accepts a
catalogconstructor argument so tests and downstream services can inject custom profiles without monkey-patching module-level state. - An empty catalog now resolves to
actor_id="unknown"with explicit reasoning ("Actor catalog is empty"), instead of confusingly falling through to the no-match-above-threshold branch.
The /api/v1/actors/* endpoints are reachable on the threatintel
service without RBAC enforcement in v0 — they assume cluster-internal
network reachability only. Do not expose them through public
ingress until a Depends(require_permission(...)) guard is added.
Tracked as a known limitation in the docs.
A review of G2, Gartner Peer Insights, and customer feedback on AI SOC / SIEM / SOAR platforms drove this release. Five new agents, eight new console pages, four new API surfaces, and ten new connectors landed at once. Connector catalog goes from 16 → 26.
auto_triage_agent.py— Master triage agent classifies each incoming alert astrue_positive/false_positive/benignwith a confidence score. Low-confidence noise auto-closes; everything else escalates with rationale.phishing_agent.py— Specialised phishing triage: header analysis, URL reputation, attachment sandboxing summary, sender-domain trust.identity_agent.py— Identity-centric reasoning: impossible travel, privilege escalation, MFA bypass, and session-token anomaly classification.cloud_agent.py— Cloud posture / threat reasoning across AWS, Azure, GCP, and Kubernetes signals.insider_threat_agent.py— Behavioural deviation, peer-group scoring, exfiltration intent classification.- All five are exposed via
POST /api/v1/agents/triage.
/investigate— Conversational, multi-turn copilot anchored on a case; reads its evidence, ledger, and entity graph for grounded follow-up Q&A. Component:copilot/InvestigationChat.tsx./coverage-advisor— Ranks MITRE ATT&CK technique gaps by adversary prevalence and recommends rules to close them. Component:coverage/CoverageAdvisorView.tsx./shifts— Outgoing/incoming analyst handoff dashboard: active cases, in-flight investigations, queued approvals on one screen. Component:shifts/ShiftsView.tsx./easm— External Attack Surface Management: discovers public assets, exposed services, and certificate-expiry risks. Component:easm/EASMView.tsx./mssp— MSSP executive dashboard: KPIs, cross-tenant alert volume, and per-customer SLA posture. Component:mssp/MSSPDashboardView.tsx./noise-tuning— Per-rule false-positive rate, suppression candidates, one-click tuning. Component:noise/NoiseTuningView.tsx./analytics/team— Analyst leaderboard, MTTR per analyst, dispositions accuracy, and shift workload balance. Component:analytics/TeamAnalyticsView.tsx.
shifts.py— Shift-handoff CRUD: list active shifts, post handoff notes, view queued approvals scoped to a shift window.stix_taxii.py— STIX 2.1 / TAXII 2.1 publishing; pushes the tenant's IOCs and threat-actor profiles to upstream / community feeds.compliance.py— Automated compliance evidence collection for SOC 2, ISO 27001, NIST CSF, PCI-DSS, HIPAA, and DORA. One-click evidence pull.deployment.py— Deployment / air-gap toggles; tenants that disallow external feeds can flip air-gap mode here.
EDR / XDR: sentinelone.py, cortex_xdr.py. Cloud security: wiz.py,
snyk.py. Network: zscaler.py. SaaS / email: proofpoint.py,
servicenow.py, jira.py. Identity: 1password.py, duo_security.py.
All ten registered in services/connectors/app/connectors/__init__.py,
all ship a marketplace manifest under plugins/<id>/plugin.yaml, all
collapse vendor severity to the standard four-tier ladder.
- AI-generated incident reports — Every case now has a one-click "Export Report" button that generates a PDF incident report from the Investigation Ledger.
- Air-gap deployment configuration — Per-tenant toggles disable external feeds (threat intel, marketplace sync, push notifications) for fully air-gapped deployments.
- Connector catalog count 16 → 26. Landing page hero stat, layout SEO
metadata, and
apps/docs/docs/connectors/index.mdupdated to reflect. apps/docs/docs/architecture.mdadds a v1.5 section and updates the service-responsibilities table to include the new API surfaces and autonomous agents.apps/docs/docs/intro.mdupdated to mention the new connector count and v1.5 features.- Footer release link now points at
v6.1.0.
-
Log-injection mitigation (
services/api/app/api/v1/endpoints/connectors.py) —connector_typeoriginates from user-supplied query parameters and was previously logged verbatim, leaving an injection path for newlines/control characters into structured log records. A character-allowlist reconstructor (_safe_connector_type) now strips every character outside[a-zA-Z0-9_\-]before the value reaches any log call, breaking CodeQL's taint trace (alertpy/log-injection). -
Remove dead rate-limiter code (
services/realtime/src/index.ts) — The hand-rolledmakeRateLimiterfunction was superseded byexpress-rate-limitin the previous release but not removed, leaving dead code that masked the effective rate-limiting path. The function is now deleted;express-rate-limitis the sole limiter in production (resolves CodeQL alertjs/unused-local-variable).
-
MSSP / parent-tenant console (
services/api/migrations/012_mssp_console.sql,services/api/app/models/mssp.py,services/api/app/api/v1/endpoints/mssp.py) — Parent tenants can onboard child tenants, manage cross-tenant delegations, add per-tenant notes, and view an aggregated metrics rollup in a single pane. -
Asset inventory + vuln-to-alert correlation (
services/api/migrations/013_asset_inventory.sql,services/api/app/models/asset.py,services/api/app/api/v1/endpoints/assets.py) — CRUD for discovered assets with vulnerability findings auto-correlated to alerts. Surfaces asset blast radius and enables asset-context enrichment during triage. -
Insider threat module (
services/api/migrations/014_insider_threat.sql,services/api/app/models/insider_threat.py,services/api/app/api/v1/endpoints/insider_threat.py) — User risk profiles, behavioural indicators, peer-group deviation scoring, and watchlist management. Risk scores update incrementally as new indicators arrive. -
L0–L4 auto-remediation maturity tiers (
services/api/migrations/015_remediation_maturity.sql,services/api/app/models/remediation.py,services/api/app/api/v1/endpoints/remediation.py,services/actions/app/services/maturity.py) — Per-tenant configuration of remediation autonomy from L0 (manual only) through L4 (fully autonomous). Gate log records every approve/block decision. Per-action whitelist pre-approves low-risk actions regardless of tier.
-
Internal threat intelligence (
services/api/migrations/016_threat_intel.sql,services/api/app/models/threat_intel.py,services/api/app/api/v1/endpoints/threat_intel.py) — IOC harvesting from alert history, threat actor and campaign profiles, and STIX/TAXII feed subscription management, all queryable via the REST API. -
Cloud security posture management (CSPM/KSPM) (
services/api/migrations/017_cspm.sql,services/api/app/models/posture.py,services/api/app/api/v1/endpoints/posture.py) — Ingests posture findings from cloud providers, tracks drift between scan runs, and surfaces a per-provider posture summary with suppress/resolve workflows. -
Identity-centric correlation graph (
services/api/migrations/018_identity_graph.sql,services/api/app/models/identity_graph.py,services/api/app/api/v1/endpoints/identity_graph.py) — Graph of users, devices, service accounts, and roles with typed relationship edges. Alerts link to identity nodes, enabling blast-radius queries and attack-path reconstruction. -
Auto-generated board reports (
services/api/migrations/019_board_reports.sql,services/api/app/models/report.py,services/api/app/api/v1/endpoints/reports.py) — Report templates and scheduled generation of PDF/HTML executive summaries. Artefacts are stored, versioned, and deliverable via email or webhook.
-
Dashboard metrics API (
services/api/app/api/v1/endpoints/metrics.py) —/api/v1/metrics/dashboardaggregates alert KPIs, case counts, connector source stats, top MITRE tactics, 24-hour alert trend, and threats-by-source for the frontend dashboard tiles./api/v1/metrics/alerts/trendsupports1h / 24h / 7d / 30dperiod buckets. -
Tailscale connector (
services/connectors/app/connectors/tailscale.py) — Pulls audit logs and policy-file change events from the Tailscale API with OAuth client-credential and API-key auth, cursor-based pagination, and four-tier severity mapping. -
AWS GuardDuty credential-exfiltration detection (
detections/cloud/aws-guardduty-instance-credential-exfiltration.yaml) — Sigma rule covering EC2 instance credential exfiltration viaUnauthorizedAccess:IAMUser/InstanceCredentialExfiltration.
This pass turns connectors from a hardcoded, code-edit-only feature into a runtime, schema-driven, click-and-connect surface — and lights up nine new cloud / SaaS / VCS sources (Microsoft Entra, Azure Activity, Defender XDR, GCP Cloud Audit, GCP SCC, Microsoft 365 audit, Google Workspace, Cloudflare, GitHub) on top of the original CrowdStrike / Splunk / AWS Security Hub / Okta / Microsoft Sentinel set.
CredentialVault(services/api/app/security/credential_vault.py,services/connectors/app/security/credential_vault.py) — Fernet (AES-128-CBC + HMAC-SHA256) wrapper forauth_configJSON, keyed off the newAISOC_CREDENTIAL_KEYenv var. SupportsMultiFernetrotation viaAISOC_CREDENTIAL_KEY_ROTATION_FROM. Theservices/connectorsread-path mirror decrypts only; writes always go through the API service. Documented in docs/operations/credentials.- Self-describing connector schemas (
services/connectors/app/connectors/base.py) —BaseConnectorgained aField/OAuthHints/ConnectorSchematrio and an abstractschema()classmethod. Each connector class is now the source of truth for its ownname,connector_category, fields (text / secret / select / textarea / oauth), default poll interval, and hosted-OAuth roadmap hints. The hardcoded dict inservices/connectors/app/api/router.pyis gone — schema responses come from the registry built inservices/connectors/app/connectors/__init__.py. /api/v1/connectorsCRUD endpoints (services/api/app/api/v1/endpoints/connectors.py,services/api/app/schemas/connector.py) —GET /catalog,POST /test,GET / POST / PATCH / DELETE /instances,POST /instances/{id}/test. Tenant-scoped via the existing auth dependency, secrets encrypted on write through the vault, and proxied to the connectors microservice for schema lookups and liveTest connectioncalls.ConnectorScheduler(services/connectors/app/scheduler.py) — APScheduler in-process insideservices/connectors, started in the FastAPI lifespan. One job per enabled instance, pollsfetch_alerts(since_seconds=300)every 5 min by default (connector_config.poll_interval_secondsoverrides per instance), decrypts via the read-path vault, normalizes events through the connector'snormalize()method, and pushes the batch toservices/ingest/v1/ingest/batchvia the newIngestClient. SetAISOC_CONNECTORS_DISABLE_SCHEDULER=1to skip wiring the scheduler in tests.- Nine new connectors in
services/connectors/app/connectors/:azure_entra(Microsoft Graph audit logs),azure_activity(ARM Activity Log via Resource Graph + blast-radius_HIGH_BLAST_RADIUS_VERBSlist),azure_defender(Microsoft Graph Security alerts),gcp_cloud_audit(Cloud Logging API with hand-rolled RS256 JWT signing for service-account auth),gcp_scc(Security Command Center findings, same JWT signer),m365_audit(Office 365 Management Activity API, sharing the Azure AD app fromazure_entra),google_workspace(Reports API with domain-wide delegation),cloudflare(Audit Logs), andgithub(Org Audit Log + Code Scanning alerts). Every connector ships unit tests covering schema contract, normalization, andtest_connection()happy/sad paths (services/connectors/tests/test_*_connectors.py,test_schemas.py,test_scheduler.py). - Frontend click-and-connect wizard
(
apps/web/src/components/connectors/AddConnectorModal.tsx,ConnectorInstanceList.tsx, rewiredConnectorsView.tsx, typed client inapps/web/src/lib/api.ts) — two-step modal: (1) catalog grid grouped by category, (2) schema-driven form withtext/secret/select/textareafields, an inlineTest connectionbutton, and aSave & enableaction.framer-motionfor transitions,react-hot-toastfor feedback. Existing connector cards now render from the live API via SWR. - Marketplace + plugin manifests —
plugins/{azure-entra, azure-activity, azure-defender, gcp-cloud-audit, gcp-scc, m365-audit, google-workspace, cloudflare, github}/plugin.yamlcarry the newschema()shape soscripts/build_marketplace.pycan surface them in the in-app Marketplace, andapps/web/public/marketplace/index.jsonis regenerated viapnpm marketplace:sync. - Documentation —
apps/docs/docs/connectors/index.md(catalog landing with a connector walkthrough and category taxonomy), nine per-connector setup walkthroughs (prereqs, scopes, screenshots),apps/docs/docs/operations/credentials.md(vault threat model, key rotation procedure, hosted-OAuth roadmap), and a newConnectorssection inapps/docs/sidebars.ts.
services/api/app/core/config.py— addedAISOC_CREDENTIAL_KEY,AISOC_CREDENTIAL_KEY_ROTATION_FROM,CONNECTORS_SERVICE_URL,CONNECTORS_SERVICE_TIMEOUT_SECONDS. Documented in.env.example.services/api/app/main.py— the new/api/v1/connectorsrouter is mounted alongside the existing v1 router set.services/connectors/app/api/router.py— schema responses lookup the registry instead of returning a hardcoded dict; newPOST /connectors/{connector_id}/testendpoint runs an unauthenticated dry-runtest_connection()for the wizard's pre-save Test step.services/connectors/app/main.py— the FastAPI lifespan now wires the scheduler, withAISOC_CONNECTORS_DISABLE_SCHEDULERhonored for tests and CI.
Before this pass: adding a connector meant editing Python in three places,
shipping a release, and reading docs to discover the auth fields. Secrets
sat in plain JSON in Postgres. After this pass: connectors are runtime
data; secrets are encrypted with a key the operator controls; rotation
is a documented procedure; the wizard's Test connection round-trip
catches bad credentials before they're saved; and the per-connector docs
each give an analyst a 5-minute path from "I have a tenant" to "alerts
are flowing into the console."
This pass addresses two questions raised on the public launch thread about the v5.2 eval harness:
- "Any interest in shipping synthetic telemetry (M365 audit, CloudTrail,
Sysmon) backing each incident?" — Yes. A companion
synthetic_telemetry.jsonlcorpus is now generated alongsidesynthetic_incidents.jsonand gives connector and Sigma PRs a concrete contract to wire against without provisioning a real tenant. - "INC-EVAL-044, 099, and 154 are the same template with
{user}/{host}swapped — what does the multiplier buy vs. the dilution in regression signal?" — The multiplier still buys breadth for connector regressions, but the eval suites now report a per-template macro alongside the per-case mean so a single broken template (~4 cases) moves the regression signal by ~1.8% rather than ~0.5%, and the failing template IDs are surfaced inline.
- Synthetic telemetry corpus
(
services/agents/tests/eval_data/synthetic_telemetry.jsonl,scripts/generate_eval_incidents.py) — 361 backing events spanning 14 log sources (Sysmon, Windows Security, M365 audit, Azure sign-in, CloudTrail, Linux auditd, journald, EDR, DNS, web access, Kubernetes audit, GitHub audit, VPN, DB audit), wired to all 200 incidents. Each event is a templated dictionary with{user}/{host}/{ip}/{campaign}placeholders resolved against the incident it backs, and carries the fields a real connector pivots on (process tree, principal, source IP, log source, event ID). - Telemetry event factories + recursive resolver
(
scripts/generate_eval_incidents.py) —_sysmon,_winsec,_m365,_azure_signin,_cloudtrail,_auditd,_journald,_edr,_dns,_web,_k8s,_github,_vpn,_dbproduce base event shapes; a recursive resolver walks nested dicts and substitutes incident context. The 55 templates in_TEMPLATESeach now carry atemplate_id, atemplate_index, and a tuple of telemetry events. - Schema + coverage gate (
services/agents/tests/test_synthetic_telemetry.py) — five new assertions: every incident has ≥ 1 backing event, every expected source is present, every event carries the source-specific pivot fields a real connector needs, all placeholders resolve, and no single template dominates the source distribution. - Per-template macros on every scoring suite
(
services/agents/tests/test_mitre_accuracy.py,test_investigation_completeness.py,test_response_quality.py,scripts/run_evals.py) — each result now carries aper_template_summary()(mean, median, min, max, count, failing IDs) alongside the per-case mean, plus a new test gating macro accuracy ≥ 0.80 for MITRE / completeness and ≥ 0.75 for response-plan quality. A template-distribution-balance test asserts no single template accounts for > 5% of incidents (currently 0.5–2.0% each). run_evals.pyoutput expansion — each suite headline now prints the per-case mean and the per-template macro with the failing template IDs inline; the human-readable summary appends a synthetic- telemetry footer (event count, source count, incident coverage, file path);--jsonoutput addsper_templateandtelemetryblocks.
- Incident schema —
synthetic_incidents.jsonentries now includetemplate_id(e.g.m365_admin_impersonation) andtemplate_indexfields. Existing fields are unchanged. Regenerated deterministically from the seeded RNG. apps/docs/docs/benchmark.md— added a "What's new (v1.4)" section, a "Per-case vs. per-template metrics" section explaining the ~0.5% vs ~1.8% sensitivity argument with worked examples, and a new "Synthetic telemetry corpus" section documenting the 14 sources, the pivot fields, the placeholder resolver, and the five schema/coverage checks. The "Help us harden the harness" call-outs now include adding a connector + Sigma rule against the corpus and adding a new template with backing telemetry. The "What this is not" section is updated to call out that the corpus is hand-shaped (not captured from a live tenant) and that the per-template macro is the non-tautological signal on top of the otherwise self-consistent gates.README.md— capability bullet rewritten to call out five suites (was four), 55 distinct templates, per-case + per-template macros, and the synthetic-telemetry coverage gate. The comparison table flags the eval harness as having a synthetic-telemetry corpus + per-template macros. Step 5b (Run the public eval harness) documents the newpython scripts/generate_eval_incidents.pyworkflow for regenerating the dataset and the corpus together.- Eval signature on completeness + response-quality runs — calls
from
run_evals.pynow usekeep_per_incident=Trueso the per- template summary is computable. Default behaviour unchanged for existing direct callers.
The v5.2 harness gave deterministic numbers but two real concerns existed: duplicates could mask a broken template behind 199 working duplicates, and there was no concrete telemetry shape for connector contributors to wire against. v1.4 closes both: the per-template macro is the dilution-resistant regression signal that surfaces template-class breaks, and the synthetic telemetry corpus is the connector-development contract.
This is a "fix the foundations" pass: tighten security defaults, drop
overclaims, harden CI, fix DX rough edges, scale detection content from
~200 to 6,913 rules with explicit tiering, and ship a public demo
hosted on tryaisoc.com via Cloudflare Tunnel.
- GraphQL tenant scoping (
services/api/app/graphql/) — every resolver is wrapped with atenant_scopehelper, GraphiQL is forced off in production, and a tenant-isolation regression test asserts cross-tenant reads return 0 rows. - Plugin signature gate (
services/api/app/services/plugin_manager.py,packages/plugin-sdk-py/src/aisoc_plugin_sdk/loader.py,packages/plugin-sdk-go/aisoc/loader.go) — Ed25519 signature verification is required before loading any plugin.PLUGIN_TRUST_MODEcontrols policy:strict(default, signed only),permissive(warn- load),
dev(skip). Publisher signing flow is documented inpackages/plugin-sdk-py/README.mdandpackages/plugin-sdk-go/README.md.
- load),
/metricsand compose hardening (docker-compose.yml,docker-compose.demo.yml,services/api/app/main.py,services/api/app/core/security.py) — service ports bind to127.0.0.1by default, the API logs a loud warning ifSECRET_KEYis unset or default, theadminrole permissions are corrected to match the documented matrix, and/metricsis gated behindMETRICS_TOKEN.
- Fusion pipeline framing (
services/agents/app/fusion/,apps/docs/docs/architecture.md) — replaced "real fusion pipeline" with the actual scope (rule-based + ML scoring fan-in, no reinforcement learning). - CI cadence wording (
README.md,CONTRIBUTING.md) — "every commit" → "every push and PR tomain". - Eval harness honesty (
scripts/eval/,apps/docs/docs/) — removed "Macro F1" references, reframed the 200-incident synthetic dataset as substrate self-consistency, dropped the hardcodedSUITESconstant, fixed the broken--reportflag, and aligned Prophet usage in code and docs.
- No more
|| true(.github/workflows/ci.yml) — removed every silent failure suppression. - Web Vitest smoke —
apps/webships a Vitest suite covering marketplace filters, detection coverage view, and core layouts. - SDK + service jobs — added Python pytest + Vitest jobs for
packages/sdk-{py,ts,go}andpackages/plugin-sdk-{py,go}, plus pytest jobs forservices/{api,agents,actions,connectors}. - Detection + playbook validation in CI
(
.github/workflows/validate-detections.yml,.github/workflows/check-openapi.yml) —validate_detections.pyruns against all 6,913 rules and the OpenAPI spec is regenerated and compared on every PR.
aisoc-doctorprobes fixed (tools/aisoc-doctor/) — checks match the actual ports, env var names, and service URLs.- CLI consistency (
packages/cli/,README.md,apps/docs/docs/) —npx aisocandaisocresolve identically; package names, missing pnpm scripts, and themcpservice reference are corrected; branching/tooling and env var names match across docs. - Infra READMEs —
infra/k8s/,infra/helm/,infra/terraform/,infra/render/,infra/fly/,infra/railway/,infra/coolify/each have aREADME.mddocumenting prerequisites, secrets, and invocation.
- 800 native rules — added 600 new Sigma-shaped detections across
five new spec modules (
scripts/detection_specs_part3_cloud.py,_identity.py,_endpoint.py,_network.py,_application.py), each withmatch_when, MITRE tagging, and auto-generated positive/negative fixtures viascripts/detection_specs_part3_helpers.py. Native total: 200 → 800. - 6,113 imported rules with provenance — wired importers under
tools/detection_import/{sigma,splunk,chronicle,car}_importer.pyfor SigmaHQ, Splunk Security Content, Chronicle, and MITRE CAR. Each imported rule is tagged with its source, license, and original ID; rules whose mappings cannot be replayed against AiSOC fixtures are quarantined underdetections/<source>-imports/quarantine/(~5,937 quarantined, ~6,113 active). - Title → name migration — imported YAMLs now use the canonical
name:field instead oftitle:, matchingvalidate_detections.py's required schema.tools/detection_import/common.pywas updated and 6,113 existing files were migrated in place. - Marketplace tier UX (
apps/web/src/components/marketplace/MarketplaceView.tsx,MarketplaceView.test.tsx,marketplace/index.json,apps/web/public/marketplace/index.json,scripts/build_marketplace.py) — items now expose atierfield (stable/beta/imported/community), the marketplace UI defaults tostableand shows per-tier counts on filter chips, andbuild_marketplace.pyinfers tiers fromplugin.yamland source paths. - MITRE ATT&CK coverage view (
apps/web/src/app/(app)/detection/coverage/,apps/web/src/lib/mitreTactics.ts) — new in-app dashboard rendering the coverage matrix from the marketplace index. - Documentation refresh — updated
README.md,apps/docs/docs/intro.md,apps/docs/docs/quickstart.md,apps/docs/docs/concepts/detections.md,apps/docs/docs/contributing/dev-setup.md,detections/README.md, and.github/workflows/validate-detections.ymlto reflect 800 native + ~6,000 imported (filterable by tier) and drop stale "200+ rules" claims.
- Cloudflare Tunnel infra (
infra/cloudflare/) —config.yml.example,tunnel.sh, and a README explaining how to run the demo profile behindtryaisoc.comviacloudflared. Tunnel script readsDOMAIN,TUNNEL_NAME,SUBDOMAINS,SKIP_DNS,SKIP_RUNenv vars; defaults publish apex +api.,ws.,docs.subdomains. pnpm demo:publicscript (scripts/demo-public.sh) — bootsdocker-compose.demo.yml(read-only demo profile with seeded incidents) viapnpm aisoc:demo --no-open, then brings up the Cloudflare Tunnel that mapstryaisoc.com→ web (:3000),api.tryaisoc.com→ api (:8000),ws.tryaisoc.com→ realtime (:4000), anddocs.tryaisoc.com→ Docusaurus (:3001). Companion scripts:pnpm demo:public:tunnel-only(skip stack bring-up, just run the tunnel) andpnpm demo:public:setup(provision tunnel + DNS without running cloudflared, forcloudflared service installflows).- Public-host-agnostic web bundle (
apps/web/next.config.js,apps/web/src/lib/api.ts) — the Next.js client now emits same-origin relative paths (/api/v1/...,/ws/...) instead oflocalhost:8000-baked URLs, with server-side rewrites proxying to api/agents/realtime by Docker DNS name. The same image works onlocalhost:3000, behind Cloudflare Tunnel ontryaisoc.com, or behind any reverse proxy without a rebuild. - README "Try it live" — top-of-README link to the public demo with a one-liner for hosting your own on a Cloudflare-managed domain.
5.2.0 — 2026-05-04
This release groups four areas of work: an append-only investigation ledger, a public eval harness, a mobile responder PWA, and a hosted demo profile. Details below.
- Investigation Ledger (
services/api/migrations/008_investigation_ledger.sql,services/api/app/models/investigation.py,services/agents/app/investigator/ledger.py) — every prompt the agent emits, every tool call, every retrieved evidence shard, and every rationale is persisted as an append-onlyinvestigation_steprow, scoped to a tenant + case. - Investigation Ledger UI (
apps/web/src/components/cases/InvestigationLedger.tsx) — replayable step-by-step view in the case workspace with prompt, response, and tool-call diffs. GET /api/v1/investigations/*endpoints (services/api/app/api/v1/endpoints/investigations.py) for listing, retrieving, and replaying ledger entries by case.- Investigator graph upgrades
(
services/agents/app/investigator/{orchestrator,recon_agent,forensic_agent,responder_agent,report_writer_agent,state}.py) — every node now writes a ledger entry on entry and exit, including the structured input it received and the structured output it produced.
- 200-incident synthetic dataset
(
services/agents/tests/eval_data/synthetic_incidents.json) — 200 deterministic, regenerable cases covering all 14 MITRE ATT&CK enterprise tactics across roughly the top 50 techniques. Generated byscripts/generate_eval_incidents.py. - Four eval gates under
services/agents/tests/:test_alert_reduction.py— real measurement: 1 000 noisy alerts → ~250 incidents via 3-tier fusion, with explicit storm and near-duplicate handlingtest_mitre_accuracy.py— substrate self-consistency gate: tactic-level accuracy / precision / recall / F1 between the hand-curated extractor and the dataset that was written to feed ittest_investigation_completeness.py— substrate self-consistency gate: evidence-keyword coverage on a templated reporttest_response_quality.py— substrate self-consistency gate: 5-criterion offline rubric on a templated response plan (action class, severity awareness, MITRE alignment, evidence grounding, actionability)
scripts/run_evals.py— one-shot harness with--jsonand--cioutput modes. Total runtime ~25 ms on a laptop. CI-gated on every commit via.github/workflows/ci.yml. Runs deterministic substrate code against synthetic incidents — does not call the live LLM agent.- Public eval harness page (
apps/docs/docs/benchmark.md,apps/web/src/app/benchmark/page.tsx,apps/web/src/components/benchmark/) — published numbers, full method, comparison to other AI SOC offerings, and explicit framing of which suites measure substrate self-consistency vs real behaviour. Linked from the README and the docs landing page.
- Responder PWA (
apps/web/src/app/(responder)/,apps/web/src/components/responder/,apps/web/src/components/pwa/) — installable, offline-aware, push- enabled responder console for on-call analysts. Service worker atapps/web/public/sw.js, manifest atapps/web/public/manifest.json, offline shell atapps/web/public/offline.html. - Passkey authentication (
services/api/app/models/responder.py,services/api/app/api/v1/endpoints/passkeys.py,apps/web/src/lib/responder/) — WebAuthn registration and login for the Responder surface; FIDO2 platform authenticators only, no SMS fallback. - On-call schedule + handoff (
services/api/app/models/responder.py,services/api/app/api/v1/endpoints/oncall.py) — current responder per tenant, surfaced in the Responder home page and in alert pages on the desktop console. - Approvals workflow (
services/api/app/api/v1/endpoints/approvals.py) — long-lived approval requests for blast-radius-gated SOAR actions, approvable from the Responder PWA with hardware-attested passkey. - Web Push delivery (
services/realtime/src/push.ts,services/api/app/api/v1/endpoints/push.py) — VAPID-signed push notifications wired into the realtime gateway. Subscriptions persist per-device and follow the on-call rotation. - Migration —
services/api/migrations/009_responder_pwa.sql.
- Contextual actions (
services/agents/app/api/contextual.py,apps/web/src/components/alerts/AlertDetailView.tsx,apps/web/src/components/cases/CaseWorkspace.tsx,apps/web/src/components/detections/RuleEditor.tsx,apps/web/src/components/playbooks/PlaybookEditor.tsx) — the AI Copilot now reads the surface the analyst is standing on (alert / case / rule / playbook) and proposes the next two or three concrete actions with the correct payloads pre-filled. One click invokes the agent with the right tool. - Investigator graph awareness — every contextual action is grounded in the same Investigation Ledger so the analyst sees, before clicking, which prompts and tool calls will be issued.
@aisoc/mcp(services/mcp/) — Model Context Protocol server exposing 11 AiSOC tools to Claude Desktop, Cursor, Cody, and Continue.- Discovery tools —
aisoc_list_alerts,aisoc_list_cases,aisoc_query_detections. - Deep-dive tools —
aisoc_get_case,aisoc_get_investigation,aisoc_get_alert. - Action / replay tools —
aisoc_run_investigation,aisoc_replay_decision,aisoc_explain_step,aisoc_create_case,aisoc_assign_alert. The replay set walks the Investigation Ledger step-by-step inside the IDE / chat. - Install command —
npx -y @aisoc/mcp install --host claude --aisoc-url … --api-key …. - Documentation —
apps/docs/docs/integrations/mcp.md,services/mcp/README.md.
- Slim demo profile (
docker-compose.demo.yml) — postgres + redis + kafka + api + agents + realtime + web. ClickHouse, OpenSearch, Neo4j, and Qdrant are gated behind compose profiles for production. - Prebuilt images —
ghcr.io/beenuar/aisoc-{api,agents,realtime,web,…}built and published by.github/workflows/publish-images.ymlon every release tag. - One-shot orchestrator (
scripts/aisoc-demo.ts) — pulls images, brings up the stack, waits on healthchecks, seeds canonical demo data, kicks off an agent investigation against a seeded case, and opens the browser at/cases/<uuid>with the live ledger view selected. - Demo mode middleware (
services/api/app/middleware/demo_mode.py) — gates write operations, resets state every UTC midnight, and watermarks the UI as read-only. Tests atservices/api/tests/test_demo_mode.py. - Target time-to-first-investigation: roughly 3–5 minutes on a warm Docker daemon, depending on image cache state.
- Cleanup —
pnpm aisoc:demo:downremoves the volumes; logs atpnpm aisoc:demo:logs.
- Fly.io (
infra/fly/) — first-class config forapi,agents,realtime,web. Deploys viainfra/fly/fly-demo-deploy.sh, ~$14/mo for the whole stack. - Render (
render.yaml) — managed, sleep-on-idle config suitable for hobbyists and design partners. - Railway (
infra/railway/railway.toml) — pay-as-you-go PaaS. - Coolify (
infra/coolify/README.md) — self-hosted on your own VPS, reuses the existingdocker-compose.yml.
- ~200 detection rules in
detections/covering MITRE ATT&CK Enterprise (cloud, identity, endpoint, network, application). Sigma format, with MITRE technique IDs intags, fixtures underdetections/fixtures/, anddetections/README.mddocumenting the schema. - 50+ response playbooks in
playbooks/packs/v1/— IAM, EDR, network, application, generic. JSON DSL with explicit decision trees, human-approval gates, and rollback steps. Schema inplaybooks/README.md. - 15 plugins in
plugins/— both Go and Python implementations for CrowdStrike, Splunk, Sentinel, AWS Security Hub, Okta, Cloudflare WAF, Defender, GuardDuty, Pagerduty, Slack, Teams, Jira, ServiceNow, VirusTotal, AbuseIPDB. Each ships with manifests, tests, and SDK helpers. - Marketplace index (
marketplace/index.json,apps/web/public/marketplace/index.json) — auto-generated byscripts/build_marketplace.pyfrom the on-disk content tree. - Validation tooling —
scripts/validate_detections.py(Sigma + MITRE ID schema)scripts/validate_playbooks.pyandscripts/lint_playbooks.py(DSL well-formedness + safety).github/workflows/{validate-detections,validate-playbooks,sync-marketplace}.ymlenforce the gates on every PR.
- In-app marketplace (
apps/web/src/app/(app)/marketplace/page.tsx,apps/web/src/components/marketplace/MarketplaceView.tsx) — filterable by category, ratings, verified vs community badge.
packages/plugin-sdk-go— Go plugin SDK (module github.com/beenuar/aisoc/plugin-sdk-go) with action, connector, enricher, registry, widget, and loader primitives. Examples underpackages/plugin-sdk-go/examples/.packages/plugin-sdk-py— Python plugin SDK with the matching primitives, decorators, and a registry. Tests underpackages/plugin-sdk-py/tests/.packages/sdk-py(PyPI:aisoc-sdk) — async Python client SDK for the AiSOC API.packages/sdk-ts(npm:@aisoc/sdk) — TypeScript client SDK with auto-generated types.packages/sdk-go— Go client SDK with OpenAPI-generated models.
/why-open-sourcepage (apps/web/src/app/why-open-source/page.tsx) — long-form description of the project's open-source posture and trade-offs.- Updated landing (
apps/web/src/components/landing/{Hero,LandingNav,Footer,OpenSource}.tsx) — the "live demo" button lands directly on a seeded investigation; comparison rows reference specific behaviours rather than generic claims. - Docusaurus refresh — new MCP integration page, benchmark page, Investigation Ledger references, Responder PWA mentions in concepts and quickstart.
- Repository home — all
cyble-inc/AiSOCandaisoc-os/aisocURLs updated tobeenuar/AiSOCacross docs, README, SDKs, and benchmark badges. packages/sdk-gomodule path is nowgithub.com/beenuar/aisoc/sdk-gofor the API client SDK; the plugin SDK is atgithub.com/beenuar/aisoc/plugin-sdk-go.alertsAPI (services/api/app/api/v1/endpoints/alerts.py,services/api/app/models/alert.py) — surfaces copilot context (suggested next actions) inline on the alert detail response.- API router (
services/api/app/api/v1/router.py) — wires upapprovals,investigations,marketplace,oncall,passkeys,push.
- CI Docker build contexts —
.github/workflows/{ci,release,publish-images}.ymlnow set explicitcontextandfileparameters per service; multi- service builds no longer race on a stale build root. - Docker Compose obsolete
versionwarning — removedversion: '3.8'fromdocker-compose.demo.yml. - Repository hygiene — added
.gocache/,*.tsbuildinfo,apps/docs/.docusaurus/,apps/docs/build/,plugins/**/*-build-test,plugins/**/*-build,eval_report.json, andeval_mitre_accuracy_report.jsonto.gitignore. Removed previously tracked Docusaurus cache and local IDE hook state files from the index.
5.1.0 — 2026-05-03
- UEBA service (
services/ueba) — User & Entity Behavior Analytics- Welford online algorithm for incremental baseline computation
- Z-score anomaly scoring with configurable sensitivity
- Peer-group analysis (same role / department / location clustering)
- Kafka consumer (
security.events) → producer (security.anomalies) integration withfusionservice - Alembic migrations, Dockerfile, Helm deployment template
- Honeytokens service (
services/honeytokens) — deceptive credential & file traps- HMAC-SHA256 signed token generator (URL, file, AWS key, email flavors)
- Webhook handler for first-touch alerting (HTTP signed callbacks)
- Token lifecycle management: active / triggered / expired states
- React UI: create tokens, view trigger log, copy lure URLs
- Alembic migrations, Dockerfile, Helm deployment template
- Purple Team service (
services/purple-team) — adversary emulation & tabletop- Atomic Red Team YAML parser (any
atomics/directory) - Caldera REST integration for remote execution
- ATT&CK coverage heatmap (tactic × technique matrix)
- Test execution tracking with detection reporting (true positive / false negative)
- Tabletop exercise session manager with finding capture
- React UI: Coverage tab, Executions tab, Tabletop tab
- Alembic migrations, Dockerfile, Helm deployment template
- Atomic Red Team YAML parser (any
5.0.0 — 2026-05-03
- SAML 2.0 + OIDC authentication (
services/api/app/auth/)- IdP-initiated and SP-initiated SAML 2.0 flows (python3-saml)
- OIDC authorization-code + PKCE flow with
authlib - JWT issuance on successful SSO login
- Multi-tenant Row-Level Security (Postgres RLS)
tenant_idcolumn on all data tables- RLS policies enforced at the database level
- SQLAlchemy
set_tenant()middleware in FastAPI deps
- Granular RBAC (
services/api/app/api/v1/endpoints/rbac.py)roles,role_permissions,user_rolestablesrequire_permission("resource:action")FastAPI dependency- Admin UI at
/settings/rbac
- Immutable Audit Log
- Append-only
audit_logtable with a before-UPDATE trigger - FastAPI middleware auto-logs every mutating request
GET /api/v1/auditpaginated endpoint with tenant filter- Audit log viewer UI at
/audit
- Append-only
- Compliance dashboards
- SOC 2 evidence auto-collection + PDF export (
/compliance/soc2) - ISO 27001, NIST CSF, PCI-DSS, HIPAA, DORA framework heatmaps
GET /api/v1/compliance/{framework}endpoint with control mapping
- SOC 2 evidence auto-collection + PDF export (
- SLA tracking — MTTD / MTTR / MTTC
tenant_sla_config+alert_sla_eventstablesGET /api/v1/sla/metrics+GET /api/v1/sla/breaches- SLA dashboard widget at
/sla
- HA Helm chart — HPA, PDB, Ingress per service
- Backup & restore scripts (
scripts/backup.sh,scripts/restore.sh) for Postgres + ClickHouse + plugins → S3/R2 - Operational runbook generator (
scripts/generate_runbook.py) from live OTel trace data - Multi-region deployment guide (
docs/operations/multi-region.md) - OpenTelemetry instrumentation across API, UEBA, Honeytokens, and Purple Team services
4.1.0 — 2026-05-03
- AiSOC CLI (
packages/aisoc-cli) —scaffold,validate,publishcommands for plugins and detectionsaisoc scaffold plugin <name>— generate plugin skeletonaisoc validate detection <file>— Sigma/YAML schema validationaisoc publish plugin <path>— submit to community registry with Ed25519 signing
- Plugin publishing flow
community_pluginstable with signature, author, review statePOST /api/v1/plugins/publish— signed submissionPOST /api/v1/plugins/{id}/approve/reject— curator review endpoints- Ed25519 signature verification on every submission
- Marketplace v2 — ratings, install counts, verified badges, category filter, sort options
plugin_ratingstable +POST /api/v1/plugins/{id}/rateGET /api/v1/marketplace?category=&sort=with pagination
- Detection catalog (
/detection/catalog) — paginated Sigma rule browser- Install-to-tenant action from catalog
GET /api/v1/detections/catalogendpoint
- Playbook community submissions
community_playbookstable + submit / curate API- Community tab in PlaybooksView UI
- Docusaurus documentation site (
apps/docs) — full API, architecture, deployment, plugin SDK, quickstart
3.0.0 — 2026-05-02
- Threat Intelligence Enrichment (13 providers)
- Open-source/freemium: VirusTotal, AbuseIPDB, GreyNoise, Shodan, URLScan.io, IPinfo
- Commercial: Cyble Vision, Recorded Future, Mandiant, Crowdstrike Intel, Anomali, IBM X-Force, Flashpoint, Intel 471, DomainTools, RiskIQ
- New enrichment types:
DarkWebContext,VulnerabilityRef,BrandRisk - Concurrent fan-out enrichment engine in Go
- Go module path migration — all services updated from
github.com/cyble/aisoctogithub.com/beenuar/aisoc - SECURITY.md — vulnerability disclosure policy and security contacts
services/enrichment/README.md— full enrichment service documentation
- All GitHub repository references updated to
https://github.com/beenuar/AiSOC - Helm chart container images updated from
ghcr.io/cyble/aisoc-*toghcr.io/beenuar/aisoc-* .env.exampleexpanded with API keys for all commercial TI providers
2.0.0 — 2026-05-01
- Knowledge Graph — Neo4j-backed entity relationship visualization (
services/api/app/services/graph_service.py) - ML Fusion Engine — multi-model alert scoring and deduplication (
services/fusion/app/services/) - Rule Engine — YAML-based detection rules with MITRE ATT&CK mapping (
services/api/app/services/rule_engine.py) - Attack Graph viz with D3.js force layout (
apps/web/src/components/graph/) - MITRE ATT&CK Heatmap on dashboard
- AI Copilot dock — streaming LLM assistant integrated into case and alert views
- Threat Hunt page — query builder with saved hunts and timeline scrubbing
- Case Workspace — full case lifecycle: evidence, timeline, collaborators, MITRE tagging
- Detection Rule Builder — visual rule editor with backtesting
- Settings page — RBAC, notifications, API key management, threat intel feed config
- Live Dashboard — WebSocket-powered real-time alert/event feed
- Command Palette (cmd-K) — fuzzy search for navigation and actions
- Marketing Landing Page — hero, feature highlights, open-source section, footer
- Design Token System — Tailwind + CSS vars, Framer Motion animations, responsive layouts
- Demo Producer — synthetic event generator for local development
scripts/seed_demo.py— database seeding for demos
- Web app migrated to Next.js App Router
- All API routes versioned under
/api/v1
1.0.0 — 2026-04-30
- Initial release of AiSOC — AI Security Operations Center
- FastAPI backend (
services/api) with alert ingestion, case management, detection rules - Next.js 14 frontend (
apps/web) with dashboard, alerts, cases, connectors, threat-intel pages - Real-time service (
services/realtime) using WebSockets - Ingest service (
services/ingest) in Go for high-throughput event ingestion - Enrichment service (
services/enrichment) in Go - Docker Compose stack for local development
- Helm chart for Kubernetes deployment (
infra/helm/aisoc/) - MIT License