Skip to content

rpwalsh/scce

Repository files navigation

SCCE 2.0

SCCE (Self-Contained Cognitive Engine) is a local-first, citation-driven question-answering system. It answers questions over an ingested corpus using classical retrieval (BM25 + entity graph + spectral), a planner-driven verification loop, and a local n-gram synthesizer. No remote LLM calls are required on the answering path.

Quickstart (one command)

git clone <this-repo> scce && cd scce
npm install -g pnpm@9        # if you don't have pnpm
pnpm install:local           # installs deps, builds, runs migrations if Postgres is up
pnpm dev:server              # terminal A
pnpm dev:web                 # terminal B  →  http://localhost:5173

pnpm install:local is idempotent. If Postgres isn't running, it prints exactly which commands to run (Homebrew / Docker / apt) and exits non-zero so CI can detect it. Re-run after starting Postgres.

Reviewer fast-path (10 seconds, no DB required)

A hostile reviewer with the brief "show me, don't tell me" runs:

pnpm install && pnpm build       # ~30s, deps + 9-package build
pnpm publishable:check           # ~5s — 18/18 audit + 4-axis eval + fuzz + bench
pnpm seed:test                   # ~5ms — bootstraps a 5-domain brain

Reproducible numbers (April 2026 baseline, M4 / Node 25):

  • 18/18 hostile-review guards pass
  • 100% capability_correctness on the 16-Q embedded eval
  • 5,000 fuzz iters, 0 contract violations
  • 25,000 q/s, p99 0.06ms end-to-end (16-triple brain)
  • 99 pass / 0 fail / 3 skipped on the unit suite (clean checkout)

Optional diagnostic (NOT a release gate):

pnpm test:integration            # runs 45 gated end-to-end tests on a 5-domain seed

The integration suite stresses the extraction/reasoning heuristics on a deliberately tiny inline corpus; ~30% of them fail by design (the heuristics are tuned for live corpora ≥10× larger). The suite is a development-time aid, not a pass/fail signal — see the test file header for why. The actual release gate is pnpm publishable:check.

Framing: see docs/POST_TOKEN_ECONOMY.md. Honest gap closure log: docs/HOSTILE_AUDIT_PRODUCTIZATION.md §14.

The Brain Bundle

Everything SCCE has learned — the n-gram models, the linguistic-primitives lexicon, the concept graph (including Wikipedia-mined common-sense edges) — can be packed into a single portable file:

pnpm brain:export                      # → ./scce-<timestamp>.brain
pnpm brain:import ./scce-<ts>.brain    # CRC-verified, atomic, idempotent

Or from the GUI: Brain tab → Download .brain / drag-and-drop a file to inspect → review entries → Apply. Two-step import means you always see the manifest (and per-entry CRC32) before disk state is overwritten.

The bundle format is documented in brainBundle.ts: 6-byte magic, version, JSON manifest, length-prefixed entries each with their own CRC32, and a footer CRC32 over the concatenation. Hostile-audit posture is documented in the file header.

What It Is, And What It Is Not

SCCE is:

  • A local-first QA system that answers from documents you ingest.
  • A retrieval pipeline that fuses lexical (BM25), graph (entity co-occurrence), and spectral (TF-IDF + truncated SVD) channels.
  • A planner that hypothesizes, verifies against retrieved spans, and emits per-sentence provenance.
  • A Fastify server with explicit migrations, controlled shutdown, SSE streaming, and a job queue.

SCCE is not:

  • A general-purpose LLM. Its synthesizer is a Kneser-Ney n-gram model; fluency is bounded by what it has seen.
  • A reasoning engine in the chain-of-thought sense. "Reasoning" here means structured retrieval + verification, not free-form deduction.
  • A drop-in replacement for hosted models. It is built for use cases where traceability and offline operation outweigh stylistic polish.

Provenance is a hard requirement of the verification path: sentences without supporting span overlap are flagged, not hidden.

Known Honesty Bounds

Four gaps are documented openly because the project's posture is "say what we don't do" rather than paper over weaknesses:

  1. Surface fluency is bounded by mined sentence templates plus the 6-gram synthesizer. SCCE now ships a fluency realizer that walks a proof tree over the concept graph and slots its claims into English frames mined directly from Wikipedia (sentenceTemplates.ts). Output is then polished, perplexity-ranked against the local n-gram model, and run through the self-evaluator, which can force-abstain when coverage, citations, or completeness fall below threshold. Answers in well-covered domains read like English; in thin domains the system prefers "I don't know" to fluent guessing.

  2. Zero-shot generalization to unseen tokens is handled by the honest analogy engine — morphological + compositional + structural analogy with hard per-kind confidence caps (≤ 0.6). When no analogy can be honestly drawn, the planner says "I don't know."

  3. Multi-step reasoning is capped at 4 hops by default and may extend to 8 hops only when the caller supplies a per-step entailment verifier (cite-or-stop). See multiHopWalker.ts.

  4. Common-sense breadth is mined from Wikipedia ITSELF, not from crowdsourced graphs (no ConceptNet ingestion). Five novel signals — list-page enumeration, category co-membership, superlative typicality, infobox value priors, hyperlink-anchor aliasing — are implemented in commonSenseMiner.ts and committed with empirical confidence and per-signal provenance.

What SCCE Does

SCCE combines five capabilities into one deployable system:

  1. Corpus ingestion across mixed sources (documents, spreadsheets, code, wiki-style corpora).
  2. Knowledge structuring via entities, relations, and spectral projections.
  3. Multi-channel retrieval (lexical, graph, spectral) with diversity-aware fusion.
  4. Planner-driven reasoning loop that tests and refines candidate claims.
  5. Local synthesis with quality gates, provenance checks, and uncertainty signaling.

Reasoning, Fluency, and Abstention

On top of retrieval, SCCE has a small post-LLM cognitive layer:

  • Proof-tree reasoner (reasoner.ts) resolves anchors in the question, harvests 1-hop edges from the concept graph, then beam-searches multi-hop chains with the multi-hop walker. Every claim carries citations and a confidence; the tree exposes a completeness score and an answer | abstain recommendation.
  • Sentence templates (sentenceTemplates.ts) are mined from Wikipedia during ingestion (first 10 sentences per document) and round-trip inside the .brain bundle as templates.json. Frames are keyed by predicate (is-a, worked-with, wrote, ...) so realization stays grounded in attested phrasings.
  • Fluency realizer (fluencyRealizer.ts) plans (definition → property → relation → multi-hop), slots each claim into a mined frame, joins with connectives ("Additionally", "By extension"), polishes (a/an, capitalization, punctuation), ranks candidates by n-gram perplexity, and appends a (sources: ...) suffix.
  • Self-evaluator (selfEval.ts) scores six honesty signals — coverage, citation density, unverified-chain risk, confidence floor, completeness, fabrication risk vs. excerpts — and can hard-abstain at severity ≥ 0.999.
  • Bundle federation (bundleFederation.ts) loads N signed .brain bundles in priority order into a single live brain, so a 50 GB shipped knowledge pack can ride alongside user-trained ones with per-bundle CRC and signature verification.
  • Code & environment readers (codeReader.ts, environmentReader.ts) ingest TypeScript / JavaScript / Python source and project trees (respecting .gitignore) into the same concept graph as prose, so the brain can reason about the project it lives in.

The full path is exposed at POST /api/reason (see docs/API_REFERENCE.md) and exercised end-to-end by pnpm smoke:post-llm.

End-to-End Pipeline

At a high level:

  1. Ingest files into documents/spans/chunks.
  2. Correlate entities and relations.
  3. Build and refresh spectral basis/projections.
  4. Train and load local n-gram models.
  5. Resolve queries through perception, retrieval, planning, verification, and synthesis.
  6. Return response text plus source-linked context.

This is implemented as a stable server runtime with background jobs and API visibility for each operational phase.

Operational Posture

SCCE is structured for real operations, not just demos. Hardening is opt-in via environment, and unsafe defaults fail closed in production.

  • Stateful service with explicit DB + model dependencies
  • Startup migration safety and controlled shutdown persistence
  • Async chat mode with SSE streaming and status events
  • Job queue control for indexing/training/spectral refresh
  • Operational endpoints for status, topology, activity, and audit export
  • Runbook coverage for backups, restore, incidents, and handoff

See full operating details in docs/OPERATIONS.md and docs/PRODUCTION_HANDOFF.md.

Architecture at a Glance

  • apps/server: Fastify API, startup/shutdown lifecycle, routes, worker orchestration
  • apps/web: React UI for chat, vault, training, artifacts, and system monitoring
  • packages/core: ingestion, correlation, retrieval, planner, synthesis, spectral logic
  • packages/db: PostgreSQL access and migration layer
  • packages/types: shared TypeScript types and contracts
  • packages/compute: parallel pipeline and compute dispatch utilities
  • packages/security: policy and audit support
  • packages/plugins: renderer and webapp template infrastructure
  • packages/sketches: probabilistic structures used by supporting workflows
  • data: local models, uploads, corpora, artifacts, and runtime state

Prerequisites

  • Node.js >= 20
  • pnpm >= 8 (via corepack)
  • PostgreSQL >= 14

Quick Start (Local Development)

  1. Install dependencies.
corepack enable
pnpm install
  1. Configure environment. For development, auth is bypassed when NODE_ENV=development (or SCCE_DEV_MODE=1):
export SCCE_DB_URL="postgres://scce_app:scce_app@localhost:5432/scce"
export NODE_ENV=development
  1. Build all packages.
pnpm -r build
  1. Start the server and the web app in separate terminals.
pnpm dev:server
pnpm dev:web
  1. Verify runtime health.
curl http://127.0.0.1:3000/health
curl http://127.0.0.1:3000/api/system/status

Fast Local Bootstrap

For a full local bootstrap (DB path, demo seeding, ingest, training triggers, and validation request):

pnpm tsx scripts/setup-complete-system.ts

First API Interaction

Synchronous chat (no attachments):

curl -X POST http://127.0.0.1:3000/api/chat `
	-H "Content-Type: application/json" `
	-d '{"message":"What is in the vault?","conversationId":null,"attachments":[]}'

Asynchronous chat pattern (attachments -> SSE):

  1. POST /api/chat with attachments.
  2. Read conversationId from response.
  3. Stream events from GET /api/events/:conversationId.

See detailed contracts and payload shapes in docs/API_REFERENCE.md.

Core Scripts

  • pnpm db-setup: create/apply database schema
  • pnpm smoke-test: validate key runtime paths
  • pnpm seed: seed demo corpus
  • pnpm status: status script
  • pnpm ingest:wiki: run wiki ingestion/training pipeline
  • pnpm brain:export / pnpm brain:import: pack / unpack the live brain as a portable .brain bundle
  • pnpm brain:keygen: generate an Ed25519 keypair for signed bundles
  • pnpm brain:federate <dir-or-file...>: load N .brain bundles into one live brain (50 GB ship path)
  • pnpm env:scan <root>: ingest a project directory (source + manifests + docs) into the concept graph
  • pnpm smoke:post-llm: end-to-end reasoner → realizer → self-eval smoke test
  • pnpm quality:check: headers + architecture checks
  • pnpm eval: run the gold-set QA evaluation harness
  • pnpm eval:strict: same, but exit non-zero if quality floors are not met
  • pnpm quality:deep: quality checks + hostile audit suite + strict eval

Evaluation

SCCE ships a gold-set runner at scripts/eval-qa.ts. It exercises the live /api/chat endpoint against a curated set of questions and computes:

  • retrieval_recall@k — fraction of gold documents present in top-k retrieved spans
  • provenance_precision — fraction of cited spans whose source actually appears in the answer
  • provenance_coverage — fraction of answer sentences with at least one supporting citation
  • answer_keyword_recall — fraction of expected keywords present in the answer
  • latency_ms_p50 / latency_ms_p95 — end-to-end answer latency

Gold sets live at data/eval/gold.json. A placeholder is auto-seeded on first run. Each entry has the shape:

{
  "id": "q1",
  "question": "What does SCCE use for retrieval?",
  "expected_doc_paths": ["docs/ARCHITECTURE.md"],
  "expected_keywords": ["BM25", "spectral"],
  "min_provenance": 1
}

Run:

pnpm eval            # writes a timestamped JSON report under data/eval/reports/
pnpm eval:strict     # additionally fails the process if quality floors are not met

Strict-mode floors (configurable in the script):

  • answer_present_rate >= 0.8
  • provenance_precision_avg >= 0.7
  • provenance_coverage_avg >= 0.5

Security and Trust Model

SCCE fails closed in production and is configured entirely through environment variables. There are no hard-coded credentials and no implicit allow-all behavior outside development.

Authentication

  • SCCE_API_KEYS — comma-separated list of accepted API keys. Required in production. Requests must send one of these as Authorization: Bearer <key> or x-api-key: <key>.
  • SCCE_DEV_MODE=1 — explicit dev/test bypass. Auth is skipped. Never set this in production.
  • NODE_ENV — when development or test, auth is bypassed automatically. Production requires NODE_ENV=production and a populated SCCE_API_KEYS, or the server refuses to start.

CORS

  • SCCE_CORS_ORIGINS — comma-separated list of exact-match allowed origins (e.g. https://app.example.com,https://admin.example.com).
  • In dev/test, localhost origins on any port are allowed automatically.
  • The null origin is always rejected.

Self-training

  • SCCE_ALLOW_SELF_TRAIN=1 — opt-in to letting the synthesizer learn from its own answers. Off by default to avoid model collapse. Per-call overrides exist for explicit user feedback (positive feedback only) and for force-trained corpus material.

Database

  • SCCE_DB_URL — PostgreSQL connection string. Required.
  • SCCE_DB_STATEMENT_TIMEOUT_MS — optional per-statement timeout (default applied at pool init).

Other operational guarantees

  • Upload/ingest paths are validated before filesystem operations.
  • Duplicate controls reduce accidental corpus bloat and replay noise.
  • Provenance verification is content-aware: cited spans are resolved against the underlying chunk text and rejected if there is no token-level overlap with the cited sentence.

Operating SCCE in Production

Operational priorities:

  1. keep DB and model backups current
  2. monitor chat error and timeout rates
  3. watch training/job queue health
  4. track ingestion growth and duplicate trends
  5. validate release upgrades against migration path

Use these docs as your source of truth:

Contributing and Engineering Standards

SCCE expects disciplined, auditable changes.

  • keep changes scoped and reversible
  • preserve API contracts or document intentional changes
  • keep SQL parameterized and input validation explicit
  • update docs alongside behavior changes
  • validate with build/smoke/quality scripts before merge

Contributor workflow references:

Documentation Index

License

Proprietary. See LICENSE for terms.

About

LLM-independent inference engine using graph reasoning, spectral retrieval, BM25/SVD search, Kneser-Ney grounded synthesis, concept learning, provenance tracking, and deterministic answer planning without generative hallucination.

Topics

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE.md

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors