Corely Database Persistence Strategy

Version: 2.1 - Implemented
Date: 2026-02-01
Status: Active

Intent

This document defines Corely's persistence strategy to support 1000+ modules without exploding PostgreSQL catalog size or making migrations/ORM tooling unusable. It separates "module packaging" (feature delivery) from "DB object ownership" (persistence design).

Problem Statement

Schema-per-module does not scale:

PostgreSQL tables/indexes consume catalog space and slow down schema introspection
Prisma schema generation/migration becomes unwieldy with 500+ models
Developer tooling (autocomplete, type generation) degrades
Migration coordination becomes a bottleneck

The solution: Use a 3-tier persistence model that separates logical modules from physical schemas.

Three-Tier Persistence Model

Tier 1 — Domain Bucket Schemas (Core Relational Data)

Target: 10–50 PostgreSQL schemas representing bounded contexts.

Purpose: Core transactional data with rich relational integrity.

Schema List:

identity — Tenants, users, roles, permissions, API keys
crm — Parties, contacts, addresses, deals, activities
billing — Invoices, payments, allocations, refunds
accounting — Ledger accounts, journal entries, journal lines
inventory — Stock locations, moves, reservations, reorder policies
sales — Quotes, sales orders
purchasing — Purchase orders, vendor bills
workflow — Workflow definitions, instances, tasks, events
platform — Apps, templates, packs, installations, menu overrides
reporting — Read models, snapshots, materialized views

Naming Convention:

Tables: <domain>.<table_name> (e.g., crm.party, accounting.journal_entry)
No prefixes inside schemas; schema qualifies the namespace
Clean SQL: SELECT * FROM crm.party WHERE tenantId = $1

When to use:

Medium to large modules with complex relational needs
Cross-module foreign keys (within reason)
Multi-table transactions
Rich indexing and query patterns

Tier 2 — Extension Storage (Shared Primitives for Small Modules)

Target: ext schema with 3 universal storage primitives.

Purpose: Allow small modules to persist data without creating new schemas/tables.

Tables

ext.kv — Key-Value Storage

CREATE TABLE ext.kv (
  id TEXT PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id TEXT NOT NULL,
  module_id TEXT NOT NULL,
  scope TEXT NOT NULL, -- e.g. "workspace", "user", "entity"
  key TEXT NOT NULL,
  value JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (tenant_id, module_id, scope, key)
);

CREATE INDEX idx_kv_tenant_module ON ext.kv (tenant_id, module_id);
CREATE INDEX idx_kv_value_gin ON ext.kv USING GIN (value jsonb_path_ops);

Use cases:

Module settings
Small datasets (tags, categories, flags)
Cached computed values

ext.entity_attr — Entity Attributes

CREATE TABLE ext.entity_attr (
  id TEXT PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id TEXT NOT NULL,
  module_id TEXT NOT NULL,
  entity_type TEXT NOT NULL, -- e.g. "Party", "Invoice"
  entity_id TEXT NOT NULL,
  attr_key TEXT NOT NULL,
  attr_value JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (tenant_id, module_id, entity_type, entity_id, attr_key)
);

CREATE INDEX idx_entity_attr_entity ON ext.entity_attr (tenant_id, entity_type, entity_id);
CREATE INDEX idx_entity_attr_module ON ext.entity_attr (tenant_id, module_id);
CREATE INDEX idx_entity_attr_value_gin ON ext.entity_attr USING GIN (attr_value jsonb_path_ops);

Use cases:

Custom fields attached to core entities
Module-specific metadata (e.g., "loyalty points" on Party)
Extensibility without schema changes

ext.entity_link — Entity Relationships

CREATE TABLE ext.entity_link (
  id TEXT PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id TEXT NOT NULL,
  module_id TEXT NOT NULL,
  from_entity_type TEXT NOT NULL,
  from_entity_id TEXT NOT NULL,
  to_entity_type TEXT NOT NULL,
  to_entity_id TEXT NOT NULL,
  link_type TEXT NOT NULL,
  metadata JSONB,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (tenant_id, module_id, from_entity_type, from_entity_id, to_entity_type, to_entity_id, link_type)
);

CREATE INDEX idx_entity_link_from ON ext.entity_link (tenant_id, from_entity_type, from_entity_id);
CREATE INDEX idx_entity_link_to ON ext.entity_link (tenant_id, to_entity_type, to_entity_id);
CREATE INDEX idx_entity_link_module_type ON ext.entity_link (tenant_id, module_id, link_type);

Use cases:

"Related items" lists
Cross-entity associations (e.g., Invoice → Project)
Graph-like relationships

Guardrails

Tenant scoping: All queries MUST include tenant_id
Module isolation: Modules cannot access other modules' keys
Size limits: JSONB values should remain under 1MB (configurable)
Indexing: Use GIN indexes for JSONB query acceleration

Tier 3 — Heavy Module Escape Hatch

When a module truly needs relational tables but doesn't fit a bucket:

Bucket schema with prefixed tables (preferred):
- Place tables in related bucket schema (e.g., crm, billing)
- Name tables: <moduleSlug>__<tableName> (double underscore)
- Example: crm.loyalty__tier, crm.loyalty__points
Dedicated schema (rare, requires approval):
- Justification: 20+ tables, complex domain, future extraction
- Must document integration contracts
- Approval from architecture team
Separate database (extreme):
- Heavy workloads (analytics, timeseries, geospatial)
- Integrate via API/events only

Module Persistence Decision Tree

Does the module need persistence?
│
├─ No → Domain logic only (stateless module)
│
├─ Yes → How much data?
   │
   ├─ Minimal (settings, flags, small lists)
   │  └─ ✅ Use ext.kv
   │
   ├─ Attributes on core entities (custom fields)
   │  └─ ✅ Use ext.entity_attr
   │
   ├─ Relationships between entities
   │  └─ ✅ Use ext.entity_link
   │
   ├─ 1-5 relational tables
   │  └─ ✅ Use bucket schema with prefixed tables
   │
   ├─ 5-20 tables, fits a domain bucket
   │  └─ ✅ Use bucket schema (crm, billing, etc.)
   │
   └─ 20+ tables OR requires dedicated schema
      └─ ⚠️ Requires architecture approval

ORM Configuration (Prisma)

Multi-Schema Support

File: packages/data/prisma/schema/schema.prisma

datasource db {
  provider = "postgresql"
  schemas  = ["public", "identity", "crm", "billing", "accounting", "inventory", "sales", "purchasing", "workflow", "platform", "reporting", "ext"]
}

Schema File Organization

packages/data/prisma/schema/
├── schema.prisma                 # Main config
├── 00_ext.prisma                 # Extension tables (ext schema)
├── 10_identity.prisma            # @@schema("identity")
├── 20_workspaces.prisma          # @@schema("platform")
├── 45_party_crm.prisma           # @@schema("crm")
├── 58_accounting.prisma          # @@schema("accounting")
├── 60_billing.prisma             # @@schema("billing")
└── ...

Example Model Annotation

model Party {
  id          String   @id @default(cuid())
  tenantId    String
  displayName String
  // ...

  @@schema("crm")
  @@unique([tenantId, id])
  @@index([tenantId])
}

Migration Strategy

Current → Target Migration Path

Phase 1: Introduce ext schema (non-breaking)

Create migration: CREATE SCHEMA ext;
Add extension tables
Deploy extension storage service
No changes to existing modules

Phase 2: Migrate core tables to bucket schemas (expand/contract)

Create new schemas (identity, crm, etc.)
Create tables in new schemas
Migrate data
Update Prisma schema with @@schema annotations
Drop old public tables

Phase 3: Migrate small modules to ext storage

Identify candidates (1-2 table modules)
Implement ext storage adapters
Migrate data to ext.* tables
Remove old tables

Principle: Always expand-contract. Never breaking change in production.

Boundary Rules

Hard rules to prevent coupling:

No cross-module DB writes: Modules write ONLY to:
- Their own bucket schema tables
- ext.* tables scoped to their module_id
- Never another module's tables
Integration via contracts:
- Ports (interfaces)
- Domain events (outbox)
- Shared contracts (@corely/contracts)
Tenant scoping is mandatory:
- Every query includes WHERE tenantId = $1
- Unique constraints include tenantId
- Indexes include tenantId as first column
Extension storage isolation:
- module_id acts as namespace
- Collisions prevented by unique constraints
- No cross-module queries on ext tables

Developer Guidance

Creating a New Small Module

Use ext.kv for settings:

import { ExtKvService } from "@corely/data";

class MyModuleSettings {
  constructor(private kv: ExtKvService) {}

  async getConfig(tenantId: string): Promise<Config> {
    return this.kv.get({
      tenantId,
      moduleId: "my-module",
      scope: "workspace",
      key: "config",
    });
  }
}

Use ext.entity_attr for custom fields:

import { ExtEntityAttrService } from "@corely/data";

class LoyaltyPoints {
  constructor(private attr: ExtEntityAttrService) {}

  async getPoints(tenantId: string, partyId: string): Promise<number> {
    const attr = await this.attr.get({
      tenantId,
      moduleId: "loyalty",
      entityType: "Party",
      entityId: partyId,
      attrKey: "points",
    });
    return attr?.attrValue?.points || 0;
  }
}

Creating a Medium Module

Use bucket schema with prefixed tables:

// File: 77_loyalty.prisma

model LoyaltyTier {
  id          String @id @default(cuid())
  tenantId    String
  name        String
  minPoints   Int
  // ...

  @@schema("crm")
  @@unique([tenantId, id])
  @@index([tenantId])
}

Performance Considerations

Schema Catalog Size

Target: < 10,000 tables total
With 50 schemas, 200 tables each: 10,000 tables (threshold)
With ext tables: Most modules avoid dedicated tables

Query Performance

All indexes include tenantId for partition pruning (future)
GIN indexes on JSONB for ext.* tables
Monitor slow queries on ext tables; migrate to relational if needed

JSONB Best Practices

Keep payloads small (< 100KB ideal, < 1MB limit)
Use typed contracts for JSONB structure
Index frequently queried paths: CREATE INDEX ... USING GIN ((value -> 'status'))

Monitoring & Observability

Metrics to track:

Schema count
Table count per schema
Index count
ext.* table row counts by module_id
JSONB payload size distribution
Query performance on ext tables

Alerts:

Schema count > 60
Table count per schema > 300
JSONB payload > 1MB
Slow queries on ext.* (> 100ms)

Appendix: Schema Bucket Assignments

Bucket Schema	Modules Assigned
`identity`	identity, API keys, sessions
`platform`	workspaces, apps, templates, packs, menu overrides
`crm`	party, contacts, deals, activities
`billing`	invoices, payments, allocations
`accounting`	ledger accounts, journal entries
`inventory`	stock, locations, moves, reservations
`sales`	quotes, sales orders
`purchasing`	purchase orders, vendor bills
`workflow`	workflow definitions, instances, tasks
`platform`	documents, files, links
`reporting`	read models, snapshots, analytics aggregates
`ext`	Extension storage (kv, entity_attr, entity_link)

Next Steps

✅ Introduce ext schema and tables (Phase 1)
✅ Build extension storage service layer
✅ Migrate core tables to bucket schemas (Phase 2)
⏳ Identify small module candidates for ext migration
⏳ Update developer docs and module templates

This strategy ensures Corely can scale to 1000+ modules while keeping the database maintainable, performant, and developer-friendly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corely Database Persistence Strategy

Intent

Problem Statement

Three-Tier Persistence Model

Tier 1 — Domain Bucket Schemas (Core Relational Data)

Tier 2 — Extension Storage (Shared Primitives for Small Modules)

Tables

Guardrails

Tier 3 — Heavy Module Escape Hatch

Module Persistence Decision Tree

ORM Configuration (Prisma)

Multi-Schema Support

Schema File Organization

Example Model Annotation

Migration Strategy

Current → Target Migration Path

Boundary Rules

Developer Guidance

Creating a New Small Module

Creating a Medium Module

Performance Considerations

Schema Catalog Size

Query Performance

JSONB Best Practices

Monitoring & Observability

Appendix: Schema Bucket Assignments

Next Steps

FilesExpand file tree

DATABASE_PERSISTENCE_STRATEGY.md

Latest commit

History

DATABASE_PERSISTENCE_STRATEGY.md

File metadata and controls

Corely Database Persistence Strategy

Intent

Problem Statement

Three-Tier Persistence Model

Tier 1 — Domain Bucket Schemas (Core Relational Data)

Tier 2 — Extension Storage (Shared Primitives for Small Modules)

Tables

Guardrails

Tier 3 — Heavy Module Escape Hatch

Module Persistence Decision Tree

ORM Configuration (Prisma)

Multi-Schema Support

Schema File Organization

Example Model Annotation

Migration Strategy

Current → Target Migration Path

Boundary Rules

Developer Guidance

Creating a New Small Module

Creating a Medium Module

Performance Considerations

Schema Catalog Size

Query Performance

JSONB Best Practices

Monitoring & Observability

Appendix: Schema Bucket Assignments

Next Steps