Skip to content

[EPIC][CHORE]: Alembic migration hygiene โ€” reject placeholder hashes, verify chain linearity, rehash 36 affected migrationsย #4613

@jonpspri

Description

@jonpspri

๐Ÿ”ง Epic Summary

Alembic migrations have been quietly accumulating placeholder-style revision IDs (a1b2c3d4e5f6, w7x8y9z0a1b2, aa1b2c3d4e5f, โ€ฆ) instead of alembic revision's actual random-hex output (bb43712cae28, 43c07ed25a24, 191a2def08d7, โ€ฆ). The repo has paid for this with repeated rebase/merge conflicts (most recently in #3202, also in #4429 / #4548 / #4555 / #2154 / #4138). We have an existing pre-commit hook and CI harness that checks format but neither rejects placeholder content nor verifies the full chain is linear.

This epic groups the existing-but-unmerged hygiene work under one umbrella, adds the missing guardrails, and re-hashes the 36 affected revisions in a single sweep so this stops happening.

This is an epic / tracking issue. Each bullet in "Children" gets its own PR.


๐Ÿงฑ Area Affected

  • GitHub Actions / CI Pipelines (.github/workflows/alembic-upgrade-validation.yml, .github/workflows/pre-commit.yml)
  • Pre-commit hooks / linters (.pre-commit-config.yaml, .pre-commit-lite.yaml, scripts/pre-commit/check_migration_patterns.py)
  • Database / migrations (mcpgateway/alembic/versions/*)
  • Build system (Makefile db-*, migration-test-*, upgrade-validate targets)
  • Docs (tests/migration/README.md, ADR for migration hash policy)

โš™๏ธ Context / Why now

Three concrete symptoms motivate this epic:

1. 36 of 96 migrations on main use placeholder-style revision IDs. The pattern is "letter + digit alternating" ([a-z]\d[a-z]\dโ€ฆ) โ€” 12 alphanumeric chars that pass the existing pre-commit regex but are clearly hand-picked, not alembic revision-generated.

2. Stacks of pre-allocated prefixes. Two long manual chains exist:

When two PRs each pre-allocate the next letter and merge close in time, both pick the same letter and we get multiple-head merge migrations (bb43712cae28_merge_token_blocklist_with_audit_identity_heads.py is the most recent evidence).

3. Header drift. Six migrations have Revises: docstrings that disagree with the actual down_revision value in code:

File Docstring claims Code says
20a0e0538ac5_add_composite_indexes_for_metrics_time_.py 64acf94cb7f2 abf8ac3b6008
356a2d4eed6f_uuid_change_for_prompt_and_resources.py z1a2b3c4d5e6 9e028ecf59c4
add_toolops_test_cases_table.py add_oauth_tokens_table z1a2b3c4d5e6
ba202ac1665f_migrate_user_roles_to_configurable_.py a31c6ffc2239 f9a8b7c6d5e4
bb43712cae28_merge_token_blocklist_with_audit_identity_heads.py cae28b15a507, b2c3d4e5f6g7 only b2c3d4e5f6g7
r2b3c4d5e6f7_add_prompt_namespacing_fields.py k5e6f7g8h9i0, 4f07c116f917, z1a2b3c4d5e6 only 4f07c116f917

These are silent bugs โ€” alembic uses the code value, but anyone reading the docstring (humans, AI, future maintainers debugging chain shape) gets misled.

4. The CI gap that keeps letting this in. Per audit of current state:

  • scripts/pre-commit/check_migration_patterns.py validates filename hash format and revision/filename match, plus duplicate revision IDs. It does not reject placeholder-style content.
  • scripts/ci/run_upgrade_validation.sh enforces "exactly one alembic head". It does not verify the chain is linear (i.e., every revision has exactly one parent except the merge points, and merges are intentional).
  • .pre-commit-lite.yaml โ€” the config used by make pre-commit in CI โ€” does not include the check-migration-patterns hook at all. So the local hook runs only for developers with pre-commit install'd, not in CI.

๐Ÿ“ Proposed Architecture

Guardrail layer (do this first; prevents new placeholder hashes from landing)

  1. Extend check_migration_patterns.py to reject the placeholder pattern. Concrete regex (rejects "alternating letter+digit, โ‰ฅ11 chars"):

    import re
    _PLACEHOLDER_RE = re.compile(r"^(?:[a-z]\d){5,}[a-z]?$")
    if _PLACEHOLDER_RE.match(revision_id):
        raise SystemExit(f"{path}: revision '{revision_id}' looks placeholder-allocated. Run `alembic revision -m '...'` to get a real hash.")

    Also reject revision IDs that contain 0a1b, 1b2c, 2c3d, โ€ฆ, 8h9i, 9i0j substrings (the diagonal chain marker).

  2. Add a down_revision <-> Revises: docstring consistency check to the same hook. Catches the 6 mismatches above and prevents new ones.

  3. Move check-migration-patterns into .pre-commit-lite.yaml so CI's pre-commit job actually runs it. Currently the hook only protects developers running pre-commit install.

Chain-shape verification (do this in CI, not pre-commit; needs the full graph)

  1. Add --check-linear to scripts/ci/run_upgrade_validation.sh (or a sibling script). Walks the alembic graph, asserts:

    • Exactly one head (already covered).
    • Every non-merge revision has exactly one down_revision.
    • Every merge revision is named *_merge_* AND is documented in a CHANGELOG/ADR.
    • No "orphan" revisions (must be reachable from base to head).
  2. Hook the new check into alembic-upgrade-validation.yml so PRs get blocked on broken chains, not just multi-head.

Cleanup (do last; needs the guardrails first or it'll regress)

  1. Re-hash the 36 placeholder revisions in a single PR per logical batch. Use python -c "import secrets; print(secrets.token_hex(6))" to generate replacements. For each replaced revision, update:

    • The filename
    • The revision = "..." line
    • The Revises: docstring of the next migration
    • The down_revision = "..." of the next migration
    • Any merge migrations that reference the old hash
  2. Fix the 6 docstring/code mismatches during the rehash pass.

  3. Add an ADR (docs/docs/architecture/alembic-migration-hash-policy.md) recording the policy: "All revision IDs must come from alembic revision -m '...'. Manual or pre-allocated IDs are not permitted. Merge migrations must be named *_merge_* and explained in commit/ADR."


๐Ÿง‘๐Ÿปโ€๐Ÿ’ป User Story 1 โ€” Developer

As a: developer adding a new migration
I want: pre-commit to fail loudly if I hand-edit a revision ID into something cute
So that: I never get into a "two PRs picked m7g8โ€ฆ simultaneously" merge conflict again

Scenario: Placeholder pattern is rejected at commit time
  Given a new file `mcpgateway/alembic/versions/m7g8h9i0j1k2_add_my_thing.py`
  When I run `pre-commit run check-migration-patterns --files <file>`
  Then the hook fails with "revision 'm7g8h9i0j1k2' looks placeholder-allocated"

Scenario: Real alembic-generated hashes pass
  Given a new file with revision = "bb43712cae28"
  When I run the hook
  Then it passes

๐Ÿง‘๐Ÿปโ€๐Ÿ’ป User Story 2 โ€” Reviewer / CI

As a: PR reviewer / CI
I want: the upgrade-validation workflow to refuse a chain that's secretly branched
So that: "exactly one head" is no longer the only graph property we enforce

Scenario: Linear-chain verification catches a stealth branch
  Given a migration tree where rev_X and rev_Y both list rev_Z as down_revision but no merge migration unifies them
  When alembic-upgrade-validation.yml runs
  Then it fails with "non-merge revisions sharing parent rev_Z: rev_X, rev_Y"

๐Ÿง‘๐Ÿปโ€๐Ÿ’ป User Story 3 โ€” Cleanup

As a: maintainer
I want: the existing 36 placeholder revisions re-hashed and the 6 docstring mismatches fixed
So that: the codebase stops carrying historical hash debt forward

Scenario: Placeholder rehash PR
  Given the cleanup PR replaces every placeholder revision ID with a real random hex hash
  And updates every downstream `down_revision` and `Revises:` reference
  When `make migration-test-all` runs
  Then `alembic upgrade head` succeeds on a fresh DB
  And `alembic downgrade base` succeeds from head
  And `alembic heads` returns exactly one head
  And the new placeholder-pattern check passes

๐Ÿ“‹ Children (track via this epic)

Already-existing related issues that fit under this umbrella:

Already-closed prior art (cite for context, do not reopen):

New work items to file as children of this epic (after this epic is accepted):

  • A11. Extend check_migration_patterns.py with placeholder-pattern rejection + docstring-vs-code consistency check
  • A12. Add check-migration-patterns to .pre-commit-lite.yaml so CI runs it
  • A13. Add chain-linearity validation to scripts/ci/run_upgrade_validation.sh (or a sibling script) and wire into .github/workflows/alembic-upgrade-validation.yml
  • A14. Re-hash the 36 placeholder revisions in a sequence of small batched PRs (one per logical chain to keep diffs reviewable)
  • A15. Fix the 6 Revises: docstring mismatches during the rehash
  • A16. ADR: document the alembic hash policy in docs/docs/architecture/

๐Ÿ“ฆ Affected migrations (reference data for cleanup PRs)

36 placeholder-style revisions to be re-hashed:

a1b2c3d4e5f6  a4f1c7d8e9b0  a7f3c9e1b2d4  a8f3b2c1d4e5
aa1b2c3d4e5f  b1b2b3b4b5b6  b2c3d4e5f6g7  b2d9c6e4f1a7
c1c2c3c4c5c6  d3e4f5a6b7c8  d9e0f1a2b3c4  e1f2a3b4c5d6
f1a2b3c4d5e6  f8c9d3e2a1b4  f9a8b7c6d5e4  g1a2b3c4d5e6
h2b3c4d5e6f7  i3c4d5e6f7g8  j4d5e6f7g8h9  k5e6f7g8h9i0
l6f7g8h9i0j1  m7g8h9i0j1k2  n8h9i0j1k2l3  o9i0j1k2l3m4
p0a1b2c3d4e5  q1b2c3d4e5f6  r2b3c4d5e6f7  s3c4d5e6f7g8
t4d5e6f7g8h9  u5f6g7h8i9j0  v1a2b3c4d5e6  w6g7h8i9j0k1
w7x8y9z0a1b2  x7h8i9j0k1l2  y8i9j0k1l2m3  z1a2b3c4d5e6

6 Revises: docstring mismatches to fix:

  • 20a0e0538ac5_add_composite_indexes_for_metrics_time_.py
  • 356a2d4eed6f_uuid_change_for_prompt_and_resources.py
  • add_toolops_test_cases_table.py
  • ba202ac1665f_migrate_user_roles_to_configurable_.py
  • bb43712cae28_merge_token_blocklist_with_audit_identity_heads.py
  • r2b3c4d5e6f7_add_prompt_namespacing_fields.py

1 header anomaly to investigate:

  • b77ca9d2de7e_uuid_pk_and_slug_refactor.py has no Revises: line; uses a Create Date: header instead.

๐Ÿ“‹ Acceptance criteria (epic-level)

  • Pre-commit check-migration-patterns rejects placeholder-style revision IDs (regex test + docstring consistency)
  • .pre-commit-lite.yaml includes the migration hook so CI enforces it
  • CI verifies chain linearity, not just single-head
  • All 36 placeholder revisions are replaced with real hex hashes
  • All 6 docstring mismatches are fixed
  • b77ca9d2de7e header is normalized
  • ADR documenting the policy is merged
  • All listed children issues are either closed or explicitly de-scoped from this epic with reasoning

๐Ÿ“ฆ Related Make Targets

  • make pre-commit โ€” currently uses .pre-commit-lite.yaml, needs to include the migration hook
  • make upgrade-validate โ€” runs scripts/ci/run_upgrade_validation.sh, target for linearity check
  • make migration-test-all โ€” sanity gate after rehash PRs land
  • make db-heads โ€” already shows count, will be the cleanest single-head reporter post-cleanup

๐Ÿ““ Notes

  • This epic is not a security issue; it's hygiene that prevents future merge-train collisions and silent bugs.
  • The rehash work is inherently rebase-y. Recommend doing it during a quiet window and pinning a specific main commit so contributors with in-flight migrations can reparent cleanly.
  • Both feat(transport): add gRPC methods as MCP toolsย #3202 (just-rebased gRPC PR) and any unmerged PR that adds a migration WILL conflict with the rehash sweep. Plan to coordinate.
  • Audit data above is reproducible from the mcpgateway/alembic/versions/ tree at the time of filing; happy to drop the exact python -c invocations into a comment if reviewers want to verify.

Metadata

Metadata

Assignees

No one assigned

    Labels

    SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releasechoreLinting, formatting, dependency hygiene, or project maintenance choresdatabaseepicLarge feature spanning multiple issues

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      โšก