Skip to content

[CHORE]: Tool table needs subtyping by provider โ€” pick a pattern before next provider FK landsย #4612

@jonpspri

Description

@jonpspri

๐Ÿ”ง Chore Summary

The tools table on main has accumulated 45 columns (per mcpgateway/db.py:3166+), three distinct families of provider-specific config baked in (REST/MCP/A2A), one gateway_id foreign key for federated MCP, and is about to gain grpc_service_id via #3202. Several other columns are stored but never queried.

Before kicking off the broader MCP services refactor, we should agree on a single typing strategy for tool variants โ€” applied consistently in both the Python ORM and the Rust runtime โ€” so the next round of work doesn't bake in another N provider-specific FKs and dispatch branches.

This is a design/scoping issue, not an implementation issue. Goal: choose one of the patterns below, document it, then file follow-up issues for the migration.


๐Ÿงฑ Area Affected

  • Database schema / Alembic migrations
  • Python ORM (mcpgateway/db.py, services)
  • Rust MCP runtime (crates/mcp_runtime/) โ€” cross-language parity
  • Pydantic schemas (mcpgateway/schemas.py)
  • Type-checking (mypy/pyrefly) โ€” STI vs JTI vs JSONB drives different ORM type stories

โš™๏ธ Context / Why now

ContextForge has been growing one provider at a time and the tools table has paid the price each round:

De-facto subtyping that already exists in code (per audit on origin/main):

Provider Writer integration_type Provider-specific columns / annotations
Manual REST tool_service.register_tool() (tool_service.py:1796, DbTool(...) at :1919) REST url, request_type, headers, auth_type, auth_value, base_url, path_template, query_mapping, header_mapping, timeout_ms, expose_passthrough, allowlist, plugin_chain_pre, plugin_chain_post, jsonpath_filter
Federated MCP gateway_service._create_db_tool() (gateway_service.py:4562) MCP gateway_id (FK), gateway-derived auth_type, auth_value, headers, request_type โˆˆ {SSE, STDIO, STREAMABLEHTTP}
A2A tool_service.create_tool_from_a2a_agent() (tool_service.py:6683) A2A request_type=POST, annotations.a2a_agent_id, annotations.a2a_agent_type, auth_type, auth_value (the agent reference is currently smuggled through the annotations JSON column)
gRPC (in-flight, #3202) grpc_service._sync_tools_from_reflection() gRPC new grpc_service_id FK, input_schema.x-grpc-input-type, input_schema.x-grpc-output-type, input_schema.x-grpc-client-streaming, input_schema.x-grpc-server-streaming (currently smuggled through input_schema)
GraphQL (proposed, #2984) not yet implemented TBD will need yet another schema/method shape
OpenAPI/Swagger import (proposed, #3500) also REST today REST will likely need spec/operation refs

invoke_tool() dispatch tree (tool_service.py:4296+) already branches into:

  • REST (:4804) โ€” reads url, request_type, headers, auth_*, query_mapping, header_mapping, output_schema, jsonpath_filter
  • MCP (:5077) โ€” reads request_type as transport, gateway-derived auth/cert/passthrough
  • A2A (:5708) โ€” reads a2a_agent_* from tool_annotations
  • gRPC (:5811, after feat(transport): add gRPC methods as MCP toolsย #3202) โ€” reads grpc_service_id

Symptoms of the smell:

  1. Provider-specific columns leak in via JSON keys because authors don't want to add yet another column. annotations.a2a_agent_id and input_schema.x-grpc-input-type are both load-bearing fields hidden inside JSON blobs the schema doesn't declare.
  2. Many columns are stored but never queried in mcpgateway/services/: created_by, created_from_ip, created_user_agent, import_batch_id, federation_source, _computed_name (per Tool.<col> / DbTool.<col> audit). Strong candidates to collapse into a JSON config blob.
  3. No constraint that REST-only fields stay null on non-REST rows. expose_passthrough, allowlist, plugin_chain_pre, plugin_chain_post, jsonpath_filter, base_url, path_template, query_mapping, header_mapping, timeout_ms are all nullable on non-REST rows by accident, not by design.
  4. integration_type is a string column with no enum. Dispatch is if tool_integration_type == "REST" style across multiple services โ€” easy to typo, easy to miss a branch when adding a provider.
  5. Add-a-provider currently means: new column on tools, new mapped_column on Tool, new *_id FK to a new provider table, new elif in invoke_tool, new write path in the provider service, new fields on ToolCreate / ToolUpdate / ToolRead, new test fixtures, new mock attribute (tool.grpc_service_id=None) on every SimpleNamespace/MagicMock(spec=DbTool) in the test suite. PR feat(transport): add gRPC methods as MCP toolsย #3202 hit every one of these.

Cross-language: crates/mcp_runtime/src/lib.rs currently models tools as flat rows (McpToolDefinition :655-666, ResolvedMcpToolCallPlan :423-462, raw SQL at :5081-5142) over tokio-postgres โ€” no discriminated union, no trait hierarchy, no serde tag. So whatever Python picks, Rust either has to mirror it or the two implementations diverge structurally.

Why now: the tracking issue mentions "major refactoring of the MCP services" is imminent. Locking in a typing decision now prevents a third round of provider-specific FKs.


๐Ÿงญ Design options

Each pattern has a working code skeleton in SQLAlchemy 2.0 syntax, plus Mapped[] subtype fields where relevant.

Option A โ€” Single Table Inheritance (STI)

class Tool(Base):
    __tablename__ = "tools"
    id: Mapped[str] = mapped_column(String(36), primary_key=True)
    integration_type: Mapped[str] = mapped_column(String(32), nullable=False)
    name: Mapped[str]
    # generic columns
    __mapper_args__ = {"polymorphic_on": integration_type, "polymorphic_identity": "base"}

class RestTool(Tool):
    base_url: Mapped[str | None]
    request_type: Mapped[str | None]
    query_mapping: Mapped[dict | None] = mapped_column(JSON)
    # ... REST-only fields
    __mapper_args__ = {"polymorphic_identity": "REST"}

class McpTool(Tool):
    gateway_id: Mapped[str | None] = mapped_column(ForeignKey("gateways.id"))
    __mapper_args__ = {"polymorphic_identity": "MCP"}

class GrpcTool(Tool):
    grpc_service_id: Mapped[str | None] = mapped_column(ForeignKey("grpc_services.id"))
    __mapper_args__ = {"polymorphic_identity": "gRPC"}
  • Pros: 1 table, simple polymorphic queries, zero migration disruption (current schema already is the STI base table โ€” we'd add polymorphic_on to the existing column).
  • Cons: still wide & null-heavy; weak DB-level integrity for subtype fields; subtype-specific indexes are messy.

Authoritative ref: lib/sqlalchemy/orm/mapper.py:212-220, official docs https://docs.sqlalchemy.org/en/20/orm/inheritance.html

Option B โ€” Joined Table Inheritance (JTI)

class Tool(Base):
    __tablename__ = "tools"
    id: Mapped[str] = mapped_column(String(36), primary_key=True)
    integration_type: Mapped[str]
    name: Mapped[str]
    __mapper_args__ = {"polymorphic_on": integration_type, "polymorphic_identity": "base", "polymorphic_load": "selectin"}

class RestTool(Tool):
    __tablename__ = "tools_rest"
    id: Mapped[str] = mapped_column(ForeignKey("tools.id"), primary_key=True)
    base_url: Mapped[str | None]
    request_type: Mapped[str | None]
    # ... REST-only
    __mapper_args__ = {"polymorphic_identity": "REST"}

class McpTool(Tool):
    __tablename__ = "tools_mcp"
    id: Mapped[str] = mapped_column(ForeignKey("tools.id"), primary_key=True)
    gateway_id: Mapped[str] = mapped_column(ForeignKey("gateways.id"))
    __mapper_args__ = {"polymorphic_identity": "MCP"}
  • Pros: real DB integrity (NOT NULL where it matters); narrow tools core; new providers add a new tools_<kind> table without touching the others; polymorphic_load="selectin" prevents N+1.
  • Cons: every read joins; more migration work; crates/mcp_runtime raw SQL has to learn LEFT JOINs per kind.

Authoritative ref: SQLAlchemy selectin_polymorphic at lib/sqlalchemy/orm/mapper.py:212-220; in-the-wild example: rivenmedia/riven/src/program/media/item.py:761-770.

Option C โ€” Concrete Table Inheritance

class RestTool(Base):
    __tablename__ = "tools_rest"
    id, name, ... + REST fields
    __mapper_args__ = {"polymorphic_identity": "REST", "concrete": True}

class McpTool(Base):
    __tablename__ = "tools_mcp"
    id, name, ... + MCP fields
    __mapper_args__ = {"polymorphic_identity": "MCP", "concrete": True}
  • Pros: zero nulls, total provider isolation.
  • Cons: polymorphic queries become UNION-heavy; relationships (servers, metrics) are awkward; current foreign keys (server_tool_association.tool_id) need rework.

Authoritative ref: examples/inheritance/concrete.py:37-79 in SQLAlchemy.

Option D โ€” Flat core + provider_config JSONB blob (hybrid)

class Tool(Base):
    __tablename__ = "tools"
    id: Mapped[str]
    integration_type: Mapped[str]
    # generic + governance columns only
    provider_config: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False, default=dict)

with Pydantic discriminated unions on the Python edge:

class RestConfig(BaseModel):
    kind: Literal["REST"]
    base_url: str
    request_type: str
    # ...

class McpConfig(BaseModel):
    kind: Literal["MCP"]
    gateway_id: str

class GrpcConfig(BaseModel):
    kind: Literal["gRPC"]
    grpc_service_id: str
    method_input_type: str
    method_output_type: str

ProviderConfig = Annotated[
    RestConfig | McpConfig | A2AConfig | GrpcConfig,
    Field(discriminator="kind"),
]
  • Pros: fastest schema evolution; new providers add zero DDL; matches what most adjacent OSS projects already do; eliminates the annotations.a2a_agent_id / input_schema.x-grpc-* smuggling.
  • Cons: weaker DB constraints; can't easily index inside JSONB without GIN; FK relationships (gateway_id, grpc_service_id) become "soft" string IDs unless we keep them as proper columns.

Adjacent OSS evidence:

  • arc53/DocsGPT/application/storage/db/models.py:70-80 โ€” tool config / requirements / actions all in JSONB
  • Haohao-end/openagent/api/internal/model/app.py:195-206 โ€” tools, graph, retrieval_config, long_term_memory all JSONB
  • langgenius/dify/api/core/provider_manager.py:59-69 โ€” typed ProviderConfigurations registry, no ORM polymorphism
  • BerriAI/litellm/litellm/utils.py:8015-8022 โ€” ProviderConfigManager keyed by LlmProviders enum

Option E โ€” Full normalization

class Tool(Base):  # generic only
    id, integration_type, name, ...

class RestToolDetails(Base):
    tool_id: FK โ†’ tools.id (PK)
    base_url, request_type, ...

class McpToolDetails(Base):
    tool_id: FK โ†’ tools.id (PK)
    gateway_id: FK โ†’ gateways.id

class GrpcToolDetails(Base):
    tool_id: FK โ†’ tools.id (PK)
    grpc_service_id: FK โ†’ grpc_services.id
  • Pros: strongest integrity, best for large/complex provider payloads, no NULL columns anywhere.
  • Cons: most joins, most tables, biggest migration; we'd be adding ~3-4 new tables on day one.

๐Ÿ“Š Tradeoffs matrix

Pattern Query perf Type safety Schema flex Rust parity cost Migration cost Best fit if
A. STI high medium low low (still 1 row) very low provider-specific cols are few/small
B. JTI medium high medium medium (JOINs) medium shared core + real subtype data
C. Concrete low high low high (UNION-heavy) high hard isolation, rare polymorphic reads
D. Hybrid JSONB high medium (Pydantic-validated) very high low (mirrors serde(tag) cleanly) low provider config is volatile / many providers
E. Full normalize high highest high highest (per-table queries) highest long-lived, high-fidelity provider data

๐Ÿฆ€ Cross-language parity contract

The Rust runtime today:

  • crates/mcp_runtime/src/lib.rs:655-666 โ€” McpToolDefinition flat struct
  • crates/mcp_runtime/src/lib.rs:5081-5142 โ€” raw SELECT โ€ฆ FROM tools with no kind discrimination
  • No #[serde(tag = "...")] on tool types; the only tagged enum in the workspace is unrelated cache invalidation (crates/a2a_runtime/src/cache.rs:302-312)

If Python picks Option A or D, Rust can keep its flat row model with minor changes (add integration_type discriminator field, optional provider_config: serde_json::Value).
If Python picks Option B/C/E, Rust needs explicit per-kind structs and either UNION queries or JOINs.

The decision must be made jointly โ€” the contract surface is the wire/DB schema both runtimes hit.


โœ… Recommended path (for discussion)

Option D (Hybrid JSONB) for provider-specific config + Option B (JTI) only for the foreign-keyed providers.

Rationale:

  • Most provider-specific fields are stored, never queried โ€” perfect JSONB candidates (audit shows headers, auth_value, base_url, path_template, query_mapping, header_mapping, timeout_ms, expose_passthrough, allowlist, plugin_chain_*, jsonpath_filter are never used in WHERE/JOIN clauses in services/).
  • Real foreign keys (gateway_id, grpc_service_id, future A2A agent ref) belong as proper columns / JTI side tables for ON DELETE CASCADE and indexable joins.
  • Pydantic discriminated unions provide compile-time-ish safety on the API surface.
  • Rust mirrors cleanly: enum ProviderConfig { Rest{...}, Mcp{...}, ... } with #[serde(tag = "kind")] over a provider_config: serde_json::Value column.
  • Migration is incremental: ship the new provider_config JSONB column, double-write for one release, switch readers, drop legacy columns in a follow-up.

Open for debate. The point of this issue is to pick one before #3202 lands a fourth provider FK.


๐Ÿ““ Migration recipe (for the chosen option)

  1. Add the discriminator/JSONB column nullable, ship it.
  2. Backfill via Alembic data migration โ€” read from existing columns, write into the new shape, leave old columns in place.
  3. Switch all readers to the new shape under a feature flag.
  4. Once stable, drop the old columns.
  5. Add a CI test that parses every integration_type enum case (catches missing dispatch branches).

use_existing_column=True is the SQLAlchemy escape hatch when multiple STI subclasses want the same column during the transition: lib/sqlalchemy/orm/mapper.py (test_inheritance.py:825-838 in SQLAlchemy repo).


๐Ÿ”— Related / prior art

This RFC overlaps with active work and should be coordinated, not duplicated:


๐Ÿ“‹ Acceptance criteria

  • One option (Aโ€“E or a documented hybrid) is selected via this issue's discussion thread
  • An ADR / docs/docs/architecture/ doc is added recording the decision and rationale
  • Cross-language parity contract is written down (which fields live where, which side owns validation)
  • Follow-up issues are filed for: (a) Alembic migration recipe, (b) mcpgateway/db.py model changes, (c) crates/mcp_runtime mirror changes, (d) Pydantic schema changes, (e) test-fixture refactor (kills the mock.grpc_service_id=None paper-cut), (f) deprecation policy for legacy columns
  • PR feat(transport): add gRPC methods as MCP toolsย #3202 is updated to use the chosen pattern (or merges with the current shape and is followed by a refactor PR โ€” this issue resolves whether to block feat(transport): add gRPC methods as MCP toolsย #3202 on the redesign)

๐Ÿ“ Notes

  • This is intentionally a "design meeting in an issue" rather than a PR proposal. No code yet.
  • Audit data and adjacent-project survey are reproducible; happy to drop the exact gh search / grep invocations into a comment if reviewers want to verify.

Metadata

Metadata

Assignees

No one assigned

    Labels

    SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releasechoreLinting, formatting, dependency hygiene, or project maintenance choresdatabase

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      โšก