Skip to content

feat(transport): add gRPC methods as MCP tools#3202

Merged
jonpspri merged 26 commits intomainfrom
GrpcMethodsAsTools-2854
May 7, 2026
Merged

feat(transport): add gRPC methods as MCP tools#3202
jonpspri merged 26 commits intomainfrom
GrpcMethodsAsTools-2854

Conversation

@crivetimihai
Copy link
Copy Markdown
Member

Note: This PR was re-created from #2983 due to repository maintenance. Your code and branch are intact. @gabe-l-hart please verify everything looks good.

🔗 Related Issue

Addresses #2854


📝 Summary

This PR implements automatic tool registration for gRPC methods


🏷️ Type of Change

  • Bug fix
  • Feature / Enhancement
  • Documentation
  • Refactor
  • Chore (deps, CI, tooling)
  • Other (describe below)

🧪 Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage ≥ 80% make coverage

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • Tests added/updated for changes
  • Documentation updated (if applicable)
  • No secrets or credentials committed

📓 Notes (optional)

Screenshots, design decisions, or additional context.

@crivetimihai crivetimihai added this to the Release 1.2.0 milestone Feb 24, 2026
@crivetimihai crivetimihai added enhancement New feature or request mcp-protocol Alignment with MCP protocol or specification COULD P3: Nice-to-have features with minimal impact if left out; included if time permits labels Feb 24, 2026
@gabe-l-hart
Copy link
Copy Markdown
Contributor

@crivetimihai It looks like we need to redo the migration again based on the test failures

@gabe-l-hart
Copy link
Copy Markdown
Contributor

@crivetimihai I don't have write access to update the migration on this branch, so I'll officially hand it off to you. The rest of the branch looks good!

@gabe-l-hart
Copy link
Copy Markdown
Contributor

@crivetimihai It looks like this PR is somewhat stale. Is this a feature you want to adopt? If so, I can refresh the PR.

@jonpspri
Copy link
Copy Markdown
Collaborator

jonpspri commented May 6, 2026

Rebased the branch onto current main (was 477 commits behind) and force-pushed. Summary of what changed during the rebase, in case it helps reviewers diff against the previous state:

Migration parent

  • mcpgateway/alembic/versions/w7x8y9z0a1b2_add_grpc_service_id_to_tools.py: down_revision repointed from x7h8i9j0k1l2aa1b2c3d4e5f (current head). Verified alembic heads returns a single head and alembic upgrade head runs end-to-end on a fresh SQLite DB.

Conflict resolutions (kept original PR intent, adapted to current code shape):

  • mcpgateway/services/grpc_service.pyinvoke_method: kept the new SSRF/TLS validation that landed on main and the PR's stored-file-descriptor short-circuit. Both run; validation first, then descriptor-pool fast-path.
  • mcpgateway/services/tool_service.pyinvoke_tool dispatch: kept main's restructured A2A branch (status/response_text variables, structured logging, Rust delegation) and inserted the gRPC elif branch after it, using the existing tool_grpc_service_id local variable. Also kept main's query_mapping/header_mapping REST extraction together with the PR's tool_grpc_service_id extraction.
  • tests/unit/mcpgateway/services/test_tool_service.py — appended TestGrpcToolInvocation after the new test_list_*_creates_span tests that landed on main; both blocks coexist.

Test fix added

  • Two tests added to TestRustMcpExecutionPlan after this PR was last rebased build tool mocks via SimpleNamespace. The PR's _build_tool_cache_payload now reads tool.grpc_service_id, so those mocks AttributeError. Added grpc_service_id=None to both — same pattern as b576ec44a for the MagicMock(spec=DbTool) helpers.

Lint follow-up

  • make black + make isort reformatted one stray blank line in grpc_service.py and re-grouped imports in three test files. No semantic changes.

Auto-dropped commit

  • 04164898a docs: Better comment on lazy import was dropped by git rebase as "patch contents already upstream" — its comment improvement got folded into the conflict resolution on tool_service.py.

Verification done locally

  • pytest tests/unit/mcpgateway/services/test_grpc_service.py tests/unit/mcpgateway/services/test_tool_service.py tests/unit/mcpgateway/test_translate_grpc.py → 507 passed
  • ruff check --select E3,E4,E7,E9,F,D1 on changed files → all checks passed
  • black --check --line-length 200 on changed files → clean
  • isort --check-only --profile=black on changed files → clean
  • alembic heads → single head, alembic upgrade head → clean

Forward-looking note — opened #4612 as a design RFC for the tools table itself. The TL;DR: this PR is the fourth provider-specific FK on a 45-column table, and there are several other open issues (#2854, #4210, #2984, #3500) that will keep adding more. Worth picking a typing strategy (STI / JTI / hybrid JSONB / full normalization) before this PR — or the next one — lands. Not blocking on that issue, but flagging it now.

Backup tag for the pre-rebase tip is at backup/pre-rebase-20260506-080102 locally; previous remote SHA was 4e782946f.

jonpspri added a commit that referenced this pull request May 6, 2026
Addresses BLOCKING findings from a multi-agent code review of PR #3202:

Security:
* Reject local-only gRPC schemes (unix:, unix-abstract:, vsock:, fd:)
  in _validate_grpc_target since they bypass the IP-based SSRF model.
  Also normalize dns:/// and ipv4:/ipv6: name-resolver prefixes and
  parse bracketed IPv6 hosts correctly so the host-extraction step
  cannot be tricked. (review B1)
* Add _enforce_descriptor_limits to bound reflected
  FileDescriptorProto count, per-blob size, and aggregate size before
  storing or loading. Hardcoded constants (not settings) so a config
  change cannot silently weaken the DoS defense. (review B3)
* Add _validate_reflected_tool_name that runs reflected tool names
  through SecurityValidator.validate_tool_name() before persisting.
  Reflected names from upstream gRPC servers no longer bypass the
  injection / length / character checks applied to user-registered
  tools. (review B4)
* Strip metadata pseudo-keys (anything starting with '_') when
  copying discovered_services into endpoint._services so the
  _file_descriptors entry can never be confused with a service.
* Decode stored descriptors with base64.b64decode(..., validate=True)
  to fail fast on tampered DB content.

Robustness / Layer 1 invariants:
* _sync_tools_from_reflection now runs each method through a
  per-tool try/except so a single bad method cannot abort the
  entire reflection sync. Counts created / updated / failed.
  (review B9)
* _sync_tools_from_reflection now propagates parent service
  visibility, team_id, and owner_email to existing reflected tools,
  not just to newly created ones. Closes a Layer 1 token-scoping gap
  where a public->team visibility change on the parent service was
  silently ignored on already-discovered tools. (review B5)

Operational:
* invoke_method gains a timeout parameter (default settings.tool_timeout)
  and wraps endpoint.start() / endpoint.invoke() in asyncio.wait_for so
  a slow upstream cannot tie up a worker indefinitely. (review B6)
* tool_service.invoke_tool gRPC branch wraps the call in
  asyncio.wait_for, surfaces ToolTimeoutError + post-invoke timeout
  hook (matches A2A branch), preserves CancelledError instead of
  swallowing it, and logs with %-style + exc_info=True. (review B6/B7/B8)
* invoke_method preserves CancelledError, logs with %-style +
  exc_info=True, re-raises GrpcServiceError/GrpcServiceNotFoundError
  unwrapped instead of double-wrapping them. (review B8)

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
@jonpspri
Copy link
Copy Markdown
Collaborator

jonpspri commented May 6, 2026

Ran a deep review on this PR with five specialist agents in parallel (security-focused Oracle, code-consistency reviewer, silent-failure hunter, test-coverage analyzer, type-design analyzer), then iterated review→fix→re-review until clean.

Three review/fix iterations:

Iteration Findings Fixes
1 10 BLOCKING + 21 SUGGESTED + 3 NIT 730ecfa41, 8c3a5ae6e, d9c8fe3d3, 4c0d527bc
2 6 still-BLOCKING + 1 new BLOCKING (B12) 3bd414543
3 0 BLOCKING, 1 NIT 5e053b8e4

BLOCKING items addressed (12 total):

  • B1 SSRF target validator now delegates to SecurityValidator._validate_ssrf() so metadata.google.internal and other DNS-resolved bypasses are caught — same blocklist as HTTP targets. Also rejects unix:, unix-abstract:, vsock:, fd: schemes, parses gRPC name-resolver prefixes (dns:///, ipv4:, ipv6:), and handles bracketed IPv6 literals.
  • B2 Per-GrpcEndpoint private descriptor_pool.DescriptorPool() (was descriptor_pool.Default(), the process-wide singleton). Reflected descriptors from one upstream can no longer poison subsequent calls.
  • B3 Hardcoded DoS guards on reflected descriptors (per-blob 1 MiB, count 1024, aggregate 8 MiB). Hardcoded by design so a config change can't weaken them.
  • B4 Reflected tool names run through SecurityValidator.validate_tool_name() so upstream-controlled service/method names cannot bypass the injection / length / character-class checks user-registered tools go through.
  • B5 Visibility / team_id / owner_email propagation: _sync_tools_from_reflection updates existing tools' scoping fields when they differ from the parent service, AND update_service bulk-updates child tools in the same transaction whenever those fields change. Closes the Layer 1 token-scoping gap where a public→team change on the parent silently left existing tools public.
  • B6 Real gRPC deadlines: timeout= parameter threaded through invoke_methodendpoint.start()_discover_services()_discover_service_details()ServerReflectionInfo(..., timeout=...) AND through to the actual unary gRPC call. Both the asyncio wrapper and the underlying gRPC call now have the deadline so a slow upstream can't keep an executor thread alive after the coroutine is cancelled.
  • B7+B8 except asyncio.CancelledError: raise inserted in 4 sites (gRPC inner branch in tool_service.invoke_tool + 3 outer except BaseException as e: blocks at the REST/MCP/A2A nesting levels). Cancellation is no longer wrapped as a ToolInvocationError. All exception logs use %-formatting with exc_info=True.
  • B9 Per-method try/except in _sync_tools_from_reflection so a single bad method (e.g. one that fails name validation) is skipped rather than aborting the whole reflection sync.
  • B11 MessageToDict(..., including_default_value_fields=True) (protobuf 4.x kwarg) → always_print_fields_with_no_presence=True (protobuf 5/6.x). The repo pins protobuf>=6.33.6 so the old kwarg failed at runtime — the gRPC tool invocation feature was effectively dead-on-arrival before this fix.
  • B12 MessageFactory.GetPrototype was removed in protobuf 5.x; replaced with the module-level message_factory.GetMessageClass(...) at unary and streaming sites.

New tests added (4c0d527bc):

  • TestSecurityHardening (10 tests) — covers _validate_grpc_target scheme rejection, IPv6, prefix stripping; _enforce_descriptor_limits for count/per-blob/aggregate; _validate_reflected_tool_name for empty/over-length/injection.
  • TestVisibilityPropagation (3 tests) — Layer 1 invariant + B9 per-method isolation regression test.

Suggested follow-up (not blocking):

Verification:

  • pytest tests/unit/mcpgateway/services/test_grpc_service.py tests/unit/mcpgateway/services/test_tool_service.py tests/unit/mcpgateway/test_translate_grpc.py → 520 passed
  • ruff check --select E3,E4,E7,E9,F,D1 → all checks passed
  • black --check --line-length 200 → clean
  • alembic heads → single head, alembic upgrade head → clean

Backup tag for the pre-rebase tip is preserved locally at backup/pre-rebase-20260506-080102 if anyone needs to compare.

jonpspri added a commit that referenced this pull request May 6, 2026
…rogate/pylint)

Remaining fixes the project lint cycle surfaced after the fix-now bundle landed:

* isort: re-sort sibling imports in test_grpc_service.py
  (_enforce_descriptor_limits before _GRPC_MAX_*; case-insensitive order).

* ruff PLW0108 (unnecessary lambda) in test_translate_grpc.py: align the
  4 mocks I added with the 7 pre-existing patterns in the same file
  (lambda **_kw: MagicMock() / lambda *_a, **_kw: MagicMock()).

* ruff RET501 in test_grpc_service.py::tls_spy: drop the redundant
  trailing 'return None'.

* pylint no-member on the alembic migration: add the project-standard
  '# pylint: disable=no-member' header that 9 other migrations use to
  silence alembic.op's dynamic-attribute false positives.

* pylint try-except-raise on tool_service.invoke_tool gRPC branch:
  the 'except asyncio.CancelledError: raise' clause LOOKS like a no-op
  but exists specifically so the LATER 'except Exception' cannot swallow
  cancellation (PR #3202 review B7). Added an inline disable plus a
  comment citing B7 so the next lint sweep doesn't delete the
  anti-regression code.

Verified after fix:
  make autoflake / isort / black -> clean
  make ruff (E3,E4,E7,E9,F,D1)  -> All checks passed
  make bandit                   -> 0 issues, 18022 LoC
  make interrogate              -> 100% docstring coverage
  make pylint                   -> 10.00/10 on all 6 production files
  pytest (targeted gRPC suite)  -> 525/525 passing

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
@jonpspri jonpspri force-pushed the GrpcMethodsAsTools-2854 branch from fa116a2 to de3f896 Compare May 6, 2026 19:31
jonpspri added a commit that referenced this pull request May 6, 2026
Addresses BLOCKING findings from a multi-agent code review of PR #3202:

Security:
* Reject local-only gRPC schemes (unix:, unix-abstract:, vsock:, fd:)
  in _validate_grpc_target since they bypass the IP-based SSRF model.
  Also normalize dns:/// and ipv4:/ipv6: name-resolver prefixes and
  parse bracketed IPv6 hosts correctly so the host-extraction step
  cannot be tricked. (review B1)
* Add _enforce_descriptor_limits to bound reflected
  FileDescriptorProto count, per-blob size, and aggregate size before
  storing or loading. Hardcoded constants (not settings) so a config
  change cannot silently weaken the DoS defense. (review B3)
* Add _validate_reflected_tool_name that runs reflected tool names
  through SecurityValidator.validate_tool_name() before persisting.
  Reflected names from upstream gRPC servers no longer bypass the
  injection / length / character checks applied to user-registered
  tools. (review B4)
* Strip metadata pseudo-keys (anything starting with '_') when
  copying discovered_services into endpoint._services so the
  _file_descriptors entry can never be confused with a service.
* Decode stored descriptors with base64.b64decode(..., validate=True)
  to fail fast on tampered DB content.

Robustness / Layer 1 invariants:
* _sync_tools_from_reflection now runs each method through a
  per-tool try/except so a single bad method cannot abort the
  entire reflection sync. Counts created / updated / failed.
  (review B9)
* _sync_tools_from_reflection now propagates parent service
  visibility, team_id, and owner_email to existing reflected tools,
  not just to newly created ones. Closes a Layer 1 token-scoping gap
  where a public->team visibility change on the parent service was
  silently ignored on already-discovered tools. (review B5)

Operational:
* invoke_method gains a timeout parameter (default settings.tool_timeout)
  and wraps endpoint.start() / endpoint.invoke() in asyncio.wait_for so
  a slow upstream cannot tie up a worker indefinitely. (review B6)
* tool_service.invoke_tool gRPC branch wraps the call in
  asyncio.wait_for, surfaces ToolTimeoutError + post-invoke timeout
  hook (matches A2A branch), preserves CancelledError instead of
  swallowing it, and logs with %-style + exc_info=True. (review B6/B7/B8)
* invoke_method preserves CancelledError, logs with %-style +
  exc_info=True, re-raises GrpcServiceError/GrpcServiceNotFoundError
  unwrapped instead of double-wrapping them. (review B8)

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
jonpspri added a commit that referenced this pull request May 6, 2026
…rogate/pylint)

Remaining fixes the project lint cycle surfaced after the fix-now bundle landed:

* isort: re-sort sibling imports in test_grpc_service.py
  (_enforce_descriptor_limits before _GRPC_MAX_*; case-insensitive order).

* ruff PLW0108 (unnecessary lambda) in test_translate_grpc.py: align the
  4 mocks I added with the 7 pre-existing patterns in the same file
  (lambda **_kw: MagicMock() / lambda *_a, **_kw: MagicMock()).

* ruff RET501 in test_grpc_service.py::tls_spy: drop the redundant
  trailing 'return None'.

* pylint no-member on the alembic migration: add the project-standard
  '# pylint: disable=no-member' header that 9 other migrations use to
  silence alembic.op's dynamic-attribute false positives.

* pylint try-except-raise on tool_service.invoke_tool gRPC branch:
  the 'except asyncio.CancelledError: raise' clause LOOKS like a no-op
  but exists specifically so the LATER 'except Exception' cannot swallow
  cancellation (PR #3202 review B7). Added an inline disable plus a
  comment citing B7 so the next lint sweep doesn't delete the
  anti-regression code.

Verified after fix:
  make autoflake / isort / black -> clean
  make ruff (E3,E4,E7,E9,F,D1)  -> All checks passed
  make bandit                   -> 0 issues, 18022 LoC
  make interrogate              -> 100% docstring coverage
  make pylint                   -> 10.00/10 on all 6 production files
  pytest (targeted gRPC suite)  -> 525/525 passing

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
@jonpspri jonpspri force-pushed the GrpcMethodsAsTools-2854 branch from 8d2a305 to b9fc786 Compare May 6, 2026 19:46
gabe-l-hart and others added 25 commits May 7, 2026 13:22
…C services

#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
…bytes

#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

AI-usage: full
#2854
Branch: GrpcMethodsAsTools-2854

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
#2854
Branch: GrpcMethodsAsTools-2854
AI-usage: none
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
#2854
Branch: GrpcMethodsAsTools-2854
AI-usage: full
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Updates down_revision from x7h8i9j0k1l2 to aa1b2c3d4e5f, the current
head on origin/main. Several new migrations have landed since this PR
was last rebased, so the previous parent is no longer the head.

Verified single head: alembic heads -> w7x8y9z0a1b2 (head)

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…space mocks

Two tests added to TestRustMcpExecutionPlan after this PR was last
rebased construct tool mocks via SimpleNamespace. The PR's change to
_build_tool_cache_payload reads tool.grpc_service_id, so those mocks
now AttributeError. Mirrors b576ec4's earlier fix for the same
pattern in MagicMock(spec=DbTool) helpers.

Affected:
- TestRustMcpExecutionPlan.test_prepare_rust_mcp_tool_execution_uses_live_gateway_auth_fields_for_loaded_tools
- TestRustMcpExecutionPlan.test_prepare_rust_mcp_tool_execution_uses_live_gateway_string_auth_values

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
Tightens whitespace and import grouping per project lint config:
- mcpgateway/services/grpc_service.py: black removes a stray blank line
- tests/unit/mcpgateway/{services/test_grpc_service,services/test_tool_service,test_translate_grpc}.py: isort regroups stdlib/third-party/first-party/local imports

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…ptor pool

Two issues addressed in mcpgateway/translate_grpc.py:

1. Pre-existing runtime bug: invoke()/invoke_streaming() called
   json_format.MessageToDict(..., including_default_value_fields=True),
   the parameter name from protobuf 4.x. The repo pins protobuf>=6.33.6
   (see pyproject.toml), where the parameter was renamed to
   always_print_fields_with_no_presence. Calls raised TypeError at
   runtime, breaking gRPC tool invocation. Renamed the kwarg and
   replaced the # pylint: disable=unexpected-keyword-arg suppression
   with an anti-regression comment.

2. Descriptor pool poisoning: GrpcEndpoint shared
   descriptor_pool.Default(), the process-wide singleton. Reflected
   FileDescriptorProto from one (untrusted) upstream service could
   leak across requests and cause symbol collisions or type confusion
   in subsequent calls. Switched to a per-endpoint
   descriptor_pool.DescriptorPool(); MessageFactory now binds to that
   private pool.

Updated tests/unit/mcpgateway/test_translate_grpc.py monkeypatches to
expose DescriptorPool() instead of Default() and MessageFactory(pool=...)
instead of MessageFactory().

Closes review: B2, B11

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
Addresses BLOCKING findings from a multi-agent code review of PR #3202:

Security:
* Reject local-only gRPC schemes (unix:, unix-abstract:, vsock:, fd:)
  in _validate_grpc_target since they bypass the IP-based SSRF model.
  Also normalize dns:/// and ipv4:/ipv6: name-resolver prefixes and
  parse bracketed IPv6 hosts correctly so the host-extraction step
  cannot be tricked. (review B1)
* Add _enforce_descriptor_limits to bound reflected
  FileDescriptorProto count, per-blob size, and aggregate size before
  storing or loading. Hardcoded constants (not settings) so a config
  change cannot silently weaken the DoS defense. (review B3)
* Add _validate_reflected_tool_name that runs reflected tool names
  through SecurityValidator.validate_tool_name() before persisting.
  Reflected names from upstream gRPC servers no longer bypass the
  injection / length / character checks applied to user-registered
  tools. (review B4)
* Strip metadata pseudo-keys (anything starting with '_') when
  copying discovered_services into endpoint._services so the
  _file_descriptors entry can never be confused with a service.
* Decode stored descriptors with base64.b64decode(..., validate=True)
  to fail fast on tampered DB content.

Robustness / Layer 1 invariants:
* _sync_tools_from_reflection now runs each method through a
  per-tool try/except so a single bad method cannot abort the
  entire reflection sync. Counts created / updated / failed.
  (review B9)
* _sync_tools_from_reflection now propagates parent service
  visibility, team_id, and owner_email to existing reflected tools,
  not just to newly created ones. Closes a Layer 1 token-scoping gap
  where a public->team visibility change on the parent service was
  silently ignored on already-discovered tools. (review B5)

Operational:
* invoke_method gains a timeout parameter (default settings.tool_timeout)
  and wraps endpoint.start() / endpoint.invoke() in asyncio.wait_for so
  a slow upstream cannot tie up a worker indefinitely. (review B6)
* tool_service.invoke_tool gRPC branch wraps the call in
  asyncio.wait_for, surfaces ToolTimeoutError + post-invoke timeout
  hook (matches A2A branch), preserves CancelledError instead of
  swallowing it, and logs with %-style + exc_info=True. (review B6/B7/B8)
* invoke_method preserves CancelledError, logs with %-style +
  exc_info=True, re-raises GrpcServiceError/GrpcServiceNotFoundError
  unwrapped instead of double-wrapping them. (review B8)

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
The inline 'from mcpgateway.config import settings' in
_validate_grpc_target was made redundant by hoisting the same import
to module top in the security hardening commit.

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
Adds two test classes exercising the helpers introduced by the
security/robustness hardening commit:

TestSecurityHardening:
* _validate_grpc_target rejects unix:/unix-abstract:/vsock:/fd: schemes
* _validate_grpc_target normalizes dns:///, dns://, dns:, ipv4:, ipv6:
  name-resolver prefixes
* _validate_grpc_target accepts bracketed IPv6 literals and rejects
  malformed bracket syntax
* _enforce_descriptor_limits enforces count, per-blob, and aggregate
  size caps; happy path passes within bounds
* _validate_reflected_tool_name rejects empty / over-length / injection
  patterns and accepts proto-style identifiers

TestVisibilityPropagation:
* _sync_tools_from_reflection propagates parent service visibility,
  team_id, and owner_email to existing reflected tools (Layer 1
  invariant).
* _sync_tools_from_reflection isolates per-method failures: a method
  with an invalid name is skipped while the rest of the sync proceeds
  (B9 regression test).

All 13 new tests pass; the broader gRPC + tool_service + translate_grpc
suite is green at 520 tests.

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
Addresses BLOCKING items the second-pass review identified as still
present after the first round of fixes.

* B1 (DNS resolution): _validate_grpc_target now delegates the
  hostname/IP-network/DNS policy to SecurityValidator._validate_ssrf so
  hostnames like metadata.google.internal are resolved and the resolved
  IPs go through the same blocklist as HTTP targets. Removes the
  duplicate hand-rolled IP-network logic that previously lived inline.

* B5 (visibility on update): GrpcService.update_service now snapshots
  visibility / team_id / owner_email before applying the update and
  bulk-updates every child Tool with grpc_service_id == service.id in
  the same transaction whenever any of those fields change. Closes the
  Layer 1 token-scoping gap where a public->team change on the parent
  service was silently ignored on already-discovered tools.

* B6 (real gRPC deadlines): GrpcEndpoint.start, _discover_services,
  and _discover_service_details now accept a timeout kwarg that is
  threaded into ServerReflectionInfo, and GrpcEndpoint.invoke binds
  the deadline onto the underlying unary_unary call so a slow upstream
  cannot keep an executor thread alive after asyncio.wait_for cancels
  the wrapping coroutine. invoke_method passes the effective tool
  timeout into both endpoint.start and endpoint.invoke.

* B7 (cancellation propagation in outer except BaseException): adds
  except asyncio.CancelledError: raise immediately before the three
  outer except BaseException as e blocks in tool_service.invoke_tool
  so a cancellation is never wrapped as a ToolInvocationError.

* B12 (protobuf 6.x compatibility): MessageFactory.GetPrototype was
  removed in protobuf >= 5.x. Both the unary and streaming invocation
  paths in translate_grpc now call message_factory.GetMessageClass(...)
  instead. Updated the corresponding unit-test mocks.

* Cleanup: drops unused import ipaddress that became dead after
  the SecurityValidator delegation.

Test mocks updated to match the new signatures (timeout kwarg on the
populate_service / _populate callbacks; GetMessageClass at the module
level instead of MessageFactory.GetPrototype).

Lint clean (black, isort, ruff E3/E4/E7/E9/F/D1). 520 targeted tests
pass.

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
The 'if timeout' ternary in three reflection/unary call sites silently
drops timeout=0, but gRPC accepts timeout=0 as the explicit 'fail
immediately' deadline. Switched to 'if timeout is not None' so a
caller-supplied 0.0 reaches the underlying gRPC call with the right
semantics.

Found in iteration-3 oracle review (final pass, no new BLOCKING items).

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…view passes

Bundle of low-risk, non-blocking improvements that surfaced during the
deep review iterations. Each item below was explicitly classified as
fix-now in the review summary; broader / design-dependent items were
filed as follow-ups under #4612 and #4613.

* Schema parity (#1): add grpc_service_id to ToolRead in schemas.py
  alongside the existing gateway_id field so API consumers can identify
  gRPC-discovered tools (review S2.4 / TD1).

* db.py cleanup (#2): drop redundant nullable=True from
  Tool.grpc_service_id; Mapped[Optional[str]] already implies it and
  the sibling gateway_id column omits it (review TD2).

* translate_grpc.load_file_descriptors (#3): widen the parameter type
  from List[bytes] to Sequence[bytes] and reject a single bytes object
  passed by mistake. Without the guard Python would silently iterate
  byte-by-byte (review TD3).

* translate_grpc descriptor pool conflict (#4): replace the inaccurate
  'no-op if already added' comment with an explicit TypeError handler.
  protobuf raises TypeError when a file with the same name has
  conflicting content; the existing descriptor stays authoritative
  (review S3.3).

* test_sync_tools_removes_stale_tools (#5): replace the brittle
  string-match assertion ('DELETE' in str(call)) with an explicit
  call_count == 4 (1 select + 3 deletes), which no longer depends on
  the SQLAlchemy Delete object repr (review S2.6).

* translate_grpc close() (#6): convert the f-string log to lazy
  %-style for consistency with the rest of this PR's logging
  (review N2.1).

* grpc_service _sync_tools_from_reflection (#7): document why
  input_schema['properties'] is empty by design — gRPC arg shape is
  validated at the protobuf invocation layer, not the MCP tool-call
  layer; the actual proto types live in the x-grpc-* extensions
  (review N2.2).

* TestInvokeMethodGuards (#8): add 5 edge-case tests covering
  invoke_method paths the previous suite did not exercise:
  - service-not-found  -> GrpcServiceNotFoundError
  - disabled service   -> GrpcServiceError('is disabled')
  - invalid method format (no dot) -> GrpcServiceError
  - _validate_grpc_target spy called with service.target
  - _validate_tls_path spy called for both cert and key paths

525 targeted tests pass (was 520). Lint clean.

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…rogate/pylint)

Remaining fixes the project lint cycle surfaced after the fix-now bundle landed:

* isort: re-sort sibling imports in test_grpc_service.py
  (_enforce_descriptor_limits before _GRPC_MAX_*; case-insensitive order).

* ruff PLW0108 (unnecessary lambda) in test_translate_grpc.py: align the
  4 mocks I added with the 7 pre-existing patterns in the same file
  (lambda **_kw: MagicMock() / lambda *_a, **_kw: MagicMock()).

* ruff RET501 in test_grpc_service.py::tls_spy: drop the redundant
  trailing 'return None'.

* pylint no-member on the alembic migration: add the project-standard
  '# pylint: disable=no-member' header that 9 other migrations use to
  silence alembic.op's dynamic-attribute false positives.

* pylint try-except-raise on tool_service.invoke_tool gRPC branch:
  the 'except asyncio.CancelledError: raise' clause LOOKS like a no-op
  but exists specifically so the LATER 'except Exception' cannot swallow
  cancellation (PR #3202 review B7). Added an inline disable plus a
  comment citing B7 so the next lint sweep doesn't delete the
  anti-regression code.

Verified after fix:
  make autoflake / isort / black -> clean
  make ruff (E3,E4,E7,E9,F,D1)  -> All checks passed
  make bandit                   -> 0 issues, 18022 LoC
  make interrogate              -> 100% docstring coverage
  make pylint                   -> 10.00/10 on all 6 production files
  pytest (targeted gRPC suite)  -> 525/525 passing

Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
Updates down_revision from aa1b2c3d4e5f to 926d3e07d098. Origin/main
gained the CPEX plugin framework replacement (#3754) which introduced
two new migrations and a mergepoint, leaving our grpc_service_id
migration as a sibling head. Re-pointing to the new head returns the
chain to single-headed.

Verified single head:
  alembic heads -> w7x8y9z0a1b2 (head)
  alembic upgrade head -> clean on a fresh sqlite DB

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…ice_no_grpc

Two related fixes for the no-grpc test file that surfaced after rebase:

1. Restore is_reserved / is_multicast pre-check in _validate_grpc_target.
   The earlier delegation to SecurityValidator._validate_ssrf inadvertently
   dropped this guard because the shared validator only checks
   blocked-networks / localhost / private. IP literals like 224.0.0.1
   (multicast) or 0.0.0.0 (reserved) would have slipped past unless they
   were also in ssrf_blocked_networks. The restored check excludes
   loopback so ::1 (which Python flags as both is_loopback AND
   is_reserved) still passes through the localhost gate as intended.

2. Update test_grpc_service_no_grpc.py:
   - Add timeout=None kwarg to the three FakeEndpoint.start() and
     invoke() methods so they accept the per-RPC deadline added in the
     security/robustness hardening commit.
   - Update test_validate_grpc_target_enforces_ssrf_rules regex
     patterns to match the SecurityValidator messages reached via
     delegation (blocked hostname X, localhost address which is
     blocked, private network address which is blocked).

All 4 gRPC test files pass: 528 / 528.

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…vice+translate_grpc

Adds 11 targeted tests + a decision-record comment to address the diff-coverage
report that flagged 18 previously-uncovered lines:

  grpc_service.py:    95.7% -> 100%  (7/7 covered)
  translate_grpc.py:  91.7% -> 100%  (3/3 covered)
  tool_service.py:    69.2% -> 62%   (5/8 covered + 3 documented)

New tests in tests/unit/mcpgateway/services/test_grpc_service.py:

* test_validate_grpc_target_empty_string                        -> line 134
* test_invoke_method_propagates_cancelled_error                 -> line 1021
* test_invoke_method_re_raises_timeout                          -> lines 1023-1024
* test_invoke_method_re_raises_grpc_service_error_unwrapped     -> line 1026
* test_update_service_propagates_visibility_to_child_tools      -> lines 503-504

New tests in tests/unit/mcpgateway/test_translate_grpc.py:

* test_load_file_descriptors_rejects_single_bytes               -> line 409
* test_load_file_descriptors_skips_pool_conflict                -> lines 420, 425

New tests in tests/unit/mcpgateway/services/test_tool_service.py:

* test_invoke_grpc_tool_propagates_cancellation                 -> line 5847
* test_invoke_grpc_tool_timeout_raises_tool_timeout_error       -> lines 5849-5850, 5852

Restored a regression: the earlier B1 refactor (delegating to
SecurityValidator._validate_ssrf) inadvertently dropped the is_reserved /
is_multicast guard, since SecurityValidator only checks blocked-networks /
localhost / private. Restored the local guard with an is_loopback exclusion
so IPv6 ::1 (which Python flags as both is_loopback AND is_reserved)
still passes through the localhost gate.

Coverage decision-record left in test_tool_service.py for the 3 remaining
B7 anti-regression lines (5446 REST, 5632 MCP, 5851 gRPC timeout-with-pm):
they are byte-identical to the gRPC variant at line 5847 (which is fully
exercised). Equivalent REST/MCP tests require coaxing CancelledError
through asyncio.wait_for, which Python's event loop converts to TimeoutError
in some states. Protecting one branch is sufficient to detect a regression
that would affect all three.

541 tests passing, ruff/black/isort/pylint clean.

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
…ate 100%

Two CI failures triggered by main moving forward during this PR's review:

* alembic multi-head: PR #4501 ("add ondelete=CASCADE to FK constraints")
  introduced 9fb98535724d as a sibling of our w7x8y9z0a1b2, both parented
  to 926d3e07d098. Re-parents w7x8y9z0a1b2 -> 9fb98535724d so the chain
  has a single head again and bootstrap_db / SQLite+Postgres upgrade
  validation pass. Verified single head with alembic heads.

* interrogate --fail-under=100: three nested helper functions had no
  docstrings, dropping the package to 99.9%. One is mine
  (GrpcEndpoint.invoke._call introduced by the gRPC timeout fix in
  PR #3202 review B6); the other two (_user_obj_to_dict._iso,
  _user_dict_to_obj._dt in services/email_auth_service.py) landed
  on main via PR #4595 and were already failing main's CI. All three
  added one-line docstrings.

Cascade fixed by these two changes:
  - SQLite + PostgreSQL Fresh/Upgrade (alembic head check)
  - pytest (py3.12) (bootstrap_db on alembic head check)
  - playwright-ci-smoke (depends on bootstrap)
  - sql-sanitizer E2E (depends on bootstrap)
  - interrogate (mcpgateway) (docstring coverage)
  - Run pre-commit hooks (interrogate is a pre-commit hook)

The 2 email_auth_service docstrings are technically out of scope for
this PR but the local lint cycle scoping I had been using
(make interrogate TARGET=<my-files>) didn't catch them. Folding
them in here unblocks PR-level CI rather than waiting for a separate
fix to main.

Verified locally:
  alembic heads      -> single head w7x8y9z0a1b2
  interrogate        -> 100% on full mcpgateway/
  black --check      -> clean
  ruff check         -> clean
  pylint             -> 10.00/10
  targeted gRPC tests -> 541 passing

#2854
Branch: GrpcMethodsAsTools-2854
Signed-off-by: Jonathan Springer <jps@s390x.com>
@jonpspri jonpspri force-pushed the GrpcMethodsAsTools-2854 branch from 4c9abfb to fae111a Compare May 7, 2026 13:09
@jonpspri jonpspri merged commit 47dfa0e into main May 7, 2026
50 checks passed
@jonpspri jonpspri deleted the GrpcMethodsAsTools-2854 branch May 7, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

COULD P3: Nice-to-have features with minimal impact if left out; included if time permits enhancement New feature or request mcp-protocol Alignment with MCP protocol or specification

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants