Give the chat router fast-path tools for browsing and inventory by dannon · Pull Request #22610 · galaxyproject/galaxy

dannon · 2026-04-30T21:21:13Z

The router was handing off every query to a specialist, even for trivial read-only operations. "What histories do I have?" round-tripped through HistoryAgent for a result that's just list_user_histories(). This adds a small set of @agent.tool functions on the router itself for browsing/inventory queries, so the easy cases skip the handoff round-trip while everything else still gets specialist routing.

Curation principle: stateless + read-only + single AgentOperationsManager call = router tool. Multi-step reasoning, structured output, or domain-specific prompting = specialist handoff.

The seven tools:

Tool	Used for
`list_histories(limit=10)`	"What histories do I have?"
`get_history_summary(history_id)`	"Tell me about history abc123" (metadata only)
`list_workflows(filter="")`	"What workflows do I have?"
`search_workflows(query, limit=10)`	"Do I have an RNA-seq workflow?"
`search_tools(query, limit=10)`	"Is FastQC installed?"
`get_user_info()`	"Who am I?"
`get_server_info()`	"What version of Galaxy is this?"

The router's prompt has a new "Fast-path tools" section listing each tool and a one-line cue for when to use it vs. hand off. Recommendation queries ("what tool should I use for my analysis?") still route to tool_recommendation -- the new tools are for availability, not ranking. IWC catalog search is deliberately out of scope; that's a different data source / ranking story and deserves its own follow-up.

Two safety details: search_tools clamps its limit argument so zero/negative values fall back to the default of 10 and oversized values cap at 50 (the underlying ops.search_tools doesn't take a limit parameter, so the wrapper slices). get_history_summary catches MalformedId and returns a structured error pointing the model back at list_histories rather than crashing the run.

Hard cap of 6-10 tools to keep routing quality from drifting. We're at 7; if more fast-paths are wanted later (workflow-run browsing -- get_invocations / get_invocation_details / get_workflow_details -- is the obvious Tier 2), the right move is probably a "browsing skill" rather than growing the router further.

Test plan

pytest test/unit/app/test_agents.py::TestAgentUnitMocked -- 35 passed (12 new: registration with cap guardrail, per-tool happy paths, MalformedId path, empty-filter path, oversized-result truncation, parametrized clamp tests for 0/-1/-5 limits, oversized-limit clamp at 50)
Manual smoke: ask the chat router "what histories do I have?" and confirm it answers via the tool rather than handing off

Right now the router hands every browsing query off to a specialist, even trivial ones -- "what histories do I have?" round-trips through the history agent for what should be a single AgentOperationsManager call. This wires up four @agent.tool fast-path tools on QueryRouterAgent so the router can answer those directly: list_histories, get_history_summary, list_workflows, and get_user_info. Each one is a thin async wrapper around AgentOperationsManager (via anyio.to_thread.run_sync), so the existing service-layer permission and encoding logic still runs. The curation principle (also documented in router.md): router tools are for stateless, low-cost, read-only browsing that maps to a single ops call. Anything that needs domain prompting, multi-step context building, or structured output -- summarizing a history, error analysis, tool generation -- still hands off to the specialist. The original target list included list_recent_jobs, but AgentOperationsManager doesn't expose a list-jobs method (only get_job_status for a single id), so that one's dropped here. We can add it later if/when the ops layer grows that capability. Tests cover registration (the four tools show up in agent.toolsets) plus per- tool unit tests that mock AgentOperationsManager and assert the right method gets called with the right args, including a MalformedId path on get_history_summary and the empty-filter -> search=None translation on list_workflows.

Server metadata (version, brand, URL, capability flags) is a stateless single-call lookup that doesn't need a specialist. Adds the tool to the router, lists it in the prompt's fast-path section, and adds focused unit-test coverage.

Inventory/availability questions like "is FastQC installed?" or "do I have an RNA-seq workflow?" don't need specialist routing -- they're single-call lookups against the toolbox or workflow index. The router handling them directly avoids a handoff round-trip. Boundary stays tight: these are availability/inventory only. "What tool should I use for my analysis?" still routes to the tool_recommendation specialist, and IWC catalog search is deliberately out of scope (different data source, different ranking semantics) -- left for a follow-up. ops.search_tools doesn't accept a limit, so the router wrapper slices oversized result lists and flags the truncation back to the model.

Negative limits silently sliced from the tail (tools[:-5]) and very large limits would have happily echoed thousands of results. Treat anything <= 0 as "use the default of 10," and cap at 50 to bound the per-turn payload the router ships back to the model.

CLAUDE.md is explicit that test files shouldn't use inline imports. Move the lone galaxy.exceptions.MalformedId import to the top of the test file alongside the other galaxy.* imports.

jmchilton · 2026-04-30T22:29:39Z

        self.deps.get_agent.assert_called_once_with("history", self.deps)
        assert "History summary" in response

+    def test_router_registers_fast_path_tools(self):


I really don't think these tests are useful - this first one seems to just be testing the literal implementation list of strings. All the rest of them are just sort of seem to be testing the mock data being patched?

I would prefer if you drop them. If you're unwilling to drop them would you trade for me adapting the test fixture static response for router testing - not work I want to do per se but I would prefer it to these unit tests.

yeah, 100% fair -- they were mostly testing the mocks back at themselves. even the "real logic" ones tested 2-4 lines of arithmetic that don't prove the router actually picks the right tool. all gone in 9f5666d

Mypy on CI didn't like the bare ``router.agent.toolsets[0].tools[name]`` expression repeated across the per-tool tests: ``toolsets[0]`` is typed as ``AbstractToolset``, and ``tools`` is only an instance attribute on the concrete ``FunctionToolset`` subclass. Extract a single ``_fast_path_tool(router, name)`` helper using ``getattr`` (matching the pattern the registration test already uses) so the cast lives in one place and returns Any -- the per-tool tests then look up tools without tripping mypy.

John flagged in galaxyproject#22610 that the per-tool unit tests don't earn their keep -- the registration test just enumerates strings, and the per-tool tests largely confirm "the mock returned what I patched it with". A few of them tested small bits of wrapper logic (limit clamping, MalformedId handling, empty-filter mapping) but even those were 2-4 lines of arithmetic each, with no coverage of the thing that actually matters: whether the router decides to call these tools at the right times. Drop the whole TestAgentUnitMocked fast-path block plus the ``_fast_path_tool`` helper that just got added for mypy. John offered to extend the existing static-response test fixture for router testing as a follow-up, which gives us real router-decision coverage instead of mock-passthrough coverage.

mvdbeek · 2026-05-01T14:18:42Z

 - Scientific analysis best practices
 - Galaxy features and capabilities

+## Fast-path tools


The section seems relatively verbose. Maybe we should have some post-processing for markdown consumed by LLMs that compacts these (caveman style) ?

That's a good idea, let's follow up after I get the evals framework in and will have benchmarks to iterate on

github-actions · 2026-05-01T14:20:55Z

This PR was merged without a "kind/" label, please correct.

dannon added 5 commits April 29, 2026 23:05

Expose get_server_info as a router fast-path tool

d0705af

Server metadata (version, brand, URL, capability flags) is a stateless single-call lookup that doesn't need a specialist. Adds the tool to the router, lists it in the prompt's fast-path section, and adds focused unit-test coverage.

Hoist MalformedId import out of fast-path test body

66f6751

CLAUDE.md is explicit that test files shouldn't use inline imports. Move the lone galaxy.exceptions.MalformedId import to the top of the test file alongside the other galaxy.* imports.

github-project-automation Bot added this to Galaxy Dev - weeklies Apr 30, 2026

github-project-automation Bot moved this to Needs Review in Galaxy Dev - weeklies Apr 30, 2026

github-actions Bot added the area/testing label Apr 30, 2026

github-actions Bot added this to the 26.1 milestone Apr 30, 2026

jmchilton reviewed Apr 30, 2026

View reviewed changes

jmchilton approved these changes Apr 30, 2026

View reviewed changes

dannon added 2 commits April 30, 2026 19:48

mvdbeek reviewed May 1, 2026

View reviewed changes

mvdbeek approved these changes May 1, 2026

View reviewed changes

mvdbeek merged commit dc39768 into galaxyproject:dev May 1, 2026
52 of 57 checks passed

github-project-automation Bot moved this from Needs Review to Done in Galaxy Dev - weeklies May 1, 2026

nsoranzo added the kind/enhancement label May 1, 2026

itisAliRH deleted the agent-router-fast-path-tools branch May 4, 2026 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give the chat router fast-path tools for browsing and inventory#22610

Give the chat router fast-path tools for browsing and inventory#22610
mvdbeek merged 7 commits intogalaxyproject:devfrom
dannon:agent-router-fast-path-tools

dannon commented Apr 30, 2026

Uh oh!

jmchilton Apr 30, 2026

Uh oh!

dannon May 1, 2026

Uh oh!

mvdbeek May 1, 2026

Uh oh!

dannon May 1, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dannon commented Apr 30, 2026

Test plan

Uh oh!

jmchilton Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

dannon May 1, 2026

Choose a reason for hiding this comment

Uh oh!

mvdbeek May 1, 2026

Choose a reason for hiding this comment

Uh oh!

dannon May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants