Commit 8a7afa7
Add producer/critic reflection loops to CustomToolAgent
Builds on #22615's integrated UserToolSource validation. The producer
agent runs with output_retries=0 so the reflection loop owns the retry
and can pass a structured error list back to the prompt rather than
pydantic-ai's generic 'try again' message.
Two opt-in loops on top of pydantic validation:
- Validator-driven retry (default on): if the producer's output fails
validation, _produce_tool returns the formatted error list, and
process() re-calls the producer once with those errors prefixed to
the prompt. Cap of one retry; if it still fails, the agent returns
a low-confidence validation_failed response.
- Quality critic + refine (default off): an LLM critic agent reviews
the validated tool for clarity (description, labels, help text) and
idiomaticity (defaults, exposed options, container choice) -- the
fuzzy dimensions pydantic can't see. If the critic flags significant
issues (should_refine=true), the producer is re-rolled once with
the critique. Cap of one refine; if refinement breaks validation,
the original tool is kept rather than ship something worse.
Both gates live under inference_services.custom_tool:
validator_retry_enabled (default true) and quality_critic_enabled
(default false). Defaulting the critic off keeps the cost neutral for
deployments that don't opt in -- the critic doubles tool-creation
latency / spend when enabled.
The 170-line process() method got factored into named helpers
(capability check, structured-output extraction, validation-failed
response, success response, model-error handlers) to keep the
reflection control flow readable.1 parent b34b937 commit 8a7afa7
4 files changed
Lines changed: 607 additions & 131 deletions
File tree
- lib/galaxy
- agents
- prompts
- tool_util_models
- test/unit/app
0 commit comments