Q&A: Phase 10.3 PlanExecutor — concurrency tuning, retry configuration, failure modes, and Grafana monitoring #328
Unanswered
web3guru888
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Q&A: Phase 10.3 PlanExecutor — concurrency tuning, retry configuration, failure modes, and Grafana monitoring
Q1: How do I tune
max_concurrency?A: Start with
CPU count × 2for I/O-bound tasks, then adjust based onasi_plan_executor_active_tasksgauge.max_concurrencyMonitoring: If
asi_plan_executor_active_tasksconsistently hitsmax_concurrency, increase the limit. If wave duration p95 is high but active tasks are low, the bottleneck is the dispatcher, not concurrency.Q2: What happens if a sub-task fails and
skip_on_failure=False?A: The task is marked
FAILEDafter exhaustingmax_retries. All downstream dependents (tasks that transitively depend on the failed task) are markedSKIPPED.PlanResult.failed > 0triggersGoalRegistry.update_status(goal_id, GoalStatus.FAILED).To allow partial completion despite failures, set
skip_on_failure=True— failed tasks are treated asSKIPPEDand do not block dependents.Q3: How does retry backoff work?
A:
PlanExecutoruses exponential backoff with jitter:With the default
retry_backoff_ms=500:Each retry is counted as
asi_plan_executor_subtasks_total{state="retried"}. A successful retry emitsasi_plan_executor_retry_attempts_total{result="success"}.Q4: Can I implement a custom
TaskDispatcher?A: Yes — implement the
TaskDispatcherProtocol:Register it:
The factory accepts an optional
dispatcherargument; if omitted, it constructs aRouterTaskDispatcherusing the defaultFederatedTaskRouter.Q5: How does
PlanExecutorintegrate withGoalRegistry?A: The 4-step integration flow in
CognitiveCycle.run_once():This closes the Phase 10 autonomous loop: goals drive decomposition, decomposition drives execution, execution drives goal status updates.
Q6: What's the difference between
FAILEDandSKIPPEDstates?A:
FAILEDmax_retries, never succeededPlanResult.failed,asi_plan_executor_subtasks_total{state="failed"}SKIPPEDPlanResult.skipped,asi_plan_executor_subtasks_total{state="skipped"}Key implication:
SKIPPEDsubtasks do not consume retry budget and do not triggerRETRY_ATTEMPTSmetrics. This makes it easy to distinguish "my task broke" from "my task never ran."Q7: Grafana dashboard setup for PlanExecutor monitoring
A: Three recommended panels + one alert rule:
Panel 1 — Wave Duration Heatmap
Panel 2 — SubTask State Breakdown (Pie Chart)
Panel 3 — Retry Success Rate (Gauge)
Alert Rule — Plan Failure Rate > 5%
Beta Was this translation helpful? Give feedback.
All reactions