Skip to content

[Feat] Add a configurable callback for listener to multi-agent communication events #103

@patricka3125

Description

@patricka3125

Overview

Add support for emitting CloudEvents for CAO session lifecycle events and multi-agent communication events. The server will accept a pre-configured callback URL at startup, and upon any qualifying event will POST a well-formed CloudEvent payload to that URL. This gives operators a standards-based, backend-agnostic hook to route events to any monitoring, alerting, or observability system of their choice — including messaging apps, dashboards, or workflow automation tools.

User Stories

  • As a user, I want to monitor workflow health and directly observe agent communication in an external application or tool (such as a messaging app), so that I have real-time visibility into multi-agent sessions without needing to watch tmux windows.
    • Example: Slack webhook integration. Entire cao sessions can be reshaped into a Slack group channel where all the multi-agent communication is preserved. Users can look into active cao sessions organically in their preferred application or tool.
  • As a developer, I want the callback URL to be optional and configurable via an environment variable or config so that the feature is entirely opt-in and the server behaves normally when no URL is set.
  • As a developer, I want the messages to be in CloudEvent format for broad compatibility out-the-box with cloud provider event based services, message queues, etc. A middleware service will still need to be developed for those who are interested in bridging to external messaging applications/services.

Acceptance Criteria

  • The server reads a pre-configured callback URL from a configurable source (env var, config file, etc.)
  • The server emits a CloudEvent via HTTP POST to the configured callback URL for the following event types:
    • Session lifecycle: session_created, session_killed
    • Terminal lifecycle: terminal_created, terminal_killed
    • Multi-agent communication: message_sent (emitted for send_message, handoff, and assign operations; orchestration methods like assign may emit more than one event across their lifecycle)
  • Each CloudEvent includes appropriate context (event type, terminal/session IDs, agent names, message content where applicable)
  • If no callback URL is configured, the feature is a no-op (no crash, no warning spam — one INFO log at startup is sufficient)
  • If the POST to the callback URL fails (timeout, connection error, non-2xx), the failure is logged as a warning but does not affect the original API response
  • The callback POST is fire-and-forget (async, non-blocking) so it never adds latency to the API response path
  • (Optional) Add support for option to declare a sidecar/plugin in CAO configuration to specify custom middleware or emit message structure.

Proposed Solution

  • New module src/cli_agent_orchestrator/utils/event_emitter.py — defines a structured set of CAO event types and builds and POSTs CloudEvent payloads. Uses httpx (already a common FastAPI ecosystem dependency) with a short timeout for fire-and-forget POSTs.
  • Modify service layer (services/session.py, services/terminal.py, services/inbox.py) — inject event_emitter.emit() calls at key lifecycle points (creation, termination, message dispatch).
  • Modify src/cli_agent_orchestrator/constants.py — add EVENT_CALLBACK_URL = os.getenv("CAO_EVENT_CALLBACK_URL", None).

Additional Context

There is currently no push-based mechanism to observe what is happening across a multi-agent CAO session. Operators cannot monitor workflow progress or inter-agent communication without manually watching tmux windows or tailing log files, making it difficult to integrate CAO-driven workflows into broader tooling such as alerting systems or chat apps (e.g. Slack, Discord).

Using CloudEvents means the callback URL can point to virtually any modern event ingestion system (AWS EventBridge, GCP Eventarc, a Slack webhook, a custom receiver, etc.) without any changes to CAO itself. The cloudevents Python SDK (cloudevents-sdk) can be used to construct compliant envelopes and avoid hand-rolling the spec attributes.

Alternatives considered

  • Custom JSON webhook (non-CloudEvents): Simpler to implement but produces a proprietary payload shape, requiring consumers to write CAO-specific parsers. CloudEvents gives an industry-standard envelope for free.
  • Polling the REST API: Consumers could poll /sessions and /terminals endpoints, but this is inefficient and doesn't capture transient communication events like message_sent.
  • Web UI for monitoring workflow progress: The CAO web UI provides some visibility into session state, but there remains a gap for users who prefer a different UX or want to build their own monitoring tools that consume events directly.
  • Scope down and instead add generic hook support for events This may be the better option as it doesn't introduce hard dependency to CloudEvents structure. It also would keep the solution simpler, leaner, and provide more customizability to developers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions