Skip to content

[PRD] Operations HITL Spike #18494

@joebudi

Description

@joebudi

Operations HITL Spike — PRD

Goal
Validate that we can support human-in-the-loop (HITL) execution using existing infrastructure (Bull) while preserving the request-driven model.


Problem
We need to confirm whether Bull can support:

  • pausing execution
  • escalating to a human
  • resuming execution

…without turning the system into queue-driven automations or losing clear request state and lifecycle.


Solution
Build a minimal end-to-end flow:

  • Trigger: Slack message creates a request
  • Request enters Processing
  • Agent evaluates and triggers escalation
  • Request moves to Needs Input
  • User responds (via Slack/UI)
  • Request resumes → returns to Processing
  • Request completes → Completed

Success Criteria

  • Request state is explicit (Processing → Needs Input → Completed)
  • Request is the primary object (not Bull job)
  • Full history is reconstructable from request (not queue)
  • Bull handles execution only (not lifecycle/state)
  • Both happy + sad paths demonstrated

Out of Scope

  • Final architecture decisions
  • Scalability guarantees
  • UI polish
  • Production-ready durability

Timeline
~1 week spike, demo on Thursday (16th)

Metadata

Metadata

Labels

Type

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions