-
Fix ResourceWarning: close TaskQueue and sqlite3 connections properly. 02ae42f
Wrap TaskQueue in a with-block in _run_start so the connection is closed on all exit paths, including sys.exit() from container startup errors.
Replace five
with sqlite3.connect(...) as conn:patterns in test_executor.py with explicit open/close — the with-form only manages transactions, leaving connections open until GC.co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add heartbeat thread to _process_task in reference agent. 9499c19
Task 6 from pr-21-fixes.md:
- _process_task now starts a daemon threading.Thread that calls client.heartbeat(task.task_id) every _HEARTBEAT_INTERVAL (25 s) while triage runs so the harness does not re-queue the task mid-flight
- threading.Event stops the heartbeat thread in a finally block after triage returns or raises
- import threading added; _HEARTBEAT_INTERVAL module constant added
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add task_id identity callout to complete_task docs. 596e38d
Warns readers that task_id and decision.task_id must match; a silent mismatch causes the drain loop to miss the result.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add configurable timeout parameter to ForemanClient. 05d81cc
Exposes
timeout: float = 5.0onForemanClient.__init__and forwards it tohttpx.Client, so callers can tune per-deployment latency requirements without monkey-patching the transport.co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add integration test for agent restart resilience (Task 17, Phase 6). 1549117
Implements the MVP acceptance criterion: zero task loss under a simulated agent restart. The test uses a minimal in-process harness (real TaskQueue, real MemoryStore) and exercises the actual ForemanClient + agent startup- poll code path without live network sockets.
Also adds --run-integration pytest flag and integration marker so the test is skipped in CI by default.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add write-an-agent how-to guide (Task 16, Phase 6). 789dbc2
Documents the foreman-client SDK for agent authors: install, ForemanClient constructor args, next_task/complete_task/heartbeat methods, claim timeout, heartbeat cadence, idempotency contract, and a ≤30-line minimal example.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add initial
.superset/config.jsonand.memsearch/memory/tooling artifacts. e59100e- Introduce
.superset/config.jsonwith an empty setup, teardown, and run configuration. - Add
.memsearch/memory/2026-04-26.mdfor session logging and transcript retention.
- Introduce
-
Address Phase 3 code review: fix resource leak, export types, clean up tests. 38c72c0
- Add close(), enter, exit to ForemanClient to prevent httpx connection pool leak
- Export LLMBackendRef and TaskContext from foremanclient package init
- Move import json to module level in test_client.py; remove misleading call-ordering comment
- Add TestForemanClientLifecycle tests for close() and context manager behaviour
- Mark Phase 3 plan tasks and checkpoint complete; add phase-3-review.md
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add threading lock to TaskQueue for improved concurrency safety. a3633be
Refactored claim_next to use threading lock in conjunction with
BEGIN IMMEDIATEfor same-process thread serialization. Updated related tests and improved cleanup with explicit resource management using close(). -
Add QueueConfig to config.py (Task 1). f3a548d
Extends ForemanConfig with a new QueueConfig model matching the queue-mediated agent protocol spec. Adds corresponding tests and documents the new section in config.example.yaml.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Resolve high-priority issues from phase-3 review:. f01758a
- Wrap
_drain_loopand_requeue_loopbodies in exception handlers to ensure background loops do not terminate on errors. - Split
drain_completedinto a newmark_donemethod for per-task completion after successful execution. - Update startup poll to drain all queued tasks on agent boot.
- Add heartbeat thread to
_process_taskto prevent requeue during long-running LLM calls. - Publicize
Dispatcher.executorto remove private attribute access between modules.
- Wrap
-
Mark all pr-21-fixes.md acceptance criteria complete. 0a36267
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Publicize Dispatcher.executor (remove private-attribute cross-module access). 74cf0e0
Task 5 from pr-21-fixes.md:
- Rename Dispatcher._executor → Dispatcher.executor (public attribute)
- main.py updated to use dispatcher.executor instead of dispatcher._executor
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Drain all queued tasks on agent startup (loop until empty). 1a9bf71
Task 3 from pr-21-fixes.md:
- _lifespan startup poll now loops calling next_task() until it returns None, processing each task before moving to the next; previously only one task was claimed, leaving N-1 accumulated tasks permanently stuck
- New test: startup poll with 3 queued tasks drains all 3 before yield
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Wrap _requeue_loop body in exception handler. 0712ad8
Task 4 from pr-21-fixes.md:
- requeue_stale() + fail_exhausted() wrapped in try/except Exception so one bad cycle does not kill the requeue loop permanently
- _lifespan finally uses suppress(CancelledError, Exception) for requeue_task
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Split drain_completed/add mark_done; wrap _drain_loop in exception handlers. 4cf9097
Tasks 2+1 from pr-21-fixes.md:
- drain_completed() no longer marks rows done; rows stay 'completed'
- New mark_done(task_id) transitions completed→done after successful execute
- _drain_loop wraps drain_completed() in outer try/except (loop never dies)
- _drain_loop wraps per-task execute+memory+mark_done in inner try/except (one bad task does not abort others in the same batch)
- _lifespan finally uses suppress(CancelledError, Exception) for drain_task
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
[pre-commit.ci] pre-commit autoupdate. 16cc083
updates: - github.com/astral-sh/ruff-pre-commit: v0.15.11 → v0.15.12
-
Mark verification steps complete for Phase 6 tasks in plan. f10729f
-
Mark Phase 6 Task 17 complete in plan. 525e797
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Mark Phase 5 tasks complete in plan. 1156699
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement Phase 5: update issue-triage agent to use ForemanClient. 4ce2bec
POST /task now returns 202 immediately and fires a background task that claims the pending task via ForemanClient.next_task(), runs triage, and reports back via complete_task(). Lifespan startup poll picks up any tasks queued while the agent was down.
Inline protocol models removed; foremanclient.models is the single source of truth for TaskMessage / DecisionMessage across agent and tests.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Convert TaskQueue tests to use context manager and update installed packages. a0ba8dd
-
Implement Phase 4 Task 12: add --queue-db CLI arg and wire TaskQueue. 170d707
Add --queue-db argument to the start subcommand so users can override the queue database path without changing config. Priority: --queue-db > config db_path > ~/.agent-harness/queue.db default. Update plan.md to mark Tasks 11, 12, 13 and Phase 4 checkpoint complete.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement Phase 4 Task 11: drain and requeue background loops in lifespan. 3ba1ceb
Add two background asyncio tasks started in a FastAPI lifespan context manager:
- _drain_loop: wakes on drain_event or drain_interval_seconds; calls TaskQueue.drain_completed(), executor.execute(), and memory.upsert_memory_summary() for each completed task.
- _requeue_loop: runs every requeue_interval_seconds; calls requeue_stale() and fail_exhausted(max_retries=config.queue.max_retries).
Both tasks cancel cleanly on shutdown. The lifespan also initialises app.state.drain_event so /harness/result and /queue/complete can signal it. main.py wires app.state.executor, .memory, and .config for the lifespan.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement Phase 4 Task 10: refactor Dispatcher to enqueue + nudge. 1e2283e
Replace synchronous POST→parse dispatch with durable enqueue:
- Dispatcher.dispatch() now enqueues the TaskMessage in TaskQueue and sends a fire-and-forget nudge ({"task_id": ...}) to the agent endpoint.
- DecisionMessage parsing and executor.execute() are removed from dispatch(); those belong to the drain loop (Task 11).
- Dispatcher.init gains a required task_queue: TaskQueue parameter.
- main.py creates TaskQueue from config.queue and passes it to Dispatcher.
- Integration and server tests updated to reflect new enqueue-based protocol.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement Phase 3: foreman-client package with ForemanClient. adffcef
Creates the standalone
foreman-client/package that agent authors install to communicate with the harness queue. Exposesnext_task(),complete_task(), andheartbeat()over synchronous httpx, with structlog events andForemanClientErroron non-2xx responses. 100% line and branch coverage via respx HTTP mocks.Also excludes
foreman-client/andagents/from root pytest collection, and excludesforeman-client/from the root mypy pre-commit hook to prevent duplicate module name conflicts.co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement Phase 2: queue HTTP endpoints and harness result nudge. 89316f3
- foreman/routers/queue.py: POST /queue/next (claim task or 204), POST /queue/complete (store decision + signal drain), POST /queue/heartbeat
- foreman/routers/result.py: POST /harness/result (drain-loop nudge)
- server.py: register both new routers on the FastAPI app
- tests/test_queue_router.py, tests/test_result_router.py: HTTP contract tests using FastAPI TestClient with dependency_overrides (no SQLite in router tests)
- pyproject.toml: per-file-ignores for FastAPI router B008/TC001/TC003 patterns
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Implement TaskQueue and tests (Tasks 2 & 3). 73cf3cb
SQLite-backed task queue with enqueue, claim_next (concurrency-safe via BEGIN IMMEDIATE), complete, heartbeat, drain_completed, requeue_stale, and fail_exhausted. 21 tests cover all methods including concurrent claim.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Update minimal example and Startup Poll docs to use drain loop lifespan. a7f2905
Task 7 from pr-21-fixes.md:
- Minimal example now uses @asynccontextmanager lifespan: creates ForemanClient, drains queued tasks via while-loop, yields, closes client
- FastAPI(lifespan=lifespan) used instead of bare FastAPI()
- Startup Poll section updated from single next_task() call to the correct loop-until-None pattern with an explanation of why a single call is wrong
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Remove obsolete "How Tos" index and fix installation link in write-an-agent guide. 23a836e
-
Update messaging protocol design spec to propose queue-mediated agent architecture. 99b02c8
Adds detailed problem statement, design rationale, MVP scope, key assumptions, and open questions for implementing a robust task queue backed by SQLite. Documents at-least-once delivery, claim/requeue logic, and API adjustments. Addresses gaps in current synchronous dispatch handling.
-
Add design system assets, CSS variables, and comprehensive API reference structure. 38cfce0
-
Add CHANGELOG.md to excluded files in linter configuration. a3fa809
- Restructure and update design specs; add messaging update proposal and index file. f8027a5
- Remove outdated tutorials and API docs; add home page layout, visual assets, and updated CSS. 8b2a2fc
-
Add reference documentation for agent protocol, CLI commands, and configuration schema. b35c600
-
Add rumdl linting support, update README link, and configure pre-commit hooks. 68c7d76
-
Reformat several Markdown files. b97ec91
-
Mark Phase 5 and Final Checkpoint tasks as complete in todo.md. 9149c15
-
Task 17: mark Phase 7 tasks complete; final coverage at 96%. f8f6d35
config.example.yaml already matches full schema and loads cleanly. CHANGELOG.md already maintained by bump-my-version toolchain. 214 tests passing, 96% line coverage (target ≥85%), pre-commit clean.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Task 16: End-to-end integration test for full issue triage pipeline. 440ecec
Covers the complete path: poller event → router → dispatcher → executor → memory (real SQLite DB). Mocks are limited to PyGithub and httpx boundaries.
Six tests across two classes:
- TestFullTriagePipeline: label+comment applied, memory updated, action logged before GitHub call, prior summary injected, close_issue blocked when allow_close=False
- TestPollerFeedsDispatcher: poller.poll_all callback routes and dispatches a polled issue end-to-end
214 tests passing.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
[pre-commit.ci] pre-commit autoupdate. 068ab20
updates: - github.com/astral-sh/ruff-pre-commit: v0.15.10 → v0.15.11
-
Remove redundant sections from CONTRIBUTING.md and fix Code of Conduct link. 5c891a5
-
Remove outdated agent-harness spec, update CLAUDE.md with spec-driven development process. bc252ba
-
Wire
ContainerManagerand agent lifecycle intoforeman start. Update agent paths, config, tests, and Dockerfile to align with refactoredissue-triagestructure. Mark Phase 6 tasks as complete. 7e7846d -
Use
SecretStrfor sensitive fields in configuration and GitHubPoller, removing custom masking logic. Update tests accordingly. d2e437a -
Task 15: Triage logic and prompt (prompts/triage.py). 6518095
- build_prompt: formats issue title/body/author/labels + memory_summary
- parse_llm_response: extracts JSON from prose, validates decision type, applies allow_close guard, defaults to skip on parse failure
- _call_llm: LiteLLM wrapper (provider/model from task context)
- run_triage: duplicate-comment guard (memory keyword check) before LLM call
- 18 triage tests + full suite at 195 passing
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Task 14: Agent HTTP server scaffold + Dockerfile. 60778eb
- FastAPI app with POST /task (DecisionMessage) and GET /health (200 ok)
- Self-contained protocol models (TaskMessage, DecisionMessage, ActionItem)
- triage() delegates to prompts/triage.run_triage() — stub for Task 15
- Dockerfile installs deps and runs uvicorn on port 8000
- agents/issue-triage/pyproject.toml with runtime deps
- 7 agent server tests; full suite at 177 passing
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Task 13: Container lifecycle manager (foreman/containers.py). 7e7c407
- ContainerManager pulls images on demand, starts containers, waits for /health
- stop_all() stops all managed containers; safe to call multiple times
- handle_container_exit() logs error and restarts once; marks failed on second exit
- ContainerError raised when Docker socket is unavailable at init
- 14 tests covering all acceptance criteria; full suite at 170 passing
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Set environment to
github-pagesforpublish-docsworkflow. e2f100f
-
Add .api-env to .gitignore. ff63ae3
Prevents accidental commit of local env file containing GitHub token and API keys.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Add initial README with project description, features, requirements, and setup instructions. 3a9e9ba
-
Phase 5 — Harness Core + polling error visibility. 0a3c781
Implements router, server dispatch loop, and main entrypoint (Tasks 10–12). Fixes two bugs found during integration testing:
- SQLite connection used across threads now opens with check_same_thread=False
- Poller task was created but never awaited; fixed by running concurrently in _run_loop
Also fixes silent failure on GitHub API errors: non-rate-limit exceptions (including 401 bad credentials) are now logged immediately at critical/error level instead of being swallowed until process shutdown. Done callback on the poller task surfaces any unexpected crash in real time.
156 tests passing, all pre-commit hooks green.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Update license in README to MIT. 64a1e71
-
Update dependency versions in
uv.lockfile, including FastAPI (0.136.0), FastAPI Cloud CLI (0.17.0), FileLock (3.28.0), HuggingFace Hub (1.11.0), Identify (2.6.19), MkDocStrings (1.0.4), Packaging (26.1), and Virtualenv (21.2.4). e0bf184
-
Bump the uv group with 2 updates. b31044f
Bumps the uv group with 2 updates: litellm and uv.
Updates
litellmfrom 1.83.7 to 1.83.9Updates
uvfrom 0.11.6 to 0.11.7
updated-dependencies: - dependency-name: litellm dependency-version: 1.83.9 dependency-type: direct: production update-type: version-update:semver-patch dependency-group: uv
signed-off-by: dependabot[bot] support@github.com
-
Use
TYPE_CHECKINGfor imports in test files and update Phase 4 todo items. 6043d54 -
Phase 4: implement GitHub executor and poller (Tasks 8 & 9). 9efa175
executor.py:
- GitHubExecutor.execute() logs decision to action_log BEFORE any GitHub API call
- Handles add_label, comment, close_issue (with allow_close guard)
- Raises UnknownActionError for unrecognized action types
poller.py:
- GitHubPoller.poll_repo() fetches issues since last_polled, skips collaborator issues
- poll_all() runs repos concurrently via asyncio + semaphore (default max 5)
- Exponential backoff on 403/429; other GithubExceptions propagate
- Continuous run() loop at configurable interval
memory.py:
- Add poll_state table with get_last_polled() / set_last_polled() methods
- Timestamps stored as ISO-8601 strings, returned as timezone-aware datetime
39 new tests; 125 total passing.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
- Remove draft flag from release creation script. 131ea10
-
Fix unclosed DB connection warnings in test_memory.py. 982f6ec
Switch store fixtures to yield+context-manager so the connection is closed after each test, and remove manual store.close() calls that were no longer needed with WAL mode + committed writes.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
- Add docstrings for clarity in LLM backend tests, remove unused imports, and update CLAUDE.md with test-writing guidance. 0b73671
-
Replace
mkdocs gh-deploywithzensical build --cleanin docs workflows. 4e22796 -
Generated the changelog. 2f35d59
-
Bump the github-actions group with 10 updates. f1cb391
Bumps the github-actions group with 10 updates:
| Package | From | To | | --- | --- | --- | | actions/checkout |
4|6| | actions/download-artifact |4|8| | actions/setup-python |5|6| | astral-sh/setup-uv |5|7| | github/codeql-action |3|4| | docker/login-action |3|4| | docker/metadata-action |5|6| | docker/build-push-action |6|7| | actions/attest-build-provenance |2|4| | softprops/action-gh-release |2|3|Updates
actions/checkoutfrom 4 to 6Updates
actions/download-artifactfrom 4 to 8Updates
actions/setup-pythonfrom 5 to 6Updates
astral-sh/setup-uvfrom 5 to 7Updates
github/codeql-actionfrom 3 to 4Updates
docker/login-actionfrom 3 to 4Updates
docker/metadata-actionfrom 5 to 6Updates
docker/build-push-actionfrom 6 to 7Updates
actions/attest-build-provenancefrom 2 to 4Updates
softprops/action-gh-releasefrom 2 to 3
updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct: production update-type: version-update:semver-major dependency-group: github-actions
signed-off-by: dependabot[bot] support@github.com
-
Phase 3 Tasks 6-7: implement LLM backend abstraction. 02733dc
- LLMBackend ABC with complete() method and from_config() factory in base.py
- AnthropicBackend and OllamaBackend wrapping LiteLLM
- Recorded fixture files for both backends (no live LLM calls in tests)
- 16 new tests across test_llm_base.py and test_llm_backends.py
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Refine type annotations and optimize imports in protocol and memory tests. 12a6bd8
-
Phase 2 human review approved. 3846ea8
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Mark Phase 2 tasks complete in todo.md. 78318a0
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Phase 2 Task 5: implement SQLite memory store. 6b39f0b
Add MemoryStore with action_log and memory_summary tables (WAL mode) . log_action(), get_memory_summary(), upsert_memory_summary() covered by 13 tests using real temp-file DBs — no mocks.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Phase 2 Task 4: implement agent protocol Pydantic models. 829f47f
Add TaskMessage, DecisionMessage, ActionItem, LLMBackendRef, TaskContext, and DecisionType to foreman/protocol.py with 22 tests.
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Phase 1: scaffold, config system, and credential injection. 9f21485
- pyproject.toml: add runtime deps (PyYAML, PyGithub, litellm, httpx, docker), uncomment [project.scripts] entry pointing to foreman.main:main
- Add stub modules for all planned foreman/ submodules and llm/ package
- Add agents/issue-triage/ scaffolding (Dockerfile placeholder, prompts/)
- Implement foreman/config.py: YAML loader with ${VAR} env resolution, Pydantic validation, ConfigError, secret-masking repr for tokens/keys
- Implement foreman/credentials.py: resolve_env_refs(), get_github_token(), CredentialError (variable name only — no secrets in error messages)
- Add config.example.yaml matching the full schema from spec §5
- Add types-PyYAML to mypy pre-commit additional_dependencies
- 35 tests pass; coverage >85% on new modules
co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
-
Remove unused GitHub Actions workflows and update dependabot configuration. 7bbcfb0
-
Update httpx requirement from >=0.27 to >=0.28.1. 5cef88e
Updates the requirements on httpx to permit the latest version.
updated-dependencies: - dependency-name: httpx dependency-version: 0.28.1 dependency-type: direct:production
signed-off-by: dependabot[bot] support@github.com
-
Update pydantic-settings requirement from >=2.8.1 to >=2.13.1. 624e336
Updates the requirements on pydantic-settings to permit the latest version.
updated-dependencies: - dependency-name: pydantic-settings dependency-version: 2.13.1 dependency-type: direct: production
signed-off-by: dependabot[bot] support@github.com
-
Update opentelemetry-api requirement from >=1.32.0 to >=1.41.0. ee4c822
Updates the requirements on opentelemetry-api to permit the latest version.
updated-dependencies: - dependency-name: opentelemetry-api dependency-version: 1.41.0 dependency-type: direct: production
signed-off-by: dependabot[bot] support@github.com
-
Update docker requirement from >=7.0 to >=7.1.0. e673884
Updates the requirements on docker to permit the latest version.
updated-dependencies: - dependency-name: docker dependency-version: 7.1.0 dependency-type: direct:production
signed-off-by: dependabot[bot] support@github.com
-
Update structlog requirement from >=23.1.0 to >=25.5.0. 56a01b0
Updates the requirements on structlog to permit the latest version.
updated-dependencies: - dependency-name: structlog dependency-version: 25.5.0 dependency-type: direct:production
signed-off-by: dependabot[bot] support@github.com
-
Update HealthCheckModel dependencies type annotation for clarity. a0ad023
-
Remove outdated test, add CLAUDE.md for developer guidance, and update scaffolding notes. dae2a06
- Initial commit. 127955f