This project is deployed as a containerized Bun service on Azure Container Apps.
See also: Architecture for system design and module map, Configuration for the complete
.kodiai.ymlreference, Graceful Restart Runbook for zero-downtime deploys.
- Azure resource group:
rg-kodiai - Azure Container Registry (ACR):
kodiairegistry - Container Apps environment:
cae-kodiai - Container app:
ca-kodiai - ACA job:
caj-kodiai-agent
The service exposes:
- Webhook endpoint:
POST /webhooks/github - Slack events endpoint:
POST /webhooks/slack/events - Slack commands endpoint:
POST /webhooks/slack/commands/* - Health:
GET /healthz - Readiness:
GET /readiness
Use deploy.sh to provision and deploy.
Key properties:
- Idempotent: safe to re-run
- Remote build: uses
az acr build(Docker not required locally) - Managed identity: used for ACR pull (
AcrPullrole) - Probes: liveness/readiness/startup are configured in the app template
- Existing app updates are single-shot full YAML updates: image, env, probes, secrets, scale, ingress, and volume mounts are rendered together for
az containerapp update --yaml - Success output now prints the active revision plus deploy proof URLs for
/healthzand/readiness
Azure Container Apps update semantics are destructive for omitted fields. A two-step update like:
az containerapp update --set-env-vars ... --image ...az containerapp update --yaml <partial-template-with-volume-mount>
can create a second revision that drops env vars and probes if the YAML omits them. In single-revision mode, that broken revision can become the active revision and take the app down.
The deploy script avoids that failure mode by updating existing apps with one full YAML payload.
deploy.sh requires:
GITHUB_APP_IDGITHUB_PRIVATE_KEY_BASE64(base64-encoded PEM)GITHUB_WEBHOOK_SECRETCLAUDE_CODE_OAUTH_TOKENVOYAGE_API_KEYSLACK_BOT_TOKENSLACK_SIGNING_SECRETSLACK_BOT_USER_IDSLACK_KODIAI_CHANNEL_IDDATABASE_URL
Optional:
SHUTDOWN_GRACE_MS(defaults to300000)BOT_USER_PAT(optional; enables fork/gist bot-user flows when paired withBOT_USER_LOGIN)BOT_USER_LOGIN(optional; enables fork/gist bot-user flows when paired withBOT_USER_PAT)
Notes:
- The app runtime expects
GITHUB_PRIVATE_KEY; the deploy script stores the base64 PEM in an Azure secret and maps it toGITHUB_PRIVATE_KEY. CLAUDE_CODE_OAUTH_TOKENmust be the 1-year token fromclaude setup-token. Do not point it at~/.claude/.credentials.jsonclaudeAiOauth.accessToken; that rotating login token is rejected by the deployed runtime path.- Structural-impact output depends on the review-graph and canonical-code substrates being reachable in the deployed environment.
./deploy.shOn success, the script prints:
- Active revision
- App URL
- Health URL (
/healthz) - Readiness URL (
/readiness) - Webhook URL to configure in the GitHub App
See .env.example for the full list of environment variables. For repository-level behavior configuration (review rules, mention handling, knowledge features), see Configuration.
Azure secrets created by deploy.sh:
github-app-idgithub-private-keygithub-webhook-secretclaude-code-oauth-tokenvoyage-api-keyslack-bot-tokenslack-signing-secretdatabase-urlbot-user-pat(only when bothBOT_USER_PATandBOT_USER_LOGINare set)
Runtime env vars set by deploy.sh on the app template:
GITHUB_APP_IDGITHUB_PRIVATE_KEYGITHUB_WEBHOOK_SECRETCLAUDE_CODE_OAUTH_TOKENVOYAGE_API_KEYSLACK_BOT_TOKENSLACK_SIGNING_SECRETDATABASE_URLSLACK_BOT_USER_IDSLACK_KODIAI_CHANNEL_IDBOT_USER_PAT(only when bothBOT_USER_PATandBOT_USER_LOGINare set)BOT_USER_LOGIN(only when bothBOT_USER_PATandBOT_USER_LOGINare set)SHUTDOWN_GRACE_MSPORT=3000LOG_LEVEL=info
The app runtime still has built-in defaults for the ACA job launch contract in src/config.ts. When these env vars are absent, the runtime falls back to:
ACA_JOB_IMAGE=kodiairegistry.azurecr.io/kodiai-agent:latestACA_JOB_NAME=caj-kodiai-agentACA_RESOURCE_GROUP=rg-kodiaiMCP_INTERNAL_BASE_URL=http://ca-kodiai
Those defaults are the current truth. deploy.sh does not inject them into the container app template anymore.
deploy.sh currently pins:
min-replicas 1max-replicas 1
This avoids webhook timeouts from cold starts and reduces concurrency surprises.
Configured in the container app template:
- Liveness:
GET /healthz - Readiness:
GET /readiness - Startup:
GET /healthz
Important:
- Existing app updates must render the full container app template in one YAML payload.
- Partial YAML updates that only add a volume mount can wipe env vars or probes from the next revision.
- In single-revision mode, a stripped revision can become active immediately and fail startup with missing env vars.
These are the fastest operator checks after a deploy:
- Active revision selection
GET /healthzGET /readiness- Deploy output showing the exact proof URLs that were just probed
bun run verify:m052when Slack webhook relay is enabled
- Manual re-request / explicit
@kodiai reviewdebugging:docs/runbooks/review-requested-debug.md
Show active revision:
az containerapp revision list \
--name ca-kodiai \
--resource-group rg-kodiai \
--query "[?properties.active].name | [0]" \
--output tsvFetch FQDN:
az containerapp show \
--name ca-kodiai \
--resource-group rg-kodiai \
--query properties.configuration.ingress.fqdn \
--output tsvHealth checks:
curl -fsS "https://$(az containerapp show --name ca-kodiai --resource-group rg-kodiai --query properties.configuration.ingress.fqdn -o tsv)/healthz"
curl -fsS "https://$(az containerapp show --name ca-kodiai --resource-group rg-kodiai --query properties.configuration.ingress.fqdn -o tsv)/readiness"