Skip to content

Commit c02c3b4

Browse files
authored
Merge pull request #7 from grove-platform/feat/operator-ui-audit
feat(operator): operator UI, GitHub PAT auth, AI rule suggester
2 parents f61929e + f41bc52 commit c02c3b4

46 files changed

Lines changed: 8193 additions & 177 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,11 @@ jobs:
128128
--region $REGION \
129129
--project $PROJECT_ID \
130130
--allow-unauthenticated \
131+
<<<<<<< HEAD
132+
--set-env-vars="^|^CONFIG_REPO_OWNER=grove-platform|CONFIG_REPO_NAME=github-copier|CONFIG_REPO_BRANCH=main|PEM_NAME=CODE_COPIER_PEM|WEBHOOK_SECRET_NAME=webhook-secret|MONGO_URI_SECRET_NAME=mongo-uri|WEBSERVER_PATH=/events|MAIN_CONFIG_FILE=.copier/main.yaml|USE_MAIN_CONFIG=true|DEPRECATION_FILE=deprecated_examples.json|COMMITTER_NAME=GitHub Copier App|COMMITTER_EMAIL=bot@mongodb.com|GOOGLE_CLOUD_PROJECT_ID=github-copy-code-examples|COPIER_LOG_NAME=code-copier-log|AUDIT_ENABLED=true|METRICS_ENABLED=true|OPERATOR_UI_ENABLED=true|OPERATOR_AUTH_REPO=grove-platform/github-copier|OPERATOR_REPO_SLUG=grove-platform/github-copier|LLM_PROVIDER=anthropic|LLM_BASE_URL=https://grove-gateway-prod.azure-api.net/grove-foundry-prod/anthropic|LLM_MODEL=claude-haiku-4-5|ANTHROPIC_API_KEY_SECRET_NAME=anthropic-api-key|GITHUB_APP_ID=${{ secrets.APP_ID }}|INSTALLATION_ID=${{ secrets.INSTALLATION_ID }}" \
133+
=======
131134
--set-env-vars="^|^CONFIG_REPO_OWNER=grove-platform|CONFIG_REPO_NAME=github-copier|CONFIG_REPO_BRANCH=main|PEM_NAME=CODE_COPIER_PEM|WEBHOOK_SECRET_NAME=webhook-secret|MONGO_URI_SECRET_NAME=mongo-uri|WEBSERVER_PATH=/events|MAIN_CONFIG_FILE=.copier/main.yaml|USE_MAIN_CONFIG=true|DEPRECATION_FILE=deprecated_examples.json|COMMITTER_NAME=GitHub Copier App|COMMITTER_EMAIL=bot@mongodb.com|GOOGLE_CLOUD_PROJECT_ID=github-copy-code-examples|COPIER_LOG_NAME=code-copier-log|AUDIT_ENABLED=true|METRICS_ENABLED=true|GITHUB_APP_ID=${{ secrets.APP_ID }}|INSTALLATION_ID=${{ secrets.INSTALLATION_ID }}" \
135+
>>>>>>> origin
132136
--set-build-env-vars="VERSION=${{ steps.version.outputs.tag }}" \
133137
--tag="${{ steps.version.outputs.traffic_tag }}" \
134138
--max-instances=10 \

AGENT.md

Lines changed: 146 additions & 90 deletions
Large diffs are not rendered by default.

CHANGELOG.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,35 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
66

77
## [Unreleased]
88

9+
### Added
10+
11+
- **Operator UI — comprehensive writer + operator dashboard** at `/operator/` (`OPERATOR_UI_ENABLED=true`). Five tabs (Overview, Webhooks, Audit, Workflows, System), sticky status bar, dark mode, keyboard shortcuts, shareable URLs, and a writer/operator mode toggle persisted to localStorage.
12+
- **GitHub PAT authentication** — users sign in with their personal access token; role is derived from their permission on `OPERATOR_AUTH_REPO` (admin/maintain → operator, write/triage/read → writer). Operator actions (replay, release, AI settings) require an explicit admin or maintain grant, since most writers have `write` on the auth repo. Replay additionally enforces read access on the source repo for that specific delivery.
13+
- **AI rule suggester** — paste a source path and desired target state, receive a suggested workflow rule with self-verification via the in-process `PatternMatcher`. Two providers supported:
14+
- **Anthropic (hosted)** — default for Cloud Run. API key loaded from Secret Manager via `ANTHROPIC_API_KEY_SECRET_NAME`. No infra required; operators switch between Haiku / Sonnet / Opus from the UI.
15+
- **Ollama (local)** — for dev or self-hosted deployments. UI manages connection, model pulls, deletes, and active-model switching without a redeploy.
16+
- **Writer-facing features** — workflow browser with per-rule coverage, PR lookup by URL, recent copies feed, file match tester (with clear button and Python-style `(?P<name>)` regex translation for in-browser use), PR timeline, and in-app help overlay.
17+
- **Per-delivery log viewer** — context-tagged ring buffer captures logs per webhook delivery, surfaced in an audit drawer alongside the trace and outcome summary.
18+
- **Audit event enrichment**`processed_ok` traces now include destination repo(s), files matched / uploaded / failed, and commit SHA.
19+
- **Startup banner** — Operator UI, auth repo, AI model, and AI base URL are now surfaced when the app boots (local and Cloud Run).
20+
21+
### Changed
22+
23+
- **MongoDB audit logging enabled in production** — the Cloud Run deploy previously forced `AUDIT_ENABLED=false`; it is now `true`, aligning with the v0.3.0 "enabled by default" change.
24+
- **Operator auth hardened** — token-based auth (`OPERATOR_UI_TOKEN`) removed entirely; GitHub PAT is the only supported mechanism. `OPERATOR_UI_ENABLED=true` now requires `OPERATOR_AUTH_REPO` at config load (validated in `validateOperatorAuth`).
25+
- **`createPullRequest` skipped for empty commits**`commitFilesToBranch` now returns an `errTreeUnchanged` sentinel so `addFilesViaPR` no longer calls the GitHub PR API with an unchanged tree (previously 422'd).
26+
- **MongoDB driver v2 ObjectID decoding** — audit reads set `ObjectIDAsHexString: true` to avoid "error decoding key `_id`" on queries.
27+
28+
### Fixed
29+
30+
- **gosec G107 / G704 SSRF findings** — GitHub API URL construction in `services/operator_auth.go` now validates path components against strict RE2-compatible whitelists (`ghUsernameRe`, `ghRepoNameRe`) and escapes them with `url.PathEscape` before request construction; `slack_notifier.go` `#nosec` annotation extended to cover `NewRequestWithContext`.
31+
- **Keyboard-shortcut overlay wouldn't close**`.help-bg[hidden]` now wins over the base `display:flex`.
32+
- **File match tester returned no matches for Java files** — JavaScript `RegExp` does not support Python-style `(?P<name>)` named groups; the tester now rewrites `(?P<``(?<` before compilation.
33+
34+
### Security
35+
36+
- **Token auth removed** — the operator UI no longer accepts a shared bearer token; all access is per-user via GitHub PAT with repo-scoped permission checks.
37+
938
## [v0.3.1] - 2026-04-30
1039

1140
### Fixed
@@ -14,7 +43,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
1443

1544
### Security
1645

17-
- **Removed accidentally committed secrets and config files** from the repository.
46+
- **Removed unneeded config files** from the repository.
1847

1948
## [v0.3.0] - 2026-04-14
2049

README.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,13 @@ A GitHub app that automatically copies code examples and files from source repos
2929
- **Development Tools** - Dry-run mode, CLI validation, enhanced logging
3030
- **Thread-Safe** - Concurrent webhook processing with proper state management
3131

32+
### Operator UI
33+
- **Web dashboard at `/operator/`** - Five-tab UI (Overview, Webhooks, Audit, Workflows, System) with dark mode, keyboard shortcuts, and shareable URLs
34+
- **GitHub PAT authentication** - Users sign in with their personal access token; role is derived from their permission on a configured auth repo (`admin`/`maintain` → operator, `write`/`triage`/`read` → writer)
35+
- **Per-repo replay authorization** - Replay requires the caller's PAT to have read access to the source repo of the webhook being replayed
36+
- **Writer-facing tools** - Workflow browser, PR lookup, recent copies feed, file match tester, audit drawer, per-delivery log viewer
37+
- **AI rule suggester** - Paste a source/target pair; get a generated copier rule self-verified against the in-process pattern matcher. Two providers: [Anthropic](https://www.anthropic.com/) (hosted, default in prod via the Grove Foundry APIM gateway) or [Ollama](https://ollama.com) (local, for dev)
38+
3239
## 🚀 Quick Start
3340

3441
### Prerequisites
@@ -385,6 +392,47 @@ Get performance metrics:
385392
curl http://localhost:8080/metrics
386393
```
387394

395+
## Operator UI
396+
397+
The operator UI is a web dashboard served from `/operator/` for diagnosing webhook processing, replaying failed deliveries, browsing workflows, and generating copier rules with AI assistance.
398+
399+
### Enabling the UI
400+
401+
Set the required env vars:
402+
403+
```yaml
404+
OPERATOR_UI_ENABLED: "true"
405+
OPERATOR_AUTH_REPO: "your-org/some-repo" # user permissions here determine role
406+
OPERATOR_REPO_SLUG: "your-org/some-repo" # optional; enables audit-row deep links
407+
```
408+
409+
**Startup fails** if `OPERATOR_UI_ENABLED=true` without `OPERATOR_AUTH_REPO` — this prevents an accidentally-open operator UI.
410+
411+
### Authentication and roles
412+
413+
Each user authenticates with their own **GitHub Personal Access Token**. Paste the PAT into the sign-in prompt; the server checks the user's permission on `OPERATOR_AUTH_REPO` and assigns a role:
414+
415+
| GitHub permission | Operator UI role | Can do |
416+
|---|---|---|
417+
| `admin` / `maintain` | **operator** | View everything; replay deliveries; cut release tags; change AI settings |
418+
| `write` / `triage` / `read` | **writer** | View workflows, audit, recent copies, file match tester, AI rule suggester |
419+
| None | **denied** | 401 Unauthorized |
420+
421+
`write` maps to writer (not operator) so typical docs contributors with repo write access can't replay deliveries or cut releases — those need an explicit `admin` / `maintain` grant.
422+
423+
On top of the role, **replay is repo-scoped**: the user's PAT must also have read access to the source repo of the webhook being replayed.
424+
425+
### AI rule suggester
426+
427+
The operator UI includes an LLM-backed helper that takes a source/target file pair and returns a generated copier workflow rule, self-verified against the in-process pattern matcher before display.
428+
429+
Two providers are supported via `LLM_PROVIDER`:
430+
431+
- **`anthropic`** (default in Cloud Run): calls the Anthropic Messages API. For MongoDB deployments this routes through the Grove Foundry APIM gateway — set `LLM_BASE_URL=https://grove-gateway-prod.azure-api.net/grove-foundry-prod/anthropic` and load the gateway key from Secret Manager via `ANTHROPIC_API_KEY_SECRET_NAME`.
432+
- **`ollama`** (default for local dev): runs against a local Ollama instance at `http://localhost:11434`. Connect, pull models, and switch the active model from the UI's System → AI settings panel without a redeploy.
433+
434+
Smoke-test the LLM provider end-to-end with [`cmd/test-llm`](cmd/test-llm/README.md).
435+
388436
## Audit Logging
389437

390438
When enabled, all operations are logged to MongoDB:
@@ -598,4 +646,6 @@ See [DEPLOYMENT.md](./docs/DEPLOYMENT.md) for the complete deployment and rollba
598646

599647
- **[Config Validator](cmd/config-validator/README.md)** - CLI tool for validating configs
600648
- **[Test Webhook](cmd/test-webhook/README.md)** - CLI tool for testing webhooks
649+
- **[Test PEM](cmd/test-pem/README.md)** - CLI tool for verifying the GitHub App private key
650+
- **[Test LLM](cmd/test-llm/README.md)** - CLI tool for smoke-testing the AI rule suggester's LLM provider
601651
- **[Scripts](scripts/README.md)** - Helper scripts for deployment, testing, and releases

app.go

Lines changed: 43 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,15 @@ func main() {
6363
os.Exit(1)
6464
}
6565

66+
// Anthropic API key is only needed when the operator UI's AI suggester uses
67+
// the anthropic provider. Failure to load is non-fatal — the UI will show
68+
// "not configured" and writers can still use every other feature.
69+
if config.OperatorUIEnabled && config.LLMProvider == "anthropic" {
70+
if err := services.LoadAnthropicAPIKey(ctx, config); err != nil {
71+
fmt.Printf("⚠️ Anthropic API key not loaded: %v (AI suggester will be disabled)\n", err)
72+
}
73+
}
74+
6675
// Override dry-run from command line
6776
if dryRun {
6877
config.DryRun = true
@@ -136,15 +145,35 @@ func printBanner(config *configs.Config, container *services.ServiceContainer) {
136145
fmt.Printf("║ Version: %-48s║\n", version)
137146
fmt.Printf("║ Port: %-48s║\n", config.Port)
138147
fmt.Printf("║ Webhook Path: %-48s║\n", config.WebserverPath)
139-
fmt.Printf("║ Config File: %-48s║\n", config.EffectiveConfigFile())
148+
fmt.Printf("║ Config File: %-48s║\n", truncMiddle(config.EffectiveConfigFile(), 48))
140149
fmt.Printf("║ Dry Run: %-48v║\n", config.DryRun)
141150
fmt.Printf("║ Audit Log: %-48v║\n", config.AuditEnabled)
142151
fmt.Printf("║ Metrics: %-48v║\n", config.MetricsEnabled)
143152
fmt.Printf("║ Slack: %-48v║\n", config.SlackEnabled)
153+
fmt.Printf("║ Operator UI: %-48v║\n", config.OperatorUIEnabled)
154+
if config.OperatorUIEnabled {
155+
fmt.Printf("║ Auth Repo: %-48s║\n", truncMiddle(config.OperatorAuthRepo, 48))
156+
fmt.Printf("║ AI Provider:%-48s║\n", truncMiddle(config.LLMProvider, 48))
157+
fmt.Printf("║ AI Model: %-48s║\n", truncMiddle(config.LLMModel, 48))
158+
fmt.Printf("║ AI URL: %-48s║\n", truncMiddle(config.LLMBaseURL, 48))
159+
}
144160
fmt.Println("╚════════════════════════════════════════════════════════════════╝")
145161
fmt.Println()
146162
}
147163

164+
// truncMiddle shortens s to max bytes, replacing the middle with "..." when
165+
// too long. Uses ASCII so Go's byte-count-based %-Ns padding stays aligned.
166+
func truncMiddle(s string, max int) string {
167+
if len(s) <= max {
168+
return s
169+
}
170+
if max < 6 {
171+
return s[:max]
172+
}
173+
keep := (max - 3) / 2
174+
return s[:keep] + "..." + s[len(s)-(max-3-keep):]
175+
}
176+
148177
func validateConfiguration(container *services.ServiceContainer) error {
149178
ctx := context.Background()
150179
_, err := container.ConfigLoader.LoadConfig(ctx, container.Config)
@@ -155,24 +184,22 @@ func startWebServer(config *configs.Config, container *services.ServiceContainer
155184
// Create HTTP handler with all routes
156185
mux := http.NewServeMux()
157186

158-
// Webhook endpoint
159-
mux.HandleFunc(config.WebserverPath, func(w http.ResponseWriter, r *http.Request) {
160-
handleWebhook(w, r, config, container)
161-
})
162-
163-
// Liveness probe — lightweight, always 200 if process is running
187+
// Register built-in paths before the configurable webhook route so a mis-set
188+
// WEBSERVER_PATH can never shadow /health, /ready, /metrics, /config, or /operator.
164189
mux.HandleFunc("/health", services.HealthHandler(container.StartTime, version))
165-
166-
// Readiness probe — checks GitHub auth, MongoDB connectivity
167190
mux.HandleFunc("/ready", services.ReadinessHandler(container))
168-
169-
// Metrics endpoint (if enabled)
170191
if config.MetricsEnabled {
171192
mux.HandleFunc("/metrics", services.MetricsHandler(container.MetricsCollector, container.FileStateService))
172193
}
173-
174-
// Config diagnostic endpoint — shows resolved config with secrets redacted
175194
mux.HandleFunc("/config", services.ConfigDiagnosticHandler(container, version))
195+
if config.OperatorUIEnabled {
196+
services.RegisterOperatorRoutes(mux, config, container, version)
197+
}
198+
199+
// GitHub webhook (configurable path, typically /events)
200+
mux.HandleFunc(config.WebserverPath, func(w http.ResponseWriter, r *http.Request) {
201+
handleWebhook(w, r, config, container)
202+
})
176203

177204
// Info endpoint
178205
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
@@ -189,6 +216,9 @@ func startWebServer(config *configs.Config, container *services.ServiceContainer
189216
if config.MetricsEnabled {
190217
_, _ = fmt.Fprintf(w, "Metrics: /metrics\n")
191218
}
219+
if config.OperatorUIEnabled {
220+
_, _ = fmt.Fprintf(w, "Operator UI: /operator/ (authenticate with a GitHub PAT; role from %s)\n", config.OperatorAuthRepo)
221+
}
192222
})
193223

194224
// Create server

cmd/test-llm/README.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# test-llm
2+
3+
Smoke-test the operator UI's LLM client against the configured provider.
4+
5+
## Purpose
6+
7+
Verify end-to-end that:
8+
9+
- The provider URL and API key are reachable from your machine
10+
- Auth headers are accepted (direct Anthropic API or APIM-fronted gateway)
11+
- The active model responds to a real rule-suggester prompt and returns valid JSON
12+
13+
Useful after rotating `ANTHROPIC_API_KEY`, changing `LLM_BASE_URL`, or pointing at a new gateway.
14+
15+
## Build
16+
17+
```bash
18+
go build -o test-llm ./cmd/test-llm
19+
```
20+
21+
## Usage
22+
23+
```bash
24+
./test-llm [-env <path>] [-timeout <duration>]
25+
```
26+
27+
The tool reads standard env vars — `LLM_PROVIDER`, `LLM_BASE_URL`, `LLM_MODEL`, `ANTHROPIC_API_KEY` — from the process environment. Use `-env` to load a `.env`-style file first. Inline env vars on the command line override file values.
28+
29+
## Examples
30+
31+
Smoke-test against the local `.env.test`:
32+
33+
```bash
34+
./test-llm -env .env.test
35+
```
36+
37+
Override the key without editing the env file:
38+
39+
```bash
40+
ANTHROPIC_API_KEY='sk-...' ./test-llm -env .env.test
41+
```
42+
43+
Test Ollama locally:
44+
45+
```bash
46+
LLM_PROVIDER=ollama LLM_BASE_URL=http://localhost:11434 LLM_MODEL=qwen2.5-coder:7b ./test-llm
47+
```
48+
49+
## Output
50+
51+
On success:
52+
53+
```
54+
Provider: anthropic
55+
Base URL: https://grove-gateway-prod.azure-api.net/grove-foundry-prod/anthropic
56+
Model: claude-haiku-4-5
57+
API key: sk-a…xyz9
58+
59+
✅ Ping OK
60+
✅ ListModels: 3 models
61+
- claude-opus-4-7
62+
- claude-sonnet-4-6
63+
- claude-haiku-4-5-20251001
64+
✅ GenerateJSON parsed OK:
65+
{
66+
"transform_type": "move",
67+
"transform_from": "agg/python/models",
68+
...
69+
}
70+
71+
🎉 All checks passed — the LLM provider is reachable and usable.
72+
```
73+
74+
## Exit Codes
75+
76+
| Code | Meaning |
77+
|------|--------------------------------------|
78+
| 0 | All checks passed |
79+
| 1 | Any failure (auth, network, parsing) |

0 commit comments

Comments
 (0)