You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[runtime-failure-observer] Inline curl calls and require fetched evidence before opening PRs (#1612)
* runtime-failure-observer: fix network egress and ban ungrounded PRs
The observer agent's shell commands were intermittently denied. The
copilot harness authorizes a command only by its first token, but the
prompt instructed the agent to pre-bind URLs (`url=...` then
`curl "$url"`) and to loop over definitions with `for`. Those forms
start with an assignment or keyword, so the harness rejected them with
"Permission denied and could not request permission from user" even
though the firewall allowlist contains .dev.azure.com and
.helix.dot.net. Across the three real runs this produced three
different outcomes (worked around, noop, and a false report_incomplete
that blamed the firewall).
Changes to runtime-failure-observer.agent.md (prompt body, imported at
runtime via {{#runtime-import}}, so no lock recompile needed):
- Rule 11 now requires every shell command to begin with an
allow-listed program; inline URLs into `curl ... -o file`, no
variable pre-bind, no loops. Step 1, Step 2, and the Step 4 dedup
cache snippet are rewritten to match.
- New Step 0 preflight proves egress with one inlined curl and, on
failure, emits an accurate report_incomplete (harness command
authorization, not firewall) instead of misdiagnosing the firewall.
- New rule 6b forbids opening a PR unless the build timeline and Helix
console were actually downloaded this run; no citing build ids,
Helix GUIDs, exit codes, or stderr from memory or inference.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Tighten wording to match prompt style
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address review feedback: shell-safe placeholders and allowed safe-output
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy file name to clipboardExpand all lines: .github/workflows/runtime-failure-observer.agent.md
+25-19Lines changed: 25 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,12 +87,12 @@ The agent reads `dotnet/runtime` and the failing build logs. It never writes to
87
87
3.**Every PR title starts with `[runtime-observer] `.** PRs are opened as drafts.
88
88
4.**Small-fix bounds for complete autofix PRs.** A *complete* fix PR must satisfy all of: `<=` 30 changed lines total, `<=` 2 files (one source + one test), no new public API, no protocol change, no native code change. If the fix needs more, do not silently truncate it: open a clearly-marked best-effort/diagnosability **draft** PR (Step 5) that a human finishes. Best-effort and diagnosability draft PRs may exceed these bounds but must be marked work-in-progress and must still avoid new public API, protocol changes, and native code.
89
89
5.**Don't propose fixes for runtime test bugs.** If the failure is in the test binary itself (assertion in the test code, missing mock, runtime API regression), record `skipped: runtime-side issue` and emit nothing.
90
-
6.**Never assume.** Cite the runtime build URL, the Helix work item URL, the xharness command line, and the exact stderr / exit code in every PR body.
90
+
6.**Never assume; cite only what you fetched this run.** Cite the runtime build URL, the Helix work item URL, the xharness command line, and the exact stderr / exit code in every PR body. If any required fetch (build list, timeline, Helix work items, console log) failed, was empty, or was denied, emit nothing for that candidate — never reconstruct a build id, URL, GUID, exit code, or stderr from memory or inference.
91
91
7.**Dedup.** Before emitting, search open and recently merged PRs / issues in `dotnet/xharness` for the same xharness-signature. On match: `existing-PR #<n>` or `existing-issue #<n>`, emit nothing.
92
92
8.**Same-run dedup cache.** Persist `(exit_code, command, signature_norm)` keys in `/tmp/gh-aw/agent/filed.tsv`. On hit: `dup-this-run`, skip.
93
93
9.**All state under `/tmp/gh-aw/agent/`.**
94
94
10.**AzDO API: anonymous only.** Stay on `https://dev.azure.com/dnceng-public/public/_apis/build/...`.
95
-
11.**Pre-bind every URL with `?` or `&` to a variable on its own line, then `curl -s "$url"`.**
95
+
11.**Start every shell command with an allow-listed program (`curl`, `jq`, `gh`, `grep`, `printf`, ...).** The harness authorizes by first token only, so a command beginning with `url=...`, `key=...`, or `for` is denied with `Permission denied and could not request permission from user` even when the firewall allows the domain. Inline each URL into a single `curl ... -o <file>` (keep `%24` for `$top`); never pre-bind URLs to variables or loop over `curl`.
96
96
97
97
## Pipelines to scan
98
98
@@ -125,42 +125,49 @@ These exit codes from `src/Microsoft.DotNet.XHarness.Common/CLI/ExitCode.cs` are
125
125
126
126
Exit codes outside this table: record `skipped: exit code <n> not in improvement table` and stop.
127
127
128
+
## Step 0. Preflight: confirm network egress
129
+
130
+
Prove the harness will let `curl` reach the public AzDO API before scanning (rule 11):
Valid JSON: continue. If `curl` itself is denied, that is the harness rejecting the command form, not the firewall (`.dev.azure.com` and `.helix.dot.net` are allow-listed). If an inlined first-token `curl` still fails, record `skipped: harness denied inlined curl to dev.azure.com; firewall already allows it` and stop. Never blame the firewall allowlist and never open a PR.
138
+
128
139
## Step 1. Set up
129
140
141
+
Run one inlined `curl` per definition id in `154 223 224 225 226 228 260 261 265`, substituting the id in the URL and the `-o` path:
Reconstruct `Stage -> Phase -> Job -> Task` via `parentId`. A failed leaf with non-null `log.id` is a candidate.
150
159
151
160
Filter to Helix work items only. xharness runs inside Helix work items, not on the AzDO agent. From the `Send to Helix` task log, extract `Sent Helix Job: <GUID>`:
152
161
153
162
```bash
154
-
log_url='<Send to Helix task log url>'
155
-
curl -s "$log_url"| tee /tmp/gh-aw/agent/helix-send.log
163
+
curl -s "<Send to Helix task log url>" -o /tmp/gh-aw/agent/helix-send.log
A work item is an xharness invocation candidate if `ConsoleOutputUri` contains an xharness command (`xharness apple`, `xharness android`, `xharness wasm`, or `dotnet exec .../Microsoft.DotNet.XHarness.CLI.dll`). Fetch the console and scan for:
@@ -192,11 +199,10 @@ gh pr list --repo dotnet/xharness --state all --limit 50 \
192
199
193
200
On match (open or merged in last 30 days): `existing-PR #<n>` / `existing-issue #<n>`. Emit nothing.
194
201
195
-
Same-run cache:
202
+
Same-run cache. Use the `<exit_code>|<command_norm>|<signature_norm>` key inline, never via a variable (rule 11):
0 commit comments