Skip to content

Commit 46173be

Browse files
committed
Add Claude-powered CVE triage to scheduled scan
When the weekly Trivy scan finds new fixable HIGH/CRITICAL CVEs, hand them off to Claude Sonnet 4.6 (on Vertex AI in viral-seq-ai via Workload Identity Federation) for analysis, then file one GitHub issue per CVE explaining the vuln, the dependency chain, why the Rego policy didn't suppress it, and a recommended fix informed by historical patterns in the repo. "New" is determined by issue-existence (open OR closed) -- a CVE with no issue whose title contains the CVE ID is treated as new. workflow_dispatch inputs for testing without waiting for the weekly schedule: - test_cve_id: bypass diff and force-analyze a specific CVE ID - dry_run: run analysis but skip gh issue create (artifact is still uploaded for inspection) Required GH repo variables (already set): GCP_PROJECT_ID - viral-seq-ai GCP_WIP_PROVIDER - full WIF provider resource path GCP_SA_EMAIL - viral-ngs-cve-triage@viral-seq-ai.iam.gserviceaccount.com GCP-side: WIF pool github-actions-pool + provider broadinstitute-github (gated by repository_owner == broadinstitute); SA has roles/aiplatform.user and roles/serviceusage.serviceUsageConsumer.
1 parent 6d229c8 commit 46173be

1 file changed

Lines changed: 248 additions & 1 deletion

File tree

.github/workflows/container-scan.yml

Lines changed: 248 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,17 @@ on:
55
# Weekly scan of main branch mega image every Monday at 06:00 UTC
66
- cron: '0 6 * * 1'
77
workflow_dispatch:
8+
inputs:
9+
test_cve_id:
10+
description: 'Optional: bypass new-CVE detection and force-analyze this specific CVE ID (for testing the Claude pipeline)'
11+
required: false
12+
type: string
13+
default: ''
14+
dry_run:
15+
description: 'Run Claude analysis but do NOT file GitHub issues (artifact still uploaded)'
16+
required: false
17+
type: boolean
18+
default: false
819

920
permissions: {}
1021

@@ -18,6 +29,8 @@ jobs:
1829
contents: read
1930
packages: read
2031
security-events: write
32+
issues: write # for filing CVE issues
33+
id-token: write # for OIDC token to GCP via WIF
2134
steps:
2235
- name: Checkout repository
2336
uses: actions/checkout@v4
@@ -49,7 +62,7 @@ jobs:
4962
format: 'json'
5063
output: 'trivy-results.json'
5164
severity: 'CRITICAL,HIGH'
52-
exit-code: '1'
65+
exit-code: '0' # don't fail here — Claude pipeline + final-step gate handles signaling
5366
ignore-unfixed: true
5467
trivyignores: '.trivyignore'
5568
ignore-policy: '.trivy-ignore-policy.rego'
@@ -75,3 +88,237 @@ jobs:
7588
with:
7689
name: trivy-mega-scheduled
7790
path: trivy-results.json
91+
92+
# === Claude triage pipeline ===
93+
# If new fixable HIGH/CRITICAL CVEs are found, hand them to Claude (Sonnet 4.6
94+
# on Vertex AI) for analysis, then file GitHub issues. Source of truth for
95+
# "new" is GH issues themselves: a CVE is new if no existing issue (open OR
96+
# closed) has the CVE ID in its title.
97+
98+
- name: Identify new fixable CVEs
99+
id: triage
100+
env:
101+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
102+
TEST_CVE_ID: ${{ inputs.test_cve_id }}
103+
run: |
104+
set -euo pipefail
105+
106+
# Test mode: bypass scan-diff and use the provided CVE ID directly.
107+
if [ -n "${TEST_CVE_ID:-}" ]; then
108+
echo "::notice::Test mode active — analyzing TEST_CVE_ID=$TEST_CVE_ID"
109+
echo "cve_ids=$TEST_CVE_ID" >> "$GITHUB_OUTPUT"
110+
echo "test_mode=true" >> "$GITHUB_OUTPUT"
111+
exit 0
112+
fi
113+
114+
# Production mode: parse trivy JSON for fixable HIGH/CRITICAL CVEs.
115+
all_cves=$(jq -r '
116+
[.Results[]?.Vulnerabilities[]?
117+
| select((.Severity == "HIGH" or .Severity == "CRITICAL")
118+
and (.FixedVersion // "") != "")
119+
| .VulnerabilityID]
120+
| unique[]
121+
' trivy-results.json)
122+
123+
if [ -z "$all_cves" ]; then
124+
echo "::notice::No fixable HIGH/CRITICAL CVEs in scan."
125+
echo "cve_ids=" >> "$GITHUB_OUTPUT"
126+
echo "test_mode=false" >> "$GITHUB_OUTPUT"
127+
exit 0
128+
fi
129+
130+
# Dedup against existing GH issues (open + closed) by title-substring search.
131+
new_cves=()
132+
for cve in $all_cves; do
133+
count=$(gh search issues \
134+
--repo "$GITHUB_REPOSITORY" \
135+
--state=all \
136+
"\"$cve\" in:title" \
137+
--json url --jq 'length')
138+
if [ "$count" = "0" ]; then
139+
new_cves+=("$cve")
140+
echo " NEW: $cve"
141+
else
142+
echo " existing issue for $cve, skipping"
143+
fi
144+
done
145+
146+
new_list="${new_cves[*]:-}"
147+
echo "::notice::Found ${#new_cves[@]} new fixable CVE(s)"
148+
echo "cve_ids=$new_list" >> "$GITHUB_OUTPUT"
149+
echo "test_mode=false" >> "$GITHUB_OUTPUT"
150+
151+
- name: Authenticate to GCP via Workload Identity Federation
152+
if: steps.triage.outputs.cve_ids != ''
153+
uses: google-github-actions/auth@v2
154+
with:
155+
workload_identity_provider: ${{ vars.GCP_WIP_PROVIDER }}
156+
service_account: ${{ vars.GCP_SA_EMAIL }}
157+
158+
- name: Ensure issue labels exist
159+
if: steps.triage.outputs.cve_ids != '' && inputs.dry_run != true
160+
env:
161+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
162+
run: |
163+
# Idempotent: gh label create exits non-zero if label exists; ignore that.
164+
gh label create security --color B60205 --description "Security-related issue" --repo "$GITHUB_REPOSITORY" 2>/dev/null || true
165+
gh label create cve --color B60205 --description "CVE tracked in container scans" --repo "$GITHUB_REPOSITORY" 2>/dev/null || true
166+
gh label create test --color FBCA04 --description "Test issue (filed by workflow_dispatch test_cve_id)" --repo "$GITHUB_REPOSITORY" 2>/dev/null || true
167+
168+
- name: Claude analysis on Vertex AI
169+
if: steps.triage.outputs.cve_ids != ''
170+
uses: anthropics/claude-code-action@beta
171+
env:
172+
CLAUDE_CODE_USE_VERTEX: '1'
173+
CLOUD_ML_REGION: global
174+
ANTHROPIC_VERTEX_PROJECT_ID: ${{ vars.GCP_PROJECT_ID }}
175+
with:
176+
claude_args: |
177+
--model claude-sonnet-4-6
178+
--max-turns 30
179+
settings: |
180+
{
181+
"permissions": {
182+
"allow": [
183+
"Read",
184+
"Write",
185+
"Bash(git log:*)",
186+
"Bash(git show:*)",
187+
"Bash(git rev-parse:*)",
188+
"Bash(grep:*)",
189+
"Bash(find:*)",
190+
"Bash(jq:*)",
191+
"Bash(ls:*)",
192+
"Bash(mkdir:*)",
193+
"Bash(cat:*)",
194+
"Bash(head:*)",
195+
"Bash(tail:*)"
196+
]
197+
}
198+
}
199+
prompt: |
200+
You are triaging container vulnerabilities for the broadinstitute/viral-ngs repo.
201+
202+
## Your task
203+
204+
For each CVE ID listed below, write a triage report to `/tmp/issues/<CVE-ID>.md`.
205+
The reports will be filed verbatim as GitHub issues by the next workflow step.
206+
207+
**CVE IDs to analyze:** ${{ steps.triage.outputs.cve_ids }}
208+
209+
**Test mode:** ${{ steps.triage.outputs.test_mode }}
210+
(If `true`, the CVE was supplied manually via `test_cve_id` and may not appear in
211+
the current scan's `trivy-results.json`. Use your training knowledge in that case
212+
and add a `> _Test analysis_` blockquote at the top of the report so reviewers
213+
know it was generated for pipeline validation, not from a real scan finding.)
214+
215+
## Required reading (do this BEFORE writing reports)
216+
217+
1. `trivy-results.json` (in the workspace root) — authoritative metadata for every
218+
CVE flagged in the current scan. ALWAYS check here first for CVE details
219+
(severity, vector, package path, fix version, references). Use `jq` to query.
220+
2. `.agents/skills/container-vulns/SKILL.md` — read fully. This is the repo's
221+
container-vulnerability playbook and tells you what the maintainers consider
222+
actionable vs. accepted risk.
223+
3. `.trivyignore` — existing per-CVE exceptions with their justifications. Mirror
224+
the writing style and depth of justification when you recommend `.trivyignore`
225+
additions.
226+
4. `.trivy-ignore-policy.rego` — Rego policy for class-level CVE filtering.
227+
Understand what it filters and why.
228+
5. `docker/Dockerfile.*` — container build files showing dep installs and inline
229+
mitigations. Look for prior fixups (`find ... -exec rm`, `gem install`, etc.)
230+
applied to similar packages.
231+
6. `docker/requirements/*.txt` — conda dependency lists. Use `grep` to find
232+
which file pulls in the affected package.
233+
7. Recent git history — `git log --all --oneline --grep <package>`,
234+
`git log --all --oneline --grep CVE-`, and `git show <sha>` to inspect prior
235+
fix patterns. ALWAYS verify a commit SHA exists before citing it.
236+
237+
## Required structure for each report
238+
239+
File path: `/tmp/issues/<CVE-ID>.md` (filename MUST match the CVE ID exactly).
240+
**First line MUST be a single H1 used as the issue title:**
241+
`# [CVE-YYYY-NNNN] <package>: <one-line description>`
242+
243+
Then sections (use H2 `##` headers):
244+
245+
1. **Summary** — 2–3 sentences: what it is, severity, where it came from.
246+
2. **Vulnerability details** — CVSS score + vector + plain-English meaning;
247+
2–4 sentences explaining the bug technically.
248+
3. **Dependency chain** — name the direct conda package or Docker layer that
249+
pulls this in. Trace transitive deps where you can. If you can't determine
250+
this confidently, say so explicitly — do NOT guess.
251+
4. **Why the Rego policy didn't suppress it** — explain in terms of the AV/PR/UI/S
252+
vector classes the policy filters and why this CVE's vector doesn't match.
253+
5. **Recommended fix** — concrete and actionable. Options:
254+
- Version bump (which file, which floor)
255+
- Inline Dockerfile mitigation (which Dockerfile, what RUN-block addition)
256+
- `.trivyignore` entry (with justification matching the existing style)
257+
Cite historical precedent when applicable: `(mirror the fix in commit <sha>)`.
258+
6. **Practical exploitability** — in this deployment model (ephemeral batch
259+
containers, no network-facing services, no untrusted user input at runtime),
260+
is this actually reachable? Be honest and specific.
261+
7. **References** — GHSA URL, NVD URL, vendor advisory.
262+
263+
## Constraints
264+
265+
- `mkdir -p /tmp/issues` first.
266+
- One file per CVE.
267+
- Be concise. Each report should be readable in 1–2 minutes (target: 300–600 words).
268+
- Do NOT hallucinate package versions, file paths, or commit SHAs. Verify with
269+
tools when in doubt.
270+
- If you finish all reports with budget remaining, do NOT pad — stop.
271+
272+
- name: Upload Claude analysis as artifact
273+
if: steps.triage.outputs.cve_ids != '' && always()
274+
uses: actions/upload-artifact@v4
275+
with:
276+
name: claude-cve-analysis
277+
path: /tmp/issues/
278+
if-no-files-found: warn
279+
280+
- name: File GitHub issues
281+
if: steps.triage.outputs.cve_ids != '' && inputs.dry_run != true
282+
env:
283+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
284+
TEST_MODE: ${{ steps.triage.outputs.test_mode }}
285+
run: |
286+
set -uo pipefail
287+
288+
if [ ! -d /tmp/issues ] || [ -z "$(ls -A /tmp/issues 2>/dev/null)" ]; then
289+
echo "::error::No analysis files in /tmp/issues — Claude may have failed silently"
290+
exit 1
291+
fi
292+
293+
failed=0
294+
for f in /tmp/issues/*.md; do
295+
cve=$(basename "$f" .md)
296+
title=$(head -1 "$f" | sed 's/^# *//')
297+
body=$(tail -n +2 "$f")
298+
299+
labels="security,cve"
300+
if [ "$TEST_MODE" = "true" ]; then
301+
labels="$labels,test"
302+
fi
303+
304+
echo "Creating issue for $cve: $title"
305+
if ! gh issue create \
306+
--repo "$GITHUB_REPOSITORY" \
307+
--title "$title" \
308+
--body "$body" \
309+
--label "$labels"; then
310+
echo "::error::Failed to create issue for $cve"
311+
failed=$((failed+1))
312+
fi
313+
done
314+
315+
if [ $failed -gt 0 ]; then
316+
echo "::error::$failed issue(s) failed to create"
317+
exit 1
318+
fi
319+
320+
- name: Fail job if new CVEs were found (production mode only)
321+
if: steps.triage.outputs.cve_ids != '' && steps.triage.outputs.test_mode != 'true'
322+
run: |
323+
echo "::error::Scan found new fixable HIGH/CRITICAL CVEs. See filed issues for details."
324+
exit 1

0 commit comments

Comments
 (0)