Skip to content

Commit c7d8089

Browse files
dpark01claude
andcommitted
Pin snpeff=5.1 (log4j 2.17.1, openjdk 8); add vuln management docs to AGENTS.md
snpeff 5.1 already bundles log4j-core 2.17.1 (Log4Shell fix) and uses openjdk 8, avoiding the ARM64 icu solver conflict from 5.2+. Add Container Vulnerability Management section to AGENTS.md covering: - Trivy scanning setup (Rego policy, .trivyignore, JSON artifacts) - Common vulnerability sources (Java fat JARs, Go binaries, vendored deps) - ARM64 solver conflict patterns and mitigations - CVE triage decision process Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 2a3d6d2 commit c7d8089

3 files changed

Lines changed: 72 additions & 2 deletions

File tree

AGENTS.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,7 @@ viral-ngs/
147147
│ ├── core.txt
148148
│ ├── core-x86.txt # x86-only core packages
149149
│ ├── assemble.txt
150+
│ ├── assemble-x86.txt # x86-only assembly packages
150151
│ ├── classify.txt
151152
│ ├── classify-x86.txt # x86-only classify packages
152153
│ ├── phylo.txt
@@ -503,6 +504,75 @@ micromamba search -c bioconda <package> --subdir linux-aarch64
503504

504505
---
505506

507+
## Container Vulnerability Management
508+
509+
### Scanning
510+
511+
Container images are scanned for vulnerabilities using [Trivy](https://aquasecurity.github.io/trivy/):
512+
513+
- **On every PR/push**: `docker.yml` scans each image flavor after build (SARIF → GitHub Security tab, JSON → artifact)
514+
- **Weekly schedule**: `container-scan.yml` scans the latest published images
515+
- Scans filter to **CRITICAL/HIGH** severity, **ignore-unfixed**, and apply a Rego policy (`.trivy-ignore-policy.rego`)
516+
- Per-CVE exceptions go in `.trivyignore` with mandatory justification comments
517+
518+
### Rego Policy (`.trivy-ignore-policy.rego`)
519+
520+
The Rego policy filters CVEs that are architecturally inapplicable to ephemeral batch containers:
521+
522+
- **AV:P** (Physical access required) — containers are cloud-hosted
523+
- **AV:A** (Adjacent network required) — no attacker on same network segment
524+
- **AV:L + UI:R** (Local + user interaction) — no interactive sessions
525+
- **AV:L + PR:H** (Local + high privileges) — containers run non-root
526+
- **AV:L + S:U** (Local + scope unchanged) — attacker already has code execution and impact stays within the ephemeral container
527+
528+
Changes to this policy should be reviewed carefully. The comments in the file explain the rationale and risk for each rule.
529+
530+
### Common Vulnerability Sources
531+
532+
**Python transitive deps**: Pin minimum versions in `docker/requirements/*.txt`. Prefer conda packages over pip. Check conda-forge availability before assuming a version exists — conda-forge often lags PyPI by days/weeks.
533+
534+
**Java fat JARs** (picard, gatk, snpeff, fgbio): Bioinformatics Java tools are distributed as uber JARs with all dependencies bundled inside. Trivy detects vulnerable libraries (log4j, commons-compress, etc.) baked into these JARs. Version bumps can cause ARM64 conda solver conflicts because Java tools pull in openjdk → harfbuzz → icu version chains that clash with other packages (r-base, boost-cpp, pyicu). Always check:
535+
1. Whether the tool is actually flagged by Trivy (don't bump versions unnecessarily)
536+
2. Whether the CVE applies (e.g., log4j 1.x is NOT vulnerable to Log4Shell)
537+
3. Whether the desired version resolves on ARM64 before pushing
538+
539+
**Go binaries**: Some conda packages bundle compiled Go binaries (e.g., mafft's `dash_client`, google-cloud-sdk's `gcloud-crc32c`). If the binary is unused, delete it in the Dockerfile. Delete from **both** the installed location and `/opt/conda/pkgs/*/` (conda package cache) — Trivy scans the full filesystem.
540+
541+
**Vendored copies**: Packages like google-cloud-sdk and setuptools bundle their own copies of Python libraries that may be older than what's in the conda environment. Trivy flags these vendored copies separately. Options: delete the vendored directory (if not needed at runtime), or accept the risk in `.trivyignore` with justification.
542+
543+
### ARM64 Solver Conflicts
544+
545+
The conda solver on ARM64 (linux-aarch64) is more constrained than amd64 because fewer package builds exist. Common conflict patterns:
546+
547+
- **icu version conflicts**: Many packages (openjdk, r-base, boost-cpp, pyicu) pin specific icu version ranges. Bumping one package can make the entire environment unsolvable.
548+
- **libdeflate/htslib conflicts**: lofreq 2.1.5 pins old htslib/libdeflate versions that conflict with newer pillow/libtiff.
549+
- **openjdk version escalation**: snpeff 5.2+ requires openjdk>=11, 5.3+ requires openjdk>=21. Higher openjdk versions pull in harfbuzz→icu chains that conflict with everything.
550+
551+
When a solver conflict occurs: revert the change, check what version the solver was picking before, and pin to that exact version if it already addresses the CVE.
552+
553+
### Mitigation Decision Process
554+
555+
When triaging a CVE:
556+
557+
1. **Check the CVSS vector** — does the Rego policy already filter it?
558+
2. **Identify the source package** — use Trivy JSON output (`PkgName`, `PkgPath`, `InstalledVersion`)
559+
3. **Check if a fix version exists on conda-forge/bioconda** — not just on PyPI
560+
4. **Test on ARM64** — solver conflicts are the most common failure mode
561+
5. **If the fix version conflicts**: consider whether the CVE is exploitable in your deployment model. Document the risk assessment in `.trivyignore` or `vulnerability-mitigation-status.md`.
562+
6. **If the vulnerable code is unused**: delete the binary/file inline in the Dockerfile (same RUN layer as install to avoid bloating images)
563+
564+
### Key Files
565+
566+
| File | Purpose |
567+
|------|---------|
568+
| `.trivy-ignore-policy.rego` | Rego policy for class-level CVE filtering |
569+
| `.trivyignore` | Per-CVE exceptions with justifications |
570+
| `.github/workflows/docker.yml` | Build-time scanning (SARIF + JSON) |
571+
| `.github/workflows/container-scan.yml` | Weekly scheduled scanning |
572+
| `vulnerability-mitigation-status.md` | Local-only tracking doc (not committed) |
573+
574+
---
575+
506576
## Troubleshooting
507577

508578
### Circular Import Errors

docker/requirements/phylo.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@ lofreq>=2.1.5
44
mafft>=7.508
55
mummer4>=4.0.0rc1
66
muscle=3.8.1551
7-
snpeff>=4.3.1t
7+
snpeff=5.1
88
vphaser2>=2.0

src/viral_ngs/phylo/snpeff.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
_log = logging.getLogger(__name__)
2424

2525
TOOL_NAME = 'snpEff'
26-
TOOL_VERSION = '4.3.1t'
26+
TOOL_VERSION = '5.1'
2727

2828

2929
class SnpEff(core.Tool):

0 commit comments

Comments
 (0)