Fix CI: Install CUDA toolkit for flashinfer JIT compilation

JenniferWang · JenniferWang · commit da56a0deb97a · 2026-02-02T10:39:25.000-05:00
vLLM 0.13.0 uses flashinfer as its default attention backend, which
requires nvcc for JIT compilation of CUDA kernels at runtime. The CI
runner has GPU runtime libraries but not the CUDA development toolkit,
causing the error:
  RuntimeError: Could not find nvcc and default cuda_home='/usr/local/cuda' doesn't exist

This adds cuda-toolkit-12-8 installation and sets CUDA_HOME environment
variable before installing torchforge.

Note: Temporarily added branch trigger to test this fix - remove after merging.
diff --git a/.github/workflows/integration_test.yaml b/.github/workflows/integration_test.yaml
@@ -2,7 +2,7 @@ name: Integration Tests (8 card)
 
 on:
   push:
-    branches: [ main ]
+    branches: [ main, fix-ci-cuda-toolkit ]  # TODO: remove fix-ci-cuda-toolkit after testing
   workflow_dispatch:
 
 concurrency:
@@ -33,6 +33,12 @@ jobs:
           python-version: '3.12'
       - name: Update pip
         run: python -m pip install --upgrade pip
+      - name: Install CUDA toolkit
+        run: |
+          # flashinfer (used by vLLM 0.13.0) requires nvcc for JIT compilation
+          sudo dnf install -y cuda-toolkit-12-8
+          echo "CUDA_HOME=/usr/local/cuda-12.8" >> $GITHUB_ENV
+          echo "/usr/local/cuda-12.8/bin" >> $GITHUB_PATH
       - name: Install torchforge
         run: pip install uv && uv pip install . && uv pip install .[dev]
       - name: Run weight sync integration test