ci: align integration test runners with downstream CIs, add multi-GPU peft

Titus-von-Koeller · claude · Titus-von-Koeller · commit 06a9409019b9 · 2026-04-13T16:32:11.000+02:00
Key changes after digging into each downstream project's own CI:

Runner updates:
- transformers: T4 → A10G (bandb-aws-g5-4xlarge-plus). Current upstream
  transformers quantization CI runs on g5.4xlarge (A10G); our earlier T4
  choice came from a stale Feb-2024 fork.
- peft (single GPU): A10 → L4 (bandb-aws-g6-4xlarge-plus). Matches peft's
  aws-g6-4xlarge-plus runner group exactly.

PEFT filter:
- Switched from `-m "single_gpu_tests and bitsandbytes"` (both test files)
  to Benjamin Bossan's recommendation:
  `-m single_gpu_tests -k PeftBnbGPUExampleTests tests/test_gpu_examples.py`.
  Narrower scope (20 vs 86 tests) focused on the end-to-end QLoRA-style
  integration signal, less noise from tests where bnb is incidental.

New multi-GPU peft job:
- Uses bandb-aws-g6-12xlarge-plus (4× L4, CUDA_VISIBLE_DEVICES=0,1) —
  mirroring the legacy peft nightly-bnb.yml deleted in peft#2858.
- Filter: `-m multi_gpu_tests -k PeftBnbGPUExampleTests`.
- Note: this runner is being provisioned by infra; job will fail to pick
  up a runner until that's done.

Accelerate:
- Added `-rs` to surface skip reasons. Previous run showed 26 silent skips
  that produced a false "pass"; -rs will print the reason for each.

Report job's `needs:` updated to include test-peft-multigpu.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.github/workflows/tests-integration-nightly.yml b/.github/workflows/tests-integration-nightly.yml
@@ -43,9 +43,9 @@ jobs:
   # This reduces spurious failures from expected values calibrated on their runners.
 
   test-transformers:
-    name: Transformers bnb tests
+    name: Transformers bnb tests (single GPU)
     if: github.repository == 'bitsandbytes-foundation/bitsandbytes'
-    runs-on: bandb-aws-g4dn-4xlarge-plus-use1-public-80  # T4
+    runs-on: bandb-aws-g5-4xlarge-plus-use1-public-80  # A10G (matches transformers CI)
     steps:
       - name: Show GPU information
         run: nvidia-smi
@@ -140,7 +140,7 @@ jobs:
         run: |
           mkdir -p ${GITHUB_WORKSPACE}/reports
           python -m pytest tests/test_quantization.py \
-            -s -v \
+            -s -v -rs \
             -k "not multi_device" \
             --junitxml=${GITHUB_WORKSPACE}/reports/accelerate.xml \
             -o junit_logging=all \
@@ -155,9 +155,9 @@ jobs:
           retention-days: 7
 
   test-peft:
-    name: PEFT bnb tests
+    name: PEFT bnb tests (single GPU)
     if: github.repository == 'bitsandbytes-foundation/bitsandbytes'
-    runs-on: bandb-aws-g5-4xlarge-plus-use1-public-80  # A10
+    runs-on: bandb-aws-g6-4xlarge-plus-use1-public-80  # L4 (matches peft CI)
     steps:
       - name: Show GPU information
         run: nvidia-smi
@@ -196,8 +196,9 @@ jobs:
         run: |
           mkdir -p ${GITHUB_WORKSPACE}/reports
           python -m pytest \
-            -m "single_gpu_tests and bitsandbytes" \
-            tests/test_gpu_examples.py tests/test_common_gpu.py \
+            -m single_gpu_tests \
+            -k PeftBnbGPUExampleTests \
+            tests/test_gpu_examples.py \
             -v \
             --junitxml=${GITHUB_WORKSPACE}/reports/peft.xml \
             -o junit_logging=all \
@@ -211,6 +212,64 @@ jobs:
           path: reports/
           retention-days: 7
 
+  test-peft-multigpu:
+    name: PEFT bnb tests (multi GPU)
+    if: github.repository == 'bitsandbytes-foundation/bitsandbytes'
+    runs-on: bandb-aws-g6-12xlarge-plus-use1-public-80  # 4× L4
+    steps:
+      - name: Show GPU information
+        run: nvidia-smi
+
+      - uses: actions/checkout@v4
+
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ env.PYTHON_VERSION }}
+
+      - name: Install torch + bnb (from continuous-release)
+        run: |
+          pip install torch==${TORCH_VERSION} --index-url ${PYPI_INDEX}
+          pip install "bitsandbytes[test] @ ${BNB_WHEEL_URL}"
+
+      - name: Install peft and clone matching tag
+        run: |
+          pip install peft transformers accelerate datasets
+          PEFT_VERSION=$(pip show peft | awk '/^Version:/ {print $2}')
+          echo "Installed peft v${PEFT_VERSION}"
+          git clone --depth=1 --branch "v${PEFT_VERSION}" \
+            https://github.com/huggingface/peft.git /tmp/peft
+
+      - name: Show environment
+        run: |
+          pip list
+          python -m torch.utils.collect_env
+
+      - name: Run peft bnb tests
+        working-directory: /tmp/peft
+        env:
+          IS_GITHUB_CI: "1"
+          CUDA_VISIBLE_DEVICES: "0,1"
+        shell: bash -o pipefail {0}
+        run: |
+          mkdir -p ${GITHUB_WORKSPACE}/reports
+          python -m pytest \
+            -m multi_gpu_tests \
+            -k PeftBnbGPUExampleTests \
+            tests/test_gpu_examples.py \
+            -v \
+            --junitxml=${GITHUB_WORKSPACE}/reports/peft-multigpu.xml \
+            -o junit_logging=all \
+            2>&1 | tee ${GITHUB_WORKSPACE}/reports/peft-multigpu.log
+
+      - name: Upload JUnit XML and log
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: reports-peft-multigpu
+          path: reports/
+          retention-days: 7
+
   # ─── Consolidated report ──────────────────────────────────────────────────
   # Runs after all three test jobs finish (success or failure).
   # Downloads the JUnit XMLs, runs our report script, writes to the job
@@ -221,7 +280,7 @@ jobs:
 
   report:
     name: Consolidated report
-    needs: [test-transformers, test-accelerate, test-peft]
+    needs: [test-transformers, test-accelerate, test-peft, test-peft-multigpu]
     if: always() && github.repository == 'bitsandbytes-foundation/bitsandbytes'
     runs-on: ubuntu-22.04
     steps: