Skip to content

[Environment Adaptation] Add sm_120 to Blackwell CUDA arch selection#79331

Open
BostonSupremeMantou wants to merge 4 commits into
PaddlePaddle:developfrom
BostonSupremeMantou:sm120-blackwell-cuda-arch
Open

[Environment Adaptation] Add sm_120 to Blackwell CUDA arch selection#79331
BostonSupremeMantou wants to merge 4 commits into
PaddlePaddle:developfrom
BostonSupremeMantou:sm120-blackwell-cuda-arch

Conversation

@BostonSupremeMantou

@BostonSupremeMantou BostonSupremeMantou commented Jun 17, 2026

Copy link
Copy Markdown

PR Category

Environment Adaptation

PR Types

Improvements

Description

This PR updates CUDA architecture selection for Blackwell GPUs by adding sm_120 alongside sm_100 when the CUDA compiler supports native Blackwell code generation.

It also guards CUDA_ARCH_NAME=Blackwell behind CUDA 12.8+, because older CUDA toolchains such as CUDA 12.4 do not recognize compute_120. For CUDA 12.x releases before 12.8, CUDA_ARCH_NAME=All continues to exclude sm_100/sm_120 to avoid generating unsupported NVCC flags.

This is a build configuration readiness change, not a full claim of RTX 50-series runtime support.

Related context: #79314 handles CUDA 13 sm_121 for GB10 / DGX Spark. This PR is intentionally focused on sm_120 and the CUDA 12.8+ guard.

Tests run:

  • git diff --check
  • pre-commit run --files cmake/cuda.cmake test/compat/test_cpp_extension_api.py
  • CMake probe with CUDA 12.4: CUDA_ARCH_NAME=Blackwell fails with the expected CUDA 12.8+ requirement
  • CMake probe with CUDA 12.4: CUDA_ARCH_NAME=All does not emit sm_100/sm_120
  • CMake probe with CUDA 12.8: CUDA_ARCH_NAME=Blackwell emits sm_100/sm_120
  • python -m unittest discover -s test/compat -p test_cpp_extension_api.py -k blackwell

是否引起精度变化

@CLAassistant

CLAassistant commented Jun 17, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@BostonSupremeMantou BostonSupremeMantou force-pushed the sm120-blackwell-cuda-arch branch from a06d4c3 to ceb2c94 Compare June 17, 2026 21:22
@BostonSupremeMantou BostonSupremeMantou marked this pull request as ready for review June 17, 2026 21:28
PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-06-18 11:16:46

📋 Review 摘要

PR 概述:为 Blackwell CUDA 架构选择补充 sm_120,并对 CUDA 12.8 以下工具链增加门禁。
变更范围cmake/cuda.cmakepython/paddle/utils/cpp_extension/extension_utils.pytest/compat/test_cpp_extension_api.py
影响面 TagBuild Python API Tests

问题

未发现阻塞性问题。

历史 Findings 修复情况

Finding 问题 状态
F1 PADDLE_CUDA_ARCH_LIST 显式写数值 Blackwell 架构时会绕过 CUDA 12.8 检查 ✅ 已修复

📝 PR 规范检查

符合规范。

总体评价

当前 diff 在命名架构展开后统一检查 Blackwell 数值架构,覆盖了 10.010.112.0+PTX 以及自动探测到的 12.0 路径,历史绕过问题已修复。未发现新的阻塞性问题;本地尝试运行 python -m unittest discover -s test/compat -p test_cpp_extension_api.py -k blackwell 时因当前环境缺少 paddle.base 未能执行测试断言。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants