Skip to content

fix: tag tritonfrontend wheel with arch-specific platform tag (TRI-983)#8761

Open
mc-nv wants to merge 5 commits intomainfrom
mchornyi/TRI-983-wheel-tag
Open

fix: tag tritonfrontend wheel with arch-specific platform tag (TRI-983)#8761
mc-nv wants to merge 5 commits intomainfrom
mchornyi/TRI-983-wheel-tag

Conversation

@mc-nv
Copy link
Copy Markdown
Contributor

@mc-nv mc-nv commented Apr 29, 2026

What does the PR do?

Fix tritonfrontend Python wheel so it ships with a correct architecture-specific
platform tag instead of py3-none-any.

The root cause was a bdist_wheel.get_tag() override in src/python/setup.py that
hard-coded py3-none-any when --plat-name was not passed (which it never was).

Changes:

  • src/python/setup.py — replace the bdist_wheel + get_tag() override with
    BinaryDistribution.has_ext_modules()=True, mirroring the fix applied to
    tritonserver in triton-inference-server/core. Setuptools now auto-derives the
    correct cpXY-cpXY-linux_<arch> tag from the current interpreter.
  • src/python/build_wheel.py — add _repair_wheel_with_auditwheel() (ported from
    core/python/build_wheel.py) to upgrade the wheel to manylinux_2_X_<arch> for
    PyPI compatibility; add _detect_cuda_version() and _compose_version() for the
    +nv{release}.cu{cudaXY} local-version segment; add PEP 427 build-tag sourcing
    (CI_PIPELINE_IDNVIDIA_BUILD_IDBUILD_NUMBER) passed as
    --build-number=<N> to setup.py bdist_wheel; replace deprecated
    distutils.dir_util.copy_tree with shutil.copytree.

The container image continues to receive the linux_<arch> wheel from
generic/wheel/dist/; the manylinux wheel is written to generic/ for future PyPI
publishing.

CI (internal): [#49841381](http://tritonserver.local/ci/pipelines/49841381)
Resolves: TRI-983
Related: triton-inference-server/core#495

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.

Commit Type:

  • fix

Related PRs:

triton-inference-server/core#495

Where should the reviewer start?

src/python/setup.pyBinaryDistribution class replaces the old get_tag() override;
src/python/build_wheel.py_repair_wheel_with_auditwheel(), _compose_version(),
and the build-tag sourcing block.

Test plan:

Build inside the SDK container (requires auditwheel — added to Dockerfile.sdk in #8743):

python3 src/python/build_wheel.py \
  --dest-dir /tmp/whl \
  --binding-path <path-to-tritonfrontend_bindings.so>
ls /tmp/whl/              # manylinux wheel
ls /tmp/whl/wheel/dist/   # linux_<arch> wheel (for container install)
python3 -m wheel tags --list /tmp/whl/wheel/dist/*.whl
# Must NOT be py3-none-any
  • CI Pipeline ID: 49841381

Caveats:

PyPI publishing CI jobs (py3-tritonfrontend-publish) are a follow-up task.

Background

Both tritonserver and tritonfrontend wheels were shipping py3-none-any despite
containing arch-specific .so extensions — causing Poetry/pip-tools lock file issues
for users on multi-arch setups. The tritonserver fix lives in triton-inference-server/core#495.

Related Issues:

  • Resolves: TRI-983
  • NVBug: 6098081 / JIRA: DLIS-8648

Replace the legacy bdist_wheel.get_tag() override that hard-coded
py3-none-any with BinaryDistribution.has_ext_modules()=True, mirroring
the fix applied to the tritonserver wheel in core/. Add
_repair_wheel_with_auditwheel() to build_wheel.py so the wheel is
upgraded from linux_<arch> to manylinux_2_X_<arch> for PyPI
compatibility.

The container image still receives the linux_<arch> wheel from
generic/wheel/dist/ (correct for in-container pip install); the
manylinux wheel is written to generic/ for future PyPI publishing.

Also replace the deprecated distutils.dir_util.copy_tree with
shutil.copytree (symlinks=True, dirs_exist_ok=True).
mc-nv added 4 commits April 29, 2026 11:30
…983)

Port _detect_cuda_version() and _compose_version() from core/python/
build_wheel.py so the tritonfrontend wheel gets the same
+nv{release}.cu{cudaXY} local-version segment as tritonserver.

Add PEP 427 build-tag sourcing (CI_PIPELINE_ID / NVIDIA_BUILD_ID /
BUILD_NUMBER) passed as --build-number=<N> to setup.py bdist_wheel,
mirroring the logic in core/python/build_wheel.py.
build_wheel.py prefers CI_PIPELINE_ID over NVIDIA_BUILD_ID as the PEP 427
wheel build tag, but CI_PIPELINE_ID was never forwarded into the build
container. NVIDIA_BUILD_ID is baked in as ENV from build.py --build-id,
so add the same treatment for CI_PIPELINE_ID: read it from the host
environment and emit ENV CI_PIPELINE_ID <value> in both the Linux and
Windows buildbase Dockerfiles, but only when the variable is non-empty
so local builds are unaffected.
@mc-nv mc-nv self-assigned this Apr 29, 2026
@mc-nv mc-nv marked this pull request as ready for review April 29, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant