Skip to content

[Paddle TensorRT] Fix int64 shape tensor in stack/arange converters breaking TRT>=10.8 engine build#79320

Open
WheelWell9876 wants to merge 1 commit into
PaddlePaddle:developfrom
WheelWell9876:fix-pir-trt-int64-stack-arange
Open

[Paddle TensorRT] Fix int64 shape tensor in stack/arange converters breaking TRT>=10.8 engine build#79320
WheelWell9876 wants to merge 1 commit into
PaddlePaddle:developfrom
WheelWell9876:fix-pir-trt-int64-stack-arange

Conversation

@WheelWell9876

Copy link
Copy Markdown

PR Category

Inference

PR Types

Bug fixes

Description

Problem. On TensorRT ≥ 10.8 the Shape op returns int64 (it returned int32 up to 10.0.1).
The pd_op.stack converter builds its output-shape subgraph from a raw network.add_shape(...) and
then concatenates that int64 tensor with the int32 add_1D_constant_layer(network, 1), which breaks the
TensorRT engine build. Paddle already handles this elsewhere via converter_utils.py::trt_shape() (which
casts the Shape result back to int32 on TRT ≥ 10, per its docstring); stack_converter simply wasn't
routed through it. arange_converter has the analogous gap — only its float branch casts the quotient.

Fix (2 files):

  • python/paddle/tensorrt/impls/manipulation.pystack_converter: replace raw
    network.add_shape(...) with trt_shape(...) so the shape tensor is int32 before the
    add_concatenation with the int32 constant.
  • python/paddle/tensorrt/impls/creation.pyarange_converter: cast the integer-branch quotient to
    int32 (mirroring the existing float branch).

Verification — the repo's own tests go FAIL → PASS, on Turing and Blackwell. Built the released
paddlepaddle-gpu==3.3.0 (cu129) + tensorrt==10.15.1.29 and ran test/tensorrt/test_converter_*.py:

[Tesla T4, sm_75]  and  [RTX PRO 6000, sm_120]  — identical:

BEFORE this fix:
  test_converter_manipulation.py::TestStackTRTPattern::test_trt_result       FAILED
  test_converter_manipulation.py::TestStackCase2TRTPattern::test_trt_result  FAILED
    [TRT] [E] Error Code 2: graphShapeAnalyzer ... Assertion !isInFlight(p.second.symbolicRep) failed
    E  AttributeError: 'NoneType' object has no attribute 'axis'  (manipulation.py:999)
  => 2 failed, 2 passed

AFTER this fix (same tests):
  test_converter_manipulation.py::TestStackTRTPattern::test_trt_result       PASSED
  test_converter_manipulation.py::TestStackCase2TRTPattern::test_trt_result  PASSED
  => 4 passed

The existing TestStackTRTPattern / TestStackCase2TRTPattern already cover the path, so no new test
is needed
— they fail on TRT ≥ 10.8 without this change and pass with it (and pass on TRT < 10.8 either
way, where Shape still returns int32, which is why this regressed silently on CI). In larger graphs the
same int64 shape tensor instead trips Error Code 4: ... incompatible types Int32 and Int64 (observed
with PP-FormulaNet_plus-L).

Note on arange: the existing arange tests use constant inputs and pass before and after, so they
don't independently exercise this; the integer-branch cast is included as the obvious consistency fix
matching the float branch (the same issue surfaces at the model level, e.g. PP-DocLayout-L). Happy to
split it into a follow-up if preferred.

Regression sources: pd_op.stack (#68839), pd_op.arange (#68757). Good cherry-pick candidate for
release/3.3 / release/3.4.

Fixes #79319.

是否引起精度变化 (Does this change precision?)

否 (No). Shape/index tensors are bounded well within int32 range; the change only makes the TensorRT
operand dtype consistent so the engine builds, and is numerically equivalent.

@CLAassistant

CLAassistant commented Jun 15, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-06-16 07:21:05

📋 Review 摘要

PR 概述:修复 TensorRT 10.8+ 下 stack/arange converter 中 int64 shape tensor 与 int32 shape subgraph 混用导致的 engine build 问题。
变更范围python/paddle/tensorrt/impls/creation.pypython/paddle/tensorrt/impls/manipulation.py
影响面 Tag[Inference]

问题

级别 文件 概述
🟡 建议 python/paddle/tensorrt/impls/creation.py:134 arange 的 runtime tensor integer 分支缺少对应回归测试

📝 PR 规范检查

符合规范。

总体评价

stack_converter 改用 trt_shape() 与现有 shape helper 语义一致,能把 TRT>=10 的 shape 输出统一回 int32;arange_converter 的整数分支 cast 也与 float 分支和后续 zero_tensor 的 int32 约束对齐。主要剩余风险是 arange 现有测试只覆盖常量输入,建议补一个 runtime tensor 输入用例锁住这次修复。

quotient_tensor = f_quotient_tensor
# zero_tensor (above) is int32; on TRT>=10 the integer quotient is int64,
# which would mismatch in the trt_sub below. Cast it to int32 too.
quotient_tensor = trt_cast(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 这个 cast 修的是 start/end/step 作为 TensorRT 输入 tensor 时的非 float 分支,但现有 arange converter 测试只用常量输入(test/tensorrt/test_converter_creation.py:109-114feed_list 为空)。常量折叠路径不会稳定覆盖 f_quotient_tensor 在 TRT>=10 上产生 int64 后再和 zero_tensortrt_sub 的运行时 subgraph;后续删除或改坏这行 cast 时,现有测试仍可能通过。

建议修复方式:
新增一个 arange TensorRT 用例,把 startendstep 放进 feed_list(例如三个 int64、shape 为 [1] 的输入),并设置对应的输入 shape 数据,使 pd_op.arange 的 integer 分支以 runtime tensor 进入 converter;该用例应在 TRT>=10.8 删除这行 cast 时 engine build 失败。

@paddle-bot paddle-bot Bot added the contributor External developers label Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PIR-TensorRT] pd_op.stack / pd_op.arange converters fail the TensorRT engine build on TRT >= 10.8 (int64 shape tensor / Int32-Int64 mismatch)

3 participants