ONNX 1.21.0 integration by titaiwangms · Pull Request #27601 · microsoft/onnxruntime

titaiwangms · 2026-03-09T22:06:37Z

This pull request updates ONNX Runtime to support ONNX opset 26, including new operator implementations and related infrastructure changes. The most important changes are the upgrade of the ONNX dependency, addition of new opset 26 kernels (such as CumProd and BitCast), and updates to macros and versioning to ensure compatibility. Below are the key changes grouped by theme:

ONNX Dependency Upgrade:

Updated ONNX submodule and source references to the latest commit supporting opset 26, and changed versioning in vcpkg.json from 1.20.1 to 1.21.0. (cmake/deps.txt, cmake/external/onnx, cmake/vcpkg-ports/onnx/portfile.cmake, cmake/vcpkg-ports/onnx/vcpkg.json) [1] [2] [3] [4]

Opset 26 Kernel Support:

Registered new opset 26 kernels for BitCast and all supported types of CumProd in the CPU execution provider, including their instantiation and build logic. (onnxruntime/core/providers/cpu/cpu_execution_provider.cc, onnxruntime/core/providers/cpu/math/cumprod.cc, onnxruntime/core/providers/cpu/math/cumprod.h) [1] [2] [3] [4]
Increased the maximum supported opset version in the optimizer API from 25 to 26. (onnxruntime/core/optimizer/transpose_optimization/optimizer_api.h)

Build and Patch Updates:

Added a new ONNX_MINIMAL_BUILD option to ONNX CMake configuration and updated patch files for compatibility with the new ONNX version. (cmake/patches/onnx/onnx.patch, cmake/vcpkg-ports/onnx/binskim.patch) [1] [2] [3]

Macro Improvements:

Updated operator schema macros to use [[maybe_unused]] instead of the deprecated ONNX_UNUSED attribute, improving code clarity and modernizing macro usage. (onnxruntime/core/graph/contrib_ops/contrib_defs.h, onnxruntime/core/graph/dml_ops/dml_defs.h) [1] [2]

ONNX Dependency Upgrade

Updated ONNX submodule and source references to the latest commit supporting opset 26, and changed versioning in vcpkg.json from 1.20.1 to 1.21.0. [1] [2] [3] [4]

Opset 26 Kernel Support

Registered new opset 26 kernels for BitCast and all supported types of CumProd in the CPU execution provider, including their instantiation and build logic. [1] [2] [3] [4]
Increased the maximum supported opset version in the optimizer API from 25 to 26.

Build and Patch Updates

Added a new ONNX_MINIMAL_BUILD option to ONNX CMake configuration and updated patch files for compatibility with the new ONNX version. [1] [2] [3]

Macro Improvements

Updated operator schema macros to use [[maybe_unused]] instead of the deprecated ONNX_UNUSED attribute, improving code clarity and modernizing macro usage. [1] [2]

Update ONNX submodule to rel-1.21.0 branch (commit fbbe45b8e2). Update cmake/deps.txt with new URL and SHA1. Update vcpkg port (portfile.cmake, vcpkg.json) for 1.21.0. Regenerate onnx.patch and binskim.patch for 1.21.0 CMakeLists.txt changes. Update all 7 requirements.txt files to onnx==1.21.0. Bump kMaxSupportedOpset from 25 to 26 in optimizer_api.h. Fix ONNX_UNUSED macro removal (replaced with [[maybe_unused]]) in contrib_defs.h, dml_defs.h, and test_opaque_api.cc. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

BitCast (opset 26): Zero-copy tensor type reinterpretation for types with matching bit-widths. Supports all standard numeric types. Registered in cpu_execution_provider.cc with 17 passing tests. CumProd (opset 26): Cumulative product along a given axis with optional exclusive and reverse attributes. Supports float, double, int32, int64, uint32, uint64. Identity element is 1 (vs 0 for CumSum). Registered in cpu_execution_provider.cc with 33 passing tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add BitCast and CumProd entries to the CPU provider kernel documentation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

You can commit the suggested changes from lintrunner.

Copilot

Pull request overview

Updates ONNX Runtime’s ONNX dependency and kernel surface to support ONNX opset 26 (aligned with ONNX 1.21.0), including new CPU kernels and associated CI/build/doc updates.

Changes:

Bumped ONNX Python dependencies and vcpkg/zip-based ONNX sources to 1.21.0 (and updated submodule ref).
Added new CPU opset 26 kernels (BitCast, CumProd) plus extensive unit tests.
Updated schema-registration macros to use [[maybe_unused]] and refreshed generated operator-kernel documentation.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tools/ci_build/github/windows/python/requirements.txt	Bump CI Python ONNX requirement to 1.21.0.
tools/ci_build/github/linux/python/requirements.txt	Bump CI Python ONNX requirement to 1.21.0.
tools/ci_build/github/linux/docker/scripts/requirements.txt	Bump docker scripting ONNX requirement to 1.21.0.
tools/ci_build/github/linux/docker/scripts/manylinux/requirements.txt	Bump manylinux image ONNX requirement to 1.21.0.
tools/ci_build/github/linux/docker/scripts/lort/requirements.txt	Bump LoRT docker ONNX requirement to 1.21.0.
tools/ci_build/github/linux/docker/inference/aarch64/python/cpu/scripts/requirements.txt	Bump aarch64 inference image ONNX requirement to 1.21.0.
onnxruntime/test/python/requirements.txt	Bump test Python ONNX requirement to 1.21.0.
onnxruntime/test/providers/cpu/tensor/bitcast_op_test.cc	Adds CPU unit tests for new `BitCast` kernel.
onnxruntime/test/providers/cpu/math/cumprod_test.cc	Adds CPU unit tests for new `CumProd` kernel.
onnxruntime/test/opaque_api/test_opaque_api.cc	Updates schema-registration macro to use `[[maybe_unused]]`.
onnxruntime/core/providers/cpu/tensor/bitcast_op.h	Declares new CPU `BitCast` kernel.
onnxruntime/core/providers/cpu/tensor/bitcast_op.cc	Implements and registers opset-26 CPU `BitCast`.
onnxruntime/core/providers/cpu/math/cumprod.h	Declares templated CPU `CumProd` kernel and axis helper.
onnxruntime/core/providers/cpu/math/cumprod.cc	Implements and registers opset-26 CPU `CumProd`.
onnxruntime/core/providers/cpu/cpu_execution_provider.cc	Registers new opset-26 CPU kernels into the EP registry.
onnxruntime/core/optimizer/transpose_optimization/optimizer_api.h	Extends optimizer API max supported opset to 26.
onnxruntime/core/graph/dml_ops/dml_defs.h	Modernizes schema macro to `[[maybe_unused]]`.
onnxruntime/core/graph/contrib_ops/contrib_defs.h	Modernizes schema macro to `[[maybe_unused]]`.
docs/OperatorKernels.md	Refreshes generated operator-kernel listing (adds opset-26 ops, alters provider sections).
cmake/vcpkg-ports/onnx/vcpkg.json	Bumps vcpkg ONNX port to 1.21.0.
cmake/vcpkg-ports/onnx/portfile.cmake	Updates ONNX source ref/SHA to build ONNX 1.21.0 content.
cmake/vcpkg-ports/onnx/binskim.patch	Updates ONNX patch hunks for new upstream version (adds ONNX_MINIMAL_BUILD option).
cmake/patches/onnx/onnx.patch	Updates ONNX patch hunks for new upstream version (adds ONNX_MINIMAL_BUILD option).
cmake/external/onnx	Updates ONNX submodule commit to newer ref.
cmake/deps.txt	Updates ONNX zip dependency URL/hash to newer ref.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Change onnx==1.21.0 to onnx==1.21.0rc1 in all 7 requirements.txt files since the final 1.21.0 release is not yet published. Apply lintrunner auto-formatting fixes to whitespace/alignment. Verified SHA1 (deps.txt) and SHA512 (vcpkg portfile) hashes match the downloaded archives. No v1.21.0 tag exists yet — commit hash URL is correct. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- cumprod.cc: Add #include <numeric>, validate axis tensor has exactly one element (0-D scalar or 1-D shape [1]) - bitcast_op.cc: Add null check for TensorTypeFromONNXEnum return value - OperatorKernels.md: Restore DML section that was accidentally removed during regeneration, add BitCast and CumProd entries manually Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ONNX 1.21.0 (onnx/onnx#7675) added stricter raw_data size validation in ParseData<T>. The test had shape {4} but only 3 values {2, 64, 32}, which old ONNX silently ignored. Fix shape to {3}. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Update ONNX submodule, deps.txt, vcpkg portfile to rc2 commit a51ac075 - Update onnx==1.21.0rc2 in all 7 requirements.txt files - Fix cumprod.cc review comments (namespace, ORT_ENFORCE, type, closing brace) - Add 5 test exclusions: 4 DFT rfft/irfft tests (ORT lacks IRFFT) + 1 BitCast bool test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…broken tests The JSONC filter only covers Python backend tests. The C++ onnx_test_runner uses hardcoded arrays in TestCase.cc GetBrokenTests(). Add BitCast bool and DFT rfft/irfft filters to cover the C++ test runner path. ORT BitCast kernel doesn't register bool type, and ORT DFT kernel lacks IRFFT (inverse real FFT) support. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Agent-signed-off: Developer (45720d0d) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ONNX 1.21.0rc2 enables _GLIBCXX_ASSERTIONS (onnx/onnx#7601) which exposes pre-existing undefined behavior in Slice shape inference: std::clamp(start, 0, dim_value-1) with dim_value=0 violates lo<=hi. Add early-exit guard for both opset 10 and 11 locations in old.cc. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The onnx.patch fix must also be in binskim.patch for Windows CI builds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Covers the third std::clamp UB location in processSliceInputs. All three sites now patched: old.cc:2646, old.cc:6329, defs.cc:792. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Update cmake/deps.txt: commit hash and SHA1 for rc3 zip - Update cmake/external/onnx submodule to rc3 commit (e6c12c5fa) - Update cmake/vcpkg-ports/onnx/portfile.cmake: REF and SHA512 - Update onnx==1.21.0rc3 in all 7 requirements.txt files - Verified all vcpkg patches (binskim, fix-cmakelists, fix-dependency-protobuf) and cmake/patches/onnx/onnx.patch apply cleanly to rc3 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Agent-signed-off: Developer (257e49bb) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The Slice shape inference fix for dim_value==0 (tensor/old.cc and tensor/defs.cc) was cherry-picked into ONNX rc3 natively (commit 33afebf43, PR #7739). The parameter was also renamed from 'input_rank' to 'input_dim_size_or_value'. Remove these 3 hunks from both onnx.patch and binskim.patch to prevent build failures. Retained hunks: CMakeLists ONNX_MINIMAL_BUILD, Utils.cmake protobuf warnings, GroupNormalization Deprecate removal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Agent-signed-off: Developer (257e49bb) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace the 4 sequential outer loops (forward/reverse × exclusive/non-exclusive) with concurrency::ThreadPool::TryBatchParallelFor. Each outer iteration processes an independent slice, making them safe to parallelize. Refactored from sequential pointer arithmetic (input_iter++/output_iter++) to index-based access using base offset = outer * dim * lower_dim_size, which is required for parallel execution where iterations cannot share mutable iterators. Agent-signed-off: Developer (257e49bb) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…nx-1.21.0-integration

Update ONNX dependency from 1.21.0rc3 to 1.21.0rc4 (commit c751ddbce897). RC4 includes bug fixes (Slice SIGABRT on empty dimensions) and security hardening (ExternalDataInfo attribute injection). Changes: - cmake/deps.txt: Updated archive URL and SHA1 hash - cmake/external/onnx: Updated submodule to rc4 commit - cmake/vcpkg-ports/onnx/portfile.cmake: Updated REF and SHA512 - 7 requirements.txt files: onnx==1.21.0rc4 Agent-signed-off: Developer (dc55daf6) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

### Integrate ONNX 1.22.0rc1 (opset 27) Resolves #28752. Pin: `onnx/onnx@bc3be77bec2f628788796dff60819186bacf49df` (VERSION_NUMBER `1.22.0rc1`). ONNX **1.21.0 → 1.22.0rc1**. Max ai.onnx opset **26 → 27**. IR version **unchanged (13 / `0x0D`)**. This is the **RC validation phase** of an incremental integration (same strategy as the ONNX 1.21 bump, #27601). The formal `v1.22.0` GitHub release is still a **draft** (no git tag yet), so re-pinning to the released tag is deferred to **Phase 2** (see Follow-ups). Landing the RC now validates ONNX 1.22 against ORT before ONNX publishes the formal release. --- ### Update — ONNX 1.22.0 **FINAL** re-pin + rebase onto `upstream/main` + closes #28969 ONNX published the formal **`v1.22.0`** GitHub release, so this PR is re-pinned **rc2 → FINAL** (`onnx/onnx@v1.22.0`) — the Phase-2 step deferred in the rc1 description below. The branch was also **rebased onto `upstream/main`** to pick up the intervening optimizer/opset-26 work. The released tag tarball is a different asset hash than the RCs, so the vcpkg MS-internal asset mirror was re-seeded for the final tag (otherwise `--use_vcpkg` legs 404). **Also closes #28969** (WebGPU binary-elementwise broadcast `SIZE_MAX` underflow). ONNX 1.22's expanded-Attention reference tests exposed a latent WebGPU bug where a broadcast shape computed `dim - 1` on a zero/unit dimension and underflowed to `SIZE_MAX`; the fix is included here and the previously-skipped reference tests are re-enabled. **Opset-27 `*CurrentOpset` test handling.** ONNX 1.22.0 FINAL ships `DomainToVersionRange` **map-max 27** while the last *released* opset is **26**, so **opset 27 stays under development** for the whole 1.22 cycle. Strict legs (the default, or `ALLOW_RELEASED_ONNX_OPSET_ONLY=1`) therefore throw *"Opset 27 under development"* at model load on every `*CurrentOpset` fusion test that builds at the max opset. These tests now load with per-model `ModelOptions{/*allow_released_opsets_only*/ false, /*strict_shape_type_inference*/ false}`, extending the existing `38f17243b` / GatherToSlice precedent to the rest of the `*CurrentOpset` suite. This is **leg-agnostic** (exercises opset 27 on every CI leg, not just the relaxed ones) and **preserves opset coverage** (vs. `GTEST_SKIP`). Each call site is annotated with a one-line WHY + tracking issue (#28966) so the relaxation can be removed once opset 27 is released. `Resolves #28752` (unchanged). Closes #28969. ### Update — ONNX 1.22.0rc2 re-pin + ConvTranspose conforms to ONNX `output_shape` spec Since the original rc1 description below, this PR was re-pinned **rc1 → rc2** (`onnx/onnx@b124e0188a`, `VERSION_NUMBER 1.22.0rc2`) to pick up the upstream Xcode/iOS CMake fix (onnx#8056). rc2 also carries onnx#8051, which tightened `convTransposeShapeInference` to reject an `output_shape`/`output_padding` whose size does not match the number of spatial dimensions (per the ONNX spec clarification onnx#5400). **ONNX Runtime now conforms to that spec** instead of patching ONNX to preserve a non-standard form. **⚠️ Breaking change — ConvTranspose `output_shape` now follows the ONNX spec (spatial dimensions only).** ORT previously also accepted a non-standard `rank + 2` form that included batch and channel, i.e. `(N, C, H, W)`. As of ONNX 1.22, a `rank + 2` `output_shape` on a ConvTranspose whose input has a **statically-known rank** is rejected at `Graph::Resolve` with *"Attribute output_shape has incorrect size"*. **Migration:** specify `output_shape` with spatial dimensions only — e.g. `{1, 1, 1, 14}` → `{1, 14}` (batch and channel are always inferred from the input and weight, so results are identical; the kernel ignores `N, C`). Models whose ConvTranspose input has a **dynamic/unknown rank are unaffected** — ONNX skips the size check and ORT computes the same result (covered by the new `ConvTranspose_RankPlus2_OutputShape_DynamicRankInput_Runtime` test). **Patch inventory — supersedes "2 files, 3 hunks" below.** `cmake/patches/onnx/onnx.patch` (and its byte-identical `binskim.patch` mirror) carries **only** the `ONNX_MINIMAL_BUILD` option hunk and the GroupNormalization-18 `.Deprecate()` removal — **no ConvTranspose hunks**. rc2's strict shape-inference check is kept as-is; ORT's own test models were conformed to the spec. The upstream archive hash, `deps.txt`, `portfile.cmake`, `vcpkg.json`, and the submodule pin are unchanged. **Additional rc2 test conform.** rc2 also tightened `convPoolShapeInference` to reject `Conv` inputs with rank < 3 (*"Input tensor must have at least 3 dimensions"*). The hand-authored model in `onnxruntime/test/python/quantization/test_op_split.py` declared a spec-invalid rank-2 `Conv` input/weight; it was conformed to a valid NCHW shape (`[6, 3]` → `[1, 1, 6, 3]`, weight → `[2, 1, 1, 1]`), keeping the quantized-Split graph and expected outputs identical. No ORT source change. > This note should also seed the GitHub Release notes for the ONNX 1.22 / opset 27 milestone and the squash-commit message. --- ### What changed (29 files) **Version plumbing** - `cmake/deps.txt` — onnx archive URL → rc1 commit zip + SHA1 `421e5a9afb6c41a54696e424e5b9a3796aab6821`. - `cmake/external/onnx` — submodule → `bc3be77b`. - `cmake/vcpkg-ports/onnx/portfile.cmake` — `REF` commit form + tar.gz SHA512 `e0c526f5…3ce467`. - `cmake/vcpkg-ports/onnx/vcpkg.json` — `version-semver` `1.22.0`, `port-version` 0. - `cmake/patches/onnx/onnx.patch` + `cmake/vcpkg-ports/onnx/binskim.patch` — **byte-identical** rebase onto 1.22 (2 files, 3 hunks): kept the `ONNX_MINIMAL_BUILD` option (restructured for 1.22's new `onnx_core` OBJECT-lib / `add_subdirectory(onnx)` layout) and the GroupNormalization-18 `.Deprecate()` removal; **dropped** the `Utils.cmake` protobuf-warnings hunk (already merged upstream in 1.22). **Opset-27 op enablement (Range)** - `onnxruntime/core/providers/cpu/generator/range.cc` — split into versioned `[11, 26]` + a new unversioned `27` registration. The opset-27 kernel natively supports the existing common numeric types (float/double/int16/int32/int64). **fp16 Range is covered** via ONNX's Range-27 **function body**, which ORT expands into primitive ops at partition time. **bf16 Range is deferred to that same function expansion** — there is no native bf16 kernel, and its bf16 reference node test (`test_range_bfloat16_type_positive_delta`, base + `_expanded`) is not exercised by the Python/numpy ONNX backend series, whose harness cannot materialize bf16 (`Numpy_type 256`); a native fp16/bf16 kernel + `stash_type` handling is a follow-up (efficiency, not correctness). - `onnxruntime/core/providers/cpu/cpu_execution_provider.cc` — versioned the Range forward-declare + `BuildKernelCreateInfo` entries and added the opset-27 registration. - **CUDA Range** — same versioned `[11, 26]` + opset-27 split as CPU (`onnxruntime/core/providers/cuda/generator/range.cc` + `cuda_execution_provider.cc`); GPU-verified locally: `onnx_test_runner -e cuda` 8/8 opset-27 Range node tests pass, native Range-27 placed on CUDAExecutionProvider (fp16/bf16 via function expansion). **Optimizer / EP opset ceilings** - `…/transpose_optimization/optimizer_api.h` — `kMaxSupportedOpset` **26 → 27**. - `coreml`/`nnapi`/`vsinpu`/`webnn` `base_op_builder.h` — `GetMaxSupportedOpSet()` **25 → 27** (upper guard only; per-op support checks still gate — these EPs gain no new kernels here). **Fusion updates** - `onnxruntime/core/optimizer/gather_fusion.cc` — GatherToSlice Range version list `{1,11}` → `{1,11,27}`. - `onnxruntime/core/optimizer/embed_layer_norm_fusion.cc` — add `27` to the two Range path-matchers (`parent_path_3/4`) so embedding fusion still matches opset-27 models. - `onnxruntime/test/optimizer/graph_transform_test.cc` — new opset-27 GatherToSliceFusion test. **Requirements (7 bumped)** - All 7 CI `requirements.txt` → `onnx==1.22.0rc1` (rc1 wheel is on PyPI). The 3 transformers pins remain frozen at `1.18.0` (unrelated to this bump; intentionally untouched). **Generated docs / test data** - `js/web/docs/webgl-operators.md` — regenerated. - `docs/OperatorKernels.md` — **surgical** edit: CPU EP **and** CUDA EP Range rows (`27+` + `[11, 26]` continuation each); see caveats. - `onnxruntime/test/testdata/onnx_backend_test_series_filters.jsonc` — **comment-only**: documents why no opset-27 CPU exclusions are needed (all opset-27 node tests pass via function expansion). **Docs** - `.agents/skills/onnx-opset-bump-checklist/SKILL.md` — new reusable checklist skill distilled from this integration. Now also documents the "bump **all** execution providers together" tradition (CPU + CUDA + JS/DML assessment in one pass) so future opset bumps don't ship a partial EP set. --- ### Validation (CPU EP + CUDA EP, Linux x64) - Full build ✅ - `--minimal_build extended` build ✅ (validates the rebased `ONNX_MINIMAL_BUILD` patch hunk independently of the vcpkg mirror path) - `onnxruntime_test_all` ✅ — **1595 passed / 0 failed** - `onnx_test_runner -e cpu` on the ONNX 1.22 opset-27 node tests ✅ — **62/62 pass** via ONNX function-body expansion (run with `ALLOW_RELEASED_ONNX_OPSET_ONLY=0`), including CausalConvWithState, LinearAttention, and fp16/bf16 Range — despite no native kernels for them. - **CUDA EP (H100):** built `--use_cuda` clean in both **Debug** and **RelWithDebInfo** ✅; `onnx_test_runner -e cuda` on the opset-27 Range node tests ✅ — **8/8 pass**, with native Range-27 placed on CUDAExecutionProvider (no CPU fallback) and fp16/bf16 covered via function-body expansion. --- ### Standing caveats (please read before reviewing) 1. **CUDA EP now locally verified for Range; other GPU EPs/ops still CI-only.** The CUDA EP was built and the opset-27 **Range** node tests run locally on an H100 (8/8 pass). DML and the remaining GPU EPs/ops were **not** exercised here. Function-body expansion is EP-agnostic, so other opset-27 models are expected to run on those EPs too, but broader GPU coverage remains a CI/follow-up item. 2. **`OperatorKernels.md` updated surgically** (CPU Range row only). A CPU-only *full* regen would destructively wipe the CUDA/DML/other-EP sections (the generator only emits rows for the EPs in the built module). A correct multi-EP regen needs a build per EP and is a follow-up. 3. **Opset 27 is "under development"** in ONNX's released-versions map. ORT's load-time validation rejects opset-27 models unless `ALLOW_RELEASED_ONNX_OPSET_ONLY=0` (ORT CI already sets this). The opset-27 **schemas are always compiled in from the submodule** regardless — this gate only affects model load-time acceptance, not schema availability. 4. **EP `GetMaxSupportedOpSet` jumped 25 → 27** (skips 26). This is an *upper* guard only; raising it merely lets opset-26/27 nodes reach the per-op support checks that still gate correctness. No regression — it also retroactively un-caps opset-26 for these EPs. 5. **iOS/macOS Xcode framework build is currently broken by an upstream ONNX CMake regression** (the `onnx_core` OBJECT-library split in onnx/onnx#7733 reintroduced the Xcode breakage originally fixed by onnx/onnx#7515 for onnx/onnx#7514). This is **NOT** caused by this opset bump. Tracked upstream at [onnx/onnx#8053](onnx/onnx#8053). Non-Xcode builds (Linux/Windows/Android/WASM) and all CPU/CUDA validation are unaffected. This resolves at the **Phase 2** formal `v1.22.0` re-pin once ONNX ships the fix. --- ### Follow-ups (explicitly NOT in this PR) - **GPU/multi-EP coverage:** run opset-27 CUDA/DML node tests; regenerate `OperatorKernels.md` across all EPs. - **JS EP Range** `[11, 26]` + `27` split (currently registered open-ended at `11`; mirror the CPU/CUDA versioned split). - **DML Range opset-27 assessment** (DML uses its own `REG_INFO` registration system — assess whether an opset-27 entry is needed). - **WebGPU EP Range** opset-27 split — `range.cc` registers `Range` `.SinceVersion(11)` open-ended, so it already claims opset-27 Range; only the new bf16 type is unsupported and falls back via the `T` type-constraint (function expansion). Mirror the CPU/CUDA versioned `[11, 26]` + `27` split. - **Native kernels:** implement CPU (and EP) `CausalConvWithState` and `LinearAttention` kernels, and a native fp16/bf16 + `stash_type` Range-27 kernel (replace today's function-expansion path with efficient kernels). - **Phase 2 — formal `v1.22.0` re-pin:** re-pin `deps.txt`/submodule/portfile/requirements to the released tag once ONNX publishes it (currently blocked on ONNX tagging the release); upload the tag tarball to the vcpkg mirror. **This also restores the iOS/macOS Xcode framework build** once the upstream onnx OBJECT-library Xcode regression (caveat 5) is resolved and re-pinned. - **Tooling:** fix the pre-existing crash in `find_optimizer_opset_version_updates_required.py` (placeholder `ver` parsed as int) so it can be relied on for future bumps. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

titaiwangms and others added 3 commits March 9, 2026 21:34

Regenerate OperatorKernels.md for ONNX opset 26

f2bc8bf

Add BitCast and CumProd entries to the CPU provider kernel documentation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

titaiwangms requested a review from Copilot March 9, 2026 22:06

github-advanced-security AI found potential problems Mar 9, 2026

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/math/cumprod.cc Fixed

Comment thread onnxruntime/core/providers/cpu/tensor/bitcast_op.cc Fixed

Comment thread onnxruntime/test/providers/cpu/tensor/bitcast_op_test.cc Fixed

github-actions Bot reviewed Mar 9, 2026

View reviewed changes

Copilot AI reviewed Mar 9, 2026

View reviewed changes

titaiwangms and others added 3 commits March 9, 2026 22:26

Merge branch 'main' into onnx-1.21.0-integration

0d2a0d6

titaiwangms closed this Mar 10, 2026

titaiwangms reopened this Mar 10, 2026

titaiwangms and others added 2 commits March 11, 2026 17:17

webgl-operators.md update

29744da

titaiwangms requested a review from Copilot March 11, 2026 17:39

Copilot AI reviewed Mar 11, 2026

View reviewed changes

titaiwangms and others added 5 commits March 12, 2026 19:56

Sync Slice dim_value==0 fix to vcpkg binskim.patch

1d32aea

The onnx.patch fix must also be in binskim.patch for Windows CI builds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add Slice dim_value==0 fix for defs.cc (opset 13)

b691c52

Covers the third std::clamp UB location in processSliceInputs. All three sites now patched: old.cc:2646, old.cc:6329, defs.cc:792. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

xadupre reviewed Mar 16, 2026

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/math/cumprod.cc Outdated

titaiwangms and others added 2 commits March 16, 2026 19:01

titaiwangms closed this Mar 16, 2026

titaiwangms reopened this Mar 16, 2026

titaiwangms and others added 4 commits March 16, 2026 21:35

Merge branch 'main' into onnx-1.21.0-integration

5812ebf

Merge remote-tracking branch 'origin/onnx-1.21.0-integration' into on…

139ca4e

…nx-1.21.0-integration

This was referenced May 25, 2026

Bump the minor-and-patch group with 16 updates dantte-lp/dc-research-mcp#12

Closed

Bump FluentAssertions and 13 others dantte-lp/dc-research-mcp#13

Closed

This was referenced Jun 1, 2026

Bump the minor-and-patch group with 16 updates dantte-lp/dc-research-mcp#14

Closed

Bump FluentAssertions and 14 others dantte-lp/dc-research-mcp#15

Closed

titaiwangms mentioned this pull request Jun 2, 2026

Integrate ONNX 1.22.0 (opset 27) — issue #28752 #28754

Merged

This was referenced Jun 8, 2026

Bump the minor-and-patch group with 17 updates dantte-lp/dc-research-mcp#16

Closed

Bump FluentAssertions and 14 others dantte-lp/dc-research-mcp#17

Closed

This was referenced Jun 15, 2026

Bump the minor-and-patch group with 17 updates dantte-lp/dc-research-mcp#18

Closed

Bump FluentAssertions and 15 others dantte-lp/dc-research-mcp#22

Closed

Conversation

titaiwangms commented Mar 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants