contrib: add mask/input shape consistency checks in MaxpoolWithMask::Compute by Copilot · Pull Request #28223 · microsoft/onnxruntime

Copilot · 2026-04-24T15:00:34Z

Description

Replaces the long-standing // TODO: fix this checker later comment in MaxpoolWithMask::Compute with real input validation. Without these checks, a mismatched mask silently causes out-of-bounds memory access.

Changes:

contrib_ops/cpu/maxpool_with_mask.h — Added three ORT_RETURN_IF_NOT guards:
- Mask must have the same number of dimensions as the input tensor
- Mask N and C dimensions must be nonzero when input is non-empty (prevents modulo-by-zero in total_mask_channels)
- Each spatial dimension (dim ≥ 2) of the mask must match the corresponding input dimension
test/contrib_ops/maxpool_mask_test.cc — Added three failure-case tests:
- MaxPoolWithMask_SpatialDimMismatch — mask spatial dims differ from input
- MaxPoolWithMask_DimCountMismatch — mask rank differs from input rank
- MaxPoolWithMask_MaskEmptyBatchDim — mask N=0 with non-empty input triggers the nonzero N/C guard

Motivation and Context

The mask tensor is indexed using the input's spatial step size (x_step = height * width, etc.), so a shape mismatch leads to silent out-of-bounds reads. Additionally, total_mask_channels = m_shape[0] * m_shape[1] is used as a modulo divisor in the per-channel offset formula; if either dimension is zero while the input is non-empty, this causes undefined behaviour (division by zero). The original code had a commented-out check with a TODO acknowledging this gap; this PR closes it.

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/3ff41f92-9af2-46fc-9d9f-029cd4f52233 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Copilot

Pull request overview

This PR replaces a placeholder TODO in the com.microsoft::MaxpoolWithMask CPU kernel with concrete mask-vs-input shape validation to prevent unsafe indexing when mask shapes don’t match the input tensor.

Changes:

Added runtime guards in MaxpoolWithMask::Compute to require identical rank for X and M, and identical spatial dimensions (dims ≥ 2).
Added negative tests to ensure shape mismatches fail with clear error messages.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
onnxruntime/contrib_ops/cpu/maxpool_with_mask.h	Adds input/mask rank + spatial-dimension consistency checks in `Compute`.
onnxruntime/test/contrib_ops/maxpool_mask_test.cc	Adds two failure-case tests covering rank mismatch and spatial-dimension mismatch.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

github-actions

You can commit the suggested changes from lintrunner.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

titaiwangms · 2026-05-12T17:37:59Z

Multi-reviewer synthesis (readability + code + critical + deep)

Strong cross-reviewer consensus. The new guards do close the OOB they target — deep-reviewer derived the bound rigorously: with total_mask_channels ≥ 1, x_step ≥ 1, max mask access (total_mask_channels − 1) + (x_step − 1) < total_mask_channels · x_step ✓. Empty-input boundary cases (total_channels == 0, x_step == 0) are also safe via the parallel-for short-circuit. However, the analysis surfaced a deeper pre-existing indexing bug that this PR effectively papers over by deleting the TODO.

Major

1. Pre-existing mask-channel offset formula is mathematically wrong.
At maxpool_with_mask.h:44, 91, 151:

const int32_t* m_d = M_data + (c * x_step) % total_mask_channels;

Result is bounded by total_mask_channels − 1 — i.e. always a small offset into mask-channel 0. To address mask channel c (after the new spatial-equality guard makes mask spatial size == x_step), the correct formula is (c % total_mask_channels) * x_step.

Hand-traced the existing passing test (x = m = {1,3,8,8}, x_step=64, total_mask_channels=3):

c	code computes	should be	result
0	`(0·64)%3 = 0`	0	start of mask ch0 ✓
1	`(1·64)%3 = 1`	64	2nd element of mask ch0 ✗
2	`(2·64)%3 = 2`	128	3rd element of mask ch0 ✗

The existing test only passes because the per-channel maxima all happen to live in column 0. Move the channel-1 max to column 3+ and the existing test would fail with the buggy formula.

This PR removes the // TODO: fix this checker later comment, signalling the kernel is "fixed" — but only the bounds aspect is fixed; the channel indexing remains broken for any C ≥ 2 with non-uniform per-channel masks.

Ask: either (a) tighten guard #2 to require strict m_shape[0] == x_shape[0] && m_shape[1] == x_shape[1] and switch the offset to M_data + c * x_step (no modulo, simplest contract), or (b) fix the modulo to (c % total_mask_channels) * x_step, document the broadcast semantics, and add a multi-channel test where per-channel maxima are NOT at column 0 so the math bug is observable.

2. Guard #2 phrasing is awkward and hides the actual invariant.

const bool input_has_nonzero_channels = x_shape[0] > 0 && x_shape[1] > 0;
ORT_RETURN_IF_NOT(!input_has_nonzero_channels || (m_shape[0] > 0 && m_shape[1] > 0), ...);

Issues:

The variable name says "channels" but x_shape[0] is N (batch).
The implication / De Morgan form obscures the real reason: prevent UB from % total_mask_channels when divisor would be 0.
The error message describes a symptom of the implementation, not the contract.

Cleaner equivalent (still empty-input-safe because the parallel-for body never runs when total_channels == 0):

const int64_t total_channels      = x_shape[0] * x_shape[1];
const int64_t total_mask_channels = m_shape[0] * m_shape[1];
ORT_RETURN_IF_NOT(total_channels == 0 || total_mask_channels > 0,
                  "Mask must have at least one channel when input is non-empty. ...");

3. kernel_shape.size() == input_rank − 2 not enforced.
The new spatial loop runs over every dim ≥ 2 of X but doesn't validate that kernel_shape.size() matches the input's spatial rank. A model with rank-3 X and 2D kernel_shape still passes the new mask validation, then later code reads x_shape[3] / output_dims[3] (out of range) and pool_attrs_.SetOutputSize indexes kernel_shape[dim] past its end. The trust-boundary spirit of this PR argues for an explicit guard:

const size_t spatial_rank = x_shape.NumDimensions() - 2;
ORT_RETURN_IF_NOT(spatial_rank == pool_attrs_.kernel_shape.size(), ...);
ORT_RETURN_IF_NOT(spatial_rank >= 1 && spatial_rank <= 3, ...);

Minor

No test for guard Remove vsts test runner in cmake file #2 (the div-by-zero one) — every reviewer flagged this. The most algorithmically novel guard, with the most awkward message, has no executable documentation. Add e.g. x = {1,1,8,8}, m = {0,1,8,8} expecting the guard's substring.
N/C mismatch behavior is undocumented and untested. The new checks deliberately don't enforce m_shape[0] == x_shape[0] / m_shape[1] == x_shape[1], but no test pins down whether channel broadcasting is allowed-and-correct or allowed-and-broken. Coupled to finding Set up CI with Azure Pipelines #1.
Spatial loop bound for (size_t i = 2; i < x_shape.NumDimensions(); ++i) is over-strict if x_shape.NumDimensions() > kernel_shape.size() + 2. Should be bounded by kernel_shape.size() + 2 to match what the kernel actually reads.

Nits

Inconsistent formatting between guard Set up CI with Azure Pipelines #1's "mask dims: X input dims: Y" and guard Remove vsts test runner in cmake file #2's "mask N=X C=Y …".
Test placeholder comment ("Placeholder output shape and values…") duplicated verbatim in both new tests — factor into a helper or top-of-section comment.
Test suite name MaxPoolWithMask_* (capital P) doesn't match op name MaxpoolWithMask (lowercase p). Pre-existing inconsistency, but the new tests propagate it.
Guard update HighLevelDesign.md #3's error message could include both shapes inline like guard Set up CI with Azure Pipelines #1 does, for parity.

Praise

Replacing a long-standing TODO with real validation is the right move.
Guard ordering is correct: rank → nonzero N/C → per-spatial. Each later guard relies on the earlier holding, avoiding OOB during validation itself.
Guard Remove vsts test runner in cmake file #2's empty-input short-circuit means valid empty graphs aren't rejected.
Tests use kExpectFailure with substring matching — proper regression discipline.
Bounds analysis: the three guards together make the modulo-based offset provably in-bounds, including all empty-input boundary cases.

Recommendation

Request changes on finding #1 (the offset formula): either tighten the contract (simplest) or fix the math + add a multi-channel test. Without that, this PR fixes the OOB but cements the silently-incorrect indexing. Findings #2 and #3 are cheap follow-ups worth folding in. Minors are nice-to-have.

Reviewed by a cross-family reviewer team (Claude readability + GPT code & critical + Claude-high deep). Bounds derivation and the multi-channel offset-formula trace independently verified.

tianleiwu

Review Summary

This PR correctly addresses a real safety gap by replacing a long-standing TODO with proper mask-vs-input shape validation in MaxpoolWithMask::Compute. The spatial dimension loop check is well-generalized (works for 1D/2D/3D), the N/C nonzero guard prevents modulo-by-zero crashes, and error messages are clear. Two failure-mode tests cover the key validation paths.

Minor suggestions (non-blocking):

N/C batch-broadcast documentation: The validation ensures mask N and C are nonzero but does not check whether they match the input's N and C. The downstream indexing uses (c * x_step) % total_mask_channels, which silently "broadcasts" the mask when m_shape[0] * m_shape[1] != x_shape[0] * x_shape[1]. If this modulo-based broadcasting is intentional, a brief comment would prevent future confusion.
Missing test for N/C nonzero check: The PR adds the input_has_nonzero_channels validation path but has no test exercising the failure case (e.g., mask shape {0, 1, 8, 8} with input shape {1, 1, 8, 8}).

Overall: well-scoped, low-risk change with good test coverage.

github-actions

You can commit the suggested changes from lintrunner.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/9e152742-abd5-4f85-a3dc-dc7073befa07 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

tianleiwu

Review Summary

Second review on the updated head. Both suggestions from the prior round are addressed:

N/C broadcasting comment: Added, documenting that the modulo-based mask broadcasting is intentional.
N/C nonzero test: MaxPoolWithMask_MaskEmptyBatchDim now covers the guard against division-by-zero in total_mask_channels.

The three-tier validation (rank equality → N/C nonzero guard → spatial dimension match) is correctly ordered and all branches have test coverage. Clean and ready to merge.

tianleiwu · 2026-05-14T23:53:02Z

@xadupre, please merge/rebase main branch to pass CI.

…copilot/check-consistency-mask-shape

Initial plan

d3e93c1

Copilot AI assigned Copilot and xadupre Apr 24, 2026

Copilot started work on behalf of xadupre April 24, 2026 15:00 View session

Copilot AI linked an issue Apr 24, 2026 that may be closed by this pull request

function onnxruntime!onnxruntime::contrib::MaxpoolWithMask::Compute, check consistency between the mask and shape #28222

Closed

Copilot AI and others added 2 commits April 24, 2026 15:48

Add shape consistency validation in MaxpoolWithMask::Compute

2735242

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/3ff41f92-9af2-46fc-9d9f-029cd4f52233 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Clarify placeholder output comments in failure test cases

b050564

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/3ff41f92-9af2-46fc-9d9f-029cd4f52233 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix consistency check between mask and shape in MaxpoolWithMask~~ contrib: add mask/input shape consistency checks in MaxpoolWithMask::Compute Apr 24, 2026

Copilot finished work on behalf of xadupre April 24, 2026 15:51

Copilot AI requested a review from xadupre April 24, 2026 15:51

xadupre requested a review from Copilot May 5, 2026 15:45

Copilot started reviewing on behalf of xadupre May 5, 2026 15:46 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/maxpool_with_mask.h

Potential fix for pull request finding

8e3bc56

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

github-actions Bot reviewed May 5, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/maxpool_with_mask.h Outdated

Update onnxruntime/contrib_ops/cpu/maxpool_with_mask.h

4bc6022

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

xadupre marked this pull request as ready for review May 6, 2026 12:01

titaiwangms requested a review from tianleiwu May 12, 2026 18:07

tianleiwu previously approved these changes May 12, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/maxpool_with_mask.h

Comment thread onnxruntime/test/contrib_ops/maxpool_mask_test.cc

add comment

94b3431

xadupre dismissed tianleiwu’s stale review via 94b3431 May 13, 2026 08:27

Copilot started work on behalf of xadupre May 13, 2026 08:30 View session

github-actions Bot reviewed May 13, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/maxpool_with_mask.h Outdated

xadupre and others added 2 commits May 13, 2026 10:52

Update onnxruntime/contrib_ops/cpu/maxpool_with_mask.h

03ea614

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Add test for mask N/C nonzero validation path in MaxpoolWithMask

1f73169

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/9e152742-abd5-4f85-a3dc-dc7073befa07 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Copilot finished work on behalf of xadupre May 13, 2026 09:07

xadupre approved these changes May 13, 2026

View reviewed changes

tianleiwu approved these changes May 13, 2026

View reviewed changes

xadupre added 2 commits May 18, 2026 11:03

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

3d4849d

…copilot/check-consistency-mask-shape

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

2b2fa57

…copilot/check-consistency-mask-shape

tianleiwu merged commit 9883e93 into main May 18, 2026
88 checks passed

tianleiwu deleted the copilot/check-consistency-mask-shape branch May 18, 2026 22:27

BrewTestBot mentioned this pull request Jun 19, 2026

onnxruntime 1.27.0 Homebrew/homebrew-core#288892

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib: add mask/input shape consistency checks in MaxpoolWithMask::Compute#28223

contrib: add mask/input shape consistency checks in MaxpoolWithMask::Compute#28223
tianleiwu merged 10 commits into
mainfrom
copilot/check-consistency-mask-shape

Copilot AI commented Apr 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

titaiwangms commented May 12, 2026

Uh oh!

tianleiwu left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

tianleiwu left a comment

Uh oh!

tianleiwu commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Copilot AI commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

titaiwangms commented May 12, 2026

Multi-reviewer synthesis (readability + code + critical + deep)

Major

Minor

Nits

Praise

Recommendation

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

tianleiwu commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Copilot AI commented Apr 24, 2026 •

edited

Loading