Add more fusion test-cases (part 1)#2896
Merged
gramalingam merged 16 commits intomainfrom Apr 22, 2026
Merged
Conversation
Use @script with msft_op for model construction. Test positive cases with numerical validation: all bias combos (Q+K+V, Q-only, K-only, V-only, Q+K) verify original and fused models produce equivalent outputs via ORT. Test negative cases: no biases, INT32 dtype rejection, rank-2 shape mismatch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Models now use symbolic dimension names in input_types/output_types to better reflect real-world models, while concrete values (_B, _S) are still used for numpy test data generation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5 tests covering:
- Scalar float constant scale → fused into MHA scale attribute
- Integer scale constant → fused
- Existing MHA scale attribute → combined with external scale
- No Mul before MHA → no fusion (negative)
- Dynamic (non-constant) scale input → no fusion (negative)
All positive tests include numerical validation via ORT.
Uses symbolic dims ("B", "S") in input/output types.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use op.Constant(value=ir.tensor(...)) inside @script() functions to define scale as a graph constant directly, instead of creating it as a graph input then converting post-hoc. Simpler and more realistic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use tuples instead of lists for class-level _3D and _OUT constants. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use "from onnxscript import values" instead of "import onnxscript" to avoid mixing import and from-import for the same module. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4 tests covering:
- Both Mul orderings: scale*normalized and normalized*scale (parameterized)
- Mixed-precision: fp16 input with fp32 compute via Cast
- Integer input dtype rejected (negative)
All positive tests include numerical validation via ORT.
Uses symbolic dims ("B", "S") and @script() model construction.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5 structural tests covering 4 rule variants: - Basic MHA with key transposed (BHSd format) - Basic MHA with key not transposed (BSHd format) - MHA with past key/value (has_past_present=True, 3 outputs) - MHA with RotaryEmbedding on Q and K - Rank-2 query shape rejection (negative) Tests are structural-only (no ORT run) since the pattern requires internal SDPA nodes (ai.onnxruntime._fusion) that ORT cannot execute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The original negative test had a tuple-vs-list comparison bug: get_ints() returns a tuple, so perm == [0,2,1,3] was always False, meaning the corruption never happened. Fixed to compare with tuple. Confirmed the fusion correctly rejects mismatched Transpose perms. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
22 new tests across 4 files: rotary_embedding_unit_test.py (3 tests): - Full rotary embedding pattern fusion - Partial rotary embedding (adds rotary_embedding_dim attribute) - 3D input rejection (negative) skip_normalization_unit_test.py (8 tests): - SkipRmsNorm: both Add orderings via OrValue (parameterized) - SkipRmsNorm: post-add bias and pre-add bias variants - SkipLayerNorm: no bias and post-add bias - No skip Add (negative), rank-2 input (negative) _rms_normalization_extended_test.py (5 tests): - Both mul_order variants: scale*norm and norm*scale (parameterized) - Mixed-precision: fp16 input with fp32 compute via Cast - Double precision - Integer input rejection (negative) _layer_norm_extended_test.py (6 tests): - OrValue: Pow(deviation,2) vs Mul(deviation,deviation) - OrValue: Div(deviation,std_dev) vs Mul(deviation,Reciprocal) - Both OrValue alternatives combined - Div + bias fusion - Double precision - fp16 input rejection (negative) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use ONNX reference implementation (ORT lacks RMSNormalization kernel) to verify original and fused models produce identical results for float32 and fp16 tests. Double precision remains structural-only since the reference impl doesn't support stash_type=DOUBLE. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All 6 positive tests now verify original vs fused model outputs match using ORT inference. Uses concrete dims for test data while keeping the structural assertions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused local variable 'input_data' in _layer_norm_extended_test.py - Remove unused global '_EPS_F' in skip_normalization_unit_test.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
justinchuby
approved these changes
Apr 21, 2026
Collaborator
|
Some tests failing |
Serializing fused models with RMSNormalization requires onnx opset >= 23. On older onnx versions, tests now fall back to structural checks only (fusion count + op type assertions still run). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Collaborator
Author
Addressed. This is due to CIs that use onnx==1.17. I wonder why we have so many of these (in comparison to later ones). According to copilot, this is what we have: CI onnx version matrix
|
justinchuby
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add more unit test cases for fusion rules. (Currently, these are tested via real-world models in the benchmark-suite elsewhere, but unit tests are missing for various fusions.)