Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Python E2E test to validate that enabling SparseAttentionMode.XATTENTION does not significantly degrade generation similarity versus the non-sparse baseline, and wires this test into the Linux CI matrix for continuous batching.
Changes:
- Added
test_xattention.pyto compare similarity between xattention-enabled and disabled ContinuousBatchingPipeline runs. - Added a new Linux CI step to execute the xattention test under the continuous batching affected-components gate.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/python_tests/test_xattention.py | New similarity-based regression test for XAttention vs baseline. |
| .github/workflows/linux.yml | Runs the new xattention test as an additional Cacheopt E2E CI step. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ssert xattention in test
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
rkazants
left a comment
There was a problem hiding this comment.
Please provide proper PR description. The PR title does not reflect a real source changes. Here you seem to add get_schedule_config(). Please explain in the description why it is needed.
Updated the description. Thanks! @rkazants |
Description
CVS-175120
Checklist: