Skip to content

Gpu acceleration#15930

Closed
dmeliksetian wants to merge 2 commits intoQiskit:mainfrom
dmeliksetian:GPU-acceleration
Closed

Gpu acceleration#15930
dmeliksetian wants to merge 2 commits intoQiskit:mainfrom
dmeliksetian:GPU-acceleration

Conversation

@dmeliksetian
Copy link
Copy Markdown
Contributor

Summary

Offloads the boolean tensor operations in SparsePauliOp.compose() to the GPU via CuPy when available and the intermediate tensor size exceeds 5,000,000 elements (self.size × other.size × num_qubits). Falls back silently to NumPy when CuPy is not installed or the tensor is below the threshold — no behaviour change for existing users.

Closes #15929

Details and comments

Changes

  • qiskit/utils/optionals.py — add HAS_CUPY lazy optional tester
  • SparsePauliOp.compose() — select xp = cupy / numpy based on tensor size; keep the entire qargs branch (including repeat + scatter assignment) on GPU, transferring back to CPU only once before constructing BasePauli
  • test/python/.../test_sparse_pauli_op.py — add TestSparsePauliOpGPU correctness tests (skip when CuPy not installed) covering qargs=None, front=True, qargs set, and CPU/GPU parity
  • test/benchmarks/quantum_info.py — add SparsePauliOpGPUComposeBench and SparsePauliOpGPUComposeQargsBench with paired time_compose_cpu / time_compose_gpu methods for direct ASV comparison

Benchmarks

Measured on AMD Ryzen 9 7950X + NVIDIA RTX PRO 4500 Blackwell, CUDA 13.2, cupy-cuda12x:

qargs=None:

Operator size CPU GPU Speedup
10 qubits, 800 terms 34ms 22ms 1.6x
10 qubits, 1500 terms 104ms 59ms 1.8x
10 qubits, 2300 terms 238ms 136ms 1.8x
50 qubits, 1000 terms 73ms 48ms 1.5x

qargs set (e.g. apply_layout):

Operator size CPU GPU Speedup
50 total / 10 sub, 800 terms 66ms 40ms 1.6x
50 total / 30 sub, 420 terms 28ms 16ms 1.8x

Test plan

  • stestr run quantum_info.operators.symplectic.test_sparse_pauli_op.TestSparsePauliOpGPU — all 4 pass with CuPy installed
  • stestr run quantum_info.operators.symplectic.test_sparse_pauli_op.TestSparsePauliOpMethods — 272 pass, 2 skip (pre-existing), 0 fail
  • asv run --python=same --bench SparsePauliOpGPUCompose — both benchmark classes report GPU speedup

dmeliksetian and others added 2 commits March 31, 2026 14:19
Offloads the intermediate boolean tensor operations in SparsePauliOp.compose()
to the GPU when CuPy is available and the tensor size exceeds _GPU_COMPOSE_THRESHOLD
(5M elements), falling back silently to NumPy otherwise.

- Add HAS_CUPY lazy optional to qiskit.utils.optionals
- Add GPU/CPU correctness tests in TestSparsePauliOpGPU
- Add CPU-vs-GPU ASV benchmarks in SparsePauliOpGPUComposeBench

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Defer cp.asnumpy() in the qargs branch until after cp.repeat and
scatter assignment, so the full embedding stays on GPU. Previously
x3/z3/phase were transferred back to CPU before np.repeat was called.

Also expand ASV benchmarks with SparsePauliOpGPUComposeQargsBench to
measure CPU vs GPU speedup for the qargs path across varying total/sub
qubit counts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmeliksetian dmeliksetian requested a review from a team as a code owner April 1, 2026 00:45
@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Apr 1, 2026
@qiskit-bot
Copy link
Copy Markdown
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@jakelishman
Copy link
Copy Markdown
Member

See justification in #15929 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community PR PRs from contributors that are not 'members' of the Qiskit repo

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Add GPU acceleration for SparsePauliOp hot paths via CuPy

3 participants