Use nalgebra Matrix2 internally in the TwoQubitBasisDecomposer by mtreinish · Pull Request #15928 · Qiskit/qiskit

mtreinish · 2026-03-31T23:37:40Z

Summary

This commit moves to using Matrix2 as the array type used internally for the TwoQubitBasisDecomposer. Matrix2 is a fixed size stack allocated matrix type that has several performance advantages especially for matmul because the compiler can reason about a fixed number of operations and better optimize the implementation. Similarly we avoid a lot of heap allocations. This will improve the runtime performance of the two qubit basis decomposer.

This is part of the ongoing effort to move to using nalgebra's fixed size matrix types Matrix4 and Matrix2 inside all of the two qubit decomposer paths. We will still use faer for the involved linear algebra such as eigenvalue decomposition where it is faster and more numerically stable. This doesn't get us all the way to this goal, it's just another step on the journey.

There are still places in the module that are using ndarray as the array types, this is mostly because they're used with either the Weyl decomposition or the one qubit euler decomposition. In particular there are a couple of duplicate methods either prefixed or postfixed with nalgebra to either return or convert to/from an nalgebra object which are temporary while we're in the middle of the transition. The goal is to remove these as we migrate the rest of the two qubit decomposers to be using nalgebra for the storage type.

Details and comments

This commit moves to using Matrix2 as the array type used internally for the TwoQubitBasisDecomposer. Matrix2 is a fixed size stack allocated matrix type that has several performance advantages especially for matmul because the compiler can reason about a fixed number of operations and better optimize the implementation. Similarly we avoid a lot of heap allocations. This will improve the runtime performance of the two qubit basis decomposer. This is part of the ongoing effort to move to using nalgebra's fixed size matrix types Matrix4 and Matrix2 inside all of the two qubit decomposer paths. We will still use faer for the involved linear algebra such as eigenvalue decomposition where it is faster and more numerically stable. This doesn't get us all the way to this goal, it's just another step on the journey. There are still places in the module that are using ndarray as the array types, this is mostly because they're used with either the Weyl decomposition or the one qubit euler decomposition. In particular there are a couple of duplicate methods either prefixed or postfixed with nalgebra to either return or convert to/from an nalgebra object which are temporary while we're in the middle of the transition. The goal is to remove these as we migrate the rest of the two qubit decomposers to be using nalgebra for the storage type.

qiskit-bot · 2026-03-31T23:37:45Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core
@levbishop

mtreinish · 2026-03-31T23:39:26Z

I ran some asv benchmarks and there wasn't much of an improvement. Asv didn't have any confidence in a significant change in the numbers it reported. But as I said in the commit message this is just one step towards enabling using Matrix2 and Matrix4 for the array type in the decomposer everywhere.

Benchmarks that have stayed the same:

| Change   | Before [29fdcad7]    | After [8dfbc15d]    |   Ratio | Benchmark (Parameter)                                                                           |
|----------|----------------------|---------------------|---------|-------------------------------------------------------------------------------------------------|
|          | 3.07±0.01ms          | 3.09±0.02ms         |    1.01 | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')                                          |
|          | 3.06±0.01ms          | 3.09±0.01ms         |    1.01 | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')                                          |
|          | 150±0.4ms            | 151±0.4ms           |    1.01 | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                                            |
|          | 33.4±0.2ms           | 33.2±0.1ms          |    1    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')                                          |
|          | 33.8±0.2ms           | 33.8±0.2ms          |    1    | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                                         |
|          | 3.07±0.01ms          | 3.08±0.02ms         |    1    | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr')                                         |
|          | 20.6±0.02s           | 20.6±0.02s          |    1    | utility_scale.UtilityScaleBenchmarks.time_hwb12('cz')                                           |
|          | 19.6±0.05s           | 19.5±0.01s          |    1    | utility_scale.UtilityScaleBenchmarks.time_hwb12('ecr')                                          |
|          | 222±3ms              | 222±0.8ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('ecr')                                    |
|          | 179±0.7ms            | 179±0.5ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')                                            |
|          | 171±0.5ms            | 171±0.5ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                                           |
|          | 295±0.6ms            | 294±0.8ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qft('cz')                                             |
|          | 397±1ms              | 396±0.6ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                                              |
|          | 398±0.3ms            | 397±0.7ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                                             |
|          | 47.8±0.1ms           | 47.8±0.5ms          |    1    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')                              |
|          | 3.63±0.01ms          | 3.60±0.01ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(0)                   |
|          | 5.67±0.03ms          | 5.62±0.03ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(1)                   |
|          | 5.27±0.01ms          | 5.22±0.01ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(2)                   |
|          | 5.47±0.03ms          | 5.43±0.01ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(3)                   |
|          | 14.4±0.1ms           | 14.3±0.05ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(0) |
|          | 19.1±0.06ms          | 18.9±0.1ms          |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(1) |
|          | 4.05±0.01ms          | 3.99±0.01ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(2) |
|          | 6.25±0.1ms           | 6.16±0.1ms          |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(0)                        |
|          | 36.5±0.3ms           | 36.3±0.3ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                                          |
|          | 11.5±0.2ms           | 11.4±0.1ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                                         |
|          | 13.2±0.1ms           | 13.1±0.2ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                                         |
|          | 15.3±0.01s           | 15.2±0.02s          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_hwb12('cx')                                           |
|          | 223±2ms              | 221±0.9ms           |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cz')                                     |
|          | 3.79±0.02ms          | 3.73±0.01ms         |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')                                |
|          | 41.2±0.1ms           | 40.8±0.2ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')                                 |
|          | 13.2±0.05ms          | 13.1±0.01ms         |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')                    |
|          | 13.2±0.04ms          | 13.1±0.03ms         |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')                    |
|          | 13.3±0.07ms          | 13.1±0.08ms         |    0.99 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr')                   |
|          | 249±0.6ms            | 247±0.7ms           |    0.99 | utility_scale.UtilityScaleBenchmarks.time_qft('cx')                                             |
|          | 301±1ms              | 299±1ms             |    0.99 | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                                            |
|          | 44.3±0.1ms           | 43.8±0.2ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')                               |
|          | 50.1±0.5ms           | 49.8±0.2ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')                               |
|          | 4.24±0.01ms          | 4.16±0ms            |    0.98 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(3) |
|          | 9.18±0.1ms           | 8.95±0.07ms         |    0.98 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(1)                        |
|          | 13.5±0.09ms          | 13.2±0.2ms          |    0.98 | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                                        |
|          | 224±3ms              | 221±1ms             |    0.98 | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cx')                                     |
|          | 3.80±0.03ms          | 3.74±0.01ms         |    0.98 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')                                 |
|          | 3.81±0.01ms          | 3.73±0.01ms         |    0.98 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')                                 |
|          | 41.4±0.3ms           | 40.6±0.2ms          |    0.98 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                                  |
|          | 41.3±0.2ms           | 40.7±0.1ms          |    0.98 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                                  |
|          | 384±2ms              | 377±0.6ms           |    0.98 | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                                              |
|          | 3.04±0.1s            | 2.95±0.07s          |    0.97 | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cz')                                      |
|          | 21.2±0.2ms           | 20.4±0.08ms         |    0.96 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(3)                        |
|          | 19.0±0.2ms           | 18.0±0.08ms         |    0.95 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(2)                        |
|          | 3.12±0.02s           | 2.93±0.07s          |    0.94 | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cx')                                      |
|          | 3.11±0.06s           | 2.88±0.03s          |    0.93 | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('ecr')                                     |

BENCHMARKS NOT SIGNIFICANTLY CHANGED.

raynelfss

I couldn't find anything particularly concerning in the code here. I just had a very minor comment.

ShellyGarion

I have some minor comments.
In addition, is there a plan to convert the other two-qubit decomosers (TwoQubitWyelDecomposer and TwoQubitControlledUDecomposer) into nalgeba as well?

ShellyGarion · 2026-04-05T08:46:38Z

-use super::common::{
-    DEFAULT_FIDELITY, IPZ, TraceToFidelity, rx_matrix, rz_matrix, transpose_conjugate,
-};
+use super::common::{DEFAULT_FIDELITY, TraceToFidelity, rx_matrix_nalgebra, rz_matrix_nalgebra};


note that IPZ has been moved to common.rs since it's also used in weyl_decompositions.rs

This is temporary just for this PR as the version in common.rs is replaced with a Matrix2 version in the next PR: #15960 That PR moves all the static IPZ, IPY, and IPX definitions to use Matrix2 instead of [[Complex64; 2]; 2] as all the usage has been moved to nalgebra in that PR.

I think that this was just a bad mege with PR #15880, but if you prefer to merge this PR now and fix it in #15960 then it's OK with me.

ShellyGarion · 2026-04-05T08:47:29Z

 use qiskit_circuit::{NoBlocks, Qubit};
-use qiskit_util::alias::GateArray1Q;
-use qiskit_util::complex::{C_M_ONE, C_ONE, IM, M_IM, c64};
+use qiskit_util::complex::{C_M_ONE, C_ONE, C_ZERO, IM, M_IM, c64};


if we don't define IPZ in this file, then C_ZERO is not needed here.

This is temporary just for this PR as the version in common.rs is replaced with a Matrix2 version in the next PR: #15960 That PR moves all the static IPZ, IPY, and IPX definitions to use Matrix2 instead of [[Complex64; 2]; 2] as all the usage has been moved to nalgebra in that PR.

#15928 (comment)

ShellyGarion · 2026-04-05T08:47:39Z

-    [c64(0., FRAC_1_SQRT_2), c64(FRAC_1_SQRT_2, 0.)],
-    [c64(-FRAC_1_SQRT_2, 0.), c64(0., -FRAC_1_SQRT_2)],
-];
+static IPZ: Matrix2<Complex64> = Matrix2::new(IM, C_ZERO, C_ZERO, M_IM);


note that IPZ has been moved to common.rs since it's also used in weyl_decompositions.rs

This is temporary just for this PR as the version in common.rs is replaced with a Matrix2 version in the next PR: #15960 That PR moves all the static IPZ, IPY, and IPX definitions to use Matrix2 instead of [[Complex64; 2]; 2] as all the usage has been moved to nalgebra in that PR.

#15928 (comment)

ShellyGarion · 2026-04-05T08:50:31Z

+static IPZ: Matrix2<Complex64> = Matrix2::new(IM, C_ZERO, C_ZERO, M_IM);
+
+static HGATE: Matrix2<Complex64> =
+    Matrix2::new(H_GATE[0][0], H_GATE[0][1], H_GATE[1][0], H_GATE[1][1]);


the name HGATE may be a bit confusing (with H_GATE). Maybe call it HGATE_matrix or H_matrix ?
also, why not use the matrix method of the standard gates here?

I can rename the variable name I just picked something that wouldn't conflict with the static being imported here. But it needs to be all capital letters as a static, rustfmt will complain otherwise.

The reason I didn't use the StandardGate::matrix() method here is that the matrix method is not defined as a const method and I can't call it from a static context. The matrix method in particular is written to return an owned Array2<Complex64> and we wouldn't be able to use that in a const function anyway because it relies on dynamic memory allocation. We might be able to make the matrix_as_static_1q and matrix_as_static_2q methods const functions, but it would require everything in them to be defined as const functions.

I wanted this to be a static so we only have a single Matrix2 that is use by reference for the Hadamard matrix when we need it. This avoid allocating a temporary 2x2 matrix in any form when we need it. The problem with the existing static is it's a [[Complex64; 2]; 2] which isn't compatible with nalgebra types for matrix multiplication or other operations, so I needed a static that was a Matrix2 type for this use case.

maybe HGATE_MATRIX? or H_GATE_MATRIX?

ShellyGarion · 2026-04-05T08:56:17Z

+#[inline]
+pub fn ndarray_to_matrix2<T: Copy>(view: ArrayView2<T>) -> Matrix2<T> {
+    Matrix2::new(view[[0, 0]], view[(0, 1)], view[(1, 0)], view[(1, 1)])
+}


in https://github.com/Qiskit/qiskit/blob/main/crates/synthesis/src/linalg/mod.rs there are several methods to convert ndarrays to and from nalgbra and faer.
perhaps it's worth to move this function there too?

mtreinish · 2026-04-07T20:31:30Z

In addition, is there a plan to convert the other two-qubit decomosers (TwoQubitWyelDecomposer and TwoQubitControlledUDecomposer) into nalgeba as well?

Yes as I mentioned in the commit message/PR summary this is just an incremental step towards migrating to use nalgebra for fixed size small matrices in the two qubit decomposer code. I have the second PR open for the TwoQubitWeylDecomposition already: #15960. Several of your comments here are already fixed in that PR as some of the statics move to the common module in that PR. As I mentioned in the PR summary there are several temporary things I did in this PR to make it work as a standalone PR. This is just an initial incremental step that is self contained and things get a bit cleaner when all the pieces are finalized. I wanted to do this as smaller PRs per decomposer because the logic is particularly dense in all of these modules and keeping the changes as minimal as possible makes much easier to follow.

ShellyGarion

I had some minor minor comments on the names and moving some code (I think this was just a bad mege after PR #15880), but if you prefer to merge this PR now and fix it in #15960 then it's OK with me.

mtreinish added this to the 2.5.0 milestone Mar 31, 2026

mtreinish requested a review from a team as a code owner March 31, 2026 23:37

mtreinish added performance Changelog: None Do not include in the GitHub Release changelog. Rust This PR or issue is related to Rust code in the repository labels Mar 31, 2026

Merge remote-tracking branch 'origin/main' into matrix2-basis-decomp

40a050c

raynelfss reviewed Apr 3, 2026

View reviewed changes

Comment thread crates/synthesis/src/two_qubit_decompose/basis_decomposer.rs Outdated

ShellyGarion reviewed Apr 5, 2026

View reviewed changes

mtreinish mentioned this pull request Apr 7, 2026

Switch TwoQubitWeylDecomposition to use nalgebra internally #15960

Merged

mtreinish added 2 commits April 7, 2026 17:38

Merge remote-tracking branch 'origin/main' into matrix2-basis-decomp

b6a2409

Move static definitions to the top of the file

93bb69b

ShellyGarion approved these changes Apr 9, 2026

View reviewed changes

raynelfss approved these changes Apr 9, 2026

View reviewed changes

raynelfss added this pull request to the merge queue Apr 10, 2026

Merged via the queue into Qiskit:main with commit a2c1136 Apr 10, 2026
26 checks passed

mtreinish deleted the matrix2-basis-decomp branch April 13, 2026 11:26

ShellyGarion self-assigned this Apr 14, 2026

mtreinish mentioned this pull request Apr 15, 2026

Switch Two qubit decompose to nalgebra #15017

Closed

ShellyGarion added this to Qiskit 2.5 Apr 19, 2026

github-project-automation Bot moved this from Ready to Done in Qiskit 2.5 Apr 19, 2026

github-project-automation Bot moved this to Ready in Qiskit 2.5 Apr 19, 2026

Conversation

mtreinish commented Mar 31, 2026

Summary

Details and comments

Uh oh!

qiskit-bot commented Mar 31, 2026

Uh oh!

mtreinish commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raynelfss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ShellyGarion left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShellyGarion Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mtreinish commented Apr 7, 2026

Uh oh!

ShellyGarion left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mtreinish commented Mar 31, 2026 •

edited

Loading

ShellyGarion Apr 9, 2026 •

edited

Loading

ShellyGarion left a comment •

edited

Loading