Use nalgebra::Matrix4 as output for instructions_to_matrix by mtreinish · Pull Request #15871 · Qiskit/qiskit

mtreinish · 2026-03-24T20:50:41Z

Summary

This commit switches the return type of
convert_2q_block_matrix::instructions_to_matrix() to return a nalgebra
Matrix4 rather than a dynamicly allocated ndarray Array2. The function
explicitly returns a 4x4 Complex64 matrix. Using an nalgebra fixed size
array is stack allocated and we avoid needing a dynamic allocation. This
should also speedup matrix multiplication since nalgebra can leverage
simd better (either directly or implicitly via the compiler) because it
knows the fixed operations needed. We were already setup towards doing
since #13649 which moved to using nalgebra internally for the 1q
component, but it didn't update the whole path in that PR to use an
nalgebra array for everything.

Ideally we'd be using nalgebra data types throughout the two qubit
decomposers too. We should still use faer for the more involved linear
algebra operations in the module but we should be using Matrix4 and
Matrix2 for fixed sized matrices where we know the size of the matrices
in that module and generate faer MatRefs using
qiskit_synthesis::linalg::nalgebra_to_faer() to do linear algebra with
the matrix. This PR is an incremental step towards doing that.

Details and comments

~~This PR is based on top of #15858 and will need to be rebased after that merges. In the meantime you can view the contents of just this PR by looking at the HEAD commit: 29a718d~~ Rebased now

qiskit-bot · 2026-03-24T20:50:46Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core

mtreinish · 2026-03-24T21:28:29Z

I ran a pair of quick asv benchmarks, although I don't trust these numbers since it seems like I had a fair amount of system noise and the variability on these numbers seems high. I'll do more thorough benchmarking after the parent PR merges and this is unblocked.

Benchmarks that have improved:

| Change   | Before [572045f5] <fixed-size-2q-matrices~1>   | After [29a718d8] <use-nalgebra-arrays-for-consolidate>   |   Ratio | Benchmark (Parameter)                                             |
|----------|------------------------------------------------|----------------------------------------------------------|---------|-------------------------------------------------------------------|
| -        | 8.58±0.08ms                                    | 7.77±0.07ms                                              |    0.91 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')   |
| -        | 78.4±1ms                                       | 68.9±2ms                                                 |    0.88 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz') |
| -        | 69.8±2ms                                       | 59.9±1ms                                                 |    0.86 | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')            |
| -        | 435±20ms                                       | 367±10ms                                                 |    0.84 | utility_scale.UtilityScaleBenchmarks.time_qft('cz')               |
| -        | 250±9ms                                        | 204±5ms                                                  |    0.81 | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')              |

Benchmarks that have stayed the same:

| Change   | Before [572045f5] <fixed-size-2q-matrices~1>   | After [29a718d8] <use-nalgebra-arrays-for-consolidate>   | Ratio   | Benchmark (Parameter)                                                         |
|----------|------------------------------------------------|----------------------------------------------------------|---------|-------------------------------------------------------------------------------|
|          | 36.5±0.8s                                      | 27.4±0.4s                                                | ~0.75   | utility_scale.UtilityScaleBenchmarks.time_hwb12('cz')                         |
|          | 61.5±1ms                                       | 66.8±2ms                                                 | 1.09    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')            |
|          | 6.12±0.03s                                     | 6.46±0.04s                                               | 1.06    | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cx')                    |
|          | 6.08±0.02s                                     | 6.46±0.02s                                               | 1.06    | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cz')                    |
|          | 6.05±0.04s                                     | 6.41±0.1s                                                | 1.06    | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('ecr')                   |
|          | 5.85±0.1ms                                     | 6.11±0.2ms                                               | 1.05    | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr')                       |
|          | 63.8±5ms                                       | 66.4±3ms                                                 | 1.04    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                        |
|          | 247±2ms                                        | 256±2ms                                                  | 1.04    | utility_scale.UtilityScaleBenchmarks.time_qft('cx')                           |
|          | 383±20ms                                       | 394±10ms                                                 | 1.03    | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                          |
|          | 16.7±0.2s                                      | 16.9±0.1s                                                | 1.02    | utility_scale.UtilityScaleBenchmarks.time_hwb12('cx')                         |
|          | 83.7±0.4ms                                     | 85.4±1ms                                                 | 1.02    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')               |
|          | 27.7±0.6ms                                     | 28.2±0.9ms                                               | 1.02    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')  |
|          | 456±20ms                                       | 459±4ms                                                  | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cz')                   |
|          | 28.5±1ms                                       | 28.9±0.9ms                                               | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')  |
|          | 141±2ms                                        | 142±3ms                                                  | 1.01    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                          |
|          | 26.0±0.6ms                                     | 25.9±0.4ms                                               | 1.00    | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                      |
|          | 84.7±2ms                                       | 84.8±0.3ms                                               | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                |
|          | 5.91±0.3ms                                     | 5.83±0.1ms                                               | 0.99    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')                        |
|          | 26.2±1ms                                       | 26.0±0.3ms                                               | 0.99    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                       |
|          | 85.5±2ms                                       | 84.6±0.7ms                                               | 0.99    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                |
|          | 62.1±2ms                                       | 60.7±0.9ms                                               | 0.98    | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                       |
|          | 22.4±0.9ms                                     | 22.0±0.8ms                                               | 0.98    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                       |
|          | 25.4±0.4s                                      | 24.9±0.3s                                                | 0.98    | utility_scale.UtilityScaleBenchmarks.time_hwb12('ecr')                        |
|          | 461±20ms                                       | 453±9ms                                                  | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('ecr')                  |
|          | 60.6±2ms                                       | 59.6±0.9ms                                               | 0.98    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')             |
|          | 8.05±0.2ms                                     | 7.85±0.1ms                                               | 0.97    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')              |
|          | 462±8ms                                        | 449±30ms                                                 | 0.97    | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                            |
|          | 189±8ms                                        | 182±5ms                                                  | 0.96    | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                         |
|          | 6.13±0.2ms                                     | 5.83±0.2ms                                               | 0.95    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')                        |
|          | 478±20ms                                       | 447±3ms                                                  | 0.94    | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cx')                   |
|          | 8.41±0.3ms                                     | 7.91±0.1ms                                               | 0.94    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')               |
|          | 29.3±0.4ms                                     | 27.6±0.2ms                                               | 0.94    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr') |
|          | 404±20ms                                       | 378±8ms                                                  | 0.94    | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                            |
|          | 473±10ms                                       | 441±5ms                                                  | 0.93    | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                           |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

alexanderivrii

Thanks Matthew, overall this looks good to me, just two minor questions.

alexanderivrii · 2026-03-25T08:29:25Z

+    let OperationRef::Gate(gate) = inst.op.view() else {
+        return Err(QiskitError::new_err(
+            "Can't compute matrix of non-unitary op",
+        ));
+    };


Should we also consider OperationRef::Operation variant here (and if so, then also in get_matrix_from_inst)?

We should also extend this to PauliProductRotation, but this can be done for all relevant functions in a separate PR.

I just mirrored what was already there. I think we do want to expand this for any unitary operation. So this will be PPR and Operations that have a unitary matrix. Ideally we'd probably cover PPR and unitary OperationRef::Operations in PackedInstruction::try_matrix_as_nalgebra_2q this check is just the fallback for whether we call out to python and use quantum_info.Operator to compute the unitary from a gate's definition.

That being said I think we should fix this in a separate PR that fixes all the matrix methods

alexanderivrii · 2026-03-25T08:41:29Z

+static IDENTITY_2Q: Matrix4<Complex64> = Matrix4::new(
+    // Row 1
+    Complex64::ONE,
+    Complex64::ZERO,
+    Complex64::ZERO,
+    Complex64::ZERO,
+    // Row 2
+    Complex64::ZERO,
+    Complex64::ONE,
+    Complex64::ZERO,
+    Complex64::ZERO,
+    // Row 3
+    Complex64::ZERO,
+    Complex64::ZERO,
+    Complex64::ONE,
+    Complex64::ZERO,
+    // Row 4
+    Complex64::ZERO,
+    Complex64::ZERO,
+    Complex64::ZERO,
+    Complex64::ONE,
+);


I like this matrix: no need to think whether the elements are passed in row-major or column-major order 😄

Ideally I would have been able to use Matrix4::identity() here, but it's not a const function so I couldn't use it in a static context. This was the best way I could come up with, but I also do a double take every time I work with nalgebra about row major vs col major because I've gotten it backwards too many times.

This commit switches the return type of convert_2q_block_matrix::instructions_to_matrix() to return a nalgebra Matrix4 rather than a dynamicly allocated ndarray Array2. The function explicitly returns a 4x4 Complex64 matrix. Using an nalgebra fixed size array is stack allocated and we avoid needing a dynamic allocation. This should also speedup matrix multiplication since nalgebra can leverage simd better (either directly or implicitly via the compiler) because it knows the fixed operations needed. We were already setup towards doing since Qiskit#13649 which moved to using nalgebra internally for the 1q component, but it didn't update the whole path in that PR to use an nalgebra array for everything. Ideally we'd be using nalgebra data types throughout the two qubit decomposers too. We should still use faer for the more involved linear algebra operations in the module but we should be using Matrix4 and Matrix2 for fixed sized matrices where we know the size of the matrices in that module and generate faer MatRefs using qiskit_synthesis::linalg::nalgebra_to_faer() to do linear algebra with the matrix. This PR is an incremental step towards doing that.

alexanderivrii

Thanks, LGTM!

In the recently merged Qiskit#15871 we updated the consolidate blocks pass to use Matrix4 in the common path of a 2q block being consolidated. This is like > 90% of what the pass does when run in the preset pass manager. However, there were uncommon cases in the pass around the handling of blocks of a single gate that are outside of the target which were not updated to use nalgebra arrays if it's a fixed size 1q or 2q gate. This commit updates these uncommon paths so that we're always returning an nalgebra matrix in the output UnitaryGate if the block being consolidated is a single qubit or two qubits.

In the recently merged #15871 we updated the consolidate blocks pass to use Matrix4 in the common path of a 2q block being consolidated. This is like > 90% of what the pass does when run in the preset pass manager. However, there were uncommon cases in the pass around the handling of blocks of a single gate that are outside of the target which were not updated to use nalgebra arrays if it's a fixed size 1q or 2q gate. This commit updates these uncommon paths so that we're always returning an nalgebra matrix in the output UnitaryGate if the block being consolidated is a single qubit or two qubits.

…t#15881) In the recently merged Qiskit#15871 we updated the consolidate blocks pass to use Matrix4 in the common path of a 2q block being consolidated. This is like > 90% of what the pass does when run in the preset pass manager. However, there were uncommon cases in the pass around the handling of blocks of a single gate that are outside of the target which were not updated to use nalgebra arrays if it's a fixed size 1q or 2q gate. This commit updates these uncommon paths so that we're always returning an nalgebra matrix in the output UnitaryGate if the block being consolidated is a single qubit or two qubits.

mtreinish added this to the 2.5.0 milestone Mar 24, 2026

mtreinish requested a review from a team as a code owner March 24, 2026 20:50

mtreinish added on hold Can not fix yet performance Changelog: None Do not include in the GitHub Release changelog. Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Mar 24, 2026

alexanderivrii reviewed Mar 25, 2026

View reviewed changes

mtreinish force-pushed the use-nalgebra-arrays-for-consolidate branch from 29a718d to 35f8d27 Compare March 25, 2026 13:30

Update safety comment

5ed8f82

mtreinish removed the on hold Can not fix yet label Mar 25, 2026

mtreinish requested a review from alexanderivrii March 25, 2026 13:42

mtreinish assigned alexanderivrii Mar 25, 2026

alexanderivrii approved these changes Mar 25, 2026

View reviewed changes

alexanderivrii added this pull request to the merge queue Mar 25, 2026

Merged via the queue into Qiskit:main with commit 3e80326 Mar 25, 2026
25 checks passed

mtreinish deleted the use-nalgebra-arrays-for-consolidate branch March 25, 2026 15:10

mtreinish mentioned this pull request Mar 26, 2026

Use nalgebra matrices for uncommon paths in consolidate blocks #15881

Merged

ShellyGarion added this to Qiskit 2.5 Apr 15, 2026

github-project-automation Bot moved this from Ready to Done in Qiskit 2.5 Apr 15, 2026

github-project-automation Bot moved this to Ready in Qiskit 2.5 Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use nalgebra::Matrix4 as output for instructions_to_matrix#15871

Use nalgebra::Matrix4 as output for instructions_to_matrix#15871
alexanderivrii merged 2 commits intoQiskit:mainfrom
mtreinish:use-nalgebra-arrays-for-consolidate

mtreinish commented Mar 24, 2026 •

edited

Loading

Uh oh!

qiskit-bot commented Mar 24, 2026

Uh oh!

mtreinish commented Mar 24, 2026

Uh oh!

alexanderivrii left a comment

Uh oh!

alexanderivrii Mar 25, 2026

Uh oh!

mtreinish Mar 25, 2026

Uh oh!

mtreinish Mar 25, 2026

Uh oh!

Uh oh!

alexanderivrii Mar 25, 2026

Uh oh!

mtreinish Mar 25, 2026

Uh oh!

alexanderivrii left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mtreinish commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details and comments

Uh oh!

qiskit-bot commented Mar 24, 2026

Uh oh!

mtreinish commented Mar 24, 2026

Uh oh!

alexanderivrii left a comment

Choose a reason for hiding this comment

Uh oh!

alexanderivrii Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

mtreinish Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

mtreinish Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexanderivrii Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

mtreinish Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

alexanderivrii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mtreinish commented Mar 24, 2026 •

edited

Loading