Skip to content

Switch Two qubit decompose to nalgebra#15017

Closed
gadial wants to merge 10 commits intoQiskit:mainfrom
gadial:two_qubit_decompose_to_nalgebra
Closed

Switch Two qubit decompose to nalgebra#15017
gadial wants to merge 10 commits intoQiskit:mainfrom
gadial:two_qubit_decompose_to_nalgebra

Conversation

@gadial
Copy link
Copy Markdown
Contributor

@gadial gadial commented Sep 16, 2025

Summary

Changing the two qubit decompose (and related files) to work with Matrix4 and Matrix2 instead of arbitrary sized matrices.

Details and comments

Fix #13665

@gadial gadial requested a review from a team as a code owner September 16, 2025 12:46
@qiskit-bot
Copy link
Copy Markdown
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core
  • @levbishop

@gadial gadial marked this pull request as draft September 16, 2025 12:46
@coveralls
Copy link
Copy Markdown

coveralls commented Sep 16, 2025

Pull Request Test Coverage Report for Build 18377399152

Details

  • 476 of 518 (91.89%) changed or added relevant lines in 6 files are covered.
  • 19 unchanged lines in 5 files lost coverage.
  • Overall coverage increased (+0.01%) to 88.231%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/transpiler/src/passes/high_level_synthesis.rs 4 5 80.0%
crates/transpiler/src/passes/split_2q_unitaries.rs 17 18 94.44%
crates/transpiler/src/passes/unitary_synthesis.rs 24 25 96.0%
crates/synthesis/src/two_qubit_decompose.rs 424 463 91.58%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.82%
crates/synthesis/src/two_qubit_decompose.rs 3 91.54%
crates/qasm2/src/lex.rs 4 91.26%
crates/circuit/src/parameter/symbol_expr.rs 5 72.82%
crates/qasm2/src/parse.rs 6 96.62%
Totals Coverage Status
Change from base Build 18350473782: 0.01%
Covered Lines: 93224
Relevant Lines: 105659

💛 - Coveralls

@gadial
Copy link
Copy Markdown
Contributor Author

gadial commented Sep 18, 2025

Current benchmarking results for QSD decomposition show a minor improvement, which is good when trying to set a baseline but far from enough at this stage:

╒═════════════════════════╤════════════╤════════════╤═══════════════════╕
│ Test Case               │        old │        new │   Ratio (old/new) │
╞═════════════════════════╪════════════╪════════════╪═══════════════════╡
│ num_qubits: 5 seed 13   │ 0.00636392 │ 0.00462999 │          1.3745   │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 5 seed 1    │ 0.00678139 │ 0.00527153 │          1.28642  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 5 seed 1089 │ 0.00701723 │ 0.00557046 │          1.25972  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 5 seed 42   │ 0.00754852 │ 0.00485458 │          1.55493  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 5 seed 500  │ 0.00858774 │ 0.0052814  │          1.62603  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 6 seed 500  │ 0.0302784  │ 0.033772   │          0.896553 │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 6 seed 1089 │ 0.0303608  │ 0.0247061  │          1.22888  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 6 seed 1    │ 0.0316213  │ 0.0263052  │          1.20209  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 6 seed 42   │ 0.0323374  │ 0.0271037  │          1.1931   │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 6 seed 13   │ 0.0332036  │ 0.032197   │          1.03127  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 7 seed 13   │ 0.200143   │ 0.192108   │          1.04182  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 7 seed 1    │ 0.204011   │ 0.156909   │          1.30019  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 7 seed 42   │ 0.209486   │ 0.199668   │          1.04917  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 7 seed 1089 │ 0.218755   │ 0.177695   │          1.23107  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 7 seed 500  │ 0.235781   │ 0.187665   │          1.25639  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 8 seed 1    │ 0.929031   │ 1.07564    │          0.863699 │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 8 seed 42   │ 0.932327   │ 0.851199   │          1.09531  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 8 seed 13   │ 0.936288   │ 0.906832   │          1.03248  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 8 seed 1089 │ 0.954443   │ 0.869523   │          1.09766  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 8 seed 500  │ 0.976067   │ 0.85258    │          1.14484  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 9 seed 1    │ 5.66242    │ 5.45405    │          1.0382   │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 9 seed 1089 │ 5.68774    │ 5.42259    │          1.0489   │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 9 seed 13   │ 5.75313    │ 5.30579    │          1.08431  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 9 seed 500  │ 6.15012    │ 5.75433    │          1.06878  │
├─────────────────────────┼────────────┼────────────┼───────────────────┤
│ num_qubits: 9 seed 42   │ 8.13967    │ 6.05398    │          1.34452  │
╘═════════════════════════╧════════════╧════════════╧═══════════════════╛

This commit removes the number of copies being made around nalgebra
conversions. In places we use ndarray, faer, or fixed sized arrays and/or
slices the conversion between the types was typically done via copying.
While in some places the copying is desireable, such as going from a
dynamic container type like an owned nalgebra Array2 or faer Mat to a
fixed size type like Matrix2 or Matrix4. In other places, especially
going from nalgebra -> ndarray or faer, we should be able to do the
conversion without any copying. This commit makes these changes in the
following ways:

- going from nalgebra to ndarray we create an ArrayView2 pointing to the
  underlying Matrix storage.
- going from nalgebra to faer this calls the ArrayView2 generator and
  then uses faer-ext to create a MatRef from the ArrayView2
- The `static [[Complex64; 2]]` and static `[[Complex64; 4]]` usage is
  replaced with direct `static Matrix2<Complex64>` and
  `static Matrix4<Complex64>`. This enables directly using the statics
  in combination with dynamically created matrices

The conversion from faer and ndarray to owned matrix types

As follow ons to this commit we should look at places where we are using
faer (or to a lesser extent ndarray) and see if wrapping the return from
faer functions in a dynamic MatrixView instead of an owned Matrix4 or
Matrix2 makes sense. For example if the result is only used in
computation with nalgebra types and never stored, then we don't need an
owned Matrix object and doing  faer -> MatrixView with no copies is
potentially faster (see Qiskit#15059 for a conversion function doing that).
@mtreinish
Copy link
Copy Markdown
Member

Thanks for getting this started. I know that this is still a draft, but I started to review it. But I couldn't help myself and pushed some small changes that I think will help move this toward a better end state. This commit tries to reduce some of the copy overhead: 67e3e2a and shows some ideas for working between the three array/matrix/linear algebra libraries being used in the module now. I tried to keep the changes minimal here to not step on your toes while you're finishing this. If you don't like the commit feel free to revert it and continue working without my interjections.

I'm running some benchmarks now. But I don't think we'll get most of the advantages of this migration until a follow up to this PR that leverages nalgebra for the *_inner() rust interfaces that get called by the transpiler passes like UnitarySynthesis. That way we can basically collect UnitaryGates into a Matrix4 from consolidate blocks and keep it all as a fixed sized nalgebra matrix without any copying. We shouldn't try to do this as part of this PR because there will be too many moving pieces. As it is looking through the PR we probably should split out two_qubit_decompose.rs into a directory that splits things up a bit more to make it easier to follow. Right now the file is a bit too large to keep all the context in your head at once.

@mtreinish
Copy link
Copy Markdown
Member

mtreinish commented Oct 8, 2025

Hmm, it's failing CI tests. I tested it locally and know that it successfully passed tests locally before pushing. I'll look at this in the morning and fix whatever I broke/got wrong. This should be fixed by: e2ac69d

@mtreinish
Copy link
Copy Markdown
Member

From my local benchmark run it's showing a nice transpile speedup:

Benchmarks that have improved:

| Change   | Before [83b762d3] <two_qubit_decompose_to_nalgebra~1^2>   | After [67e3e2a4] <two_qubit_decompose_to_nalgebra>   |   Ratio | Benchmark (Parameter)                                                                |
|----------|-----------------------------------------------------------|------------------------------------------------------|---------|--------------------------------------------------------------------------------------|
| -        | 74.3±1ms                                                  | 67.3±0.8ms                                           |    0.91 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')                    |
| -        | 135±3ms                                                   | 121±1ms                                              |    0.9  | transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(2) |
| -        | 22.5±2ms                                                  | 20.3±0.5ms                                           |    0.9  | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(1)             |
| -        | 615±40ms                                                  | 552±7ms                                              |    0.9  | utility_scale.UtilityScaleBenchmarks.time_qft('cz')                                  |
| -        | 468±30ms                                                  | 420±8ms                                              |    0.9  | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                                   |
| -        | 14.7±0.7ms                                                | 12.7±0.2ms                                           |    0.87 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(0)             |
| -        | 40.5±1ms                                                  | 35.2±0.9ms                                           |    0.87 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(2)             |
| -        | 353±20ms                                                  | 302±4ms                                              |    0.86 | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')                                 |
| -        | 33.8±0.6ms                                                | 28.7±0.6ms                                           |    0.85 | transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(0) |
| -        | 602±20ms                                                  | 513±10ms                                             |    0.85 | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                                  |
| -        | 49.9±3ms                                                  | 40.9±1ms                                             |    0.82 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(3)             |
| -        | 613±20ms                                                  | 489±2ms                                              |    0.8  | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                                   |

Benchmarks that have stayed the same:

| Change   | Before [83b762d3] <two_qubit_decompose_to_nalgebra~1^2>   | After [67e3e2a4] <two_qubit_decompose_to_nalgebra>   | Ratio   | Benchmark (Parameter)                                                                                  |
|----------|-----------------------------------------------------------|------------------------------------------------------|---------|--------------------------------------------------------------------------------------------------------|
|          | 9.95±0.6ms                                                | 11.0±0.9ms                                           | ~1.11   | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(2)                          |
|          | 66.0±0.7ms                                                | 59.7±1ms                                             | ~0.90   | transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(1)                   |
|          | 6.80±0.3s                                                 | 6.07±0.04s                                           | ~0.89   | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                                                |
|          | 6.70±0.3s                                                 | 5.89±0.09s                                           | ~0.88   | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                                                |
|          | 0                                                         | 0                                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cx')                                          |
|          | 0                                                         | 0                                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cz')                                          |
|          | 0                                                         | 0                                                    | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('ecr')                                         |
|          | 12.5±0.2ms                                                | 13.5±2ms                                             | 1.08    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(3)                          |
|          | 26.0±0.7ms                                                | 27.7±2ms                                             | 1.06    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(0)        |
|          | 10.9±0.2ms                                                | 11.4±0.3ms                                           | 1.05    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(1)                          |
|          | 28.6±1ms                                                  | 29.5±1ms                                             | 1.03    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr')                          |
|          | 72.0±1ms                                                  | 72.8±2ms                                             | 1.01    | transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(1)                                |
|          | 7.12±0.2ms                                                | 7.21±0.06ms                                          | 1.01    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(0)                          |
|          | 8.50±1ms                                                  | 8.62±0.3ms                                           | 1.01    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(3)        |
|          | 27.1±0.8ms                                                | 27.3±0.07ms                                          | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')                           |
|          | 84.9±9ms                                                  | 84.8±3ms                                             | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(0)                                |
|          | 7.86±0.5ms                                                | 7.85±0.2ms                                           | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(2)        |
|          | 1429                                                      | 1429                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(0)            |
|          | 1316                                                      | 1316                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(1)            |
|          | 1174                                                      | 1174                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(2)            |
|          | 1213                                                      | 1213                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_quantum_volume_transpile_50_x_20(3)            |
|          | 2705                                                      | 2705                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(0)                   |
|          | 2005                                                      | 2005                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(1)                   |
|          | 7                                                         | 7                                                    | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(2)                   |
|          | 7                                                         | 7                                                    | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm(3)                   |
|          | 11117                                                     | 11117                                                | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(0) |
|          | 5015                                                      | 5015                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(1) |
|          | 16                                                        | 16                                                   | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(2) |
|          | 16                                                        | 16                                                   | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_from_large_qasm_backend_with_prop(3) |
|          | 1035                                                      | 1035                                                 | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(0)                        |
|          | 817                                                       | 817                                                  | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(1)                        |
|          | 615                                                       | 615                                                  | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(2)                        |
|          | 634                                                       | 634                                                  | 1.00    | transpiler_levels.TranspilerLevelBenchmarks.track_depth_transpile_qv_14_x_14(3)                        |
|          | 8.44±0.4ms                                                | 8.45±0.3ms                                           | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')                                        |
|          | 84.3±2ms                                                  | 84.3±0.8ms                                           | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                                         |
|          | 400                                                       | 400                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cx')                                          |
|          | 400                                                       | 400                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cz')                                          |
|          | 400                                                       | 400                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('ecr')                                         |
|          | 300                                                       | 300                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cx')                                         |
|          | 300                                                       | 300                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cz')                                         |
|          | 300                                                       | 300                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('ecr')                                        |
|          | 1590                                                      | 1590                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cx')                                            |
|          | 1603                                                      | 1603                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cz')                                            |
|          | 1603                                                      | 1603                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('ecr')                                           |
|          | 2692                                                      | 2692                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cx')                                             |
|          | 2744                                                      | 2744                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cz')                                             |
|          | 2744                                                      | 2744                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('ecr')                                            |
|          | 2571                                                      | 2571                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cx')                                              |
|          | 2571                                                      | 2571                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cz')                                              |
|          | 2571                                                      | 2571                                                 | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('ecr')                                             |
|          | 480                                                       | 480                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cx')                               |
|          | 480                                                       | 480                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cz')                               |
|          | 480                                                       | 480                                                  | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('ecr')                              |
|          | 5.77±0.2ms                                                | 5.71±0.2ms                                           | 0.99    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')                                                 |
|          | 6.57±0.1s                                                 | 6.49±0.1s                                            | 0.99    | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                                               |
|          | 7.77±0.2ms                                                | 7.72±0.06ms                                          | 0.99    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')                                        |
|          | 34.1±0.8ms                                                | 33.3±3ms                                             | 0.98    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(1)        |
|          | 77.8±3ms                                                  | 75.3±0.8ms                                           | 0.97    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                                                 |
|          | 86.6±2ms                                                  | 84.0±2ms                                             | 0.97    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')                                                 |
|          | 6.25±0.1ms                                                | 6.04±0.3ms                                           | 0.97    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')                                                 |
|          | 87.1±3ms                                                  | 84.3±1ms                                             | 0.97    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')                                        |
|          | 87.9±2ms                                                  | 84.4±0.4ms                                           | 0.96    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                                         |
|          | 67.4±2ms                                                  | 64.0±0.6ms                                           | 0.95    | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                                                |
|          | 5.87±0.2ms                                                | 5.60±0.04ms                                          | 0.95    | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr')                                                |
|          | 218±9ms                                                   | 206±4ms                                              | 0.95    | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                                                  |
|          | 170±6ms                                                   | 160±7ms                                              | 0.94    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                                                   |
|          | 87.4±5ms                                                  | 82.5±0.9ms                                           | 0.94    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')                                     |
|          | 176±4ms                                                   | 163±2ms                                              | 0.93    | transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(3)                   |
|          | 8.31±0.4ms                                                | 7.77±0.1ms                                           | 0.93    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')                                       |
|          | 29.1±0.7ms                                                | 27.2±0.06ms                                          | 0.93    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')                           |
|          | 298±10ms                                                  | 278±6ms                                              | 0.93    | utility_scale.UtilityScaleBenchmarks.time_qft('cx')                                                    |
|          | 425±20ms                                                  | 396±6ms                                              | 0.93    | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                                                   |
|          | 115±4ms                                                   | 106±1ms                                              | 0.93    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')                                      |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

In the previous commit the internal logic for the conversion from faer
Mat to an owned Matrix4 was updated to avoid the overhead of checking
the matrix shape on each execution by using a debug_assert. The function
is only used internally with our generated matrices which should always
be a 4x4 based on how it's called so the shape check is needless
overhead. However, when the logic was moved from an if check with an
error returned to a debug assert the condition wasn't updated
accordingly. This caused debug builds to fail the assertion when the
matrix was 4x4. This commit fixes this oversight and fixes the
conversion function in debug mode.
@jakelishman
Copy link
Copy Markdown
Member

Is this PR still looking to merge, or did it get overtaken by other linear-algebra performance PRs?

@mtreinish
Copy link
Copy Markdown
Member

Yeah, we can close this. It's been superseded by #15928 and #15960. The follow-on work I outlined in #15017 (comment) still needs to be done to take full advantage of the changes, but that will come after #15960 merges.

@mtreinish mtreinish closed this Apr 15, 2026
@github-project-automation github-project-automation Bot moved this from Ready to Done in Qiskit 2.5 Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Migrate linear algebra code from faer-rs to nalgebra

7 participants