Skip to content

Reimplementing QSD using faer instead of nalgebra.#15874

Merged
mtreinish merged 16 commits intoQiskit:mainfrom
alexanderivrii:reimplement-qsd-using-faer
Apr 9, 2026
Merged

Reimplementing QSD using faer instead of nalgebra.#15874
mtreinish merged 16 commits intoQiskit:mainfrom
alexanderivrii:reimplement-qsd-using-faer

Conversation

@alexanderivrii
Copy link
Copy Markdown
Member

Summary

This PR reimplements quantum Shannon decomposition to use faer instead of nalgebra as faer is significantly better for involved linear algebra calculations.

Fixes #15870.

@alexanderivrii alexanderivrii added this to the 2.5.0 milestone Mar 25, 2026
@alexanderivrii alexanderivrii requested a review from a team as a code owner March 25, 2026 11:28
@alexanderivrii alexanderivrii added the Changelog: Fixed Add a "Fixed" entry in the GitHub Release changelog. label Mar 25, 2026
@qiskit-bot
Copy link
Copy Markdown
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@mtreinish mtreinish self-assigned this Mar 25, 2026
@mtreinish
Copy link
Copy Markdown
Member

I'm curious is there any runtime difference between this and the previous implementation. Since this is a correctness fix it doesn't really matter since we're going to do it anyway, I'm just curious.

Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnaks for fixing this bug so quickly... I do hope that with faer we won't have the numeric issues in QSD anymore...

Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread releasenotes/notes/reimplement-qsd-using-faer-6f402a8d766e5db1.yaml Outdated
Copy link
Copy Markdown
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only gotten through the linear algebra module and not the bulk of the qsd yet. But I'm about to step away so I thought I'd share my comments so far

Comment thread crates/accelerate/Cargo.toml Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
pub fn svd_decomposition_faer(
mat: MatRef<Complex64>,
) -> (Mat<Complex64>, Mat<Complex64>, Mat<Complex64>) {
let svd = mat.svd().expect("Call to faer SVD failed");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle this and return a user facing error like we do for nalgebra. From what I can see in the faer docs this will be a convergence error which I'm not sure we'd want to result in a panic.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have improved the error checking as part of a4a1172, however I did not succeed to wrap errors produced by faer similarly to errors coming from python or circuit data (while there is a class FaerError, see https://docs.rs/faer/latest/faer/sparse/enum.FaerError.html, it does not seem to account for errors coming from say SVD decomposition).

d: MatRef<Complex64>,
) -> Mat<Complex64> {
let n = a.nrows();
let mut block_matrix = Mat::<Complex64>::zeros(2 * n, 2 * n);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to initialize this with zero? It feels unnecessary because you should be populating every element in the matrix based on the blocks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way I can see how to avoid this (keeping the current code structure) is doing something like

 let mut block_matrix = Mat::<Complex64>::with_capacity(2 * n, 2 * n);
 unsafe { block_matrix.set_dims(2 * n, 2 * n) };

However, I semi-expect that the compiler should be able to optimize this itself. I have kept the original code for now, but replaced the loops as per Shelly's suggestion. I am also not very worried about this specific code as for now it's only used in debug assertions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something along those lines. I wasn't sure if faer's Mat type had a method for creating an uninitialized matrix directly or you had to do something like that. But it would be inside an unsafe either way. The other option would be to use from_fn and have conditions on the index ranges.

I find it unlikely that the compiler will able to optimize it away because Mat::zeros() is calling Mat::from_fn which is explicitly calling the closure that returns zero for every element in the array. But it's only a small performance difference, it's something we can explore later if this becomes a bottleneck.

@alexanderivrii
Copy link
Copy Markdown
Member Author

A quick answer on performance comparison.

Running the following simple script on my laptop (generating 10 random unitary matrix over nq qubits, running qs_decomposition, and taking the average of runtimes):

run_times = []
num_trials = 10
for nq in range(1, 11):
    total_time = 0
    for seed in range(num_trials):
        n = 2 ** nq
        mat = random_unitary(n, seed=seed).data
        time_start = time.time()
        qc = qsd.qs_decomposition(mat)
        time_end = time.time()
        total_time += time_end - time_start
    avg_time = total_time / num_trials
    run_times.append(avg_time)
print(run_times)

returns the following

main = [9.09566879272461e-05, 4.303455352783203e-05, 0.00023097991943359374, 0.0009044647216796875, 0.004224538803100586, 0.019590377807617188, 0.10117588043212891, 0.5623687744140625, 3.393336462974548, 22.304159712791442]

this_pr = [9.09566879272461e-05, 4.079341888427734e-05, 0.00023005008697509765, 0.0009874343872070313, 0.0043228864669799805, 0.018857145309448244, 0.09331314563751221, 0.47628915309906006, 2.518214392662048, 12.950119781494141]
image

Overall it seems like we have a performance improvement as well, especially for larger number of qubits (I believe it's due to translating from nalgebra to faer and back for the SVD).

ShellyGarion
ShellyGarion previously approved these changes Mar 26, 2026
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. but I'll wait with merging to see if @mtreinish has any further comments.

Copy link
Copy Markdown
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is fairly straightforward. I left a few inline comments. The biggest thing is I don't think we need to add faer as a dependency to the transpiler crate. We should just adjust our public interface to use ndarray types, because that's (and nalgebra for 1q and 2q unitaries) actually what we use for storage in the transpiler. So it's more natural if we just move the ndarray -> faer conversion inside the qsd function (and do it without copies).

Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/transpiler/src/passes/unitary_synthesis/mod.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
/// Decomposed quantum circuit.
pub fn quantum_shannon_decomposition(
mat: &DMatrix<Complex64>,
mat: MatRef<Complex64>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the transpiler module, but I think we should make this an ArrayView2, we only use qsd from either Python or the transpiler. In both contexts it's easier to work with an ndarray view, since rust-numpy gives us an ndarray array view and similarly we can run UnitaryGate::matrix_view() returns an ArrayView2. We can just make the call crate::linalg::ndarray_to_faer to get the MatRef from the ArrayView2 with no copies when we start using faer.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense - I love this! With this change, only the synthesis crates needs to be aware about faer. Does it still make sense to include the faer version in the topmost level Cargo.toml file (and set faer.workspace in the synthesis toml file), or is the older version of setting the faer version in synthesis toml directly preferable?

Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
Comment on lines +335 to +340
let (v, sigma, w_dg) = svd_decomposition_faer(a)?;

let s = &v * &sigma * &v.adjoint();
let s = v.as_ref() * sigma * v.adjoint();
let u = v * w_dg;

(s, u)
Ok((s, u))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper makes an owned copy of the svd's output matrices. When I see this I'm wondering do we even want that helper method? You can avoid these copies if you just call faer directly just like the comment I left in the linalg file yesterday. I assumed the to_owned calls in the helper function were because something needed an owned copy, but this could work with references instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried removing this helper, but surprisingly it did not noticeably change runtimes, even for large matrices, while making the code a bit uglier. But I ended up removing the helper function zxz_decomp_svd which also copies matrices.

Comment thread crates/synthesis/src/qsd.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/linalg/mod.rs Outdated
Comment thread crates/synthesis/src/qsd.rs Outdated
@alexanderivrii alexanderivrii requested a review from mtreinish April 6, 2026 09:21
Copy link
Copy Markdown
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates and work on this. I just had one small comment inline about error messages. Right now I'm a bit worried they're a bit opaque for the internal tolerance checks or the decompositions fail. After updating that that I'm ready to approve I think.

QSDError::SchurDecompositionFailed => {
QiskitError::new_err("Schur decomposition failed")
}
QSDError::ErrorFromLinAlg(err) => err.into(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if we want a better error message here. Right now a user calls qsd, unitary synthesis, or the transpiler and will get an exception: `QiskitError: Eigen decomposition failed" that points to the qsd rust function. I'm worried this is too opaque since this will end up potentially user facing. Maybe we should change the error message to something like, "internal eigendecomposition run as part of qsd failed. This can point to a numerical tolerance issue...."

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved error messages in 91f40b5.

rand_pcg.workspace = true
rand_distr.workspace = true
faer = "0.24"
faer.workspace = true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this change anymore but it doesn't hurt either, since we've agreed at this point we'll rely on faer for all the heavy linear algebra. So there is a fairly good chance we'll use it in another crate at some point.

@alexanderivrii alexanderivrii requested a review from mtreinish April 9, 2026 12:21
Copy link
Copy Markdown
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to merge now, thanks for all your hard work on this. I'm still a bit worried about the error message text not being descriptive enough for end users. But I don't have a concrete suggestion on how to improve it right now. So if we come up with something better we can update the error message text in the future.

@mtreinish mtreinish added this pull request to the merge queue Apr 9, 2026
Merged via the queue into Qiskit:main with commit f8bd5d0 Apr 9, 2026
26 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to Done in Qiskit 2.5 Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: Fixed Add a "Fixed" entry in the GitHub Release changelog.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Schur decomposition fails during transpilation of a circuit built from a unitary matrix.

4 participants