Skip to content

Add new multithreaded TwoQubitPeepholeOptimization pass#13419

Open
mtreinish wants to merge 86 commits intoQiskit:mainfrom
mtreinish:two-qubit-peephole-parallel-pass
Open

Add new multithreaded TwoQubitPeepholeOptimization pass#13419
mtreinish wants to merge 86 commits intoQiskit:mainfrom
mtreinish:two-qubit-peephole-parallel-pass

Conversation

@mtreinish
Copy link
Copy Markdown
Member

@mtreinish mtreinish commented Nov 10, 2024

Summary

This commit adds a new transpiler pass for physical optimization,
TwoQubitPeepholeOptimization. This replaces the use of Collect2qBlocks,
ConsolidateBlocks, and UnitarySynthesis in the optimization stage for
a default pass manager setup. The pass logically works the same way
where it analyzes the dag to get a list of 2q runs, calculates the matrix
of each run, and then synthesizes the matrix and substitutes it inplace.
The distinction this pass makes though is it does this all in a single
pass and also parallelizes the matrix calculation and synthesis steps
because there is no data dependency there.

This new pass is not meant to fully replace the Collect2qBlocks,
ConsolidateBlocks, or UnitarySynthesis passes as those also run in
contexts where we don't have a physical circuit. This is meant instead
to replace their usage in the optimization stage only. Accordingly this
new pass also changes the logic on how we select the synthesis to use
and when to make a substitution. Previously this logic was primarily done
via the ConsolidateBlocks pass by only consolidating to a UnitaryGate if
the number of basis gates needed based on the weyl chamber coordinates
was less than the number of 2q gates in the block (see #11659 for
discussion on this). Since this new pass skips the explicit
consolidation stage we go ahead and try all the available synthesizers

Right now this commit has a number of limitations, the largest are:

  • Only supports the target
  • It doesn't support the XX decomposer because it's not in rust (the TwoQubitBasisDecomposer and TwoQubitControlledUDecomposer are used)

This pass doesn't support using the unitary synthesis plugin interface, since
it's optimized to use Qiskit's built-in two qubit synthesis routines written in
Rust. The existing combination of ConsolidateBlocks and UnitarySynthesis
should be used instead if the plugin interface is necessary.

Details and comments

Fixes #12007
Fixes #11659

TODO:

@mtreinish mtreinish added performance Changelog: Added Add an "Added" entry in the GitHub Release changelog. Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Nov 10, 2024
@mtreinish mtreinish added this to the 2.0.0 milestone Nov 10, 2024
@coveralls
Copy link
Copy Markdown

coveralls commented Nov 10, 2024

Pull Request Test Coverage Report for Build 15654439601

Details

  • 576 of 611 (94.27%) changed or added relevant lines in 12 files are covered.
  • 22 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.02%) to 88.02%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/synthesis/src/two_qubit_decompose.rs 16 17 94.12%
crates/transpiler/src/passes/unitary_synthesis.rs 12 13 92.31%
crates/transpiler/src/passes/two_qubit_unitary_synthesis_utils.rs 133 139 95.68%
crates/transpiler/src/passes/two_qubit_peephole.rs 376 403 93.3%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 2 92.48%
crates/transpiler/src/passes/unitary_synthesis.rs 2 94.33%
crates/circuit/src/symbol_expr.rs 6 73.69%
crates/qasm2/src/parse.rs 12 96.68%
Totals Coverage Status
Change from base Build 15635869582: 0.02%
Covered Lines: 83570
Relevant Lines: 94944

💛 - Coveralls

This commit adds a new transpiler pass for physical optimization,
TwoQubitPeepholeOptimization. This replaces the use of Collect2qBlocks,
ConsolidateBlocks, and UnitarySynthesis in the optimization stage for
a default pass manager setup. The pass logically works the same way
where it analyzes the dag to get a list of 2q runs, calculates the matrix
of each run, and then synthesizes the matrix and substitutes it inplace.
The distinction this pass makes though is it does this all in a single
pass and also parallelizes the matrix calculation and synthesis steps
because there is no data dependency there.

This new pass is not meant to fully replace the Collect2qBlocks,
ConsolidateBlocks, or UnitarySynthesis passes as those also run in
contexts where we don't have a physical circuit. This is meant instead
to replace their usage in the optimization stage only. Accordingly this
new pass also changes the logic on how we select the synthesis to use
and when to make a substituion. Previously this logic was primarily done
via the ConsolidateBlocks pass by only consolidating to a UnitaryGate if
the number of basis gates needed based on the weyl chamber coordinates
was less than the number of 2q gates in the block (see Qiskit#11659 for
discussion on this). Since this new pass skips the explicit
consolidation stage we go ahead and try all the available synthesizers

Right now this commit has a number of limitations, the largest are:

- Only supports the target
- It doesn't support any synthesizers besides the TwoQubitBasisDecomposer,
  because it's the only one in rust currently.

For plugin handling I left the logic as running the three pass series,
but I'm not sure this is the behavior we want. We could say keep the
synthesis plugins for `UnitarySynthesis` only and then rely on our
built-in methods for physical optimiztion only. But this also seems less
than ideal because the plugin mechanism is how we support synthesizing
to custom basis gates, and also more advanced approximate synthesis
methods. Both of those are things we need to do as part of the synthesis
here.

Additionally, this is currently missing tests and documentation and while
running it manually "works" as in it returns a circuit that looks valid,
I've not done any validation yet. This also likely will need several
rounds of performance optimization and tuning. t this point this is
just a rough proof of concept and will need a lof refinement along with
larger changes to Qiskit's rust code before this is ready to merge.

Fixes Qiskit#12007
Fixes Qiskit#11659
Since Qiskit#13139 merged we have another two qubit decomposer available to
run in rust, the TwoQubitControlledUDecomposer. This commit updates the
new TwoQubitPeepholeOptimization to call this decomposer if the target
supports appropriate 2q gates.
Clippy is correctly warning that the size difference between the two
decomposer types in the TwoQubitDecomposer enumese two types is large.
TwoQubitBasisDecomposer is 1640 bytes and TwoQubitControlledUDecomposer
is only 24 bytes. This means each element of ControlledU is wasting
> 1600 bytes. However, in this case that is acceptable in order to
avoid a layer of pointer indirection as these are stored temporarily
in a vec inside a thread to decompose a unitary. A trait would be more
natural for this to define a common interface between all the two qubit
decomposers but since we keep them instantiated for each edge in a Vec
they need to be sized and doing something like
`Box<dyn TwoQubitDecomposer>` (assuming a trait `TwoQubitDecomposer`
instead of a enum) to get around this would have additional runtime
overhead. This is also considering that TwoQubitControlledUDecomposer
has far less likelihood in practice as it only works with some targets
that have RZZ, RXX, RYY, or RZX gates on an edge which is less common.
Also don't run scoring more than needed.
@ShellyGarion
Copy link
Copy Markdown
Member

Copy here the comment of @t-imamichi #13568 (comment)
and my reply: #13568 (comment)

I think this closes #13428. How about adding a test case of consecutive RZZ (RXX, and RYY) gates?

We should make sure that after PR #13568 and this PR will be merged, we can efficiently transpile circuits into basis fractional RZZ gates .

@mtreinish
Copy link
Copy Markdown
Member Author

I added support for using the ControlledUDecomposer to the new pass back in early December with this commit: 746758f although looking at that now with fresh eyes I need to check that the gate is continuous in the target, right now it only looks at the supported gate types.

@ShellyGarion ShellyGarion self-assigned this Feb 3, 2025
We don't want to spend time reconstructing an exact copy of the dag if
there are no substituions needed. Prior to using a vec for tracking the
run indices that nodes are part of we would check if that map was empty.
The vec is always populated and to determine if there are no entries
we'd have to do a worse case O(n) lookup to determine if any entries are
set. To avoid that this overhead but keeping the check this adds an atomic
bool that is used to track whether we've substituted any blocks. If this
is not set to true we can just exit early since there are no
substitutions to make.
@mtreinish
Copy link
Copy Markdown
Member Author

I think this is almost ready, the only missing piece is I want to figure out a good block count to use for the parallel threshold. There is overhead associated with launching the thread pool and for circuits with few blocks to check/optimize that overhead can be higher than the runtime of the pass when run serially. We'll need to do some benchmarking to figure out where the cross over point is and then add that to the pass

@ShellyGarion
Copy link
Copy Markdown
Member

ShellyGarion commented Apr 5, 2026

  1. Since the two-qubit decomposers tend to add some redundant one-qubit gates, I think that this pass should be generally followed by a one-qubit optimization pass (in the relevant optimization levels), and perhaps it makes sense to benchmark the gates counts of both passes together.
    An even better optimization may be to add the euler_one_qubit_decomposer pass before calculating the score (this can be done in a follow-up PR).

  2. The two-qubit decompositions provide a cannonical form inside the Weyl chamber. Hence, in order to reduce the number of one-qubit gates, a possible optimization could be to try to resynthesize both U and U.inverse() and choose the best decomposition.
    This optimization could be done in a follow-up PR, and should be benchmarked since it will increase the running time.

Here are a few examples of some circuits when using the following optimization code.

        basis_gates = ["rzz", "rx", "rz", "cz"]
        translate = PassManager(
        [
            TwoQubitPeepholeOptimization(backend.target),
            Optimize1qGatesDecomposition(basis_gates),
        ])
        trans = translate.run(qc)
        print (trans)

Example 1:

        qc = QuantumCircuit(2)
        qc.rzz(0.1, 0, 1)
        qc.rzz(0.2, 0, 1)

gives:

global phase: π
                                  
q_0: ──────────■──────────────────
     ┌───────┐ │ZZ(-0.3) ┌───────┐
q_1: ┤ Rx(π) ├─■─────────┤ Rx(π) ├
     └───────┘           └───────┘

while the inverse circuit:

        qc = QuantumCircuit(2)
        qc.rzz(-0.1, 0, 1)
        qc.rzz(-0.2, 0, 1)

gives:

q_0: ─■─────────
      │ZZ(-0.3) 
q_1: ─■─────────

Example 2:

        qc = QuantumCircuit(2)
        qc.rzz(0.1, 0, 1)
        qc.cz(0, 1)

gives:

global phase: π/4
                   ┌─────────┐
q_0: ─■────────────┤ Rz(π/2) ├
      │ZZ(-1.4708) ├─────────┤
q_1: ─■────────────┤ Rz(π/2) ├
                   └─────────┘

while the inverse circuit:

        qc = QuantumCircuit(2)
        qc.cz(0, 1)
        qc.rzz(-0.1, 0, 1)

gives:

global phase: 3π/4
     ┌───────┐              ┌─────────┐ ┌───────┐
q_0: ┤ Rx(π) ├─■────────────┤ Rz(π/2) ├─┤ Rx(π) ├
     └───────┘ │ZZ(-1.4708) ├─────────┴┐└───────┘
q_1: ──────────■────────────┤ Rz(-π/2) ├─────────
                            └──────────┘         

@ShellyGarion
Copy link
Copy Markdown
Member

My concern is that without performing one-qubit optimizations, the two-qubit decomposers may produce too many unnecessary one-qubit gates, that may eventually negatively affect the two-qubit gates optimizations.

Here is an example:

        qc = QuantumCircuit(2)
        qc.rzz(-0.1, 1, 0)
        qc.rzz(-0.2, 0, 1)
        basis_gates = ["rzz", "rx", "rz", "cz"]

        backend = GenericBackendV2(
            num_qubits=2, basis_gates=basis_gates,
        )
        peephole = TwoQubitPeepholeOptimization(backend.target)
        result = peephole(qc)
        print (result)
        print (result.count_ops())

gives:

     ┌──────────┐┌─────────┐┌─────────┐┌──────────┐┌──────────┐┌─────────┐»
q_0: ┤ Rz(-π/2) ├┤ Rx(π/2) ├┤ Rz(π/2) ├┤ Rz(-π/2) ├┤ Rx(-π/2) ├┤ Rz(π/2) ├»
     ├──────────┤├─────────┤├─────────┤├──────────┤├──────────┤├─────────┤»
q_1: ┤ Rz(-π/2) ├┤ Rx(π/2) ├┤ Rz(π/2) ├┤ Rz(-π/2) ├┤ Rx(-π/2) ├┤ Rz(π/2) ├»
     └──────────┘└─────────┘└─────────┘└──────────┘└──────────┘└─────────┘»
«                ┌─────────┐┌──────────┐┌──────────┐┌─────────┐┌─────────┐»
«q_0: ─■─────────┤ Rz(π/2) ├┤ Rx(-π/2) ├┤ Rz(-π/2) ├┤ Rz(π/2) ├┤ Rx(π/2) ├»
«      │ZZ(-0.3) ├─────────┤├──────────┤├──────────┤├─────────┤├─────────┤»
«q_1: ─■─────────┤ Rz(π/2) ├┤ Rx(-π/2) ├┤ Rz(-π/2) ├┤ Rz(π/2) ├┤ Rx(π/2) ├»
«                └─────────┘└──────────┘└──────────┘└─────────┘└─────────┘»
«     ┌──────────┐
«q_0: ┤ Rz(-π/2) ├
«     ├──────────┤
«q_1: ┤ Rz(-π/2) ├
«     └──────────┘

and the gates counts are {'rz': 16, 'rx': 8, 'rzz': 1}.

As we have seen above, with optimizing the one-qubit gates, one can get the gate counts of {'rzz': 1}.
However, without optimizing the one-qubit gates, if error(rx) > 1/8 * error(rzz), than this pass will not reduce the number of two-qubit rzz gates.

@mtreinish
Copy link
Copy Markdown
Member Author

mtreinish commented Apr 8, 2026

Since the two-qubit decomposers tend to add some redundant one-qubit gates, I think that this pass should be generally followed by a one-qubit optimization pass (in the relevant optimization levels), and perhaps it makes sense to benchmark the gates counts of both passes together.
An even better optimization may be to add the euler_one_qubit_decomposer pass before calculating the score (this can be done in a follow-up PR).

I agree you almost always want to follow this pass with a 1q optimization pass. But the way the algorithm works it doesn't have the full context of the circuit when it is evaluating peephole optimizations to make. Each synthesis decision is made in the isolated context of a single block. To get a benefit from the Optimize1qGatesDecomposition pass we need to put all the substitutions back together first so we can find the runs between the blocks. Right now the synthesis we do via the 2q decomposers is already running the one qubit euler decomposition on each of the 4 one qubit components that result as part of the decomposition. We get the compatible basis sets available from the Target (this code is reused from UnitarySynthesis) so in the context that the pass is doing synthesis we are already doing the best 1q synthesis we can and we can't really do better in this pass on it's own (or at least not a way that I can see).

The two-qubit decompositions provide a cannonical form inside the Weyl chamber. Hence, in order to reduce the number of one-qubit gates, a possible optimization could be to try to resynthesize both U and U.inverse() and choose the best decomposition.
This optimization could be done in a follow-up PR, and should be benchmarked since it will increase the running time.

This should already be factored into the unitary synthesis code and will depend on your target. This pass doesn't really add new logic around this, we give the matrix and the qubits it operates on to the existing 2q synthesis code from UnitarySynthesis and it returns the synthesis results. In the examples you gave what was the target you used? If we define cz or rzz bidirectionally with the same error rates it should prefer the synthesis that results in lower predicted error. But there is a lot of funky legacy logic around picking a direction that has baked in assumptions around how IBM hardware worked many years ago (looking at the gate duration to find the natural direction of a bidirection cx for example). So I could see the logic around this needing an improvement. But I'm not sure I would want to try and do it as part of this PR.

My concern is that without performing one-qubit optimizations, the two-qubit decomposers may produce too many unnecessary one-qubit gates, that may eventually negatively affect the two-qubit gates optimizations.

There is always going to be a compromise with these kinds of optimization passes. Those don't change with this new pass, in particular the combination of ConsolidateBlocks followed by UnitarySynthesis has the same issues you're highlighting. The difference there is the heuristics used in that combination of passes doesn't consider 1q gates at all, it just looks at the weyl chamber coordinates and determines whether there is an expected decrease in the number of 2q gates. The new pass at least will pick the result with lower predicted error and fewer gates too if the number of 2q gates is the same. This is why you typically don't run a single optimization pass in isolation and consider the job done. We need to run these passes in together as they make some aspects of the circuit better but could negatively impact others. Typically in a full workflow you want to follow TwoQubitPeepholeOptimization with Optimize1qGatesDecomposition. That's what we do in the preset pass managers right now for 2q peephole, and what we will do when we start looking at integrating this as part of the preset pass managers.

As we have seen above, with optimizing the one-qubit gates, one can get the gate counts of {'rzz': 1}.
However, without optimizing the one-qubit gates, if error(rx) > 1/8 * error(rzz), than this pass will not reduce the number of two-qubit rzz gates.

In this particular case this is actually partially covered. The estimated fidelity of the synthesis is factored into the algorithm already. However, as you point out if the number of 2q gates decreases that takes priority over the fidelity estimate. When I originally wrote this pass I had the scoring function compare (estimated_error, num_2q_gates, num_1q_gates) between synthesis results and the original circuit. But during the debugging (probably like a year ago) the 2q gate count was significantly higher and I was worried that the heuristic wasn't working as I expected so I switched it to prioritize decreasing 2q gates. We can try tweaking the heuristic back to how I had it originally.

This is actually a case where I think we need to talk about benchpress (and maybe other benchmark suites too) because right now benchpress is not factoring in this kind of tradeoff. It only looks at 2q gate count and 2q depth, so the optimization the pass is doing right now, regardless of the gates error rates, makes us look better in benchpress. Personally I'm fine doing what we think results in a better quality compilation (which should be higher estimated fidelity) if we can determine that with some certainty, even if it potentially makes us look worse in benchpress. Since it's the real world results that matter and not what a synthetic benchmark says. But this is the kind of discussion we should feed back into benchpress and improve the quality of benchmarking there.

mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Apr 9, 2026
This commit switches the topological sort function we use in transpiler
passes when reconstructing a dag from scratch. In several passes where
we typically replace or remove a large numbers of gates we iterate over
the input dag in topological order and construct a copy of it making the
alterations as we go. Right now when we do this we rely on
`DAGCircuit::topological_op_nodes` which makes sense because it's or
built-in method for iterating over a dag's op node in topological
ordering. Internally this uses rustworkx's lexicographical topological
sort function with our custom sort function that maintains our desired
tie breaker using the bits of a node. However, since Qiskit#14762 where we're
asserting structural equality in passes we don't need to use that sort
anymore for these reconstruction cases. We just need a consistent
topological sort. In optimizing Qiskit#13419 one thing that showed in profiles
was that for very large circuits the overhead of the lexicographical
topological sort for the iteration. The toposort function in petgraph
is lower overhead because it doesn't need to work about the
lexicographical tie breaker. By switching to use this instead we can
reduce the overhead of the final sort in all these passes. In asv
benchmarking this commit speeds up transpiler benchmarks are 2-5%
faster (although without asv flagging it as significant).
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an important addition to the unitary synthesis.
This is a preliminary review with some minor comments (there are still a few commnets from my previous review).
I still need to look at the score calculation and the tests.

Comment thread crates/transpiler/src/passes/unitary_synthesis/mod.rs Outdated
Comment thread crates/transpiler/src/passes/unitary_synthesis/mod.rs Outdated
Comment thread crates/synthesis/src/two_qubit_decompose/basis_decomposer.rs
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, how is this new transpiler pass appear in the default optimization levels?
In some places, you mention that it should replace UnitarySynthesis and ConsolidateBlocks, but should it be added in the default optimization levels?

Comment thread releasenotes/notes/two_qubit_peephole-de6d3438ed7df6a9.yaml
and :class:`.TwoQubitControlledUDecomposer` for synthesizing the two qubit unitaries.
You should not use this pass if you need to use the pluggable interface and the ability
to use different synthesis algorithms, instead you should use a combination of
:class:`.ConsolidateBlocks` and :class:`.UnitarySynthesis` to use the plugin mechanism
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add some of this as note ? (as there are too many details here).
also, it may be good to add some example (like you did in the relase notes).

@mtreinish
Copy link
Copy Markdown
Member Author

Another question, how is this new transpiler pass appear in the default optimization levels?
In some places, you mention that it should replace UnitarySynthesis and ConsolidateBlocks, but should it be added in the default optimization levels?

This pass only makes sense as a physical optimization pass where we've already run layout and translation on the circuit since we need to compare the estimated fidelity of the block against the available synthesis outcomes. So this pass is designed to replace the use of ConsolidateBlocks and UnitarySynthesis in the optimization stage. Specifically here:

pre_loop = [
ConsolidateBlocks(
basis_gates=pass_manager_config.basis_gates,
target=pass_manager_config.target,
approximation_degree=pass_manager_config.approximation_degree,
),
UnitarySynthesis(
pass_manager_config.basis_gates,
approximation_degree=pass_manager_config.approximation_degree,
coupling_map=pass_manager_config.coupling_map,
method=pass_manager_config.unitary_synthesis_method,
plugin_config=pass_manager_config.unitary_synthesis_plugin_config,
target=pass_manager_config.target,
),

and

ConsolidateBlocks(
basis_gates=pass_manager_config.basis_gates,
target=pass_manager_config.target,
approximation_degree=pass_manager_config.approximation_degree,
),
UnitarySynthesis(
pass_manager_config.basis_gates,
approximation_degree=pass_manager_config.approximation_degree,
coupling_map=pass_manager_config.coupling_map,
method=pass_manager_config.unitary_synthesis_method,
plugin_config=pass_manager_config.unitary_synthesis_plugin_config,
target=pass_manager_config.target,
),

we'd replace the usage in those 2 places with this new pass. This is for level 2 and 3 respectively which matches where we are currently doing this peephole optimization. But, we still would use ConsolidateBlocks in the default init stage and UnitarySynthesis in the default translation stage.

@ShellyGarion
Copy link
Copy Markdown
Member

we'd replace the usage in those 2 places with this new pass. This is for level 2 and 3 respectively which matches where we are currently doing this peephole optimization. But, we still would use ConsolidateBlocks in the default init stage and UnitarySynthesis in the default translation stage.

This should be done in a follow-up PR?

@mtreinish
Copy link
Copy Markdown
Member Author

This should be done in a follow-up PR?

Yes that's the plan. This PR is big enough on it's own and we should handle it as adding a new pass without worrying about the impact on preset pass managers. Then in a follow up PR we can experiment with adding it to our default pipeline and see what the impacts are from doing that in isolation.

@mtreinish mtreinish requested a review from ShellyGarion April 25, 2026 12:30
Previously there was a mismatch between the scoring of synthesis
results and the peephole pass's comparison with the original block. The
pass is documented as using the tuple (num_2q_gates, error, num_gates)
and picking the min of all the choices. But, when we called the unitary
synthesis function that selects the best synthesis outcome it was
maximizing the estimated fidelity but not considering the gate counts
like the pass is documented as doing. This corrects this mismatch by
updating the function doing the synthesis to be generic on score type
and taking a scorer callback. This lets the peephole pass control the
heuristic used for selecting the best score.
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see that finally this PR is ready.
I have some minor suggestions on the tests.

Comment thread test/python/transpiler/test_two_qubit_peephole.py
from ..legacy_cmaps import YORKTOWN_CMAP


class FakeBackend2QV2(GenericBackendV2):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this class and the following class here and not in: test.python.providers ?
I noticed that we have several tests in which we define these classes (like test_unitary_synthesis), and wondered if we should put these classes in a shared place?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case that's fine and almost expected. In general test code is a bit different than normally library code. You want test code to be explicit in what's testing to make it easier to debug and understand what's going on. You want the code structure to be flatter and easier to understand because test code needs to be easier to understand and debug when it fails. That sometimes comes at the cost of some code duplication or decentralization. There's a tradeoff in either way but I think for this case it's fine.

Comment thread test/python/transpiler/test_two_qubit_peephole.py
Copy link
Copy Markdown
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think it's an important contribution to qiskit, I'm not merging it yet, since perhaps others would like to review it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: Added Add an "Added" entry in the GitHub Release changelog. mod: transpiler Issues and PRs related to Transpiler performance Rust This PR or issue is related to Rust code in the repository

Projects

Status: In development

Development

Successfully merging this pull request may close these issues.

Add parallel synthesis interface to default unitary synthesis plugin ConsolidateBlocks does not have a good logic for heterogeneous gates

9 participants