Rewrite QPY ParameterExpression handling in pure Polish form#15934
Rewrite QPY ParameterExpression handling in pure Polish form#15934mtreinish merged 12 commits intoQiskit:mainfrom
ParameterExpression handling in pure Polish form#15934Conversation
This patch is a fairly invasive change, but one that fixes the known edge cases of replay generation from `ParameterExpression` itself. This fixes several `pickle`, `copy`/`deepcopy` and QPY bugs around `ParameterExpression`. The Python-space QPY loader is modified in a small way to allow it to handle QPY-format permitted "replay" elements that act between two bare values; this was always allowed by the QPY spec, but couldn't be generated by Python-space QPY/`ParameterExpression`. Doing so actually significantly simplifies the loading logic, since it no longer needs to reflect certain arithmetic operations. The new Rust-space `ParameterExpression` uses binary operations with two numeric operands (e.g. `Add(2, 0)`) to safely propagate bare values through the QPY replay. This was not previously necessary in Python space because there was no public interface to create a bare-numeric `ParameterExpression` without replayable expressions having been tracked. Similarly, cancelled-out symbols are propagated in the replay with `Sub(x, x)` operations, which already must work across all compliant QPY loaders, and both Rust- and Python-space `ParameterExpression. History ------- The QPY replay was _intended_ to be a pure Polish-notation representation of the operations to take to rebuild the expression. This mostly worked fine at the time, but the patch was written under a lot of stress and without full CI tooling, due to it being in response to a security bug. This led to some less-than-ideal parts of the format specification, such as the start/end recursion opcodes in the QPY format not actually being necessary; they are actually zero-operand no-ops to the stack, but this wasn't noticed during the patch due to time pressure. Further, as `ParameterExpression` moved to Rust, the tracked internal "replay" list disappeared, since we could now reliably walk the expression tree and issue rebuild commands directly. This generated replay, however, _only_ examined the expression and not potentially cancelled-out parameters. All together, there were several problems affecting QPY (both Rust and Python), and `ParameterExpression`'s interaction with `pickle` and the copy module, mostly relating to situations where the resulting expression had cancelled-out parameters or had degraded to be a bare value. For example, expressions like `x - x` (or `0*x + 3`, or arbitrarily complex extensions of these) are supposed to retain a reference to `x` for binding purposes, but failed to do so after pickle or QPY roundtrips. Several of these expressions failed during QPY deserialisation, as the replay stack handling failed to cope with bare values.
|
One or more of the following people are relevant to this code:
|
Gadi did much of the exploratory work of the previous commit, and wrote several of the test cases I pulled in - I forgot to include the line before. Co-authored-by: Gadi Aleksandrowicz <gadial@gmail.com>
|
Whoops, forgot to include Gadi as a co-author in the first commit. The only failure in QPY backwards-compat is the broken 2.4.0rc1; I didn't retain the compatibility handling that's needed to make that work, to simplify the |
I was misled by `Symbol::name` not returning the complete name. We should probably rename that method in a follow-up.
Cryoris
left a comment
There was a problem hiding this comment.
The approach here LGTM and seems to simplify the logic, which is a good sign. I haven't read the QPY serialization/deserialization bit in Rust (param.rs) yet, but I already wanted to submit these comments now 🙂
Using `Mul(sym, 0)` also cancels out the symbol, and doesn't require an internal clone. Co-authored-by: Julien Gacon <jul@zurich.ibm.com>
Cryoris
left a comment
There was a problem hiding this comment.
I ran out of time before the easter break but this looks pretty good to me -- if someone else could have a look at the changes to params.rs in particular I'd feel good about merging 🙂
| // This variant should not exist; see its documentation comment. We have to | ||
| // silently skip it to handle loading incorrect QPY files from Qiskit 2.0 with | ||
| // substitutions involving expressions. |
There was a problem hiding this comment.
Is there a test covering this in the QPY backwards compat tests? Or no because it was an incorrect format also in 2.0?
There was a problem hiding this comment.
One of the backwards compatibility tests (the one with ParameterExpression.subs calls in tests/qpy_compat/test_qpy.py) triggers this for the Qiskit 2.0.x series, yeah - you need to have Python-space (i.e. symengine/sympy) ParameterExpression and also the QPY replay. There's another comment elsewhere in this file at the point that we skip it all - it's safe to do this, and the Python QPY loaders effectively do the same thing (though it's a bit more obfuscated through a couple of layers).
mtreinish
left a comment
There was a problem hiding this comment.
Overall this looks good to me. Most of the logic matches my expectations on how the replay should be handled as a stack based sequence of operations. I just had a few minor inline comments. The only big one is really about the testing. I'm thinking we might want to expand the testing a bit in the qpy backwards compat suite to make sure we have sufficient coverage of parameter expression handling there.
| qc2 = QuantumCircuit(1) | ||
| theta = Parameter("θ") | ||
| rz = Parameter("rz") | ||
| exp = theta + np.pi | ||
| exp = exp.subs({theta: rz}) | ||
| exp = exp.subs({rz: theta}) | ||
| qc2.rz(exp, 0) | ||
|
|
||
| pv = ParameterVector("rz", 2) | ||
| qc3 = QuantumCircuit(1, name="subs-vector") | ||
| qc3.rz((pv[0] + 0.5).subs({pv[0]: pv[1]}), 0) | ||
|
|
||
| return [qc, qc2, qc3] |
There was a problem hiding this comment.
I'm wondering if there is more testing we want to embed here to ensure we have thorough coverage here. We've had a fair number of issues with ParameterExpression and qpy over the years (not just with this release). So it might behoove us to improve the coverage in the backwards compat suite. Even just embedding all of or unit test with examples
The other obvious thing is we need to expand the backwards compatibility test suite to cover using the version under test to generating from QPY versions starting at QPY_COMPATIBILTY_VERSION and loading from the previous major version through the most recent release to ensure we have long term coverage of this path (especially after we remove the python code). But this can be in a follow up PR.
There was a problem hiding this comment.
Yeah, there's quite a lot of better testing we ought to be doing, I think.
These tests are the ones known to have triggered bad behaviour, but we definitely should expand the backwards compatibility tests with more of the special cases (cancellations, values, etc) too.
There was a problem hiding this comment.
I added a little more testing in c042d82, but I haven't gone so far as to rewrite the test matrix to go up from the compatibility version in this PR.
| // it. Actually, anything other than a "value" that's the zero-data case (meaning | ||
| // it's just a `Parameter`/`ParameterVectorElement` definition) is completely | ||
| // meaningless; all versions of Python-space Qiskit QPY loads would immediately |
There was a problem hiding this comment.
What is "value" here? IIRC the zero-data case was a type code of "p" for a Parameter object in the map. I get the other values aren't valid in the symbol_map despite how it's documented. But I'm a bit confused by the wording here.
There was a problem hiding this comment.
This format refers to entries in a map, and value is what the documentation calls it: https://quantum.cloud.ibm.com/docs/en/api/qiskit/qpy#parameter_expr-1
| if expression_data.OP_CODE in (0, 1, 2, 3, 4, 13, 15, 18, 19, 20): | ||
| rhs = stack.pop() | ||
| lhs = stack.pop() | ||
| # Reverse ops for commutative ops, which are add, mul (0 and 2 respectively) | ||
| # op codes 13 and 15 can never be reversed and 18, 19, 20 | ||
| # are the reversed versions of non-commutative operations | ||
| # so 1, 3, 4 and 18, 19, 20 handle this explicitly. | ||
| if ( | ||
| not isinstance(lhs, ParameterExpression) | ||
| and isinstance(rhs, ParameterExpression) | ||
| and expression_data.OP_CODE in {0, 2} | ||
| ): | ||
| if expression_data.OP_CODE == 0: | ||
| method_str = "__radd__" | ||
| elif expression_data.OP_CODE == 2: | ||
| method_str = "__rmul__" | ||
| stack.append(getattr(rhs, method_str)(lhs)) | ||
| else: | ||
| stack.append(getattr(lhs, method_str)(rhs)) | ||
| if not isinstance(lhs, ParameterExpression): | ||
| lhs = ParameterExpression._Value(lhs) | ||
| stack.append(getattr(lhs, method_str)(rhs)) |
There was a problem hiding this comment.
Are these changes correcting a correctness issue or are you just trying to optimize the code a bit here? It reads to me like you're just trying to improve the efficiency of the legacy python code. Unless I'm missing there is a correctness bug in the old code that this fixes I would suggest not doing this in a separate PR to keep this concentrated on just fixing the issues with the new code.
There was a problem hiding this comment.
It's correctness: the previous form couldn't handle a valid stack operation on two immediate values if neither were a Parameter or expression. I had to fix it here to let the full cross Python/Rust test matrix work, but it's totally valid QPY, so it should be handled.
|
If the comments are the default hasher, adding a comment and expanding the test suite, can they be later follow-ups? I'm not at my laptop right now to make substantial changes. |
All versions of Qiskit that have Rust-space `ParameterExpression` (introduced in v2.2) up to (excluding) the roll-up commit of this PR (intended release in v2.4.0rc3) produce invalid QPY for expressions with cancellations. Here, we skip the known-bad versions, since there is no way to completely recover from such files; the cancelled-out expression _might_ be zero, or might truly have been an arbitrary numeric value (e.g. in the case `0*x + 1.5`, the `1.5` is completely lost in the broken versions of Qiskit). The previous version-handling mechanisms in the QPY backwards compatibility tests couldn't cope with an `rc` being an upper bound in a version range (since it only dealt with the "release" component), so here we also update the checking system to use the complete version of `packing.version` comparisons.
mtreinish
left a comment
There was a problem hiding this comment.
It looks like you missed installing packaging inside the venvs used for the symengine version checks used at the end of run_tests.sh.
mtreinish
left a comment
There was a problem hiding this comment.
Thanks for the updates. I think this looks good to go for the most part. I just have one small comment on the backwards compat test organization. After that's updated I think this should be good to merge.
mtreinish
left a comment
There was a problem hiding this comment.
LGTM now, thanks for the fast updates.
* Rewrite QPY `ParameterExpression` handling in pure Polish form This patch is a fairly invasive change, but one that fixes the known edge cases of replay generation from `ParameterExpression` itself. This fixes several `pickle`, `copy`/`deepcopy` and QPY bugs around `ParameterExpression`. The Python-space QPY loader is modified in a small way to allow it to handle QPY-format permitted "replay" elements that act between two bare values; this was always allowed by the QPY spec, but couldn't be generated by Python-space QPY/`ParameterExpression`. Doing so actually significantly simplifies the loading logic, since it no longer needs to reflect certain arithmetic operations. The new Rust-space `ParameterExpression` uses binary operations with two numeric operands (e.g. `Add(2, 0)`) to safely propagate bare values through the QPY replay. This was not previously necessary in Python space because there was no public interface to create a bare-numeric `ParameterExpression` without replayable expressions having been tracked. Similarly, cancelled-out symbols are propagated in the replay with `Sub(x, x)` operations, which already must work across all compliant QPY loaders, and both Rust- and Python-space `ParameterExpression. History ------- The QPY replay was _intended_ to be a pure Polish-notation representation of the operations to take to rebuild the expression. This mostly worked fine at the time, but the patch was written under a lot of stress and without full CI tooling, due to it being in response to a security bug. This led to some less-than-ideal parts of the format specification, such as the start/end recursion opcodes in the QPY format not actually being necessary; they are actually zero-operand no-ops to the stack, but this wasn't noticed during the patch due to time pressure. Further, as `ParameterExpression` moved to Rust, the tracked internal "replay" list disappeared, since we could now reliably walk the expression tree and issue rebuild commands directly. This generated replay, however, _only_ examined the expression and not potentially cancelled-out parameters. All together, there were several problems affecting QPY (both Rust and Python), and `ParameterExpression`'s interaction with `pickle` and the copy module, mostly relating to situations where the resulting expression had cancelled-out parameters or had degraded to be a bare value. For example, expressions like `x - x` (or `0*x + 3`, or arbitrarily complex extensions of these) are supposed to retain a reference to `x` for binding purposes, but failed to do so after pickle or QPY roundtrips. Several of these expressions failed during QPY deserialisation, as the replay stack handling failed to cope with bare values. * Credit Gadi Gadi did much of the exploratory work of the previous commit, and wrote several of the test cases I pulled in - I forgot to include the line before. Co-authored-by: Gadi Aleksandrowicz <gadial@gmail.com> * Correct handling of vector-symbol names in map I was misled by `Symbol::name` not returning the complete name. We should probably rename that method in a follow-up. * Avoid symbol cloning in QPY replay output Using `Mul(sym, 0)` also cancels out the symbol, and doesn't require an internal clone. Co-authored-by: Julien Gacon <jul@zurich.ibm.com> * Make inner-worker recursive call clearer * Add hasher to `IndexSet` * Update comments * Expand `ParameterExpression` backwards-compatibility testing * Skip backwards-compatibility tests for broken Qiskit versions All versions of Qiskit that have Rust-space `ParameterExpression` (introduced in v2.2) up to (excluding) the roll-up commit of this PR (intended release in v2.4.0rc3) produce invalid QPY for expressions with cancellations. Here, we skip the known-bad versions, since there is no way to completely recover from such files; the cancelled-out expression _might_ be zero, or might truly have been an arbitrary numeric value (e.g. in the case `0*x + 1.5`, the `1.5` is completely lost in the broken versions of Qiskit). The previous version-handling mechanisms in the QPY backwards compatibility tests couldn't cope with an `rc` being an upper bound in a version range (since it only dealt with the "release" component), so here we also update the checking system to use the complete version of `packing.version` comparisons. * Add known-issue release note * Add missing dependency installation * Separate conditional test into separate file --------- Co-authored-by: Gadi Aleksandrowicz <gadial@gmail.com> Co-authored-by: Julien Gacon <jul@zurich.ibm.com> (cherry picked from commit 8064d27)
… (#15961) * Rewrite QPY `ParameterExpression` handling in pure Polish form This patch is a fairly invasive change, but one that fixes the known edge cases of replay generation from `ParameterExpression` itself. This fixes several `pickle`, `copy`/`deepcopy` and QPY bugs around `ParameterExpression`. The Python-space QPY loader is modified in a small way to allow it to handle QPY-format permitted "replay" elements that act between two bare values; this was always allowed by the QPY spec, but couldn't be generated by Python-space QPY/`ParameterExpression`. Doing so actually significantly simplifies the loading logic, since it no longer needs to reflect certain arithmetic operations. The new Rust-space `ParameterExpression` uses binary operations with two numeric operands (e.g. `Add(2, 0)`) to safely propagate bare values through the QPY replay. This was not previously necessary in Python space because there was no public interface to create a bare-numeric `ParameterExpression` without replayable expressions having been tracked. Similarly, cancelled-out symbols are propagated in the replay with `Sub(x, x)` operations, which already must work across all compliant QPY loaders, and both Rust- and Python-space `ParameterExpression. History ------- The QPY replay was _intended_ to be a pure Polish-notation representation of the operations to take to rebuild the expression. This mostly worked fine at the time, but the patch was written under a lot of stress and without full CI tooling, due to it being in response to a security bug. This led to some less-than-ideal parts of the format specification, such as the start/end recursion opcodes in the QPY format not actually being necessary; they are actually zero-operand no-ops to the stack, but this wasn't noticed during the patch due to time pressure. Further, as `ParameterExpression` moved to Rust, the tracked internal "replay" list disappeared, since we could now reliably walk the expression tree and issue rebuild commands directly. This generated replay, however, _only_ examined the expression and not potentially cancelled-out parameters. All together, there were several problems affecting QPY (both Rust and Python), and `ParameterExpression`'s interaction with `pickle` and the copy module, mostly relating to situations where the resulting expression had cancelled-out parameters or had degraded to be a bare value. For example, expressions like `x - x` (or `0*x + 3`, or arbitrarily complex extensions of these) are supposed to retain a reference to `x` for binding purposes, but failed to do so after pickle or QPY roundtrips. Several of these expressions failed during QPY deserialisation, as the replay stack handling failed to cope with bare values. * Credit Gadi Gadi did much of the exploratory work of the previous commit, and wrote several of the test cases I pulled in - I forgot to include the line before. * Correct handling of vector-symbol names in map I was misled by `Symbol::name` not returning the complete name. We should probably rename that method in a follow-up. * Avoid symbol cloning in QPY replay output Using `Mul(sym, 0)` also cancels out the symbol, and doesn't require an internal clone. * Make inner-worker recursive call clearer * Add hasher to `IndexSet` * Update comments * Expand `ParameterExpression` backwards-compatibility testing * Skip backwards-compatibility tests for broken Qiskit versions All versions of Qiskit that have Rust-space `ParameterExpression` (introduced in v2.2) up to (excluding) the roll-up commit of this PR (intended release in v2.4.0rc3) produce invalid QPY for expressions with cancellations. Here, we skip the known-bad versions, since there is no way to completely recover from such files; the cancelled-out expression _might_ be zero, or might truly have been an arbitrary numeric value (e.g. in the case `0*x + 1.5`, the `1.5` is completely lost in the broken versions of Qiskit). The previous version-handling mechanisms in the QPY backwards compatibility tests couldn't cope with an `rc` being an upper bound in a version range (since it only dealt with the "release" component), so here we also update the checking system to use the complete version of `packing.version` comparisons. * Add known-issue release note * Add missing dependency installation * Separate conditional test into separate file --------- (cherry picked from commit 8064d27) Co-authored-by: Jake Lishman <jake.lishman@ibm.com> Co-authored-by: Gadi Aleksandrowicz <gadial@gmail.com> Co-authored-by: Julien Gacon <jul@zurich.ibm.com>
This patch is a fairly invasive change, but one that fixes the known edge cases of replay generation from
ParameterExpressionitself. This fixes severalpickle,copy/deepcopyand QPY bugs aroundParameterExpression.The Python-space QPY loader is modified in a small way to allow it to handle QPY-format permitted "replay" elements that act between two bare values; this was always allowed by the QPY spec, but couldn't be generated by Python-space QPY/
ParameterExpression. Doing so actually significantly simplifies the loading logic, since it no longer needs to reflect certain arithmetic operations. The new Rust-spaceParameterExpressionuses binary operations with two numeric operands (e.g.Add(2, 0)) to safely propagate bare values through the QPY replay. This was not previously necessary in Python space because there was no public interface to create a bare-numericParameterExpressionwithout replayable expressions having been tracked. Similarly, cancelled-out symbols are propagated in the replay withSub(x, x)operations, which already must work across all compliant QPY loaders, and both Rust- and Python-space `ParameterExpression.History
The QPY replay was intended to be a pure Polish-notation representation of the operations to take to rebuild the expression. This mostly worked fine at the time, but the patch was written under a lot of stress and without full CI tooling, due to it being in response to a security bug. This led to some less-than-ideal parts of the format specification, such as the start/end recursion opcodes in the QPY format not actually being necessary; they are actually zero-operand no-ops to the stack, but this wasn't noticed during the patch due to time pressure.
Further, as
ParameterExpressionmoved to Rust, the tracked internal "replay" list disappeared, since we could now reliably walk the expression tree and issue rebuild commands directly. This generated replay, however, only examined the expression and not potentially cancelled-out parameters.All together, there were several problems affecting QPY (both Rust and Python), and
ParameterExpression's interaction withpickleand the copy module, mostly relating to situations where the resulting expression had cancelled-out parameters or had degraded to be a bare value. For example, expressions likex - x(or0*x + 3, or arbitrarily complex extensions of these) are supposed to retain a reference toxfor binding purposes, but failed to do so after pickle or QPY roundtrips. Several of these expressions failed during QPY deserialisation, as the replay stack handling failed to cope with bare values.Summary
Details and comments
Close #15355
Close #15900 (obsolete)