-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Expand discussion of LLM use in contributions #15923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -11,6 +11,9 @@ community in this goal. | |||||||||||||
| * [Set up Python virtual development environment](#set-up-python-virtual-development-environment) | ||||||||||||||
| * [Installing Qiskit from source](#installing-qiskit-from-source) | ||||||||||||||
| * [Issues and pull requests](#issues-and-pull-requests) | ||||||||||||||
| * [Pull request checklist](#pull-request-checklist) | ||||||||||||||
| * [Code review](#code-review) | ||||||||||||||
| * [Use of AI tools](#use-of-ai-tools) | ||||||||||||||
| * [Contributor Licensing Agreement](#contributor-licensing-agreement) | ||||||||||||||
| * [Changelog generation](#changelog-generation) | ||||||||||||||
| * [Release notes](#release-notes) | ||||||||||||||
|
|
@@ -216,23 +219,32 @@ please ensure that: | |||||||||||||
| If your code fails the local style checks (specifically the black | ||||||||||||||
| or Rust code formatting check) you can use `tox -eblack` and | ||||||||||||||
| `cargo fmt` to automatically fix the code formatting. | ||||||||||||||
|
|
||||||||||||||
| 2. The documentation has been updated accordingly. In particular, if a | ||||||||||||||
| function or class has been modified during the PR, please update the | ||||||||||||||
| *docstring* accordingly. | ||||||||||||||
|
|
||||||||||||||
| If your pull request is adding a new class, function, or module that is | ||||||||||||||
| intended to be user facing ensure that you've also added those to a | ||||||||||||||
| documentation `autosummary` index to include it in the api documentation. | ||||||||||||||
|
|
||||||||||||||
| 3. If you are of the opinion that the modifications you made warrant additional tests, | ||||||||||||||
| feel free to include them | ||||||||||||||
|
|
||||||||||||||
| 4. Ensure that if your change has an end user facing impact (new feature, | ||||||||||||||
| deprecation, removal etc) that you have added a reno release note for that | ||||||||||||||
| change and that the PR is tagged for the changelog. | ||||||||||||||
|
|
||||||||||||||
| 5. All contributors have signed the CLA. | ||||||||||||||
|
|
||||||||||||||
| 6. The PR has a concise and explanatory title (e.g. `Fixes Issue1234` is a bad title!). | ||||||||||||||
|
|
||||||||||||||
| 7. If the PR addresses an open issue the PR description includes the `fixes #issue-number` | ||||||||||||||
| syntax to link the PR to that issue (**you must use the exact phrasing in order for GitHub | ||||||||||||||
| to automatically close the issue when the PR merges**) | ||||||||||||||
| syntax to link the PR to that issue (**you must use the exact phrasing in order for GitHub | ||||||||||||||
| to automatically close the issue when the PR merges**) | ||||||||||||||
|
|
||||||||||||||
| 8. You have disclosed all substantial use of AI tooling, including large language models (LLMs). | ||||||||||||||
| See [Use of AI tools](#use-of-ai-tools) for your responsibilities. | ||||||||||||||
|
|
||||||||||||||
| ### Code Review | ||||||||||||||
|
|
||||||||||||||
|
|
@@ -253,23 +265,99 @@ sure to always be kind and respectful in your interactions with maintainers and | |||||||||||||
|
|
||||||||||||||
| ### Use of AI tools | ||||||||||||||
|
|
||||||||||||||
| > [!WARNING] | ||||||||||||||
| > If you use any AI tool while preparing your code contribution, you **must** disclose the name of the tool and its version in the PR description. | ||||||||||||||
| > [!NOTE] | ||||||||||||||
| > By "AI tools", we mean generative langauge tools like large language models (LLMs). | ||||||||||||||
|
|
||||||||||||||
| *You are responsible for any code you submit to Qiskit, no matter how it was generated.* | ||||||||||||||
|
|
||||||||||||||
| We discourage the use of LLM code generation, but do not forbid it in high-quality pull requests. | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not convinced we should categorically discourage it like this, I think the above statement about responsibility is clear enough. I don't think we can control people using it, it's a tool in the toolbox by now... If you want to keep it, we could maybe rephrase it in a constructive manner like
Suggested change
Or something?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the responsibility section says anything about code generation. I do want to discourage LLM code generation overall: empirically, code we receive that is LLM generated is worse on average, and it is a negative modifier on the chance of the PR getting accepted. |
||||||||||||||
|
|
||||||||||||||
| Qiskit maintainers are not obliged to accept any pull request, even if it follows these guidelines. | ||||||||||||||
| Pull requests that require excessive maintainer time may be closed, even with no proposed | ||||||||||||||
| alternative. | ||||||||||||||
|
Comment on lines
+275
to
+277
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this may be a more conservative/commanding way of saying this, but just a nit
Suggested change
|
||||||||||||||
|
|
||||||||||||||
| #### Your responsibilities | ||||||||||||||
|
|
||||||||||||||
| Your responsibilities for your code are not changed by using generative AI tooling. Some of these | ||||||||||||||
| (this is not an exhaustive list) are: | ||||||||||||||
|
|
||||||||||||||
| - You have reviewed and fully understand all code you submit, and explain the reasoning for it. | ||||||||||||||
| Asking an LLM to explain the reasoning for you is not acceptable. | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
maybe this?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd rather be stronger here. It isn't acceptable to use an LLM to provide the reasoning to a question a maintainer asks you - you have to be able to explain it yourself, and you shouldn't need the LLM to explain it to you first once it's reached the review process, or you didn't follow the rules before submission.
Comment on lines
+284
to
+285
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: It's mostly about using stronger language here, but also okay not to include.
Suggested change
|
||||||||||||||
|
|
||||||||||||||
| - Your use of the AI tool, or the use of the output in Qiskit, does not violate any third-party | ||||||||||||||
| license obligations of source code used during the generation. This may mean including license | ||||||||||||||
| notices or source attribution with the generated code. | ||||||||||||||
|
|
||||||||||||||
| - The AI tool's terms and conditions must allow its output to be used in open-source projects, and | ||||||||||||||
| they are compatible with submitting code under the [Qiskit license](LICENSE.txt), and abiding by | ||||||||||||||
| the [Qiskit CLA](https://qisk.it/cla) and these contribution guidelines. | ||||||||||||||
|
|
||||||||||||||
| By submitting code to Qiskit, you are asserting that your submission is your own original work of | ||||||||||||||
| authorship, as required by the [Contributor License Agreement (CLA)](https://qisk.it/cla) that you | ||||||||||||||
| signed (or will sign) on your first contribution to Qiskit. | ||||||||||||||
|
|
||||||||||||||
| Any use of generative tooling to produce code or public communications (for example, comments or | ||||||||||||||
| pull-request descriptions) must be disclosed in the pull-request description. You can say: | ||||||||||||||
|
|
||||||||||||||
| > LLM disclosure: ChatGPT with GPT-5.4 was used to help write the pull request description. | ||||||||||||||
|
|
||||||||||||||
| or | ||||||||||||||
|
|
||||||||||||||
| > LLM disclosure: Claude Opus 4.6 was used to generate initial prototypes, which I then modified. | ||||||||||||||
|
|
||||||||||||||
| All non-trivial generated code must be clearly marked inline with code comments that: | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we mention that one can also put it in the commit message if there's no clear block that was generated, but it was used throughout the code?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a follow up to @Cryoris , this section implies the disclosure needs to be in the PR description (which turns into the commit message when squash merged) and also inline. What's the rationale for the redundancy?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would advocate against inline code attribution. I don't think this adds more clarity to the contributor's responsibility when it comes to using gen AI tools, and it also has the downside of cluttering the code with metadata that does provide value to better understanding the code, like regular comments do. |
||||||||||||||
|
|
||||||||||||||
| - include the name and version of the model and tool used; | ||||||||||||||
| - indicate the start and end of the generated code. | ||||||||||||||
|
|
||||||||||||||
| For example, you might write: | ||||||||||||||
|
|
||||||||||||||
| ```python | ||||||||||||||
| class SomeTranspilerPass: | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Claude Opus 4.6 forgot to derive the base class 😛
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, if you're going to nitpick this kind of thing then it also needs the imports to work, a docstring to pass lint, and the code body needs to have a statement and not just comments so it's syntactically valid. I wouldn't do those, though - they're just noise that's not relevant to the point. |
||||||||||||||
| # This `run` method was originally generated using Claude Opus 4.6, then tidied up by hand. | ||||||||||||||
| def run(self, dag): | ||||||||||||||
| # [ ... generated code ... ] | ||||||||||||||
| ``` | ||||||||||||||
|
|
||||||||||||||
| It is never appropriate to allow an unsupervised agent to interact publicly with the Qiskit | ||||||||||||||
| repository. | ||||||||||||||
|
|
||||||||||||||
| #### Appropriate use of AI tools | ||||||||||||||
|
|
||||||||||||||
| We understand that using AI tooling is fun, feels productive, and can help communicate, particularly | ||||||||||||||
| for people whose primary language is not English. However, AI tooling is just a tool, and is not | ||||||||||||||
| always appropriate. | ||||||||||||||
|
|
||||||||||||||
| It is generally fine to use AI tooling privately in chat mode to ask about the existing code, | ||||||||||||||
| prototype solutions, and debug issues. Private use, where no LLM output becomes submitted code or | ||||||||||||||
| public communications, does not need disclosure. Be aware that writing code that is based on other | ||||||||||||||
| code (such as re-implementing LLM-output code yourself) may still be subject to licensing | ||||||||||||||
| restrictions from the source. | ||||||||||||||
|
|
||||||||||||||
| It is generally not ok to use AI tooling to produce code that you could not subsequently reproduce | ||||||||||||||
| yourself later without tooling, including if the AI tooling only provided explanations. This would | ||||||||||||||
| indicate that you do not sufficiently understand the code produced. | ||||||||||||||
|
Comment on lines
+325
to
+339
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A bit like above, but I'm not convinced we should put this here. It's unlikely going to stop anyone from using LLMs and I don't think we should give an opinion on how to use tools, but rather focus on what they should submit -- exactly like the points you're listing below. How about removing these 3 paragraphs and just keep the bullets below as guideline?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I very much want to include this section. I think it's clear examples of what kind of use is totally fine for submission to Qiskit, what definitely isn't, and it's something pre-written and public to point to when responding on PRs. I'm fine to reword it to make it clear that I mean "Use of AI tools in submissions to Qiskit" and not "for personal use" though. I think trying to pretend that we don't actually care about how you use such a tool whose improper use has such a negative impact on maintainers is disingenuous and unhelpful to users trying to understand: we do care. |
||||||||||||||
|
|
||||||||||||||
| Some questions to ask yourself: | ||||||||||||||
|
|
||||||||||||||
| - *Have I reviewed and thought about the code in as much depth as I would have without the tool?* | ||||||||||||||
|
|
||||||||||||||
| When using AI tools for code generation, your submission must still be your own original work of authorship, as required by the [Contributor License Agreement (CLA)](https://qisk.it/cla). It is your responsibility to make sure that: | ||||||||||||||
| The hardest part of producing high-quality software is not writing lines of code. It is far | ||||||||||||||
| harder to review code for correctness and for whole-project architectural fit, and to make sure | ||||||||||||||
| that code is solving problems in the right way. | ||||||||||||||
|
|
||||||||||||||
| - You review and fully understand the generated code, and you can explain the reasoning behind it during review. | ||||||||||||||
| - The usage of the AI tool does not violate any third-party license obligations. | ||||||||||||||
| - The AI tool's terms and conditions allow its output to be used in open source projects and are compatible with the [Qiskit license](LICENSE.txt), the [Qiskit CLA](https://qisk.it/cla), and [these Contributor Guidelines](CONTRIBUTING.md). | ||||||||||||||
| - You only use AI tools that have features to: | ||||||||||||||
| * filter out generated code substantially similar to training data, or | ||||||||||||||
| * identify similar training code so you can comply with the original license obligations (notice, attribution, etc.) and only contribute if it's compatible with the [Qiskit license](LICENSE.txt). | ||||||||||||||
| - You disclose the name and version of the AI tool in your PR description. | ||||||||||||||
| - *Have I been respectful of the time the code review will take?* | ||||||||||||||
|
|
||||||||||||||
| Submissions that appear unreviewed or copied directly from an AI tool without proper understanding may be requested to be revised or declined. | ||||||||||||||
| By submitting code to Qiskit, you are asking a maintainer to review it. It is very easy to | ||||||||||||||
| quickly generate large quantities of code with an LLM, yet all code must be reviewed. Qiskit | ||||||||||||||
| maintainers will close PRs that indicate the submitter is not respecting the time of the humans | ||||||||||||||
| on the maintenance side. | ||||||||||||||
|
|
||||||||||||||
| Remember that spamming issues or pull requests with AI-generated comments is prohibited under the [Qiskit Code of Conduct](https://qisk.it/coc). | ||||||||||||||
| - *What am I, the person, bringing to this process?* | ||||||||||||||
|
|
||||||||||||||
| Remember that Qiskit maintainers have access to AI tools too. If the majority of your involvement | ||||||||||||||
| was to point an AI tool at an open issue and ask it to fix it, consider that we could have done | ||||||||||||||
| that too and there was a reason we didn't. | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can also skip this one -- or make it a bit more constructive 😂
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Imo this is constructive: I think people don't actually consider this, or think that "point an LLM at an issue" is actually helping.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Judging based on some recent AI-generated PRs I think this point is important. But there might be valid reasons why a long-standing issue is still open, like priorities and bandwidth of the core team, so it may still be valuable for someone else to address the issue. How about we add something to encourage contributors to consult the maintainers first (in the issue) in case of doubts or questionable added value before submitting a PR? |
||||||||||||||
|
|
||||||||||||||
| ## Contributor Licensing Agreement | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.