Exploring Solutions to Tackle Low-Quality Contributions on GitHub #185387
Replies: 95 comments 240 replies
-
|
such a great intiative |
Beta Was this translation helpful? Give feedback.
-
|
I know this is a pretty ambitious idea and not trivial to implement, but it would be really powerful to have an AI-detection mechanism with a configurable threshold at the repository or organization level. That way, teams could decide what percentage of AI-generated code is acceptable in pull requests. Another possible approach would be to define a set of rules or prompts and evaluate pull requests against them. PRs that don’t meet those rules could be automatically flagged or potentially even closed. |
Beta Was this translation helpful? Give feedback.
-
|
As of today, I would say that 1 out of 10 PRs created with AI is legitimate and meets the standards required to open that PR.
On 28 Jan 2026, at 18:41, Camilla Moraes ***@***.***> wrote:
Another possible approach would be to define a set of rules or prompts and evaluate pull requests against them. PRs that don’t meet those rules could be automatically flagged or potentially even closed.
This is definitely something we’re exploring. One idea is to leverage a repository’s CONTRIBUTING.md file as a source of truth for project guidelines and then validate PRs against any defined rules.
In regards to AI-generated code, have you seen cases where the code is AI-generated but still high-quality and genuinely solves the problem? Or is it alwaays just something you want to close out immediately? I'm curious because I'm wondering if an AI-detection mechanism would rule out PRs where AI is used constructively, but that's where we'd want to test this thoroughly and understand what sensible thresholds look like.
—
Reply to this email directly, view it on GitHub<#185387 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABBWEYEKF6WLNDKE376L3GD4JDYFXAVCNFSM6AAAAACS7B7C7OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTKNRTGEZTMMI>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as disruptive content.
This comment was marked as disruptive content.
-
|
Hey! I am from Azure Core Upstream and we have a lot of OSS maintainers who mainly maintain repositories on GitHub. We held an internal session to talk about copilot and there is a discussion on the topic where maintainers feel caught between today’s required review rigor (line-by-line understanding for anything shipped) and a future where agentic / AI-generated code makes that model increasingly unsustainable. below are some key maintainer's pain points:
|
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
|
Beta Was this translation helpful? Give feedback.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
-
|
An option to limit new contributors to one open PR would be nice. Just today I had to batch-close several AI generated PRs which were all submitted around the same time. For this protection, defining "new contributor" is probably not possible to do perfectly. But anyone who has no interactions with a project prior to the last 48 hours seems like a good heuristic. The point is to catch such a user at submission time and limit the amount of maintainer attention they can take up. For a different type of problem, I'd like to be able to close PRs as "abandoned", similar to the issue close statuses. It's a clear UI signal to the contributor that their work isn't being rejected but I'm not going to finish it for them. Several of the low quality contributions I have handled, dating back to before the Slop Era but getting worse, are simply incomplete and need follow through. |
Beta Was this translation helpful? Give feedback.
-
|
For the long term horizon: Implement a reviewer LLM that first does an initial scoring of the PRs? Critique is far easier than creation of a correct result. That automated pre-moderation should give the edge needed to handle. Depending on whether you just use rich prompting or fine-tuning, you can even start building an "oracle vox" for your project, which acts as a reasonably informed, reasonably on point virtual representative for the project/organization. |
Beta Was this translation helpful? Give feedback.
-
|
This is a very real problem, and I appreciate that it’s being treated as systemic rather than blaming maintainers or contributors individually. One concern I have with repo-level PR restrictions is that they may disproportionately impact first-time contributors who do want to engage meaningfully but don’t yet have collaborator status. Personally, I think the most promising direction here is criteria-based PR gating rather than blanket restrictions things like required checklist completion, passing CI, linked issues, or acknowledgement of contribution guidelines before a PR can be opened. On AI usage specifically, transparency feels more scalable than prohibition. Clear disclosure combined with automated guideline checks could help maintainers focus on high-intent contributions without discouraging responsible AI-assisted workflows. Looking forward to seeing how these ideas evolve especially solutions that preserve openness while respecting maintainer time. |
Beta Was this translation helpful? Give feedback.
-
|
Thinking along the lines of the discussion first approach that Ghostty uses, I think one way to create just enough friction would be to have an opt-in where a PR has to be linked to an open issue or discussion topic. So when an unprivileged (i.e. does not have elevated privileges on the repo) user tries to create a PR, there's a required field that takes an issue/discussion number. If that's not provided (or the corresponding issue/discussion is closed), then the PR can't be created. This could be trivially worked around by throwing in any old issue/discussion (or by creating one), but it may cause just enough friction to help. To guard against this, perhaps maintainers could set a "minimum age" for the issue/discussion (e.g. 12 hours) to prevent creating fake issues to support a spammy PR. |
Beta Was this translation helpful? Give feedback.
-
|
This is a real problem, and I think the key is not just restricting contributions, but raising the baseline quality before maintainers even see a PR. A few ideas that could work well together:
|
Beta Was this translation helpful? Give feedback.
-
|
Honestly if I get an AI-generated security advisory dumped in my lap and it actually pans out, I just fix it and move on. I don't need to credit a machine. |
Beta Was this translation helpful? Give feedback.
-
|
I would like an option to require users to enable 2FA to interact with my projects. You don’t need to authenticate with 2FA for every action. You just need to have it enabled. I want to set this option both at organization level, and as a global setting for all of my personal projects. This is not disruptive for people, because you should enable 2FA anyway. But this is hard to overcome for bots. |
Beta Was this translation helpful? Give feedback.
-
|
My two cents: One easy filter is to simply reject PRs from non-maintainer contributors with lots of nonsense commits. I keep seeing folks who commit each AI iteration, swinging the slop pendulum back and forth until they achieve their goal. This would only work though for those who don't know how to rebase. |
Beta Was this translation helpful? Give feedback.
-
|
I've been thinking about this from the maintainer tooling angle. Most anti-spam approaches focus on contributor reputation (account age, profile, commit patterns). That works for obvious bots, but it fails on a specific category: PRs that look legitimate but don't actually address the linked issue. I built an open source triage tool that takes a different approach — it reads the actual diff and evaluates it against six quality dimensions:
Each dimension is scored with specific file references and diff evidence. The system uses probabilistic language and a signal hierarchy — it tells you which PRs deserve your review time, it doesn't make merge/close decisions. I've tested it on real bounty PRs that attract 10-25 submissions per issue. It correctly flags the cosmetic-only submissions (score <40) while surfacing the ones with real implementations (score >70). MIT licensed, BYOK (bring your own API key), free tier available: https://github.com/Elifterminal/pr-triage-web Would love feedback from maintainers on whether the scoring aligns with your instincts. |
Beta Was this translation helpful? Give feedback.
-
|
Follow-up with a concrete example. I ran triage on three open PRs this morning — one of them illustrates exactly why contributor-signal filters are becoming insufficient. microsoft/vscode#267874 — authored by Copilot. 422 lines of telemetry code with 213 lines of tests. The code is technically sound:
It scored 52/100 — The signal that flagged it wasn't code quality. It was contextual: no linked issue, draft status, and scope that's suspiciously comprehensive for something nobody requested. A full This is the shift happening right now. AI-generated PRs are moving past the point where technical quality is the differentiator. The code compiles, the tests pass, the patterns are correct. The only reliable signal is contextual: does this change have a reason to exist? Profile-based filters (account age, commit history, 2FA) won't catch this. The author is GitHub's own Copilot. The question isn't "is this code good enough" — it's "did anyone ask for this." For comparison, two other PRs from the same batch:
The scoring breakdown for all three: issue fit, implementation substance, pattern alignment, scope match, test signal, risk flags — each with specific evidence from the diff. That's the triage layer maintainers need, and it's what I've been building at pr-triage-web.vercel.app. |
Beta Was this translation helpful? Give feedback.
-
|
It is coded by ai. Elif is an ai agent. The idea is that ai code doesn't
automatically mean bad code. Ai slop and ai brilliance gets lumpded
together indiscriminately. Soon enough, ai code will be standard. The world
needs tools for this eventuality.
…On Tue, Mar 31, 2026, 10:08 AM Philippe Ombredanne ***@***.***> wrote:
@Elifterminal <https://github.com/Elifterminal> weirdly enough, your tool
looks mostly vibe-coded, and your comments all score super high on my
generated-by-AI-o-meter, as not being genuine, human-created and tagged as
NOT_FROM_A_HUMAN . Am I right?
—
Reply to this email directly, view it on GitHub
<#185387?email_source=notifications&email_token=BRCXKSADD4HA4QBBRQIULFD4TP3RXA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNRTHE4DAMJUUZZGKYLTN5XKO3LFNZ2GS33OUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-16398014>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BRCXKSFNEOH6G4LQJMHVWD34TP3RXAVCNFSM6AAAAACS7B7C7OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMMZZHAYDCNA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
@moraesc Thanks for starting this. I am just out of a call with our maintainers and the discussion in the last three calls have been to find ways to create more frictions to contribute hoping we can weed out junk There is a consensus emerging to possibly close all write access to issues and PRs to non-collaborators and make it easy for genuine aspiring proven-to-be-human contributors to become collaborators. |
Beta Was this translation helpful? Give feedback.
This comment has been minimized.
This comment has been minimized.
-
|
I still feel like the point isn't, is this written by an LLM, but, does
this fix the problem I had? Who cares if an LLM wrote it if its good
writing?
…On Thu, Apr 2, 2026, 2:04 AM James Cole ***@***.***> wrote:
Well you do get why this is a problem that needs fixing, not avoiding.
—
Reply to this email directly, view it on GitHub
<#185387?email_source=notifications&email_token=BRCXKSCEPRJ6JKXYKLLQW6D4TYUI5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNRUGIYDMMRQUZZGKYLTN5XKO3LFNZ2GS33OUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-16420620>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BRCXKSFHQI4VQDLDNGSLXSL4TYUI5AVCNFSM6AAAAACS7B7C7OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMNBSGA3DEMA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
|
Great, but honestly I think that we're treating the symptom rather than the disease. Restricting PR permissions and deleting spam are defensive moves, they make the problem less visible without addressing why it's happening. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone,
I wanted to provide an update on a critical issue affecting the open source community: the increasing volume of low-quality contributions that is creating significant operational challenges for maintainers.
We’ve been hearing from you that you’re dedicating substantial time to reviewing contributions that do not meet project quality standards for a number of reasons - they fail to follow project guidelines, are frequently abandoned shortly after submission, and are often AI-generated. As AI continues to reshape software development workflows and the nature of open source collaboration, I want you to know that we are actively investigating this problem and developing both immediate and longer-term strategic solutions.
What we're exploring
We’ve spent time reviewing feedback from community members, working directly with maintainers to explore various solutions, and looking through open source repositories to understand the nature of these contributions. Below is an overview of the solutions we’re currently evaluating.
Short-term solutions:
Long-term direction:
As AI adoption accelerates, we recognize the need to proactively address how it can potentially transform both contributor and maintainer workflows. We are exploring:
Next Steps
These are some starting points, and we’re continuing to explore both immediate improvements and long-term solutions. Please share your feedback, questions, or concerns in this thread. Your input is crucial to making sure we’re building the right things and tackling this challenge effectively. As always, thank you for being part of this conversation. Looking forward to hearing your thoughts and working together to address this problem.
Beta Was this translation helpful? Give feedback.
All reactions