FAQ: Privacy Statement update on Copilot data use for model training (Free/Pro/Pro+) #188488

2026-03-02T22:15:42Z

GitHub Community Admin
Mar 2, 2026

Hello GitHub Community👋

We’re sharing an update to our Privacy Statement and Terms of Service about how we use personal data to develop, improve, and secure GitHub products and services, including training AI and machine learning models that power GitHub Copilot.

For the full announcement and complete details, please visit the blog post: Updates to GitHub Copilot interaction data usage policy.

Frequently Asked Questions

Below is an FAQ covering what’s changing, who’s affected, what data may be used (when enabled), safeguards, and opt-out instructions.

Why is GitHub making this change and when will it go into effect?
- We’re making this change to improve model performance and deliver a better user experience. As Copilot usage continues to increase dramatically, real-world data will help our models cover the increasing number of scenarios they’re now being used for. The change goes into effect on April 24. Users received 30 days advance notice and can opt out at any time.
Why are you only using data from individuals while excluding businesses and enterprises?
- Our agreements with Business and Enterprise customers prohibit using their Copilot interaction data for model training, and we honor those commitments. Individual users on Free, Pro, and Pro+ plans have control over their data and can opt out at any time.
Are students and teachers that access Copilot Pro for free affected by this update?
- No, under our policy, students that have access to Copilot Student and teachers that access Copilot Pro for free are not affected.
What data are you collecting?
- When an individual user has this setting enabled, the interaction data we may collect includes:
  - Outputs accepted or modified by the user
  - Inputs sent to GitHub Copilot, including code snippets shown to the model
  - Code context surrounding the user’s cursor position
  - Comment and documentation that the user wrote
  - File names, repository structure, and navigation patterns
  - Interactions with Copilot features including Chat and inline suggestions
What can individuals do if they don’t want their inputs, outputs, or code snippets, used for model training?
- They can opt out via their GitHub Copilot account settings at any time.
Who will have access to this data outside of GitHub?
- GitHub and Microsoft personnel working on AI model development may access interaction data collected for training. We may also engage service providers to assist with model training on our behalf, subject to contractual obligations to use the data only for providing services to GitHub. Third-party model providers do not receive this data for their own model training. We do not sell user data. For more information on how we handle your data, see our Privacy Statement.
What is a "GitHub affiliate" and who does that include?
- A GitHub affiliate is a company that is part of the same corporate family as GitHub — meaning any entity that controls, is controlled by, or is under common control with GitHub. Today, that primarily means Microsoft Corporation and its subsidiaries. You can review Microsoft's privacy practices in the Microsoft Privacy Statement.
Companies that provide AI models or other services to GitHub — such as model providers, cloud hosting vendors, and other service providers — are not affiliates. They are service providers or subprocessors, and they are bound by contractual obligations that restrict how they can use your data. Specifically, service providers and subprocessors may only process your data on GitHub's behalf and at GitHub's direction — they do not receive your data for their own independent purposes, including their own model training. You can see GitHub's current list of subprocessors here.

In short: affiliates are part of our corporate family. Service providers work for us under contract. These are distinct relationships with different rights and obligations.
How do you protect sensitive data?
- We’ve implemented multiple layers of protection for sensitive data including automated filtering designed to detect and remove API keys, passwords, tokens, and personally identifiable information. We also provide user controls for which repositories Copilot can access. We limit data access to authorized personnel working directly on model improvement and safety, and log and audit access.
Do I need to do anything if I previously disabled the setting titled "Enabling or disabling prompt and suggestion collection"?
- No, if you previously disabled that setting your preference will carry over and your interaction data will not be used for model training.
Will code stored in private repositories be used for model training?
- If a Copilot user has their settings set to enable model training on their interaction data, code snippets from private repositories can be collected and used for model training while the user is actively engaged with Copilot while working in that repository. This applies only to code snippets and outputs that are actively sent to Copilot during a user's session — we do not access or pull code from private repositories at rest. To prevent code snippets shared with Copilot during active sessions from being used for model training, users can disable model training in their GitHub account settings.
What safeguards are in place to prevent enterprise code being used for model training due to an individual using a personal Copilot license while working in their employer’s codebase?
- We do not train on the contents from any paid organization’s repos, regardless of whether a user is working in that repo with a Copilot Free, Pro, or Pro+ subscription. If a user’s GitHub account is a member of or outside collaborator with a paid organization, we exclude their interaction data from model training.
Other companies aren’t using user data to train models. Why is GitHub?
- We’re giving users the option of allowing us to train AI models with their interaction data because internal testing shows it will improve the accuracy of Copilot’s suggestions for all users. We can only speak on behalf of GitHub, but it’s worth noting this is similar to the approach that Microsoft, Anthropic, and JetBrains are taking.
You’re collecting code snippets, prompt text, AI responses, and detailed interaction patterns. How is this not giving you my entire codebase?
- We’re collecting data only from your interactions with Copilot—we are not pulling any data from your codebase at rest.
Security researchers found that Copilot Chat could expose private code from repositories that were temporarily public then set back to private. Why should we trust your guarantees about protecting our data?
- We take the security of your data seriously and continue to invest in safeguards. The scenario described involved third-party collection of temporarily public data, which is outside GitHub's control. For data collected under this new program, we apply access controls, audit logging, and automated filtering to protect your code.
I selected Copilot because GitHub said it didn’t train on user data. This feels like a bait and switch.
- As Copilot usage continues to increase dramatically, we’ve identified a need for real-world data to help our models cover the increasing number of scenarios they’re now being used for. We are committed to giving developers control over whether their interaction data is used for training and will always be transparent about our use of this data.
If this data collection is truly safe, why don’t you enable it by default for enterprise customers as well?
- Our agreements with Business and Enterprise customers prohibit using their Copilot interaction data for model training, and we honor those commitments. That said, we use Microsoft interaction data for model training, with plans to incorporate GitHub interaction data, and are very confident in our ability to protect data used for model training.
If your AI needs real user code to be competitive, isn’t that an admission that your advantage comes from exploiting your existing user base rather than better research?
- 26 million developers are now using GitHub Copilot, representing a huge range of use cases and needs. We want to deliver an exceptional experience for every user, which is why we’re giving users the option of allowing us to use their usage data to improve Copilot’s ability to perform a more diverse range of coding tasks.
If training on user code is so obviously beneficial, why does every company try to hide it behind toggles, footnotes, or tiered pricing instead of proudly asking for consent?
- We can only speak for GitHub, and we are being as transparent as we can be about this change. We are notifying affected users directly, we published a blog post, and we have in-product notifications alerting users of the upcoming changes and directing them to update their user settings per their preferences.
I noticed the private repository access language was removed from the Privacy Statement. Where did it go?
- We moved the description of when GitHub personnel may access private repository content from the Privacy Statement to Section E (Private Repositories) of the Terms of Service. This is a consolidation, not a change. The same access limitations that have always applied continue to apply. See the changelog for more details: Updates to our Privacy Statement and Terms of Service: How we use your data.

Join the discussion

Have additional questions or feedback? Please share them in the comments below.

Xefreh · 2026-03-25T19:29:44Z

Xefreh
Mar 25, 2026

Why not let people choose per project if they want to opt out?

6 replies

HappyTepid Mar 25, 2026

Just a convenient side effect of the “simple setting” that it prevents the owner of a repo from opting out of having their work exploited for microslop’s benefit?

Choms Mar 25, 2026

@martinwoodward poor excuse when the post literally says that the enterprise enforcement is repository-level even when the user has a free or personal sub...

you can 100% enforce it but would loss on a lot of data (same as if you did this opt-in)

brandon-fryslie Mar 26, 2026

Hadn't thought of it but it's a good idea to give the option. I would enable it on some of my repos but won't enable it for all.

And I would recommend taking a page from OpenAI's book, and giving out tokens. They give you something like 1 million tokens per day for free on projects where you enable them to use your data. That take it from "I'm paying you to give you my data? Really?" to "Sign me right up!" And to be clear: what they allow is for you to enable on a per-project/per-api-token basis, to opt-in to data collection. Anything you use that API Token for is fair game for them to use the data, and you also get some free tokens each day when using that API Token.

this is a far better tradeoff than "one toggle per account". It's just too big of a hammer to give users no options like this and if you're not already planning the work to fix it, you will be soon, I believe. If someone is a contractor w/ private repos who also does some personal projects, the choices are "risk exposing customer data, likely in breach of their own contracts", "disable data collection and double check that every day and still worry about it" (and you're not even able to collect data that they might not care about), or "find a different provider than GitHub so they can feel safe".

I'm fine with Opt-Out as a concept, because 90% of people never change default settings. But making this opt-out by default even for private repos is going to cause intense blowback to the point where I wouldn't be surprised to see adoption of GitHub Copilot stall or reverse. I'm not going to say this was poorly thought out, because there are a lot of moving parts. I understand that. But I think on the point of "we're going to make it easy by shipping an MVP with an account level toggle and iterate on customer feedback" you guys grossly misjudged how the community is going to react to this, how much non-enterprise users really depend on copilot as a core part of their workflow, how much data you'll actually get from this, and overall how successful this "yolo" approach will really be.

It's a stumble, you (Github) needs to recognize that, but you can recover. Here's what I see as the right path:

Opt-out for public repos by default
Opt-in for private repos by default
Repo level toggle would be smart, but I don't think it's strictly necessary. Personally, I will only opt-in on a repo basis. I think you'll get more data if people can toggle by repo
Apologize for the botched rollout and fix these problems before this pending disaster goes live
Anyone who opts in on a private repo gets X premium requests per day for that repo. You want the data? Provide some tangible benefit

If you want the real power move:

Make everything opt-in by default
Anyone who opts-in gets X free premium requests per month
For the first 2-3 months, opting in is invite only
For the next 2-3 months, anyone can request an opt-in, but they're waitlisted
Eventually "allow" everyone to opt-in, or announce that it was a tough sell, but you've finally got buy in from the business to "allow" giving everyone access to the 5 premium requests a day (or 10 a week or whatever) and you're rolling it out by default (effectively making this opt-out by default). You go from "here's an incredibly unpopular decision that everyone is going to hate" to "twitter is blowing up because everyone wants an invite and customers can't wait to opt-in"

Add restrictions around X requests per month at the account level, I can't design the whole thing for you. But whatever you do, you're gonna want to get started before this really blows up. Which it will.

without-ordinary Mar 26, 2026

If it was a repo flag, then it would be possible to fix the forking exploit to circumvent the opt-out.

I personally have no issue with my trash code being used for training of open AI models, but Copilot and other closed model providers contributors little to nothing back to FOSS AI. The horribly forced "integrations" of Copilot alone are enough to make me want to actively hinder it.

Xefreh Mar 26, 2026

Hadn't thought of it but it's a good idea to give the option. I would enable it on some of my repos but won't enable it for all.

And I would recommend taking a page from OpenAI's book, and giving out tokens. They give you something like 1 million tokens per day for free on projects where you enable them to use your data. That take it from "I'm paying you to give you my data? Really?" to "Sign me right up!" And to be clear: what they allow is for you to enable on a per-project/per-api-token basis, to opt-in to data collection. Anything you use that API Token for is fair game for them to use the data, and you also get some free tokens each day when using that API Token.

this is a far better tradeoff than "one toggle per account". It's just too big of a hammer to give users no options like this and if you're not already planning the work to fix it, you will be soon, I believe. If someone is a contractor w/ private repos who also does some personal projects, the choices are "risk exposing customer data, likely in breach of their own contracts", "disable data collection and double check that every day and still worry about it" (and you're not even able to collect data that they might not care about), or "find a different provider than GitHub so they can feel safe".

I'm fine with Opt-Out as a concept, because 90% of people never change default settings. But making this opt-out by default even for private repos is going to cause intense blowback to the point where I wouldn't be surprised to see adoption of GitHub Copilot stall or reverse. I'm not going to say this was poorly thought out, because there are a lot of moving parts. I understand that. But I think on the point of "we're going to make it easy by shipping an MVP with an account level toggle and iterate on customer feedback" you guys grossly misjudged how the community is going to react to this, how much non-enterprise users really depend on copilot as a core part of their workflow, how much data you'll actually get from this, and overall how successful this "yolo" approach will really be.

It's a stumble, you (Github) needs to recognize that, but you can recover. Here's what I see as the right path:

Opt-out for public repos by default

Opt-in for private repos by default

Repo level toggle would be smart, but I don't think it's strictly necessary. Personally, I will only opt-in on a repo basis. I think you'll get more data if people can toggle by repo

Apologize for the botched rollout and fix these problems before this pending disaster goes live

Anyone who opts in on a private repo gets X premium requests per day for that repo. You want the data? Provide some tangible benefit

If you want the real power move:

Make everything opt-in by default

Anyone who opts-in gets X free premium requests per month

For the first 2-3 months, opting in is invite only

For the next 2-3 months, anyone can request an opt-in, but they're waitlisted

Eventually "allow" everyone to opt-in, or announce that it was a tough sell, but you've finally got buy in from the business to "allow" giving everyone access to the 5 premium requests a day (or 10 a week or whatever) and you're rolling it out by default (effectively making this opt-out by default). You go from "here's an incredibly unpopular decision that everyone is going to hate" to "twitter is blowing up because everyone wants an invite and customers can't wait to opt-in"

Add restrictions around X requests per month at the account level, I can't design the whole thing for you. But whatever you do, you're gonna want to get started before this really blows up. Which it will.

The free premium requests if you opt-in are great but I don’t think Microsoft is going to do that tbh. But at this point if they’re not doing something everyone will just turn it off.

eriklharper · 2026-03-25T19:32:49Z

eriklharper
Mar 25, 2026

If training on user code is so obviously beneficial, why does every company try to hide it behind toggles, footnotes, or tiered pricing instead of proudly asking for consent?

The way to "proudly ask for consent" would be to require the user to opt-in instead of notifying them that this feature will be turned on at a future date, and thus requiring them to opt-out instead.

7 replies

o-leary Mar 25, 2026

If you find good alternatives for GitHub please let me know. I’m so sick of this type of behaviour where consumers are treated as product.

This comment was marked as off-topic.

Sign in to view

o-leary Mar 26, 2026

@jmac122 first of all it was 36, secondly those do not cover forks where a pull request isn’t made, ie when changes are tested or recommended by other means. I don’t know if private repos are even included. Lastly it doesn’t matter if I’m contributing to projects, I still own whatever personal work I do regardless of size. If you code daily or once a year this affects you and the means in which it is being turned on for everyone is frankly quite ass.

MeanDoubleStain Mar 26, 2026

Half the options still show as 'enabled' what does this even mean? Github is like Code City right? Why the sloppy GUI? The e-mail, blog and FAQ are just runarounds.

ClariNerd617 Mar 28, 2026

If you find good alternatives for GitHub please let me know. I’m so sick of this type of behaviour where consumers are treated as product.

@o-leary Codeberg is pretty good these days.

zottelbeyer · 2026-03-25T19:36:02Z

zottelbeyer
Mar 25, 2026

This is not okay. Let users opt in an please let this AI fad die.

2 replies

This comment was marked as off-topic.

Sign in to view

JessMTermini Mar 26, 2026

You have contributed 5 times in the last 4 years. Why do you care?

@jmac122 Are you not aware that you can turn off showing personal contributions to private repositories? Just because - you - can only see 5 contributions from that person in the last year, does not mean that is the only contributions that person has made lol.. what a ridiculous comment to make.

kinduff · 2026-03-25T19:36:24Z

kinduff
Mar 25, 2026

At least offer a discount if we opting to benefit the models you sell.

1 reply

Pr0methean Mar 25, 2026

Or better yet for no-budget open-source projects, an amount of extra free-tier usage commensurate with what we're contributing this way.

gablilli · 2026-03-25T19:36:44Z

gablilli
Mar 25, 2026

last announcements for copilot were horrible guys, just delete it to make us a favour if you continue like this

0 replies

zoliidev · 2026-03-25T19:37:32Z

zoliidev
Mar 25, 2026

Thank you microslop, I hope you will train your shitty model with this comment :)

0 replies

tehabe · 2026-03-25T19:37:52Z

tehabe
Mar 25, 2026

And still no ability to completely disable Copilot

4 replies

andrewkarch Mar 26, 2026

VS Codium is a MicroSlop / Copilot free VS Code.

juan-octoco Mar 26, 2026

Agreed, been using vscodium for over a year now

tehabe Mar 26, 2026

VS Codium is a MicroSlop / Copilot free VS Code.

Still a browser with a bad editor. I would rather use edit than that.

Ste1io Mar 27, 2026

The fact that they don't maintain an official "vanilla" VSCode fork maintained alongside their broken AI IDE experiment is something I'll never understand. They would dominate the market share, locked in. And no reason for users to bitch about shoving AI down people's throats: just use the vanilla build. Increased user satisfaction from both sides of the fence, higher adoption rate for the AI version, and a more secure ecosystem through extension marketplace. But clearly Microsoft is too short sighted to realize that.

marisloading · 2026-03-25T19:39:10Z

marisloading
Mar 25, 2026

As a new user to GitHub (sorta), I'm floored by this information. Can a mere person escape all this??? How do I opt out of all this? I think I've disabled most features Copilot has in my settings, but some say "Enabled", and I don't know how to turn them off.

I hate this timeline, thanks.

11 replies

gablilli Mar 26, 2026

@rhyswynne @knutmerket scroll a bit and you'll see, under the privacy section, "Allow GitHub to use my data for AI model training". atleast I have that option.

kapaweb Mar 26, 2026

Allow GitHub to use my data for AI model training

What's the URL?

I 've got nothing here.

Ste1io Mar 27, 2026

Settings > Copilot > Features. Scroll to very bottom of page. If you previously disabled it, the option will already be disabled, no further action needed:

bentalexanderhaase Mar 28, 2026

Settings > Copilot > Features. Scroll to very bottom of page. If you previously disabled it, the option will already be disabled, no further action needed:

Had to scroll way too far to find a simple, straightforward guide for disabling this. Their FAQ and documentation are packed with management fluff and walls of text — yet somehow a plain "click here to opt out" in the banner was too much to ask for. Thank you @Ste1io! 🎉

gu3st Mar 28, 2026

it's hard to opt out because they want it to be. dark pattern hoping people are too lazy or busy to figure it out.

ghost · 2026-03-25T19:39:56Z

ghost
Mar 25, 2026

6 replies

JessMTermini Mar 26, 2026

jmac122

Yeah real freedom lover right here folks lol
Think you need to change your personal tagline to something a little more...truthful perhaps?

Kun0913 Mar 26, 2026

Don't worry, you don't need to opt out.

you are being so rude. learn how to talk first before going online.

runner18 Mar 26, 2026

In fairness, jmac's screenshot only shows contributions from 2026. Buchhla had contributions in 2024, 2023, 2022, 2021, 2019, 2017, 2016, etc.

ghost Mar 26, 2026

Ad hominem attacks that ignore the actual issue, got to love the internet. All my current work lives under an EMU account on the enterprise github at work, I have not really used this account in many years, but I actually couldn't post to this using my work account.

annsek Mar 27, 2026

@jmac122 Fuck off already.

burnhamup · 2026-03-25T19:40:08Z

burnhamup
Mar 25, 2026

Dark pattern to not actually link to the page to update your settings in the email with instructions to disable it.

6 replies

inakarmacoma Mar 25, 2026

settings are also not present in the Android App...

"as transparent as we can be" eh?

phette23 Mar 25, 2026

I don't see this setting on the website...

reubot Mar 26, 2026

https://github.com/settings/copilot/features

IrregularShed Mar 26, 2026

@reubot doing what GitHub couldn't be bothered to do

eriklharper Mar 27, 2026

All settings on a page should be hashtagged too so that when you click a link with the hashtag reference it will auto-scroll right to the heading for said setting.

palapapa · 2026-03-25T19:40:21Z

palapapa
Mar 25, 2026

All these data and you still can't add a search feature to VSCode Copilot Chat
microsoft/vscode#259374

0 replies

IdfbAn · 2026-03-25T19:40:32Z

IdfbAn
Mar 25, 2026

If your AI needs real user code to be competitive, isn’t that an admission that your advantage comes from exploiting your existing user base rather than better research?

26 million developers are now using GitHub Copilot, representing a huge range of use cases and needs. We want to deliver an exceptional experience for every user, which is why we’re giving users the option of allowing us to use their usage data to improve Copilot’s ability to perform a more diverse range of coding tasks.

Unless I'm not getting something, this is a complete and utter nothing burger of an answer. How does this address the question at all?

4 replies

marisloading Mar 25, 2026

Wait till you find out they most likely asked Copilot to answer these...

kinduff Mar 25, 2026

From Copilot itself:

marisloading Mar 25, 2026

From Copilot itself:

Your answer is as equally disappointing.

kinduff Mar 25, 2026

@marisloading They should've run their models in their own FAQ lol we've come full circle now

fiammara · 2026-03-25T19:41:10Z

fiammara
Mar 25, 2026

At first please fix your bug..I cannot use my free quota, it does not reset.

0 replies

mattwhitlock · 2026-03-25T19:41:17Z

mattwhitlock
Mar 25, 2026

Just because the whole industry is doing opt-in by default doesn't make it less shitty.

0 replies

era-epoch · 2026-03-25T19:46:50Z

era-epoch
Mar 25, 2026

missing Q: Why would anyone trust anything Microsoft says when the punishments for corpos violating user data privacy are nonexistent?

0 replies

ClariNerd617 · 2026-03-28T20:24:21Z

ClariNerd617
Mar 28, 2026

Seems weird that Microslop is giving us the illusion of free will here.

0 replies

DamienStaebler · 2026-03-29T02:10:24Z

DamienStaebler
Mar 29, 2026

So all this to train microslop... Is there another alternative to github? This sucks

1 reply

Symbai Mar 29, 2026

Yes there is it. Gitlab Bitbucket or if you dont trust them doing the same as Github in the future: self hosted Gitea. There might even be shared hosting providers for Gitea. It looks and act like Github and in the web UI you can even migrate all your Github projects. All git history, issues, PRs, wiki etc. are copied.

ShikoShintaro · 2026-03-29T04:53:55Z

ShikoShintaro
Mar 29, 2026

I want a clarification becauss it didnt state in the post

Do public repositories also get trained even if you opted out? If so im ditching it no hesitation the because the reason i use github is to share to other people on my work and now you announce that you will use our work to train your ai thats kinda unfair you know? so i want a clarification on that one

Do our public works get trained?

1 reply

DamienStaebler Mar 29, 2026

Well, supposedly they will not. However it is microslop so the opt in/out option might just be a setting that makes us feel better. I doubt they'll respect it. Imma ditch github and use something else.

kevcampb · 2026-03-29T05:56:55Z

kevcampb
Mar 29, 2026

Absolutely appalling. Github was the one vendor I assumed would not start doing an opt-in by default. We've already had this pulled on us by Figma with their default opt-in for paid accounts.

Please understand our concerns. We have IP in our repos that took years to develop. We cannot have that exposed to models in order to train them. The attitude you are taking on this is not ok. We've now stuck playing whack-a-mole, as you slowly change the settings until a gap appears that leaks our data out.

1 reply

tmtiwari Mar 29, 2026

This ⬆️. Time to fully disable Copilot, it's no where close to other commercial products anyways.

Another threat in this case seems to be the consolidated Chat interface in VS Code. Is VS Code going to push the chat messages from other providers too or just the Copilot?

LLabmik · 2026-03-29T09:44:46Z

LLabmik
Mar 29, 2026

I'd like a bit more granularity on it. On my public repos I don't mind you training on them, but I do not want you training on my chat inputs, as often I have to refer to local resources, internal passwords, etc. So if you split out the permissions to inputs/outputs/etc, then I'll enable everything but inputs. But until I can block inputs, my setting will be opted out.

0 replies

eqsupport · 2026-03-30T07:40:12Z

eqsupport
Mar 30, 2026

We take the protection of our intellectual property very seriously. If GitHub cannot provide a strong guarantee that our organization’s code and interaction data will not be used for model training under the current Copilot Free setup, we will have no choice but to cancel our Copilot subscriptions and explore alternative solutions.
We would greatly appreciate a timely and detailed response, including any relevant references to your Data Protection Agreement or privacy policy.

1 reply

poutila Mar 30, 2026

Is there a way to cancel - other than paid subscription?
I think that you are in free plan weather you like it or not.

poutila · 2026-03-30T11:48:33Z

poutila
Mar 30, 2026

Can anyone suggest another service?
Someone that does not steal your intelligent property?
Unbelievable that something like this is being done.
Will this be the "Greatest IP steal in history"?
I am pretty angry. What if I would not have visited https://github.com and seen the "We steal everything" - banner?

is it possible to get a permission with "If you do not react we steal everything" - principle in western democracies?

0 replies

hcet14 · 2026-03-30T17:31:55Z

hcet14
Mar 30, 2026

Yes or No doesn't make sense.
There should be more fields to select!
For no I selected No.

This is just copied from a paper!

Specifically, GitHub uses the following data:
Private repositories during use
Accepted or modified Copilot suggestions
Inputs sent to Copilot, including code snippets
Code context around the cursor position
Comments and documentation
Filenames and repository structures
Navigation patterns
Interactions with Copilot features such as chat or inline suggestions
Feedback on suggestions (thumbs up/down)

Excluded are:
Content from issues and discussions
Private repositories when not in use (“at rest”)
Business and Enterprise plans
Students and teachers who use the Pro plan for free

0 replies

kdschlosser · 2026-03-31T20:54:57Z

kdschlosser
Mar 31, 2026

You guys have to be very careful with this. The AI cannot use code that is directly found in another repository unless the credit is given to the author of that code. Changing variable names isn't enough to skirt copyright laws. in my opinion AI should not be using any code found on other work. It can use it to learn but not use it directly. It should be writing it's own code from the ground up with the knowledge it has.

Another feature that should be added and this is a HUGE one is having the AI remember what it has done on the repo already so it doesn't go and muck up code that may have been written with the help of AI and having to go through to fix the issues with the generated code sometimes with the assistance of AI to do so. If it sees modifications to code it has written that should be a fairly good indication that the code AI assisted with needed to be corrected because it didn't work properly It should see that and use it to correct it's knowledge. I have had AI change something that I had already fixed because what it did previously didn't work 100% correct. Having to go back and change it again is a big annoyance.

AI should also not be omitting/removing any comments that are in code. I have seen it do this numerous times as well. It adds a lot of extra work to fix things like that.

Another thing that should be fixed is AI doing what it is asked/told to do. Too many times I have given a task to AI for it to only partially do what I had asked. Even after repeated times of asking and getting more explicit each time it still never completes the work and it sometimes starts to undo what it had previously done. Now I know that AI is supposed to mimic human behavior, but for something like assistance with writing code the laziness portion of human behavior needs to be removed from this type of AI completely, it's counter productive.

2 replies

eriklharper Mar 31, 2026

Thanks for sharing your experience. From what you're describing it sounds like your use of AI has actually added to your burden of work instead of subtracting from it.

kdschlosser Apr 1, 2026

it can in some cases. It all depends on how the AI is acting on a specific day. It's funny because it seems like it has a mood where some days I have to battle with it. Some times I cut my losses and take what it has done and then do the rest myself.

The funny thing is I have found that using an agent is what gives me the most grief with completing tasks. If I use the AI through the dashboard it works better. It will ask me if I want it to open a PR and I tell it no and it will in most cases complete the task I have given it and it will attach the completed files to the chat in the dashboard.

Here is what it ends up looking like when that happens...

IDK why the agent works differently. If I tell the AI to go ahead and open the PR the AI states it is passing the work over to the agent and the agent still doesn't complete the task given. Where as the AI from the dashboard does.

TheFloatingBrain · 2026-04-02T03:20:58Z

TheFloatingBrain
Apr 2, 2026

It is unclear whether or not the feature requested in this post has been implemented.

Is there a way to opt out of GitHub/Microsoft doing Machine Learning/Model (ML/M) training on our our repositories (Repos)/code/Intellectual Property (IP) at rest, or do the settings under Settings > Copilot > Coding agent and Settings > Copilot > Features only apply when Copilot is being used "live" in a user interaction on small bits of code?

Statements such as:

  What data are you collecting?

    When an individual user has this setting enabled, the interaction data we may collect includes:
        Outputs accepted or modified by the user
        Inputs sent to GitHub Copilot, including code snippets shown to the model
        Code context surrounding the user’s cursor position
        Comment and documentation that the user wrote
        File names, repository structure, and navigation patterns
        Interactions with Copilot features including Chat and inline suggestions

and


  I selected Copilot because GitHub said it didn’t train on user data. This feels like a bait and switch.

    As Copilot usage continues to increase dramatically, we’ve identified a need for real-world data to help our models cover the increasing number of scenarios they’re now being used for. We are committed to giving developers control over whether their interaction data is used for training and will always be transparent about our use of this data.

could be interpreted to include repos/IP/code at rest, but the statments such as

  You’re collecting code snippets, prompt text, AI responses, and detailed interaction patterns. How is this not giving you my entire codebase?

    We’re collecting data only from your interactions with Copilot—we are not pulling any data from your codebase at rest.

seem to mean this may not apply to our repos/IP/code at rest.

Is there a way to opt out of collecting and using our repos/IP/code "at rest" (as put in the quote above) for ML/M training?
Is there a way to know if your repos/IP/code has been used in a ML training dataset?
Is there a way to opt out of repos/IP/code being used in future ML training dataset (or removed from current/past runs/models)?
Will "no-ai" licenses be respected?
Does GitHub/Microsoft do anything to prevent bulk collection of repos for ML training purposes?
Does GitHub/Microsoft sell bulk collections of repos for ML training purposes, and if so could we know who they are, and do they have a channel to request an opt-out?

0 replies

kdschlosser · 2026-04-02T04:43:34Z

kdschlosser
Apr 2, 2026

I think that just like you have to be told by a company that they are going to record a phone call the same thing should also need to exist when using AI. People need to be told right then and there when they use it that the session is going to be used to train heir AI model. If the session is not going to be used then the message would not appear. The fact that there is a giant ? as to what is actually being collected and when it is being collected is the problem at hand. The only thing the user is able to do is make an assumption that everything is being collected all the time. Right in GitHubs statement about collecting the data is states that they "may" collect data not that they will as in 100% of the time, This looming shadow of secrecy as to when it is happening and what exactly is being collected is the problem. What is being collected needs to be far more specific and when it happens also need to be specific.

0 replies

kmuehler-sl · 2026-04-02T08:05:16Z

kmuehler-sl
Apr 2, 2026

If you’re excluding business customers from this, how is it that every user on our Business Plan has the “Allow GitHub to use my data for AI model training” option set to “Enabled” and has to actively opt out of it, just like any regular individual user?
Where is the guarantee that the new model training won’t apply to Business Accounts?

0 replies

twistedmetal420 · 2026-04-03T09:11:43Z

twistedmetal420
Apr 3, 2026

I just wanna say is I believe in Ai and it's potential to be great. Its a great idea and a TERRIBLE one. Humans are totally gonna fuck this all up for everybody else. It's already too late now might as well improve or to our advantage. Anyway I can help out I will. I made some $$$ on the stock market with my AI. The least I can do is pay it forward anyway I can.to improve it.

0 replies

This comment was marked as off-topic.

Sign in to view

FAQ: Privacy Statement update on Copilot data use for model training (Free/Pro/Pro+) #188488

Uh oh!

Uh oh!

GitHub Community Admin Mar 2, 2026

Frequently Asked Questions

Join the discussion

Replies: 99 comments · 125 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

I hate this timeline, thanks.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GitHub Community Admin
Mar 2, 2026

Replies: 99 comments 125 replies