FAQ: Privacy Statement update on Copilot data use for model training (Free/Pro/Pro+) #188488
Replies: 99 comments 125 replies
-
|
Why not let people choose per project if they want to opt out? |
Beta Was this translation helpful? Give feedback.
-
The way to "proudly ask for consent" would be to require the user to opt-in instead of notifying them that this feature will be turned on at a future date, and thus requiring them to opt-out instead. |
Beta Was this translation helpful? Give feedback.
-
|
This is not okay. Let users opt in an please let this AI fad die. |
Beta Was this translation helpful? Give feedback.
-
|
At least offer a discount if we opting to benefit the models you sell. |
Beta Was this translation helpful? Give feedback.
-
|
last announcements for copilot were horrible guys, just delete it to make us a favour if you continue like this |
Beta Was this translation helpful? Give feedback.
-
|
Thank you microslop, I hope you will train your shitty model with this comment :) |
Beta Was this translation helpful? Give feedback.
-
|
And still no ability to completely disable Copilot |
Beta Was this translation helpful? Give feedback.
-
|
As a new user to GitHub (sorta), I'm floored by this information. Can a mere person escape all this??? How do I opt out of all this? I think I've disabled most features Copilot has in my settings, but some say "Enabled", and I don't know how to turn them off. I hate this timeline, thanks. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Dark pattern to not actually link to the page to update your settings in the email with instructions to disable it. |
Beta Was this translation helpful? Give feedback.
-
|
All these data and you still can't add a search feature to VSCode Copilot Chat |
Beta Was this translation helpful? Give feedback.
-
Unless I'm not getting something, this is a complete and utter nothing burger of an answer. How does this address the question at all? |
Beta Was this translation helpful? Give feedback.
-
|
At first please fix your bug..I cannot use my free quota, it does not reset. |
Beta Was this translation helpful? Give feedback.
-
|
Just because the whole industry is doing opt-in by default doesn't make it less shitty. |
Beta Was this translation helpful? Give feedback.
-
|
missing Q: Why would anyone trust anything Microsoft says when the punishments for corpos violating user data privacy are nonexistent? |
Beta Was this translation helpful? Give feedback.
-
|
Seems weird that Microslop is giving us the illusion of free will here. |
Beta Was this translation helpful? Give feedback.
-
|
So all this to train microslop... Is there another alternative to github? This sucks |
Beta Was this translation helpful? Give feedback.
-
|
I want a clarification becauss it didnt state in the post Do public repositories also get trained even if you opted out? If so im ditching it no hesitation the because the reason i use github is to share to other people on my work and now you announce that you will use our work to train your ai thats kinda unfair you know? so i want a clarification on that one Do our public works get trained? |
Beta Was this translation helpful? Give feedback.
-
|
Absolutely appalling. Github was the one vendor I assumed would not start doing an opt-in by default. We've already had this pulled on us by Figma with their default opt-in for paid accounts. Please understand our concerns. We have IP in our repos that took years to develop. We cannot have that exposed to models in order to train them. The attitude you are taking on this is not ok. We've now stuck playing whack-a-mole, as you slowly change the settings until a gap appears that leaks our data out. |
Beta Was this translation helpful? Give feedback.
-
|
I'd like a bit more granularity on it. On my public repos I don't mind you training on them, but I do not want you training on my chat inputs, as often I have to refer to local resources, internal passwords, etc. So if you split out the permissions to inputs/outputs/etc, then I'll enable everything but inputs. But until I can block inputs, my setting will be opted out. |
Beta Was this translation helpful? Give feedback.
-
|
We take the protection of our intellectual property very seriously. If GitHub cannot provide a strong guarantee that our organization’s code and interaction data will not be used for model training under the current Copilot Free setup, we will have no choice but to cancel our Copilot subscriptions and explore alternative solutions. |
Beta Was this translation helpful? Give feedback.
-
|
Can anyone suggest another service?
|
Beta Was this translation helpful? Give feedback.
-
|
Yes or No doesn't make sense. This is just copied from a paper! Specifically, GitHub uses the following data: Excluded are: |
Beta Was this translation helpful? Give feedback.
-
|
You guys have to be very careful with this. The AI cannot use code that is directly found in another repository unless the credit is given to the author of that code. Changing variable names isn't enough to skirt copyright laws. in my opinion AI should not be using any code found on other work. It can use it to learn but not use it directly. It should be writing it's own code from the ground up with the knowledge it has. Another feature that should be added and this is a HUGE one is having the AI remember what it has done on the repo already so it doesn't go and muck up code that may have been written with the help of AI and having to go through to fix the issues with the generated code sometimes with the assistance of AI to do so. If it sees modifications to code it has written that should be a fairly good indication that the code AI assisted with needed to be corrected because it didn't work properly It should see that and use it to correct it's knowledge. I have had AI change something that I had already fixed because what it did previously didn't work 100% correct. Having to go back and change it again is a big annoyance. AI should also not be omitting/removing any comments that are in code. I have seen it do this numerous times as well. It adds a lot of extra work to fix things like that. Another thing that should be fixed is AI doing what it is asked/told to do. Too many times I have given a task to AI for it to only partially do what I had asked. Even after repeated times of asking and getting more explicit each time it still never completes the work and it sometimes starts to undo what it had previously done. Now I know that AI is supposed to mimic human behavior, but for something like assistance with writing code the laziness portion of human behavior needs to be removed from this type of AI completely, it's counter productive. |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
|
It is unclear whether or not the feature requested in this post has been implemented. Is there a way to opt out of GitHub/Microsoft doing Machine Learning/Model (ML/M) training on our our repositories (Repos)/code/Intellectual Property (IP) at rest, or do the settings under Statements such as: and could be interpreted to include repos/IP/code at rest, but the statments such as seem to mean this may not apply to our repos/IP/code at rest.
|
Beta Was this translation helpful? Give feedback.
-
|
I think that just like you have to be told by a company that they are going to record a phone call the same thing should also need to exist when using AI. People need to be told right then and there when they use it that the session is going to be used to train heir AI model. If the session is not going to be used then the message would not appear. The fact that there is a giant ? as to what is actually being collected and when it is being collected is the problem at hand. The only thing the user is able to do is make an assumption that everything is being collected all the time. Right in GitHubs statement about collecting the data is states that they "may" collect data not that they will as in 100% of the time, This looming shadow of secrecy as to when it is happening and what exactly is being collected is the problem. What is being collected needs to be far more specific and when it happens also need to be specific. |
Beta Was this translation helpful? Give feedback.
-
|
If you’re excluding business customers from this, how is it that every user on our Business Plan has the “Allow GitHub to use my data for AI model training” option set to “Enabled” and has to actively opt out of it, just like any regular individual user? |
Beta Was this translation helpful? Give feedback.
-
|
I just wanna say is I believe in Ai and it's potential to be great. Its a great idea and a TERRIBLE one. Humans are totally gonna fuck this all up for everybody else. It's already too late now might as well improve or to our advantage. Anyway I can help out I will. I made some $$$ on the stock market with my AI. The least I can do is pay it forward anyway I can.to improve it. |
Beta Was this translation helpful? Give feedback.






Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello GitHub Community👋
We’re sharing an update to our Privacy Statement and Terms of Service about how we use personal data to develop, improve, and secure GitHub products and services, including training AI and machine learning models that power GitHub Copilot.
For the full announcement and complete details, please visit the blog post: Updates to GitHub Copilot interaction data usage policy.
Frequently Asked Questions
Below is an FAQ covering what’s changing, who’s affected, what data may be used (when enabled), safeguards, and opt-out instructions.
Why is GitHub making this change and when will it go into effect?
Why are you only using data from individuals while excluding businesses and enterprises?
Are students and teachers that access Copilot Pro for free affected by this update?
What data are you collecting?
What can individuals do if they don’t want their inputs, outputs, or code snippets, used for model training?
Who will have access to this data outside of GitHub?
What is a "GitHub affiliate" and who does that include?
Companies that provide AI models or other services to GitHub — such as model providers, cloud hosting vendors, and other service providers — are not affiliates. They are service providers or subprocessors, and they are bound by contractual obligations that restrict how they can use your data. Specifically, service providers and subprocessors may only process your data on GitHub's behalf and at GitHub's direction — they do not receive your data for their own independent purposes, including their own model training. You can see GitHub's current list of subprocessors here.
In short: affiliates are part of our corporate family. Service providers work for us under contract. These are distinct relationships with different rights and obligations.
How do you protect sensitive data?
Do I need to do anything if I previously disabled the setting titled "Enabling or disabling prompt and suggestion collection"?
Will code stored in private repositories be used for model training?
What safeguards are in place to prevent enterprise code being used for model training due to an individual using a personal Copilot license while working in their employer’s codebase?
Other companies aren’t using user data to train models. Why is GitHub?
You’re collecting code snippets, prompt text, AI responses, and detailed interaction patterns. How is this not giving you my entire codebase?
Security researchers found that Copilot Chat could expose private code from repositories that were temporarily public then set back to private. Why should we trust your guarantees about protecting our data?
I selected Copilot because GitHub said it didn’t train on user data. This feels like a bait and switch.
If this data collection is truly safe, why don’t you enable it by default for enterprise customers as well?
If your AI needs real user code to be competitive, isn’t that an admission that your advantage comes from exploiting your existing user base rather than better research?
If training on user code is so obviously beneficial, why does every company try to hide it behind toggles, footnotes, or tiered pricing instead of proudly asking for consent?
I noticed the private repository access language was removed from the Privacy Statement. Where did it go?
Join the discussion
Have additional questions or feedback? Please share them in the comments below.
Beta Was this translation helpful? Give feedback.
All reactions