who said they haven't. for something to show up verbatim in the output of a text...

belorn · on Oct 16, 2022

Microsoft have a public statement that they don't use proprietary code, only public code with public licenses. They have a lot of companies as customers who uses github, and they also use a lot third-party code in their own products.

stefan_ · on Oct 16, 2022

Even BSD et. al. have attribution requirements - that must be a vanishingly small amount of code to be used. Me thinks the people who run GitHub (who have apparently decided to abandon the core business for the latest fun project) aren't being entirely upfront.

pabs3 · on Oct 17, 2022

I thought they said all public repos without regard to the license they are under, which could be a proprietary EULA.

akudha · on Oct 16, 2022

With the amount of resources that Microsoft has, how hard can it be for them to exclude proprietary code that other people have stolen? I’d bet it is easy for them, but they won’t do it. Because they don’t care, because who is gonna take on them?

Will they “accidentally” include proprietary code from say, Oracle? Nope. They’ll make sure of it. But Joe Random? Sure

make3 · on Oct 16, 2022

there's exactly no way they have

naikrovek · on Oct 17, 2022

I'm curious how you could possibly know that for sure.

make3 · on Oct 18, 2022

because Microsoft is known to be extremely protective of their code. there is just no way they would expose their internal code to being straight up decoded from the model, while they can just train the model on the huge public data of GitHub