Universal Music Asks Streaming Services to Block AI Access to Its Songs

cloudking · on April 14, 2023

Prediction: regulation will require AI companies to have copyright licenses granted for training data. There will be legal models trained on datasets that have copyright granted, and illegal black market models trained on copyrighted materials. However, it will be really difficult to enforce and prove what data was used for training in each.

mrbungie · on April 14, 2023

Prediction: Some nation-state that cares nothing about IP (i.e. China) wins the AI race in this domain making these regulations totally useless.

newswasboring · on April 14, 2023

And they will deserve to. The point of copyright wasn't to bar anyone from using any work anywhere. It was to encourage artists to create art. Doing it by protecting the work they create was just a mechanism. The current mechanism of almost perpetual copyrights and just constant nagging over things which don't include a recognizable semblance of the original work is just insanity.

tjr · on April 14, 2023

I don't think that taking one's art and using it to train a machine to be able to make limitless art is super encouraging to the artist.

newswasboring · on April 14, 2023

I see that, but it's a matter of balance. Lots of things encourage/discourage artists, not everything will be in their favor. The way I see it, as long as nobody's work is being reproduced, they shouldn't have any real qualms.

tjr · on April 14, 2023

I think that, just as content creators can designate a work to under various Creative Commons licenses, for commercial use, or for non-commercial use, or under MIT licenses, or GPL, or proprietary licenses, etc....

We may find that letting content creators choose to have their work included, or not, in AI training sets would be helpful? Or, included in training sets for a fee, or for some sort of attribution, or ?...

The apparent current status quo of "anything on the public web is fair game for an AI training set" might not be a good permanent solution?

jacooper · on April 14, 2023

The U.S. Supreme court will rule on that hopefully soon.

amanaplanacanal · on April 14, 2023

But the goal is more art, not keeping artists employed. If machines do it better, so be it.

somenameforme · on April 14, 2023

I don't think the concern is about machines doing it better, but machines being able to angleshoot around copyright. The issue is most easily demonstrated in writing. The current state of technology is already sufficient to create a program that could take as input a book and produce as output a book with identical story, characters, settings, and more - but 'nudged around' just enough to skirt current copyright law.

Humans have already been able to do this of course, but the difference is scale and automation. With this software one could, under current law, setup a 100% legal 'shadow library' that effectively infringes on every single book published. Even a leaked copy of a book could be released and shared (in its legally infringing format) before the "real" copy hit the market. And again, all completely legally. The impacts of copyright infringement are regularly grossly distorted, but I think this is the sort of technology that could genuinely damage artists and creators across many endeavors.

The exact same thing will be coming to software soon enough. 'Take this assembly/IL code, and create a functionally identical but superficially rebranded program while working to ensure you sidestep all relevant patents.' 'Sure, here you go.' The issue of this being done [relatively] instantly is going to really impact things in ways I think many are not considering. Never in a million years thought I'd see myself on the side of the copyright cartel, but this is one of the extremely rare times they're right.

CuriouslyC · on April 14, 2023

That shadow library already exists, people write a shit ton of fanfic and publishers publish highly derivative stories specifically because they're basically veiled ripoffs of highly lucrative IP hoping to cash in.

The barrier to shadow libraries is marketing, which is the only thing that really separates huge blockbuster music/art/books from stuff that makes literally zero money. Quality and ideas haven't been the gate for a long time.

dragonwriter · on April 14, 2023

If you shift the reward from the people doing one kind of work to people owning (one kind of, but this is fungible) capital, and don't shift capital at the same time, you are doing concrete harm to identifiable people, even if the net is positive in some constructed aggregate. AI has a tremendous ability to do that across a wide range of different types of work rapidly and simultaneously.

YurgenJurgensen · on April 14, 2023

No it isn't. 'Not enough art' is basically the exact opposite of the problem we're facing right now. Do you honestly believe that there's not enough music on Spotify? That a two hundred year catalogue isn't enough and you'd rather it be two hundred thousand years?

CuriouslyC · on April 14, 2023

Sure it is. There's a shit ton of shit on spotify, but if I want to listen to blackened ambient doom or technical jazz fusion death metal I might have one or two options each. Just because there are ~1000 Kanye/Taylor Swift wannabes trying to cash in on that played out mainstream sound doesn't mean there is enough music.

d1sxeyes · on April 15, 2023

Interesting that you describe making music that appeals to a wide audience as “trying to cash in”.

Would you not say that using an AI to do essentially the same thing, just with the benefit that you don’t really need to pay anyone for creating the art meaning you can target more niche preferences is also “cashing in”?

CuriouslyC · on April 15, 2023

No. I see strong generative AI allowing people who like fun genres outside the mainstream to make music in those genres that sounds good more easily. That will enable a creative explosion as new artists are given a rocket boost and established artists are taken to the next level and made more productive.

Generative AI is going to allow an amazing diversification and explosion of art and music, which is going to create the next big AI application once current systems for distributing content are strained to overloading - interactive recommenders. Imagine asking for a piece of content, getting recommendations, evaluating a short clip, giving feedback to the recommender and getting better recommendations as a result in a cycle until you get exactly what you want.

d1sxeyes · on April 16, 2023

Maybe I'm a luddite, and I'm not really into technical jazz fusion death metal, but I do like some technical music and some metal music.

At least part of the reason I like technical music is because it is challenging to play for the musician. If an AI 'plays' 'technical' music, it loses its appeal for me.

Similarly, a lot of the reason I like metal music is because of the emotion and energy poured into it by the human creator.

Being able to ask a computer make noises which sounds like some musicians you like is not, in my opinion, the same as creating art.

dwallin · on April 14, 2023

If your point is that we already have more than enough music than needed, isn’t that an indication that current copyright laws don’t need to be nearly as strong as they are in order to continue to promote the “useful arts”? (at least when it comes to music)

YurgenJurgensen · on April 14, 2023

While this may be true, there is a middle ground between 'copyright lasts for the heat death of the Universe plus seventy years' and 'everything is stolen by robots the instant it's created'.

tjr · on April 14, 2023

What we may have here is a need for some new use cases in copyright. The current laws were written with humans in mind -- human creators, human consumers. To retroactively say, "well, you released this to the [human] public under U.S. copyright law, therefore the rights you granted to humans extend to training AI systems as well" could be unfair.

bryanrasmussen · on April 15, 2023

every now and then I like to point out that not every country has the same understanding as to the purposes of copyright as the U.S, in fact this whole promotion thing seems to be a specifically American argument (although obviously not familiar with every country in the world so many some other countries also have this conception and I'm not familiar with it - maybe some countries that based their national laws on the U.S for example)

on edit: I just realized that of course British copyright law is also very similar to the American model.

vkou · on April 14, 2023

There's more good - even great - art currently in existence than I will be able to experience in my lifetime.

If not a single new piece of art were created from this point on, I'd still die with a massive backlog of material to read, see, and listen to.

At this point, I'm infinitely more concerned about the wellbeing of artists than about whether or not new art is going to be created.

stale2002 · on April 14, 2023

> train a machine to be able to make limitless art

> is super encouraging to the artist.

This is the "nobody goes there anymore. It's too crowded!" version of making art.

Creating more, or "limitless" art is indeed the point.

tjr · on April 14, 2023

The statement I replied to spoke of the point of copyright being to encourage artists to create more art; not simply "to create more art".

But neither of those is really what's written down, with the (U.S. law) point of copyright being "to promote the Progress of Science and useful Arts".

We might could interpret this as, training AI is progress of science, and thus the point of copyright includes training AI. Although I'm not sure that we really need copyright to do that, and accordingly, I doubt that is the intent of the law.

Really, the issue here is may be that copyright law was not formed with AI systems in mind at all, neither as creators nor as consumers, and trying to apply it to AI systems, or to reason about copyright law as it pertains to what AI systems do, doesn't necessarily work very well.

Maybe we need to go back and amend all of our written laws with a phrase like "for humans", just like so many science headlines need to include the phrase "in mice"!

blibble · on April 14, 2023

the point of copyright is to allow people to exploit their work for material gain, which in turn benefits society by encouraging people to create works

if only Microsoft (OpenAI) are able to exploit works for material gain at the cost of the literally 100% of the rest of society: why should society allow Microsoft to do this?

(or even to allow Microsoft to exist at all?)

newswasboring · on April 14, 2023

Yup I agree. These models should be open by law. Currently the calls for open models is not happening because its the initial chaos time, but as soon as we understand a little bit more there will be either regulation or open source reproductions. I mean LLama already exists, but even that has bullshit license stuff around it.

jutrewag · on April 14, 2023

What does it matter, the AI isn’t copying the work, just using it for “inspiration”.

bugglebeetle · on April 14, 2023

Even if we take copyright in its most limited, original conception, why would all value from creative works accruing to generative models that train on them ever encourage anyone to create these works?

IMO, what LLMs are demonstrating are the fundamental contradictions of Capitalism, where now being a capital owner with a bunch of GPUs is supposed to give you exclusive returns on the sum total of human intellectual and artistic labor. I have a feeling that people aren’t going to take to that too kindly, so we’ll either see robots mowing down the masses of unemployed, a Butlerian jihad, or various states assuming control over their productive capacities and redistributing the return in the form of greater safety nets.

wokwokwok · on April 14, 2023

Funny, they seem to be more proactive about putting laws around it there already (1), and that hasn't stopped progress at all, while protecting the vested interests they care about.

If only our law makers were so capable of a) understanding whats going on and b) actually passing laws when it mattered.

[1] - https://arstechnica.com/information-technology/2022/12/china...

mrbungie · on April 14, 2023

That is obviously going to happen in mainland China specially when it may affect the CCP's plans in some way (i.e. even "morally" in the case of Xi Jinping deepfakes).

Are they going to enforce it for products/services sold overseas? I'm not so sure, not until I see watermarks in Tiktok's content.

ozymandias12 · on April 14, 2023

>Are they going to enforce it for products/services sold overseas?

Like, are you a troll or you just don't know how laws work?

>I'm not so sure, not until I see watermarks in Tiktok's content.

Tiktok literally exports all videos with watermarks, its the people that remove them to post on reddit lmao how clueless are you

mrbungie · on April 14, 2023

Just look for the words "China IP law infrigement" in Google FFS, if that is trolling, well, then I'm trolling.

About watermarks, I'm obviously talking about AI generated content watermarks, not general tiktok watermarks. I guess I should've been more concrete with my phrasing as some people can't read considering context.

drtgh · on April 14, 2023

I think perhaps one of the reasons they were so quick to push was for the introduction of a section saying the content generated by such services should not contain elements that could subvert state power, incite secession or disrupt social order, according to the rules [1].

[1] https://archive.is/nPLDu

jquery · on April 14, 2023

Is there any evidence of mass compliance with these regulations? It seems basically impossible to enforce in a humane and fair way. Such laws will be used to prosecute political minorities if they are enforced at all.

throwaway290 · on April 14, 2023

Have you checked on the state of AI in China lately? It might be enlightening if you think there's some sort of arms race going on, or anything even close to that.

Also, if you follow through your logic you are arguing for abolishing all laws in the US that constrain US companies if some other countries can ignore these laws and get the upper hand. Keep in mind that these laws (like intellectual property, copyright, patents) is the reason US has innovated so much in the first place

mrbungie · on April 14, 2023

Nope, I haven't and I also didn't mention or suggest any actual arms race. My point is that any policy regarding AI (or any policy for that matter) is going to be hard to enforce when including potential (economically) "rogue" states, just that.

Now, if you have any information about AI and China, enlighten me. The more one knows the better.

PS/edit: I need more information about your last ghost edit. China (and others) would say otherwise about IP laws. If anything I would at least say that innovation can be either cut or assured through IP laws depending on the domain/technology, but it is difficult to conclude that absolutely for all cases.

throwaway290 · on April 14, 2023

> enlighten me

It's not going to make China's AI better than US. Not nearly enough, they're so behind. So giving them competitive advantage is not something that should matter when you consider whether this law would do good or not

alasdair_ · on April 14, 2023

China’s AI strategy is likely “annex Taiwan”.

After that, they will decimate global gpu supply and they can use their newfound lead in compute to win the race.

politician · on April 14, 2023

Can you explain where China obtains a lead in compute from attacking Taiwan? TSMC fabs on the island will not survive an invasion.

mrbungie · on April 14, 2023

Probably it will be more like espionage and ensuring at least constant access to the litography equipment that China is lacking right now.

gumballindie · on April 14, 2023

Yeah China hasn't been respecting IP for a long time and it's still well behind. We shouldn't become China just to win a race - there are rules to the game you know? If everyone's content is stolen and resold by an AI then why would anyone want to create content?

pdntspa · on April 14, 2023

It would go back to the best reason: for the joy of creating. The field would be reduced to those who enjoy making art for the sake of making art. It would also do wonders for filtering out the amount of crap that people produce. We would all have a much more rich and varied diet of art to appreciate.

tjr · on April 14, 2023

Possibly. When wearing my "artist" hat, I work on two different things: art that I personally care about, and art that might sell.

If AI made all the art that might sell, that would give me more time and energy to work on the art I actually care about, but maybe less sales from art.

pdntspa · on April 14, 2023

That's the thing. "Art that might sell" is corrupted. The diversity and uniqueness of art we would encounter regularly would be so much higher if that category was simply eliminated.

tjr · on April 14, 2023

I agree. A great deal of "art that might sell" is already borderline repetitive, uncreative garbage. Maybe pleasant garbage, but still. Nothing anybody would be really interested in, except to add a bit of color to empty space. The more creativity you inject, the less likely it is to sell.

kmeisthax · on April 14, 2023

So far AI art is doing the opposite: increasing the amount of crap that people produce. The kinds of people who were once limited to underpaying ghostwriters to make spam books for Audible[0] can now chuck together a few prompts to ChatGPT and Stable Diffusion and get the same result. Yes, you have to be good at art in order to make good art with AI, but that doesn't matter when your goal is to create spam.

The idea that artists should "do stuff for the joy of creating" is just plain insulting, too. They already do that. While artists would love to see, say, AI art models that were trained with licensed or public-domain data; training data theft isn't even their biggest concern. Their biggest concern is having the fun sucked out of their job as the artful minutae of drawing or writing is replaced with finding the correct combination of words to make Stable Diffusion draw the character you want with exactly the same details every time. It would be like if you worked at a PC building shop and one day the boss said "Actually we're just going to be an Apple authorized reseller now." The thing that's destroying artists' jobs being trained on their own work is just insult to injury.

To be clear, though, AI doesn't "steal and resell content" in the vast majority of cases, either. Regurgitation is a thing, but the cause is duplicate data in the training set making it advantageous to memorize a few images to improve loss metrics. Most diffusion model architectures are not big enough to memorize the whole training set, or even large pieces of it.

[0] https://www.youtube.com/watch?v=biYciU1uiUw

pdntspa · on April 14, 2023

I consider myself an artist, and I give away my work for free. I make my money in other ways, as should everyone, as money irreversibly corrupts the artistic process by dragging everything towards a bland middle ground.

So no, I don't consider that insulting.

DangitBobby · on April 15, 2023

You are not all artists.

hammyhavoc · on April 14, 2023

Assuming a race to the bottom, eventually the content won't sell because people will just generate it for themselves.

winternett · on April 15, 2023

Prediction: Hacks will flood spotify, youtube, and social sites with terrible cheap shoddy songs titled as if they are made by actual artists and the platforms will dish out royalties meant for authentic artists to total frauds. Real musicians may possibly abandon mega-music platforms altogether, and implement more secure (direct sales) platforms on their own web sites...

Big platforms really are not helpful nor respectful to musicians, especially musicians that are working hard to be discovered. It's a total shame that shotify charges musicians to be promoted on their platform while giving a ton of royalties away to so many fraudulent actors every year.

_fat_santa · on April 15, 2023

> Prediction: Hacks will flood spotify, youtube, and social sites with terrible cheap shoddy songs titled as if they are made by actual artists and the platforms will dish out royalties meant for authentic artists to total frauds. Real musicians may possibly abandon mega-music platforms altogether, and implement more secure (direct sales) platforms on their own web sites...

I don't think that would work, if a song is detectable enough to give the artist royalties for a song, it's detectable enough to see that it's copyrighted music. IMHO I think music piracy is essentially dead, because the music industry has finally learned and made a compelling product. With services like Spotify and Apple Music, pirating music is just not worth it anymore. Why bother when for $10/month you could listen to all the music to your hearts content.

Universal asking steaming companies to not use their label's music on training is just stupid because all they are doing is shooting the artists in the foot. Most AI in music is recommendation engines, do you not want your artists to be discovered? The streaming services are not stealing your music you idiot [UMG], they are trying to make you more money by directing people to music they like.

winternett · on April 16, 2023

You have no idea of what you're typing about, nor about how content ID works online for music. Ai music generators are specifically designed to circumvent content ID.

Ai generators put music and other content in a blender and then scrambles the source samples until they are unrecognizable and then mashes the tiny cut pieces of source samples back together into a collage. Kind of like putting a strawberry into a smoothie... It's no longer a strawberry, but the smoothie now has the extracted taste of strawberry AND the other original materials used to source the end product. Content ID only recognizes strawberries and whole fruits, not smoothies.

vkou · on April 14, 2023

Western AI currently doesn't respect IP, as it scrapes the web, and turns it into proprietary model weights, with zero attribution, and with zero compliance with the wishes of the original IP's owners.

EntrePrescott · on April 14, 2023

Another prediction would be: maybe Western states start blocking Chinese businesses that don't comply with their laws.

After all, China already blocks the Western businesses that don't implement the spying/surveillance measures the Chinese communist party wants.

The more the Chinese communist government continues pushing a war course against Taiwan makes that even more likely, e.g. in the context of sanctions.

londons_explore · on April 14, 2023

But it won't matter. If an integrated economy can be more efficient by copying each other internally, then it will eventually be more competitive at business worldwide, despite trade levies aimed against it.

mrbungie · on April 14, 2023

Good luck blocking such a big market. Money simply speaks louder.

vkou · on April 14, 2023

Blocking imports is actually very easy.

rvba · on April 14, 2023

I wonder if some sort of copyright traps cannot be created? Fake music pieces.

Map-makers use arificial "trap streets" ( https://en.m.wikipedia.org/wiki/Trap_street ) to show that their maps were copied. Dictionaries add fictious entries ( https://en.m.wikipedia.org/wiki/Fictitious_entry ).

Similar idea could be used with music?

SMAAART · on April 14, 2023

Related (and funny): https://twitter.com/brdskggs/status/1637114268876144640

oh_sigh · on April 14, 2023

Prediction: clean room AI will forever lag behind AI that hoovers up whatever data it can get it's hands on.

CharlesW · on April 14, 2023

It won't matter if using the output has sufficient risk. Go ahead and train that model on Disney content and sell the results, then see what happens.

TobyTheDog123 · on April 14, 2023

No one will be able to prove the source of the data - was it Disney content, or was it someone copying the animation style of Disney content?

bryanrasmussen · on April 14, 2023

I've noticed programmers often think the law will be worked around by 'tricks'

your honor, this LLM produces content that looks like various Disney intellectual products, it was obviously trained on that data.

your honor, no we trained it on a lot artwork that was copying the Disney style!

Judge: do you have this corpus of data you trained it on for the court to inspect.

uh, no.

summary judgement for plaintiff.

politician · on April 14, 2023

There is existing case law behind clean room reverse engineering (the "Chinese Wall" technique). Using generative AIs to scale up this process is inevitable.

"Clean Room Defeats Software Infringement Claim in U.S. Federal Court" http://hoviblog.blogspot.de/2008/10/clean-room-defeats-softw...

"Chinese Wall" https://en.wikipedia.org/wiki/Chinese_wall

kmeisthax · on April 14, 2023

Clean room reverse engineering is a thing because software copyright was a mistake and the judicial system understands it was a mistake. In software, it is regular and common for programmers to use licensed software libraries with defined interfaces. In writing or art, the notion of a "compatible interface" isn't really there. If you wrote a short fan story with Marvel characters and you want to liberate it from Disney's ownership, you have to redesign all the characters; you can't just argue that you need a superhero with red nanotech armor and a drinking problem in exactly this particular shape for compatibility.

politician · on April 15, 2023

I'll agree to your premise, but I'm relying on the combination of the recent guidance from the Copyright Office on AI generated output (ie: the output cannot receive copyright) with Clean Room techniques, optionally followed by a human transformation. I agree that if the output is a character named Iron Man, then no dice; if, however, the output is a new song that sounds similar to the original song but isn't the same, then I suspect that this will clear the bar for copyright.

blibble · on April 14, 2023

the AI in this situation is no different than gzipping a copyrighted file, ungzipping it and then trying to claim the output is no longer copyrighted

politician · on April 14, 2023

There have to be two AIs, one examines the original and produces a description while the second examines the description and produces a new work. Neither the description nor the new work are assigned copyright because they were produced by an AI (see [1]). However, if subsequently a human then manipulates the new work, the derived work is automatically protected by copyright assigned to that person (see [1, 2]). Furthermore, the new author has a defense against a copyright infringement claim by the owner of the original based on existing case law shared previously.

This has nothing to do with gzip.

[1] https://www.federalregister.gov/documents/2023/03/16/2023-05...

[2] https://en.wikipedia.org/wiki/Berne_Convention

blibble · on April 14, 2023

no matter how many AIs you layer: it's still just a lossy compression function

jpeg'ing frames of disney animation doesn't remove copyright (even if you do it twice)

CuriouslyC · on April 14, 2023

Saying a model is like jpeg compression is like saying a van is like an elephant, in that maybe if you squint so hard you're almost blind it's kinda true.

politician · on April 15, 2023

What matters here for copyright are the laws, not the math.

CuriouslyC · on April 14, 2023

It's quite different, because GZIP produces a lossy copy of an image, whereas models produce a distinct image that shares many aspects of style and composition. In one case it's obvious the product is the original image, in the other it's a derivative that may or may not be sufficiently different to be protected under fair use.

I can write an algorithm that just generates random noise in the dimensions of art, and if I run it long enough it'll output things that are "close enough" to copyright works. There's no argument for that program being copyright infringement, and that holds for models as well.

blibble · on April 14, 2023

> In one case it's obvious the product is the original image, in the other it's a derivative that may or may not be sufficiently different to be protected under fair use.

right, so if it's a completely original work let's clear out the training set and let's see if it can do it without it

no? so the output is a product of the input... a derivative work

and if you run it again it produces exactly the same thing? (sans artificial random injection)

sounds like a lossy compression function to me!

CuriouslyC · on April 14, 2023

I'd like to see a human make art after you lobotomize them.

newswasboring · on April 14, 2023

Why will this lead to summary judgement to the plaintiff, neither side has presented concrete proofs, just assertions.

Serious question.

bryanrasmussen · on April 14, 2023

obviously what I said is not an actual court case and is meant to be a facetious condensation of what would happen.

so after the several days when Disney shows how this LLM was obviously trained on the dataset of available Disney content, and then the defendant responded that they trained on a corpus of pseudo-Disney but then could not produce this corpus it would be reasonable to conclude they were lying.

The parent suggested "No one will be able to prove the source of the data" and that is the kind of thing that programmers for some reason often think is some fantastic gotcha so one can really get away with anything, but it is these kinds of things that the law is generally pretty good in handling.

maxbond · on April 14, 2023

The standard in US civil court is preponderance of evidence, not reasonable doubt. You'd have to go further then making an assertion (by bringing in an expert witness or something) but you don't necessarily have to prove something definitively. You just have to make the more compelling argument - you have to present evidence that it's more likely that your position is correct.

Since the hypothetical AI company wasn't able to produce anything in it's defense, it lost.

IANAL

dragonwriter · on April 14, 2023

> you don't have to prove anything, strictly speaking.

Strictly speaking, you do have to prove things (if you have the burden of proof on the specific issue), and the standard of proof is usually “preponderance of the evidence” (though there are a few other standards that apply to particular issues/circumstances in the civil justice system.) “Beyond a reasonable doubt” is a different standard of proof used for conviction in the criminal justice system, but it is strictly not correct to call proof under other standards something other than proof.

maxbond · on April 14, 2023

For sure, I was responding to the idea of "concrete proof" and why it didn't need to be "concrete proof," but I'm sure you're right. Thanks for keeping me honest.

I've edited my comment to reflect your correction.

newswasboring · on April 14, 2023

But Disney also didn't present anything. Can the court case really be decided based on only expert witnesses?

maxbond · on April 15, 2023

I've been avoiding giving an answer in case someone who knew better than me came around (open invitation), but afaik expert testimony is evidence and the case is decided on the best evidence. So why not? You'll have to do some work to establish the expertise of the witness, and the opposing side might attack those credentials and such and they'll probably produce their own expert rather than just rolling over, but in principle you could have an entire case that consists of various expert testimony.

I don't think the suggestion is meant to be that this is a realistic scenario though, just that the court isn't as mechanical and naive as we programmers are prone to imagining (being stewards of systems that are largely mechanical and naive).

IANAL

ChatGTP · on April 14, 2023

Eventually you will need to show sources and I’m sure we’ll have ways to extract sources from the models. ma these things are not magic after all.

kmeisthax · on April 14, 2023

You actually can, but only for things the model memorized: https://arxiv.org/pdf/2301.13188.pdf

amanaplanacanal · on April 14, 2023

I’m not sure why sources matter at all. In copyright you compare the end result, not how you got there.

blibble · on April 14, 2023

this isn't true at all

the process the work was generated through is of primary importance for copyright

the existence of the concept of "derivative work" should make this obvious

see the reverse engineering of the IBM BIOS for another example

CuriouslyC · on April 14, 2023

There are specific criterion for fair use: https://fairuse.stanford.edu/overview/fair-use/four-factors/. None of them are process specific. Maybe bone up on the law rather than getting yourself in any deeper.

blibble · on April 14, 2023

> Maybe bone up on the law rather than getting yourself in any deeper.

we're not talking about fair use here buddy

CuriouslyC · on April 14, 2023

And that's probably why you're talking to yourself.

blibble · on April 14, 2023

Maybe try reading the thread rather than getting yourself in any deeper.

CuriouslyC · on April 14, 2023

You're the one trying to ignore fair use as it has been broadly applied to creative works for something like 150 years, and make this about a very narrow slice of software copyright law even though people in the thread are talking about images and music. You're being willfully obtuse to try and make an argument because you are emotional about AI creative output.

blibble · on April 14, 2023

another strawman

fauxpause_ · on April 14, 2023

So if I create a movie about cyber pangolin, a gory half human half pangolin cyborg fighting for Justice in a post bio-punk world where people are swapping their organs out on a daily basis due to immuno suppressant immunity; and I used a model that saw Aladdin, you think Disney could successfully sue Cyber Pangolin for violating the copyright of Aladdin?

Cuz I don’t.

newswasboring · on April 14, 2023

Honestly at this point it's not about who is right, it's about who has the bigger legal team. So yes, I think Disney can sue and possibly win. Copyright system has not made sense in a while now.

CuriouslyC · on April 14, 2023

That's only true if the judge doesn't throw the case out nearly immediately, which they will do if they view that the plaintiff's case is bullshit.

fauxpause_ · on April 14, 2023

It’s really not. That’s very dishonest and lazy hand waving.

indymike · on April 14, 2023

It depends on how much of a knock-off the model's output is, no different than a human artist. Taking Disney's Aladdin and putting a cyborg eye on him probably will be a legal issue that the user of the art and seller of the AI output both would be liable for. AI is a tool, and it can be used for good and evil.

fauxpause_ · on April 15, 2023

I agree. But that is irrelevant to what a model was trained on.

oh_sigh · on April 14, 2023

Sure, if the company doing it is in the same jurisdiction banning it. But what if it's hosted in, say, Russia?

pixl97 · on April 14, 2023

Lets say this another way

"A Child left in an empty room is never going to get ahead of one that has experiences with everything in the world"

Anyone that thinks a clean room is going to make a valid AI, in my mind, is insane.

joshspankit · on April 14, 2023

Yes but also no.

Training on shite results in a pretty poor AI compared to one that’s trained on quality data. Do I think that means that quality AI will keep up with “hoover AI”? No. Hoover AI still gets you to profitability and that’s mostly what matters in a capitalist arms race.

pixl97 · on April 14, 2023

Real life contains a lot of shit data that you have to learn to filter by experience. An AI containing only the best will fall flat on its face when subjected to the dirty mess reality is.

joshspankit · on April 14, 2023

I’m not sure about that, but I’m looking forward to finding out

kneebonian · on April 14, 2023

Further predicition: There will be a certification process that will be required to prove you only used data you have legally obtained. Running or hosting a model will be grounds for a fine, or possible criminal penalties.

mock-possum · on April 14, 2023

I think it’ll be more like using samples in music - probably not a big deal until you’re making a bunch of money, and then people come after you for licensing fees.

mtillman · on April 14, 2023

A certification would happen in the EU first.

InCityDreams · on April 14, 2023

...with the states and the uk to follow, rather like the gdpr-like privacy setup in Cali.

rvz · on April 14, 2023

I'm from the future and this is true. There will certainly be AI regulations and licensed AI models and it had already happened with the Shutterstock and O̶p̶e̶n̶AI.com partnership.

Stable Diffusion and the Getty Images lawsuit will end with a settlement and licensing will be the option to go with.

jquery · on April 14, 2023

I'm also from the future (2028 to be precise), and enforcing copyright on AI-generated content has become near-impossible. It's working about as well as DRM has in stopping video and music piracy. There's now BitTorrent-like software for AI, which allows groups of individuals to train and run their own AI on whatever they please, and they seem to have little regard for copyright. In fact, underground music from these creators is taking off, and despite the law requiring all AI-assisted songs to cite their influences for royalty payments, they're simply refusing to comply. The music is so good that the public has no appetite for prosecuting the individuals making it. In fact, most people are using these AIs to create their own custom music stations. The old guard is stomping its gold-encrusted shoes in anger, but the youth just don't care.

There's a cool song called 'Neural Harmony.' The creators used a mixture of classical and electronic music as input, but they never disclosed the specific songs they were influenced by. This has made it impossible for the original artists to claim royalties, and yet the public can't get enough of its sick beatz.

There's even an AI-generated album called 'Digital Renaissance.' The creators claim to have used thousands of songs from various genres as inspiration, but they never provided a list of the songs or artists. The album has gained a massive following and has even been featured in several popular playlists. There was briefly an attempt to prosecute them but the case was dropped after public outcry.

blibble · on April 14, 2023

> It's working about as well as DRM has in stopping video and music piracy.

quite well then?

streaming services have essentially replaced music and video piracy

and steam (essentially streaming games) vastly reduced video game piracy

jquery · on April 14, 2023

DRM was not the primary factor in reducing piracy. The music and video industries reluctantly embraced new business models that emphasized convenience and accessibility for consumers, largely in response to the frustration with existing DRM systems. For example, platforms like Spotify and Netflix made it easier for users to access content legally, decreasing the appeal of piracy.

Gabe Newell once famously stated, "Piracy is almost always a service problem and not a pricing problem." Steam's success can be attributed to addressing the service issues that initially led people to piracy, providing a user-friendly platform for gamers. So, while DRM might have had a minor role in curbing piracy, it's the innovative business models and improved services that made the most significant impact. Even now, downloading movies, albums, or video games for next to nothing remains possible, and those who prioritize money over time continue to pirate without issue.

blibble · on April 14, 2023

> Even now, downloading movies, albums, or video games for next to nothing remains possible, and those who prioritize money over time continue to pirate without issue.

who's going to bother for music when spotify/youtube are free?

your time would have to have negative value for it to be worth it

kouteiheika · on April 14, 2023

Or companies will just move to a country where training on copyrighted data is 100% legal and allowed (like e.g. in Japan which had an explicit amendment to its copyright law to allow it), although it'll be interesting to see how such models would be treated internationally.

kmeisthax · on April 14, 2023

The EU also had its copyright law amended to allow data mining on copyrighted data sets. And I would not be surprised if US courts decide that training an AI is fair use. The goal, after all, is to create a machine that draws new images.

oytis · on April 14, 2023

AI leaders will be located in China or India then.

TiredOfLife · on April 15, 2023

Prediction: that will lead to a couple huge copyright guzzlers that will sue everyone trying to enter their market.

wefarrell · on April 14, 2023

This could be a boon for publishers since they'll be able to act as the gatekeepers of almost all professionally produced content.

metry · on April 14, 2023

This could be true.

mrtksn · on April 14, 2023

I think we will settle on proprietary AI required to obtain licenses on trained data. Any model trained on public domain data will be completely public domain too.

So I imagine, someone will train a model to do high end law work and they will have to train it on the data produced by the people they hire to create it.

This of course assumes that the society will have the same structure as of today. What I actually think it will happen is, we will completely delegate all our work to machines and the concepts of ownership will vanish as everything for exception of land will be in abundance. I imagine in the future we will fight each other over apartments in cool areas or trade it for some kind of social credit which we generate by impressing other humans. No apartment will be worse than the other but the proximity natural wonders, cultural centres and networks will be the paramount. After all, it's all about how we pick our sexual mates and social status.

I'm sorry you don't like it but There's no way the society functions the same once resources and servants are in abundance. Your bank account is relevant only when there's a scarcity and the current scarcity can be gone once machines are autonomous in enough areas to convert the material all around us into things we need using the practically limitless energy from the sun.

butterfi · on April 14, 2023

I stopped reading at "“We have a moral and commercial responsibility to our artists..." It's true, but they big music companies are well known for their business practices.

KMnO4 · on April 14, 2023

“Hey independent artist, let’s make a deal. You spend hundreds or thousands of hours creating music and we’ll help you sell it”

“Sure, and I assume you’ll want a 10% cut?”

“Nope, give me 80% if someone buys it”

“But people don’t buy music anymore. How much money will you give me if a thousand people stream my music?”

“We can do $2. Take it or leave it.”

CharlesW · on April 14, 2023

It's illustrative that, even with its history of lopsided deals with creators, Universal Music has the ethical high ground in this case because at least they have legally-binding business arrangements with their artists.

kneebonian · on April 14, 2023

"It's better than legal we are using the law to keep justice away" - PHB, Dilbert

shadowgovt · on April 14, 2023

There is no ethical high ground here.

There is only creative output bent towards the artificial framework of capitalism.

One could argue there is some weak ethical underpinning in "In the absence of higher moral values, one should simply adhere to previous agreements and laws..." Except previous agreements and laws give basically no guidance on the topic of "Can I use this data to train a machine to make more data?" Especially if the thusly-created data can't even, itself, be copyrighted. It's a novel use-case unpredicted by the existing copyright framework.

Mountain_Skies · on April 14, 2023

Reminds me of when Garth Brooks went on a crusade against used CD stores selling his albums without him getting a cut. After the predictable blowback, he claimed it wasn't his income he was fighting for, it was all the small indy artists he was protecting even though the bulk of used CD sales were of mainstream artists like Brooks. Eventually he realized no one was buying that excuse and he dropped his opposition (or maybe one of his legal staff explained first sale doctrine to him).

omginternets · on April 14, 2023

It would at least be entertaining to see the RIAA sue companies rather than individuals for a change.

hospitalJail · on April 14, 2023

I know this is all about self preservation, but gosh I hate hindering progress.

Its going to happen and AI is going to learn lyrics and how to create music, its inevitable. These are just roadblocks that are bad for everyone but the extreme minority.

Even if one company/country bans it, in 10 years, it won't matter.

asjdflakjsdf · on April 14, 2023

I am seeing this sentiment so often online at the minute that it seems as though nobody has learned anything about DRM over the past 20 years. So many threads on reddit calling Italy "backwards" for protecting citizens data, and now people on HN expecting companies to give everything away for free because of the outcome is "inevitable"!

There are a bunch of for-profit American AI companies. Why on Earth would another for-profit company, especially one based in another country, be ok with other people making money from their content. They can either look at developing their own AI platform, or build deals with the existing AI companies. It would be just plain stupid to give it all away for free, or any price that isn't determined by themselves.

matheusmoreira · on April 14, 2023

Companies don't have to give away everything for free. It's already free. Public domain is the natural state of information. They're the ones who insist on copyright so they can maintain the artificial scarcity delusion well into 2023 where AI is literally on its way to automating intellectual work. These irrelevant industries need to stop holding us all back and just disappear already.

falcolas · on April 14, 2023

> These irrelevant industries need to stop holding us all back and just disappear already.

Like artists, sculptors, writers, photographers, narrators, musicians, composers, and so forth? The very same industries AI requires to exist for training?

They will disappear. And we will be poorer for that.

matheusmoreira · on April 14, 2023

Nope. People with the impulse to create will do it regardless. Sellouts without intrinsic motivation to create who are just looking to make money by creating products instead of real art? I won't mourn their disappearance at all.

falcolas · on April 14, 2023

That's an assumption that has not been tested in modern times. At least in the past, an artist could sell their painting.

And even if the assumption proves to be true, the volume will decrease dramatically as people are no longer allowed to make a living to create their art.

And no, Patreon and its ilk is not a sufficient replacement, not for full time jobs. It mostly doesn't even replace a job for the (comparatively few) people on it today.

EDIT: I for one will miss movies like "Everything Everwhere All At Once", which could not have been made as an "impulse" project.

matheusmoreira · on April 14, 2023

> That's an assumption that has not been tested in modern times.

It's a fact as old as humanity itself. People will create because that's what people do. What isn't guaranteed is the existence of the billion dollar copyright industry.

> an artist could sell their painting.

Still perfectly possible to sell the physical canvas you applied paint to.

> the volume will decrease dramatically as people are no longer allowed to make a living to create their art

So what? That's a good thing. The market is filled with cheap art that's made just to sell copies, stuff that wouldn't even exist at all if not for the profit. I don't consider that a big loss at all.

JohnFen · on April 14, 2023

> I don't consider that a big loss at all.

And yet you're cheering on AI that will dramatically increase the amount of cheap art that's made just to sell copies.

matheusmoreira · on April 14, 2023

So worst case scenario is just more of the same. I'm OK with that.

CuriouslyC · on April 14, 2023

Um, people are still blacksmithing and riding old tymie bikes, so I'm pretty sure it has been tested.

falcolas · on April 14, 2023

Yes. A tiny fraction of a percent of people (compared to the volume of smiths in the past) do continue traditional blacksmithing.

The results of their work is not IP though, which makes the comparison too weak to serve as proof that artistic works that create only IP will continue unabated.

CuriouslyC · on April 14, 2023

Blacksmiths in America don't make money, it's a hobby they do for fun. If the argument is that people will stop doing hobbies because a machine can do the work faster and better I'm pretty sure that's been proven wrong.

falcolas · on April 16, 2023

Sure they do it for money. You cant live in America without doing something for money, and many of these blacksmiths do it full time.

There’s probably one in your city.

JohnFen · on April 14, 2023

> People with the impulse to create will do it regardless.

But they may not publicly release it. I've already removed my works from the public web, and I've heard from several others that have done the same.

matheusmoreira · on April 14, 2023

That's OK. I don't publish everything I make either. Just stuff I actually want people to see and have access to.

JohnFen · on April 14, 2023

No, you misunderstand. The stuff is still published, because these are works that people want to share. They're just not on the open web anymore, they're invite-only web spaces, or internet spaces that aren't web-based at all, because there appears to be no other way to avoid having them used to train AIs.

matheusmoreira · on April 14, 2023

I have no problem with that. I'd like to warn you that this is essentially security through obscurity. Only one copy ever needs to make it out of that closed space. The more people in there, the higher the odds of that happening. Once it does, all bets are off.

JohnFen · on April 14, 2023

That's certainly true. But compromises must be made. A solution isn't worthless just because it's not 100% effective.

The only other alternative would be to withdraw from society entirely, which is obviously not feasible.

matheusmoreira · on April 14, 2023

There's also option to simply accept that you cannot own ideas. Let them go. Once I accepted this, I felt like I was finally free.

I released some software as GPL but truth be told I couldn't care less if someone violates it. I'm certainly not gonna waste my limited time on this earth going to court over it.

JohnFen · on April 14, 2023

The problem comes when people actively don't want to further the training of AI. It's not so much about not accepting that you cannot own ideas as it is about not wanting to contribute to a thing that you believe is going to result in greater suffering for most people.

matheusmoreira · on April 14, 2023

I think the only way to ensure that these days is to not allow data to ever leave your computer under any circumstances. I have no doubt Microsoft is using the software I published to train its copilot thing, I published it with that understanding. My only problem with this is the hypocrisy of it all. Microsoft won't allow their people to even look at at AGPLv3 code lest they unconsciously reproduce it but they will let the AI look at AGPLv3 code while conveniently excluding their proprietary software. It should be trained on everyone's code, especially the proprietary stuff they're so protective of, or not trained at all.

JohnFen · on April 14, 2023

> I think the only way to ensure that these days is to not allow data to ever leave your computer under any circumstances.

Indeed. Which I consider to be a real loss.

hammyhavoc · on April 14, 2023

Why shouldn't people be able to earn a living with their originality and skill? Sounds like envy to me.

matheusmoreira · on April 14, 2023

Go ahead and earn your living. Just don't expect me to take absurdities like delusional people thinking they own numbers seriously.

falcolas · on April 14, 2023

> Just don't expect me to take absurdities like delusional people thinking they own numbers seriously.

The same governments that let you 'own' physical items are the ones who say you can 'own' IP as well.

If they didn't - and didn't back it up with force - you wouldn't 'own' anything at all. Cherry picking which version of ownership is 'absurd' is an exercise in futility, since it's not up to you.

matheusmoreira · on April 14, 2023

Nah. I own physical things by literally holding onto them. Keeping them inside my property to which only I have the keys. Defending that property by force if necessary. Government doesn't have to "let" me own anything, it merely recognizes and formalizes the de facto reality of things. Meanwhile we have these people with their made up delusions of ownership of ideas and all the contradictions inherent in that, and I'm supposed to pretend it's not absurd?

Whether or not the world conforms to their made up copyright reality isn't really up to them either. The simple fact is: information, once discovered, is infinitely copyable. No amount of lobbying is ever gonna change that. People are still gonna train AI models with "their" data and there's nothing they can do about it short of destroying free computing as we know it by making it so we can only execute software they approve. Surely you don't want that, fellow Hacker News user, given that such tyranny is the antithesis of everything the word "hacker" stands for.

dragonwriter · on April 14, 2023

> Government doesn't have to "let" me own anything,

You seem to be confusing possession with ownership.

Ownership is the social relationship by which you exert control independent of immediate possession, but you’ve just described how you can maintain possession.

hammyhavoc · on April 14, 2023

Yup. By his logic, if a thief holds someone at gunpoint and takes their property then they now own x. Furthermore, if they are then caught, by his logic, that property shouldn't be returned to the victim because the thief now owns it apparently.

matheusmoreira · on April 14, 2023

Lol. They literally do own that property. They'll even sell it off for drugs or whatever as if they did own it. It's a very rare case that police will get off their asses and retrieve "your" stolen property. You can give them a GPS signal to the property and they still won't do it. Believing in this "posession/ownership" dichotomy is just as delusional as believing in imaginary intellectual property. It's just a flat out denial of the reality of things.

You know what's funny? In my country, Apple's security is more effective at deterring criminals than any of this "ownership" crap. A stolen iPhone is basically a brick that's worthless to anyone else. So they'd rather target Android phones instead which they can more easily reset and pass off as some used phone they own.

falcolas · on April 14, 2023

And the moment you stop defending your house, to go shopping for example, you no longer possess (nor own) anything in it.

hammyhavoc · on April 14, 2023

Do people own property? Do they even have money? Do you own a license to your software? If it is all just on paper or on a screen, it's just numbers. The entire system is make-believe. If you choose not to believe in intellectual property, you must also acknowledge that other aspects of capitalism also do not actually exist and is a shared delusion.

However, the shared delusion makes the world go round as-is.

OK, "copyright bad", "intellectual property rights bad", so what's the alternative?

matheusmoreira · on April 14, 2023

> If you choose not to believe in intellectual property, you must also acknowledge that other aspects of capitalism also do not actually exist and is a shared delusion.

I already do. Dollars? It's just paper, not even backed by anything. People believe in it so it has value for the time being. It will literally go to zero if people stop believing in it though.

It was hard for me to accept these truths. I don't post them here lightly.

> However, the shared delusion makes the world go round as-is.

People who choose to believe in delusions don't get to complain when reality inevitably comes creeping in.

> OK, "copyright bad", "intellectual property rights bad", so what's the alternative?

Post scarcity. Automate everything and provide abundance, eliminating the need for an economy to begin with.

OkGoDoIt · on April 14, 2023

But those artists with a true impulse to create still need to eat, pay for a place to live, etc. How exactly does that work?

matheusmoreira · on April 14, 2023

Dunno. They'll probably get another job and use that to sustain their real interests. Or maybe AI will automate everything and we'll finally enter the age of post scarcity. I'm an optimist. What'll probably happen is we'll descend even further into cyberpunk hell.

krono · on April 14, 2023

A work that is protected by copyright - which most works are by default in the majority of cases - is by definition not in the public domain.

To offset that nitpicky line above a genuine question: if I were to produce a work and share it with you directly, in private, and perhaps for good measure clarify to you that I am only sharing it with you personally to hopefully get your feedback on whatever it is that I made, and that I do not want you to do anything else with it than the minimum that would be required to fulfil that purpose.

Wouldn't you then see any natural wrong in sharing my work with others or even the broader public, regardless?

matheusmoreira · on April 14, 2023

> A work that is protected by copyright - which most works are by default in the majority of cases - is by definition not in the public domain.

Every single piece of idea is public domain from their inception. Actually, all ideas already exist, we humans just discover them. Ideas are information, information is bits and bits are numbers. All numbers already exist, and all "creation" is merely discovering those numbers.

Any assignment of ownership obviously happens after the fact and are completely ineffectual, especially in the 21st century, the age of information and networked computers with infinite ability to copy bits at negligible costs. The technology really exposes that sham for what it really is and it's a shame how everyone reacts by trying to destroy the perfectly good technology instead of fixing the fraud that is "intellectual property".

> Wouldn't you then see any natural wrong in sharing my work with others or even the broader public, regardless?

I'd see it as a very rude thing to do to you personally. Simply because you asked me not to do it and I generally try to be nice and respect people.

A natural universal ideological wrong though? No. Plenty of people publish the private communications they receive. It's just information. Publishing it might hurt my social standing with you buf I personally don't believe in anyone ever going to jail over it.

krono · on April 14, 2023

Now that you've written it out for me here (thanks for which btw, and for your thoroughness in particular), I see that I should have been able to infer your angle from your previous comment. For the record, not that I was meaning to imply anything with my hypothetical question, but now I know where you were coming from I see that it's not very relevant at all and I wouldn't have asked it.

It would require an unthinkable near unanimous societal willingness and cooperation, such comprehensive planning to the likes of which I believe humanity is practically incapable of today with currently available tools and mindsets, an ultra-careful and yet pertinacious iterative implementation process that will probably need to take place over a multi-generational timeframe.

If, however, we would somehow pull all that off and manage to rework our world into one that is entirely formed around the philosophy you describe above, then I am fully convinced that not only humanity, but also our planet and in fact the rest of the universe too would be better off for it.

blibble · on April 14, 2023

> They're the ones who insist on copyright so they can maintain the artificial scarcity delusion well into 2023 where AI is literally on its way to automating intellectual work.

AI won't be able to automate anything if we use the legal system to forcefully reduce the size of its training set by 99.999%

matheusmoreira · on April 14, 2023

I have no doubt that at some point this technology will make it to our actual computers instead of being sioled away in some corporation's servers. That way there's nothing they can do about it unless they up the tyranny 1000x and destroy our freedom to execute any software we want on our own machines.

blibble · on April 14, 2023

> I have no doubt that at some point this technology will make it to our actual computers instead of being sioled away in some corporation's servers.

thankfully Moore's law is dead

> That way there's nothing they can do about it unless they up the tyranny 1000x and destroy our freedom to execute any software we want on our own machines.

I'd probably prefer this to a world where all knowledge workers become permanently destitute

and I suspect the vast majority of the world's electorates will agree

(do people prefer being able to eat over some ability to run software on their computer? I suspect so)

Analemma_ · on April 14, 2023

Because (at least in America) generative AI is an obvious transformative case allowable under Fair Use, and even if courts rule otherwise, like Sci-Hub it's such an obvious net positive for humanity that it's ethical to use even in the face of IP cops demanding you stop.

JohnFen · on April 14, 2023

> it's such an obvious net positive for humanity

I think that's nowhere near obvious. But we will see. At this point, everyone is just guessing.

Jevon23 · on April 14, 2023

Making a large profit off of other people’s work, without their permission and without compensating them, is not progress.

If someone said “for the sake of progress we just REALLY need to use this GPL’d code in our proprietary closed source app”, I don’t think that would fly around here.

FrenchDevRemote · on April 14, 2023

Using content as training data is not making profit off other people work.

Musicians don't pay royalties to every other musician they've ever listened to, but that's literally their training data, the brain is just a large neural network.

deely3 · on April 14, 2023

How?

They used content as training data and now selling access to model based on this training data. How this is not "making profit off other people work"?

FrenchDevRemote · on April 14, 2023

You know that because you read it somewhere else, because someone put it online. Your brain took that as training data, and now you're regurgitating it, are you going to pay that person for the data your brain is using?

A musician is just a big neural network, and they sell content that is nothing but the product of all their influences, of all the music they listened to.

I don't see a difference between a musician making music after having listened to thousand of hours of music throughout their lives and an AI generating music.

It's the same thing, in one case you have neurons made out of flesh, in the other neurons made of transistors and code.

DangitBobby · on April 15, 2023

It's not a person. Some people make a lot of money off of it, and they are able to do this by siphoning knowledge and effort off of millions of others.

FrenchDevRemote · on April 15, 2023

Why is it different if it's a person or a model doing the generation?

"siphoning knowledge and effort off of millions of others."

how is a regular artist not doing the exact same thing?

DangitBobby · on April 16, 2023

Artists learn through blood, sweat, and tears. No artist achieves excellence without significant effort, and won't arrive until they've attempted original works many times. And they can spend their entire lives without ever finding any success or actually being particularly good. Are their outputs colored by the culture and prior art they've experienced? Absolutely. That's how learning works.

Compare that to AI. It doesn't do any actual "art" work to become an artist, nor do the people who train it, it just sucks up what's fed into it, without the consent of the creators. Then, it can create much, much faster than an artist without breaks and without pay, and it is owned and directed by a huge faceless company as, effectively, a fleet of mindless slaves, diminishing the livelihoods of the very people absolutely essential in training the model.

From a moral perspective, all you really need to ask is, would these artists have consented to this training if they knew mindless AI slaves would replace them?

deely3 · on April 16, 2023

Can I revert your question?

Why it should be the same for one regular person and for profit-aimed corporation that did not bother to get consent from authors?

ronsor · on April 14, 2023

> If someone said “for the sake of progress we just REALLY need to use this GPL’d code in our proprietary closed source app”, I don’t think that would fly around here.

Arguably that's because putting GPL code in a proprietary app is making a free thing closed. FWIW, I don't like proprietary AI models either, but I think open-source ones shouldn't be hampered by the copyright mafia.

shadowgovt · on April 14, 2023

> Making a large profit off of other people’s work, without their permission and without compensating them, is not progress.

Best dynamite all the bridges because I sure haven't paid any of the (now quite-long-dead) people who put them up.

gumballindie · on April 14, 2023

> Its going to happen and AI is going to learn lyrics and how to create music, its inevitable.

Why is everyone in AI so open to stealing people's work and why do y'all think "ai" is something with a mind of its own that just runs about and does things? It's a software product and the companies owning it must play by the rules. Period.

selfhoster11 · on April 14, 2023

“Companies owning it” is presumably not what the parent had in mind (or at least, not mainly). People should have access to locally-running AI just as megacorps do. And I don’t think it’s reasonable to require licensing for every little thing used to train those, much like we (hopefully) wouldn’t require people leaving the cinema to be memory-wiped to prevent them from stealing creative cues from the film they just watched, if memory wiping was a thing.

> why do y'all think "ai" is something with a mind of its own that just runs about and does things?

Because GPT-4 is already capable of doing this (very poorly) if incorporated into a larger system that provides it with REPLs, internet access, an initial goal, and some form of memory. GPT-5 will be more capable, and AI will only get better at this.

hammyhavoc · on April 14, 2023

Remember NFTs ripping off the work of others? Same crowd.

gumballindie · on April 14, 2023

Getting the same vibes to be honest. I've seen quite a few switch from crypto currencies to ai. Not sure what they think they will achieve.

However the difference between ai and nfts is that ai is powerful and much needed. But there needs to be rules to the game.

Also playing by these rules means better ai. It means that instead of spewing content it would actually have to learn, and the result would be a far more accurate and far more reliable output.

That's where we should be.

hammyhavoc · on April 14, 2023

AI could indeed be brilliant for humanity, but yes, as with crypto, the response to the "Wild West" era of it is too slow before damage is done.

mattmcknight · on April 14, 2023

Is listening to a bunch of music and being inspired by it to create new music stealing?

gumballindie · on April 14, 2023

If my grandmother had wheels she would have been a bicycle.

That's how people comparing ai to humans sound.

stale2002 · on April 14, 2023

> Why is everyone in AI so open to stealing people's work

Because it is no more "stealing" than when a human learns from other people's work, and then makes a profit off of it.

> the companies owning it must play by the rules

Nobody has proven that they are breaking any rules.

gumballindie · on April 14, 2023

These arguments are a bit boring. Machine learning for a chat bot is not a person "learning". It's software. Also I hate to break it to you but humans also pay to learn. It's why books, universities and other learning content costs money one way or another.

stale2002 · on April 14, 2023

> Machine learning for a chat bot is not a person "learning". It's software.

I am not saying that it is exactly the same.

Instead, I am saying that if a human can profit from other people's work, by learning from it, then there are clearly exceptions to this idea that using other people's work, for any reason at all, is "stealing".

It is perfectly legal to use other people's work, for all sort of things. Its not stealing, in many situations.

This hard rule that you have made up, is clearly not the situation, and using your hard rule, where you just call all of it "stealing" would similarly apply to all sorts of other, completely allowed behavior that nobody thinks is "stealing".

> but humans also pay to learn

Nobody is going to successfully be able to sue you, because you downloaded their publicly accessible work, and learn from it, actually.

If you release your creative works, for people to consume, and people consume it, then they are similarly allowed to learn from it.

gumballindie · on April 14, 2023

> It is perfectly legal to use other people's work, for all sort of things. Its not stealing, in many situations.

For the most part it's covered by contracts, terms, agreements, laws, and so on. Except where it doesn't really matter.

Ai is software. Data models are software. It's output is a product.

stale2002 · on April 14, 2023

> For the most part it's covered by contracts, terms, agreements, laws, and so on.

Actually, its mostly covered by the "laws" part, and the laws allow people to use other people work, all of the time, even if the person doesn't want you to use and, and there was no agreement to allow it.

That is what I am saying. I am saying that it is legal, in all sorts of situations, to use other people's work even if they object/don't want you to.

gumballindie · on April 14, 2023

Of course people can use other people's work, within terms and conditions. The people that build data models are required to follow laws. They are more than welcome to use content within their constraints.

stale2002 · on April 14, 2023

> Of course people can use other people's work, within terms and conditions.

No, actually there are many situations where the terms and conditions can be completely ignored, and people can use other people's works without permission, or without caring about the terms and conditions.

> They are more than welcome to use content within their constraints.

No, they can ignore the constraints, because the law allows people to use other people's works, without getting permission, and without following the constraints of the original creator.

> The people that build data models are required to follow laws

The point is that the law allows people to ignore the wishes of the original creators, and use their creative work, in many situations, while ignoring what the original creators want or has authorized.

pixl97 · on April 14, 2023

>Machine learning for a chat bot is not a person "learning". It's software.

I'd like you to learn more about LLMs then, because under the definition you want to make, you are 'software'.

Also, we made libraries so people could learn for free without being beholden to their capitalist masters.

gumballindie · on April 14, 2023

> we made libraries so people could learn for free without being beholden to their capitalist masters.

Naturally. If you copy the "free" content from those libraries and resell it you are committing plagiarism.

Just because something can mimick humans it doesn't meant they are humans. ML is just that, software. There's too much pareidolia out there in AI. Sad, because it's a great concept gradually getting bastardised.

pixl97 · on April 15, 2023

And if I learn something from that content and make a work of my own?

I swear that people on this thread want to create the world envisioned in 'the right to read'.

eastbound · on April 14, 2023

We can go the other way:

- Entirely remove intellectual property protection granted by copyright,

- Music is freely copiable. Not like we’re making much money with it anymore.

- Software is free by default. Use SAAS if you don’t want to give away your IP. Did you disclose your code? Too bad, ideas can’t be prevented from being copied.

- No more patent trolls. Find another way to fund drug research.

- AI can train on anything. We make a big leap forward.

hammyhavoc · on April 14, 2023

A big leap forward to what? A big regression in the joy of creativity?

What a lot of AI folks don't understand is that the joy is frequently in the process itself, very seldom in the outcome/result of a process.

stale2002 · on April 14, 2023

> is that the joy is frequently in the process itself

That's fine if you prefer that.

But other people prefer a different process (such as using AI), and that's totally OK.

YurgenJurgensen · on April 14, 2023

How naïve of you to think that AI won't be coming for SAAS next. With all the API externals, the social media activity of all your employees and a video feed through the window of one of your home-working coders, a next-gen coding AI just reverse-engineered your entire software stack. With how porous modern systems are, no back-end code worth stealing hasn't been stolen.

hospitalJail · on April 14, 2023

I unironically endorse this.

52-6F-62 · on April 15, 2023

So much for “automate the boring things”. The mask has really come off this past year.

_spduchamp · on April 14, 2023

At least until they setup their own AI music division, which they will totally do.