this post was submitted on 09 Aug 2025

831 points (98.6% liked)

Technology

73967 readers

3597 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

831

AI industry horrified to face largest copyright class action ever certified (arstechnica.com)

submitted 4 days ago by Davriellelouna@lemmy.world to c/technology@lemmy.world

117 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] panda_abyss@lemmy.ca 36 points 3 days ago (1 children)

Well maybe they shouldn't have done of the largest violations of copyright and intellectual property ever.

Probably the largest single instance ever.

[–] ivanafterall@lemmy.world 9 points 3 days ago

I feel like it can't even be close. What would even compete? I know I've gone a little overboard with my external hard drive, but I don't think even I'm to that level.

[–] nulluser@lemmy.world 218 points 4 days ago (1 children)

threatens to "financially ruin" the entire AI industry

No. Just the LLM industry and AI slop image and video generation industries. All of the legitimate uses of AI (drug discovery, finding solar panel improvements, self driving vehicles, etc) are all completely immune from this lawsuit, because they're not dependent on stealing other people's work.

[–] a_wild_mimic_appears@lemmy.dbzer0.com 19 points 4 days ago (4 children)

But it would also mean that the Internet Archive is illegal, even tho they don't profit, but if scraping the internet is a copyright violation, then they are as guilty as Anthropic.

[–] magikmw@piefed.social 24 points 4 days ago (6 children)

IA doesn't make any money off the content. Not that LLM companies do, but that's what they'd want.

[–] axmo@lemmy.ca 15 points 4 days ago

Profit (or even revenue) is not required for it to be considered an infringement, in the current legal framework.

load more comments (5 replies)

[–] omxxi@feddit.org 1 points 2 days ago

Scrapping the Internet is not illegal. All AI companies did much more beyond that, they accessed private writings, private code, copyrighted images. they scanned copyrighted books (and then destroyed them), downloaded terabytes of copyrighted torrents ... etc

So, the message is like piracy is OK when it's done massively by a big company. They're claiming "fair use" and most judges are buying it (or being bought?)

load more comments (2 replies)

[–] arararagi@ani.social 15 points 3 days ago

Meanwhile some Italian YouTuber was raided because some portable consoles already came with roms in their memory, they only go after individuals.

[–] anarchy79@lemmy.world 18 points 3 days ago (2 children)

I am holding my breath! Will they walk free, or get a $10 million fine and then keep doing what every other thieving, embezzling, looting, polluting, swindling, corrupting, tax evading mega-corporation have been doing for a century!

[–] cmeu@lemmy.world 4 points 2 days ago

Would be better if the fee were nominal, but that all their training data must never be used. Start them over from scratch and make it illegal to use anything that it knows now. Knee cap these frivolous little toys

[–] hansolo@sh.itjust.works 5 points 3 days ago

This is how corruption works - the fine is the cost of business. Being given only a fine of $10 million is such a win that they'll raise $10 billion in new investment on its back.

[–] halcyoncmdr@lemmy.world 134 points 4 days ago (18 children)

As Anthropic argued, it now "faces hundreds of billions of dollars in potential damages liability at trial in four months" based on a class certification rushed at "warp speed" that involves "up to seven million potential claimants, whose works span a century of publishing history," each possibly triggering a $150,000 fine.

So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That's not how that works.

[–] zlatko@programming.dev 39 points 4 days ago

Actually that usually is how it works. Unfortunately.

*Too big to fail" was probably made up by the big ones.

[–] a_wild_mimic_appears@lemmy.dbzer0.com 10 points 4 days ago (1 children)

If scraping is illegal, so is the Internet Archive, and that would be an immense loss for the world.

[–] Signtist@bookwyr.me 8 points 4 days ago (3 children)

This is the real concern. Copyright abuse has been rampant for a long time, and the only reason things like the Internet Archive are allowed to exist is because the copyright holders don't want to pick a fight they could potentially lose and lessen their hold on the IPs they're hoarding. The AI case is the perfect thing for them, because it's a very clear violation with a good amount of public support on their side, and winning will allow them to crack down even harder on all the things like the Internet Archive that should be fair use. AI is bad, but this fight won't benefit the public either way.

load more comments (3 replies)

load more comments (16 replies)

[–] WereCat@lemmy.world 13 points 3 days ago (1 children)

We just need to show that ChatGPT and alike can generate Nintendo based content and let it fight out between them

[–] anarchy79@lemmy.world 6 points 3 days ago (3 children)

They will probably just merge into another mega-golem controlled by one of the seven people who own the planet.

load more comments (3 replies)

[–] RagingRobot@lemmy.world 8 points 3 days ago

Is this how Disney becomes the owner of all of the AI companies too? Lol

[–] FauxLiving@lemmy.world 37 points 4 days ago* (last edited 3 days ago) (3 children)

An important note here, the judge has already ruled in this case that "using Plaintiffs' works "to train specific LLMs [was] justified as a fair use" because "[t]he technology at issue was among the most transformative many of us will see in our lifetimes." during the summary judgement order.

The plaintiffs are not suing Anthropic for infringing on their copyright, the court has already ruled that it was so obvious that they could not succeed with that argument that it could be dismissed. Their only remaining claim is that Anthropic downloaded the books from piracy sites using bittorrent

This isn't about LLMs anymore, it's a standard "You downloaded something on Bittorrent and made a company mad"-type case that has been going on since Napster.

Also, the headline is incredibly misleading. It's ascribing feelings to an entire industry based on a common legal filing that is not by itself noteworthy. Unless you really care about legal technicalities, you can stop here.

The actual news, the new factual thing that happened, is that the Consumer Technology Association and the Computer and Communications Industry Association filed an Amicus Brief, in an appeal of an issue that Anthropic the court ruled against.

This is pretty normal legal filing about legal technicalities. This isn't really newsworthy outside of, maybe, some people in the legal profession who are bored.

The issue was class certification.

Three people sued Anthropic. Instead of just suing Anthropic on behalf of themselves, they moved to be certified as class. That is to say that they wanted to sue on behalf of a larger group of people, in this case a "Pirated Books Class" of authors whose books Anthropic downloaded from the book piracy websites.

The judge ruled they can represent the class, Anthropic appealed the ruling. During this appeal an industry group filed an Amicus brief with arguments supporting Anthropic's argument. This is not uncommon, The Onion famously filed an Amicus brief with the Supreme Court when they were about to rule on issues of parody. Like everything The Onion writes, it's a good piece of satire: link

load more comments (3 replies)

[–] herseycokguzelolacak@lemmy.ml 3 points 2 days ago (1 children)

I love this. I hope big-tech/big-AI destroys big-copyright industry.

[–] squaresinger@lemmy.world 1 points 13 hours ago

Nah, the only thing that could realistically happen is that copyright doesn't apply to AI hosted by large corporations. In no way will this destroy copyright claims against individuals or small companies.

[–] a_person@piefed.social 11 points 3 days ago

Good fuck those fuckers

[–] chaosCruiser@futurology.today 64 points 4 days ago (2 children)

Oh no! Building a product with stolen data was a rotten idea after all. Well, at least the AI companies can use their fabulously genius PhD level LLMs to weasel their way out of all these lawsuits. Right?

[–] Rooskie91@discuss.online 45 points 4 days ago (2 children)

I propose that anyone defending themselves in court over AI stealing data must be represented exclusively by AI.

[–] Regna@lemmy.world 13 points 4 days ago (1 children)

Hilarious.

load more comments (1 replies)

[–] PushButton@lemmy.world 45 points 4 days ago (12 children)

Let's go baby! The law is the law, and it applies to everybody

If the "genie doesn't go back in the bottle", make him pay for what he's stealing.

[–] Zetta@mander.xyz 31 points 4 days ago (2 children)

The law absolutely does not apply to everybody, and you are well aware of that.

load more comments (2 replies)

load more comments (11 replies)

[–] Plurrbear@lemmy.world 13 points 3 days ago (3 children)

Fucking good!! Let the AI industry BURN!

load more comments (3 replies)

[–] westingham@sh.itjust.works 34 points 4 days ago (5 children)

I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.

[–] kibiz0r@midwest.social 39 points 4 days ago (3 children)

They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.

For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.

[–] Jason2357@lemmy.ca 13 points 4 days ago (3 children)

Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.

I love Cory's writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it's laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.

That is that training models on creative works and then selling access to the derivative "creative" works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call "fair use" that hasn't been really tested in courts.

Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don't think anyone would argue that is not a derivative work, or that falls under "fair use." However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is "fair use" to sell. It's not producing copy-cat literature.

I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under "fair use", but it's hard to justify the slop machines as not a copyright problem.

In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won't help artists and authors.

load more comments (3 replies)

load more comments (2 replies)

load more comments (4 replies)

[–] SugarCatDestroyer@lemmy.world 26 points 4 days ago* (last edited 4 days ago) (1 children)

Unfortunately, this will probably lead to nothing: in our world, only the poor seem to be punished for stealing. Well, corporations always get away with everything, so we sit on the couch and shout "YES!!!" for the fact that they are trying to console us with this.

load more comments (1 replies)

[–] Deflated0ne@lemmy.world 26 points 4 days ago (3 children)

Good. Burn it down. Bankrupt them.

If it's so "critical to national security" then nationalize it.

load more comments (3 replies)

[–] crystalmerchant@lemmy.world 12 points 3 days ago

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

And yet, despite 20 years of experience, the only side Ashley presents is the technologists' side.

[–] keyhoh@piefed.social 28 points 4 days ago* (last edited 4 days ago)

I thought it was hilarious how there was a quote in the article that said

immense harm not only to a single AI company, but to the entire fledgling AI industry and to America’s global technological competitiveness

It will only do this because all these idiotic American companies fired all their employees to replace them with AI. Hire then back and the edge won't dull. But we all know that they won't do this and just cry and point fingers wondering how they ever lost a technology race.

Edited because it's my first time using quotes and I don't know how to use them properly haha

[–] Treczoks@lemmy.world 24 points 4 days ago (1 children)

Well, theft has never been the best foundation for a business, has it?

While I completely agree that copyright terms are completely overblown, they are valid law that other people suffer under, so it is 100% fair to make them suffer the same. Or worse, as they all broke the law for commercial gain.

load more comments (1 replies)

[–] 9point6@lemmy.world 24 points 4 days ago

Probably would have been cheaper to license everything you stole, eh, Anthropic?

[–] ZILtoid1991@lemmy.world 5 points 3 days ago

Now they're in the "finding out" phase of the "fucking around and finding out".

[–] LucidLyes@lemmy.world 8 points 3 days ago (2 children)

I hope LLMs and generative AI crash and burn.

load more comments (2 replies)

[–] FauxLiving@lemmy.world 13 points 4 days ago* (last edited 4 days ago) (6 children)

People cheering for this have no idea of the consequence of their copyright-maximalist position.

If using images, text, etc to train a model is copyright infringement then there will NO open models because open source model creators could not possibly obtain all of the licensing for every piece of written or visual media in the Common Crawl dataset, which is what most of these things are trained on.

As it stands now, corporations don't have a monopoly on AI specifically because copyright doesn't apply to AI training. Everyone has access to Common Crawl and the other large, public, datasets made from crawling the public Internet and so anyone can train a model on their own without worrying about obtaining billions of different licenses from every single individual who has ever written a word or drawn a picture.

If there is a ruling that training violates copyright then the only entities that could possibly afford to train LLMs or diffusion models are companies that own a large amount of copyrighted materials. Sure, one company will lose a lot of money and/or be destroyed, but the legal president would be set so that it is impossible for anyone that doesn't have billions of dollars to train AI.

People are shortsightedly seeing this as a victory for artists or some other nonsense. It's not. This is a fight where large copyright holders (Disney and other large publishing companies) want to completely own the ability to train AI because they own most of the large stores of copyrighted material.

If the copyright holders win this then the open source training material, like Common Crawl, would be completely unusable to train models in the US/the West because any person who has ever posted anything to the Internet in the last 25 years could simply sue for copyright infringement.

[–] barryamelton@lemmy.world 11 points 4 days ago* (last edited 4 days ago)

Anybody can use copyrighted works under fair use for research, more so if your LLM model is open source (I would say this fair use should only actually apply if your model is open source...). You are wrong.

We don't need to break copyright rights that protect us from corporations in this case, or also incidentally protect open source and libre software.

[–] JustARaccoon@lemmy.world 10 points 4 days ago (2 children)

In theory sure, but in practice who has the resources to do large scale model training on huge datasets other than large corporations?

load more comments (2 replies)

load more comments (4 replies)

[–] a_wild_mimic_appears@lemmy.dbzer0.com 13 points 4 days ago* (last edited 4 days ago)

So, the US now has a choice: rescue AI and fix their fucked up copyright system, or rescue the fucked up copyright system and fuck up AI companies. I'm interested in the decision.

I'd personally say that the copyright system needs to be fixed anyway, because it's currently just a club for the RIAA&MPAA to wield against everyone (remember the lawsuits against single persons with alleged damages in the millions for downloading a few songs? or the current tries to fuck over the internet archive?). Should the copyright side win, then we can say goodbye to things like the internet archive or open source-AI; copyright holders will then be the AI-companies, since they have the content.

[–] Lexam@lemmy.world 13 points 4 days ago (1 children)

No it won't. Just their companies. Which are the ones making slop. If your AI does something actually useful it will survive.

load more comments (1 replies)

load more comments