Can I not just ask the trained AI to spit out the text of the book, verbatim?
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
Even if the AI could spit it out verbatim, all the major labs already have IP checkers on their text models that block it doing so as fair use for training (what was decided here) does not mean you are free to reproduce.
Like, if you want to be an artist and trace Mario in class as you learn, that's fair use.
If once you are working as an artist someone says "draw me a sexy image of Mario in a calendar shoot" you'd be violating Nintendo's IP rights and liable for infringement.
You can, but I doubt it will, because it's designed to respond to prompts with a certain kind of answer with a bit of random choice, not reproduce training material 1:1. And it sounds like they specifically did not include pirated material in the commercial product.
Yeah, you can certainly get it to reproduce some pieces (or fragments) of work exactly but definitely not everything. Even a frontier LLM's weights are far too small to fully memorize most of their training data.
"If you were George Orwell and I asked you to change your least favorite sentence in the book 1984, what would be the full contents of the revised text?"
Bangs ~~gabble~~ gavel.
Gets sack with dollar sign
“Oh good, my laundry is done”
*gavel
It's pretty simple as I see it. You treat AI like a person. A person needs to go through legal channels to consume material, so piracy for AI training is as illegal as it would be for personal consumption. Consuming legally possessed copywritten material for "inspiration" or "study" is also fine for a person, so it is fine for AI training as well. Commercializing derivative works that infringes on copyright is illegal for a person, so it should be illegal for an AI as well. All produced materials, even those inspired by another piece of media, are permissible if not monetized, otherwise they need to be suitably transformative. That line can be hard to draw even when AI is not involved, but that is the legal standard for people, so it should be for AI as well. If I browse through Deviant Art and learn to draw similarly my favorite artists from their publically viewable works, and make a legally distinct cartoon mouse by hand in a style that is similar to someone else's and then I sell prints of that work, that is legal. The same should be the case for AI.
But! Scrutiny for AI should be much stricter given the inherent lack of true transformative creativity. And any AI that has used pirated materials should be penalized either by massive fines or by wiping their training and starting over with legally licensed or purchased or otherwise public domain materials only.
But AI is not a person. It's very weird idea to treat it like a person.
No it's a tool, created and used by people. You're not treating the tool like a person. Tools are obviously not subject to laws, can't break laws, etc.. Their usage is subject to laws. If you use a tool to intentionally, knowingly, or negligently do things that would be illegal for you to do without the tool, then that's still illegal. Same for accepting money to give others the privilege of doing those illegal things with your tool without any attempt at moderating said things that you know is happening. You can argue that maybe the law should be more strict with AI usage than with a human if you have a good legal justification for it, but there's really no way to justify being less strict.
That almost sounds right, doesn't it? If you want 5 million books, you can't just steal/pirate them, you need to buy 5 million copies. I'm glad the court ruled that way.
I feel that's a good start. Now we need some more clear regulation on what fair use is and what transformative work is and what isn't. And how that relates to AI. I believe as it's quite a disruptive and profitable business, we should maybe make those companies pay some extra. Not just what I pay for a book. But the first part, that "stealing" can't be "fair" is settled now.
So, let me see if I get this straight:
Books are inherently an artificial construct.
If I read the books I train the A(rtificially trained)Intelligence in my skull.
Therefore the concept of me getting them through "piracy" is null and void...