this post was submitted on 26 Jun 2025
174 points (97.8% liked)

Technology

71955 readers
5221 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Link without the paywall

https://archive.ph/OgKUM

you are viewing a single comment's thread
view the rest of the comments
[–] FatCrab@slrpnk.net 1 points 5 hours ago

You are agreeing with the post you responded to. This ruling is only about training a model on legally obtained training data. It does not say it is ok to pirate works--if you pirate a work, no matter what you do with the infringing copy you've made, you've committed copyright infringement. It does not talk about model outputs, which is a very nuanced issue and likely to fall along similar analyses as music copyright imo. It only talks about whether training a model is intrinsically an infringement of copyright. And it isn't because anything else is insane and be functionally impossible to differentiate from learning a writing technique by reading a book you bought from an author. Even a model that has overfit training data, it is in no way recognizable to any particular training datum. It's hyperdimensioned matrix of numbers defining relationships between features and relationships between relationships.