this post was submitted on 07 Oct 2025
255 points (92.9% liked)
Technology
75792 readers
2941 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That’s part of the reason these models haven’t improved much in the last year or so. They‘ve absorbed all the public facing internet and whatever copyrighted works they could get away with pirating (pretty much all printed work), and now they are faced with a brick wall. They haven’t come up with a way to create new content, to reinforce a „correct“ statistical model without causing model collapse, and I don’t think they ever will. The well (the public internet) is already thoroughly poisoned so they have to use a snapshot of the pre-LLM internet, not even an up to date one.
If it isn’t good enough after consuming almost the entirety of humanity’s written output since the invention of the printing press, it’s never going to be.