this post was submitted on 21 Jul 2025
40 points (83.3% liked)

Technology

73066 readers
2223 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

This article describes what ive been thinking about for the last week. How will these billions of investments by big tech actually create something that is significantly better than what we have today already?

There are major issues ahead and im not sure they can be solved. Read the article.

top 9 comments
sorted by: hot top controversial new old
[–] proceduralnightshade@lemmy.ml 2 points 6 hours ago (1 children)

tl;dr AI companies are slowly running out of data to train their models; synthetic data is not a viable alternative.

I can't remember where I saw it, but someone somewhere on YouTube suspected the next step for OpanAI and such would be to collect user data directly; recording conversations of users and using that data to train models further.

If I find the vid I will add a link here.

[–] 1984@lemmy.today 2 points 6 hours ago

Yeah that would be the logical end game since companies have invested billions into this trend now.

[–] mesamunefire@piefed.social 6 points 20 hours ago

Interesting: https://arxiv.org/pdf/2305.17493

""THE CURSE OF RECURSION:
TRAINING ON GENERATED DATA MAKES MODELS FORGET"

A great read on the referenced paper.

[–] some_kind_of_guy@lemmy.world 3 points 19 hours ago (1 children)

I wonder if AI applications other than just "be a generalist chat bot" would run into the same thing. I'm thinking about pharma, weather prediction, etc. They would still have to "understand" their english-language prompts, but the LLMs can do that just fine today, and could feed systems designed to iteratively solve for problems in those areas. A model feeding into itself or other models doesn't have to be a bad thing.

[–] homesweethomeMrL@lemmy.world 1 points 17 hours ago

Only in the sense that those “words” they know are pointers to likely connected words. If the concepts follow alike then, theoretically all good. But beyond FAQs and such I’m not seeing anything that would indicate it’s ready for anything more.

[–] Xaphanos@lemmy.world 2 points 21 hours ago (1 children)

My company is in AI. One of our customers pays for systems capable of the hard computational work to design the drugs to treat Parkinson's. This is the only newly possible with the newest technology.

[–] MysteriousSophon21@lemmy.world 3 points 14 hours ago (1 children)

This is acutally one of the most promising applications - AI can screen millions of potential drug compounds and predict protein interactions in hours instead of months, which is why we're seeing breakthroughs in neurodegenerative disease research.

[–] altkey@lemmy.dbzer0.com 3 points 12 hours ago (1 children)

That's probably Machine Learning, the root category of tools and the origin of LLMs, not Large Language Models themselves we call 'AI'. These have many applications they are efficient at gradually explored from the 80s I believe, while the AI boom involving Google, Meta, OpenAI and others is about generalistic chatbots that are bad in just about everything they used in. I'm putting that distinction not because I'm an ass, but because I don't want the hype wave to get more credibility on the back of real scientifical and technological progress.

[–] Womble@lemmy.world 2 points 11 hours ago

It depends, if they're use a transformer or diffusion based archetecture I think it would be fair to include it in the same "AI wave" thats been breaking since the release of chat gpt publicly.