this post was submitted on 07 Mar 2026
839 points (98.9% liked)
Technology
82460 readers
2943 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The bubble popping seems inevitable at this point. Before the Giants were funding this by their core business plus loans backed by their core business. Now they've stretched their credit so much that no one's giving them loans anymore and instead of cutting back on the building spree they're making cuts to their core business.
They're betting that their customers are so locked in that they won't leave despite degradation in service. How deep oracle, AWS, googles hooks are in people remain to be seen, people seem to tolerate a lot of enshitification, but there's gotta be a tipping point. Once they reach that and the core business crashes all the rest of the dominos will fall.
Once these companies have to start charging what it really costs to maintain and run these huge models. The number of use cases will shrivel.
Models are becoming more optimized. I've recently tried LFM2.5, small version, and it's ridiculously close in usefulness to Qwen3.5, for example. Or RNJ-1.
To maintain, meaning actualized datasets - well, sort of expensive, but they were assembling those as a side effect of their main businesses.
So this is not what'll kill them. Their size will. These are very big companies with lots of internal corruption and inefficiency pulling them down. And a few new AI companies, which, I think, are going to survive, they are centered around specific products, some will die, but I'd expect LiquidAI or Anthropic or such to still be around some time after the crash.
The crash might coincide with a bubble burst, but notice how this family of technologies really is delivering results. Instead of a bunch of specialized applications people are asking LLMs and getting often good enough answers. LLM agents can retrieve data from web services, perform operations, assist in using tools.
You shouldn't look at the big ones in the cloud, rather at what value local LLMs give you for energy spent. Right now it's not that good, but approaching good honestly. I don't feel like they've stopped becoming better. Human time is still more expensive. The tools are there, and are being improved, and the humans are slowly gaining experience in using them, and that makes them more efficient in various tasks.
It's for all kinds of reference and knowledge tools what Google was for search.
And there's one just amazing thing about these models - they are self-contained, even if some can use tools to access external sources. Our corporate overlords have been building a dependent networked world for 20 years, simply to break it by popularizing a technology that almost neuters that. They were thinking, probably, that they were reaping the crops of the web for themselves, instead they taught everyone that you don't have to eat at the diner, you can take the food home.
I like local LLMs as much as the next person but the issue is that doesn't scale the way companies need it to.
As a personal assistant? Sure, I agree. They're useful at times. But as soon as you need multiple to run simultaneously you're gonna hit resource issues.
What Oracle and others were banking on is that you have engineers and others running a lot of agents in parallel composing different things together. Or having one input that multiple serverside agents take and execute numerous tasks on. That's something you can't run on an individual machine right now. And with the way they currently work I don't envision they will anytime soon.
There are lightweight models as good as some heavier ones. It's a bit like Intel's tick-tock advertised process. Heavy memory-hungry models are "tick", but there's "tock"- say, "lfm2.5-thinking" model, the light version, in the ollama repository seems almost as good as qwen3.5 for me, except it's very lightweight and lightning-fast compared to that.
These things are being optimized. It's just that in the market capture phase nobody bothered.
That they are not being used correctly - yeah, absolutely, my idea of their proper use is some graph-based system with each node being processed by a select LLM (or just piece of logic) with select set of tools and actions and choices available for each. A bit like ComfyUI, but something saner than a zoom-based web UI. Like MacOS Automator application, rather.