Technology

78511 readers

3254 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

Researchers figured out how to run a 120-billion parameter model across four regular desktop PCs (actu.epfl.ch)

submitted 2 days ago by noumenon@lemmy.world to c/technology@lemmy.world

10 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] madcaesar@lemmy.world 1 points 1 day ago (1 children)

So what do you get with a home run LLM? How capable is it what can you use it for?

[–] afk_strats@lemmy.world 1 points 5 hours ago

I still think AI is mostly a toy and a corporate inflation device. There are valid use cases but I don't think that's the majority of the bubble

For my personal use, I used it to learn how models work from a compute perspective. I've been interested and involved with natural language processing and sentiment analysis since before LLMs became a thing. Modern models are an evolution of that.
A small, consumer grade model like GPT-oss-20 is around 13GB and can run on a single mid-grade consumer GPU and maybe some RAM. It's capable of parsing text and summarizing, troubleshooting computer issues, and some basic coding or code review for personal use. I built some bash and home assistant automatons for myself using these models as crutches. Also, there is software that can index text locally to help you have conversations with large documents. I use this with documentation for my music keyboard which is a nightmare to program and with complex APIs.
A mid-size model like Nemotron3 30B is around 20GB can run on a larger consumer card (like my 7900xtx with 24 gb of VRAM, or 2 5060tis with 16gb of vRAM each) and will have vaguely the same usability as the small commercial models, like Gemini Flash, or Claude Haiku. These can write better, more complex code. I also use these to help me organize personal notes. I dump everything in my brain to text and have the model give it structure.
A large model like GLM4.7 is around 150GB can do all the things ChatGPT or Gemini Pro can do, given web access and a pretty wrapper. This requires big RAM and some patience or a lot of VRAM. There is software designed to run these larger models in RAM faster, namely ik_llama but, at this scale, you're throwing money at AI.

I played around with image creation and there isn't anything there other than a toy for me. I take pictures with a camera.