this post was submitted on 23 Mar 2025
1244 points (98.2% liked)

Technology

69421 readers
2748 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] TheMightyCat@lemm.ee 10 points 1 month ago (11 children)

No?

Anyone can run an AI even on the weakest hardware there are plenty of small open models for this.

Training an AI requires very strong hardware, however this is not an impossible hurdle as the models on hugging face show.

[–] CodeInvasion@sh.itjust.works 7 points 1 month ago (3 children)

Yah, I'm an AI researcher and with the weights released for deep seek anybody can run an enterprise level AI assistant. To run the full model natively, it does require $100k in GPUs, but if one had that hardware it could easily be fine-tuned with something like LoRA for almost any application. Then that model can be distilled and quantized to run on gaming GPUs.

It's really not that big of a barrier. Yes, $100k in hardware is, but from a non-profit entity perspective that is peanuts.

Also adding a vision encoder for images to deep seek would not be theoretically that difficult for the same reason. In fact, I'm working on research right now that finds GPT4o and o1 have similar vision capabilities, implying it's the same first layer vision encoder and then textual chain of thought tokens are read by subsequent layers. (This is a very recent insight as of last week by my team, so if anyone can disprove that, I would be very interested to know!)

[–] cyd@lemmy.world 2 points 1 month ago* (last edited 1 month ago)

It's possible to run the big Deepseek model locally for around $15k, not $100k. People have done it with 2x M4 Ultras, or the equivalent.

Though I don't think it's a good use of money personally, because the requirements are dropping all the time. We're starting to see some very promising small models that use a fraction of those resources.

load more comments (2 replies)
load more comments (9 replies)