Selfhosted

56957 readers

637 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

In search for a new self-hosted LLM (lemmy.ml)

submitted 5 hours ago* (last edited 3 hours ago) by tanka@lemmy.ml to c/selfhosted@lemmy.world

11 comments fedilink hide all child comments

Hey :) For a while now I use gpt-oss-20b on my home lab for lightweight coding tasks and some automation. I'm not so up to date with the current self-hosted LLMs and since the model I'm using was released at the beginning of August 2025 (From an LLM development perspective, it feels like an eternity to me) I just wanted to use the collective wisdom of lemmy to maybe replace my model with something better out there.

Edit:

Specs:

GPU: RTX 3060 (12GB vRAM)

RAM: 64 GB

gpt-oss-20b does not fit into the vRAM completely but it partially offloaded and is reasonably fast (enough for me)

top 11 comments

sorted by: hot top controversial new old

[–] ejs@piefed.social 7 points 3 hours ago

I suggest looking at llm arena leaderboards filtered by open weight models. It offers benchmarks at a very complete and statistically detailed level for models, and usually is quite up to date when new models come out. The new Gemma that just came out might be the best for 1x GPU, and if you have a bunch of vram check out the larger Chinese models

[–] Jozzo@lemmy.world 15 points 5 hours ago (1 children)

I find Qwen3.5 is the best at toolcalling and agent use, otherwise Gemma4 is a very solid all-rounder and it should be the first you try. Tbh gpt-oss is still good to this day, are you running into any problems w it?

[–] tanka@lemmy.ml 2 points 2 hours ago

No problems per se. I just thought that I had not checked for an update for a longer time.

[–] tal@lemmy.today 7 points 5 hours ago

I'm not on there, but you might have more luck in !localllama@sh.itjust.works

You might also want to list the hardware that you plan to use, since that'll constrain what you can reasonably run.

[–] Gumus@lemmy.dbzer0.com 5 points 5 hours ago

I'd say Qwen 3.5 and Gemma 4 beat GPT OSS in every aspect.

[–] cron@feddit.org 3 points 5 hours ago

The latest open weights model from google might be a good fit for you. The 26B model works pretty well on my machine, though the performance isn't great (6 tokens per second, CPU only).

[–] theunknownmuncher@lemmy.world 3 points 5 hours ago

How much VRAM?

[–] carzian@lemmy.ml 3 points 5 hours ago

I'm in the same boat. You'll get better responses if you post your machine specs. I

[–] sompreno@lemmy.zip 0 points 2 hours ago (1 children)

What are your computer specs?

[–] tanka@lemmy.ml 2 points 2 hours ago (1 children)

I did just update my post with the specs. Maybe it takes a while to federate?

[–] sompreno@lemmy.zip 1 points 44 minutes ago

I must have not refreshed ignore my comment