Selfhosted

56957 readers

1306 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

I wanted Claude Code-style workflows without sending code to the cloud, so I built Loki (lemmy.world)

submitted 3 hours ago by aclarke@lemmy.world to c/selfhosted@lemmy.world

26 comments fedilink hide all child comments

For the longest time, I've been trying to figure out a way to "survive" in this new AI age without having to fork over a ton of money just to keep up. I've tried using local models via Ollama, and while they definitely work to a degree, they're (unsurprisingly) not as good as the big model providers.

The local models tend to

Forget what they're doing
Struggle to break larger tasks into smaller ones
Lose focus easily
Have weaker coding performance
Drift over longer sessions

So to improve the reliability of fully local, smaller models (and to keep all my data local and in my own network), I created Loki.

It's a local-first, batteries-included command line tool and runtime for building and running LLM workflows locally. It's model agnostic and supports things like

Agents and agent delegation
Roles/personas
MCP Servers
RAG
Custom tools
Macros
Workflow Scripting

A lot of the features it supports are specifically designed to compensate for weaknesses in smaller local models. For example:

Auto continuation to keep pushing models to completion instead of stopping halfway through problems
Parallel agent delegation so tasks can be split into smaller, focused scopes
Workflow-based execution ("If this, do that") for building more reliable and repeatable automations

It also supports the major cloud providers if you want them (which definitely helped while testing 😄), but my long-term goal is simple:

Get as close as possible to Claude Code-style reliability using fully local models.

I'm always open to feedback, questions, or ideas.

Repo: https://github.com/Dark-Alex-17/loki

you are viewing a single comment's thread
view the rest of the comments

[–] aclarke@lemmy.world 10 points 3 hours ago (1 children)

I'm using a ton of different ones but the main ones I use daily are

gemma4:26b
deepseek-coder
deepseek-r1:32b
devstral:24b
granite-code:34b
openthinker:latest
phi4:latest
qwen3:30b
mixtral:8x22b

I'm also going to use this opportunity to plug an amazing project to help figure out which models will work well on my hardware: https://github.com/AlexsJones/llmfit Is amazing!

[–] Blue_Morpho@lemmy.world 6 points 2 hours ago (1 children)

Isn't it a huge delay to swap out to a different ~30b model every few minutes depending on the use case?

[–] aclarke@lemmy.world 4 points 2 hours ago

Unfortunately, yes. It's one reason I'm trying to figure out a good mechanism to maybe do something like multiple ollama hosts. So like: you can specify what model to use specifically in an agent. But if an agent delegates to a sub-agent, it unloads that model and loads the new one. I'm trying to figure out if there's a way to "alternate" between multiple hosts (say, ollama running locally and one running on your server), so that when a switch happens, it does it on the secondary host while also looking ahead to see what needs to be switched, if anything, on the primary host.

It supports multiple Ollama hosts right now as-is so what I've honestly been doing for the time being is specify which model on which host each agent uses so there's only loading of one model at the beginning of a session. Then there's no unloading/loading/etc. The other thing I've been trying is to see how small I can get the models to be without losing performance. While the tricks implemented in Loki help dramatically, I know there's still a lot more I can do to improve it further.