this post was submitted on 10 Apr 2025
27 points (67.5% liked)

Selfhosted

46304 readers
535 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

For context I created a video search engine last year, I shut it down and put the data online. You can read about it here: https://www.bendangelo.me/2024/07/16/failed-attempt-at-creating-a-video-search-engine/

I put that project on hold because of scaling issues, anyway I'm back with an other idea. I've been frustrated with how AI slop is ruining the internet and recently it's been hitting Youitube pretty hard with AI videos. I’m brainstorming a tool for people to selfhost:

Self-hosted crawler: Pick which sites/videos to index (blogs, forums, YT channels, etc.). AI chat interface: Ask questions like, “Show me Rust tutorials from 2023” or “Summarize recent posts about homelab backups.” Optional sharing: Pool indexes with trusted friends/communities.

Why? No Google/YouTube spam—only content you choose. Works offline (archive forums, videos, docs). Local AI (Mistral) or cloud (paid) for smarter searches.

Would this be useful to you? What sites would you crawl? Any killer features I’m missing?

Prototype in progress—just testing interest!

you are viewing a single comment's thread
view the rest of the comments
[–] CameronDev@programming.dev 51 points 2 weeks ago (8 children)

I personally have zero interest in AI search, if you mean LLM. The fact that it can make stuff up, also means it can miss stuff as well. Neither are acceptable for a search engine.

If you mean some kind of deterministic algorithm for indexing and searching, then maybe.

Also, attempting to crawl sites locally sounds like a great way to get banned from those sites for looking like a bot.

[–] T156@lemmy.world 8 points 2 weeks ago (7 children)

I can't imagine self hosting an LLM-based search engine would be too viable. The hardware demands, even for a relatively small quantised model, are considerable. Doubly so if you don't have a GPU to accelerate with.

[–] JeremyHuntQW12@lemmy.world 1 points 2 weeks ago (1 children)

You can run Deepseek on a Raspberry Pi.

[–] Senal@programming.dev 1 points 2 weeks ago

At a level you'd need to use for a search engine ?

load more comments (5 replies)
load more comments (5 replies)