this post was submitted on 28 Sep 2025
56 points (98.3% liked)

Selfhosted

51841 readers
581 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

When I first got into self hosting, I originally wanted to join the Fediverse by hosting my own instance. After realizing I am not that committed to that idea, I went into a simpler direction.

Originally I was using Cloudflare's tunnel service. Watching the logs, I would get traffic from random corporations and places.

Being uncomfortable with Cloudflare after pivoting away from social media, I learned how to secure my device myself and started using an uncommon port with a reverse proxy. My logs now only ever show activity when I am connecting to my own site.

Which is what lead me to this question.

What do bots and scrapers look for when they come to a site? Do they mainly target known ports like 80 or 22 for insecurities? Do they ever scan other ports looking for other common services that may be insecure? Is it even worth their time scanning for open ports?

Seeing as I am tiny and obscure, I most likely won't need to do much research into protecting myself from such threats but I am still curious about the threats that bots pose to other self-hosters or larger platforms.

you are viewing a single comment's thread
view the rest of the comments
[–] Cyberflunk@lemmy.world 8 points 1 day ago* (last edited 1 day ago) (1 children)

Read up on shodan.io. bot networks and scrapers can use the database as a seed to find open ports.

The cli massscan can (under reasonable conditions) scan the the entire ipv4 address space for a single port in 3 minutes. It would take an estimated 74 years for massscan to scan all 64k ports for the entire ipv4 network.

So, using a seed like shodan, can compliment scanners/scrapers to isolate ip addresses to further recon.

I honestly don't know if this helps your question, I don't actually know how services in general deal with nonstandard ports, but I've written a lot of scanning agents (not ai, old school agents) to recon for red/blue teams. I never started with raw internet guesses, I always used a seed. Shodan, or other scan results.

Thanks for the insight. It's useful to know what tools are out there and what they can do. I was only aware of nmap before which I use to make sure the only ports open are the ports I want open.

My web facing device only serves static sites and a file server with non identifiable data I feel indifferent about being on the internet. No databases or stress if it gets targeted or goes down.

Even then, I still like to know how things work. Technology today is built on so many layers of abstraction, it all feels like an infinite rabbit hole now. It's hard to look at any piece of technology as secure these days.