79
What are the platforms on the Fediverse doing to prevent data scraping and prevent bots?
(piefed.blahaj.zone)
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, Mbin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)
Your original post didn't specify a particular kind of data scraping. TropicalDingdong had no way to know you were only specifically interested in that one kind of data scraping, so his comment is appropriate - you can't stop data scraping in general, and attempting to do so in the general case goes directly against the goal of ActivityPub.
I guess. But that was an assumption on you guys' part as well. Not that there's anything wrong with that.
I'm curious about the "in general" part, though. Maybe that's a part of the philosophy I don't quite understand yet, but how's the kind of scraping that I mentioned any good? Or is that not the right question to ask?
I didn't say anything about the "prevent instances from being overloaded" part being good or bad. I didn't even give an opinion on ActivityPub, just pointed out the practical limitations and incompatible design goals.
Personally, I've got no problem with websites implementing rate caps and whatnot to ensure that their traffic remains within the limits they can handle, or throttling specific IPs. I am very concerned with how Cloudflare in particular has become the single centralized "gatekeeper" for vast swaths of the Internet, though. If they decide that some particular client isn't allowed to see stuff then poof, a big chunk of the Internet is cut off. That's worrisome IMO.