79
What are the platforms on the Fediverse doing to prevent data scraping and prevent bots?
(piefed.blahaj.zone)
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, Mbin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)
I don't understand the part where you say that my view is narrow. I am talking about a specific kind of data scraping. I'm not sure what I've said that has lead you and a few other people to believe I'm necessarily worried about people getting hold of "my data".
Am I just expressing myself badly here?
As for the rate limiting, that's closer to what I wanted to know. Thanks.
Your original post didn't specify a particular kind of data scraping. TropicalDingdong had no way to know you were only specifically interested in that one kind of data scraping, so his comment is appropriate - you can't stop data scraping in general, and attempting to do so in the general case goes directly against the goal of ActivityPub.
I guess. But that was an assumption on you guys' part as well. Not that there's anything wrong with that.
I'm curious about the "in general" part, though. Maybe that's a part of the philosophy I don't quite understand yet, but how's the kind of scraping that I mentioned any good? Or is that not the right question to ask?
I didn't say anything about the "prevent instances from being overloaded" part being good or bad. I didn't even give an opinion on ActivityPub, just pointed out the practical limitations and incompatible design goals.
Personally, I've got no problem with websites implementing rate caps and whatnot to ensure that their traffic remains within the limits they can handle, or throttling specific IPs. I am very concerned with how Cloudflare in particular has become the single centralized "gatekeeper" for vast swaths of the Internet, though. If they decide that some particular client isn't allowed to see stuff then poof, a big chunk of the Internet is cut off. That's worrisome IMO.