this post was submitted on 01 Jul 2026
821 points (98.6% liked)
Technology
85968 readers
5081 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As opposed to the plain old scraping they do to train AI, and generate revenue by selling user comments for others to train AI.
I read a half-cocked internet theory that a certain someone might've purchased twitter just to gain access to an ex-gf's personal tweets. I judged it as possible but unlikely, as that's a lot of money to spend on such a thing.
Now, we've all heard stories about reddit blocking accounts for no published reason, and tracking folks down across accounts/IP addresses/etc. That code must be pretty expansive to do the things they've done. So one has the thought: if you've ever reached out to the reddit hive mind for some kind of support with a personal issue of any kind then that data about you is still floating around in their database and tied to whatever alternate accounts you have, even if it was the "good old days" when you did it.
Abusive scraping, my ass.
reddits moderation is very expensive, which is why they allow google to scrape thier data. the V3 captcha system is googles thing, reddit cant afford that but google lets them use it in exchange for AI/datamining. reddit and google is quite intertwined, like with mozilla and google.
Had no idea they used that. I edited all my comments to crap then deleted them around the time the admin monkied with the backend database, and stopped using old.reddit to browse once I found lemmy. I once went through the effort of making a temp account to comment on someone else's comment there because they had suggested trying something specifically dangerous and didn't seem to know about it. I doublechecked later and the comment I wrote was caught in some filter, likely the result of the account being too new. I can't imagine what garbage that site will be in the years to come.