overview for wjs018

FOSS infrastructure is under attack by AI companies in c/technology@lemmy.world

[–] wjs018@piefed.social 80 points 1 month ago (2 children)

The theory that the lead maintainer had (he is an actual software developer, I just dabble), is that it might be a type of reinforcement learning:

Get your LLM to create what it thinks are valid bug reports/issues
Monitor the outcome of those issues (closed immediately, discussion, eventual pull request)
Use those outcomes to assign how "good" or "bad" that generated issue was
Use that scoring as a way to feed back into the model to influence it to create more "good" issues

If this is what's happening, then it's essentially offloading your LLM's reinforcement learning scoring to open source maintainers.

FOSS infrastructure is under attack by AI companies in c/technology@lemmy.world

[–] wjs018@piefed.social 114 points 1 month ago* (last edited 1 month ago) (7 children)

Really great piece. We have recently seen many popular lemmy instances struggle under recent scraping waves, and that is hardly the first time its happened. I have some firsthand experience with the second part of this article that talks about AI-generated bug reports/vulnerabilities for open source projects.

I help maintain a python library and got a bug report a couple weeks back of a user getting a type-checking issue and a bit of additional information. It didn't strictly follow the bug report template we use, but it was well organized enough, so I spent some time digging into it and came up with no way to reproduce this at all. Thankfully, the lead maintainer was able to spot the report for what it was and just closed it and saved me from further efforts to diagnose the issue (after an hour or two were burned already).

PhysicsForums and the Dead Internet Theory [old specialized forums have started backdating millions of LLM-generated posts] in c/technology@lemmy.world

[–] wjs018@piefed.social 1 points 2 months ago* (last edited 2 months ago)

Official response from Greg Bernhardt

It's years since I last used PhysicsForums, but found it immensely useful in the old days while going through my undergrad physics degree (it was less useful for PhD courses). I am not morally opposed to providing AI attempts at an answer in threads where nobody else chimes in. However, using real accounts that belong to other users is wildly over the line. I was surprised to see this wasn't really called out in the official response thread by the existing users as that is the part of all this that is the most egregious to me.