this post was submitted on 25 Apr 2026
44 points (97.8% liked)

Technology

42810 readers
222 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 4 years ago
MODERATORS
 

If so are these programs that claim to 'poison' the training datasets effective ?

you are viewing a single comment's thread
view the rest of the comments
[–] BB84@mander.xyz 10 points 1 day ago (1 children)

You took those quotes wildly out of context. Of course there is a hard limit on how much information can be extracted from data. Clever processing won't break that limit. But only in basic cases have we seen proofs that certain statistical inference methods make optimal use of the data. In complicated systems like neural nets it is basically impossible to prove such optimality. In fact the models are almost definitely not using the data optimally. Processing can help. A lot.

[–] pglpm@lemmy.ca 16 points 1 day ago* (last edited 1 day ago) (1 children)

They aren't out of context, and you have just said the same thing. Data processing can help in removing noise, but it can't help in creating information or extracting information that wasn't there in the first place. In fact – again as you said – it can end up destroying part of the original information.

LLMs extract word correlations from textual data. Already in this process they are losing information, since they can't extract correlations beyond a certain (yet large) length, and don't extract correlations at shorter lengths. And in creating output they insert spurious correlations that replace (destroy) some of the original ones. This output will contain even less information than the original training data. So a new LLM trained with such an output will give back even less.

[–] BB84@mander.xyz 1 points 23 hours ago (1 children)

No one feeds random LLM output straight back though. The whole idea of reinforcement learning is you take some ML model output, check if it is good, and push the model in that direction if it is good.

As long as you believe that e.g. it's easier to verify a mathematical result than to come up with one, then RL should work.

[–] athatet@lemmy.zip 1 points 22 hours ago (1 children)

It will still, over time, give fewer and fewer good results to be fed back into it.

[–] BB84@mander.xyz 1 points 20 hours ago

Reinforcement learning makes the model better over time, so why should there be fewer and fewer good results?

If you're talking about the rate of improvement going down, then yes, of course. That's bound to happen (unless you have an actual intelligence explosion, but in that case you won't know what "good results" even mean anyway).