this post was submitted on 08 Jan 2026
601 points (99.7% liked)

Technology

78511 readers
3110 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] webadict@lemmy.world 4 points 2 days ago (1 children)

Actually, the Rs issue is funny because it WAS trained on that exact information which is why it says strawberry has two Rs, so it's actually more proof that it only knows what it has been given data on. The thing is, when people misspelled strawberry as "strawbery", then naturally, people respond, " Strawberry has two Rs." The problem is that LLM learning has no concept of context because it isn't learning anything. The reinforcement mechanism is what the majority of its data tells it. It regurgitates that strawberry has two Rs because it has been reinforced by its dataset.

[–] rumba@lemmy.zip 2 points 2 days ago (1 children)

Interesting story, but I've seen the same work with how many ass in assassian

you can probe the stuff it's bad at, and a lot of it doesn't line up well with the story that it's how people were corrected.

[–] webadict@lemmy.world 1 points 2 days ago

But that's exactly how an LLM is trained. It doesn't know how words are spelled because words are turned into numbers and processed. But it does know when its dataset has multiple correlations for something. Specifically, people spell out words, so it will regurgitate to you how to spell strawberry, but it can't count letters because that's not a thing that language models do.

Generative AI and LLMs are just giant reconstruction bots that take all the data they have and reconstruct something. That's literally what they do.

Like, without knowing what your answer is for assassin, I will assume that your issue is that the question is probably "How many asses are in assassin?" But, like, that's a joke. Assassins only has one ass, just like the rest of us. That's a joke. And nobody would ever spell assassin as assin, so why would it learn that there are two asses in assassin?

I'm confused where you are getting your information from, but this is not particularly special behavior.