Anthropic sonnet 3.5 is an old mid tier model. GPT 4o is also multiple generations ago.
Newer models handle this much better. Not claiming sentience or anything.
This is a most excellent place for technology news and articles.
Anthropic sonnet 3.5 is an old mid tier model. GPT 4o is also multiple generations ago.
Newer models handle this much better. Not claiming sentience or anything.
These models tested are so old they're from the era where they couldn't pass a math test or count letters in words