Technology

85745 readers

3980 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

285

Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase (www.psypost.org)

submitted 2 days ago by sanitation@lemmy.today to c/technology@lemmy.world

112 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] zbyte64@awful.systems 3 points 2 days ago (1 children)

Using computers to search for a counter example to a conjecture isn't exactly new ground and I suspect they did so with the aide of some harness tweaks like some numerical LSP. Like cool, it pushed the envelope but like what the parent said, they grafted on the ability to do a specific task.

[–] communist@lemmy.frozeninferno.xyz 0 points 1 day ago* (last edited 1 day ago) (1 children)

That doesn't change the fact that llm's are capable of acing math olympiads. So what if it uses tools? You probably would too. I doubt anybody there did it without a calculator.

https://www.nature.com/articles/d41586-025-02343-x

[–] zbyte64@awful.systems 1 points 1 day ago* (last edited 1 day ago) (1 children)

Aren't you the least bit curious what tools they gave the LLM and how the LLM used those tools? It's like back in math class you are asked to solve a quadratic formula but you forgot how. So you use the calculator to try different numbers and the calculator is telling you if you are getting closer. Sure I got the right answer, but it's hardly a testament to my math skills.

[–] communist@lemmy.frozeninferno.xyz 1 points 1 day ago* (last edited 1 day ago) (1 children)

The calculator does not tell them if they're getting closer? This isn't how anything works. No I can't say I'm very interested in whether or not the llm has access to python/a calculator as long as it completes the task, that doesn't matter.

[–] zbyte64@awful.systems 1 points 1 day ago (1 children)

If you are not interested in how it completes the task then you are not an authority on how it works.

[–] communist@lemmy.frozeninferno.xyz 1 points 1 day ago* (last edited 1 day ago) (1 children)

I'm academically interested, what I mean when I say I'm not interested is that I just don't see the significance when we're talking about if it's capable of the task.

[–] zbyte64@awful.systems 1 points 1 day ago (1 children)

How are you able to understand it's capability without understanding what tools it is capable of manipulating to effect?

[–] communist@lemmy.frozeninferno.xyz 1 points 1 day ago (1 children)

You aren't, and that's exactly what I'm saying, it's capable of doing these things with tools, therefore it's capable of doing these things.

[–] zbyte64@awful.systems 1 points 1 day ago (1 children)

So why are you allergic to people talking about the quality of the tools in regards to capability?

[–] communist@lemmy.frozeninferno.xyz 1 points 23 hours ago (1 children)

I don't know what you mean, I wasn't the one who claimed they couldn't do something they clearly can.

[–] zbyte64@awful.systems 1 points 23 hours ago (1 children)

You are the one collapsing tool use into a binary when there are varying degrees of competency and hand holding.

[–] communist@lemmy.frozeninferno.xyz 1 points 22 hours ago (1 children)

I am not, you inaccurately said that the math olympiad was not bested by llm's because they had a tool that told them if they were close but incorrect and can just try an infinite number of times. This is incorrect, they had a number of tries with python. This just isn't a true statement. I think them besting it with use of python is equally significant and still counts as them besting it, and saying they can't do math work is absurd.

[–] zbyte64@awful.systems 1 points 13 hours ago* (last edited 12 hours ago) (1 children)

It's not "bested" by the LLM though, a mathematician used the LLM as a tool to disprove a conjecture. Subtract the mathematicians from the process and the LLM would not have successfully completed the task. It would be more accurate to say a mathematician with an LLM was able to best a mathematician who did not have an LLM. Which is cool, but we don't need to pretend the LLM is not a tool but something that "understands" math like a mathematician

[–] communist@lemmy.frozeninferno.xyz 1 points 8 hours ago* (last edited 8 hours ago) (1 children)

You're confusing the olympiad with the erdos conjecture. This is just really not true, they just asked it and it found a solution, the mathmatician then used its solution as inspiration to create a better one. It still essentially did it on its own, and they certainly do the olympiad on their own.

[–] zbyte64@awful.systems 1 points 5 hours ago* (last edited 5 hours ago)

The description that the LLM did it on its own is subjective at best. I'll just leave it at that. Have a good one