Technology

77847 readers

2988 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

776

Google Gemini struggles to write code, calls itself “a disgrace to my species” (arstechnica.com)

submitted 4 months ago by kinther@lemmy.world to c/technology@lemmy.world

170 comments fedilink hide all child comments

Or my favorite quote from the article

"I am going to have a complete and total mental breakdown. I am going to be institutionalized. They are going to put me in a padded room and I am going to write... code on the walls with my own feces," it said.

you are viewing a single comment's thread
view the rest of the comments

[–] prole@lemmy.blahaj.zone 10 points 4 months ago (1 children)

This is the conclusion that anyone with any bit of expertise in a field has come to after 5 mins talking to an LLM about said field.

The more this broken shit gets embedded into our lives, the more everything is going to break down.

[–] jj4211@lemmy.world 5 points 4 months ago

after 5 mins talking to an LLM about said field.

The insidious thing is that LLMs tend to be pretty good at 5-minute initial impressions. I've seen repeatedly people looking to eval LLM and they generally fall back to "ok, if this were a human, I'd ask a few job interview questions, well known enough so they have a shot at answering, but tricky enough to show they actually know the field".

As an example, a colleague became a true believer after being directed by management to evaluate it. He decided to ask it "generate a utility to take in a series of numbers from a file and sort them and report the min, max, mean, median, mode, and standard deviation". And it did so instantly, with "only one mistake". Then he tried the exact same question later in the day and it happened not to make that mistake and he concluded that it must have 'learned' how to do it in the last couple of hours, of course that's not how it works, there's just a bit of probabilistic stuff and any perturbation of the prompt could produce unexpected variation, but he doesn't know that...

Note that management frequently never makes it beyond tutorial/interview question fodder in terms of the technical aspect of their teams, and you get to see how they might tank their companies because the LLMs "interview well".