Technology

77979 readers

2385 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

1083

I Went All-In on AI. The MIT Study Is Right. (open.substack.com)

submitted 2 weeks ago* (last edited 2 weeks ago) by AutistoMephisto@lemmy.world to c/technology@lemmy.world

301 comments fedilink hide all child comments

Just want to clarify, this is not my Substack, I'm just sharing this because I found it insightful.

The author describes himself as a "fractional CTO"(no clue what that means, don't ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

you are viewing a single comment's thread
view the rest of the comments

[–] dsilverz@calckey.world 42 points 2 weeks ago (9 children)

@AutistoMephisto@lemmy.world @technology@lemmy.world

I used to deal with programming since I was 9 y.o., with my professional career in DevOps starting several years later, in 2013. I dealt with lots of other's code, legacy code, very shitty code (especially done by my "managers" who cosplayed as programmers), and tons of technical debts.

Even though I'm quite of a LLM power-user (because I'm a person devoid of other humans in my daily existence), I never relied on LLMs to "create" my code: rather, what I did a lot was tinkering with different LLMs to "analyze" my own code that I wrote myself, both to experiment with their limits (e.g.: I wrote a lot of cryptic, code-golf one-liners and fed it to the LLMs in order to test their ability to "connect the dots" on whatever was happening behind the cryptic syntax) and to try and use them as a pair of external eyes beyond mine (due to their ability to "connect the dots", and by that I mean their ability, as fancy Markov chains, to relate tokens to other tokens with similar semantic proximity).

I did test them (especially Claude/Sonnet) for their "ability" to output code, not intending to use the code because I'm better off writing my own thing, but you likely know the maxim, one can't criticize what they don't know. And I tried to know them so I could criticize them. To me, the code is.. pretty readable. Definitely awful code, but readable nonetheless.

So, when the person says...

The developers can’t debug code they didn’t write.

...even though they argue they have more than 25 years of experience, it feels to me like they don't.

One thing is saying "developers find it pretty annoying to debug code they didn't write", a statement that I'd totally agree! It's awful to try to debug other's (human or otherwise) code, because you need to try to put yourself on their shoes without knowing how their shoes are... But it's doable, especially by people who deal with programming logic since their childhood.

Saying "developers can't debug code they didn't write", to me, seems like a layperson who doesn't belong to the field of Computer Science, doesn't like programming, and/or only pursued a "software engineer" career purely because of money/capitalistic mindset. Either way, if a developer can't debug other's code, sorry to say, but they're not developers!

Don't take me wrong: I'm not intending to be prideful or pretending to be awesome, this is beyond my person, I'm nothing, I'm no one. I abandoned my career, because I hate the way the technology is growing more and more enshittified. Working as a programmer for capitalistic purposes ended up depleting the joy I used to have back when I coded in a daily basis. I'm not on the "job market" anymore, so what I'm saying is based on more than 10 years of former professional experience. And my experience says: a developer that can't put themselves into at least trying to understand the worst code out there can't call themselves a developer, full stop.

[–] Munkisquisher@lemmy.nz 6 points 2 weeks ago (1 children)

When the cost to generate new code has become so cheap,and the cost of devs maintaining code they didn't write gets higher. There's a huge shift happening to just throw out the code and regenerate it instead. Next year will be the find out phase, where the massive decline in code quality catches up with big projects.

[–] MangoCats@feddit.it 3 points 2 weeks ago

where the massive decline in code quality catches up with big projects.

That's going to depend, as always, on how the projects are managed.

LLMs don't "get it right" on the first pass, ever in my experience - at least for anything of non-trivial complexity. But, their power is that they're right more than half of the time AND when they can be told they are wrong (whether by a compiler, or a syntax nanny tool, or a human tester) AND then they can try again, and again as long as necessary to get to a final state of "right," as defined by their operators.

The trick, as always, is getting the managers to allow the developers to keep polishing the AI (or human developer's) output until it's actually good enough to ship.

The question is: which will take longer, which will require more developer "head count" during that time to get it right - or at least good enough for business?

I feel like the answers all depend on the particular scenarios - some places some applications current state of the art AI can deliver that "good enough" product that we have always had with lower developer head count and/or shorter delivery cycles. Other organizations with other product types, it will certainly take longer / more budget.

However, the needle is off 0, there are some places where it really does help, a lot. The other thing I have seen over the past 12 months: it's improving rapidly.

Will that needle ever pass 90% of all software development benefitting from LLM agent application? I doubt it. In my outlook, I see that needle passing +50% in the near future - but not being there quite yet.

load more comments (7 replies)