The research I saw mentioning LLMs as being fairly good at chess had the caveat that they allowed up to 20 attempts to cover for it just making up invalid moves that merely sounded like legit moves.
jj4211
I remember seeing that, and early on it seemed fairly reasonable then it started materializing pieces out of nowhere and convincing each other that they had already lost.
Because the business leaders are famously diligent about putting aside the marketing push and reading into the nuance of the research instead.
To reinforce this, just had a meeting with a software executive who has no coding experience but is nearly certain he's going to lay off nearly all his employees because the value is all in the requirements he manages and he can feed those to a prompt just as well as any human can.
He does tutorial fodder introductory applications and assumes all the work is that way. So he is confident that he will save the company a lot of money by laying off these obsolete computer guys and focus on his "irreplaceable" insight. He's convinced that all the negative feedback is just people trying to protect their jobs or people stubbornly not with new technology.
Questions would be:
Was this a sincere account or satire?
Even if sincere, did that post actually exist?
It just seems a bit too much to believe someone would admit it is just for the lulz at making liberals upset. Maybe they admit to it being a bonus but to say it is the point.. Particularly in a scenario where that admission very explicitly amounts to a self own...
Yes, as common as that is, in the scheme of driving it is relatively anomolous.
By hours in car, most of the time is spent on a freeway driving between two lines either at cruising speed or in a traffic jam. The most mind numbing things for a human, pretty comfortably in the wheel house of driving.
Once you are dealing with pedestrians, signs, intersections, etc, all those despite 'common' are anomolous enough to be dramatically more tricky for these systems.
At least in my car, the lane following (not keeping system) is handy because the steering wheel naturally tends to go where it should and less often am I "fighting" the tendency to center. The keeping system is at least for me largely nothing. If I turn signal, it ignores me crossing a lane. If circumstances demand an evasive maneuver that crosses a line, it's resistance isn't enough to cause an issue. At least mine has fared surprisingly well in areas where the lane markings are all kind of jacked up due to temporary changes for construction. If it is off, then my arms are just having to generally assert more effort to be in the same place I was going to be with the system. Generally no passenger notices when the system engages/disengages in the car except for the chiming it does when it switches over to unaided operation.
So at least my experience has been a positive one, but it hits things just right with intervention versus human attention, including monitoring gaze to make sure I am looking where I should. However there are people who test "how long can I keep my hands off the steering wheel", which is a more dangerous mode of thinking.
And yes, having cameras everywhere makes fine maneuvering so much nicer, even with the limited visualization possible in the synthesized 'overhead' view of your car.
To the extent it is people trying to fool people, it's rich people looking to fool poorer people for the most part.
To the extent it's actually useful, it's to replace certain systems.
Think of the humble phone tree, designed to make it so humans aren't having to respond, triage, and route calls. So you can have an AI system that can significantly shorten that role, instead of navigating a tedious long maze of options, a couple of sentences back and forth and you either get the portion of automated information that would suffice or routed to a human to take care of it. Same analogy for a lot of online interactions where you have to input way too much and if automated data, you get a wall of text of which you'd like something to distill the relevant 3 or 4 sentences according to your query.
So there are useful interactions.
However it's also true that it's dangerous because the "make user approve of the interaction" can bring out the worst in people when they feel like something is just always agreeing with them. Social media has been bad enough, but chatbots that by design want to please the enduser and look almost legitimate really can inflame the worst in our minds.
The thing about self driving is that it has been like 90-95% of the way there for a long time now. It made dramatic progress then plateaued, as approaches have failed to close the gap, with exponentially more and more input thrown at it for less and less incremental subjective improvement.
But your point is accurate, that humans have lapses and AI have lapses. The nature of those lapses is largely disjoint, so that makes an opportunity for AI systems to augment a human driver to get the best of both worlds. A constantly consistently vigilant computer driving monitoring and tending the steering, acceleration, and braking to be the 'right' thing in a neutral behavior, with the human looking for more anomolous situations that the AI tends to get confounded about, and making the calls on navigating certain intersections that the AI FSD still can't figure out. At least for me the worst part of driving is the long haul monotony on freeway where nothing happens, and AI excels at not caring about how monotonous it is and just handling it, so I can pay a bit more attention to what other things on the freeway are doing that might cause me problems.
I don't have a Tesla, but have a competitor system and have found it useful, though not trustworthy. It's enough to greatly reduce the drain of driving, but I have to be always looking around, and have to assert control if there's a traffic jam coming up (it might stop in time, but it certainly doesn't slow down soon enough) or if I have to do a lane change in some traffic (if traffic conditions are light, it can change langes nicely, but without a whole lot of breathing room, it won't do it, which is nice when I can afford to be stupidly cautious).
I think the self driving is likely to be safer in the most boring scenarios, the sort of situations where a human driver can get complacent because things have been going so well for the past hour of freeway driving. The self driving is kind of dumb, but it's at least consistently paying attention, and literally has eyes in the back of it's head.
However, there's so much data about how it fails in stupidly obvious ways that it shouldn't, so you still need the human attention to cover the more anomalous scenarios that foul self driving.
Now there’s models that reason,
Well, no, that's mostly a marketing term applied to expending more tokens on generating intermediate text. It's basically writing a fanfic of what thinking on a problem would look like. If you look at the "reasoning" steps, you'll see artifacts where it just goes disjoint in the generated output that is structurally sound, but is not logically connected to the bits around it.
If they marketed on the actual capability, customer executives won't be as eager to open their wallet. Get them thinking they can reduce headcount and they'll fall over themselves. You tell them your staff will remain about the same but some facets of their job will be easier, and they are less likely to recognize the value.