jj4211

joined 2 years ago
[–] jj4211@lemmy.world 2 points 28 minutes ago

Mine would be: "I have no idea" - An answer the LLMs generally refuse to give by their nature (usually declining to answer is rooted in something in the context indicating refusing to answer being the proper text).

If you really pressed them, they'd probably google each thing and sum the results, so the estimates would be as consistent as first google results.

LLMs have a tendency to emit a plausible answer without regard for facts one way or the other. We try to steer things by stuffing the context with facts roughly based on traditional 'fact' based measures, but if the context doesn't have factual data to steer the output, the output is purely based on narrative consistency rather than data consistency. It may even do that if the context has fact based content in it sometimes.

[–] jj4211@lemmy.world 1 points 1 hour ago

Note that could prove you have it, but failure to execute does not prove yourself secure.

For example, someone reported to me that their RHEL9 system was not vulnerable based on this result. But it was because python was 3.9 and didn't have os.splice, so the demonstrator failed, but the actual issue was there.

Similarly, if '/usr/bin/su' isn't exactly there (maybe it's in /bin/su, or in /sbin/su, or /usr/sbin/su, or not there at all), the demonstrator will fail, but the kernel may still have the vulnerability, you just have to select a different victim utility (or change the cache for some other data other than an executable for other effects).

[–] jj4211@lemmy.world 1 points 1 hour ago

Looking at the binary blob, it's a payload to assume privileges as possible and exec sh. So replace su with that and the binary gets to use su's filesystem privileges without needing access to actually write it.

The vulnerability part is when the door opens to replace any file's read cache with arbitrary content. The binary payload is just an obvious example of the sort of payload that could do a ton of damage.

[–] jj4211@lemmy.world 2 points 2 hours ago

Note that this is a rather narrow view of the scope of things.

Yes, the demonstrator is a python script that opens up 'su' and uses splice+this vulnerability to change it to 'just assume all privileges and become sh'.

However, it's that any process in any namespace can leverage a certain socket type and splice to effectively modify any filesystem content they want. It's easy to see how this could be part of a chained attack to, for example, replace a protected service that is firewalled off with a shell. An RCE in a service permits rewriting nginx in an entirely different container and replaces it with a shell backend of your choosing.

That 'flatpak' application on your single user system that is guarded from touching your files that aren't related? That isolation doesn't mean anything if this issue is in play.

In terms of shared systems, while it should be avoided if possible, practically speaking there's a lot of shared resources.

I don't get why I've seen so many people saying "ehh, no big deal, privilege escalation is just a fact of life".

[–] jj4211@lemmy.world 7 points 2 days ago

In my experience, the bigger the codebase gets, the more confounded LLM gets at trying to make coherent changes. So LLM projects start on shaky ground and just get worse because they can't maintain the stuff they themselves generated.

I've seen what LLM can do and it is certainly interesting and can do some stuff, but the vast majority of my experience is someone who had not coded before "vibing" themselves into a corner and demanding help to dig them out. A bit irritating because while before we could reasonably prioritize requests to do stuff because management understood making something from nothing was real work, now management says "they aren't asking you to make something, just help them fix something that already exists, should be easy!"

On the ELOC metric, for a long time I pointed out how disastrous I must be because my contribution to a project I was on was about -10,000 lines of code by the time I went to something else.

[–] jj4211@lemmy.world 5 points 2 days ago

While I despise the captchas from a human perspective, the fact that an LLM can solve the challenge isn't a deal breaker. It doesn't need to be impossible for a non-human to solve, it just has to be too expensive.

It does certainly shift the equation to stuff like proof of work since a computer can solve it anyway, might as well not annoy the human.

[–] jj4211@lemmy.world 3 points 2 days ago

Seems utterly pointless though...

With the proof of work approach, at least it's demanding the client consume some resources, though the 'right' amount is a tricky question, either it's so trivial as to hardly matter to the scrapers, or it's hard enough to put a dent in the scrapers' build, but human operated low end devices are royally screwed..

Here the crawler simply schedules a resumption and moves on to other work. The crawler doesn't need it right now and it's free for it to wait.

[–] jj4211@lemmy.world 1 points 2 days ago

Yeah, seems like the problem is that fundamentally it could work by upping the difficulty a smidge making it then meaningfully expensive, but the spread between slowest edge device and high end means it's impossible to chase that difficulty without screwing over low end device users..

[–] jj4211@lemmy.world 15 points 2 days ago (2 children)

They are no longer with us.

Hey, I'm annoyed by slop coding work as much as the next guy, but murder seems a bit much as a reaction...

[–] jj4211@lemmy.world 56 points 2 days ago (5 children)

Fun story from this week, we had a chore for the frontend to refresh to a new version of the UI framework. Fairly simple task, so off to a junior developer. Within a couple hours there was a merge request ready to go. Ok, a fairly normal amount of time to change version and at least do a sniff test and find nothing changed so I go in assuming I'll look at a few version bumps, maybe one or two tweaks... I see the junior dev was proposing over 1,000 lines of code to be added... WTF...

I crack it open and there was just a firehose of css rules, all marked '!important'. Looking at one examlpe, it repeated the same classifier with the same exact bunch of rules 5 times in a row. It was like it found every possible derived css class combination with tag and defined !important CSS for most everything about it.

So I find out that the junior dev asked it to rebase and it did what he expected, just change some version and went. He tried it and due to a framework change, one element was misaligned by a little bit. So he gave the feedback to the LLM and tried again... and it failed, and he tried again and it failed and after 5 rounds, it finally got the element aligned and hit 'merge request'. For fun I opened up his proposed change and just so much was just a bit dodgy css wise because it screwed with so much stuff, but the junior dev only concerned himself with the page as it opened.

So I said screw it, I'll do it myself, and added the singular rule that was needed to adapt to the framework change, making it overall about a 5 line change including versioning and such.

Depressingly, I suspect an executive would consider me far less productive because I only did 5 lines of change and the junior dev would have done thousands...

[–] jj4211@lemmy.world 3 points 2 days ago (1 children)
[–] jj4211@lemmy.world 1 points 3 days ago (1 children)

My electric bill last month was $15. It eventually is financially worth it.

Don't know how hard you went on your home lab, I use office rather than datacenter equipment and it's quite and plenty for my needs. For my professional test/dev needs I have such equipment at work, so I don't need to home train on gear for the sake of competence in the field.

view more: next ›