Mniot

joined 9 months ago
[–] Mniot@programming.dev 6 points 5 months ago

Looking at the code, it reads like it was written by LLM: chatty commit messages, lack of spelling/capitalization errors, bullet points galore, shit-ton of "Fix X" commits that don't read like they're increasingly-frustrated, worthless comments randomly scattered like "i + 1 // add 1 to i" without any other comments on the page.

No security review because none of the code has been reviewed and he doesn't know what's in it.

[–] Mniot@programming.dev 18 points 5 months ago (1 children)

I once did some programming on the Cybiko, a device from 2000 that could form a wireless mesh network with peers. The idea was that you could have a shopping mall full of teens and they'd be able to chat with each other from one end to the other by routing through the mesh. It was a neat device!

[–] Mniot@programming.dev 1 points 6 months ago (1 children)

This is good advice for all tertiary sources such as encyclopedias, which are designed to introduce readers to a topic, not to be the final point of reference. Wikipedia, like other encyclopedias, provides overviews of a topic and indicates sources of more extensive information.

The whole paragraph is kinda FUD except for this. Normal research practice is to (get ready for a shock) do research and not just copy a high-level summary of what other people have done. If your professors were saying, "don't cite encyclopedias, which includes Wikipedia" then that's fine. But my experience was that Wikipedia was specifically called out as being especially unreliable and that's just nonsense.

I personally use ChatGPT like I would Wikipedia

Eesh. The value of a tertiary source is that it cites the secondary sources (which cite the primary). If you strip that out, how's it different from "some guy told me..."? I think your professors did a bad job of teaching you about how to read sources. Maybe because they didn't know themselves. :-(

[–] Mniot@programming.dev 2 points 6 months ago (1 children)

I think it was. When I think of Wikipedia, I'm thinking about how it was in ~2005 (20 years ago) and it was a pretty solid encyclopedia then.

There were (and still are) some articles that are very thin. And some that have errors. Both of these things are true of non-wiki encyclopedias. When I've seen a poorly-written article, it's usually on a subject that a standard encyclopedia wouldn't even cover. So I feel like that was still a giant win for Wikipedia.

[–] Mniot@programming.dev 8 points 6 months ago (6 children)

I think the academic advice about Wikipedia was sadly mistaken. It's true that Wikipedia contains errors, but so do other sources. The problem was that it was a new thing and the idea that someone could vandalize a page startled people. It turns out, though, that Wikipedia has pretty good controls for this over a reasonable time-window. And there's a history of edits. And most pages are accurate and free from vandalism.

Just as you should not uncritically read any of your other sources, you shouldn't uncritically read Wikipedia as a source. But if you are going to uncritically read, Wikipedia's far from the worst thing to blindly trust.

[–] Mniot@programming.dev 42 points 6 months ago

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.

[–] Mniot@programming.dev 3 points 6 months ago

Just saw: they're changing it back because you don't appreciate it enough :(

[–] Mniot@programming.dev 3 points 6 months ago

Thanks for linking that. Reading the paper, it looks like the majority of the "self-host" population they're capturing is people who have a WordPress site. By my reading, the wording of the paper would disqualify a wordpress.com-hosted site as "self-hosted". But I'd be very suspicious of their methodology and would expect that quite a few people who use WP-hosted reported as self-hosted because the language is pretty confusing.

[–] Mniot@programming.dev 13 points 7 months ago (1 children)

I was at Google when they announced that only AI-related projects would be able to request increased budget. I don't know if they're still doing that specifically, but I'm sure they are still massively incentivizing teams to slap an "AI Inside" sticker on everything.

[–] Mniot@programming.dev 13 points 7 months ago (2 children)

lol is it even worth tracking what's tariffed today?

[–] Mniot@programming.dev 71 points 7 months ago (6 children)

Though, do be careful because there are abusive same-sex relationships and sometimes it's even harder to get away because the people around you are telling you "but women can't be abusers!"

[–] Mniot@programming.dev 7 points 7 months ago

To someone watching network traffic, a VPN connection looks like two machines exchanging encrypted packets. You can't see the actual data inside the packet, but you can see all the metadata (who it's addressed to, how big it is, whether its TCP or UDP, when it's sent). From the metadata, you can make guesses about the content and VPN would be pretty easy to guess.

When sending a packet over the Internet, there's two parts of the address: the IP address and the port. The IP address is a specific Internet location, blocks of IP addresses are owned by groups (who owns what is public info) and there are many services that do geo-ip mappings. So if you're connecting to an IP address that belongs to a known VPN provider, that's easy.

The second part of the address is the port-number. Servers choose port-numbers to listen to and the common convention is to use well-known ports. So, for example, HTTPS traffic is on port 443. If you see a computer making a lot of requests to port 443, even though the traffic is encrypted we can guess that they're browsing the web. Wikipedia has a list (which is incomplete because new software can be written at any time and make up a new port that it prefers) and you can see lots of VPN software on there. If you're connecting to a port that's known to be used by VPN software, we can guess that you're using VPN software.

Once you're running VPN software on an unknown machine and have configured it to use a non-standard port, it's a bit harder to tell what's happening, but it's still possible to make a pretty confident guess. Some VPN setups use "split-tunnel" where some traffic goes over VPN and some over the public Internet. (This is most common in corporate use where private company traffic goes in the tunnel, but browsing Lemmy would go over public.) Sometimes, DNS doesn't go through the VPN which is a big give-away: you looked up "foo.com" and sent traffic to 172.67.137.159. Then you looked up "bar.org" and sent traffic to the same 172.67.137.159. Odds are that thing is a VPN (or other proxy).

Finally, you can just look at more complex patterns in the traffic. If you're interested, you could install Wireshark or just run tcpdump and watch your own network traffic. Basic web-browsing is very visible: you send a small request ("HTTP GET /index.html") and you get a much bigger response back. Then you send a flurry of smaller requests for all the page elements and get a bunch of bigger responses. Then there's a huuuuge pause. Different protocols will have different shapes (a MOBA game would probably show more even traffic back-and-forth).

You wouldn't be able to be absolutely confident with this, but over enough time and people you can get very close. Or you can just be a bit aggressive and incorrectly mark things as VPNs.

view more: ‹ prev next ›