this post was submitted on 07 Jun 2026
414 points (98.6% liked)
Technology
85274 readers
5537 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I've been experimenting with agentic coding the past couple of weeks. The task is to write a data scraper for a report file I get out of a commercial tool I have to use for work.
It's a pain of a format because it's not written with computer parsing in mind. It's verbose, contains loads of redundant parts, and doesn't have good delimiters around data. It's big too. 500MB uncompressed, so we keep them gzip'd.
All reasons why I don't want to write the code to do it.
The model identifies the file format without me saying where it came from, but it sits in this loop:
It does this for hours.
The tiny bits of code I've actually managed to get out of it are really bad. It's like the code you'd get back from some race-to-the-bottom offshore software "team" you were forced to work with 10-15 years ago because your boss had found an "amazing opportunity". In actuality it was somebody's teenage nepo-hire. Similar adherence to rules and standards too.
I already have a rough data scraper for this file. It's a couple of hundred lines of python. I wrote it in an afternoon. It's not great. It doesn't get everything I want out. However it exists and is usable. This isn't an intractable task.