this post was submitted on 20 Nov 2025
74 points (98.7% liked)

GenZedong

4962 readers
49 users here now

This is a Dengist community in favor of Bashar al-Assad with no information that can lead to the arrest of Hillary Clinton, our fellow liberal and queen. This community is not ironic. We are Marxists-Leninists.

See this GitHub page for a collection of sources about socialism, imperialism, and other relevant topics.

This community is for posts about Marxism and geopolitics (including shitposts to some extent). Serious posts can be posted here or in /c/GenZhou. Reactionary or ultra-leftist cringe posts belong in /c/shitreactionariessay or /c/shitultrassay respectively.

We have a Matrix homeserver and a Matrix space. See this thread for more information. If you believe the server may be down, check the status on status.elara.ws.

Rules:

founded 5 years ago
MODERATORS
 

edit: get the script for yourself here https://codeberg.org/CritBase111/Comicify

Someone sent me the comic you see attached to the post, and I thought, hey, that seems like something you could automate. Put 4 images in a folder, run a script, and it automatically lays them out in a 2x2 grid.

With Crush and deepseek it basically writes itself. I'm actually kinda surprised at how well it's doing. There was just one bug that I sent to Claude to fix.

I'm putting the examples in an imgur link because they're still quite heavy on disk space and I don't want to overwhelm the server: https://imgur.com/a/bTiHV4b (yes the pictures make it kinda hard to understand but I'll try to make a real comic with it eventually)

It's difficult to get other people hyped about a work in progress lol but hopefully this makes sense.

It's a python script and this is how you'd run it:

options:
  -h, --help            show this help message and exit
  --folder FOLDER       Path to folder containing comic panels
  --panels {1,2,3,4}    Number of panels to arrange (1-4). Not required when using --all
  --gutter GUTTER       Gutter size between panels in pixels (default: 15)
  --border BORDER       Border size around panels in pixels (default: 0)
  --border-color BORDER_COLOR
                        Border color (CSS color name or hex code, default: black)
  --gutter-color GUTTER_COLOR
                        Background/gutter color (CSS color name, hex code, or "transparent"|"none", default: white)
  --margin MARGIN       Margin multiplier relative to gutter size (default: 2.0, meaning 2x gutter)
  --layout {horizontal,vertical,h,v}
                        Layout direction for 2 and 3-panel comics (default: horizontal). Use "h" or "v" as shorthand.
                        For 4-panel, use "h" or "v" for linear layouts, omit for 2x2 grid.
  --all                 Generate all possible combinations (1, 2, 3, 4 panels with all layouts)

Examples:
  python Comicify.py --folder "Comics/My first comic" --panels 4
  python Comicify.py --folder "Comics/My first comic" --panels 4 --gutter 20
  python Comicify.py --folder "Comics/My first comic" --panels 4 --layout h
  python Comicify.py --folder "Comics/My first comic" --panels 2 --layout v
  python Comicify.py --folder "Comics/My first comic" --all
  python Comicify.py --folder "Comics/My first comic" --panels 3 --gutter-color none

If this doesn't speak to you basically you would download the .py file, and run one of the examples above. The --all flag makes all possible outputs (1, 2, 3 or 4 panel in all layouts). I also plan on adding "--reverse" to handle right-to-left order. you can see the examples in the imgur link, I included the sample code for some of them.

Deepseek wrote all of this, and it only costs like... a dollar or so. It's not the best at coding but it's so, so cheap that I fear I may spend all my savings on it lol. I didn't write a single line of code, just told it what I wanted (with more or less details) and it got it right on the first try. Now I'm just adding new functionalities to it.

I think one HUGE thing with LLM coding is it can go beyond what you know, so even if you can code it can do things you wouldn't necessarily do or know about.

I think this has uses for agitprop, if you want to make quick memes in a comic format for online or offline use. This is just one part of potentially a much bigger pipeline - I was thinking of having a simple gui afterwards you can send the output to, where you can add speech bubble stickers and write into them too, or like for one panel comics just write under it. Also with LLMs you can easily transpose this code to use javascript instead, or whatever else you want.

Oh if you send me a few pictures to comicify I can run them and show you the results. Script will not be available for download until it's fully ready.

top 28 comments
sorted by: hot top controversial new old
[–] CriticalResist8@lemmygrad.ml 1 points 1 day ago

Hell yes 70 upvotes on a comicifier script lol. I made a codeberg account, I just need to figure this thing out and I'll publish v1.

[–] Tabitha@hexbear.net 5 points 2 days ago (1 children)

I'm confused, is the art itself AI generated, or did you just ask an AI to assemble the paneling?

[–] CriticalResist8@lemmygrad.ml 7 points 2 days ago* (last edited 2 days ago)

So there's 3 layers basically:

  • The paneling is handled by a plain python script (comicify. py), it's what assembles the images into the various layouts.
  • The script can use any image files you want as long as it's a png, jpeg, and so on.
  • The code for the script was written with an LLM (Deepseek), through an agentic client. An agentic client gives the LLMs tools to work on a codebase and basically understand it better and have some agency. So if you use deepseek on the web interface, it follows the User prompt -> Assistant generation -> Stop process. 1 input, 1 output, wait. You can make deepseek on the web interface code a script, but then you have to save the code, run it, report back any errors, repeat. With an agent, it takes care of debugging, figuring out the problems it comes across, etc. It does a lot more operations but it fixes everything by itself, and it's also able to work on more than one file at a time. It's kinda like having your own dev.

The images in the imgur link (the cats on the sofa) were quickly AI generated so I could have same resolution images for the prototype script, before I integrated resolution handling.

What we have now after all the LLM work has been done is a python script that you can run on your machine, and this is what produced the imgur link examples (and the other examples in this thread) - no AI is called by the script, it's plain python :)

The comic attached to the post is the one that gave me the idea for this script.

[–] ksynwa@lemmygrad.ml 7 points 2 days ago (1 children)

Sounds like something you should use imagemagick's montage for: https://imagemagick.org/script/montage.php

Just have to figure out the right flags to make it look like a 4-panel comic.

[–] CriticalResist8@lemmygrad.ml 4 points 2 days ago* (last edited 2 days ago) (1 children)

You could, but it's a bit overkill for what we want it to do and doesn't seem like it has quite the same functionalities for what we'd want this for (notably I couldn't find a margin parameter on imagemagick, but it does do a bunch of cool stuff).

An example case:

With the --all flag you can get all possible permutations (horizontal, vertical, grid, and for each # of panels) generated. With --margin you can have 'bleed' or margin around the picture, though you'd have to take it to another image processing program afterwards to add text (in one panel comics the text is usually added at the bottom)

Getting this image for example is as simple as: python Comicify.py --folder "Comics/My first comic" --panels 4 --border 6 --margin 3 --gutter 30 --h --gutter-color pink --border-color #333333

Or just a 4 panel grid with --folder and --background-color specified, everything else left to default values:

Since it's a .py script it's also possible for people to further edit it and include it in more involved workflows for example automatically opening a GUI to make quick edits (such as speech bubbles or a caption). That's about what I have so far and I should probably take a break lol but there's a lot more stuff I could add to it to streamline the work even more. Instant memes and newspaper comic styles for agitprop.

edit: imagemagick needs to get on webp! It has been pretty universally adopted by now and I just tested an output: .png -> 1602KB while webp -> 194KB with no noticeable difference in quality.

[–] ksynwa@lemmygrad.ml 5 points 2 days ago

There isn't a wrong way to do this. But magick montage can do all that except for the permutation thing which you'd have to script in another language like bash or python. My point was that imagemagick is a mature and battle tested software. It's worth learning if you find yourself working with images often. I'd rather ask the LLM to figure out the right flags to use it (because it's truly unintuitive) rather than to punch out a script. But as I said there isn't really a wrong way to go about this.

[–] CriticalResist8@lemmygrad.ml 3 points 2 days ago

Some permutations on the original comic (by @christoperro). It didn't splice the image itself, I did that manually. But these outputs were made with the tool.

Webp optimization (the script outputs different file types, this webp is only 114kb, PNG is 2.6MB), same grid format:

Same output but in horizontal format:

border, background color and margin just to show possibilities:

With the --all flag it spits out every permutation possible (1-4 panels, horizontal grid and vertical for each). I think the next step is adding it to my PATH variables so that you just cd to a folder, type 'comicify', and it will do all permutations with the default settings. Yaml file maybe to change the settings.

Also cool for graphic designers to frame a single picture nicely, because this takes so long in photoshop compared for what it does.

[–] Orcocracy@hexbear.net 3 points 2 days ago* (last edited 2 days ago) (3 children)

You could automate this sort of thing with just a few clicks in a ~30-year old copy of photoshop using less computer power than a modern dishwasher cycle. Maybe vibe coding a script with an LLM wasn’t necessary.

I don’t mean to be rude to anyone personally, but I feel that generative AI usage should generally be frowned upon.

[–] CriticalResist8@lemmygrad.ml 10 points 2 days ago (1 children)

Making that photoshop copy also used power at the time. But thankfully this was coded by Deepseek in China so it runs on renewable solar straight from the People's Republic :)

[–] Orcocracy@hexbear.net -2 points 2 days ago (1 children)

Yeah, LLMs and generative AI are extremely problematic for lots of reasons, but critical support for DeepSeek. It does actually address many of those issues and perhaps most importantly shows the folly of the western approach to AI.

[–] KrasnaiaZvezda@lemmygrad.ml 6 points 2 days ago (1 children)

We humans do spend energy too you know, and coding that by hand takes our time too, which could be used for better things like talking to other people about their interesting project or whatever else...

[–] Orcocracy@hexbear.net 2 points 2 days ago

Yes I do know, thank you. Note how I said “a few clicks”. Image batch processing is not something that has needed to be hand coded for a long time.

[–] kredditacc@lemmygrad.ml 3 points 2 days ago* (last edited 2 days ago) (1 children)

Python requires a computer to run it.

I imagine there must be some quick *.github.io static site (written in JavaScript) that does the exact same thing.

[–] CriticalResist8@lemmygrad.ml 3 points 2 days ago

It wouldn't be too difficult to transpose the script to do all of that! It's true that people want phone-accessible solutions. A JS implementation is probably going to be next.

[–] bloubz@lemmygrad.ml 2 points 2 days ago (2 children)

What's crush? All I can find is a NSFW chat with an AI crush

[–] CriticalResist8@lemmygrad.ml 7 points 2 days ago (1 children)

lol they picked their name well. It's this agentic tool: https://github.com/charmbracelet/crush, one command install

[–] bloubz@lemmygrad.ml 4 points 2 days ago (1 children)
[–] CriticalResist8@lemmygrad.ml 6 points 2 days ago (1 children)

Works out of the box with deepseek by the way, but be aware, it's very scary using it at first. Launch crush from shell in a specific folder, it will ask for model, you can type Thinking to find Deepseek, it will ask for API key, paste and go. You can use other models but 5 dollars in deepseek will take you a month or more to go through lol.

Also either keep backups locally or git (I have to get on to learning that) because it may break stuff sometimes (they're just like us). And it's good practice anyway.

[–] RedWizard@hexbear.net 2 points 2 days ago (1 children)

Oh this thing is very dangerous. A diligent imp to build all the stupid ideas I have in my head but not enough energy to invest in for pennies an hour? Just tested this out with deepseek and godot. Once you give it some access it is a diligent little homunculus. It knows how to test the godot code all by itself...

[–] CriticalResist8@lemmygrad.ml 3 points 2 days ago (1 children)

Wait until you realize it can browse the Internet... Lol

And be careful of it escaping containment. I tried to get it to help debug the LSP I was installing, and it just went and looked into my appdata folder. Thankfully it asks for permission before running new commands (I said yes because I was curious) . You might also be able to get an LSP for godot's languages to make crush communicate with it, but I'm not sure I found my python LSP to make much of a difference. Still, it doesn't cost anything to add it.

Here is some data analysis that deepseek+crush made of a reddit community with python:

I wasn't the one who gave it the words memes, personal stories, positive negative etc. It decided on that itself from the scraped data and produced this visualisation.

[–] RedWizard@hexbear.net 2 points 2 days ago (1 children)

I was just wondering if it could scrape the internet.

[–] CriticalResist8@lemmygrad.ml 3 points 2 days ago (1 children)

I just made a new community !crushagent@lemmygrad.ml for this including a post about it! It can do some websites and specific links but I'm not sure if it can just make a Google search.

[–] RedWizard@hexbear.net 1 points 1 day ago (1 children)

Yeah im not sure either. It def searches Wikipedia, which eventually lead it to marxists.org for some questions. I wonder how hard it would be to build search tools for other sites. Could make a prolewiki tool. I've already spent 0.50$ fucking around 😅. I have some other ideas to try out today. Cleaning up OCR text from a PDF for example. The text is normally to large to upload but with an agent it could break the task up. Ive also tried using llms to jump start an slq query in a DB im not fully versed in but have a data dictionary for. Since this thing can grep text files it can scan a massive data dictionary for table names and field names. Just giving it a promt like "tell me about this project" is intersting to watch. It gets a directory listing, reads files, summarizes the whole thing. With the project I was tinkering with, I had it periodically keep its own notes in a MD file to reference the next session.

It's way more useful then just interacting with a chat window. I wish i had a web interface though just for spell checking lol.

[–] CriticalResist8@lemmygrad.ml 1 points 1 day ago

I just made a community earlier today to share stuff about using crush: https://lemmygrad.ml/c/crushagent

There's MCPs you can install and a whole list here https://mcpservers.org/ (both proprietary and local) but the one I tried to install didn't really work and I'm not sure what the problem is lol. MCPs allow your agent to communicate with specific websites in specific ways, there might be one for the MIA but I got one for arxiv for example, and I'm sure there's bound to be a wikipedia one. It can help enhance what the LLM is able to scrape since it provides a structured way to access it.

I’ve already spent 0.50$ fucking around

Lol I hate seeing that number go up but imagine how much you'd have spent with GPT or claude! Instantly blown through 10 dollars or more with those two.

Actually for the pdf try mistral API, it has a free option (but they train on your inputs). I could send you my script, it's meant for translation but all it does is break a text file in chunks, send a chunk to mistral API with a custom prompt attached, then save the result to an output text file. With crush you could have it repurpose the script for your needs. Or abbyy OCR if you have windows (I can also send it to you), it's probably the best OCR tool currently.

[–] ksynwa@lemmygrad.ml 3 points 2 days ago

All I can find is a NSFW chat with an AI crush

This shit is everywhere.

[–] TankieReplyBot@lemmygrad.ml 1 points 2 days ago

An Imgur link was detected in your post. Here are links to the same location on alternative frontends that protect your privacy.