Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>106193870https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.com
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanXhttps://github.com/Wan-Video
2.2 Guide: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y
>Chromahttps://huggingface.co/lodestones/Chroma1-Base/tree/main
Training: https://rentry.org/mvu52t46
>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Remove Chroma from rentry,
>>106197528 (OP)>Local Model Meta: https://rentry.org/localmodelsmeta>Edit: 07 Aug 2025 >I haven't updated this in awhile. Sorry. I've been busy. I'll try to get back to it over the next couple of weeks, same with the Wan rentry. If not, someone else can take over.nice
>>106197549>Afraid to quote
>>106197528 (OP)>>Chroma>https://huggingface.co/lodestones/Chroma1-Base/tree/mainUpdate to https://huggingface.co/lodestones/Chroma1-HD/tree/main
and bring back civitaiarchive. Maybe throw the Wan2.1 guide back in.
What can cause a washed out effect on wan2.2 generated videos? I've already rules out a faulty vae.
CFG > 1? Torch compile?
so uh I downloaded 50a, but is there a better one? 49 is better somehow??
>>106197579maybe download 50 non a?
Reddit says there are more Chroma versions coming? I though v50 was the last one.
>>10619757948 is correct, imo
Poll (from last thread):
https://poal.me/z3ek04
https://poal.me/z3ek04
https://poal.me/z3ek04
https://poal.me/z3ek04
>>106197002trying to figure this out right now anon, any luck on your end? atm im trying to find the set ratio for snr and tmoe, then to see how to read it in comfyui
>>106197591thecheckpoint training is done but anyone can still fuck with it if they're so inclined
>>106197718Nope sadly, no one seems interested in that. Which is weird, people just randomly chose "half the steps" as the way.
>>106197718try the reddit thread he grabbed it from
>>106197728yeah super odd, seems like something thats quite important to the whole architecture lol
and it seems fairly easy, unless i am completely misunderstanding it and its not actually readable or something
>>106197733I got it from the github: https://github.com/Wan-Video/Wan2.2?tab=readme-ov-file#introduction-of-wan22
If there is a reddit thread about it and how to do that, please share it.
>>106197728yeah same on my side, people just trying stuff and recording what happens
found this browsing around tho, seems to be some active discussion going on and people experimenting with it
>https://www.reddit.com/r/StableDiffusion/comments/1mkv9c6/wan22_schedulers_steps_shift_and_noise/
>flux dev
>flux krea
>chroma
>wan
which is best for realism?
>>106197822That chart seems to be without the light2x lora since the steps are 10+. Does it translate 1:1 into a 4 steps total run?
>>106197822Interesting, it's somewhat active too.
>>106197837Not sure, give it a shot and use the good ole eyeball test
>>106197851Wan is so good man
>>106197851so close, its come a long way anon
>>106197578Only way to not get that washed out effect is to use lightxtv, which makes no sense...
>>106197851crazy how this random qwan image keeps popping up
>>106197875I've been using it to test WF cause its such a good test of prompt following / motion quality when using light lora
gib wan 2.2 t2i workflow pls
>>106197886https://files.catbox.moe/hmdb2p.json
https://github.com/Extraltodeus/Skimmed_CFG
Anything recent like that exists?
>>106197823Qwen really doesn't work as advertised with text.
>>106197949I'm using fp8, I only have 24GB VRAM. I suspect the full weights are superior, but fp8 is still a qualitative step above any other open model for text. has some weird quirks though, and that issue is compounded by how slow it runs.
file
md5: 0c825f85d77358e2885833396c02bfbd
๐
====PSA PYTORCH 2.8.0 (stable) AND 2.9.0-dev ARE SLOWER THAN 2.7.1====
tests ran on rtx 3060 12gb/64gb ddr4/i5 12400f 570.133.07 cuda 12.8
all pytorches were cu128
>inb4 how do i go back
pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
maybe the new shit just runs ass on your outdated shit gpu nigga
>>106198083>tests ran on rtx 3060jesus christ, aren't there standards to post here? like a height limit or something?
>>106198108im 5"2 btw
>>106198105maybe do a test and show if there's any change? 3090 chads might benefit from this info
I'm having far too much fun with this
it's too late to go to bed, may as well stay up
>>106198115doesnt it feel like magic anon, im not sleeping either
>>106198115so many questions
what was she doing confronting a pimp
did she know the pimp was a wizard
why is the pimp a wizard
what's in her handbag
how's she gonna get home
>>106198112>3090 chads>5 year old gpupeasants you mean
I've got 2 3090s but I am starting to feel their age. Having to do silly little things to make certain things were where it just werks on a 4090. I think I'll just fork out for a 5090 when the time comes.
>>106198212like what? i never have to do silly little things on my 3060, everything just werks
chroma, flux, krea, wan, hunyuan, sdxl tunes
LLMs
I tried a few times for the sign to read "planned parenthood" but it wouldn't work with me.
>>106198231Torch version not playing well with multi gpu training. fp8, no access to some of those sweet fp8 speed optimizations.
They are getting old.
>>106198242yea i feel u, only thing that doesnt work for me is sageattn2++ because it needs 4000+
i guess since you're training loras it must be tougher, im just a consoomer
animate your fav old gens, super fun
>>106198265If it trains it trains and it probably will for a few more years. It just like, not fun looking at people having more fun and not quite having enough fun to justify sinking the cash to match them.
>>106198242shit, I remember when I saw fp8 on my 4090 vs 3090. night and day. then I got a 5090 and my fucking 4090 felt slow. march of godforsaken progress.
>>106194872uh anon can I get a proooompt for that effect...
>>106198212If you're gonna spend the money on a 5090 get a 4090 with 48gb. I have an A6000 and I've realised 2 things: I wish I bought a 4090 48gb and I would buy a 96gb card immediately if I could.
>>106198347mmm long legs i want to fuck that
>>106198356The 48gb 4090s are just bootleg chinese soldered 4090s right? Do they have a warranty?
>>106198376Yeah that's basically what they are. No warranty, or rather you'd be at the inconvenience of dealing with shipping and if the seller honours warranty. I'm looking to buy one but it's only because I want to speed up gens. I have a 4090,3090,A6000. If it fits on the 4090 I run it there, otherwise I use the A6000. Really, I want a Blackwell 96gb but I just can't justify the cost (yet).
A man holds an ak-47 and fires it at the camera
basic test but works
>>106198356>>106198376you'd have to be an actual retard to buy a *used* 4090D that was hacked up by some chinese fuck who got the job cause he sucks good dick, and then resold through a sketchy ass website for 3k usd.
>>106198356>>106198481I'm using my work budget to buy some RTX 6000 pros
>>106197832Wan, then Chroma, then Krea
>>106197528 (OP)>>106195231>>106195316Huh? So it turns out Chroma HD has to be prompted at a higher res bros. Really dumb but makes sense, it gets rid of the slop look entirely
in comfy video nodes what is crf and what is max quality?
>>106198538video encoding parameters like you use in ffmpeg
>>106198537Yeah, that was it. It's a bit silly. But then the question is, is limb accuracy and overall image coherence significantly higher for it to make a difference? And is there more detail?
Chroma users are so fucking cooked right now.
>>106198500flux dev is better if you know how to use it
the man in the grey jacket grabs a gun and fires it at the people behind him, who fall down on the ground.
NOT SO FAST!
>>106198588What do you use to make these comparisons
wan doesnt like violence without loras
the man in the grey jacket does a karate chop to the people behind him, who fall down on the ground.
>>106198490attack of the anime yap mode, first time seeing it for me
>>106198642Don't specify that they fall, just say the man in the grey jacket shoots the other people or something like that
>>106194877Insane result on 1152 bros. I say we are so back, we have never been more back! Not only is the count correct, it's adhering to the prompt perfectly!
https://files.catbox.moe/h8jy5o.png
>>106198588How high did you have to go ?
>>106198660nice gen, cute youmu
>>106198672nvm, just saw, 1152, that's not a large increase
okay ive managed to improve the fps with interpolation in kijais workflow like so:
>>106198699result: 16 to 32 fps
>>106198704try "roundhouse kick"
>>106198623Used python to create me a script + have a bat file that I edit, which I made to just run it, not the best way to do it as it can all be made into a single file but here you go if interested
https://files.catbox.moe/ovqm80.py
https://files.catbox.moe/nxw5hu.py
https://files.catbox.moe/3jpvfb.bat
you create a ./venv in current folder and edit the bat file with the names, then run the png2jpg "python .\png2jpg.py --files .\my_final_comparison.png --output-dir ." (could also be automated with bat file and output file names be randomized)
>>106198702I have a 4080, not gonna spend $5000 on a 5090 for a few fps or slightly bigger 720p
>>106198708>pythonGemini*
beaten by psychic powers:
what is "max" crf value
wish I didn't need to be a rocket scientist to decipher all these nodes and values
>>106198679yeah takes just a few secs extra on my 3090, though it's possible to go lower to like 1096 (but I wouldn't go too low because I see more bokeh). If I go higher it's possibly less slopped then? Not sure, sucks to be capped by the 3090 in that regard.
>>106198807Can qwen do ponos and vagoo? If not, how easy is it to train on it compared to flux?
>At a fast food restaurant. Camera pans back. An overweight man from Ohio foams at the mouth in excitement.
Background looking more coherent now
film vfi interpolation 3 (x3) seems to work okay, default is 16 so 48fps:
more generated frames can mean more artifacts though so can adjust to preference.
>>106198026As a fellow 24GB peasant, use Q8 GGUFs instead of fp8, the quality is much closer to the full model.
>>106198874Try GIMM vfi, it handles fast motion a bit more accurately. It's also slower though
how can I use qwen image on a 16gb gpu?
Is there a way to disable comfyui from unloading models in to system memory? WAN is eating up my RAM like crazy and wont release. Wish it would just load off my nvme for whatever model it needs at that moment.
>>106198925back to default with no interpolation (16fps), I think ill stick with this cause interpolation can cause issues with high motion gens.
>>106198115upload this LoRA on civitai, my nigga
>>106198942*also, it added like a minute to my gen process, 188 seconds vs 130.
Well, geez. That took a bit.
Here are some fuck-ass large plots for Chroma HD non-annealed.
Why? Why not.
Titty fairy: https://files.catbox.moe/ni9b5o.jpg
Trash girl: https://files.catbox.moe/poj393.jpg
Rat girl: https://files.catbox.moe/bmo5yu.jpg
Mixed media: https://files.catbox.moe/dd8jka.jpg
Oil: https://files.catbox.moe/dfr3aa.jpg
Full subplots again on: https://unstabledave34.neocities.org/comparisons
>>106198940start comfyui with the parameters:
--cache-none --disable-smart-memory
>>106198961chroma is a joke
>Trying to calculate how many samples do I need to conclude that a new embedding/vae/text encoder/etc. makes gens in comparable quality or better than not using it at least 65% of the time (Arbitrary threshold I picked to deem some method/model worthwhile for making slop.).
>With the bog standard 95% confident interval and 5% margin of error I need THREE HUNDRED FUCKING FIFTY sample pairs to conclude that with reasonable confidence.
>Even lowering confidence interval to unusually low 85% and increasing margin of error to sloppy 10%, realistically as much as I can stretch before the whole experiment becomes borderline worthless, I need 48 sample pairs to test any random embedding, vae, prompting method or whatever I see on Civitai.
So how do you test stuff?
I tried just roll with dozen samples and call it a day thing in the past and it actually mislead me into doing some BS that I later realized was in fact not making gens better. That's why I think more rigorous testing is necessary, though that demands more time and effort investment than this 0$/h hobby warrants I think.
I am in a conundrum.
>>106198942now it seems more natural in general.
file
md5: e34a39d028b32649d420367a9371d1a9
๐
>>106198964>Full subplots again on: https://unstabledave34.neocities.org/comparisonsNice site. And neat picrel.
TY for the grids.
>>106198977It literally directly improved anon. lodestone has delivered. Ngl, after trying it at 1024 I doubted him for a sec, but when increasing the res it's a different beast. This may not be entirely coherent but it's a lot more coherent than whatever incoherent garbage we were getting before for images in this size, orientation and specially with the backgrounds and subjects.
>>106198807Wait till Plebbitors hear of Chroma HD at 1152
>>106198943forgot civit was blocked in this country, what a pain in the ass
civitai.com/api/download/models/2095264?type=Model&format=SafeTensor
there we go, camera motion.
man holds up a plate with a mcdonalds cheeseburger and mcdonalds fries and smashes the plate on the table in front of him, while looking upset. the camera zooms out to show a wooden table.
Whenever I make a video in Wan, it generates a png file too. Any way to stop it from doing this?
>>106199067Whatever image processing nodes you are using for i2v, one of them is saving the file. Find and replace it.
>>106199028https://files.catbox.moe/3gmao9.png
Looking for other anons that are using local-AI for their gamedev/art projects. In cartooning or anime styles, is the best option still just something like Illustrious? I feel like I'm constantly in a trade-off between quality and prompt adherence. Models like ILL have great quality and customization support with loras, but I fucking hate how character-focused they are. Even with negative prompts, I too often struggle to get a single object to render without a superfluous butt, tiddy, or hand.
Is there a model of similar quality that is more tuned for environmental details and objects over characters? Loras haven't been cutting it.
>pic not related, just something I put together for fun>>106198985I usually gen in a series of batches that I square exponentially. If the change didn't obviously degrade quality in 2 batches, then I do 4 batches, then 8 and so on. Once its at >200 images per interaction, I assume its fine enough and feel okay leaving it run over night. The only advice I can give is try and find any ways you can shave a few seconds per image generation. It saves so much fucking time over a large run.
>>106199190Was the "theme" retarded again?
Like:
>Anime>Kneepad (Anime)>Anime
>>106199108Based cat, off to shitpost on /ldg/
Contortionist prompt test. Magic extra hand there on this seed but I'll take it
>>106199288Lot less likely to fuck up on seeds/situations it likely would have before
>>106199301Come again anon?
>/ldg/ completely taken over by a disgusting 3dpd footfag degenerate who should be on /b/
This is why I'm so glad we have /adt/.
>>106199132Try qwen image
>>106199385>3dpd footfag degenerateChroma doubters like yourself have been BTFO'd quite hard. What do you have to say now about this unprecedented coherence?
Also, the containment thread is that way
>>106199421there is weird stuff all over that image
>>106199385Please stay there
>>106199421i don't care about "chroma".
I don't like your disgusting foot fetish images. That's all.
what a horrible horrible thread
man puts on a baseball cap saying "RETARD" in playful text.
pretty fast at 640x480 with proportions set to crop, 98 seconds.
>A gravure photoshoot of a Japanese woman poolside. A Japanese woman with large breasts is wearing a one-piece swimsuit, sitting poolside, while doing the splits. She is holding one of her legs above her head. Her hair is long and black and she is wearing mascara and eye-liner.
>Ultra HD, 4K, Gravure, AV, JAV, Realistic, Professional.
qwen image sure is fun
>>106199492>I don't like your specific gens because foots are disgusting>I'm not straight, I obviously like hairy men>I don't like seeing women, period>Let me just come to your thread, shit it up and talk about it>I'm also a tranny btwOk.
Btw anons, night time photos look a lot cleaner now. Before, they occasionally had that weird melted feel to them, but now they've gotten a lot sharper and better looking.
https://xcancel.com/bdsqlsz/status/1954109353209819277
>>106199108Make the cat wear a kippah and do the hand rubbing gesture.
why isn't sage attention making qwen image faster?
I said kicks the wooden desk...
>>106199132I think Flux with loras produces better pixel art than Illustrious does. The hues/colours tend to pop a lot better, while Illustrious pixel art's colours usually look a lot more washed out and flat/boring. The structure/lineart is incredible in either model though.
Illustrious does have a no_humans tag you might want to take advantage of, for producing objects. Also: white_background, simple_background.
>>106197871OK I found out what it was: it was loras stacking and some of them being too high.
For some reason it made everything look washed out. Tweaking the weight down solved it.
>>106199665also put "nsfw" in the negative prompt when using hentai models to produce non-hentai
file
md5: ba3b62adc54edaf78f2e6d47499dbe6a
๐
>>106198265>sageattn2++ because it needs 4000+It worked perfectly fine on my 3090. I don't think there is any limitation outside of minimal version of CUDA required (12.8+).
>>106199618It doesn't really do much for image generation at all, not sure why.
Perhaps SageAttention3 will, since it's a larger performance optimization than the previous, it will only work on Blackwell cards though
>prompt: a girl
>output a chinese loli
so this is the power of qwen image
>>106199719Probably what 99% of qwen prompters were hoping for, so they just made it easy
does negative prompt actually do something these days? I feel like less and less new model is taking neg prompt seriously and straight up ignore it
>>106198396>want a Blackwell 96gb but I just can't justify the cost (yet).I still don't get the appeal unless you batch image generations to get to full vram use.
From what I've seen, if you have fast ram, offloading to it doesn't cost much more in generation time.
And getting a 5090 + 128GB of fast ram is still so much cheaper than just 1 6000.
I was trying to find if using it would speed up wan video generation in fp8 or allows batching them but it doesn't seem the rare people having this card use it for videogen.
>>106199539It is good and can follow the prompt great, but is a wee bit slopped. I'd love a Chroma style tune for Qwen, but it is too much since that model is huge. Chroma will be the realism SOTA for years to come.
>>106199719you asked for a girl, you got a girl
you gotta be precise
>>106199766I started from a 4060ti 16GB. Then bought a 4090, then bought a 3090, then bought an A6000. I remember each time I said "this will be enough." Aside from that, for training purposes it will be nice, and for video/image generation it allows for higher resolution gen + using full weights. I'm using full qwen-image weights and it takes around 40GB with text encoder loaded. I also use hunyuan3d2.1 and that takes 40+.
All in all, it's not a wise purchase, but to me I'd not waste my money on a 5090 (32 isn't all that more useful than 24). Having all this VRAM, my experience is 24gb is the minimum, then you need 48 for it to be meaningful in use, and after that you can't get enough.
I use LLMs, image gen, video gen, as well as train. A single 5090 (4090 is enough really) + 128GB RAM is cheaper, but it has its limitations.
so i should go with cuda 12.7, not 13.0 right?
>>106199796haven't ran chroma yet, but so far the prompt adherence is much appreciate over base flux. Also has a lot of style flexibility. But yes some of the images definitely look sloppy (but so does chroma from what I've seen shared).
>>106199760negative prompts are very useful for improving quality, but the reason many models these days aren't supporting them is because cfg 1.0 gets you double the generation speed, and if you use cfg 1.0 then negative prompt is ignored.
so it's a matter of hardware speed and us trying to get better performance that's causing us to lose out on negative prompts.
>>106199810>Then bought a 4090, then bought a 3090, then bought an A6000And you crammed all of this in the same computer?
Does it work well?
I have a 3090 + 5090 in mine and it's already annoying for specific stuff.
>Aside from that, for training purposes it will be niceOh yeah that's one use case I don't really care about, but for sure this one benefits from more vram.
>but to me I'd not waste my money on a 5090 (32 isn't all that more useful than 24).Vram wise no but compute wise it is.
Offloading with fast ram is pretty ok in my experience for inference use.
>A single 5090 (4090 is enough really) + 128GB RAMI'd argue here that the 5090 speedup is quite substantiel vs the 4090 but sure.
>>106199819Qwen is impressive but somewhat slopped and censored. That can easily be fixed with loras though, so hopefully we'll see some soon (assuming Civitai adds a qwen section).
On the other hand it is very demanding and slow, which likely impedes overall uptake and lora finetuning. We'll have to see how the 'community support' for this will turn out.
>>106199812Get the minimal CUDA version that makes your gpu and use case work.
Usually 12.8 if you go comfy + sage2++ + any 3000/4000/5000 card.
13.0 just got released, I would steer away from it.
Does anyone here have a dual 5090 setup in a conventional pc case?
What model of 5090 would you recommend? The AIO version?
Even powerlimiting them to 400W would dump 800W inside the case, and no matter the case it feels like a great way to fry everything inside.
>>106199819Sure, but for realism you are very much limited to the clean polished smooth skin look you are showing in your images, plus bokeh. So only 1 style in that case. Chroma isn't slopped to the same extent, only some rare tokens in Chroma tend to trigger slop, but that has been greatly reduced in v50 at resolutions higher than 1024.
>>106199609bruh he moved that table like it was made of polystyrene or something
>>106199855>And you crammed all of this in the same computer?>Does it work well?Yep, originally had a large case to put my 4090 + 4060ti. Then swapped the 4060ti out for the A6000. Now I have an open frame with the 4090,3090,A6000 mounted and installed using a bifurcator card. Works great, but whenever I'm splitting things across them bottleneck is based on the slowest card (3090 in this case).
>Vram wise no but compute wise it is.Agreed. I favor VRAM personally, but I can't lie and say I wouldn't appreciate having the compute of a 4090 or 5090 with 48GB+ VRAM for a decent price.
>Offloading with fast ram is pretty ok in my experience for inference use.Yep, I have 128GB RAM (DDR4) too and I use it when running dsr1 or other massive MoE models. Was handy for block swapping with hunyuan video before I had the A6000
>I'd argue here that the 5090 speedup is quite substantiel vs the 4090 but sure.I believe you about the speedup. I see it with my 4090 vs the A6000/3090.But when things don't fit, the A6000 comes in handy (and has many times).
Dear feetnigga,
Can chroma do skinnier toes? I feel like every gen I see has stumpier digits and wider soles.
>>106199895I personally wonder if it can do dainty nice clean smooth feet, and not the dirty looking ones favored here.
>>106199883I'm keen to compare with chroma. I'll get it up an running someday in the next week or so.
Got a prompt you want me to try with qwen?
anyone here trained a qwen lora yet? i'm gonna give it a go.
i assume you need to rent a H100 and an A40 won't cut it
>>106199882just have a case with decent airflow/fans. and if it is still too hot mount big heat transport AIO coolers, yes.
>>106199894>installed using a bifurcator cardWhat is this?
>bottleneck is based on the slowest card (3090 in this case).Yeah frankly speaking I'll probably sell my 3090 soon, it makes no sense to use it when the 5090 is so fast the 3090 became kind of useless.
>whenever I'm splitting things across themI guess you mean anything non image or video inference right? Unless you can now distribute stable diffusion inference across cards, even more across heterogeneous models.
how do I do image editing with qwen image in comfyui?
>>106199895Use WAN or SDXL if VRAM allows it, better composition and stability.
>>106199925>just have a case with decent airflow/fansWhat would you recommend?
>>106199922An anon a few threads back did it but when I asked for details didn't get a reply. Probably need like 80+GB.But I dunno, I'm waiting to see some numbers or will probably attempt it on my A6000 later in the week if I'm bothered.
>>106199926>>installed using a bifurcator card>What is this?Splits a PCIe slot up to connect multiple things to 1 slot. I'm using the motherboard of my old gaming PC from 2019 with a bifurcator card to let me install up to 4 cards in one slot (have 3 at the moment).
>Yeah frankly speaking I'll probably sell my 3090 soonsame, it's good value but once you have tried a fast card you can't go back to a slow one
>I guess you mean anything non image or video inference right? Unless you can now distribute stable diffusion inference across cards, even more across heterogeneous models.Yes, mostly for non video or image inference. But hunyuan video and wan have distributed GPU inference (kind of) though I haven't tried it. I do utilise multiple GPUs though for image/video by having one card loaded with the text encoder and another with the diffusion model. Let's me keep each loaded and also allows the card with the diffusion model to have more VRAM free for higher res gens.
>>106199883You're right about the skin stuff. I tried this prompt and it didn't change despite being different from the earlier prompts (terms of styling phrases).
>A footpath in America at night with a girl drunk and sitting down on the pavement. A woman dressed in denim short-shorts spreading her legs apart. She has long black hair, mascara, eye-liner, and lipstick. She's wearing a black strapless croptop. Her expression is drunk while holding a beer in her hand. The photo is taken from an iPhone at night time with the camera flash on.>Low detail, iPhone photo, night mode photo, grainy, low quality, blurry.main difference between others is the 2nd line read:
>HD, iPhone, night mode photo, grainy, realistic.
>>106199957I did post a little update but nobody replied or cared.
I trained a LoRA 1024x1024. Cut it off around 2000 steps because I was only really curious to see if it worked. And it does. For some reason it like doubled my inference time though.
As for the memory needed to train it. Across two 3090s I think I capped out at like 14gb per GPU at 1024x1024 at rank 32. That was the fp16 model.
I might try a different subject at some time in the future but suffice to say it works but there are a few questions remaining.
https://www.youtube.com/watch?v=hkAH7-u7t5k
>>106199943FD Meshify 3 XL, Antec Flux, Corsair 4000D, something like that. there is a thread over in >>>/g/pcbg that is maybe more up to date
My only issue with Chroma right now is that the outputs are noisy as fuck. Like visually noisy that become distracting if you look to closely because you realize you aren't even sure what you're looking at.
>>106200017you could try doing a 2nd vae decode (like use impact-pack's iterative upscale). it tends to zap away noise
>>106200005it's funny seeing people calming down on this in real time
>>106200003Oh I did read that reply. Didn't realise it was you. Thanks for the numbers, that gives me confidence then to give it a go and not feel like I'm wasting time on my card.
Diffusion pipe, or just write your own script?
>KSamplerAdvanced ImportError: DLL load failed while importing cuda_utils
chatgtp has me running in circles
>>106199957>Splits a PCIe slot up to connect multiple things to 1 slot.Wouldn't that limit your pcie lanes to x1 for each?
Isn't that super slow? No wonder you want to keep everything in vram lol.
>same, it's good value but once you have tried a fast card you can't go back to a slow oneYeah it served me well but I'd rather sell it to some new local ai hobbyist or gamer than having it use space for nothing in my case.
>wan have distributed GPU inference (kind of)Can you share the github of that?
>I do utilise multiple GPUs though for image/video by having one card loaded with the text encoder and another with the diffusion model.When I had my dual 3090s setup I tried that but gave up as the speedup was barely noticeable compared to firing two different comfyui sessions and having both generate the same thing for me, effectively doubling the average output.
>>106200047did you update comfy and the requirements?
>>106200039GPT 5 is very obviously made in the interest of keeping the lights on rather than pushing new models. Like, I would not be surprised if they were in some very dire financial trouble.
>>106197665Why no chroma is shit option?
>>106200069>Wouldn't that limit your pcie lanes to x1 for each?>Isn't that super slow? No wonder you want to keep everything in vram lol.4x. Even when I unload and load stuff, the speed isn't bad. I've ran a 4090 on 1x using block swap and a 4090 on 16x using block swap and the time difference (hunyuan video) was less than 2 seconds. PCIe bandwidth doesn't matter for anything other than model load and unload speeds mostly. I want everything in VRAM because large models need large VRAM and it's the difference between loading and running vs not running at all
>>wan have distributed GPU inference (kind of)>Can you share the github of that?https://github.com/Wan-Video/Wan2.2?tab=readme-ov-file#run-text-to-video-generation
https://github.com/Tencent-Hunyuan/HunyuanVideo?tab=readme-ov-file#-parallel-inference-on-multiple-gpus-by-xdit
they're examples of docs that have it. If you're using comfy, you won't have access to it. I mostly just use CLI to run this stuff.
>A class photo of a school of maids. The class photo shows women dressed in maid uniforms and a few men dressed in tuxedos. The overall look of the photo is that of an old 80s polaroid photo. The background shows an opulent hall.
>Polaroid, low quality, bad photo, blurry.
file
md5: 023d4d4804739dbcdf22447265b93fe3
๐
>>106199922 (me)
looks like 512x512 lora training on qwen takes 31GB VRAM
will try higher resolutions later, after i get some preliminary results from this lora
>>106200098How come this stuff never makes it in to comfy? I think Hunyuan had this too.
>>106199997>Faggot dev changed the filename, last thread and every thread he was the only one promoting his shitty model.BUY AN AD BRO
CHROMA SUCKS AND I'M NOT USING YOUR SHIT.
YOU ARE CANCER TO THE LOCAL COMMUNITY
>>106200115Outside of sameface, it's convincing
>>106200134Hope it's not like with HiDream where you had to train on 1024 or the results would be crap.
Is this with any offloading to ram / quantization ?
>>106200168You can train on a single 24gb gpu with offloading. But only 512
>chroma finally released
>checks dev repo for scraps
>2k chroma lora spotted!
https://huggingface.co/lodestones/chroma-debug-development-only/blob/main/2k/2025-08-09_01-53-00-comfyui.safetensors
I tire of this post SD 1.5 slop.
>>106200168no offloading, but it is training using fp8 quant. so for fp16 training you'd need like double the vram i imagine.
this is on a rented H100, i'm just testing at the moment to see if 512x512 training is even viable
if not i will try higher resolutions
>>106200147I genned that with qwen
>>106200098>I've ran a 4090 on 1x using block swap and a 4090 on 16x using block swap and the time difference (hunyuan video) was less than 2 seconds.Good to know, hunting for used x570 motherboards allowing proper x8/x8 split instead of x8/x4 when using two gpus was a pain.
>If you're using comfy, you won't have access to it.Welp, shit.
>>106200140I don't know why, but for some reason, pooling gpus or multigpu in general isn't very popular in image or videogen, while it's pretty solved in LLMs spaces.
Good, not bloated workflow for Wan 2.2 i2v and i2i that presumably incorporates this https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main ?
>>106200210Yikes, 31gb vram with fp8, Qwen sure is a big one
>>106200245Is there an issue with the just plugging it in to the default workflow yourself?
>>106200152Most I've tried with qwen was specifying 3 subjects and their appearances. This one I wanted to see will it same face if I don't specify details
>>106200255>default workflowHuh? What are you referring to?
>>106200270The default comfyui Wan 2.2 workflow. Just plug the LoRA loader into that.
I downloaded a recently WAN 2.2 workflow and I saw it was using models from this link:
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/diffusion_models
wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors
wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors
However, I've been using:
wan2.2_i2v_high_noise_14B_Q8_0.gguf
wan2.2_i2v_low_noise_14B_Q8_0.gguf
I think they're all pretty much the same filesize (14.3GB). But does anyone know which ones I should use before I have to spend an hour comparing them?
maybe it's the case that .gguf is higher quality but is slower? that would be my assumption.
>>106200338GGUF quants are almost always of a higher quality than their FP8 retarded bothers.
>>106200338i think the q8 gguf quants are better than fp8, but i have not yet tested this nearly enough to be more certain
>>106200178did you tested it?
I want to use InvokeAI but I'm having python conflicts. Does anybody knows how to venv Invoke?
>>106200338They should be more or less the same.
Q8 gguf are good, but fp8 scaled are great too.
Note that it's "fp8 scaled", not just fp8.
FP8 itself is behind any of these.
>>106200374>4k chroma lora>tested it?It's better than not at 2k but not a ton different than base 49
>>106200387 (You)
>*2k loraApologies
>>106200387>her smile and optimism, back
They all got that chroma eczema.
>>106200506Use a facedetailer or an upscaler big dawg, can barely tell this is Emma in some of these gens.
>>106200450Well..it is skin detail, isn't it?
>>106200394mmmm sexy granny
>>106200553>Use a facedetailer or an upscaler big dawgThis is just a bit of testing lil bro. New = shiny. idgaf about impressing people
>>106200564>sexy grannyBut could you tell that was AI? Best realism model IMO
>>106200600>But could you tell that was AI? Best realism model IMONta, but it's very 'in your face ai' due to extremely visible artifacts, messed up perspective and the 'chroma overlay' over all realistic photos. I can spot chroma much better than wan.
Am I crazy or are the Chroma believers literally suffocating on their own copium right now?
>>106198843Cool gonna train the loras for this one, been prepping the dataset throughout the week
>>106200375>no adetailer or variants>no hiresfix>can't load json workflowswhy?
least gay way to train a Qwen lora?
>>106200657it was foretold
>>106199539I wonder if I can train my dataset of imouto on it, lora training exist for this?
>>106200657"Chroma believers" is litteraly lodestones(dev) samefaging since yesterday. All it does is further expose its shitty model.
>>106199895the toes should be as succulent as the piggy.
>>106200707>>106200687>>106200657the absolute state of nogen vramletkids kek
I miss sd 1.5 times where controlling camera angle was easier
ew feet. disgusting degenerate, go away.
bein attrated to f**t is a brain defect.
>>106199834>Open up WAN 2.2 workflow from civitai>Oh look they've got a custom negative prompt, this'll be interesting... wait a second, 1.0 cfg?>generate video>remove negative prompt>generate video again>get exact same videoI'm the only smart person on this gay earth
>>106200799imagine not using wan nag
>>106200741The furries ignored him.
The bronies the same.
In realism style, the community didn't pay atention, they preferd to make NSFW adaptations for Flux or WAN.
Anime was never his focus.
Whatโs he got left?
Shilling and samefagging here, hoping his shitty model turns into a 'cult classic.'
He wasted money and effort on a dead end.
Eventually he will kill himself.
Why doesn't everybody shut up about the models they don't like? Post gens that make people want to try different models.
Anyway,
For stylized gens, I think v48 is better than v50 of Chroma. It's hard to put the finger on what is different but I'm finding hard to like v50 or v50 annealed. Shall I bother testing v49?
>>106200741chromakek please, i'm genning 720p videos
>>106200630The biggest issue still with AI pictures is that they usually have angles that no one would take pictures at, and the composition of the photo always makes no sense.
Did anyone find out this slowmo gayness with these lightx loras?
>>106200824>Anyway,KYS iodestones
>>106200810>they preferd to make NSFW adaptations for Fluxnot a serious person, thanks for exposing yourself
This is a battle betwen Chromaschizo and AntiChromaschizo, am I right?
>>106200826im sure you are, 720x480p 4 step 4 speed up loras in wangp, sis
>chroma-unlocked-v50-annealed.safetensors
>chroma-unlocked-v50-annealed.safetensors
>chroma-unlocked-v50-flash-heun.safetensors
>chroma-unlocked-v50.safetensors
why doesn't lod explain what the fuck the difference is between these?
>>106200884The discord is full of non-social autistic furries, they need someone to do proper PR. I might take up the helm out of kindness.
When I was kid, all I wanted was moon shoes. You know, the mini trampolines you put on your feet? I was so excited when I found out I was getting a pair for Christmas. Every day I'd imagine how fun it would be to bounce around like I'm literally on the moon. Then the day came. I put them on... and they were shit. Of course. I didn't want to disappoint my parents, so I jumped around and made sure everyone knew how happy was about finally getting my moon shoes. I jumped around in front of everyone. Made sure everyone knew how satisfied I was with Chroma epoch 50. I wanted everyone to think I was the happiest boy in the universe. But on the inside, I was disappointed.
>>106200884flash-heun uses heun and is flashy
>>106200900so it's shit and i should avoid it.
what about annealed?
is v50 the 'detail-calibrated' versions or is it like base? EXPLAIN
>>106198436can we see 11 frames per second windable cam with 16mm grain
>>106200824>Why doesn't everybody shut up about the models they don't like?Because if I don't have a GPU that can generate things with a model fast, no one else should be able to get positive attention from using that model online.
>>106200891that was remarkably close to my review of Happy Gilmore 2
>>106200909V49 was the epoch with the only true hi-res addition training.
V50 I wasnโt sure with what he did, it wasnโt very clear.
V50-Annealed was a merge of the last ten epochs
>>106198908they tell one was build for museum and it was coral color maybe some metallic elements
>>106200003I figure you must have used diffusion-pipe. I'll try it with my BBW dataset. I just finished my wan 2.2 t2v bbw lora, by the way:
https://huggingface.co/quarterturn/wan2.2-14b-t2v-bbwhot
https://files.catbox.moe/ml96j2.mp4
(yeah I know the animation has a fuckup midway, I just finished epoch 80 and wanted to try it, at least it's learned the tits and tummy)
>>106200889except the overwhelming amount of gens posted there are non-furry. you'd know that if you were actually in the discord.
>>106200860who cares, filter both.
>>106200932>V49 was the epoch with the only true hi-res addition training. huh, but I thought that was meant for v50? so v50 does not have 1024x1024 training?
>>106200932>V50-Annealed was a merge of the last ten epochswrong.
but lodekek himself said that annealed is "shit"
>>106200968someone posted comparisons between v50 and v50-annealed, and yeah annealed looked worse in every sampler. not sure why he even did that
So uh, any real use case for wan besides turning reaction images into reaction videos?
>qwen on 4gb using gguf
All toasters toast tost, but damn if it isn't fun. Has kind of boring poses though, I got spoiled by 1.5 implicit variety.
>>106200090It's kind of a self-own, though. There are a few very trivial things they could do to get more users, which they don't mostly out of spite. Altman just went on a podcast where he smugly implied Ani was some horrible evil thing OpenAI would never do, but if you are not offering people a product they want to pay for, they won't pay for it.
>>106200944It's the craziest thing. I also tested my LoRA with a bbw dataset.
>>10620101990% of Wan gens are porn related, so the answer is yes
>>106200630>messed up perspectiveThe bed being too short is my only tell it's fake, but I also never learned how to draw.
>extremely visible artifacts, and the 'chroma overlay' over all realistic photosThere's no flux lines or anything. Care to be specific? You might be sensitive enough to see differences in the VAE at this point.
>>106200632>prompt issue100% this
>She is sitting on the edge of a bed, facing you. She has a large, light-green towel wrapped around her torso, and another smaller white towel is wrapped around her hair like a turban. She is leaning forward a little bit, with both of her hands resting on her knees. She is looking at you with a simple, neutral expression, as if she is listening to someone talk. Her shoulders are bare, and her skin looks smooth. The bed she is sitting on has a dark brown wooden frame, and the blanket on it is a simple white color. The room is bright with daylight coming from a window that is out of view.>>106201032Not op but gross
picrel is wan 2.1 t2i from Mar 15
I have some old chroma schnell lora. Are all the newest chroma lowstep cfg1 loras still deepfrying the pics?
x
md5: a9c9d0f012e8b1f99a92955a69e87a5b
๐
>>106201097what the fuck are you talking about and what the fuck is an Ani
>>106201205Probably meant AGI
>>106201205>>106201212no Ani is the elon grok waifu app
>>106200824Cleans up the image for me. The 2k lora on top is nice too. Today's gens have been using 28 steps with the optimal steps scheduler
>comfy gens slow down to a crawl after a couple of gens running fine
what was it again that needed to be done to that guy?
>>106201097i always thought it was just legal liability and wanting to dodge lawsuits, but if they are all giving up a load of money and user satisfaction just so they can feel better about their company then kek
https://youtu.be/hmtuvNfytjM?si=q-VZKqyqYODA6Pbl&t=2961
Any software to let me search images by generation parameters like https://github.com/zanllp/sd-webui-infinite-image-browsing but one that works with ComfyUI images?
>>106201205>>106201212Ani is Grok's "Waifu Mode." It is just a glorified Koikatsu model on a dancing rig and with the voice in ASMR mode, but it made quite a stir on both the pro- and anti- side. She looks like Misa, and the joke is Ani-May. A lot of people, including OpenAI employees, went crazy about it, accusing Musk of all sorts of irresponsible behaviors because Ani has twintails and big boobs. See https://gelbooru.com/index.php?page=post&s=list&tags=ani_%28xai%29 (to not take up an image limit slot).
Most recently, Altman made the following comment (see ~49 minutes in)
https://youtu.be/hmtuvNfytjM
>We haven't put a sex bot avatar into ChatGPT yet>Seems like that would get a lot of engagement>Apparently it doesThe irritated "apparently it does" is the key. He is not just speculating, he is annoyed at someone having already done something, the most likely candidate being Ani.
>>106201205ani is anistudio's mascot. the dev posted here since the beginning
>>106201318I think ani should sue elon
>A landscape photo of a lake with mountains off in the distance. It is a dark, starry night, with some planets visible in the sky. The lake has fireflies and wisps of fog over the water. A grassy field surrounds the lake with various flowers and a large oak tree. A mystical Fenrir made of molten lava and fire is roaming the grassy field.
>Ultra HD, 4K, cinematic composition.
cfg 4, seed 42, steps 40
>>106201265For all the screeching about how Ani isn't a hag, Musk should jump the shark and add a loli mode to it. Stirring up shit is right up his alley.
>>106198896I got a slight improvement from the q8 GGUF in one test but the perf is worse and it's almost crashing my computer. so I'm giving up on it, will have to stick to fp8.
also, I tried q6 and it took just as much VRAM as q8 and was just as crash-prone? what the fuck is the point of lower quants then? is this a problem with rocm and/or the 7900 xtx architecture?
>>106201318I don't think so. I think he's more into blue chibi foxes.
>>106201364this is planned for anistudio
>>106201231Go use forge and have fun with SDXL
>>106201367q8 and q6 both operate in fp16 after dequant. Also seems like you have real fp8 support
>>106201367Ayymd surely isn't helping but at least find fp8 scaled instead of just fp8
>>106201466He's gooning in our AGP discord, wait a bit, chuddie
Friendly reminder to NOT update your comfy to the latest release v0.3.49
>>106200884base is base
annealed is the new detail-calibrated and is slightly better for high res detail but worse at other things
flash is for 8 steps, CFG 1, heun sampler, beta scheduler
i have no idea and could be wrong though
>>106201602>(ComfyUI) archlinux% git pull>Already up to date.
/ldg/ is so comfy when there aren't any atherless avatartrannying faggots polluting it
>>106201664way to shit things up by being a schizoid
>>106201639I..I'm so sorry
>>106201602ive been on 3.49 since release with no issues
>>106198660Great movement, catbox? Or if you dont want to, at least prompt?