Qwenimageeditmodelbros Edition
Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>106180771https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanXhttps://github.com/Wan-Video
2.2 Guide: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y
>Chromahttps://huggingface.co/lodestones/Chroma1-Base/tree/main
Training: https://rentry.org/mvu52t46
>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Trying to download the model files for qwen image so I can test LoRA training. Why is hugginface cli so shit?
>>106185861>hugginface clii see we have our best on the job
>>106185903You get what you pay for.
How do I get into this as a total newfag? Do I need a PC that costs thousands?
>>106184597shut the fuck up retard
>>106185951Depends on how coherent you want your waifu to look
>>106185951If you even have to ask about this, just stick to the web saas/API models. I am serious
>>106185861>git pull huggingface_repo_url
the man smiles and holds a cardboard cutout of an anime style Miku Hatsune, standing in the snow.
based kijai making the 2.2 i2v lora work normally (1 strength for both)
>>106181046>is there even any point of the resize image v2 node? why don't i just plug the image straight in to the WanVideo ImageToVideo Encode? I can just set the dimensions there.kijai's sampler will sperg out if it doesn't like the dimensions. i dunno
>>106185803 (OP)Can someone share the json from the wan rentry?
/ldg/ Comfy T2V 480p FAST workflow (by bullerwins): ldg_2_2_t2v_14b_480p.json (updated 2nd August 2025)
https://files.catbox.moe/ygfoxx.json
I get a white page.
>>106186064i'll also try i2v. wan2.2 t2v is painful for now
the man looks up at the sky and sees a giant cardboard cutout of an anime style Miku Hatsune, standing in the snow.
kino
>>106186117>not a hologramare you even trying?
>>106185951idle here for a week and then decide if this is who you want to become.
>>106186123patience! one idea at a time. i'll get there
>>106186140here
the man looks up at the sky and sees a giant hologram of an anime style Miku Hatsune, standing in the snow.
it'd be easier if it was a screenshot at night, but it works!
how much vram qwen does actually consume when quantized?
>>106186117>>106186147i2v looks so kino compared to t2v
got a BIG miku this time.
>>106186166i2v is fun cause you avoid most of the randomness, you're just prompting what you want to happen next. often with hilarious results.
>>106186173this time, "gigantic hologram":
Fight scenes are definitely possible with wan2.2. I think I might have unlocked the Ryona fetish.
I don't think my wan kijai txt to image works as intended lol
>>106186105https://huggingface.co/bullerwins/Wan2.2-T2V-A14B-GGUF/blob/main/wan2_2_14B_t2v_example.png
it's not set up with light though
>>106186189>the strongest woman vs the weakest man
>>106186222Thank you anon, and that's fine that part is easy to set up, I was mainly wondering what to use instead of WanVideo Sampler kijai uses.
the man looks up at the sky and sees a gigantic hologram of an anime style Miku Hatsune waving hello at the man.
there, beeg miku
poorfag
md5: 43dd2ac54a7a21fb9edbdd344ea9331e
🔍
why are you not making money with your shit, /ldg/?
>>106186335Who is even paying to look at slop?
>>106186292I don't think wan knows what a hologram is
>>106186367prompt (slightly diff): the man looks up at the sky and sees a gigantic hologram of an anime style Miku Hatsune singing with a microphone.
>>106186335I have a good paying job, I want to generate stuff for me.
>>106186335for 1 successful, 100 with like this at 5$ per month total
For anons having a 5090, I'd like to change the nvidia-smi power target to something lower.
What is the sweet spot for inference? 350W?
>>106186335I'd have to change too much of myself to be successful with that sort of thing
file
md5: f52b24186afff9fa67150313f79a4491
🔍
Is this snakeoil any good? It's been slapped on every other wan2.1 workflow I came across but not on 2.2
>>106185333>bottom line is qwen image is our deepseekon the right track but i don't know about calling it the deepseek of image gen. doesn't sound quite right.
>>106186404>350W is power limiting territory for a 5090>mfw I'm running my 3090 at 210WIs electricity free where you live or something?
>>106186448My 3090 is power limited at 260W, it's a good compromise. 210 sounds a bit low.
The 5090 I have no idea what to set yet but its tdp is 575W.
>>106186414i don't think anyone, even the creator of it, knows what it does
>>106185524a chroma centaur. note this is an old gen, this is chroma 33 so a newer one would likely be a bit better. but clearly it's superior to qwen's abortive attempt
>>106186414you can probably rename it to cargo cult node
Untitled
md5: 031418db8146414f3940c3a5a1523af2
🔍
Okay, got qwen to begin training at fp16 across X2 rtx 3090s.
It was a bit of headache to setup. Mostly because the diffusion pipe repo casually forgot to mention I would need to upgrade transformers as well. I'm not sure how I was the only one experiencing that issue.
>>106186472It does some weird shit where it averages out some numbers on each step or something, idk really. It's extremely subtle. It's not fake and gay, but it may as well be due to how little it impacts the output
Untitled
md5: d1f2d4c71eb1175039d982009687ced9
🔍
>>106186523Getting about this much vram usage at 1024 at a batch size of one and rank of 32.
Seems perfectly trainable to me desu.
and this is why i2v is fun, silly shit
the man hits a baseball with a baseball bat.
>>106186581Is this with the new light LoRAs? I've noticed my I2V outputs desperately resist the subjecte turning around in them. Might just be my imagination though.
>>1061865942.2 i2v, just testing random stuff atm.
>>106186625*it's the lightning lora but the one kijai posted, 1 str for both
>>106186633Yeah just something I noticed when I didn't before since updating, they subject doesn't like to change directions. Then again, I don't have many examples.
>>106186064is that the repack?
>>106186691just kijai workflow with the 2.2 i2v loras
https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo2_2_I2V_A14B_example_WIP.json
https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Wan22-Lightning
>>106186711oh those, they're fast but censored :/
>>106186670ay yo i din kno cap picard was slick wid it like that!
genning i2v at 720p, 121 frames. tried kijai's 2.2 loras. switched back to his original workflow with "lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors" at 3.0 on high and 1.0 on low. i find the 2.1 loras have superior prompt adherence, faster motion and resist looping better. i don't know if the 2.2 loras are meant for 480p but the visual quality is about the same. wtf is going on with this shit
>>106186335I started a few months back, got a sub, then life happened and I had to stop. Then I lost that sub recently. Welp. No idea how these guys got hundreds to thousands of subs though. Insane. I put in a lot of effort and it was all high quality stuff.
>>106186792I made a porn game that brought in a few K every month. I couldn't maintain it after I got a new job though, and I was creatively drained. I was no longer enjoying the fetish I was facilitating.
Alright, I've been having a good bit of fun with 2.1 (mostly getting my NSFW images to move). How does 2.2 compare? How is it with weirder sizes? I know with 2.1 if you didn't stick exactly at 832x480, things could go pretty soft in terms of detail. And for some reason fluids exclusively genned as torrents and/or giant globs. Like say you prompt saliva, that bitch was ejecting giant globs of spit out of her mouth. I kinda figured that was down to some weird shenanigans with resolution and denoising, where it can't see the small details in the image and kinda resolved it with larger ones, but I'm not sure.
But yeah, is it a bit more flexible with resolutions and framerates?
Base 2.1 tended to be pretty shit where it'd do the slow motion stuff or loop back too.
I'm guessing requirements are mostly the same too? It's just an incremental update, so I figure it's more just optimizing and improving what's there.
I thought there was a comfyui node that let you run a python script but I can't find it for the life of me
>>106186857I can't believe you haven't moved to 2.2 already. What the fuck.
>>106186867I was on vacation, then I had to work ;_;
Haven't had time to pick back up until now.
>>106186850story of every indie porn game developet
I had an idea on how to make like a video with say, only 5 frames. This would save a fuckload of gen time.
Obviously setting Length to 5 doesn't work, it has to be set to 81 for the structure to be established. But what if you establish the structure by generating for 1 step, then remove all the other latent frames, then continue generation? Would it work?
There's nodes like "TrimVideoLatent" and "Frames Slice Latent" which allow you to remove latent frames, we would just need a node that only keeps every 16th frame (when using 81 length, to make it 5 frames).
>>106186874Go, go download 2.2. It's basically better in every conceivable way.
>>106186881It was just too much man, and the DMs asking me to add in stuff I found revolting. The insane impossible requests that would be entire games in and of them selves. The fear of people being unhappy with the update.
Yeah. I might pick it up again one day, but it will be a new project that I actually want to work on. Passion is something intangible that users can feel.
DAE not think Qwen is actually very good at all? It looks like absolute shit compared to Flux Krea for everything I want to gen personally
>>106186896haven't tried Krea, but I'm having a blast genning heaps of shit with qwen image, especially character reference frames to feed into hunyuan3d2.1
>a 2 panel comic portraiting Hatsune Miku, and John Wick shopping. the left panel shows John Wick holding a GPU saying "VRAM..." with Hatsune Miku looking excited. the right panel shows John Wick and Hatsune Miku walking away together saying "it's GENNING time". Ultra HD, 4K, comic, anime.qwen image + wan2.2 back to back is hours of fun
>>106186896I think the outputs all look "solid". Like they are very clean. Especially for anime stuff. If you do an honest comparison to default flux. It trounces it pretty handily. It's also not distilled right off the bat which makes it a much more attractive proposition (vramlets don't @ me)
>>106186926kek, what did you prompt
>>106186929Hehe yeah what did you prompt? lol. I wanna know hehe. It would be really funny if a giant naked lady just picked up a tiny little man right? heh. You know, just a fun thing?
>>106186939oh maybe prompting it for attack on titan might work?
>>106186929i sent a basic prompt about him being grabbed by a giant miku through grok and got this:
In the haunting, snow-laden climax of Blade Runner 2049, K, the weary replicant portrayed by Ryan Gosling, sits slumped on the icy steps outside the Wallace Corporation, his bloodied face and tattered coat bathed in the soft glow of falling snow. As he gazes upward into the swirling, pale sky, a colossal 2D anime hologram of Hatsune Miku materializes, her vibrant teal twin-tails cascading like neon waterfalls, dominating the desolate urban horizon. Towering over K, her luminous figure radiates an ethereal warmth against the cold, dystopian backdrop. The camera slowly pulls back, revealing the staggering scale of Miku’s hologram as she fixes her playful, glowing eyes on him. With a single, fluid motion, her enormous hand descends, effortlessly scooping K from the steps like a fragile doll. She lifts him skyward, his body suspended weightlessly against the stormy expanse, snowflakes swirling around him as her vibrant presence contrasts with his quiet resolve, the city fading below in a breathtaking ascent.
>>106186960Nah doesn't work for Wan. I actually tried something similar early today.
>>106186967>tealIsn't she more of a turquoise?
>>106186883ah it seems it's the node called "Select Every Nth Latent" that allows you to do this.
Anyone know how to get previews working for the Phr00t AiO workflow in the Rentry?
>>106186928default Flux sure but Krea takes a shit on anything that isn't WAN for photographic gens, nothing else is remotely close to as detailed in that regard
>>106186985Nevermind, I'm a blind retard.
Just had to pan down.
>>106186414All the snakeoils were more useful on 2.1. And 2.2 fixed of of the issue the old model had so these don't do as much as they used to.
>>106186993True, but Krea seems very much in that niche of realism. Qwen feels like a blank canvas. Think back to SDXL and its release and how god awful it was and it still turned out great. I don't know if Qwen can achieve that due to its size, but the potential to be truly great is there.
>>106186993Nah, Chroma is still the king of photorealism. It's not even close. There are hundreds of different things you can do with Chroma, you can't do with Krea, that's due to its uncensored nature. The Krea pretty images, you can get out of Chroma with good prompt engineering.
>>106186670it works no problem with 2.1 light lora
>>106186884You weren't kidding. This shit is way better. And it takes like half the time (110s for 2.2 on a first gen, compared to 200s for 2.1) .
Only problem is that things are a little grainy. I grabbed the Phr00t workflow from the rentry as a bit of a "quick start" to try shit out, so maybe that has something to do with it. Or maybe it's just more sensitive to resolution than before?
2.1 light, 161 frames. not great but look how coherent
>>106187055the AIO is ass
>>106187042One simple and often overlooked example. Soles. Qwen can do them, but they are blurry and slopped. Know what else is blurry and slopped? Cloudshit models. From Reve, to Gemini, to Imagen, to GPT 4o, all slopped. Chroma is unique in that it's the only unslopped model that can do soles in any situation.
chroma is king of random noise and garbage
>>106187066>the AIO is assSheeeeeeit.
The fuck do I do then? The rentry is kind of a weird mix of 2.1/2.2 info. 2.2 isn't just a drop in replacement, is it?
>>106187052I saw a r*ddit post that said they got amazing movement combining the 2.1 at strength 3 and 2.2 at 1 on the high model
and 2.2 at 1 on the low model with 2.1 at 0.25.
Both with a cfg of 2.5
>>106187075The light LoRA that is.
the professional style foot pics were like a breath of fresh air
>>106187074use kijai's workflow
https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo2_2_I2V_A14B_example_WIP.json
>>106187083If I wanted pro style foot pics I would just prompt for them. They'd still come out better than distilled Seedream 3.0 slop. That's the freedom that models like Chroma give you. But you are an idiot.
>>106186967neat, I might try that with lm studio too but prob have to unload the model (for vram), grok/etc are probably ideal options for generating detailed stuff.
5 epochs in on Qwen training. I'm wondering how it will handle respecting the captions. Flux had an awful tendency to bleed characters and everyone called me a schizo when I called it out.
>>106187021HiDream was MIT licensed and the Full version wasn't distilled, and AFAIK would be much easier to train than Qwen in terms of resource needs.
>>106187104Generate an image of a man standing in the center of a large, empty room with high ceilings, dressed in casual attire - jeans and a plain white t-shirt, with a look of wonder and awe on his face as he gazes upwards at a giant holographic image of Hatsune Miku suspended above him by an invisible field of energy or magic. The holographic Miku should be at least 10 times larger than the man, with intricate details visible even from this distance, her facial expression one of calm serenity and gentle benevolence as she looks down at the man, her eyes cast directly at him as if seeing right through to his soul. A halo of soft, pulsing light surrounds Miku's image, diffused and scattered throughout the room, creating an otherworldly ambiance that makes the viewer feel like they're witnessing something truly magical, with subtle hints of digital code or programming languages floating in the background.
neat
>>106187120and the prompt in lm studio was: make a detailed stable diffusion prompt for a man looking up at a giant holographic Miku Hatsune in one paragraph.
will prob use grok so I dont have to unload the model, it uses like 5-6gb.
>>106187119Nobody really gave a shit about HiDream. Myself included. I don't think any of the outputs really wowed anybody.
>>106187042No you can't lmao, Chroma straight up does not have that level of fidelity in it at the moment due to being trained at 512px up until now. It also has half the context length of both regular Dev and Krea because Schnell did (256 tokens versus 512 token).
>>106187128I haven't really seen a single "wow" Qwen output either quite frankly. Just a lot of people using extremely easy prompts as supposed evidence of the great prompt adherence.
>>106187136There were some that came out during the release of Flux to test how far T5 could go, someone had a long elaborate prompt for a black woman. Too lazy to look it up.
>>106187075lmaoing at the jeets there that can't figure out how to connect a lora node
>>106187119>4 (four) text encoders
What wan 2.1/2.2 i2v model that gives me reasonable speed with a 4070s and 32gb of ram?
I'm using wangp, wan2.1 Image2video 480p 14B, and profile 4 (12gb vram and 32gb ram), it takes 7 minutes to make 5 seconds
>>106187142I found one lazily search back in the archives from last year.
This is a digitally drawn anime-style image featuring Hatsune Miku. She is seated at a wooden desk in a modern office setting. On the desk is part of a half-eaten hot dog and crumbs, the hot dog has a missing part that was bitten off and it's incomplete. She has a serious expression as she extends her right hand to shake hands with a person off-screen to the left. Likely an office colleague. Indicating a break or snack time. The desk is cluttered with various office supplies, including a pencil cup filled with colored pens and markers, a calculator, and a notebook. A green potted plant is visible on the left side of the desk, adding a touch of nature to the otherwise busy workspace. The background features a large window with multiple panes, allowing sunlight to stream in and illuminate the room. Outside the window, lush green trees are visible, suggesting an office with a view of nature. The walls are adorned with bookshelves filled with neatly organized binders and books.[\code]
This is what Flux cooked up.
>>106187150Don't lmao too hard. I saw a guy here asking how to connect two LoRA nodes together over the space of 24 hours.
>>106187169Fuck me. I'll just quote it.
>This is a digitally drawn anime-style image featuring Hatsune Miku. She is seated at a wooden desk in a modern office setting. On the desk is part of a half-eaten hot dog and crumbs, the hot dog has a missing part that was bitten off and it's incomplete. She has a serious expression as she extends her right hand to shake hands with a person off-screen to the left. Likely an office colleague. Indicating a break or snack time. The desk is cluttered with various office supplies, including a pencil cup filled with colored pens and markers, a calculator, and a notebook. A green potted plant is visible on the left side of the desk, adding a touch of nature to the otherwise busy workspace. The background features a large window with multiple panes, allowing sunlight to stream in and illuminate the room. Outside the window, lush green trees are visible, suggesting an office with a view of nature. The walls are adorned with bookshelves filled with neatly organized binders and books.
>>106187169Someone plug this into Qwen, curious to see how it handles it and my GPUs are all occupied right now.
>>106187136got an idea for a prompt you'd like to try? I can run it if you want
>>106187126this time grok:
A gritty cyberpunk metropolis at night, rain-slicked streets glowing with neon reflections, a lone man in a worn trench coat staring upward in awe, a colossal holographic Miku Hatsune dominating the skyline, her vibrant teal twin-tails shimmering with intricate digital patterns, her form translucent yet luminous, surrounded by floating data streams, towering dystopian skyscrapers and flickering holographic billboards in the background, bathed in moody cyan and magenta neon hues, cinematic lighting, ultra-detailed, in the high-tech, noir aesthetic of Blade Runner 2077, immersive, futuristic atmosphere.
trippy
>>106187191running now, will make 2 versions (wide and square)
Best settings for character training in wan 2.2 for high and low?
Is rank 64 or higher bucket size worth it?
>>106187130Yet only with Chroma can you do proper feet, creepshots, gore, nudity, sex, bondage, yoga, contortions, etc... the list goes and on anon. And also Chroma follows the prompt better than Flux dev/Krea for these reasons.
>>106187212I did some rudimentary experiments with 2.2 at rank 64. If you're training low, you can plug the LoRA for the character into the low node and the output will look like that character while the motion remains in tact. I just did 1024x1024 images only as a test. Video I also tried but I don't have the will to really sus it out yet.
>>106187218Looks a bit noisy are you putting LoRAs you shouldn't in the low noise output or at too high a strength.
>>106187215got an example prompt of creepshots? do you mean images that are peeping or cctv? I've tried running cctv prompts in qwen image and it's OK but I think I'm a promptlet at directing how and where the camera is (can't get top down camera sitting in the corner of a room shot)
Does anyone know which ones from the following params ComfyUI uses by default?
[-h] [--listen [IP]] [--port PORT] [--tls-keyfile TLS_KEYFILE] [--tls-certfile TLS_CERTFILE] [--enable-cors-header [ORIGIN]]
[--max-upload-size MAX_UPLOAD_SIZE] [--base-directory BASE_DIRECTORY] [--extra-model-paths-config PATH [PATH ...]] [--output-directory OUTPUT_DIRECTORY]
[--temp-directory TEMP_DIRECTORY] [--input-directory INPUT_DIRECTORY] [--auto-launch] [--disable-auto-launch] [--cuda-device DEVICE_ID]
[--cuda-malloc | --disable-cuda-malloc] [--force-fp32 | --force-fp16]
[--fp32-unet | --fp64-unet | --bf16-unet | --fp16-unet | --fp8_e4m3fn-unet | --fp8_e5m2-unet | --fp8_e8m0fnu-unet] [--fp16-vae | --fp32-vae | --bf16-vae]
[--cpu-vae] [--fp8_e4m3fn-text-enc | --fp8_e5m2-text-enc | --fp16-text-enc | --fp32-text-enc | --bf16-text-enc] [--force-channels-last]
[--directml [DIRECTML_DEVICE]] [--oneapi-device-selector SELECTOR_STRING] [--disable-ipex-optimize] [--supports-fp8-compute]
[--preview-method [none,auto,latent2rgb,taesd]] [--preview-size PREVIEW_SIZE] [--cache-classic | --cache-lru CACHE_LRU | --cache-none]
[--use-split-cross-attention | --use-quad-cross-attention | --use-pytorch-cross-attention | --use-sage-attention | --use-flash-attention]
[--disable-xformers] [--force-upcast-attention | --dont-upcast-attention] [--gpu-only | --highvram | --normalvram | --lowvram | --novram | --cpu]
[--reserve-vram RESERVE_VRAM] [--async-offload] [--default-hashing-function {md5,sha1,sha256,sha512}] [--disable-smart-memory] [--deterministic]
[--fast [FAST ...]] [--mmap-torch-files] [--dont-print-server] [--quick-test-for-ci] [--windows-standalone-build] [--disable-metadata]
Had to cut some out due to character limit. I know that vae is run in bf16 by default for example, I am asking like that.
>>106187191Made 2 gens, one gen gets the office items positioned better but messes up the handshake and this one gets the items and handshake just in the wrong position. Neither had a bite out of the hotdog.
cfg 4.5, steps 50
>>106187235Any kind of image anon.
>Amateur photograph, a Japanese woman dressed as a maid, sleeping on the Tokyo Metro, her panties are slightly visibleThat is one example of the kind of stuff Chroma gets right. You could do cctv, walking up a flight of stairs, peeping, etc... any kind of creepshot that has a natural description, you can do, (though Chroma just like other models benefits from a good prompt, you can enhance with VLMs)
>>106187251Now you gotta do the Chinese version
>>106187237Check the code
>>106187251>>106187286I think both are certainly more coherent than FLUX, but it seems like there is still a ways to go on the prompt adherence front. It is a noticeable step up though.
So with the lightx2v 2.2 workflow in the rentry, where do I set the virtual VRAM usage type thing like it was in the 2.1 workflow (the UnetLoaderGGUFDisTorchMultiGPU node)?
Gens are still around the same time as 2.1 even without it, but I don't know if I'm fucking myself over or not.
Also 2 more questions.
How do I plug loras into this? Do I just put them inline after the lightx2v loras (assuming I use a high/low lora)?
What's the difference between the e4m3fn and e5m2 versions of the i2v models?
>>106187075might be something to that, tried it and got results that looked better compared to without, but the 2.5 cfg deep fries it, keeping it at 1
>>106187237>>106187312Well I guess there is no lovely default params list somewhere out there isn't it?
Anyway just trial and error'd what I wanted to learn, it uses fp16 precision for text encoder by default, at least on my system.
>>106187265here ya go
>>106187264I'll give it a try now
>>106187369It's comfyui so probably not, check the code, it should all be in one place, but then again it's comfyui so probably not
>>106187383I have another one when you have the chance, just to see at what point it gets overloaded in the description with characters.
>This is a colorful digital drawing in an anime style, featuring four young girls playing a chess game on a pink table in a bedroom. The girls are dressed in school uniforms with white sailor collars and blue skirts. The girl on the left is Sailor Moon and has long blonde hair tied into twin ponytails, the girl in the center has pink hair styled in pigtails, the girl on the right has dark blue hair, the girl in the bottom right is Hatsune Miku, and there's a small black cat sitting on the bed on the far right. They are all sitting on the floor, focused on the game. Behind them, there is a large bed with a blue and yellow striped blanket. The room has pastel-colored walls with a window that shows a bright blue sky. The overall atmosphere is playful and cheerful, with bright colors and simple, clean lines typical of anime art.Flux failed to gen Miku and the image is severely degraded.
>>106187423here's a link to the thread where I was sharing some stuff.
https://desuarchive.org/g/thread/106170414/#106172707
and trying out qwen as an image ref for hunyuan 3d2.1
https://desuarchive.org/g/thread/106174863/#q106175342
I'll try the prompt you shared now in a moment, I'm testing the peeping prompt at the moment
A rain-soaked cyberpunk city at night, neon reflections shimmering on wet streets, Ryan Gosling as a rugged man in a sleek trench coat, pointing upward with intensity and awe, a colossal holographic Miku Hatsune dominating the skyline, dynamically dancing and singing into a glowing microphone, her teal twin-tails swirling with vibrant digital patterns, her translucent form radiating ethereal light, surrounded by pulsating data streams and musical notes, dystopian skyscrapers and flickering holographic billboards in the background, drenched in moody cyan and magenta neon hues, cinematic lighting, ultra-detailed, in the gritty, high-tech noir style of Blade Runner 2077, immersive and atmospheric.
neat, I need to llm-max prompts more often, just get the basic idea and let the model elaborate/add detail.
I saw a blue Prius while walking today and laughed
>>106185951its like some of you guys outright refuse to read the stickies
>>106187264doesn't get the hint on the panties but I think if I added "her legs are slightly spread apart" it might get it.
>>106187442It's interesting, with enough patience you could remake whole movies into memes
You know someone will do this
>>106187503wide version. almost got it but the pillow just out of nowhere lmao
there we go, slight change to the prompt request.
A rain-slicked cyberpunk city at night, neon lights casting vibrant reflections on wet pavement, Ryan Gosling as a rugged man in a sleek trench coat, gently holding hands with a life-sized holographic Miku Hatsune, her translucent form glowing softly as she smiles warmly, her teal twin-tails shimmering with intricate digital patterns, faint data streams swirling around her, dystopian skyscrapers and flickering holographic billboards in the background, bathed in moody cyan and magenta neon tones, cinematic lighting, ultra-detailed, in the gritty, high-tech noir style of Blade Runner 2077, intimate and atmospheric.
>>106187512and all I asked grok (free) was: make a stable diffusion prompt for a man holding hands with a holographic Miku Hatsune who is smiling, in the style of Blade Runner 2077, with Ryan Gosling.
>>106187440Thanks. Seems like it will probably make the same mistakes as what I linked.
One last prompt, forgive me.
>This image is a digitally drawn cartoon in a typical comic strip format. The scene is set in an art gallery, with a girl on the left side wearing a teal blazer and light brown pants, pointing to a framed painting on the wall. The painting, which is green with a yellow border, depicts a bowl of fruit including apples, grapes, and bananas, with a price tag of "$500" attached to the lower right corner. Another identical painting, identical in style and content, hangs on the wall to the right, priced at "$1500". In the foreground, two people are standing, observing the paintings. One person, a bald man with a blue plaid shirt and brown pants, is looking at the paintings with a confused expression. The other person, a woman with dark hair and a sleeveless dress, is standing behind the bald man, watching the scene with a neutral expression. The background features a beige wall with a few other paintings, and the gallery is lit with soft, even lighting. A humorous caption at the bottom of the image reads: "It is more expensive because it took the artist several weeks to paint it, while the other one was generated in 10 seconds on my computer."Should test out text and formatting to the extreme.
>>106187423here you go. seems to handle multiple subjects very well in a prompt. I'm satisfied with qwen-image a lot and it is a night and day improvement over flux for me
>>106187383>here ya goEither you misunderstood me or you are a funny guy.
A rain-soaked cyberpunk city at night, neon lights casting vibrant reflections on slick streets, Ryan Gosling as a rugged man in a sleek trench coat, standing captivated as he gazes at a massive billboard displaying a holographic Miku Hatsune, her translucent form reaching out toward him with a gentle, inviting gesture, her teal twin-tails glowing with intricate digital patterns, faint data streams swirling around her, dystopian skyscrapers and flickering holographic signs in the background, drenched in moody cyan and magenta neon hues, cinematic lighting, ultra-detailed, in the gritty, high-tech noir style of Blade Runner 2077, immersive and atmospheric.
cool
>>106187503>>106187511Something weird I've noticed with Qwen is panties often come with a thigh strap.
>>106187538Yeah, it's definitely an improvement and got all the major elements. Still has some ways to go with the chess pieces and hands but that is minor.
>>106187604I've noticed it does that too. But I managed to get it not do so when trying to gen magazine photoshoot photos. I'd post here but it's a blue board
>>106187369I know no one cares but to add on, even FP32 models are loaded in FP16 unless you manually launch with --fp32-text-enc.
Kinda weird behavior desu. It definitely affects images.
>>106187649I'd need to see more examples to really care. Good to know though.
>>106187330>So with the lightx2v 2.2 workflow in the rentry, where do I set the virtual VRAM usage type thing like it was in the 2.1 workflow (the UnetLoaderGGUFDisTorchMultiGPU node)?You don't, instead you set the number of "blocks" you offload to swap. See picrel.
There are a total of 40 blocks in wan, and swapping 20 allows for 81 frames generated in 720p on a 24GB card.
If you send the whole 40 to swap, then you can go above at the price of longer generation time.
what's the best version of the rapid all in one wan? only just now moving from 2.1 to 2.2
>>106187523That prompt seemed to trip it up a bit. I tried cfg of 3.5, 4.5, and 5.5 with a batch of 2 seed 42. zipped all the attempts for comparison
https://files.catbox.moe/zi1948.zip
>>106187330>How do I plug loras into this? Do I just put them inline after the lightx2v loras (assuming I use a high/low lora)?Yeah, add a WanVideo Lora Select Multi and connect it behind the lightx lora loader with prev_lora. One for each lora loader.
>>106187330>What's the difference between the e4m3fn and e5m2 versions of the i2v models?e5m2 -> use with 3000 cards
e4m3fn -> use with 4000/5000 cards
>>106187668The image I posted may or may not have been cherrypicked but yeah, I guess this is still one area that needs work which is mixed text and subject mixed prompts. Thanks for the hard work, I really appreciated the time and energy you spent satisfying my curiosity.
lmao
A sleek, futuristic car interior from the driver's seat perspective, Ryan Gosling gripping the steering wheel with intensity, his face lit by the soft glow of a high-tech dashboard, driving at dusk on a winding road through a lush tropical island, a massive, eerie sign reading "EPSTEIN ISLAND" in bold, neon-lit letters looming ahead, surrounded by dense jungle and turquoise ocean views, vibrant sunset casting orange and purple hues, cinematic lighting, ultra-detailed, in a suspenseful, noir-inspired style, immersive and atmospheric.
asked grok to make a prompt of him driving a car on an island with a sign.
>>106187757Meant to quote
>>106187715>>106187668I wanted to post this separately.
https://www.ai-image-journey.com/2024/12/image-difference-t5xxl-clip-l.html
Use Q6_K or higher GGUF or FP8_scaled if you absolutely need to quantize your text encoders.
>>106187770revision, blue sky with clouds.
>>106187716Didn't know about the select multi node. I was putting a normal lora select before the lightx2v lora in the chain. Outputs were fucked beyond belief. They were sped up, and incoherent blobs.
>e4m3fn -> use with 4000/5000 cardsGood to know.
Last question. How does framerate factor into this one? I know before it output at 16fps, you'd interpolate to 32. But when Riflex was a thing, you'd do like 121 frames and output straight to 24fps. That still the same?
>>106187802>Last question. How does framerate factor into this one? I know before it output at 16fps, you'd interpolate to 32. But when Riflex was a thing, you'd do like 121 frames and output straight to 24fps. That still the same?No idea for the interpolation part as I'm using 16-60 fps on videos I like on topaz instead of adding it to the wf. I don't think interpolation was added in the rentry wf but I modified mine so I'm not sure.
>>106187774no worries mate
>>106187596I tried mate, but I just can't figure out how to get bite marks. If you want, share a prompt and I'll see if it makes it china enough for ya
>>106187831I think he means putting the prompt in Chinese and seeing if it does better. The dude is dumb for saying way too little about what he wants and expecting it to fall out of the sky magically.
>>106187848If that's the case there'd be no tells on what got gen'd of if it were done via chinese text prompt or not. But heck, I'll try that too. I'll ask deepseek to translate the prompt and run it through
>>106187503Just gotta have a move a little!
>>106187867Man I really wanna get the large models up and running (without comfy). I can only run the 5B video model. Gotta look into loading the quanted models and edit the inference code to be able to load the ggufs. If I can't get it done in a week, I'll probably cave and install comfy
Noisy first frame. Used https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne V4 model
>>106187264got it with a slightly modified prompt
>Amateur photograph, a Japanese woman dressed as a maid, sleeping on the Tokyo Metro, her thighs are spread slightly apart and her panties are slightly visible.>iPhone photo, 4K, Ultra HD.took testing 2 seeds though. cfg 4.5, seed 43, steps 45
>>106187848>>106187265Got deepseek to translate the prompt, and this is the first seed output. 2nd one is baking.
cfg 4.5, seed 42, steps 45
original prompt
>This is a digitally drawn anime-style image featuring a Chinese warrior woman with hair buns, foggy round glasses with spirals on them, and wearing a blue dress with whale symbols all over it. She is seated at a wooden desk in a modern office setting. On the desk is part of a half-eaten hot dog and crumbs, the hot dog has a missing part that was bitten off and it's incomplete. She has a serious expression as she extends her right hand to shake hands with a person off-screen to the left. Likely an office colleague. Indicating a break or snack time. The desk is cluttered with various office supplies, including a pencil cup filled with colored pens and markers, a calculator, and a notebook. A Chinese flag is on the right side of the desk. A green potted plant is visible on the left side of the desk, adding a touch of nature to the otherwise busy workspace. The background features a large window with multiple panes, allowing sunlight to stream in and illuminate the room. Outside the window, lush green trees are visible, suggesting an office with a view of nature. The walls are adorned with bookshelves filled with neatly organized binders and books.
>>106187960seed 43
deepseek translation
>这是一幅数字绘制的动漫风格图像,描绘了一位中国女武士。她梳着发髻,戴着雾面圆框螺旋纹眼镜,身穿蓝色连衣裙,裙上布满鲸鱼图案。她坐在现代办公室的木桌前,桌上有一个被咬了一半的热狗和碎屑,热狗缺了一块,显然被咬过。她表情严肃,正伸出右手与画面左侧的屏幕外人物握手,可能是同事,暗示休息或零食时间。桌上凌乱地摆放着各种办公用品,包括装满彩色笔和马克笔的笔筒、计算器和笔记本。桌子右侧有一面中国国旗,左侧有一盆绿色盆栽,为繁忙的工作空间增添了一丝自然气息。背景是一扇多格大窗,阳光透过窗户洒进房间。窗外可见茂密的绿树,表明办公室外是自然景观。墙上装饰着书架,整齐地摆满了文件夹和书籍。
>>106187075it gives more movement but changes things from the original image, lmao
>>106187075were these using kijai loras?
What is lightning in the context of wan2.2?
>udpate comfy yesterday
>bricked everything
>fresh isntall, fresh nodes, updated from cuda 12.4 to .8 and updated to python 3.12, installed triton 3.3, sageattention 2.2.1
>old gens before the fresh install
>2 min
>new gens after fresh install
>10 min
Fuck sake. Does anyone know whats possibly happening here? Spent 4 hours today and got it to finally gen but its slow as fuck
>>106188102damn this is a tricky one
>>106188129portable moment
>>106185803 (OP)>made the collage again neato :3
is the lightning workflow now the best workflow for i2v? does it beat kijai's workflow?
>>106188129Same shit here. New Comfy update is completely broken. I get warnings like this when I try to video gen.
>Lib\site-packages\torch\_inductor\utils.py:1436] [0/0] Not enough SMs to use max_autotune_gemm mode>\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\_inductor\compile_fx.py:282: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
>>106188187>the final model is here
i thought v50 would be a 2nd high res epoch but it's a merge of the one high res epoch
>>106188183>Not enough SMsMeans your gpu is too old https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
>49 and 50
Why even bother with 49?
>>106188208>Means your gpu is too old4070ti super. Not the newest or the best, but I think it should still work.
>>10618812912.8 is for triton 3.4
>>106188183I went back to an older version of comfyui, an early july release when it worked. It works now but I can only imagine it has something to do with custom nodes (probably wavespeed or some shit).
In your case, you might have to copy the python folders:
C:\Users\ieatassallday\AppData\Local\Programs\Python\Python31(YOUR VERSION NUMBER)\include
C:\Users\ieatassallday\AppData\Local\Programs\Python\Python31(YOUR VERSION NUMBER)\libs
To your comfyui python embedded
C:\ai\yourcomfyuifolder\python_embeded\include
C:\ai\yourcomfyuifolder\python_embeded\libs
Then again, unironically chatgpt5 helped with my issue, yours could be different
>the final chroma model meant for others to train on is a slopmerge and not an actual trained epoch
smoothbrained dev
>>106188187Yes! Finally time to retrain my v44 loras
sage attention fixed for gwen WHEN?????????
>>106188255It is an actual epoch (v49), not two epochs though.
vace
md5: 2dc8ba68129103c53218275180cd120f
🔍
>>106188228I'm actually retarded, I had to replace a wan node for it to work and forgot to change the settings from 20 steps back to 4, kek
Backing up this install 5 times, fuck doing that again
>>106188247>I went back to an older version of comfyui, an early july release when it worked.Which version is it? Have to try it if python file copy doesnt work
Well I did some moar testing on this FP32 CLIP.
Seems to have potential imo. There is also another FP32 Illustrious CLIP floating around that I want to get around to testing.
Not shown here, I have tested another FP16 CLIP, which seems to have exhibited behavior more similar to the FP16 CLIP inside the model(don't have a huge experiment sample size for that, admittedly).
While whatever "restoration" this guy did also has a significant effect on quality most probably, I believe that the text encoder benefits from FP32 precision. Further evidenced by the change seen here when loading FP32 CLIP as FP16
>>106187649. (Conversely, the image also changes when you load the FP16 CLIP as FP32, not necessarily for the better or worse though)
>>106188353and what are you comparing on that pic?
>>106188349>https://github.com/comfyanonymous/ComfyUI/releases/tag/v0.3.44Yeah its always worth a revert. Others been having issues with the new update too, so I'm going to wait until further updates on a separate install
>>106188369I believe it is written rather in a self-explanatory manner at the top of the image.
>>106188378so fp16 is much better than fp32, thanks
>chroma-unlocked-v50.safetensors
Aaahhh shit, here we go again.
Say it with me guys. TWO MORE EPOCHS
>>106188383(You) (You) (You) (You) (You)
(You) (You) (You) (You) (You)
(You) (You) (You) (You) (You)
How do I prevent ComfyUI from eating all the RAM when changing loras? I've tried different "unload" and "free memory" nodes, toggling smart memory, but it still eats extra 20GBs and I have to restart manually. It's unbearable.
>>106188416download more ram
>>106188405Yes, it's finally done, now off to train loras!
>>106188423I already bought extra 32GBs just for wan.
>upgrade WAN to 2.2 in Comfy
>now every other gen crashes my PC and causes it to reboot
Alright
What the fuck is going on
I thought it was power spiking causing my 5090 to freak out at first but I've been monitoring the power usage and it draws less than gaming at peak load so that can't be it
And image generation still doesn't cause any issues
when adding the lightning i2v lora to kiji's workflow, do you have to change any weights or cfg?
>>106188441Does it only happen when you are using Comfy ? As in no problems when gaming etc ?
>>106188421>PeachHer gem is no longer floating on her chest and there is only one brick instead of them getting spammed in classic AI slop fashion even though the prompt says "a brick"
>LaraHer hair is worse and so is her costume, arguably, but background has improved
>Green hair girlFP32 looks better
>VenomNo longer faded watermark and has more detail
>Peach and RosalinaBetter image. Rosalina no longer has ghost arm and deformed hand.
>JuriTossup, but you could argue the original is better
>Grey hair girlTossup, you might argue about lack of background, but prompt says nothing about it so it is not text encoder's fault
>SamusPure tossup
>WizardThe ONLY one where FP32 performed undoubtedly worse
>ZeldaTossup, but I prefer FP32 one.
So you have like 1 image where fp32 performed worse, the rest are either better or equal.
>>106186335I tried but ended up like
>>106186399 said, you need an already big community/a lot of followers on RS, or start back when 1.5 released, AND also play it safe/censor yourself from anything that would get you banned from patreon
>tried to get into comfy UI
>use WAN i2v
>generated a blurry mess
I'm now #redpilled against diffusion.
Fuck this.
>>106188529>he gave up after his first failure i bet thats gotten you far in life
>>106188535It works, it's just shit.
>>106188529wan2gp link in op. comfyorg is currently destroying their ui making it as unstable and uncomfortable as possible
file
md5: 18ddbdb25dfbf003cc15c9d97e433683
🔍
what is the annealed chroma v50?
>>106188588It's a form of model optimization, from early tests this version seems the be the best
>gotoh hitori holding a guitar from the TV Anime bocchi the rock!
so this is the power of Chroma
Is there any reason why video generation has such wildly varying speeds? like sometimes it's only 30s/it and then next gen it's 80s/it.
>>106188602>Tsukasa Jun artbased
>>106188519by pairs
"holds a brick" in the positive - fp16 follows
"green swimsuit" in the positive - fp16 follows
"bra" in the positive - fp16 follows
etc
flawless fp16 victory
>looks worse/bettersubjective, needs more samples, useless otherwise
>>106188608wow dude nice gen!
why does it do this with wan video gen
>>106188610the sampler has an effect. try out euler instead of unipc it's more consistent in gen times
OK I am convinced that it is an LLM instructed to troll now, well played but I am done with giving (You)s, cya.
>>106188670You are not gonna get far without posting workflow I think.
You are probably using a wrong node somewhere.
>>106188294my jewgyptian snow white wife
Untitled
md5: 76fe096d959cf9151afac4ea845e0a86
🔍
>>106188610go into nvidia settings and set this to Prefer No Sysmem Fallback so that Comfyui will stop trying to fuck you over with normal RAM gens.
for example you might open a side application that uses 1GB of VRAM and then your gens will start trying to use normal RAM which is slow.
>>106188731I mean I'm only using 28GB out of 32 on the vram department, so that's not really an issue.
>>106188588>>106188603>chromaDev samefaging again,Stop shilling man, it's not funny, move on with your life lodestones. Chroma is dead, people here aren't using your furry shit. MOVE ON!
is it me or is chroma v50 more coherent but also more sloppy?
>>106188828Can you post some examples? Waiting for the gguf.
the day of cope has arrived
>>106186711I tried this and my outputs start turning red. How should I set the weights? Kijai has it 3 for high noise and 1 for low noise but I don't know if that works with the new lightning 2.2 loras.
so what happened to 1 month to cook at 1024 for chroma?
>Chroma still has jacked up hands.
Well, it was a good run fellas.
>>106188173smell ya later <3
2.1 lightx2v > 2.2 lightning lora
At least for anime. No doubt in my mind I'm getting better results with kijai's workflow using the old lora.
The new one clearly has more 3D-like motion, which doesn't work well for anime.
>>106188896You throw more GPU at it, then it goes faster
why are doomGODS always right?? hopetards continue to guzzle slop to the point of embarrassment. SDXL remains winning over 2 years later
>>106188441You can hard-lock your system if you mix up model datatypes. You might see them show up as "shape" errors. Anyway, I think what happens is the GPU locks up and falls off the PCIe bus, and then the video driver get rugpulled, and at that point your system is crashed and the kernel/windows reboots it.
>>106188901yeah it's pretty ass, I'll keep using lightx2v.
>>106188900since dubs, get one free<3
>LOVE & LOVE IS THE ONLY THING
>>106188187>>106188603Flux pro at home for free! Thank you lodestones
>>106189027Nice. So our universe sits in a dew drop, well, at least it's better than it all being a simulation.
>>106188903it went x3 faster than expected and both last versions came at the same time?
>>106189103This nigga is making collage bait!
>>106189090Different anon, also confused about 49+50 concurrent release. Though, I don't feel like it was getting any sharper after testing this every day https://huggingface.co/lodestones/chroma-debug-development-only/tree/main/staging_base_4
>>106189090AFAIK they didn't train the last epoch (v50) fully, instead they merged it with another high resolution training fork they had made.
So technically v49 is the 'true' release since it was a full 1024 resolution epoch and not merged with a fork. Of course the only thing that matters is which gives the best results.
>final chroma version
>anatomy still fucked
alright, what will be shilled next?
>chroma
this is a QWEN thread, poorfags!!!
https://huggingface.co/lodestones/Chroma1-HD/tree/main
its up, hands seem to be fixed at least
same with eyes on larger images
>>106186926That one is particularly nice.
yea, all small details seem to be fixed now, and the prompt following is great, maybe not quite qwen level, but its also not style locked like qwen is
So do I grab both versions of chroma?
>>1061859512080TI Will suffice for most picture genning up to Illustrious
>>106189309she has different eye color and ear piercing also long nails
>>106189286Looks great! What prompts did you use to get that cinematic look?
some nsfw ones
https://files.catbox.moe/txo33c.jpeg
https://files.catbox.moe/op2bl4.png
https://files.catbox.moe/655vue.png
>>106189304HOLY FUGGIN SLOPPA
>>106189341These aren't even made with the newest checkpoint though kek, doing it a disservice.
>>106189143Seriously, why is chroma often broken for most basic ass shit?
I am getting anatomy errors that weren't common in SD1.5 days.
Did they fuck up training params so much that they destroyed base model's knowledge?
This was such a wasted opportunity to become the next big thing in local genning.
Shame.
>>106189388can you show me a example? maybe your using a bad sampler combo
>MFW we failed to solve the nogen negativity disease
So Chroma-annealed is just the new name for detail-calibrated then?
>>106189372>it must be the NEWEST>if i can recognize it, its SHIT!!you are autistic and annoying as fuck bro
>>106189434I don't think so? I could not find any info on what that is
>>106189388because it was trained at 512x512 on a lobotomized version of the already underperforming flux schnell. not only did it have to re-learn basic coherence which was lost during de-distillation, it also had to try and learn new anatomy/tags on top. chroma is a foundational model project being trained on a SDXL finetune budget. he constantly tweaks things and merges things every other epoch. the dataset kept shrinking as the epochs were 'taking too long'.
anyone who was a veteran of the "resonance cascade" furfag failbake knew what to expect with this one. 'locking in' isn't a thing, you can tell by epoch 13 whether or not a model will sort itself out. chroma could've worked if it was trained normally on a bigger dataset with more compute, but compute (money) remains the ultimate moat keeping local NSFW finetunes from ever reaching their full potential.
>>106189286Nice, looks like a promotional film still from a movie ~2010
>>106189445Looks like he wants to keep it a mystery or maybe he'll explain it in the new model card.
>replying to yourself to sound "smart"
ooffff
>doomfags were right again
that's it, i'm subbing to midjourney
any checkpoints for making looping animations\gifs\webm???
apparently this is the TE to use?
https://huggingface.co/silveroxides/flan-t5-xxl-encoder-only-GGUF/blob/main/flan-t5-xxl-Q8_0.gguf
at least according to https://civitai.com/models/1825018/chroma-wf-done-properly
>>106189522WHERE AT BIG DAWG
QUICK someone besides the schizo make the fucking bake
>>106189434My guess is 50a is a 49 detail merge with a smidge of extra training at a lower LR
>>106189388>>106189458Obvious samefag, stop being so pathethic
Chroma is easily the best base model for photorealism and equally good as any other for art, and yes, despite being trained on a shoestring budget compared to its competition.
Like with every previous successful model, the potential comes with loras and finetunes, and this excels with loras already, super easy to train a person or style lora.
And of course it is uncensored, with understanding of genitals trained back in, and no mutilated nipples, so training NSFW loras for this will be a breeze.
>>106189589hes literally in the thread still dumbass
I will never not be smug about the failure of chroma.
>>106189397I delete most of the deformed slop I get but here are some that I forgot to:
https://litter.catbox.moe/kdmmggdk3wlmwaom.png
https://litter.catbox.moe/10q16jjat4sxih9w.png
https://litter.catbox.moe/258kgr18nsq123dz.png
>>106189458Thanks for the response.
More or less what I expected to hear.
>>106189571Stop being a schizo.
I think chroma CAN make good gens under some select circumstances but the overall package is too damaged to be worthwhile.
I can just gen NSFW on SDXL finetunes which are much faster and reliable.
Chroma does have the advantage of better text and prompting, but this is rather niche for NSFW.
Oh also I trained that qwen LoRA as a test and yeah it works but maybe I'm just fucking crazy but it like doubled inference time and the results were kind of meh. I probably way undertrained it though, only around 2000 steps.
>>106189628that is V48 though, try V50
also it looks like your using the wrong text encoder
>>106189123what an absolute clusterfuck of autism, but what can you expect from a furfag
>>106189418that is complete shit
>>106189628>I can just gen NSFW on SDXL finetunes which are much faster and reliableAnd you will be able to gen even better NSFW on Chroma loras and finetunes, do you not know the HUGE difference between SDXL and said finetunes ?
Like SDXL and Flux etc, Chroma is a BASE model, it will not excel at specific things because it is a general model made to be extended through loras and finetunes
Nobody uses plain SDXL or Flux for anything, stop being retarded
here is my first try with chroma https://files.catbox.moe/6cg7to.png
>>106189270>-HDwhats the diff with v50?
>>106189685WOW surely you have a link to those chroma finetunes
>>106189643I EXTREMELY STRONGLY doubt all the problems magically went away in last two epochs, but I eventually intend to check it out.
Also t5 xxl is indeed the correct text encoder, even as mentioned in chroma's hugginface.
>>106189685I DON'T expect it to get all the fetishes and styles out of the box right
I DO expect it NOT to make unbelievably simple anatomy errors we don't deal with in any other model, and I DON'T expect any finetune or LORA to fix such grave, foundational problems.
>>106189708nta but you realize it just released today right?
>>106189717you realize that there are 45643216549 "BASE" models that have never gotten trained to be actually usable
this moronic over optimism is crazy, and it happens every time a model releases
>>106189708It came out 2 hours ago you absolute mong
How long do you think it took for SDXL to get finetunes ? Like holy shit how stupid are you ?
>>106189733All it takes to get great results on Chroma is to train a fucking lora, which you can do on low end cards like a 3060 in ~4 hours
>went from "chroma is the finetune flux needed!" to "chroma is just a base model for future finetunes!"
the absolute goalpost moving cope. 512x512 training killed the potential. even 20 epochs at 1024x1024 would've resulted in a better model. holy shit it's 2022-tier
>>106189748how long do you think its going to take for a model like chroma that is several times more expensive than sdxl then? by the time there is a team willing to drop a huge sack of money on it, there will be a new next gen meme
here is that anon's batman one on v50
>>106189733I dont get why you are so negative? since when did we ever get such a uncensored model? sd1.5 was the closest and that wasn't even close in how uncensored it was on top of the prompt following / quality difference
>>106189787>great resultsoooffff
>downloaded wan 2.2 from the op, can i2v generate fine
>get a t2i prompt so i can then use those images to i2v for funsies
>get this after downloading all requirements for the t2i workflow
what am i doing wrong exactly
>>106189790???
All the finetunes on Pony are on significantly smaller datasets.
>>106189733chroma was usable months ago already, sis
>>106189733I dont get why you are so negative? since when did we ever get such a uncensored model? sd1.5 was the closest and that wasn't even close in how uncensored it was on top of the prompt following / quality difference
>>106189787Best trainer software for Chroma?
>>106189799show me 1 (one) finetune of pony that has meaningfully improved it
>>106189817nearly any of them?
there are characters\danbooru tags etc
have you been under a rock the last 1000 days?
>>106189817are you serious right now? There are like hundreds all for different styles or content focuses. I think your just retarded anon
>>106189788You know this is something I don't understand.
How did they expect it to work out? Asking this rather seriously.
Models learn to generate resolutions they are trained at. (With some wiggle room for nearby resolutions)
A 512x model will shit itself when trying to generate 1024x1024, conversely a 1024x model won't make a good 512x512 image.
If you mix both, instead of getting a model that can do both, you get a confused model that can do neither well.
>>106189822>>106189826a lora for a specific style or a character is NOT a finetune
>>106189817You mean like Pony Realism? The one everyone used for months?
>>106189814I've been using Diffusion-Pipe which works great, there's also AI-Toolkit and I think Kohya is adding support
Hopefully OneTrainer will get it as well, but it seems to be in a development hiatus
>>106189843for style it is, there are realism finetunes that are night and day, cartoon looking ones, 2d, 2.5d, there are ones trained on different prompting rules, one better at furry, ones better at animie , ones better at horror...
>>106189796wrong text encoder probably
>>106189843>goalposts movedjust admit you are wrong for once
>>106189855AI-Toolkit has been kinda ass for chroma and wan for me personally, dunno why.
>>106189878He can't because he'd have to admit he can't run Chroma on his mother's laptop.
>>106189882nta but I always liked diffusion pipe best, needs wsl2 though, here is help on installing it
https://civitai.com/articles/12837/full-setup-guide-wan21-lora-training-on-wsl-with-diffusion-pipe
>>106189882Hmm.. perhaps give Diffusion-Pipe a try, if you are on Windows you will have to run it through WSL2 though
>>106189872is there a decent t2i guide? i dont see one in the op and id love to play around with this shit while i work
>>106189805This guy has been hating on Chroma AND defending BFL for ages, samefagging like a madman
Enter conversations with him knowing that
im downloading chroma v50 I will compare the slops with qwen.
give me prompts I will do some runs
>>106189635What was the VRAM usage? Similar to inference? Guessing you used an H100?
>>106189923Make sure to do porn prompts so you can learn the real difference.
>>106189948qwen can half do breasts but that is about it. THe main issue with qwen after a few days of using it though is that it is overcooked, you will only get super samey images, though they look good. And good luck changing the style much
>>106189915im not whoever you think i am because i also dislike flux and the entirety of bfl
>>106189863>>106189845i will concede pony, im from the anime side and there was pretty much no progress until illustrious, which was hardly ideal
file
md5: f69eecd24188bd43add81aad0474a754
🔍
It seems good enough to me.
>>106189977>pretty much no progressyour wrong, pony realism was still night and day better till just super recently and now chroma, and I still use some pony tunes for certain styles instead of illustrious / noob
>>106189977What progress do you think can be made? SDXL is a very shitty architecture with one of the worst text encoders imaginable.
>>106189992>if i see PONY I WILL CALL IT SLOPPAAAA!!yep, its HIM.
>>106189970I haven't had style change issues, granted I've only tried 3DCG, pixel art, comic, anime, crayon and pencil drawings, and realism. It failed to do CCTV and grainy film footage but I probably didn't prompt well enough. Breasts were on average ok but depending on seed they get better. Genitals is just ugly bulges and weird shapes.
>>106190000Have you seen SDXL outputs or are you a Jeet and can't tell what's AI slop?
>>106189987yes and you can see all the annoying quirks of pony in those styles
>>106189992what progress do you think can be made with chroma? sure there will be lora styles and celebrities and smaller scale trainings, but i doubt anyone will pretty much retrain the entire model AGAIN to remove all its problems
>>106190012you can use depreciated models\software and still create art with merit anon
you are an autist
>>106190019Finetunes for style and coherence are significantly smaller with less than 10,000 images. A base model ultimately is throwing lots of shit into the model's knowledge and it's like creating lots of radio stations, a finetune locks in the signal of a specific radio station.
>>106190019BigASP guy already stated he is planning to use Chroma as a base for a finetune
>>106190065i think his trainings are very cool and the stuff he writes about them and his tools, but i dont think many people use them in practice
>>106190064from what i can tell from their discord lodestone used some sort of RL and step distilled it, im not so sure about the base model status
How do I make this motion more aggressive? It has the general fluidity of a punch; its just slow and weak as fuck
>>106190006now try getting different looking images with somewhat the same prompt and different seeds
file
md5: 8b0b6bafc8fad32902e2bc05debb9176
🔍
>>106190065i thought he was still deliberating?
>>106190124she almost looks native alaskan kek
file
md5: 55cda642d2c38a30fa6dbdc94e63c1e2
🔍
>>106190111It trains just fine.
>>106190153i can go make a lora for base sdxl or something as well, but that doesnt say shit about how larger scale training will go
1boy, 1girl, couple, hug from behind, hand on own chin, hands on another's waist, wariza, art by Incase, from above, limited palette, orange theme, color lineart, woven hatching, blue outline, muted color, slice of life, chiaroscuro, stage lights, acrylic paint (medium), rating:general
>>106190165"Large scale". Hate to break it to you anon, but no one does 100k+ image finetunes. They're using 150 synthetic images.
>use 2.2 guide
>turn animated previews on like it suggests
>no animated preview
what gives
>>106190175kek well yeah but then we have come full circle to what i was saying in the beginning that its not wise to expect for a finetune to "save" chroma or any other """base""" model
>>106190181the 2.2 guide is honestly fucking garbage. the t2v workflow it provides doesn't even work with the stuff it makes you download.
>>106189970its not overcooked, its trained on long detailed prompts, so when you use simple words only it will result in the samey appearance
file
md5: 33da72f07a7581ec34cb18614cc9edd2
🔍
>>106190205Anon, your assertion on its face is retarded. You can use Chroma right now, it doesn't need to be "saved". And you're not the king of diffusion models, so you being unable to run Chroma (the real problem) is not my problem. But it's funny though, you're the reason why the distill these models to produce a very strict range of images so you will clap like a retarded seal.
Ok gonna try chroma after only gooning with SDXL for the most part
How do I prompt it?
>>106190269r e n t f r e e
>>106190272Boomer prompt it to the max.
>>106190181Open the comfyui manager and there's an option for previews, set it to auto
>>106190272Don't forget to get t5 as well.
>>106190266well if i wanted silly meme images i can just use qwen or base flux with some lora, but i doubt it can make realism even close to wan or some pony finetune for porn or anime close to illustrious/noob or novelai
>>106190316> but i doubt it can make realism even close to wan or some pony finetuneOkay you're just trolling.
>A cinematic screencap from an action movie. Master Chief from Halo holds a radio to his face and the subtitle caption reads "Get the poorfags out of here!" The frame at the top show him talking into a radio. Kirby can be seen floating in the background, he is a soft fuzzy round character. The shot is letterboxed. The background depicts a wartorn cityscape.Pick your favorite SDXL finetune.
soo do i pick the annealed verison or not?
chroma1 hd fp8 where please and thank you
>sliding this trash off the catalog
>>106190255I did, I used the qwen prompt extender and changed a few sentences, still almost the same generation which means the model actually over trained. It could maybe be fixed with a finetune but just saying
>>106189923I am testing v50 annealed. One thing I notce is higher color saturation.
>>106190305I have t5 from flux days, I guess this hasn't changed?
chroma has been a great model for many epochs. its already better at most things than most models. anatomy issues can be somewhat alleviated with a higher cfg. the model however does need a realism LoRA to stabilize outputs.
>>106190328I think annealed means easier to finetune somehow? At least going by the definition, more malleable
>>106190327yes a nice meme image that is full of blur
>>106190334what do you mean anon?
Is huggingface chroma workflow good enough or is there something better?
>>106190366Where's that SDXL greyslop again? I assume you tried the prompt and gave up.
>>106190380how is he supposed to compare with 4GB of integrated graphics, anon?
>>106190374It should work fine, perhaps use min_padding 1 instead of the Comfy default 0
>>106190372>page 10fuck you ai niggers im bumping all other threads until you die
>>106190340It looks less responsive to a character Lora I have. I am going to try v50 and compare.
>>106190389>muh pissin contestyour 50 series will not get you laid anon and your gens are still shiddy\fardy\brappy
>>106190397What a sad life, such limp dick energy
>>106190405you being poor and angry hurts your chances thats for sure
>>106190405You know, because we're not Zoomers we don't need your approval to enjoy anything. You should try it sometime.
Ok tech retard here, I got venv for comfy with python 3.12, how can I update it to 3.13? Or do I have to redownload all of the packages?
>>106190345Nope. Uses same text encoder.
>>106190424Why the fuck are you upgrading Python?
>>106190424YES
Delete venv. Create a 3.13 venv. source venv/bin/active. pip install -r requirements.txt
>>106190171chroma (sloppy) cfg 4 seed 42.
>>106190447qwen.. i dont understand whats happening with chroma, I've used both the integrated comfy workflow and the one they have on the model card
qwen-image just keeps growing on me. It looks great at 50 steps cfg 3.5 ddim/ddim_uniform. It handles a complex prompt well.
I'm very much looking forward to loras and finetunes for this model.
>>106190418meanwhile you do nothing but hate on Rnon