Thread 107054044

316 posts 186 images /g/

Anonymous 10/30/2025, 5:11:36 PM No.107054044 [Report] >>107057813

/ldg/ - Local Diffusion General

highlights_g_107049284_1761840655_1.jpg md5: 6055b950...

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107049284

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Neta Yume (Lumina 2)
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo

Anonymous 10/30/2025, 5:12:49 PM No.107054061 [Report] >>107054106 >>107054121 >>107054221

1756576992113833.png md5: b22832b0...

Imagine the kino level of shitpost if we really get suno 4.5 at home

Anonymous 10/30/2025, 5:15:14 PM No.107054091 [Report]

>all chroma shit
>probably own gens too
kys OP frfr

Anonymous 10/30/2025, 5:16:05 PM No.107054106 [Report] >>107054110

>>107054061
>rainbow
>woman avatar
....... is he cooking?

Anonymous 10/30/2025, 5:17:03 PM No.107054110 [Report]

>>107054106
>woman
I think it's a man avatar, the hair is short

Anonymous 10/30/2025, 5:17:58 PM No.107054121 [Report] >>107054155

>>107054061
If it was so easy suno would have been as good as udio long ago, but udio was always better, so I have my doubt that local can be as good.

Anonymous 10/30/2025, 5:19:29 PM No.107054137 [Report]

dmmg_0087.png md5: 04757163...

>>107054015
gonna depend on the model, lora strength, captions etc. this is generally the reason why you use nonstandard words to invoke the lora to avoid confusion in the model.

Anonymous 10/30/2025, 5:20:53 PM No.107054151 [Report]

file.png md5: d44d307e...

Anonymous 10/30/2025, 5:21:24 PM No.107054155 [Report]

>>107054121
this, suno is overrated as fuck, they always pretend it's at the same level as udio when it's definitely not

Anonymous 10/30/2025, 5:22:14 PM No.107054162 [Report]

>caring about the fagollage

Anonymous 10/30/2025, 5:23:39 PM No.107054175 [Report]

Pay debo no mind he's disabled

Anonymous 10/30/2025, 5:24:39 PM No.107054183 [Report] >>107054206 >>107054244

>>107054047
https://vocaroo.com/1lhI4LNQojvT
Udio 1.0 of course.

Anonymous 10/30/2025, 5:27:13 PM No.107054206 [Report] >>107054450

>>107054183
udio is amazing desu

Anonymous 10/30/2025, 5:28:51 PM No.107054221 [Report] >>107054248

>>107054061
I listened to his samples, they are struggling to get quality of YuE even.

Anonymous 10/30/2025, 5:29:26 PM No.107054227 [Report] >>107054322 >>107054339 >>107054350 >>107054367

So wan q8 gguf is like 95% of as good as fp16 and a little faster?

Anonymous 10/30/2025, 5:31:20 PM No.107054244 [Report]

>>107054183
meh

Anonymous 10/30/2025, 5:31:36 PM No.107054248 [Report] >>107057648

>>107054221
show some of those samples here anon, I don't want to go to trooncord

Anonymous 10/30/2025, 5:38:38 PM No.107054322 [Report]

>>107054227
with a little optimism and cope

Anonymous 10/30/2025, 5:40:10 PM No.107054339 [Report]

>>107054227
yeah, the quality is really equivalent and it's 2x lighter in terms of size

Anonymous 10/30/2025, 5:40:50 PM No.107054350 [Report]

>>107054227
I have the same speed almost with 16 vs q8 on a 5090, I just block swap half the model.

Anonymous 10/30/2025, 5:43:24 PM No.107054367 [Report]

>>107054227
Half the precision, half as good
But being able to run it makes you twice as blind

Anonymous 10/30/2025, 5:50:08 PM No.107054430 [Report]

ComfyUI_temp_hqdve_00082_.png md5: cb72762b...

Anonymous 10/30/2025, 5:51:55 PM No.107054444 [Report] >>107054448 >>107054509 >>107054550 >>107056231

file.png md5: 9dcc5ec3...

Contrastive flow matching is tight.

Anonymous 10/30/2025, 5:52:53 PM No.107054448 [Report] >>107054492

>>107054444
>Contrastive flow matching
what is that?

Anonymous 10/30/2025, 5:53:05 PM No.107054450 [Report]

ComfyUI_08092_.png md5: 245c21a3...

>>107054206
Yes, I really want to hope that local will catch up but that seems like a leap from nothing, not even SD, to a Dalle 3 tier music model. High quality manually captioned audio data is probably a must for such results, and then really good DPO process.

Anonymous 10/30/2025, 5:57:18 PM No.107054482 [Report] >>107054512 >>107054531 >>107054564 >>107054629 >>107056057 >>107057615

opensource emu3.5 with 32b, which according to the authors is supposed to be superior to nano banana in every way.
Looking at the sample images, I have my doubts about

Anonymous 10/30/2025, 5:58:00 PM No.107054492 [Report] >>107054498

>>107054448
It's a new version of flow matching that encourages the model to find unique paths which speeds up convergence, gives shaper results because it's not blending paths, and also encourages diverse results (because paths aren't blended).

Anonymous 10/30/2025, 5:59:11 PM No.107054498 [Report] >>107054508

>>107054492
I see, and I guess you're using that method to make a lora right?

Anonymous 10/30/2025, 6:00:10 PM No.107054508 [Report] >>107054518 >>107054524 >>107054530 >>107056231

file.png md5: 80803880...

>>107054498
I'm using the method to train a 1.5B model from scratch.

Anonymous 10/30/2025, 6:00:21 PM No.107054509 [Report]

>>107054444
test finetune or what

Anonymous 10/30/2025, 6:00:35 PM No.107054512 [Report]

>>107054482
its chinkslop. if its not good they lie and say it is. if it is good they make you pay for it. the good thing about china isnt that its better, its that its cheaper

Anonymous 10/30/2025, 6:01:28 PM No.107054518 [Report] >>107054542

>>107054508
Tell us more. This sounds interesting.

Anonymous 10/30/2025, 6:01:51 PM No.107054524 [Report] >>107054542

>>107054508
nice, how much faster is it compared to the previous method?

Anonymous 10/30/2025, 6:02:50 PM No.107054530 [Report]

>>107054508
i believe in bigma

Anonymous 10/30/2025, 6:02:52 PM No.107054531 [Report] >>107054546 >>107054547

>>107054482
even if it's true, 32b is just too big

Anonymous 10/30/2025, 6:04:02 PM No.107054542 [Report] >>107054591 >>107054636 >>107056231

>>107054518
It's just a revision of the Pixart model I'm working on. 1.5B, Pixart architecture with the HDM mlp, and Ostris's 16 channel VAE.

>>107054524
Insanely fast compared to MSE, doing like 0.02 loss per day which means a full from scratch model on a single 5090 in 50 days.

Anonymous 10/30/2025, 6:05:15 PM No.107054546 [Report]

1748619460298122.png md5: b930f520...

>>107054531
>is just too big

Anonymous 10/30/2025, 6:05:18 PM No.107054547 [Report]

>>107054531
32b for a model claiming to do everything they claim it can is quite impressive actually. the problem is benchmarks don’t hold up against reality.

Anonymous 10/30/2025, 6:05:30 PM No.107054550 [Report]

>>107054444
Based quads

Anonymous 10/30/2025, 6:06:49 PM No.107054564 [Report] >>107054576 >>107054595

>>107054482
>according to the authors
And why should I believe authors this time? They lie as often as the common whore.

Anonymous 10/30/2025, 6:07:54 PM No.107054576 [Report]

>>107054564
>And why should I believe authors this time?
you should never believe them, like everything you test it out by yourself and see that at 95% of the time it's a big nothingburger

Anonymous 10/30/2025, 6:09:41 PM No.107054591 [Report] >>107054636

>>107054542
Pretty cool tinkering. What sort of database are you using?

Anonymous 10/30/2025, 6:10:03 PM No.107054595 [Report]

>>107054564
My text consisted of two parts. Why are you asking me something that I answer in the second sentence?

Anonymous 10/30/2025, 6:13:13 PM No.107054629 [Report] >>107054657

ComfyUI_08094_.png md5: 9bf6e467...

>>107054482
It's unfortunate that from Chinese all we get are models on par with Seedream but they are made out to be something more. I'll give them props for catching up to Seedream by using Seedream based synthetic slop in conjunction with 4o though.

Anonymous 10/30/2025, 6:13:35 PM No.107054636 [Report] >>107054642 >>107054706 >>107054725 >>107054938 >>107056231

file.png md5: a00a221a...

>>107054542
Sexy loss graph. The new training run is switching from a 3d perlin noise Automagic warm up to AdamW.

>>107054591
I scraped millions of images from duckduckgo but I also have e621, danbooru and gelbooru (200k images). Then a lot of lets plays from Youtube for games. A few movie screencaps from stuff I have / famous movies. It's generally pop culture and art centered.

Anonymous 10/30/2025, 6:14:50 PM No.107054642 [Report] >>107054659

>>107054636
so like the dark blue was the normal method and the light blue is the constrative flow thing?

Anonymous 10/30/2025, 6:16:23 PM No.107054655 [Report]

Weird how no local models can break into the top 10 anymore in arenas. Shame they fell so far behind

Anonymous 10/30/2025, 6:16:31 PM No.107054657 [Report] >>107054907

ComfyUI_08093_.png md5: 04c45450...

>>107054629
Arguably, Qwen already did that though. We are stuck getting same model over and over again. One slightly more fancy than the previous.

Anonymous 10/30/2025, 6:17:04 PM No.107054659 [Report]

>>107054642
No dark blue was me experimenting with a Automagic idea I had which basically was activating layers and parameters with 3d perlin noise much like how wind is simulated in a video game. Light blue is AdamW which is much faster than the other optimizer but I think the Automagic warmup was probably helpful especially for forcing the layers and parameters to be randomly and usefully activated.

Anonymous 10/30/2025, 6:21:13 PM No.107054706 [Report] >>107054715 >>107054947

>>107054636
>I scraped millions of images from duckduckgo
Yandex used to be so good for this. Now it barely works as image search.

Anonymous 10/30/2025, 6:23:15 PM No.107054715 [Report] >>107054727

>>107054706
Unfortunately all search results are fucking garbage now because AI bullshit is everywhere. I scraped early last year before the flood.

Anonymous 10/30/2025, 6:24:01 PM No.107054725 [Report] >>107054736 >>107054902

>>107054636
retard here, what exactly does loss mean in this context?

Anonymous 10/30/2025, 6:24:13 PM No.107054727 [Report]

>>107054715
I think to be sure you should scrap every images before 2022, it's the last year before the AI flood

Anonymous 10/30/2025, 6:25:14 PM No.107054736 [Report] >>107054980

>>107054725
the loss is like the error between the image you trained and the recreation of the model, the goal is to get the lower loss possible

Anonymous 10/30/2025, 6:31:03 PM No.107054798 [Report]

TTS with voice cloning capabilities that you can set up by using docker compose with the relevant options for your system, it comes with an api and GUI set up too.

https://github.com/devnen/Chatterbox-TTS-Server

Anonymous 10/30/2025, 6:43:12 PM No.107054902 [Report] >>107054980 >>107055036

>>107054725
0 loss means the model’s prediction perfectly matches the objective, denoising an image or finding a flow. In practice this is catastrophic failure and memorization if you ever get to 0. Normally for diffusion models (e.g. SDXL) loss is how well the image is denoised. Models like Flux have loss based on finding paths in latent space to the image. Contrastive flow adds an additional objective that paths must be unique.

Anonymous 10/30/2025, 6:43:20 PM No.107054907 [Report]

>>107054657
you forgot slower every iteration

Anonymous 10/30/2025, 6:46:45 PM No.107054935 [Report] >>107057278 >>107057327

ComfyUI_00001_.png md5: 6d6ddc5b...

Hello people, its my first time using chroma, how can I get my pics to be higher quality? Higher steps or more negative prompts?

Anonymous 10/30/2025, 6:46:56 PM No.107054938 [Report]

>>107054636
Cool stuff. It's a great learning experience.

Anonymous 10/30/2025, 6:47:44 PM No.107054947 [Report] >>107055017

ComfyUI_08102_.png md5: e1c7df84...

>>107054706
Yeah, quite sad what they did to it.

Anonymous 10/30/2025, 6:50:03 PM No.107054969 [Report] >>107055046

SD35Medium_Output_262662.png md5: 393d9b5b...

>>107053778
3.5 Medium was stronger at the top end of its resolution range than the lower end TBQH, like generating at e.g. 1216x1600 would very often be way more coherent and better looking than 832x1216 on the same seed. So for a high res use case like this it might make sense especially if they've tuned it any past the base model. Attached pic is a native 1280x1536 one-shot Medium gen, for example.

Anonymous 10/30/2025, 6:51:14 PM No.107054980 [Report] >>107055016

>>107054736
>>107054902
i see
interesting and makes me wish i wasn't a retard

Anonymous 10/30/2025, 6:54:26 PM No.107055016 [Report]

>>107054980
It's not that complicated and all the smart people already did the math for you. I'm half a retard and I just experiment with bleeding edge research other people have already done. Really to get into any of this you just have to drop into it and be willing to sweat, a lot of this shit is just tedious churning, especially captioning. For example I'm working on finetuning Joycaption to be better which requires handcaptioning thousands of images.

Anonymous 10/30/2025, 6:54:33 PM No.107055017 [Report]

>>107054947
It's AI filtered walled garden.
Speculation wise, maybe Cuckflare and others restricted their web crawlers because of the geopolitical incidents. It's okay because Google can leech everything.

Anonymous 10/30/2025, 6:56:13 PM No.107055036 [Report]

>>107054902
>Contrastive flow adds an additional objective that paths must be unique.
that's quite a smart idea when you think about it, it forces the model to not be lazy and work on every edge cases

Anonymous 10/30/2025, 6:57:05 PM No.107055043 [Report]

1730723114519001.jpg md5: a565af6d...

Anonymous 10/30/2025, 6:57:30 PM No.107055046 [Report] >>107055061 >>107056557

SD35Medium_Output_75544.png md5: a16b9304...

>>107054969
And this one's 1600x1216

Anonymous 10/30/2025, 6:59:05 PM No.107055061 [Report] >>107055096 >>107055103

>>107055046
illustrious is the peak of local diffusion. no other model comes close in terms of character knowledge, style knowledge, concept knowledge, and overall fidelity. Chroma is a disgusting blurry mangled mess, neta knows a fraction of the styles, qwen is a bloated stopgap that can’t even compete with seedream 3. SDXL is an absolute triumph and will likely not be surpassed for years

Anonymous 10/30/2025, 7:00:53 PM No.107055077 [Report] >>107055080 >>107055697

00011-145286813.jpg md5: 08159a71...

Anonymous 10/30/2025, 7:01:12 PM No.107055080 [Report]

>>107055077
depends on the scope of the finetune dataset. they'll probably manage to make the girls/boys hotter, among some other things. it's probably fixing the biggest popular issue in a bunch of months or so?

idk if anyone will get enough compute to train the boorus, real fashion/nsfw collections, cjk idols and so on.

Anonymous 10/30/2025, 7:03:21 PM No.107055096 [Report] >>107055100

>>107055061
(You)

Anonymous 10/30/2025, 7:03:47 PM No.107055100 [Report]

>>107055096
i think people actually will start to finetune it with ramtorch or w/e. it'll likely be quite slow.

Anonymous 10/30/2025, 7:04:10 PM No.107055103 [Report] >>107055110 >>107056262

>>107055061
There's multiple small model projects now that have proven you can train a from scratch DiT model for a couple thousand dollars. The fact is many people don't want to stick out their neck especially if they're in North America or Europe.

Anonymous 10/30/2025, 7:04:54 PM No.107055110 [Report] >>107055149

>>107055103
Not really. Multigpu is mainly used for higher batch sizes, not learning rate. A single gpu would only go ~4 times slower than the typical setup, and you would only rake in more donations by going slow and doing incremental releases. Recent papers have proven that with modern optimizers the results are no worse too.

Anonymous 10/30/2025, 7:08:34 PM No.107055149 [Report] >>107055370 >>107056262 >>107057925

>>107055110
That's not what I said. There's nothing stopping you from taking HDM, scaling it 4x. The AMD 340m toy model was trained in 1.5 days. Pixart 600m is better than SD 1.5 and on par with SDXL. So you can assume a modern DiT 2B model could be trained in less than 30 days and be SOTA as a booru model.

Anonymous 10/30/2025, 7:10:35 PM No.107055167 [Report] >>107055184

>>107053799
>Udio partnership with UMG

https://desuarchive.org/g/thread/106957370/#106958310

So it has begun. First they will figure out which model is best, give their artists exclusive access to this model, and then give the general public a watered down version of the model, if the public gets a version at all.

Anonymous 10/30/2025, 7:11:28 PM No.107055177 [Report] >>107057255

Sage attention 3 when.

Anonymous 10/30/2025, 7:11:57 PM No.107055184 [Report]

>>107055167
Remember, Udio is no joke compared to everyone else. Going after them specifically is very strategic.

Anonymous 10/30/2025, 7:19:01 PM No.107055277 [Report] >>107055283 >>107055541

Finished baking male vocals version of:
>>107053777
Lyrics I posted here already:
>>107053946
https://voca.ro/1muX2AJkOy1P

Anonymous 10/30/2025, 7:19:40 PM No.107055283 [Report]

>>107055277
cringe
im waiting for acestep 1.5

Anonymous 10/30/2025, 7:19:50 PM No.107055288 [Report] >>107055297 >>107055412 >>107055603

musicarena.png md5: 1e81a88a...

>"Udio is da best!!!11!!1"
Reality is picrel
You can make direct comparisons yourselves in the Music Arena if you want
Udio is only "better" if you are into gacha and generate parts of the song multiple times until it does something good, and Suno allows you to do something like that as well

Anonymous 10/30/2025, 7:20:49 PM No.107055297 [Report] >>107058213

1739936326305832.png md5: cac0dc40...

>>107055288
>mememarks
>udio v1.5

Anonymous 10/30/2025, 7:22:29 PM No.107055321 [Report] >>107055359 >>107055483 >>107055531 >>107055650

I haven't been in the game for like a year now and I'm starting to plan a full system upgrade. Why is there no normal middle ground between 16 and 32 VRAM cards, nvidia?
I have a 2060 super, 8GB vram. My question is, assuming I just go for a 5080 or something for the 16GB, could I use it together with the 2060 to add up the VRAM and leave the rest to regular RAM? Or should I not bother and just make do with a 5080?
I could afford a 5090, I just cant help but feel like I'm being ripped off and honestly more than anything else I'm worried about it melting and exploding or something. Is undervolting/underclocking a good idea?

Anonymous 10/30/2025, 7:26:38 PM No.107055359 [Report]

>>107055321
Im retarded and forgot to mention that I want to get into video generation. the pastebin doesn't go too much into detail on multi-card drifting for what its worth

Anonymous 10/30/2025, 7:27:27 PM No.107055370 [Report] >>107055524

>>107055149
I think the best possible way to train an anime model would be to first train your own captioner model that took tag lists along with an image as input, and interleaved the tags into proper sentences (in a way that would try to be grammatically correct but not necessarily to a fault) based on what it could actually see, while also adding spatial information where it could and where it made sense. Then you could just run that model on the Danbooru dataset directly with the original accurate tag lists for each model.

I'd also pick just a maximum resolution and proportionally downscale larger images to that if needed, but not *upscale* anything whatsoever, rather just bucketing everything at as close to the original upload res as possible. So the end result would be basically a fairly robust mixed-res model that could coherently do a wide range of resolutions rather than just focusing on one range.

Anonymous 10/30/2025, 7:31:44 PM No.107055412 [Report] >>107055603

>>107055288
Udio 1.0 was best by far. They neutered the model after that. I've never seen a Suno gen on par with Udio 1.0 composition wise.

Anonymous 10/30/2025, 7:32:15 PM No.107055418 [Report]

WAN2.2_00406_thumb.jpg.webm md5: b6f144cd...

WebM not supported

Anonymous 10/30/2025, 7:37:56 PM No.107055483 [Report]

>>107055321
If you're patient you could wait for the 50 series super refresh as the 5080 super is supposed to have 24gb vram. Can't speak to multi gpu use, but 8gb seems rather abysmal and not worth the hassle especially when you consider you can offload to system memory.
As for the 5090, I have one and I've undervolted, overclocked it and have capped the power at 80% (460W) without any issues. In any case make sure to get at least 64gb of system ram if you're gonna gen videos.

Anonymous 10/30/2025, 7:41:41 PM No.107055524 [Report] >>107055797

>>107055370
How'd I do it is have the same image with three different captions:
- tags as seen on the booru site
- short description
- long description

Each caption really is a supported way for a user to prompt the model and the model will naturally learn how to mix the different caption types. The problem we've now seen multiple times now is people training on caption blob balls and forcing the model to be reliant on long captions if you want maximum output quality.

Anonymous 10/30/2025, 7:42:42 PM No.107055531 [Report]

>>107055321
There's really no downside to undervolting a 5090, you lose 5% performance and reduce the power 30%.

Anonymous 10/30/2025, 7:43:58 PM No.107055541 [Report] >>107055599

>>107055277
Bridges and outro are pretty weak https://voca.ro/12sNX01jyU6M

Anonymous 10/30/2025, 7:49:59 PM No.107055599 [Report]

>>107055541
It's still pretty good if it's local.
Eg. in terms of slop.

Anonymous 10/30/2025, 7:50:40 PM No.107055603 [Report] >>107055685 >>107056111

>>107055412
>>107055288
Suno very likely is trained on royalty-free music libraries like Audiojungle which is why it sounds worse but also more "polished. I'm guessing Udio was trained on more copyrighted music even before the UMG deal so it's more random but gives more interesting results.

Anonymous 10/30/2025, 7:54:53 PM No.107055650 [Report]

>>107055321
3090's are pretty cheap

Anonymous 10/30/2025, 7:59:35 PM No.107055685 [Report]

>>107055603
Old Udio was overfit on copyrighted stuff, if you input the same tags and lyrics some tracks had, you'd get nearly identical outputs

Anonymous 10/30/2025, 8:01:11 PM No.107055697 [Report] >>107055752

me and ma girl.png md5: 90b004ae...

>>107055077

Anonymous 10/30/2025, 8:02:47 PM No.107055713 [Report] >>107056586

>>>/pol/520205588
How do I do this on my laptop?

Anonymous 10/30/2025, 8:03:56 PM No.107055729 [Report]

>https://huggingface.co/nvidia/ChronoEdit-14B-Diffusers
anyone tried the new nvidia edit model?

Anonymous 10/30/2025, 8:06:48 PM No.107055752 [Report] >>107056080

me and ma girl3.png md5: b673530e...

>>107055697

Anonymous 10/30/2025, 8:11:28 PM No.107055797 [Report] >>107055828

>>107055524
I think your way might work if you literally swapped out the sets of captions for each image between epochs, probably better than slapping them all in one caption file

Anonymous 10/30/2025, 8:14:50 PM No.107055826 [Report] >>107055860 >>107055951

ComfyUI_00425_.png md5: 60ded668...

Anonymous 10/30/2025, 8:15:11 PM No.107055828 [Report]

>>107055797
That's what I mean, you duplicate the image for each caption type. And if you practice VAE jitter you prevent memorization.

Anonymous 10/30/2025, 8:18:37 PM No.107055860 [Report]

>>107055826
Nice. Maybe more... try dynamic angle, rim light. Really good.

Anonymous 10/30/2025, 8:25:29 PM No.107055921 [Report] >>107056239

UwU2.jpg md5: c409b7b3...

Anonymous 10/30/2025, 8:28:03 PM No.107055946 [Report]

does infinite talk work with wan2.2?

Anonymous 10/30/2025, 8:28:42 PM No.107055951 [Report] >>107055982 >>107056008

ComfyUI_00431_.png md5: 355e8d2a...

>>107055826

Anonymous 10/30/2025, 8:33:17 PM No.107055982 [Report] >>107056008

>>107055951
Love it. What's the model? I should try and gen something related to this.

Anonymous 10/30/2025, 8:36:33 PM No.107056008 [Report] >>107056018

ComfyUI_00434_.png md5: 16904871...

>>107055951
>>107055982
Qwen

Anonymous 10/30/2025, 8:38:05 PM No.107056018 [Report]

>>107056008
Cinematic Redmond had these wibes. Cool that Qwen can be grainy too. Of course it's probably pretty stiff but that's what they all are.

Anonymous 10/30/2025, 8:42:44 PM No.107056057 [Report]

>>107054482
>emu3.5
HF links are all dead
I can test it if the models are available somewhere

Anonymous 10/30/2025, 8:45:50 PM No.107056080 [Report]

failed_thumb.jpg.webm md5: dee7100e...

WebM not supported

>>107055752
I can't make her sit on the pig, but it's still funny

Anonymous 10/30/2025, 8:54:01 PM No.107056110 [Report]

Is there a way for Librewolf (flatpak) to remember its last directory? In Linux Mint.
It's somewhat tiring to use ComfyUI and I need to open a file dialog to traverse all the way up from /home/ to my work mount...

Anonymous 10/30/2025, 8:54:06 PM No.107056111 [Report] >>107056121

>>107055603
I don't think they're prompted the same. Udio has a better understanding of music, that shows because with a good prompt it destroys almost any Suno song I've ever heard

https://www.udio.com/songs/2bXYLKaVDyVwi1GAb6pSkR

This is a very hard song
https://www.udio.com/songs/7zrLreMnwCYrdBqQkGtEXM

The musical depth I've witnessed out of this model truly is insane. Unprecedented connection between lyrics and musical notes. It has mastered vocals and intonation in a way Suno has not.

Using high quality copyrighted music in conjunction with whatever royalty-free music is available for training the model is the way to go.

Anonymous 10/30/2025, 8:55:33 PM No.107056121 [Report] >>107056281

>>107056111
Yeah, that's just subjective to people who have never played any instrument in their lives.

Anonymous 10/30/2025, 8:58:59 PM No.107056151 [Report] >>107056670

Any of you has a recommandation to generate videos for a music video? I'm looking for 16:9, some kind of 35mm grain/look, mostly still shots but with some travelling too. Theme is urban 90's/2000's workers working daily shifts.

I only know of Veo 3 so far, and looking at Runway.

Anonymous 10/30/2025, 9:08:05 PM No.107056217 [Report]

so has the copywritepocalypse finally started

Anonymous 10/30/2025, 9:09:46 PM No.107056231 [Report] >>107056241

>>107054444
>>107054508
>>107054542
>>107054636
Are you using TREAD like HDM too? And have you considered going VAE-less using
SVG (or at least using EQ-VAE like HDM)? https://arxiv.org/pdf/2510.15301

if not, you are missing out on huge speedups

Anonymous 10/30/2025, 9:10:37 PM No.107056239 [Report] >>107056256

>>107055921
this is a very nice gen
what model did you use?

Anonymous 10/30/2025, 9:11:19 PM No.107056241 [Report]

>>107056231
TREAD is much harder to implement if you want the real speed up. The 16-channel VAE I'm using has been EQ'd yeah.

Anonymous 10/30/2025, 9:13:31 PM No.107056256 [Report] >>107056349

>>107056239
NovaOrangleXL_v120
I'm just testing my linux installation, I deleted all my previous models. Don't have noob or anything else.

Anonymous 10/30/2025, 9:14:14 PM No.107056262 [Report]

>>107055149
>>107055103
THIS, we are on the cusp of home baked local SOTA.

Anonymous 10/30/2025, 9:16:36 PM No.107056281 [Report] >>107056305

>>107056121
Udio literally just got acquired by the largest music label. That should tell you all you need to know.

Anonymous 10/30/2025, 9:18:28 PM No.107056305 [Report]

>>107056281
I don't know, really.

Anonymous 10/30/2025, 9:19:44 PM No.107056317 [Report]

Test

Anonymous 10/30/2025, 9:21:53 PM No.107056339 [Report]

00219-597055894.png md5: d0de8485...

Anonymous 10/30/2025, 9:22:04 PM No.107056342 [Report]

Sanic.jpg md5: 87668373...

Anonymous 10/30/2025, 9:22:26 PM No.107056349 [Report] >>107056389

>>107056256
thank you anon, hows your linux experience going?

Anonymous 10/30/2025, 9:24:47 PM No.107056375 [Report] >>107056415 >>107056597

00058-3892630275-ad-before.jpg md5: 21ad71b7...

Yume hands do work but you have to describe them in the prompt, treat it more like a LLM

Anonymous 10/30/2025, 9:24:54 PM No.107056376 [Report]

ComfyUI_00015_.png md5: 04384679...

/iemg/ lore, you wouldnt get it

Anonymous 10/30/2025, 9:25:27 PM No.107056382 [Report]

image_00105_.jpg md5: be3ec52c...

Anonymous 10/30/2025, 9:25:56 PM No.107056389 [Report] >>107056405 >>107056432

>>107056349
Yeah, well, I'm an experienced faggot but I wouldn't advertise it for normal people. Even with the most common interfaces, it's been 20 years and they still can't get a file save dialog right.
I have used Irix and it never had these issues.
Like save a file from Cum and it defaults to some ~/.
Open a file...
It's great if you are a developer but for a normal person just use Windows.
I feel like that Linux environments have gone backwards since I last used them 15 years ago.

Anonymous 10/30/2025, 9:27:14 PM No.107056405 [Report] >>107056432

>>107056389
Flatpak browser does not remember the save file location from Cum.
This is what I mean.
I need to browse in 5+ deep to just get to the directory I want.

Anonymous 10/30/2025, 9:27:23 PM No.107056410 [Report]

00224-1489466956.png md5: 06e32079...

Anonymous 10/30/2025, 9:27:42 PM No.107056412 [Report] >>107056440

remember that guy that was training a model and said his image of a brown splotch for the prompt "a woman" was 80% of the way there?

Anonymous 10/30/2025, 9:27:58 PM No.107056415 [Report] >>107056421

>>107056375
Okay, then what's the optimal total prompt length in your experience?

Anonymous 10/30/2025, 9:28:32 PM No.107056421 [Report] >>107056455 >>107056490

00061-1166320666-ad-before.jpg md5: ba5bb650...

Text can be consistent with small phrases from the looks of it and it is really sensitive to artist. The wrong tag will completely fuck everything which lends to the needs more training.
>>107056415
I never pay attention to that doesn't matter from my testing?

Anonymous 10/30/2025, 9:29:29 PM No.107056432 [Report] >>107056441

>>107056405
>>107056389
im a linux user, what distro r u on?
im on debian and the default file save dialog in brave/mullvad remembers locations (actually not sure about mullvad because it has crazy settings, but firefox did work) for extensions
what environment are u using? i use dwm so less ram/vram is used

Anonymous 10/30/2025, 9:30:53 PM No.107056440 [Report] >>107056451

>>107056412
how's your dataset going? still 0%?

Anonymous 10/30/2025, 9:31:04 PM No.107056441 [Report] >>107056458

>>107056432
Yeah, comparing distros is like comparing dicks. I think I'm using Librewolf and it's a flatpak - this explains why it does not remember the directories.

Anonymous 10/30/2025, 9:31:59 PM No.107056451 [Report] >>107056465

>>107056440
how's your training going 2 years later? still at 80% there?

Anonymous 10/30/2025, 9:32:28 PM No.107056455 [Report]

>>107056421
>hands do work but you have to describe them in the prompt
Sounded like your approach is to really bloat the prompt with specifics, but I guess I misunderstood.

Anonymous 10/30/2025, 9:32:39 PM No.107056458 [Report] >>107056491

>>107056441
try firefox or brave, or look for settings to disable forgetting save directory in about:config
flatpak is likely the issue because of muh sandboxing

Anonymous 10/30/2025, 9:33:32 PM No.107056465 [Report] >>107056476

>>107056451
it doesn't take much philosophy to understand you don't do anything and thus won't achieve anything, don't put your insecurities of failure on me, thanks :)

Anonymous 10/30/2025, 9:34:22 PM No.107056476 [Report] >>107056497

>>107056465
yup, still 80% there confirmed lmao

Anonymous 10/30/2025, 9:35:30 PM No.107056490 [Report] >>107056511

>>107056421
do you switch the system prompt like how the guide says to help with text or? i have yet to really fuck with text on it

Anonymous 10/30/2025, 9:35:36 PM No.107056491 [Report] >>107056620

>>107056458
Nah, your advice is just like any of the useless non-tech advice - changing distro or even browser does not accomplish anything. If it works it works and if it does not there is way to do this but sure as hell it is not by reinstalling my disks.

Anonymous 10/30/2025, 9:35:51 PM No.107056497 [Report] >>107056518

>>107056476
actually that doesn't mean anything, for all you know I've already released a model, but what we both know is in 2 years you don't have anything except a bitter attitude
it's truly funny I'm living rent free in your brain though

Anonymous 10/30/2025, 9:37:56 PM No.107056511 [Report]

>>107056490
I treat it like chroma I also use my own system prompt I'll look into the guide again but pigeonholing it to anime only doesn't do much for me

Anonymous 10/30/2025, 9:38:48 PM No.107056518 [Report] >>107056537

>>107056497
for all anyone knows you haven't released anything lmao

Anonymous 10/30/2025, 9:39:38 PM No.107056525 [Report]

00231-1253630154.png md5: 40088685...

Anonymous 10/30/2025, 9:40:42 PM No.107056537 [Report] >>107056560

>>107056518
Feel free to explain why anyone would ever attach any of their professional work to 4chan. No one releasing a model that had their name attached to it would ever link to it 4chan if they wanted to be taken serious.

Anonymous 10/30/2025, 9:42:58 PM No.107056557 [Report]

SD35Medium_Output_8473732.jpg md5: 6594ad91...

>>107055046
One more

Anonymous 10/30/2025, 9:43:20 PM No.107056560 [Report] >>107056567

>>107056537
how convenient
my locally trained 300B model is going great too

Anonymous 10/30/2025, 9:44:34 PM No.107056567 [Report] >>107056585

>>107056560
The only thing you're developing is suicidal thoughts.

Anonymous 10/30/2025, 9:45:59 PM No.107056581 [Report] >>107056596 >>107056650

image_00109_.jpg md5: f6ae5fee...

Anonymous 10/30/2025, 9:46:17 PM No.107056585 [Report]

>>107056567
let's see a 1girl gen, bro, I'm sure it will be great bro, two years of improvement bro

Anonymous 10/30/2025, 9:46:26 PM No.107056586 [Report]

>>107055713
This is brilliant desu. Now they just need to make it uncensored so that a guy jerking off shows a girl fingering her pussy. Then bye bye thots, any guy can become an OnlyFans whore.

Anonymous 10/30/2025, 9:46:55 PM No.107056596 [Report] >>107056640

>>107056581
Great stuff.

Anonymous 10/30/2025, 9:46:58 PM No.107056597 [Report] >>107056603 >>107056649 >>107057111

>>107056375
That's not really true at all for Yume 3.5 IMO, you can absolutely even Booru prompt it straight up as long as you leave the Gemma boilerplate properly in my experience. That the generally recommended sampling configs are both not really that good is more likely the issue for some people, DPM++ 2S Ancestral at 4.5 to 5.5 CFG gives massively better results most if the time for me. It is slower though.

Anonymous 10/30/2025, 9:48:12 PM No.107056603 [Report]

>>107056597
Oh I forgot to say, that's with Linear Quadratic.

Anonymous 10/30/2025, 9:49:24 PM No.107056614 [Report] >>107056687

https://huggingface.co/lightx2v/Wan2.2-Distill-Models/tree/main

new models, anyone test?

Anonymous 10/30/2025, 9:50:22 PM No.107056620 [Report] >>107056652

>>107056491
did saving files with librewolf remember?

Anonymous 10/30/2025, 9:51:37 PM No.107056635 [Report]

ComfyUI_00535_.png md5: 149e27c5...

Lustify is pretty good for off-topic gens.

Anonymous 10/30/2025, 9:52:20 PM No.107056640 [Report] >>107056650 >>107056679

image_00110_.jpg md5: 8914e94a...

>>107056596
ty

Anonymous 10/30/2025, 9:52:42 PM No.107056644 [Report]

00062-1028256704.png md5: c139567c...

Anonymous 10/30/2025, 9:53:10 PM No.107056649 [Report]

>>107056597
sadly that sampler is not in neo forge for some reason but I get good luck with DPM++ 2M

Anonymous 10/30/2025, 9:53:25 PM No.107056650 [Report] >>107056819

>>107056640
>>107056581
huh? aren't these just film stills?

Anonymous 10/30/2025, 9:53:39 PM No.107056652 [Report]

>>107056620
I does remember the last directory for images but with cum ui it does not.

Anonymous 10/30/2025, 9:55:15 PM No.107056666 [Report]

00234-3489883587.png md5: 5f2c7dbd...

Anonymous 10/30/2025, 9:55:46 PM No.107056670 [Report] >>107056708

>>107056151
>Any of you has a recommandation to generate videos for a music video? I'm looking for 16:9, some kind of 35mm grain/look, mostly still shots but with some travelling too. Theme is urban 90's/2000's workers working daily shifts.

>I only know of Veo 3 so far, and looking at Runway.
This is the local general so I'll give you advice for a model you can run on a GPU on your home computer

Your only real option for cinematic stuff is Wan 2.2 or 2.1 with the MoviiGen lora. I would recommend trying a 2.2 workflow + that Lora at 720 using a 5090, or FusionX if you choose to use 2.1

If you don't need the lack of censorship of WAN and you have money to spend, I'd just use runway for this. Higgsfield AI may also be interesting to you because they have specific stuff for music videos

Anonymous 10/30/2025, 9:56:20 PM No.107056674 [Report] >>107056688 >>107056698

1758330248565777.jpg md5: 43000c16...

Anonymous 10/30/2025, 9:56:53 PM No.107056679 [Report] >>107056709 >>107056819

>>107056640
I am going to give a you tip:
watch Bram Stoker's Dracula (90's) and take couple of screenshots, there's Lucy and all that. Then img2img them. That'll be great.

Anonymous 10/30/2025, 9:57:14 PM No.107056683 [Report]

00024-3326562438.jpg md5: 00a71b49...

Anonymous 10/30/2025, 9:57:35 PM No.107056687 [Report]

>>107056614
>more i2v
I sleep

Anonymous 10/30/2025, 9:57:35 PM No.107056688 [Report]

>>107056674
i wish this were me right now

Anonymous 10/30/2025, 9:57:57 PM No.107056690 [Report] >>107056726

Kek, I was playing the Suno side and thought local already caught up somehow

https://levo-demo.github.io/

Very disingenuous demo

Anonymous 10/30/2025, 9:58:59 PM No.107056698 [Report]

>>107056674
You should put him in a van. And make him go.

Anonymous 10/30/2025, 9:59:52 PM No.107056708 [Report]

>>107056670
Cool thanks

Anonymous 10/30/2025, 9:59:55 PM No.107056709 [Report]

>>107056679
>Bram Stoker's Dracula
The best shots are couple of still frames from inside the film, not these tiktok screenshots etc.

Anonymous 10/30/2025, 10:02:23 PM No.107056726 [Report] >>107056742 >>107056890

>>107056690
Kek, who trained this model? It spits out Adele unprompted?

https://levo-demo.github.io/static/audio_sample/overview/04_en.mp3

It might be good. How come I've never heard of it.

Anonymous 10/30/2025, 10:02:52 PM No.107056732 [Report] >>107056744

spyro.png md5: ea51c226...

Anonymous 10/30/2025, 10:03:45 PM No.107056742 [Report] >>107056773

>>107056726
Is there a reason why you feel the need to talk about a unrelated subject in the thread when it can exist in it's own thread with actual documentation we can grab from the OP and all use?
Just seems odd you can't do that instead

Anonymous 10/30/2025, 10:03:54 PM No.107056744 [Report]

>>107056732
bigger

Anonymous 10/30/2025, 10:05:01 PM No.107056751 [Report]

00065-2876646701.png md5: 98143eb0...

Anonymous 10/30/2025, 10:08:21 PM No.107056773 [Report]

>>107056742
There's no comfy workflow, and the model seems like some experimental half trained model, what is there to talk about?

Anonymous 10/30/2025, 10:10:13 PM No.107056799 [Report]

00247-667236782.png md5: 7a258168...

Anonymous 10/30/2025, 10:12:23 PM No.107056819 [Report] >>107056836

image_00111_.jpg md5: 2e10160a...

>>107056650
Remember this scene?

>>107056679
It's a decent movie but I would unironically remake all Keanu Reeves dialogue with AI

Anonymous 10/30/2025, 10:14:13 PM No.107056836 [Report]

ComfyUI_00068_.png md5: c40482a7...

>>107056819
bro dont upload that scary ass shit here

Anonymous 10/30/2025, 10:17:21 PM No.107056867 [Report] >>107056915 >>107056983

00077-3534728429.jpg md5: 755dc222...

I'm getting closer

Anonymous 10/30/2025, 10:19:46 PM No.107056890 [Report] >>107056904 >>107057924

8845748478.png md5: f4e11356...

>>107056726
Tencent is actually training their own music model.

https://huggingface.co/tencent/SongGeneration

>TODOs
>Release SongGeneration-v1.5 (trained on a larger multilingual dataset, supports more languages, and integrates a Reward Model with Reinforcement Learning to enhance musicality and lyric alignment)

And the data is so copyrighted it spits out Adele unprompted as you can see on their demo. That is wild, with Qwen doing the same, my faith in China has been restored.

Anonymous 10/30/2025, 10:20:59 PM No.107056904 [Report] >>107056926 >>107056935

>>107056890
What does this have to do with image diffusion?

Anonymous 10/30/2025, 10:22:07 PM No.107056915 [Report] >>107056946

>>107056867
to killing urself? never been happier for u

Anonymous 10/30/2025, 10:22:56 PM No.107056926 [Report] >>107056946

>>107056904
There is no music thread. It's either here or /lmg/, the only two places we can discuss local models.

Anonymous 10/30/2025, 10:23:31 PM No.107056930 [Report]

image_00112_.jpg md5: 0976eca3...

Anonymous 10/30/2025, 10:23:58 PM No.107056935 [Report] >>107056946

>>107056904
this is local diffusion general, we accept video and audio related content here.

Anonymous 10/30/2025, 10:24:26 PM No.107056943 [Report]

i for one welcome our music gen brothers

Anonymous 10/30/2025, 10:24:56 PM No.107056946 [Report] >>107056984

>>107056935
>Discussion of Free and Open Source Text-to-Image/Video Models
>>107056915
>>107056926
You revealed yourself go back to your containment thread

Anonymous 10/30/2025, 10:26:14 PM No.107056958 [Report]

00260-408042399.png md5: 26cd1a0e...

Anonymous 10/30/2025, 10:28:49 PM No.107056983 [Report] >>107056987

>>107056867
closer to approaching the quality of a quantized 2gb illustrious model? maybe

Anonymous 10/30/2025, 10:28:50 PM No.107056984 [Report] >>107056987 >>107057002

>>107056946
Comfy has audio models. We should be allowed to discuss anything comfy adopts as long as it is local.

Anonymous 10/30/2025, 10:29:15 PM No.107056987 [Report]

>>107056983
>>107056984
You're so fucking pathetic dude

Anonymous 10/30/2025, 10:30:03 PM No.107057002 [Report]

>>107056984
Besides, good audio models are pivotal for video. Since Sora 2 it's not muted audio era anymore, the SOTA has changed, so all discussion on audio research is welcomed.

Anonymous 10/30/2025, 10:30:42 PM No.107057004 [Report]

2.png md5: ba18cd40...

uh oh, melty

Anonymous 10/30/2025, 10:38:51 PM No.107057066 [Report] >>107057119 >>107057157

I'll give the NetaYume shill this, the model requires a whole lot of gacha but at least it has some actual variation in its outputs.

Anonymous 10/30/2025, 10:40:35 PM No.107057081 [Report]

Im running comfy ai and following the guide ive been playing around with the hand and face detailer. Is there an equivalent for feet/toes? Id like to be able to fix those too.

Anonymous 10/30/2025, 10:41:52 PM No.107057093 [Report]

00268-426404236.png md5: fb62b079...

Anonymous 10/30/2025, 10:44:29 PM No.107057111 [Report] >>107057320

>>107056597
Some are better than others, clearly, but IMHO much of sampler/scheduler choice is subjective. The latter moreso than the former in my estimation.

Anonymous 10/30/2025, 10:44:30 PM No.107057112 [Report]

image_00115_.jpg md5: 878316a4...

Anonymous 10/30/2025, 10:45:21 PM No.107057119 [Report] >>107057137

>>107057066
You haven't taken any steps to learn the model and it shows, Why not explore something before going on multi day complaints?

Anonymous 10/30/2025, 10:46:50 PM No.107057137 [Report] >>107057145

>>107057119
I barely post in this thread, you're tilting at the wrong windmill friend. And I'm saying I like the model, I get better results out of it for the particular thing I'm prompting than I get out of the other boomerprompt models.

Anonymous 10/30/2025, 10:47:58 PM No.107057145 [Report] >>107057165

>>107057137
Anything to show?
There have been this constant wave of anons that complain about this model but don't post anything. I know you're just wasting time but take your low skill ass to one of the other threads

Anonymous 10/30/2025, 10:48:44 PM No.107057155 [Report]

00277-1703380660.png md5: 5a30352d...

Anonymous 10/30/2025, 10:49:03 PM No.107057157 [Report] >>107057165

>>107057066
>the model requires a whole lot of gacha
Describe the poses/gestures better

Anonymous 10/30/2025, 10:50:09 PM No.107057165 [Report] >>107057172

>>107057157
"Face and proportions that don't look weird"
>>107057145
>take your low skill ass to one of the other threads
OK

Anonymous 10/30/2025, 10:50:33 PM No.107057172 [Report] >>107057334

>>107057165
Fuck off now thanks!

Anonymous 10/30/2025, 10:51:11 PM No.107057175 [Report] >>107057196

00177-1107305360.png md5: 38afb102...

Anonymous 10/30/2025, 10:52:48 PM No.107057196 [Report]

>>107057175
illustrious 2gb?

Anonymous 10/30/2025, 10:53:57 PM No.107057206 [Report] >>107057391

illunoob vs netayume 2.png md5: 67256a30...

why is netayume so sloppy bros??

Anonymous 10/30/2025, 10:54:57 PM No.107057213 [Report]

*yawn*

Anonymous 10/30/2025, 10:56:06 PM No.107057227 [Report] >>107057303

>Mindbroken because hen ever made anything good in his life

Anonymous 10/30/2025, 10:58:51 PM No.107057255 [Report]

>>107055177
it's already there

Anonymous 10/30/2025, 11:01:27 PM No.107057278 [Report]

dmmg_0165.png md5: 9a8cb01e...

>>107054935
both

Anonymous 10/30/2025, 11:04:11 PM No.107057303 [Report]

>>107057227
damn you melting so hard you cant even spell yumebro

Anonymous 10/30/2025, 11:05:40 PM No.107057317 [Report]

wong_01.png md5: 595c8bbb...

https://www.youtube.com/watch?v=xboXFT46XSo

Anonymous 10/30/2025, 11:05:56 PM No.107057320 [Report]

>>107057111
DPM++ 2S Ancestral is pretty objectively better than Res Multistep at least for details like hands and text, using Linear Quadratic for both, I'd say at least

Anonymous 10/30/2025, 11:06:27 PM No.107057327 [Report] >>107057391 >>107057457

1745324974262853.jpg md5: 4633f5d3...

>>107054935
You typically don't need more than 25 steps. Most of my 50 step outputs have been either a sidegrade or even a downgrade in terms of quality.
Don't forget that chroma can gen pics above 1024 dimensions.

Anonymous 10/30/2025, 11:07:22 PM No.107057334 [Report] >>107057352 >>107057355

>>107057172
It's clearly the same fairly bad troll as yesterday, he's blatantly ragebaiting

Anonymous 10/30/2025, 11:09:34 PM No.107057352 [Report] >>107057408

>>107057334
yeah I agree fellow yumebro, theres totally not a vast majority of people that find this model trash

Anonymous 10/30/2025, 11:10:22 PM No.107057355 [Report]

>>107057334
It's the same retard from the rentry, he spends his entire life doing this for years and is just reduced to a bitter faggot.

Anonymous 10/30/2025, 11:20:33 PM No.107057391 [Report]

>>107057206
kek yeah that poster was an idiot
>>107057327
nice

Anonymous 10/30/2025, 11:23:58 PM No.107057408 [Report]

>>107057352
You're right, there's in fact not a vast majority of such people

Anonymous 10/30/2025, 11:29:07 PM No.107057457 [Report] >>107057474 >>107058352

Chroma1-Radiance-v0.4.safetensors_00349_.png md5: a46b26fa...

>>107057327
NTA. Your pic is neat af. This is also 25 steps

Anonymous 10/30/2025, 11:30:46 PM No.107057474 [Report] >>107057505

>>107057457
Oh, this is neat too, how's radiance compared to DC-2K?

Anonymous 10/30/2025, 11:33:13 PM No.107057497 [Report] >>107057589

image_00127_.jpg md5: 68589b8c...

Anonymous 10/30/2025, 11:34:21 PM No.107057505 [Report]

Chroma1-Radiance-v0.4.safetensors_00348_.png md5: 2ed13118...

>>107057474
>How's radiance compared to DC-2K?
Couldn't tell you, but I loved the 2k debug ones. There's still a lack of blending the macro pixels but it's mostly good

Anonymous 10/30/2025, 11:35:26 PM No.107057515 [Report] >>107057589

1750563092507017.jpg md5: 2e90caec...

Anonymous 10/30/2025, 11:36:06 PM No.107057521 [Report] >>107057530 >>107057589

ComfyUI_00559_.png md5: d695a3da...

Anonymous 10/30/2025, 11:37:20 PM No.107057530 [Report]

>>107057521
Cinematic Redmond is great.

Anonymous 10/30/2025, 11:40:36 PM No.107057553 [Report]

Chroma1-Radiance-v0.4.safetensors_00356_.png md5: a88d3879...

Anonymous 10/30/2025, 11:44:03 PM No.107057588 [Report]

Chroma1-Radiance-v0.4.safetensors_00360_.png md5: 3044b12a...

Anonymous 10/30/2025, 11:44:09 PM No.107057589 [Report] >>107057609 >>107057686 >>107057713

image_00128_.jpg md5: 9055d394...

>>107057497
"The lighting is even with no strong shadows." compared to "Cinematic lighting, dark background, deep shadows, detailed skin. Sharp HDR."

>>107057515
>>107057521
very cool

Anonymous 10/30/2025, 11:44:17 PM No.107057591 [Report]

1360212403.png md5: 94ec7459...

Anonymous 10/30/2025, 11:44:28 PM No.107057595 [Report]

Ted.png md5: 783d809e...

Anonymous 10/30/2025, 11:46:04 PM No.107057609 [Report]

>>107057589
https://www.youtube.com/watch?v=ZEWGyyLiqY4

Anonymous 10/30/2025, 11:46:31 PM No.107057615 [Report] >>107057690

>>107054482
>32b
Mostly useless for local. Viable for use with quantization, especially nunchaku, but LoRA training will be a nightmare, and a model without low cost LoRA training is pointless beyond ten minutes of novelty use.

Anonymous 10/30/2025, 11:51:22 PM No.107057648 [Report] >>107057829

>>107054248
From the final pretrained model we haven't seen any samples, but this is as it was training

J-pop song
https://vocaroo.com/19CHG4V410OP

Some pop song
https://vocaroo.com/1i7OjKcLbmnO

Some opera song
https://vocaroo.com/1f64Fkmpn9Ax

Idk, maybe with SFT phase it'll catch up to where it needs to be, but those outputs are very underwhelming. Just a bit concerning, but I don't know jack shit about these models.

Anonymous 10/30/2025, 11:52:14 PM No.107057655 [Report] >>107057838 >>107057980

00093-3495012195-ad-before.jpg md5: 1df4fc65...

Slowly getting it together still need to learn composition better

Anonymous 10/30/2025, 11:52:30 PM No.107057657 [Report] >>107057671 >>107057672

What was that feature of comfyui that was being advertised a while ago where you bundle a bunch of nodes together and then you can re-use that as one node?

did this ever actually happen?

Anonymous 10/30/2025, 11:54:22 PM No.107057671 [Report] >>107058085

>>107057657
subgraphs? didn't really change anything and was kind of a letdown. the node implementation in general is lacking too much and everything done to the front end has been lipstick on a pig

Anonymous 10/30/2025, 11:54:44 PM No.107057672 [Report] >>107058085

>>107057657
subgraphs?
they're pretty great to clean up wf and only see what you actually need to see

Anonymous 10/30/2025, 11:54:56 PM No.107057677 [Report] >>107057718

>https://huggingface.co/meituan-longcat/LongCat-Video
>We introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across Text-to-Video, Image-to-Video, and Video-Continuation generation tasks. It particularly excels in efficient and high-quality long video generation, representing our first step toward world models.
Anyone tried it? Works with KJ wanvideowrapper

Anonymous 10/30/2025, 11:55:30 PM No.107057686 [Report]

>>107057589
I had an antiquated gpu. But jesus, the boost even SDXL has gotten in terms noise... Sounds like a faggotry.

Anonymous 10/30/2025, 11:55:49 PM No.107057690 [Report]

Chroma1-Radiance-v0.4.safetensors_00366_.png md5: 409dc2ca...

>>107057615
>LoRA training will be a nightmare
ostrisai's trainer has supported 3bit quants for a while now.. wouldn't that be sub-16gb? https://xcancel.com/ostrisai/status/1953933728948121838

Anonymous 10/30/2025, 11:56:38 PM No.107057700 [Report]

anyone knows if infinite talk works with wan2.2, or is it just for 2.1?

Anonymous 10/30/2025, 11:57:47 PM No.107057713 [Report] >>107057744

>>107057589
what model is this?

Anonymous 10/30/2025, 11:58:04 PM No.107057718 [Report] >>107057731

>>107057677
some onions have been trying it out. doesn't look that much different from context window jerkiness after every 5 seconds

Anonymous 10/30/2025, 11:59:05 PM No.107057731 [Report]

>>107057718
>onions
anons filters to onions sometimes or something? the more you know I guess

Anonymous 10/31/2025, 12:00:14 AM No.107057744 [Report] >>107058315

image_00132_.jpg md5: 08d7140a...

>>107057713
Chroma-DC-2K-T2-SL4-Q8_0

Anonymous 10/31/2025, 12:01:04 AM No.107057754 [Report]

ComfyUI_00569_.png md5: 98bca7b4...

Anonymous 10/31/2025, 12:05:25 AM No.107057788 [Report] >>107057810

31925346.jpg md5: 1b7fe3b8...

>python?
>no, that shit is gay

Anonymous 10/31/2025, 12:07:48 AM No.107057810 [Report]

>>107057788
based chink

Anonymous 10/31/2025, 12:08:14 AM No.107057813 [Report] >>107057823 >>107057836 >>107057838

00097-493591130.jpg md5: 8c6968b6...

>>107054044 (OP)
slowly but surely, mistakes were made just need to adjust values

Anonymous 10/31/2025, 12:09:28 AM No.107057823 [Report] >>107057922

>>107057813
Thank you Ran. You wanted some attention.

Anonymous 10/31/2025, 12:09:47 AM No.107057829 [Report]

>>107057648
It's impressive to see the vocals don't sound anywhere near as robotic as original ACE-Step though. If they catch up to Suno 4.5 maybe there's a chance of getting Udio tier kino now and then.

Anonymous 10/31/2025, 12:10:13 AM No.107057836 [Report]

>>107057813
you can't adjust values if you are worthless

Anonymous 10/31/2025, 12:10:17 AM No.107057838 [Report] >>107058031

>>107057655
>>107057813
the painted nails are nice

Anonymous 10/31/2025, 12:12:40 AM No.107057860 [Report]

image_00135_.jpg md5: 71b53191...

Anonymous 10/31/2025, 12:12:56 AM No.107057861 [Report] >>107057893 >>107057915

1755210653857745.jpg md5: a199f15d...

So far I've been using the Wan 2.1 workflow from the rentry but wanted to try out 2.2 from here: https://civitai.com/models/1818841/wan-22-workflow-t2v-i2v-t2i-kijai-wrapper (2.2 I2V)
Why isn't it recognizing the vae? Everything looks correct to me, straight dragging the vae output from the loader to the decoder doesn't do anything either

Anonymous 10/31/2025, 12:15:10 AM No.107057881 [Report]

1761859945112321.png md5: 1a035884...

Anonymous 10/31/2025, 12:16:22 AM No.107057893 [Report] >>107058287

>>107057861
it's not connected to the decode node, pull the string from the vae loader to the decode node to connect them

Anonymous 10/31/2025, 12:17:49 AM No.107057912 [Report]

jager.jpg md5: b47937c8...

https://www.youtube.com/watch?v=Gu3TAuw3ZJ8

Anonymous 10/31/2025, 12:19:13 AM No.107057915 [Report] >>107058287

>>107057861
If it's not bait, as it's probably is, but it can be useful for newfags, use the example workflow instead and just load the correct models :
https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_wan2_2_14B_i2v.json

Anonymous 10/31/2025, 12:19:52 AM No.107057917 [Report] >>107057937

001.jpg md5: bae6183d...

1girl

Anonymous 10/31/2025, 12:20:53 AM No.107057922 [Report] >>107057950 >>107057959

>>107057823
*MrCatJak

Anonymous 10/31/2025, 12:21:09 AM No.107057924 [Report]

>>107056890
need them to train a speech model with emotion prompting so we can be freed from the dead end known as vibevoice

Anonymous 10/31/2025, 12:21:13 AM No.107057925 [Report]

>>107055149
That's what I want to hear.
Start a group, delegate simpler tasks to me, such as some manual captioning, and I'll contribute $250 toward training.
The only catch is that you share the training process and I get to ask a few technical questions.
We can find 20 others; there are plenty of interested people out there.
I don't care if it's a failure.

Anonymous 10/31/2025, 12:24:03 AM No.107057937 [Report]

>>107057917
I wonder if krea has a buttchin obsession too

Anonymous 10/31/2025, 12:26:21 AM No.107057949 [Report] >>107057988

image_00137_.jpg md5: 9a7d2517...

Anonymous 10/31/2025, 12:26:51 AM No.107057950 [Report]

>>107057922
You want to suck off people.

Anonymous 10/31/2025, 12:29:16 AM No.107057959 [Report]

>>107057922
What took you so long, please offer your asshole.

Anonymous 10/31/2025, 12:32:19 AM No.107057980 [Report]

>>107057655
hot

Anonymous 10/31/2025, 12:33:12 AM No.107057988 [Report]

>>107057949
is that...

Anonymous 10/31/2025, 12:37:08 AM No.107058024 [Report] >>107058217

002.jpg md5: 2fb6a773...

elongated 1girl

Anonymous 10/31/2025, 12:37:33 AM No.107058031 [Report] >>107058049 >>107058068 >>107058104

00101-147928662.jpg md5: 4248ea80...

>>107057838
Thanks, I think I'm getting the hang of this model now, the hardest part is finding the right bled of tags for a presentable image followed by adjustments, starting to feel like 60 steps is the magic number with this model. I wish neoforge had all the samplers I don't know why he took some away.
One thing I like with this model is I can game due to how little vram it uses compared to chroma
Sorry but I have a dedicated sperg that hates me and has been holding a grudge for years as well just ignore him

Anonymous 10/31/2025, 12:39:57 AM No.107058049 [Report]

>>107058031
?

Anonymous 10/31/2025, 12:42:30 AM No.107058068 [Report]

>>107058031
I respected you for years but not any longer. Seems like you are just spiteful.

Anonymous 10/31/2025, 12:43:37 AM No.107058084 [Report] >>107058092

>disabled retard noises

Anonymous 10/31/2025, 12:43:42 AM No.107058085 [Report] >>107058130 >>107058160

>>107057671
>>107057672
Thanks, I've started using subgraphs but I can't figure out how to make all subgraphs reflect each other's changes when I edit one of them. Any ideas? I would expect them to work like Scenes in Godot.

Anonymous 10/31/2025, 12:44:22 AM No.107058092 [Report]

>>107058084
Why do you refer in 3rd person?

Anonymous 10/31/2025, 12:45:40 AM No.107058104 [Report]

>>107058031
>ran wanted to come out
He manages to spit out a narcissist rant.

Anonymous 10/31/2025, 12:47:54 AM No.107058130 [Report] >>107058193

file.png md5: bad8d73a...

>>107058085
clone it

Anonymous 10/31/2025, 12:51:22 AM No.107058160 [Report]

>>107058085
>I would expect them to work like Scenes in Godot.
there are a lot of expectations from modern nodegraphs and comfyui ducks up 90% of what's standard

Anonymous 10/31/2025, 12:55:21 AM No.107058193 [Report] >>107058215

>>107058130
ah, I re-cloned and the clone is working now! I guess I must've cloned too early before, or there was a bug, which caused my clones to become unique (and were no longer clones).

Anonymous 10/31/2025, 12:57:23 AM No.107058204 [Report]

1743883744534508.png md5: e57c0c48...

Anonymous 10/31/2025, 12:58:09 AM No.107058213 [Report]

>>107055297
>pleeeeeease novel ai, i need the model files, my local model is kinda noisy

Anonymous 10/31/2025, 12:58:19 AM No.107058215 [Report]

>>107058193
if you duplicate you get separated entities, and if you clone you get tied ones

Anonymous 10/31/2025, 12:58:25 AM No.107058217 [Report]

>>107058024
Buffy x slenderman

Anonymous 10/31/2025, 1:06:03 AM No.107058274 [Report] >>107058307

00110-1547100464.jpg md5: e1f51dd0...

Yeah I need to make loras for this model, it's the boost I needed, it should also be pretty fast compared to training chroma,

Anonymous 10/31/2025, 1:07:49 AM No.107058287 [Report]

>>107057893
doing that just crashed comfyui
>>107057915
not bait, I'm just a bit of a brainlet when it comes to this but your workflow works fine, thanks

Anonymous 10/31/2025, 1:11:41 AM No.107058307 [Report]

>>107058274
Netayume is fucking garbage, holy shit

Anonymous 10/31/2025, 1:14:33 AM No.107058315 [Report] >>107058355

ComfyUI_00079__thumb.jpg.webm md5: 6116e9cf...

WebM not supported

>>107057744
>Chroma-DC-2K-T2-SL4-Q8_0
nta, nice gens, with lora?

Anonymous 10/31/2025, 1:15:12 AM No.107058318 [Report]

Netayume is fucking trash and just having it write some text that looks like its done in paint doesnt make it redeemable

Chroma for complex stuff and illustrious for hentai is the way to go

Anonymous 10/31/2025, 1:21:34 AM No.107058339 [Report]

uh oh meltie

Anonymous 10/31/2025, 1:24:14 AM No.107058352 [Report]

>>107057457
Shit I didnt notice you replied to me

>25 steps is enough
Thanks for the heads up boss, chroma fp16 with fp16 text encoder doesnt run all that slow on my 5060ti 16gb if I keep it under 30 steps

Anonymous 10/31/2025, 1:24:28 AM No.107058355 [Report]

>>107058315
Yeah, uploading to civitai right now

Anonymous 10/31/2025, 1:26:44 AM No.107058370 [Report] >>107058403

Screenshot 2025-10-31 002207.jpg md5: dd73d3f8...

does using this node lead to loss in quality?

Anonymous 10/31/2025, 1:32:25 AM No.107058403 [Report]

>>107058370
not anything visible

Anonymous 10/31/2025, 1:34:11 AM No.107058408 [Report] >>107058441

You can still download Udio songs on the fly as 320kbps btw, just downloaded a couple of bangers. No need to record or anything like that.

Anonymous 10/31/2025, 1:38:16 AM No.107058432 [Report] >>107058444 >>107058469 >>107058474

1748063353105752.png md5: e860524b...

can a pitfag rate this pit for me

Anonymous 10/31/2025, 1:39:56 AM No.107058441 [Report] >>107058499

>>107058408
from what I read they're limited to 192kbps mp3
I'm getting everything in bulk I saved there

Anonymous 10/31/2025, 1:40:38 AM No.107058444 [Report]

>>107058432
pits fine but smaller boobs would be more harmonious

Anonymous 10/31/2025, 1:44:37 AM No.107058469 [Report]

250915-171020-wan-i2v-2xrifevfi_00001_thumb.jpg.webm md5: fb6aaab5...

WebM not supported

>>107058432
6/10
I prefer mine like this

Anonymous 10/31/2025, 1:46:06 AM No.107058474 [Report]

>>107058432
Its nuts that i immediately spot a netayume pic every time since it looks so off

Anonymous 10/31/2025, 1:47:19 AM No.107058482 [Report]

Fresh

>>107058480
>>107058480
>>107058480

Fresh

Anonymous 10/31/2025, 1:50:05 AM No.107058499 [Report]

>>107058441
Yeah dunno, it's quite strange.

Was able to download a few of them at 320kbps with fetchv, like https://www.udio.com/songs/hoCg4BmayTYXcJfjo4jvbT

But other ones are only 192kbps. Maybe for some reason some of them stream at 320kbps, while other ones don't?

Anonymous 10/31/2025, 5:54:27 AM No.107059859 [Report]

I see people making AI images of trump and stuff like that.
But some of the stuff is definitely better than others.
Is there any way to set it up such that whatever prompt I give, the character is strictly that one character?
I mean, not just simply typing in the name of the character but making it more realistic?
Like, in a way such that even when I make anime or caricature images, it seems like some professional artist drew that based on the likeness of the person?
I don't know much about the loras and such, that's why I ask.