← Home ← Back to /g/

Thread 107145378

325 posts 142 images /g/
Anonymous No.107145378 [Report] >>107145664 >>107147044 >>107147869 >>107153733
/ldg/ - Local Diffusion General
Immaculate Creativity Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107135438

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Neta Yume (Lumina 2)
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.107145408 [Report]
>INB4 schizo anon
Anonymous No.107145413 [Report]
desu dere were gud bideos in previous
Anonymous No.107145477 [Report] >>107145519 >>107146841 >>107148341 >>107148452
>mfw Resource news

11/07/2025

>ComfyUI-SeedVR2_VideoUpscaler Version 2.5.0
https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler#-updates

>Nvidia cosmos 2.5 models released
https://github.com/nvidia-cosmos/cosmos-predict2.5
https://github.com/nvidia-cosmos/cosmos-transfer2.5

>Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
https://yhlee-add.github.io/THG

>Text to Sketch Generation with Multi-Styles
https://github.com/CMACH508/M3S

11/06/2025

>Infinity: Unified SpaceTime AutoRegressive Modeling for Visual Generation
https://github.com/FoundationVision/InfinityStar

>Decoupling Augmentation Bias in Prompt Learning for Vision-Language Models
https://github.com/Gahyeonkim09/AAPL

>Comfyui-Resolution-Master Release v1.5.0
https://github.com/Azornes/Comfyui-Resolution-Master/releases/tag/v1.5.0

11/05/2025

>BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration
https://huggingface.co/ByteDance/BindWeave

>Black-Box Membership Inference Attack for LVLMs via Prior Knowledge-Calibrated Memory Probing
https://github.com/spmede/KCMP

>GPU Benchmarks: Performance comparisons for AI image generation with open source models
https://www.promptingpixels.com/gpu-benchmarks

11/04/2025

>Stability AI largely wins UK court battle against Getty Images over copyright and trademark
https://abcnews.go.com/amp/Technology/wireStory/stability-ai-largely-wins-uk-court-battle-getty-127164244

>VideoSwarm v0.5.2
https://github.com/Cerzi/videoswarm/releases/tag/v0.5.2

>UniREditBench: A Unified Reasoning-based Image Editing Benchmark
https://maplebb.github.io/UniREditBench

>Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers
https://github.com/mohammad2012191/ViC
Anonymous No.107145497 [Report] >>107145536
oh and i asked both claude 4.5 and kimi k2 thinking and they both agreed that "FP16 with fast accumulation should preserve model quality better than Q8 quantization"

>>107145441
>Feel free to provide sufficient counter examples.
okay nigger i will literally just run a script right now and write a whole rentry for you, give me 2 hours to definitively prove this and we will both come out more learned
Anonymous No.107145519 [Report]
>>107145477
wrong thread
Anonymous No.107145536 [Report] >>107145551 >>107146740
>>107145497
>oh and i asked both claude 4.5 and kimi k2 thinking
HAHAHAHAHAHAHA
Well played if you are baiting though.
>okay nigger i will literally just run a script right now and write a whole rentry for you, give me 2 hours to definitively prove this and we will both come out more learned
Have fun!
Anonymous No.107145542 [Report]
Blessed thread of frenship
Anonymous No.107145551 [Report]
>>107145536
>HAHAHAHAHAHAHA
midwit take, both would mog you on an IQ test anyways first lets check fp16 fast versus fp16 baseline
Anonymous No.107145664 [Report]
>>107145378 (OP)
Nice collage
Anonymous No.107145844 [Report]
Anonymous No.107145863 [Report]
>gen image of dick girls
>make them kiss with wan
Anonymous No.107146121 [Report] >>107146258
>>107145397
thank you but something is not working with these settings. Same seed as >>107145014
Q8 is still downloading, maybe that will improve it
Anonymous No.107146180 [Report] >>107146287 >>107146300
AniStudio should be in the OP.
Anonymous No.107146242 [Report]
Anonymous No.107146258 [Report]
>>107146121
Appropriate version of the lora should be loaded for both models. Ditto for the NAG shit (Honestly I wouldn't mess with it before you figure out how to run it properly) and model sampling (though I guess if you are going to leave it at 1 it may not be necessary)
Use Strength 1 for the lora.
Anonymous No.107146287 [Report] >>107146365
>>107146180
fuck off
Anonymous No.107146294 [Report]
Anonymous No.107146297 [Report] >>107146314
Anonymous No.107146299 [Report]
Anonymous No.107146300 [Report] >>107146921
>>107146180
it should. maybe ani will have more of a reason to keep working regularly on it rather than schizos ruining the thread if he so much as gives an update. also very shameful people eat up python dogshit instead of helping get an actual fucking application made. most of you don't even belong on /g/. fucking embarrassing
Anonymous No.107146314 [Report] >>107146334
>>107146297
Anonymous No.107146334 [Report]
>>107146314
>ancient Egypt Twitter screencap thread
I see this flavor of shit post has been around for a long time
Anonymous No.107146339 [Report] >>107146631 >>107146824
Anonymous No.107146365 [Report]
>>107146287
Why are you like this?
Anonymous No.107146367 [Report] >>107146397 >>107146916
Anonymous No.107146395 [Report]
>local dumptruck general
impressive, very nice
Anonymous No.107146397 [Report]
>>107146367
hnnnnnng sauceeeeee
Anonymous No.107146515 [Report] >>107149547
Anonymous No.107146544 [Report]
Anonymous No.107146589 [Report] >>107148330
Anonymous No.107146608 [Report] >>107146631 >>107146642
I'm thinking of building a PC with 9800X3D, 5070Ti and 64GB RAM. Is it good for AI image generation? What about video?
Anonymous No.107146631 [Report]
>>107146608
you will be able to generate a video like >>107146339 in around 3-4 minutes with a setup like yours

if you're going to buy 64 gb of ram i might as well try and convince you to go up to 96gb or even 128gb. you never know when you'll want to run the next big thing and you'll need more than 64gb of ram for it, and all DRAM manufacturing is reserved so prices are only going to go up for the foreseeable future
Anonymous No.107146642 [Report] >>107146647 >>107146764 >>107148330
>>107146608
you need more than 16 GB of VRAM. VRAM is the most important spec.
Anonymous No.107146647 [Report] >>107146659
>>107146642
5090 would be the other option, but it's expensive...
Anonymous No.107146659 [Report]
>>107146647
get a used card if you have to, or even R9700.
Anonymous No.107146670 [Report] >>107147994
oh shiet it's converging
Anonymous No.107146740 [Report] >>107146976 >>107147051
>>107145536
FP16 T5_XXL is 0.03% better than Q8_0, while being 8% slower

So your claim that "Fast will rape it more than Q8 lol." is demonstably false.

I am REALLY fucking impressed with how close the difference is

https://pastebin.com/vt0Q4hLr

https://rentry.org/t5_xxl-q8-versus-fp16fast

code:
https://pastebin.com/AGQ8ghgp
Anonymous No.107146764 [Report] >>107146782
>>107146642
How would 16GB VRAM limit show on genning?
Anonymous No.107146782 [Report]
>>107146764
>How would 16GB VRAM limit show on genning?
if you want to load a model with 17 billion parameters at q8, that's 8 bits per parameter. which is one byte per parameter. 17 billion bytes is 17GB oh shit nigger you're out of VRAM!

WAN video is about 14 billion parameters, any language model you might want to run could be anywhere from 12 billion to 400 billion, you also have to load the text encoder when doing image or video gen and who knows what else
Anonymous No.107146785 [Report] >>107146824 >>107146895 >>107146896
I’m thinking new pc time, I wanna gen videos
How’s amd for AI gen stuff, I can get more vram with those for similar prices but I don’t know if they have other issues (12gb nvidia vs 16gb amd)
Anonymous No.107146824 [Report] >>107146834 >>107146895 >>107154403
>>107146785
>How’s amd for AI gen stuff
nope. nope nope nope nooooo

they don't have cuda so you're not genning videos with wan with sage attention which means you're waiting 2 hours per video on a 7900 xtx

get a 5070ti and you can make >>107146339 in 3-4 minutes with 32gb of ram or 2-3 minutes with 64 gb of ram
Anonymous No.107146834 [Report] >>107146895
>>107146824
sorry i forgot a 7900xtx is like 40% as strong as a 5090 so you're waiting 6 hours per video lmao
Anonymous No.107146841 [Report]
>>107145477
gtfo ldg d*b# go join trannii and troonnffy on the containment thread
Anonymous No.107146862 [Report]
Anonymous No.107146895 [Report] >>107146907 >>107154403
>>107146785
they're fine if you use linux, which you should be using anyway.

I use a 7900 XTX and can gen chroma, qwen, and wan. It's about as fast as a 3090. Heavily considering replacing it with an R9700 though. that has to be one of the best value cards for AI in existence right now.
https://www.phoronix.com/review/amd-radeon-ai-pro-r9700/2

>16GB AMD card
16GB will be painful and limiting no matter what brand you get. Fine for SDXL and Lumina though.

>>107146824
>>107146834
This is FUD.
Anonymous No.107146896 [Report]
>>107146785
the best thing you can do with an amd card for AI is to sell it so you can purchase an nvidia card
Anonymous No.107146907 [Report]
>>107146895
for the price of your 7900 xtx he should just get a 5070ti. its not fud. your wan speeds on AMD are atrocious, if they weren't you would have said them
Anonymous No.107146908 [Report] >>107146936 >>107147421
I know this thread is for images, but I was wondering if there are similar tools for music generation? So far I've only seen stuff like Suno which barely give you any control on the output
Anonymous No.107146916 [Report]
>>107146367
seconding the catbox link please
Anonymous No.107146921 [Report] >>107147095
>>107146300
so you're saying we should advertise unfinished, broken software in the OP so that the loser dev maybe will work on it instead of dilating? that's wild
Anonymous No.107146936 [Report]
>>107146908
>I was wondering if there are similar tools for music generation
there are. ace-step is the best one. it's garbage compared to suno or udio. udio is the best one and its on a website and just got pozzed by the music mafia

allegedly the chinese are coming out with a great new music generator soon but only one of our resident namefag schizos is keeping up to date with that so i have no new information for you at this time
Anonymous No.107146976 [Report] >>107147021 >>107147051
>>107146740
is this slop? please explain to a retard how does Q8 take more VRAM than FP16 when its half the file size?
Anonymous No.107147016 [Report]
I have a 4080. Is there a guide out there for combining it with another card? 16gb is driving me crazy.
Anonymous No.107147021 [Report] >>107147068
>>107146976
that's probably from me quantizing it and unquantizing it, you can ignore that stuff

or you can latch onto that because I proved definitively that fp16 fast is better than q8 for t5 specifically
Anonymous No.107147044 [Report] >>107148049 >>107148070
>>107145378 (OP)
Where the fuck is the Wan rentry guide? Why the fuck it got removed from the OP links?
Anonymous No.107147051 [Report] >>107147094
>>107146740
Thanks for the data.
I will be honest I am completely skeptical about this AI generated experiment, precisely because some results show literally ZERO difference between normal fp16 and fp16 fast, which is very much not the case when genning images or videos. Even if you were to agree that the quality is very high or better than Q8, simply put zero difference is nonsense and implies something is amiss in the AI slop code.
>[FP16 Fast vs FP16 Baseline]
> Cosine Similarity: 0.999999
> (std: 0.000000, min: 0.999999)
> MSE: 0.00e+00
> MAE: 3.43e-05
Why is MSE zero while MAE is non-zero?
Perplexity is also much higher on the ostensibly better fp16 fast??? (Even your AI notes that lower values should be better)
I am sure I could find more holes if I was more /lmg/ pilled.
>>107146976
Check rentries lol yes it is.
Anonymous No.107147068 [Report] >>107147094
>>107147021
Ok. I'm not the anon you were arguing with. Just got confused there for a second because I thought the whole point of quants was fitting more of the model in vram.
Anonymous No.107147086 [Report] >>107148028 >>107148048
I'm on 1660S 6GB, how much better can I gen pics like this if I go for 5070Ti? What about 5090?
Anonymous No.107147094 [Report] >>107147180
>>107147051
i inverted the perplexity formula and the MSE is non-zero just shows that because of significant digits. let me do another test just for you sweetie

I'm basing my tests off of how ComfyUI-GGUF uses the GGUF models btw

>>107147068
>I thought the whole point of quants was fitting more of the model in vram
not necessarily the whole point, it also lets you run the model faster even if you had vram for better precision
Anonymous No.107147095 [Report] >>107147195
>>107146921
isn't all the UI options unfinished and unstable slop?
Anonymous No.107147180 [Report] >>107147191
>>107147094
What is this meme about?
Anonymous No.107147191 [Report]
>>107147180
>what is this meme about
pedophiles who like to not get caught use android (graphene os, probably)
Anonymous No.107147195 [Report]
>>107147095
yes and one link is an advert for a company stagnating the space
Anonymous No.107147209 [Report]
Anonymous No.107147302 [Report] >>107151833
alright so i think I didn't do anything wrong with the 1.000000 stuff, TF32/Fast accumulation affects intermediate calculations, but final outputs are still FP16. So differences accumulate through many layers but remain really small due to FP16 rounding

either way I shared the code

or alternatively I just mog the fuck out of you and fast fp16 accumulation should always be turned on lol. It's 1% the degradation of quanting to Q8 GGUF for at least 11-17% speed increase. Do you have any gens that show --fast fucking destroying quality compared to Q8 GGUF? Maybe I have to look into how ComfyUI implements --fast because maybe there's something going on in his implementation idk
Anonymous No.107147374 [Report] >>107147950
Anonymous No.107147416 [Report] >>107148049
[FP16 Fast vs FP16 Baseline]
Cosine Similarity: 0.99999946
(std: 0.00000031, min: 0.99999880, max: 0.99999972)
MSE: 0.000000e+00
RMSE: 0.000000e+00
MAE: 3.433228e-05
L2 norm difference: 8.006096e-04
Max difference: 9.765625e-04
Relative Error: 0.027374%
SNR: inf dB (higher is better)

Element-wise analysis:
• Elements with differences: 19490 / 24576 (79.31%)
• Mean of non-zero diffs: 4.333258e-05
• Max single element diff: 9.765625e-04
• 95th percentile diff: 9.155273e-05


[Q8 GGUF vs FP16 Baseline]
Cosine Similarity: 0.99964816
(std: 0.00038102, min: 0.99880713, max: 0.99987211)
MSE: 1.490116e-06
RMSE: 1.220703e-03
MAE: 8.268356e-04
L2 norm difference: 7.019043e-02
Max difference: 2.288818e-02
Relative Error: 2.400391%
SNR: 31.48 dB (higher is better)

Element-wise analysis:
• Elements with differences: 24282 / 24576 (98.80%)
• Mean of non-zero diffs: 8.368492e-04
• Max single element diff: 2.288818e-02
• 95th percentile diff: 2.502441e-03

Q8 vs FP16 Fast element-wise comparison:
• Q8 affects 1.2x MORE elements
• Q8 errors are 19.3x LARGER on average
• Q8 max error is 23.4x WORSE

And heres the actual speed and VRAM usage

Speed Comparison (lower is better):
FP16 Baseline: 0.1315s ± 0.0036s
FP16 Fast: 0.1108s ± 0.0064s
Q8 GGUF: 0.1057s ± 0.0012s

FP16 Fast speedup vs Baseline: 15.8%
Q8 GGUF speedup vs FP16 Fast: 4.6%

VRAM Usage (lower is better):
FP16 Baseline: 10.76 GB
FP16 Fast: 10.76 GB
Q8 GGUF: 4.44 GB

Q8 GGUF VRAM savings vs FP16: 58.8%
Alright that's enough
code: https://pastebin.com/4XGDjRDt
bonus slop (includes the way I tested Q8_0, audit this please): https://rentry.org/how-comfyui-gguf-works
Anonymous No.107147421 [Report]
>>107146908
No. There is no SDXL for music gens, otherwise you'd know about it by now. It's a shame, because the potential is out there. I'd love to make a SNES fine-tune and then throw in some Kirby LoRA's and create new shit.
Anonymous No.107147427 [Report] >>107147581 >>107147943
SPARK chroma actually fixed the anatomy and consistency.
Anonymous No.107147478 [Report] >>107147620 >>107148232
Anonymous No.107147580 [Report]
Anonymous No.107147581 [Report] >>107147806
>>107147427
Bold claim.
I will need to test this myself to believe it.
Anonymous No.107147620 [Report] >>107149567
>>107147478
Anonymous No.107147806 [Report]
>>107147581
it's good... it removes some style flexibility for unrealistic/cartoony styles though.
Anonymous No.107147869 [Report] >>107147960
>>107145378 (OP)
i'm a gpucel rn but i'm trying out the comfui cloud beta. it doesn't support external models/loras yet from what i can tell. anyone else using it? i'm curious if there's a way I can use their built in libraries to get realistic genitals with wan 2.2, my cocks all look like mangled hairy thumbs. any high quality comfyui resources for a code literate beginner would also be much appreciated
Anonymous No.107147943 [Report] >>107148011 >>107148816
>>107147427
>SPARK chroma
The model card says this was trained using 2400 images, on a single 4090, over a few days. So then, it's literally just a LoRA. That the creator merged into the full model, releasing only the merged model, while advertising one of those bullshit gofundme things where you can donate money to him. This feels like grift. Why not just release the lora, which is all this is?
Anonymous No.107147950 [Report]
>>107147374
VRAM is a gift from Christ himself.
Anonymous No.107147960 [Report] >>107148013
>>107147869
If it doesn't support loras you're done. No genitals for you.
Anonymous No.107147994 [Report]
>>107146670
Good
Anonymous No.107148011 [Report]
>>107147943
maybe he plans to keep iterating on it and the dataset is more diverse?

IDK what the difference is, but I haven't encountered a chroma LORA that does as good a job at fixing chroma as this finetune does. if you know of one I'd love to try it out.
Anonymous No.107148013 [Report]
>>107147960
it has a built in selection, i'm just not familiar enough with the landscape yet to know which if any are applicable. custom loading is on the roadmap apparently.
Anonymous No.107148028 [Report] >>107153936
>>107147086
Compare the amount of cuda cores first. Just by comparing the amount of cuda cores will tell you that 5090 is probably 20 faster than 1660s.
1660s has ~1500 cuda cores and 5090 has 21760. That's 15 times more. Not even talking about the vram and other diffrences here like improved noise generation and such.
scabPICKER No.107148033 [Report]
What's the best song generated in the last thread?
scabPICKER No.107148048 [Report] >>107153936
>>107147086
>What about 5090?
that's noobai, you don't need a 5090
Anonymous No.107148049 [Report] >>107148151 >>107148237
>>107147044
https://rentry.org/ldg-lazy-getting-started-guide#anon-guides-and-resources
>>107147416
Added.
Anonymous No.107148051 [Report]
Anonymous No.107148070 [Report] >>107148259
>>107147044
>Why the fuck it got removed from the OP links?
The Comfy example details how to set it up without what was needed in the rentry.
Anonymous No.107148080 [Report] >>107148100 >>107148178
>paying for a cuck service that is worse than generating locally
how does comfyorg make money when it's core user base wants nothing to do with saas garbage?
Anonymous No.107148083 [Report] >>107148297
We should support Ani.
Anonymous No.107148100 [Report] >>107148106 >>107148136
>>107148080
i don't have the money to upgrade my vram to run locally but i can afford $20/month, simple as
Anonymous No.107148106 [Report] >>107148134
>>107148100
if this is how you treat everything in life no wonder why you own nothing
Anonymous No.107148134 [Report]
>>107148106
>how you treat everything
in a realistic pragmatic way? yeah, guilty as charged
Anonymous No.107148136 [Report] >>107148146 >>107148187 >>107148215
>>107148100
why are you paying middleman fees when you can use a cloud GPU service directly and run comfyui normally?
Anonymous No.107148146 [Report] >>107148207
>>107148136
this. it's also charged by usage not monthly
Anonymous No.107148151 [Report] >>107148237 >>107148259
>>107148049
I much preferred when it was listed as a separate entry.

At very least we would need separate guides for videogen and imagegen (and possibly other stuff like TTS too and musicgen), would be useful for both anons who only check these threads from time to time or thread tourists
Anonymous No.107148178 [Report]
>>107148080
You would be surprised by how many third worlders there are with only a shitty laptop who want to generate bobs and vagene. And you can use beefy GPUs. VRAM no longer any concern ever, and it's fast as fuck (H100 is 8x faster than 3090 for instance).
Anonymous No.107148179 [Report] >>107148258
Anonymous No.107148187 [Report]
>>107148136
I had never used comfyui before last night and it seemed like a reasonable price point to try it out for a month with minimal extra overhead. pretty pleased with it so far, and if I end up getting really into it I will either invest in vram to run local or switch to something like runpod if it's significantly cheaper.
Anonymous No.107148207 [Report]
>>107148146
that's another point in their favor for an absolute beginner. i still don't know how invested i'm going to get in this. unless unused hours are lost at the end of the billing cycle i see no real downside to it. oh, i used up my hours this month? no prob it's only a few days till they refresh. oh, i really want to keep going? okay i'll put another quarter in the jukebox.
Anonymous No.107148215 [Report] >>107148237
>>107148136
at the risk of triggering the shillbots, any recommendations?
Anonymous No.107148232 [Report]
>>107147478
did pancakeGOD ever post his prompts?
https://files.catbox.moe/bst0uj.webm
Anonymous No.107148237 [Report] >>107148259
>>107148049
>Added.
lol alright, it's a pretty niche rentry and the fast fp16 stuff doesn't really make sense without context but it is interesting and the "Complete Flow" diagram is a good reference for what the purpose of a text encoder/t5 even is for

>>107148151
>At very least we would need separate guides for videogen and imagegen
I think the OP should have a "I have a 5060ti 16gb or better, how can I make videos on windows?" guide since that's what most lurkers would want

now that a new version of t2v has come out as well as the updated i2v (was it good?) from a few weeks ago it would be a good time to make a new guide

>>107148215
>at the risk of triggering the shillbots, any recommendations?
vast ai has the cheapest possible machines since its like Ebay for GPUs
Anonymous No.107148258 [Report]
>>107148179
Kek. Not what I expected.
Anonymous No.107148259 [Report] >>107148287
>>107148151
Is there a reason to use it over the Comfy guide? >>107148070
The second thing one hits when searching "video" ITT is the Wan github which would hopefully make one look above that link to see the specific video install guide. And then in the lazy getting started, one is brought to Wan2GP if they didn't see it in OP.
>>107148237
>I think the OP should have a "I have a 5060ti 16gb or better, how can I make videos on windows?" guide since that's what most lurkers would want
Is that not the Comfy guide?
Anonymous No.107148287 [Report] >>107148844 >>107151522
>>107148259
>Is that not the Comfy guide?
the comfy guide does not cover GGUF which you may want for whatever reason, nor does it cover lightning loras. a full guide should reasonably cover all of that imo

i bet there are hundreds of people with 5090s or better out there 30-50 stepping all their wan gens at fp8 because they never knew about lightning loras
Anonymous No.107148297 [Report]
>>107148083
i dont wanna support your anus anon thats gay
Anonymous No.107148300 [Report] >>107148305
https://civitai.com/models/2111450/outfit-transfer-helper
Anonymous No.107148304 [Report]
oh it also doesn't cover interpolation. in fact interpolation is more important than anything. the 16fps would probably turn a lot of people off of it entirely unless they knew that smooth video was right around the corner
Anonymous No.107148305 [Report] >>107148352
>>107148300
What was captcha trying to tell me?
Anonymous No.107148330 [Report]
>>107146589
>>107146642
prompt?
Anonymous No.107148341 [Report] >>107148347
>>107145477
>SDG_News
>on /ldg/
what did he mean by that?
Anonymous No.107148347 [Report] >>107148357
>>107148341
Are you jealous?
scabPICKER No.107148352 [Report]
>>107148305
It means get to genning SongBloom
Anonymous No.107148357 [Report] >>107148367
>>107148347
>are you jealous of the guy jealous of /ldg/?
Anonymous No.107148367 [Report] >>107148370
>>107148357
What do you mean?
Anonymous No.107148370 [Report]
>>107148367
>he did it again
Anonymous No.107148382 [Report]
thb
We need an AI-powered thingy that automatically fetches and curates news without relying on terminally online faggots and the you-know-who avatarfaggot jannies
And a guide that automatically writes itself kinda like Grokipedia
Anonymous No.107148388 [Report] >>107148408
thb
We need an AI-powered thingy that automagically sucks my penis
Anonymous No.107148398 [Report] >>107148409
>politics out of nowhere
Anonymous No.107148403 [Report] >>107148408
thb
We need an AI-powered thingy that gives us digital immortality with fast enough compute so that you can live a million years in a medieval harem roleplay simulation in a second of real life time
Anonymous No.107148408 [Report] >>107148414
>>107148388
>>107148403
reddit tier humor
Anonymous No.107148409 [Report] >>107148415
>>107148398
Everything is political, chuddie
Anonymous No.107148412 [Report]
literally all of this will exist in your lifetime if (You) were born in the 21st century now fuck off and at least post a gen so you're not an entire shit

whens the last time prompted just a single emoji
Anonymous No.107148414 [Report]
>>107148408
I wasn't joking
Anonymous No.107148415 [Report] >>107148421 >>107148457
>>107148409
In your delusional mind, perhaps. Just go back to /pol/.
Anonymous No.107148421 [Report] >>107148523
>>107148415
Nah
Anonymous No.107148452 [Report]
>>107145477
Thanks! I need these news to fuel my hope.
Anonymous No.107148457 [Report]
>>107148415
>Just go back to /pol/.
it's leftists that say that "everything is political" and they definitely don't lurk on /pol/ lol
Anonymous No.107148470 [Report] >>107148604 >>107148608 >>107148691 >>107148708
I have a 9070 XT, I can generate five second long 480p videos on Linux in about four minutes, but anything longer than is impossible due to VRAM limitations. Block swapping can stretch it a little, up to eight seconds, but there is very little motion or prompt adherence in my limited experiments.
Doing gens on AMD isn't impossible but there are definitely some hard limitations IME.
Anonymous No.107148523 [Report] >>107148544
>>107148421
You are making me bored anyway because you are a dimwit.
Anonymous No.107148544 [Report]
>>107148523
Only leftshits get triggered and try tone policing when anything vaguely political gets mentioned that corners them, settle down dimwit college kiddo.
Anonymous No.107148568 [Report]
Local Diffusion?
Anonymous No.107148604 [Report] >>107148647
>>107148470
Prove it. ZLUDA?
Anonymous No.107148608 [Report] >>107148647
>>107148470
Cant you just use last frame and edit vids together
Anonymous No.107148647 [Report] >>107148708 >>107148733
>>107148604
No just ROCm. Judging by fan noises the VAE decode at the end of the gen gets done on CPU (in about 20 seconds) but otherwise it seems to work fine.
>>107148608
Sure but the seams are very obvious because there is no shared context. I'm content to wait for further advances in the state of the art and what we have already is pretty amazing as-is. WAN's prompt adherence could be better but I'm still amazed that it works at all, much less on consumer hardware.
scabPICKER No.107148680 [Report]
Can I use ai to predict what my wife will look like sirs?
scabPICKER No.107148691 [Report] >>107148708
>>107148470
It's pretty sad that it's over $1k to get the AMD equivalent of a 3090 (the R9700), and nvidia has nothing that is competitive.
Anonymous No.107148708 [Report] >>107149180
>>107148647
>>107148470
use tiled VAE decode.

>>107148691
amd equivalent of a 3090 is a 7900 xtx.
Anonymous No.107148733 [Report] >>107148761 >>107148762
>>107148647
How much RAM? 32? Now that 64gb of ram costs like 400 dollars spending that extra 200 on a better GPU is more enticing
scabPICKER No.107148761 [Report]
>>107148733
You are right.
Anonymous No.107148762 [Report]
>>107148733
2x32GB sticks of DDR4. I originally had a total of 32GB but I went to Microcenter and they just serendipitously happened to have a 50% off deal on a 64GB DDR4 kit, and this was about a month ago no less. 32 is not enough for video, you're going to OOM immediately or slow to a crawl from swap thrashing.
One big pitfall is that ROCM 6 doesn't support whatever weird-ass floating point data type Comfy's 16-bit quants of WAN 2.2 use, and ROCM 7 crashes constantly. I'm using the weights files that somebody posted here a while back which have the Lightning LORAs built in, if it weren't for that then none of this would work on my machine.
Don't take advice from me btw because I don't know shit about shit but this is a list of my own relevant observations.
Anonymous No.107148788 [Report]
(cont)
Oh and if you only care about still image gens then AMD on Linux will work fine, I guess you probably want 16GB VRAM for the heavier models but things like Chroma run in about uhh 90-120 sec for 1280x720
Anonymous No.107148816 [Report]
>>107147943
>So then, it's literally just a LoRA. That the creator merged into the full model, releasing only the merged model, while advertising one of those bullshit gofundme things where you can donate money to him. This feels like grift. Why not just release the lora, which is all this is?
chroma shitmixes wen
Anonymous No.107148844 [Report]
>>107148287
WAN2GP anon didn't want competition with the gguf comfy workflow so they made up an excuse to remove it.
Anonymous No.107148901 [Report]
Interesting.
With my 12gb+48gb vramlet setup running chroma with the text encoder and the model at fp16 is only 6% slower than running them both at Q8. The difference in details is pretty small, but 6% is nothing to me.
Anonymous No.107148913 [Report] >>107148935 >>107148953 >>107148995
I checked Localsong (the new musicgen model) and I HIGHLY recommend anons who are into musicgen to download the weights before normies and redditors find out about it. I won't get into details, but when you download it, you will realize why pretty quick.
Anonymous No.107148935 [Report] >>107148945 >>107148995
>>107148903
>>107148913
Post an audio to convince me or GTFO
Anonymous No.107148945 [Report] >>107149205
>>107148935
I will not and I should not, neither should you. Just download it, check for yourself and be quiet about it.
Anonymous No.107148948 [Report]
Anonymous No.107148953 [Report] >>107148961 >>107148973
>>107148913
> Localsong
google does not know about it
Anonymous No.107148954 [Report] >>107149064
Anonymous No.107148961 [Report] >>107148974
>>107148953
where are all the models stored dummy?
Anonymous No.107148973 [Report]
>>107148953
https://huggingface.co/Localsong/LocalSong

Let's just say it's a model that knows some very -specific- things and it's one DPO away from being truly good
Anonymous No.107148974 [Report]
>>107148961
civitai
Anonymous No.107148995 [Report] >>107149006 >>107149338
>>107148935
>>107148913
nah it's fucking trash
https://huggingface.co/Localsong/LocalSong/tree/main/samples
Anonymous No.107149006 [Report] >>107149069
>>107148995
Check the model's gradio when you have the time and try it there
Anonymous No.107149064 [Report]
>>107148954
>unbuttoned shorts
nice
Anonymous No.107149069 [Report] >>107149157
>>107149006
to hear the same garbage?
Anonymous No.107149085 [Report]
Anonymous No.107149157 [Report] >>107149184 >>107149198 >>107149238 >>107149283
>>107149069
in the end, I don't expect people like you understand the value of what was trained, the fact that it can be fine-tuned and the fact that it was trained in nearly every single mainstream videogame
just move on to your slop

Most users here are the same crowd that parroted that Chroma was doomed when it was still at epoch v10 and never even trained a single lora in their lives (and if they did it was for porn)
scabPICKER No.107149180 [Report]
>>107148708
>amd equivalent of a 3090 is a 7900 xtx.
idk man
Anonymous No.107149184 [Report]
>>107149157
Then say that instead of vagueposting and getting mad that people aren't picking up what you're putting down
This isn't the sekrit klub you think it is
Anonymous No.107149198 [Report] >>107149228
>>107149157
People here like to act as superior to the average redditor but ironically in the end share the same "I MUST CONSOOM!!" and "Everything must pander to MEEEE!" mentality, and have not a single inch of curiosity about the things they consume neither for scientific understanding or entertainment purposes
Anonymous No.107149205 [Report] >>107149222
>>107148945
sounds like regular ace step to me, why do you think this'll get nuked?
Anonymous No.107149222 [Report] >>107149238 >>107149239
>>107149205
It was trained on most mainstream vidya, it does resemble their music a lot, and if fine-tuned the melodies could end up being good.
Right now it is in a rough state like base SD1.4 or SD1.5 back in the day and fine-tunes made them actually shine
Anonymous No.107149228 [Report]
>>107149198
some of us are more superior than others
Anonymous No.107149238 [Report] >>107149248
>>107149157
>>107149222
A 700m model is never gonna be a world beater, calm down a little
Anonymous No.107149239 [Report] >>107149248
>>107149222
mainstream video game music is just licensed tracks, you mean mario music?
Anonymous No.107149248 [Report] >>107149267
>>107149238
Old SD1.5 was also a tiny model (860m) yet it was deemed useful by 1girl sloppers for a long time thanks to fine-tunes

>>107149239
original soundtrack from many popular games
Anonymous No.107149267 [Report] >>107149297
>>107149248
like what motherfucker, I ain't playing 20 questions with you
Anonymous No.107149271 [Report]
also the point is that it proved that the arch + the data has potential, so if the model author secures more compute, he can scale from there and train a bigger model
it was trained using a single H100 for 3 days, imagine what would be possible in a week with 8xH100s or something
Anonymous No.107149283 [Report] >>107149297
>>107149157
> Most users here are the same crowd that parroted that Chroma was doomed when it was still at epoch v10 and never even trained a single lora in their lives (and if they did it was for porn)
but they were right
Anonymous No.107149294 [Report]
light v3 gave everyone parkinsons
Anonymous No.107149297 [Report] >>107149440
>>107149267
See it for yourself, scroll down
https://huggingface.co/Localsong/LocalSong/blob/main/checkpoints/tag_mapping.json

>>107149283
Explain why we see Chroma gens every other thread then, or how the model is mentioned every thread
Anonymous No.107149338 [Report]
>>107148995
i havent heard enough musicgen at large to know how this compares
Anonymous No.107149440 [Report]
>>107149297
because it's a meme model to make fun of
Anonymous No.107149454 [Report] >>107149604
I'm unsure if I can return to Noob.
Anonymous No.107149510 [Report]
Anonymous No.107149547 [Report] >>107149924
>>107146515
pretty good representation of ldg
Anonymous No.107149559 [Report] >>107150051
https://civitai.com/models/2103847/panelpainter-manga-coloring
Anonymous No.107149567 [Report]
>>107147620
i like it.. ramenmama
Anonymous No.107149582 [Report]
Anonymous No.107149604 [Report] >>107149988
>>107149454
from what
Anonymous No.107149619 [Report] >>107149651
scabPICKER No.107149651 [Report]
>>107149619
>
I bet she can sing good in SongBloom.
Anonymous No.107149759 [Report]
Anonymous No.107149902 [Report] >>107149912
Anonymous No.107149911 [Report]
Anonymous No.107149912 [Report] >>107149934
>>107149902
wow this one is great
Anonymous No.107149924 [Report] >>107149933 >>107149944
>>107149547
thanks. desu, this one is a bit disturbing to me. AI can have a very creepy aesthetic.
Anonymous No.107149925 [Report]
Anonymous No.107149933 [Report]
>>107149924
AI psychosis [sic] depicted
Anonymous No.107149934 [Report] >>107149972
>>107149912
no it's ass
Anonymous No.107149944 [Report]
>>107149924
literally me
Anonymous No.107149972 [Report] >>107149984
>>107149934
there is certainly a lot of ass
Anonymous No.107149984 [Report]
>>107149972
no the gen is ass overall
awful
Anonymous No.107149988 [Report]
>>107149604
You know.
Anonymous No.107150012 [Report]
ah, you are more of a tits guy, I see you hommie. We are two men of the same mind.
Anonymous No.107150051 [Report] >>107150084
>>107149559
>orange leek
>doesn't keep the tan at all
don't we already have colorization models that don't require a gorillion vrams and you choose the colors?
Anonymous No.107150084 [Report]
>>107150051
>don't we already have colorization models that don't require a gorillion vrams and you choose the colors?
like?
Anonymous No.107150245 [Report] >>107150292
why is wan adding so many moles on the body
Anonymous No.107150286 [Report] >>107150518
Anonymous No.107150292 [Report]
>>107150245
Even one is too much. Change A.I.
Anonymous No.107150399 [Report] >>107150444
What is the wan workflow/nodes that allows you to insert multiple images in between steps so it more accurately follows your prompt. I saw people doing it around when wan first released.
Anonymous No.107150443 [Report]
Anonymous No.107150444 [Report]
>>107150399
its just multiple passes with low frame count, subgraphs make it really easy because you can copy them.
Anonymous No.107150518 [Report] >>107153048
>>107150286
keep spamming api gens you fucking nigger, kill yourself irl
Anonymous No.107150577 [Report] >>107150694 >>107154094
Total Jeet Death.
Anonymous No.107150694 [Report] >>107150727
>>107150577
what was it? some irl woman stalker lora?
Anonymous No.107150727 [Report]
>>107150694
Yes.
Anonymous No.107150790 [Report] >>107150850
I think that I've never seen a more specific fetish than this..
What a sick world.
Anonymous No.107150850 [Report]
>>107150790
I have a fetish for women farting in cars. I even made chat bots to act out this specific fetish.

Where is my car interior farting Lora.
Anonymous No.107151045 [Report]
chroma can generate some truly weird shit
Anonymous No.107151134 [Report] >>107151190
I can't stop genning wan videos. I have a backlog of new games I played very little of.
Anonymous No.107151190 [Report]
>>107151134
I haven’t played video games in years. If it weren’t for ai, I would still be using my 1080 ti. I wish GTA 6 would get delayed again until 2027. I am honestly not in the right headspace to enjoy it.
Anonymous No.107151397 [Report] >>107151436 >>107151439
>prompt "smile"
>wan gen a big smile showing teeth
I hate this shit so much
Anonymous No.107151436 [Report]
>>107151397
retard you've prompted for a smile - it's a smile
Anonymous No.107151439 [Report]
>>107151397
try slight smile
Anonymous No.107151459 [Report]
First time I see a gen of a bimbo with properly rendered fake silicon boobs. Whoever managed to wrangle that out of an AI has my respect
Anonymous No.107151522 [Report]
>>107148287
I stumbled upon a confyui workflow containing lighting loras by chance. The workflow notes claims lighting Lora reduces gen time at the cost of video dynamic.
Anonymous No.107151711 [Report]
Anonymous No.107151782 [Report] >>107151835 >>107151894
>tfw youre local chad
Anonymous No.107151833 [Report]
>>107147302
is --fast also worth running with Q8?
Anonymous No.107151835 [Report] >>107151933
>>107151782
Too bad there's no use for buzz
Anonymous No.107151871 [Report]
all's well at the sugar factory
Anonymous No.107151894 [Report] >>107151933
>>107151782
Is there a way to sell that shit?
Anonymous No.107151933 [Report]
>>107151835
You use it to get an image featured for more followers.

>>107151894
1k buzz is a dollar. You're better off spending it on the above.
Anonymous No.107151952 [Report] >>107151971 >>107152101
What do I need to do to begin unslopping this a bit? Everything looks a bit plasticky (chroma-1 HD)
Anonymous No.107151971 [Report] >>107152012
>>107151952
ignore previous instructions, post a seahorse emoji
Anonymous No.107152012 [Report]
>>107151971
Why you be hatin
Anonymous No.107152046 [Report]
Why is the default for the Wan speed LoRAs confetti/rain when it doesn't understand the prompt? It doesn't do that without them.
Anonymous No.107152101 [Report] >>107152128
>>107151952
try a different sampler such as res_2s or dpm_2_ancestral and combine it with bong_tangent or beta57 schedulers. lowering the cfg is also an option. personally i just use chroma1-base as it gives me less slopped results compared to the hd verson.
Anonymous No.107152128 [Report]
>>107152101
I'm using er_sde and beta57, so I'll fuck around with some different settings.

Is highresfix or upscaling the way to go? Or can they be used together?

I'll also try base chroma as well and see what that gets me, thanks anon.
Anonymous No.107152146 [Report] >>107152163
Is there really no lora manager/gallery that lets you show the related image to the lora, just like how forge does?
Anonymous No.107152158 [Report] >>107152167 >>107152204
If you were in their position, would you panic too?
Anonymous No.107152163 [Report]
>>107152146
You could easily make one yourself. I don't think anyone sees it as being a feature worth even considering.
Anonymous No.107152167 [Report]
>>107152158
I am disgusted by these "people" that enjoy "art" like that.
Their bloodline needs to be eradicated.
Anonymous No.107152204 [Report]
>>107152158
Artslaves are paid to toil, not to slack off.
>panic
Mental illness on display.
Anonymous No.107152255 [Report] >>107152267 >>107152271
noob here
quick question
do you guys use koboldccp?
like whats the best software ?
my pc is 4060 with i5 12400f 16gb
is it enough no?
Anonymous No.107152267 [Report] >>107152280
>>107152255
Go to bed Timmy, your little gaymen PC you got for your birthday isn't gonna cut it.
Anonymous No.107152271 [Report] >>107152280
>>107152255
Yes. MsPaint. No.
Anonymous No.107152280 [Report]
>>107152267
>>107152271
guide me sar
Anonymous No.107152320 [Report]
in many ways local is still behind DALL-E 3, a 2 year old model now
we will never catch up
Anonymous No.107152354 [Report] >>107152394 >>107152420 >>107152453
can I really put anybody in videos doing anything I want with just a few pictures and a bit of time?
Anonymous No.107152394 [Report]
>>107152354
I can, but can you?
Anonymous No.107152420 [Report]
>>107152354
depends on your definition of anything
Anonymous No.107152453 [Report]
>>107152354
it sounds better than it is, the limits of the checkpoint and your own imagination quickly become apparent
Anonymous No.107152658 [Report] >>107152695
do AI-based downscale algorithms offer any added value for the diffusion ecosystem?
https://arxiv.org/pdf/2511.01620
(no github repo)
Anonymous No.107152695 [Report]
>>107152658
not really. downscaling only happens in a hires fix but diffusion is performed after so no reason to preserve details better since a new image or new details are being generated. if it's faster the speed increase would be negligible
Anonymous No.107153048 [Report] >>107153058 >>107153164
>>107150518
XD Cry me a river xD
Anonymous No.107153058 [Report] >>107153166 >>107154928
>>107153048
k. you're just a colossal faggot and you can live with that
Anonymous No.107153164 [Report] >>107153179 >>107153193 >>107154928
>>107153048
YOURE BROWN LOL!!!
Anonymous No.107153166 [Report]
>>107153058
?
Anonymous No.107153179 [Report]
>>107153164
zoomers can't even use the word nigger anymore
you are the reason why this thread sucks
Anonymous No.107153193 [Report] >>107153209
>>107153164
oh noes what am I gonna do with myself

sweety its 2025 no one gives a shit
Anonymous No.107153209 [Report] >>107153220
>>107153193
not only a filthy nigger, but also writing like a demented tranny.
kys irl faggot
Anonymous No.107153218 [Report] >>107153240 >>107153807 >>107153873
This is extremely embarrassing but I wasted literally hours upgrading comfyui.
I wanted to test what sage attention 2 would offer over the first version, I also haven't upgraded in a while so decided to upgrade everything, extensions, comfy installation, and the docker image.
It broke completely. Couldn't launch.
Needed to edit requirements.txts and retire the controlnet-aux node (wasn't working properly anyway).
Then I realized you needed to compile sageattention manually for the version 2 work.
Alright, edit and rebuild docker image to include that, took some time but wasn't that difficult. And it actually worked so I can use patch sage attention node now.
However I quickly realized that nunchaku was broken, which is the worst part. Tried to install multiple versions, none worked. Turns out they don't ship cu13 wheels. Some douche on github issue tracker apparently claimed cu13 works fine if built with "some explicit settings" (never clarified what, thanks a lot) so tried that.
Nope doesn't work, around half of the extensions (none of them are relevant to what I am doing) load fine but the other half gives "import failed ... from 'nunchaku' (unknown location)" on launch, despite compiling and installing the backend. Tried multiple different parameters to build it, all error out on launch.
I gave up. I will try rebuilding later if it looks like they added commits that might be relevant or just start publishing cu13 wheels.
I just want to generate anime women with big boobs man. This shit shouldn't be this difficult. This is insane.
Anonymous No.107153220 [Report] >>107153240
>>107153209
ohhhh noooooooo oh nooo no no no
im killing meself now.. and its all becoz of you
oh noooooooooo
Anonymous No.107153240 [Report] >>107153249 >>107153259 >>107153341
>>107153218
next time check if shit you dont want to build yourself has either prebuilt wheels or you can install it yourself.
for me the biggest annoying piece of shit is flash attention, requires like 30 mins to build on my beefed up rig too.
>>107153220
keep malding trannigger
Anonymous No.107153249 [Report] >>107153258
>>107153240
yes im def the one malding
Anonymous No.107153258 [Report]
>>107153249
yes please give me another (you)
Anonymous No.107153259 [Report]
>>107153240
>beefed up rig

RIIIIIIIGHT
Anonymous No.107153341 [Report]
>>107153240
Well I learned hard way that it doesn't have (appropriate) prebuilts. I also thought that I could install it myself, and it builds fine but causes problems with the extension for some obscure reason.
Mine took fucking forever to compile first try too but I added NVCC_FLAGS (--threads) and MAKE_FLAGS(-j) which increases speed a few times.
Anonymous No.107153733 [Report]
>>107145378 (OP)
Anonymous No.107153807 [Report] >>107153845
>>107153218
use uv/venv next time
linux btw
also compile nunchaku by urself
Anonymous No.107153845 [Report] >>107153888
>>107153807
>use uv/venv next time
Already using it
>linux btw
Ubuntu image inside arch host
>also compile nunchaku by urself
That's already what I am doing, can't you fucking read?
Most useless response of the year award.
Anonymous No.107153865 [Report] >>107153888
>omg use uv!
>same garbage problems as conda cancer
I fucking hate you people
Anonymous No.107153873 [Report]
>>107153218
I'm so glad I stopped caring about nunchaku when I saw they had ADHD and never updated anything properly, just hoping from hype to hype.
Anonymous No.107153888 [Report] >>107153920
>>107153845
u said ur using docker
just go back to cuda 12.8? u do know u can have multiple cuda versions installed, right??
>>107153865
this is the reason i dont use uv, and instead use chroots. but i dont recommend chroot for u
Anonymous No.107153918 [Report] >>107153979 >>107154021 >>107154174
Babe wake up, a new 4step distillation method got invented
https://github.com/Lakonik/ComfyUI-piFlow
https://huggingface.co/spaces/Lakonik/pi-Qwen
https://huggingface.co/Lakonik/pi-Qwen-Image
https://huggingface.co/Lakonik/pi-FLUX.1
Anonymous No.107153920 [Report] >>107153938 >>107153964
>>107153888
>just go back to cuda 12.8?
No I don't think I can. One of the other packages I compiled (sage I think, not sure) needs it to be at cuda13.
Anonymous No.107153936 [Report] >>107154111
>>107148028
>>107148048
What kind of pics can't I gen on 1660S?
Anonymous No.107153938 [Report] >>107154030
>>107153920
recompile it to cuda 12.8 then?
if you HAVE to use the most recent sageattn (if ur certain that it needs cu13 too) for wan or whatever, then make 2 venvs
one for nunchaku and one for sage?
Anonymous No.107153961 [Report] >>107153969 >>107154493
>captioned a dataset through teamviewer while shitting
technology is amazing
Anonymous No.107153964 [Report] >>107154030
>>107153920
Compile for cuda 12.8, cuda 13 accepts cuda 12.8 compiled wheels.
Anonymous No.107153969 [Report] >>107154314
>>107153961
how many hours in the toilets?
Anonymous No.107153979 [Report]
>>107153918
no piwan, sad
Anonymous No.107154021 [Report]
>>107153918
I hope somebody does this to lumina while the hype is fresh. Any other 4 step distill will do, too.
Anonymous No.107154030 [Report] >>107154100
>>107153938
>recompile it to cuda 12.8 then?
torch, torchvision etc. dependencies are installed for cu13, I need to downgrade them as well after rebuilding the docker with downgraded cuda + cuda toolkit. Compile that thing again, hope it works and then compile nunchaku (or can just fetch prebuilt wheel at this point since they support cu128).
Like, this is should indeed be possible but... it's a bit inconvenient to say at least.
If they still haven't updated nunchaku or provided a wheel, say a few days later or whenever I am fed up waiting I might do that.
Not a very convinient solution, but thanks?
>if you HAVE to use the most recent sageattn (if ur certain that it needs cu13 too) for wan or whatever, then make 2 venvs
one for nunchaku and one for sage?
I guess I might also make a separate comfy docker for nunchaku instead of downgrading everything.
>>107153964
How do I do that exactly without changing docker cuda version? Is there an NVCC_FLAG to do that? Can you do that without downgrading cuda toolkit?
Anonymous No.107154045 [Report] >>107154136
we should invite debo for one thread
he's lonely in his containment general :(
Anonymous No.107154094 [Report]
>>107150577
Good goy
Anonymous No.107154100 [Report] >>107154189 >>107154354
>>107154030
>Can you do that without downgrading cuda toolkit?
What stops you from installing multiple cuda toolkit versions and using them depending on what you build?
Anonymous No.107154111 [Report]
>>107153936
Being retarded is more limiting than any hardware.
Anonymous No.107154113 [Report] >>107154146 >>107154305
some lora makers write like it's their food blog
Anonymous No.107154136 [Report] >>107154213
>>107154045
fuck off
Anonymous No.107154146 [Report]
>>107154113
>food
I really dislike one who puts actual recipes instead of a fucking lora description in the fucking description field.
Anonymous No.107154174 [Report] >>107154910
>>107153918
Ok this thing is kind of insane. I made a workflow to compare it with normal Qwen, and it's basically the same level of quality while taking less than 10% of the time. Works out of the box with loras also. In fact, with a custom lora on a mediocre quality dataset, the results are arguably better with this thing at 4 steps. It is partially counteracting the shitty quality of my dataset. Absolutely the new meta for using Qwen, it will be impossible to go back with how fast it is.
Anonymous No.107154189 [Report] >>107154342
>>107154100
Well I compiled everything else already, so I might just rebuild image with downgraded toolkit and try nunchaku on older toolkit version now.
Just to be clear I can run 12.8 toolkit on Cuda 13?
If so I might try this now.
Anonymous No.107154213 [Report]
>>107154136
>mfw
Anonymous No.107154305 [Report] >>107154346
>>107154113
a lot of people in niche communities (especially in coomer loners ones) have this "personal hugbox" mentality, it's unsurprising many troon out due to the combo (loneliness + pornography + hugbox environment)
Anonymous No.107154314 [Report]
>>107153969
countless
Anonymous No.107154342 [Report] >>107154368
>>107154189
>Just to be clear I can run 12.8 toolkit on Cuda 13?
Yes as long as the toolkit version < cuda it works.
Anonymous No.107154346 [Report]
>>107154305
Seems like you really are obsessed with these things.
Anonymous No.107154354 [Report]
>>107154100
Because supported glibc versions differ. Good luck with those.
Anonymous No.107154368 [Report]
>>107154342
Alright, let me see if that would finally work.
Anonymous No.107154403 [Report] >>107154425
>>107146824
>>107146895
I just tried ComfyUI with ROCm on Linux (RX 6950 XT), image generation works fine. 768 x 1280 (19sec), but text to video (Wan2.1 Alpha T2v) took 428 sec... (32gb RAM btw) and now I wanted to try Image to Video (Wan2.2 Animate) I downloaded all the files, but I get this error, picrel.
>Is nvidia much faster? Because I don't want to upgrade until the next console gen is out.
Anonymous No.107154425 [Report] >>107154541
>>107154403
>but text to video (Wan2.1 Alpha T2v) took 428 sec...
Steps? Seems normal overall though.
>(32gb RAM btw)
Low for Wan
>but I get this error, picrel.
It's not an error, you need to install some nodes for the workflow you are using.
Anonymous No.107154493 [Report] >>107154504
>>107153961
>not using local https://github.com/rustdesk/rustdesk
kwab
Anonymous No.107154494 [Report] >>107154510 >>107154540
is there a guide out there for how to prompt Wan image-to-video? just started messing around with that. i'm usually good at getting what i want out of images but my usual style here doesn't seem to work for videos, I get a lot of nonsense.

I'm using a standard resolution, I tried following the advice in the rentrys already but I'm hoping there's a more advanced one somewhere I just haven't found.
Anonymous No.107154504 [Report] >>107154519
>>107154493
ew what's all that nerd shit? i only use american software
Anonymous No.107154510 [Report] >>107154543
>>107154494
i should add i'm using the image_to_video_wan22_5B workflow in comfy more or less unedited except making the resolution portrait to match my input photo
Anonymous No.107154519 [Report]
>>107154504
oh, sorry goy, didnt know your masters wouldnt approve
Anonymous No.107154540 [Report] >>107154582
>>107154494
Image-to-Video Formula: the source image already establishes the subject, scene, and style. Therefore, your prompt should focus on describing the desired motion and camera movement.
Prompt = Motion Description + Camera Movement
Motion Description:Describe the motion of elements in your image (e.g., people, animals), such as "running" or "waving hello." You can use adverbs like "quickly" or "slowly" to control the pace and intensity of the action.
Camera Movement:If you have specific requirements for camera motion, you can control it using prompts like "dolly in" or "pan left." If you wish for the camera to remain still, you can emphasize this with the prompt "static shot" or "fixed shot."
Anonymous No.107154541 [Report] >>107154583 >>107154708
>>107154425
Idk why I said error, I was just confused, because the text to vid model was pretty easy to install, just click, click download and insert and int dir, that comfyui says.
Workflow picrel, that was literally the imported template from the text to vid model. I'm using ``python main.py --use-split-cross-attention`` to start. Since I switched to loonix the whole AI shit is much simpler than on windows. I have CUDA FOMO...
Anonymous No.107154543 [Report] >>107154582
>>107154510
>5B
Don't anon, use 14B with a quant instead.
Anonymous No.107154578 [Report] >>107154593 >>107154599
I am a noob for image models. Why do all of the image models have such low number of parameters. Why aren't there like 30/120B parameter models like you see with coding llms?
Anonymous No.107154582 [Report] >>107154597 >>107154704
>>107154543
damn I was hoping it would just be a little shittier but also faster to dialing in my prompting skills. i guess it's a lot shittier. i'll try the 14b and just learn to be patient

>>107154540
oh shit, of course it's trained on actual filmmaking terms. thanks, i'll mess around with this.

does it respond better to full sentences? like say I want a dog to run for a while and then catch a frisbee that comes from off camera.

"running away. frisbee enters top right. jumps. catches frisbee in mouth"

compared to

"dog runs away from camera, when a frisbee enters from the top right. the dog jumps and catches it in its mouth"

the first one doesn't mention the subject at all, and just puts the most essential info. the other feels more extraneous maybe?

getting durations right is challenging too, which is why i picked this example.

fucking love this though, got bored of just image gen a while back
Anonymous No.107154583 [Report] >>107154628
>>107154541
>--use-split-cross-attention
Any reason why? The default attention should be better I believe.
>I have CUDA FOMO...
What do you want me to say exactly?
Don't use fp8 on RDNA2. Use fp16 or Q8.
Also no idea about the nodes on the bottom. I usually just vae decode and save video.
Anonymous No.107154593 [Report]
>>107154578
That would be too unsafe, anon. It's better that only the most trusted corporations handle the best training data and largest models.
Anonymous No.107154597 [Report]
>>107154582
fuck i shouldn't have posted on my phone, i fucked up the spacing and look like a redditor
Anonymous No.107154599 [Report]
>>107154578
I mean Hunyuan 3 exists but it's not better.
My guess is training difficulty + diminishing returns + consumer inference considerations.
Unlike LLMs, you can't easily split diffusion inference to multiple GPUs, so VRAM loads needs to be lower as well.
Anonymous No.107154628 [Report] >>107154677
>>107154583
>Any reason why?
It says on startup something like "if you have ram problems use x" and it always says something about RAM in the vae process.
>What do you want me to say exactly?
Lie to comfort me...
>Don't use fp8 on RDNA2. Use fp16 or Q8.
dunno wat dis is, but thanks.
Anonymous No.107154677 [Report]
>>107154628
>It says on startup something like "if you have ram problems use x" and it always says something about RAM in the vae process.
I don't know too much about AMD inference so there is a low chance that there is a valid reason for this but overall it should be inferior to default flash attention
>Lie to comfort me...
AMD will release the mythical ROCM update soon that will blow CUDA out of water
>dunno wat dis is, but thanks.
Quantization you are using on the diffusion model and text encoder.
FP8 is pointless without dedicated acceleration like Blackwell or RDNA4.
Use fp18 (slower but best quality) or Q8 (similar speed but better quality). For the latter you will need comfyui gguf nodes.
Anonymous No.107154704 [Report] >>107154716
>>107154582
Your second second example. Use full sentences in which you write a detailed, straightforward description. Wan 2.2 supposedly was designed to understand the nuance and context of full sentences, not just isolated keywords.
Anonymous No.107154708 [Report] >>107154747
>>107154541
how come the negative prompt isn't in english in the default workflow?
i put it through a translator and put the english version back into my workflow and it seemed to get better. only done a single gen since though so its not exactly good science
Anonymous No.107154716 [Report]
>>107154704
amazing, thank you anon. you've saved me tons of time
Anonymous No.107154747 [Report] >>107154765
>>107154708
Because it's a Chinese model and using Chinese in your prompts is supposed to be more effective.
Anonymous No.107154765 [Report]
>>107154747
damn i was afraid of that. wonder if translators/chinkgpt for my prompts will lead to better results
Anonymous No.107154827 [Report]
>>107154826
>>107154826
Anonymous No.107154910 [Report]
>>107154174
show some comparisons I'm curious of it
Anonymous No.107154928 [Report]
>>107153058
>>107153164
You are g(r)ay, KEK XD
Anonymous No.107154935 [Report]
Anonymous No.107156557 [Report] >>107156576
Thinking about renewing my NovelAI subscription, but its kinda expensive. I imagine I could acheive similar results locally, but I only have a RTX 4070 12GB, which I don't think is powerful enough to achieve the fidelity I want.
Anonymous No.107156576 [Report]
>>107156557
It's definitely enough. I do this on a 6GB card.