← Home ← Back to /g/

Thread 106267568

327 posts 158 images /g/
Anonymous No.106267568 >>106267605 >>106271344
/ldg/ - Local Diffusion General
e5 tts Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106264704

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://github.com/Wan-Video
2.1: https://rentry.org/wan21kjguide
2.2: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y

>Chroma
https://huggingface.co/lodestones/Chroma1-HD/tree/main
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.106267584 >>106267623
Tried doing the upscale in two steps.
Anonymous No.106267605 >>106267623
>>106267568 (OP)
>I'm tired, boss.
>goes to bed
I would pay moeny for this one, it is pretty good.
Anonymous No.106267606
Anonymous No.106267608
Anonymous No.106267623 >>106267645
>>106267584
why do you put cinematic, 4k, hd into your negs? did you do some a/b testing and find that those negs reduce sloppa?

>>106267605
knowing what i know now, i would pay 2000 dollars (ok but unironically 500 dollars) for the weights to wan 2.2 assuming loras are free or like a dollar each
i think i have hit 30 cooms with just 2.2, probably close to 100 between hyvid, wan 2.1 and 2.2
Anonymous No.106267645 >>106267656
>>106267623
It's in the positive. I just left a paragraph so I didn't accidentally delete it each time.
Anonymous No.106267650
Is "e5 tts" as easy to setup and use as chatterbox? I don't want more python env madness.
Anonymous No.106267656 >>106267667
>>106267645
thanks again
i swear, if the reason my pov prompts have been failing is because i wrote "pov" instead of "point of view" i'm going to be angery
Anonymous No.106267660
>2.5D anime style animated footage. Low angle shot. There are two women wearing bikini armor. The taller one is grabbing the shorter one by the ponytail and shoving her face into the mud, the shorter is on her knees and her face is closer to the camera and is pushed closer as her head is forced into the mud. The focal point changes between characters going from the taller girl to the shorter one.
>4K cinematic, HD
Anonymous No.106267667 >>106267937
>>106267656
I think handheld footage is pretty strong too.
Anonymous No.106267673 >>106267686 >>106267694 >>106267937 >>106269569
>upgraded to 64gb ram
>speeds now constantly 80-90 secs each sampler instead of 120-150 secs each
feels good
Anonymous No.106267686 >>106267733 >>106267815
>>106267673
this hobby truly is a race to an empty wallet
at least I can go to sleep knowing I am not an audiophile
Anonymous No.106267694 >>106267703 >>106267737
>>106267673
I wonder how good 128 feels.
Anonymous No.106267703 >>106267713
>>106267694
NTA but my second kit of 64gb is arriving today.
Hell yeah.
Anonymous No.106267713
>>106267703
Jelly. I had some large expenses recently so I can't splurge on hardware.
Anonymous No.106267732 >>106268058
Blessed thread of frenship
Anonymous No.106267733 >>106267742 >>106267745
>>106267686
>knowing I am not an audiophile
from my experience, cars are even worse
pc stuff is relatively cheap compared to a lot of hobbies
Anonymous No.106267737 >>106267769 >>106270645
>>106267694
I have 128, only point is if you have 2 gpus, otherwise it's a waste of ram.
Anonymous No.106267742 >>106267937
>>106267733
cars can help you make friends, which is a rarity as most hobbies are inherently antisocial
Anonymous No.106267745 >>106267752 >>106267937
>>106267733
>Car
>Drums
>ImageGen
>Music collection
Out of all of those, image gen costs the least (unsure about the power bill yet, however). Thank god I'm not into local LLMs.
Anonymous No.106267752
>>106267745
>local LLMs
Yeah these can cost an insane amount if you want to use good models.
Anonymous No.106267769 >>106267781
>>106267737
you don't understand, I have more than two tabs open in chrome
Anonymous No.106267781
>>106267769
lmao
Anonymous No.106267790
https://arxiv.org/html/2508.10711v1

Wow... who doesn't love themselves another clearly GPT-slopped model.
Anonymous No.106267813
>>106263686
Gonna release it? I wanna try

>>106264995
>>106265014
https://voca.ro/19MX8lt6IPIi
Anonymous No.106267815
>>106267686
true, thankfully got a good deal and managed to snag these for $40
Anonymous No.106267844
is there any way to make autocomplete show more than 5 tags on forge?
I altered some settings but it still only shows five.
Anonymous No.106267848 >>106267855 >>106267873 >>106267876
Another model just dropped

https://github.com/stepfun-ai/NextStep-1?tab=readme-ov-file
Anonymous No.106267855
>>106267848
w-what are you doing step(bro)fun-ai
Anonymous No.106267873
>>106267848
>Next step

Cool name.
Anonymous No.106267876 >>106267886
>>106267848
the images they showed in the paper looked really shitty desu
Anonymous No.106267886 >>106267904 >>106268101
>>106267876
researchers are known for their inability to generate kinosovl
Anonymous No.106267889
eta to a fennec pic and link to github update?
Anonymous No.106267894
Anonymous No.106267904 >>106268137
>>106267886
well the method is interesting and different at least.
If it's just generating the next couple of blocks at a time it might hypothetically scale to an arbitrarily high aspect ratio image. It's architecture might allow it to more trivially perform outpainting.
Other than that I don't see the merits of the approach

I'd like to see what the limitations are.
Anonymous No.106267937 >>106267975 >>106268003
>>106267667
>I think handheld footage is pretty strong too.
it wasnt too impressive in my testing but i am also a (v)ramlet promptlet creativitylet that uses the lightning slop. the prompt guide looks like it was trained on music video scenes so it seems that there really is no magic token for true candid/handheld vlog footage stuff

i almost want to ask the guy who made the upskirt helper lora if he could make a candid helper lora for more general candid or handheld stuff than just creepshots

>>106267673
>upgraded to 64gb ram
>speeds now constantly 80-90 secs each sampler instead of 120-150 secs each
this was the post that convinced me.

i hate the XMP antichrist (just leave me alone, man) so my 2x16 is already running at 4800 so I'm assuming if I just get another 2x16 it'll be fine at 4800 and I'll have nothing but upside right

>>106267742
>cars can help you make friends, which is a rarity as most hobbies are inherently antisocial
i saw a HN discussion where someone basically said that men don't actually have friends, just guys they like to do things with, and i feel like that's very accurate and true at least for me

>>106267745
>Thank god I'm not into local LLMs
the people into local LLMs aren't even into local LLMs anymore because they all suck so hard right now kek, unless they're literal researchers writing code and implementations
Anonymous No.106267964 >>106268002
We're all getting 1024GB VRAM for Christmas.
Anonymous No.106267975 >>106268002
>>106267937
real friends exist, you'd just be lucky to have one or two.
but yeah everyone else is of the other variety
Anonymous No.106268002
>>106267964
i'd rather get 10 second capability for christmas which might actually happen now that i think about it

>>106267975
>real friends exist, you'd just be lucky to have one or two.
my only real friend is my wife which hardly counts because its a little different. my 3 guy friends are all hobby friends and i haven't seen any of them in weeks
Anonymous No.106268003 >>106268024
>>106267937
i managed to get lucky with the 2 16gb sticks i just bought and they were the exact same model as the ones i've been running on my PC for years now so it let me run all 4 on XMP.
Anonymous No.106268024 >>106268045
>>106268003
you actually got even luckier because even with the exact same model you might still run into issues unless they all came out of the same package but thanks for flexing on me

i should go sleep now i have a lot of "point of view" shots of living with a latina bimbo and spying on her in the bathroom etc to generate tomorrow and i really need to try out that butt shot composition the kind prompt sharer japanese enthusiast anon shared the other day too but its actually kind of depressing how impossible a truly fat ass that actually sticks out (z axis) is to get, not as big of a problem for asians but still
Anonymous No.106268044 >>106268055
https://huggingface.co/tencent/Hunyuan-GameCraft-1.0
model released. comfyui when?
Anonymous No.106268045
>>106268024
>Handheld footage point of view, the camera looks through the keyhole of the door to reveal -
Anonymous No.106268050
Anonymous No.106268055
>>106268044
>Hunyuan
trash.
Statler&(maybe)Waldorf No.106268058
>>106267732
BEAHAGAHAH
Anonymous No.106268059
Anonymous No.106268074 >>106268082 >>106268266
Followed the wan2.2ldg rentry to the letter, using the suggested Kijai 480p workflow. getting the following

CompilationError: at 1:0:
def triton_poi_fused__to_copy_mul_0(in_ptr0, in_ptr1, out_ptr0, xnumel, XBLOCK : tl.constexpr):
^
ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

I would assume there's a value that's not configured right but I'm not used to the UI
Anonymous No.106268082 >>106268104
>>106268074
disable the torch compile node
Anonymous No.106268090 >>106268504
Anonymous No.106268101 >>106268140
>>106267886
dont think ive seen one single model that had slop in its paper and wasnt slop when actually using it
Anonymous No.106268104
>>106268082
appreciated, thank you
Anonymous No.106268123 >>106268128
Anonymous No.106268128
>>106268123
>typing: post more pussy
Based cat
Anonymous No.106268130 >>106268184
kek
Anonymous No.106268137
>>106267904
I only had a quick glance but it just appears to be another autoregressive model, right? We already have Lumina-mGPT as an autoregressive model, so I don't think it will change much in terms of the things you mentioned. Unless I'm also unaware of these Lumina upsides. Might be, I'm retarded.
Anonymous No.106268139 >>106268168
>try comfy's test prompt for qwen-image in chroma
>cute anime girl with massive fennec ears and a big fluffy fox tail with long wavy blonde hair between eyes and large blue eyes blonde colored eyelashes chubby wearing oversized clothes summer uniform long blue maxi skirt muddy clothes happy sitting on the side of the road in a run down dark gritty cyberpunk city with neon and a crumbling skyscraper in the rain at night while dipping her feet in a river of water she is holding a sign that says "ComfyUI is the best" written in cursive
what did chroma mean by this?
Anonymous No.106268140 >>106268157
>>106268101
Every model has slop in its paper
Anonymous No.106268157 >>106268185
>>106268140
nta, but the examples here look pretty shit. idk if it's actually as bad as they're showing but my expectations are low.
Anonymous No.106268168
>>106268139
improvement desu
Anonymous No.106268184
>>106268130
lol
Anonymous No.106268185
>>106268157
7/10 new models are shit, that's the rule of thumb, we'll soon know where this sits
Anonymous No.106268193 >>106268204 >>106268449
Just noticed lightx2v team published an official native workflow for their wan2.2 lightning, so i decided to test it out.

Lightx2v 2.1 sticks to the original image artstyle better (Lightning 2.2 increased contrast and changed how the eyes look), and followed prompt better (lightx2v 2.1 had popslop dripping from her mouth at the end).

TL;DR lightning 2.2 is still asscheeks.
Anonymous No.106268194 >>106268211 >>106268271 >>106268278 >>106268289
>3-day ban for posting the exact image that was in a previous thread
S C H I Z O
D I S C O R D
T R A N N Y F A G S
Anonymous No.106268204
>>106268193
It completely blows colors out, ruins t2v output styles, turns movements into smooth robotic bullshit.
The kightx2v time should apologize for uploading it in that state.
Anonymous No.106268211 >>106268278
>>106268194
>Arbitrary banning rules on 4chan ?
It's more likely than you think
Anonymous No.106268240
Sometime you see pictures of giant nipples stay up until the thread 404s, other times you get banned because of the suggestion of cameltoe. That just be how it be.
Anonymous No.106268250 >>106270885
Whichever anon suggested this, good idea
Anonymous No.106268266
>>106268074
You can also download the model versions that match fp8e4b15 and fp8e5 and they may enable you to switch torch compile back on. bullerwins?? are the ggufs or city96 might have them idk non ggufs are maybe available with those specific fp8 dtypes if you have the vram.
hope it's some help if you want to explore :)
Statler&(maybe)Waldorf No.106268271 >>106268278
>>106268194
>schizo report bombed something he doesnt like
first time? BEAHAgaAhAH
what is this your first day here?
Anonymous No.106268278 >>106268284
>>106268271
>>106268211
>>106268194
>>106256419

ONLY the LDG post was deleted you fucking faggot schizo
reporting bombing bitch
im gonna shit up ldg so fucking much its NEVER gonna stop
you're gonna wish you never poked the bear
Statler&(maybe)Waldorf No.106268284
>>106268278
BEAHAGAhAH
Anonymous No.106268285
Anonymous No.106268289 >>106268292
>>106268194
sometimes I get banned from all boards because the janitor didnt like one image I posted (despite it not breaking any actual rules).
Anonymous No.106268292
>>106268289
its clearly the fag baker
Anonymous No.106268298 >>106268321
>image i made only makes the collage once she is tattooed and mutilated
FUCK YOU OP
Anonymous No.106268303
Anonymous No.106268319 >>106268458
needs better captioning prolly
Anonymous No.106268321 >>106268326 >>106268329 >>106268890
>>106268298
Trying to please the collage maker is a fruitless task.
Anonymous No.106268326
>>106268321
he has sdg derangement syndrome
Statler&(maybe)Waldorf No.106268329
>>106268321
even your ai waifu is ashamed to be caught here
beahagahaha
Anonymous No.106268367
discordfaggeneral
Anonymous No.106268392
dead general
must not feel like talking to yourself tonight
Anonymous No.106268399 >>106268404 >>106268407 >>106268440 >>106268455 >>106268457 >>106271177
Man. Wan is cool.
>Prompt for unseen force pulling the cloth away
>Okay
https://files.catbox.moe/wnntmp.mp4
Anonymous No.106268404
>>106268399
hhhnnngg
Anonymous No.106268407
>>106268399
My wife does this to me when I sleep frfr.
Anonymous No.106268422
Is it too much to ask for k-sampler that takes upscale models AND works with video?
Anonymous No.106268440 >>106268467
>>106268399
SLOP.
but HOT position\prompt. keep at it.
Anonymous No.106268449 >>106268845
>>106268193
>TL;DR lightning 2.2 is still asscheeks.
nothing like wasting few days trying to find the correct settings
Anonymous No.106268455 >>106268577
>>106268399
very cool
Anonymous No.106268457 >>106268462
>>106268399
>literal NUDITY is allowed
my animation is removed.
FUCK YOU IM REPOSTING IT
Statler/Waldorf No.106268458
>>106268319
BEAHAGAHAHA
Statler/Waldorf No.106268462
>>106268457
>enjoy your vacation
BEAHAGAAHAH
Anonymous No.106268467
>>106268440
What do you think is sloppy about it?
But thanks man.
Anonymous No.106268472
Anonymous No.106268478
I now I have an extreme strong dislike for seeing anything in slow-mo now. The Matrix is ruined.
Anonymous No.106268504
>>106268090
holy shit lmfao, wan is truly magical
Anonymous No.106268536 >>106268542 >>106268566 >>106268569 >>106270672
Should I invest $4,102.63 for a RTX 4090 24GB, 32 GB RAM setup or invest $5615.43 for a RTX 5090 32 GB, 64 GB RAM setup
Mind you
I can't afford either as I am a NEET but I'll give suicide thread to my family if they don't buy me a computer
They know my laptop is the only thing that keeps me sane

Or should I not be greedy and settle for a 12 GB or 16 GB VRAM setup?
Anonymous No.106268542
>>106268536
The fucking clown world prices man.
Anonymous No.106268549 >>106269084
FUCKING STOP SHITTING UP /NAPT/ SCHIZO ANON
Anonymous No.106268566 >>106268612
>>106268536
>but I'll give suicide thread to my family if they don't buy me a computer

The fukken state of you
Anonymous No.106268569 >>106268612
>>106268536
Go for a 16gb 5060 ti or something, else you family will kill you in your sleep to save money not only on the card, but on electricity
Anonymous No.106268577
>>106268455
Anonymous No.106268607 >>106269513
Anonymous No.106268612 >>106268656 >>106268663
>>106268569
>electricity
Running the RTX 5090 (32 GB) at ~550 W 24/7 for a month would cost me about $25
>>106268566
not my fault I'm disabled
Anonymous No.106268618 >>106269513
Anonymous No.106268627 >>106268642
What's the best way to do vid2vid together with first and last frame using kijai's wan nodes in Comfy? Whatever I'm doing results in extremely noticable saturation changes in the first few frames, which isn't the case when doing just vid2vid or firstlastframe generations. Using the native Comfy wan nodes doesn't cause the same issue, but it takes much longer.
Anonymous No.106268642
>>106268627
As to not XY problem myself, I'm just trying to upscale a 480p gen to 720p.
Anonymous No.106268656 >>106268779
>>106268612
They should let you kill yourself. You have 0 value in this world
Anonymous No.106268660
>>106267400
>>106267317
FUCK OFF
Anonymous No.106268663 >>106268779 >>106268798
>>106268612
>I'm disabled
Not an excuse faggot, telling your family youre going to (not) kill your self because you want a shiny new toy is pretty scummy
Anonymous No.106268683
Anonymous No.106268721 >>106269046
Anonymous No.106268725
Anonymous No.106268779
>>106268663
>>106268656
Anonymous No.106268798
>>106268663
maybe he comes from a rich ass family so it's not that big of a deal... hopefully.
Anonymous No.106268813
Is it possible to split outputs in comfyui? Like for example in UE5 you can split the output of a node so you can handle all of the variables it outputs individually
Anonymous No.106268819 >>106268839
>20 steps qwen-q8
Impressive prompt adherence but it doesn't seem to be as good as chroma when it comes to medium tags
Anonymous No.106268830
what an evil troglodyte piece of shit. Ancient Rome had it right with the father being able to kill his own kids

I'd spam real child torture in this thread just to make you feel a sinking feeling in your stomach because the good of making evil like you feel bad easily outweighs the evil of sharing that material publically and affecting the bystanders
Anonymous No.106268839 >>106268893
>>106268819
Same prompt in chroma. Qwen doesn't seem to understand what an oil painting is.
Anonymous No.106268845
>>106268449
You have to look at it from a glass half full perspective. you didn't waste time and failed, you just found a method that doesn't work :)
Anonymous No.106268874
lmao this larping furry threatening to spam shit
Anonymous No.106268890 >>106268949
>>106268321
>Trying to please the collage maker is a fruitless task.
Just gen goth girls if the last collage had a blonde and vice versa

Also WanVideo anon please consider making a neocities or static site or something because you're the only one doing bespoke hand crafted t2v prompts with wan and you're willing to share them and you often have good ideas and novel compositions and it would be nice to have a place that they are all together

Is this how nogen beggars feel? You're the only one I've ever actually asked this for. If 4chan didn't strip exif then I wouldn't even have to ask
Anonymous No.106268893 >>106268908
>>106268839
Another example. Qwen.
Anonymous No.106268896
Anonymous No.106268906
What's the consensus? is the 2.1 I2V 4step lora better than the 2.2 I2V 4step loras?
Anonymous No.106268908 >>106268984
>>106268893
Chroma
Maybe my prompting style is the issue but qwen is quite rigid in terms of art styles
Anonymous No.106268949
>>106268890
I have other creative pursuits outside of genning I share. But it's all porn shit and I'd be lambasted for it if I ever revealed it.
Anonymous No.106268984 >>106269007 >>106269138
>>106268908
>qwen is quite rigid
Yeah it's insanely rigid.
Anonymous No.106269007 >>106269013 >>106269014 >>106269138 >>106269212
>>106268984
Another example. "A blurry low quality 80s polaroid photo".
Qwen completely ignores the quality and medium tags.
Anonymous No.106269013
>>106269007
it gave you a low quality gen
Anonymous No.106269014 >>106269064
>>106269007
Chroma.
I guess Qwen's output in this example is more stylish and beautiful, but it comes at a price of worse medium comprehension
Anonymous No.106269023
Anonymous No.106269046 >>106269063
>>106268721
can she spank her?
Anonymous No.106269057
Anonymous No.106269063 >>106270261
>>106269046
No, she genuinely hates her and wants her to die.
Anonymous No.106269064
>>106269014
kino
Anonymous No.106269082 >>106269101 >>106269105
>try wan 2.2
>first checkpoint works fine
>comfy tries loading second one
>crashes
very functional program
Anonymous No.106269084
>>106268549
kek
Anonymous No.106269101
>>106269082
>stuck on one tool
>cant find a working solution for himself
very functional anon
Anonymous No.106269105
>>106269082
>TOOL FAULT TOOL FAULT
holy retard
Anonymous No.106269138 >>106269291
>>106269007
>>106268984
Chroma is great for faces too, it seems to generate a unique face every time ( I do a lot of image to image stuff ). Where as Qwen seems to suffer from same-face syndrome, sadly.
Anonymous No.106269192 >>106269380
starting to like statler honestly
Anonymous No.106269206 >>106269380
stating to like startler honestly
Anonymous No.106269212
>>106269007
t2v with qwen
i2v with chroma
Anonymous No.106269218 >>106269225
can I run Wan and Lightx2V decently with two nvidia 3060 (12 gb each)?
haven't tried video models at all
Anonymous No.106269225 >>106269261
>>106269218
yes
Anonymous No.106269261
>>106269225
cool. thanks
Anonymous No.106269275 >>106269329
Anonymous No.106269283 >>106269299
HALP
I generate spritesheet, but when i try to resize into 128x128, its always remove the pixel
Is there any node that handle resize for pixel art?
Anonymous No.106269291
>>106269138
Same-face is good for i2v though. I'm putting together an i2v short made of qwen-image gens, it helps immensely to have them be consistent.
Anonymous No.106269295 >>106269369
>press unload models and free node cache buttons
>mfw
What the fuck is being stored in the memory?
Anonymous No.106269299 >>106269366 >>106270005
>>106269283
https://github.com/tauraloke/ComfyUI-Unfake-Pixels/
Also what are you using for spritesheets?
Anonymous No.106269318 >>106269352
Anonymous No.106269329
>>106269275
catbox?
Anonymous No.106269348
I wonder if the new wan 2.2 flash model will release weights soon?
Anonymous No.106269352 >>106269398
>>106269318
Suddenly the white woman has black arms...
Anonymous No.106269366
>>106269299
I train Illustrious using Pokemon spritesheet
Anonymous No.106269369
>>106269295
chrome
Anonymous No.106269380 >>106269606
>>106269206
>>106269192
fucking waldorf haters make me sick
Anonymous No.106269381
So Qwen Nunchaku is out, but no comfy workflow yet...
Anonymous No.106269398 >>106269408
>>106269352
it spreads
Anonymous No.106269408
>>106269398
Makes you wonder, in the world of woke, what is worse, blackface or blackarms ?
Anonymous No.106269437 >>106269475 >>106269482 >>106269665
https://huggingface.co/nunchaku-tech/nunchaku-qwen-image

qwen image is fast as fuck now, really wish they would do a wan version
Anonymous No.106269475
>>106269437
Extra fast on Blackwell, I guess they're making the best of the 4bit hardware support it has
Anonymous No.106269482 >>106269516
>>106269437
How do these differ from just running a q8?
Anonymous No.106269513
>>106268607
>>106268618
Hot af
Anonymous No.106269516 >>106269528
>>106269482
it runs inference in 4 bit with actually better quality than FP8, its black magic.
Anonymous No.106269528 >>106269533
>>106269516
get these people on wan asap wtf
Anonymous No.106269533
>>106269528
they have it on their road map at least
Anonymous No.106269557
Anonymous No.106269558
Anonymous No.106269561 >>106269566
Are the qwen nsfw loras just generic vanilla shit? Am I still locked to chroma for native freakshit?
Anonymous No.106269566
>>106269561
>gen something in qwen
>I2I in chroma or noob/illust
>inpaint as desired
Anonymous No.106269569 >>106269680
>>106267673
I want to upgrade my ram also but I have 32gb. Can i buy another 32gb and make it 64gb?
Anonymous No.106269592
Anonymous No.106269606 >>106269740
>>106269380
>>106269095
hes mean to ani :(
Anonymous No.106269622 >>106269659 >>106269698
Now that qwen nunchaku is out, what makes it better than flux?
Anonymous No.106269653 >>106270694
Newfag here, trying to figure out local gen stuff, so sorry in advance if my post seems like retarded bait. Basically
>managed to install stable-diffusion-webui-amdgpu on my pc
>I have 8gb vram and 16gb ram
>it worked (slowly, I remember some gens being around 3 or 4 secs/it, plus occasionally crashes with oom errors) but I only messed around with it a little, using the Illustrious XL 2.0 model suggested in one of the renpy links
>didn't really touch the settings since it was all gobbledigook that I wanted to google before fucking shit up, so I don't think I was using any sort of offloading
>went on 2 week vacation (no access to the pc) during which I decided to commit and move from windows to Linux (Mint, in case there's something specific that I should keep in mind with it and local models)
I'm still getting stuff ready to migrate to Linux but I want to start getting an idea of what might be the best and/or most noob friendly UI to start off with, what models might be able to run on my pc, what I can expect quality wise or if I shouldn't even bother until I upgrade my hardware.
I saw >>106267379 and >>106267432 from the previous thread, but I wouldn't mind some extra info to go along with the stuff in the renpies and whatever else I might find online.
Fwiw I was thinking of using ComfyUI and get the learning curve out of the way but that might be a bad idea. Maybe (re)Forge is a better option atm? As for models, I'm probably gonna try Illustrious again, and maybe whatever version of Qwen I might be able to run, if there aren't better options.
Anonymous No.106269659
>>106269622
Qwen is controlled by the Communist Party of China.
Anonymous No.106269665 >>106269695
>>106269437
They said that this is their next goal
Anonymous No.106269680
>>106269569
Yes, unless you're on a laptop
Anonymous No.106269695 >>106269707
>>106269665
Why the fuck would they prioritize qwen over wan?
Anonymous No.106269698 >>106270042
>>106269622
It's better than Flux at every metric, but while it is a LOT less slopped than Flux, it is still slopped

Loras will fix that though, in a much better way than loras that tried to circumvent Flux sloppiness and censorship
Anonymous No.106269707
>>106269695
Probably easier since they already have experience with making Nunchaku image models
Anonymous No.106269740 >>106270500
>>106269606
Weird
One would guess that pedos get along with each other
Anonymous No.106269775 >>106269887
a toast to the death of /lmg/!
training wizza No.106269822 >>106270152
this might help noobs who are genning, if you are trying to do a 5 second clip with an image and the image's resolution ratio is not 1:1, it will take more memory to generate
if you're having trouble woth memory allocation, you can try to trim the image so it's lenght and width are as closer as possible, if not the same
Anonymous No.106269885
I hate square format
I hate square format
Anonymous No.106269887
>>106269775
Anonymous No.106269898
Anonymous No.106269899 >>106269930
Anonymous No.106269930 >>106269961
>>106269899
catbox?
Anonymous No.106269961
>>106269930
https://files.catbox.moe/50anxm.mp4
Anonymous No.106269988 >>106270088 >>106270156 >>106270305
Is there a chroma compatible node that has the the resolutions/aspect ratio as a drag and drop or has it in presets?
Anonymous No.106270000 >>106270061
Is there any reason to use Q8 over FP8?
Anonymous No.106270005
>>106269299
oh cool thank
its work
Anonymous No.106270027
Anonymous No.106270042 >>106270103 >>106270149
>>106269698
>less slopped than flux
come on now
Anonymous No.106270061
>>106270000
Q8 is basically better in every sense.
Anonymous No.106270062 >>106270105
which of these do i get?
Anonymous No.106270088
>>106269988
In the last thread or the one before anons were recommending such a node for SDXL. One was D2 size selector
Anonymous No.106270103
>>106270042
Hve you ever used Flux ? The Flux chin alone means more slopped, but the skin is literally plastic
Anonymous No.106270105 >>106270127
>>106270062
gee, if only there was a section right in the cover page of the repo specifying what each one of these are, right?
Anonymous No.106270127
>>106270105
i just grabbed the largest one
Anonymous No.106270149
>>106270042
>implying the ultra-distilled model is less slopped
Anonymous No.106270152 >>106270204
>>106269822
A 1:1 image is gonna take less memory than a 16:9 image with the same pixel count? I don't know about that.
Anonymous No.106270156 >>106270208 >>106270210
>>106269988
>That pic
Bro you literally generated one of my dreams / disturbing visions and got double dubs on top.
Anonymous No.106270198
>32gb -> comfyui server crash
>64gb -> 72 secs
>96gb -> 68 secs
so the more ram the better
interesting
training wizza No.106270204
>>106270152
maybe it's the upscaler, but it does, i noticed that every video i had to bring the total frames down was because of the aspect ratio
Anonymous No.106270208 >>106270227
>>106270156
schizo lain anime
Anonymous No.106270210 >>106270227
>>106270156
We Are Watching You
Anonymous No.106270227 >>106270263 >>106270309
>>106270208
>>106270210
I'm going to create a novel based on that. I know what dreams are and that "dream" was way too clear to be a dream.
Anonymous No.106270260 >>106270293 >>106270714
Anonymous No.106270261
>>106269063
sounds unpleasant
Anonymous No.106270263 >>106270290 >>106270305
>>106270227
My current schizo theory is that the minds of AI users are locked into a group consciousness like a mass latent. And then the minds start blending together or pieces changing places
Anonymous No.106270268 >>106270273 >>106270286 >>106270436
>have to use quants to cope with 8gb vram + 32gb sysram
>quants look like shit
is this a sign to buy more ram
Anonymous No.106270273
>>106270268
Just download more.
Anonymous No.106270286
>>106270268
gguf q8 looks close to the same as the original on most models we can run
Anonymous No.106270290 >>106270305
>>106270263
Can you latent me a gf?
Anonymous No.106270293
>>106270260
>Early 2000 japanese mecha movie
I rike it
Anonymous No.106270305 >>106270320 >>106270324 >>106270337
>>106270263
I'm not an AI user Anon, I saw that pic >>106269988 when I opened /g/ in the thread view.

>>106270290
Do NOT summon a tulpa you moron, go to >>>/x/ and read up before you try to manifest shit.
Anonymous No.106270309 >>106270343 >>106270360 >>106270568
>>106270227
>I know what dreams are and that "dream" was way too clear to be a dream.
t. Cyrus the Great
Anonymous No.106270320 >>106270360
>>106270305
>Do NOT summon a tulpa
oh he's a schizo
Anonymous No.106270324 >>106270360
>>106270305
tulpa? you mean the realistic dreams I used to have as a kid and which I could control like in lucid dreams?
Anonymous No.106270337 >>106270360
>>106270305
anytime i hear someone mention that shit im reminded of the anon who apparently started permanently hallucinating some pony but with an extremely contorted face that did nothing but scream in agony
Anonymous No.106270343
>>106270309
Gaming was a mistake
Anonymous No.106270360 >>106270381
>>106270309
What, me? No man, no.
>>106270320
I'm not a schizo, I just enjoy reading retarded shit on /x/
>>106270324
>>106270337
Honestly, I don't really know what a tulpa is, I was just LARPing lmao. Something that you could manifest and that would take over your mind and kill your sanity, that's all I know. I don't know how to manifest that shit.
Anonymous No.106270376 >>106270385
Was sage attention updated to work on qwenimg? Any news?
Anonymous No.106270381
>>106270360
sounds like a djin, or demon possession
Anonymous No.106270385 >>106270398 >>106270416
>>106270376
>Any news?
https://huggingface.co/nunchaku-tech/nunchaku-qwen-image
read the thread
Anonymous No.106270398
>>106270385
What does this have to do with sage attention?
Anonymous No.106270409 >>106270429 >>106270600
Anonymous No.106270411 >>106270550
nunchaku qwen image has no comfy support?
Anonymous No.106270413
for those of you wondering what a tulpa is and why you should not generate one, please read the "summoning a pinkie pie tulpa demon" greentext
Anonymous No.106270416
>>106270385
ok nigger but comfy support when?
Anonymous No.106270429 >>106270545
>>106270409
latina diffusion general

sad to see that you're stuck on a vramlet resolution
Anonymous No.106270436
>>106270268
nigga ram isn't gonna help. you need a better graphics card
Anonymous No.106270443
Still no LXTV i2v video guidance?
Anonymous No.106270462
Anonymous No.106270490 >>106270611
dumb question but if i have two lora nodes next to eachother and they're both loading the same lora twice at 0.4 weight, will the total weight of the lora applied be 0.8?

i just tried a gen on the old 2.1 lightx2v and it mogs but its actually too much movement so i'm thinking of mixing 2.1 and 2.2 at 0.4 each for a total of 0.8 but im not sure if thats how the math works
Anonymous No.106270500 >>106270508
>>106269740
QRD?
Anonymous No.106270508 >>106270539
>>106270500
everyone I don't like is a pedo. it came to me in a dream
Anonymous No.106270533 >>106270612
wow combining lightx2v 2.2v1.1 and 2.1 actually works pretty well for keeping the movement

this is 0.6 2.2v1.1 lightning + 0.4 2.1 lightx2v

will try 0.8 and 0.2 next but i feel like the sweetspot is somewhere around here
Anonymous No.106270539
>>106270508
gotcha ;^)
Anonymous No.106270545 >>106270564
>>106270429
yeah sucks being a 4090 vramlet loser
Anonymous No.106270550
>>106270411
Not yet, and probably not until monday unless someone else other than the Nunchaku guys implements it
Anonymous No.106270557 >>106270564
based on my observations while lurking; Statler&Waldord posting hours coincide with somewhere in Russia, but the resolution of the gens fluctuates between 480p & 720p.. so how did a slav afford a 40 series?
Anonymous No.106270564 >>106270575
>>106270545
it unironically does

>>106270557
russia has 11 time zones your "observations" are completely retarded
Anonymous No.106270568 >>106270590
>>106270309
You can't even see what game they're playing in the window reflection

Wan sucks
Anonymous No.106270575
>>106270564
kek
Anonymous No.106270590 >>106270631
>>106270568
this could be a step distill issue or just a lightning 2.2 being garbage issue
Anonymous No.106270600 >>106270615
>>106270409
is this wan 2.2?
Anonymous No.106270611 >>106270626
>>106270490
those are separate loras though
Anonymous No.106270612
>>106270533
>v1.1 and 2.1
they've released multiple versions of the 2.2 loras?
Anonymous No.106270615
>>106270600
yes
Anonymous No.106270618 >>106270748
So I have a problem. In comfy I can't slot in an upscaler into the load upscale model node. I tried a couple.

Followed this guide https://docs.comfy.org/tutorials/basic/upscale put it in the upscale_models folder and I can't slot it in the node and the node doesn't show a list when I click it.
Anonymous No.106270624
>install latest sageattention 2.2 wheel
>go from 28 to 16 seconds chroma gens on rtx5090
sheesh
Anonymous No.106270626
>>106270611
sure, but lets pretend they were the same
Anonymous No.106270631 >>106270647
>>106270590
I was just joking, mimicking retards
Anonymous No.106270645
>>106267737
If you're going over 64 GB usage then it's not a waste. Though I don't know if you can actually hit that as I'm a 32 GB peasant but that I 99% all the time which causes chugs.
Anonymous No.106270647 >>106270712
>>106270631
i cant tell what is real or fake anymore, man
also wan does reflections really good, i bet if i mentioned "the video game being played on the tv being reflected in the windows as they tongue kiss" it would have generated it as well
Anonymous No.106270672 >>106270676
>>106268536
dude just pick up some gig jobs to save up that money
Anonymous No.106270676 >>106270730
>>106270672
>gig jobs
elaborate
Anonymous No.106270694 >>106271149
>>106269653
>Maybe (re)Forge is a better option atm?
Forge was easy to learn. I assume reforge is the same. You should be able to run illustrious just fine but you will max out your 16 GB of system ram which might become a hindrance to use your PC while genning.
Anonymous No.106270712
>>106270647
>i cant tell what is real or fake anymore, man
I get you, there's was a time when you could easily identify shitposting, but there's just too many retards these days
Anonymous No.106270714
>>106270260
you got more like this?
Anonymous No.106270718 >>106270744 >>106271257 >>106271533
https://civitai.com/articles/18185
>As a final note, you could easily train SDXL to use the Flux VAE directly

Someone please do this (I don't know how). Thank you.
Anonymous No.106270730
>>106270676
Anonymous No.106270744 >>106270841
>>106270718
it's easy! just retrain SDXL from scratch but using the flux vae instead.
Anonymous No.106270748
>>106270618
Prease herp why it no work.
Anonymous No.106270752 >>106270820
disregard females, acquire killstreaks

this is 0.8 2.2v1.1 and 0.2 2.1 on both high and low. i don't actually think you need the 2.1 lora for the low noise pass at all since the motion gets encoded in the first half right? I'm going to try not using the 2.1 lora at all for the second sampler
Anonymous No.106270800 >>106270822
Rank 32 version of Qwen digital camera lora
Night and day difference from the first version
I won't release it just yet because I have to fix the fucking dates from the dataset (although it can arguably make images more soulful)
Anonymous No.106270819
Anonymous No.106270820 >>106271005
>>106270752
You get a blurry noisy result to some extent if you do that, well at least I did when I was testing
Anonymous No.106270822
>>106270800
>fix the fucking dates from the dataset
that's what you want though? unless it isn't tagged and appears on its own
Anonymous No.106270841
>>106270744
Also all the finetunes, the finetunes of finetunes, loras...

It's easy, why hasn't someone done this already ?
Anonymous No.106270858 >>106270866 >>106270936
Why does V50 go blind sometimes all of a sudden?
Anonymous No.106270866
>>106270858
Poorly made model, this is why you dont "cheat" your final version by doing a shitmerge. Use V49.
Anonymous No.106270885
>>106268250
the pillow should be long
Anonymous No.106270936
>>106270858
I would too if I had to learn off that fags dataset
Anonymous No.106270965 >>106271005
is it just me or is the 2.2 lightning lora really resistant to moving the camera
Anonymous No.106271005
>>106270820
i dont see those issues but it does seem to be worse without having it in both high and low steps

i will try a 0.6 + 0.4 mix of 2.2v1.1 and 2.1 for a few hours

>>106270965
no that's just 2.2 lightning unfortunately
Anonymous No.106271035 >>106271168
Anonymous No.106271149 >>106271218 >>106271219
>>106270694
Actually, can forge/reforge be used with AMD? There's the amd/directml stable diffusion version I mentioned in my other post, as well as a few guides to get Comfy and A1111 working on AMD, but I didn't see anything about (re)forge specifically.
>You should be able to run illustrious just fine
Neat.
>but you will max out your 16 GB of system ram which might become a hindrance to use your PC while genning.
On the gens I tried out when I first installed it, I was already doing nothing extra on the PC to avoid oom crashes, so I guess not much will change kek
That said, I'm guessing that the "strategy" of using models of a certain size, depending on your vram/ram, is different between image gen and llms? As in, for llms, I seem to be mostly fine if I use models that can fit in my vram (so, models that are under 8gb in size, pretty much), but you're saying that Illustrious' model will "spill over" and need to use the ram as well, despite it being around 6.5gb in size. I mean, it makes sense, since it's going through a bunch of images instead of "just" text, but is there some general rule like that for image models?
Again, sorry if these questions are dumb, just trying to sort out some "basic" (?) stuff so I don't have to spend more time reinstalling stuff than experimenting with actual genning.
Anonymous No.106271168 >>106271183
>>106271035

KEK, did you make an i2v lora of chris? And can he pop into any image?
Anonymous No.106271177
>>106268399
hot
Anonymous No.106271181
Has anyone done anything interesting with this long generation workflow? Seem like it maintains quality pretty well

https://civitai.com/models/1866565/wan22-continous-generation-subgraphs
Anonymous No.106271183 >>106271242 >>106271333
>>106271168
No my solution was much more stupid.
Anonymous No.106271218
>>106271149 (Me)
Found https://github.com/likelovewant/stable-diffusion-webui-forge-on-amd
Might be a bit outdated compared to the latest release of the normal ReForge. Gonna look it up a bit more, bit I guess I can try this one out if I find nothing else.
Anonymous No.106271219 >>106271330
>>106271149
I'm new to this so I don't really know what takes up space in RAM but something for sure does. If things spill over from VRAM that's one case which slows down gens with a big impact if I understand correctly. I think LoRAs are loaded in RAM?

Mostly I've noticed regular high RAM usage both in video and image gen and going by that I'm saying 16 GB will struggle. Maybe there's tricks to optimize that but idk.
Anonymous No.106271242
>>106271183
>my solution was much more stupid.
i disagree heh nice anon
Anonymous No.106271257 >>106271326 >>106271436 >>106271514
>>106270718
>>you could easily train SDXL to use the Flux VAE directly
AHAHAHAHAH
Anonymous No.106271326 >>106271436
>>106271257
Not gonna happen
Anonymous No.106271330
>>106271219
Fair enough. Already have a few options lined up, so I'll just try it out and see what happens. Thanks anon!
Anonymous No.106271333
>>106271183
I am also doing something similar to that, I remove the background from the characters/people I want to gen, add a white background and put them together and then prompt something like
>the scene flashes from white to (whatever setting you want), (rest of the prompt)...
Anonymous No.106271344 >>106271359 >>106271365 >>106271468
>>106267568 (OP)
Can you use AMD graphics cards for AI stuff now as well? About 2 years ago it was only really possible to do AI stuff with Nvidia cards.
Anonymous No.106271359 >>106271468
>>106271344
Nope sadly
Anonymous No.106271360 >>106271394 >>106271467
Aw man, fuck it
Here is the v2 for Qwen digital cam lora

I had to divide in two parts since it's a big lora (rank 32)

https://files.catbox.moe/pboio2.001
https://files.catbox.moe/5axjtg.002

Rename them to the same name and extract
Yes, it shows date texts unprompted, you gotta add a negative prompt of it

I am having some weird artifacts with Qwen even without loras (check the grass in picrel), if someone knows how to fix it, let me know

If I train it again, I may try a resolution larger than 1024p and I will ensure every date text is gone

But for now, I'm gonna train something else
Anonymous No.106271365
>>106271344
It would be a pain but I don't know anything about it
Anonymous No.106271367 >>106271397 >>106271400 >>106271452
How do I get a faster output times for wan2gp? I have a rtx 3080 and it took 1hr to output 5s
Anonymous No.106271394 >>106271475
>>106271360
nice
>But for now, I'm gonna train something else
what will you train next?
Anonymous No.106271397
>>106271367
at 460p - any work arounds for fast times / higher quality outputs?
Anonymous No.106271400 >>106271410
>>106271367
>1hr for 5s
Jesus christ. I'm a noob and I don't know much but this https://civitai.com/models/1802623?modelVersionId=2039975 is decently fast for me on a 2070 so it should work better for you. I don't know if you can run the 720p on a 3080 but the 480p will work for sure.
Anonymous No.106271410 >>106271430
>>106271400
>2.1
Anonymous No.106271430
>>106271410
All I know I'm not taking an hour per 5 seconds of video on this. More like 5-8 minutes.
Anonymous No.106271436
>>106271257
>>106271326
trust in ostris
Anonymous No.106271452
>>106271367
>How do I get a faster output times for wan2gp?
Killing yourself would be easier
Anonymous No.106271467
>>106271360
>I am having some weird artifacts with Qwen even without loras
Looks similar to what Flux dev does sometimes. Very strange.
Anonymous No.106271468 >>106271498
>>106271344
>>106271359
Depends on the GPU model, but it's possible. I literally linked a github related to it a few posts ago. There also a couple links in the renpys about genning with AMD.
Keep in mind, from what I've looked up so far, it seems to work better on Linux (because of rocm support being Linux only atm), but there are a couple options for Windows. Look up stable diffusion amd directml, i think it was something along those lines.
Anonymous No.106271475 >>106271526 >>106271654
>>106271394
Probably the 80s dark fantasy stills since people liked that the other thread

Or 80s / late showa era japanese photographs (which is my best dataset by far)


I need someone to tell me if there is a fucking fix for this:

https://huggingface.co/Qwen/Qwen-Image/discussions/52

They apparently released a broken VAE, but this needs confirmation
Anonymous No.106271484
>>106271478
>>106271478
>>106271478
>>106271478
Anonymous No.106271498
>>106271468
directml is crap, plus you can't train anything unless it's cuda
Anonymous No.106271514 >>106271538 >>106271553
>>106271257
>t. someone who has no idea how any of this works
SDXL and Flux VAE are both 8x spatial compression. Flux just has more channels. Right before the models project to VAE space, the vectors fundamentally must contain the same information, since they are encoding for 8x8 pixel patches. I bet you could get 95% of the way to adapting SDXL to Flux VAE with literally a single linear projection layer. A fine tune of the last few layers on a few million diverse images would get you 100% of the way there.
Anonymous No.106271526
>>106271475
>80s dark fantasy stills
this pretty please
Anonymous No.106271533
>>106270718
it might take decent amount of effort to adapt to flux latent space, but you can actually modify sdxl vae with not much training to get about the same amount of compression as flux vae, though you will still have to train sdxl itself to learn to use all the new information and there might very well be downsides as well
Anonymous No.106271538
>>106271514
im pretty sure ostris tried exactly this when training adapters for his vae. and we know how that turned out
Anonymous No.106271553
>>106271514
unironically look forward to your experiments
Anonymous No.106271619 >>106271640 >>106271713
what happened to the 16 channel sdxl vae ostris was making?
Anonymous No.106271640 >>106271694
>>106271619
flux released and he stopped caring
Anonymous No.106271654
>>106271475
>Or 80s / late showa era japanese photographs
Can't go wrong with this one
Anonymous No.106271694 >>106271712
>>106271640
this is after flux. he is using the flux vae and converting it
Anonymous No.106271712 >>106271727
>>106271694
>he is using the flux vae
he made his own vae
Anonymous No.106271713
>>106271619
I don't believe he ever figured out how to make it converge. But he's always got multiple projects going so I'm sure he just became more interested in something else. Maybe his trainer.
Anonymous No.106271727
>>106271712
using the same arch. it's probably like that neural net that converts sd1.5 noise to sdxl
Anonymous No.106272309