← Home ← Back to /g/

Thread 105658422

314 posts 182 images /g/
Anonymous No.105658422 >>105658450 >>105658752 >>105660048 >>105661768
/ldg/ - Local Diffusion General
Not a Lora Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>105656291

https://rentry.org/ldg-lazy-getting-started-guide

>UI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Models, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info

>Cook
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>Chroma
Training: https://rentry.org/mvu52t46

>WanX (video)
https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1

>Misc
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Archive: https://rentry.org/sdg-link
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate
Local Model Meta: https://rentry.org/localmodelsmeta

>Neighbors
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/celeb+ai
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.105658446 >>105658453 >>105658456 >>105658470
Video collage separate because the initial frames are white even when combined with images.
>>105639397
It is incredibly quick now however. Thank you anon.
Anonymous No.105658449
hackerman JC:
Anonymous No.105658450
>>105658422 (OP)
op why didn't you include my gen in the collage
Anonymous No.105658453
>>105658446
>the initial frames are white even when combined with images.
did you ask claude to fix it?
Anonymous No.105658456
>>105658446
Doesn't happen on my end, but I'll look into it.
Anonymous No.105658470 >>105658504
>>105658446
I hearby disavow this disgusting mockery of our savior, Xi.
Anonymous No.105658492
Blessed thread of frenship
Anonymous No.105658504
>>105658470
>I hearby disavow this disgusting mockery of our savior, Xi.
same, that man saved local, have some respect!
Anonymous No.105658519 >>105661712 >>105661937
Anonymous No.105658535 >>105658539 >>105658543
does longer length of video gen with wan require more vram? or is it just a matter of more time if you can already make 5s clips?
wondering if i can leave my pc and come back to a clip that's a minute long or it'll just get bottlenecked somewhere
Anonymous No.105658539
>>105658535
>does longer length of video gen with wan require more vram?
yes
Anonymous No.105658543
>>105658535
81 frames, 5 seconds max, Wan breaks otherwise. With RifleXRope, 129 frames, 8 seconds, but it tends to want to loop.
Anonymous No.105658544
>>105658534
>Krea is 100% not going to be released
>I don't base this on lazy dooming but because I tried the website and saw that they have all sorts of midjourney-like tooling built up around it, custom style training and image prompts and stuff
>no company that's going to drop the weights does that. this model is staying closed
Anonymous No.105658548
Anonymous No.105658553
>my asuka is double major
feels good
Anonymous No.105658597 >>105658715 >>105658729 >>105658735 >>105658995
Anonymous No.105658616 >>105658640
Anonymous No.105658617 >>105658625 >>105658631 >>105658898 >>105658912
https://huggingface.co/rocca/chroma-nunchaku-test/discussions/1#68557c81961b7e57afe5f902
soon
Anonymous No.105658618
>>105656515
All the time
Anonymous No.105658625 >>105658987
>>105658617
>NOTE: The quality of outputs produced by the models in this repo are not as good as they could be, probably due to bugs in my code. You may need to wait for official Nunchaku support if you want good quality outputs.
I sleep
Anonymous No.105658631 >>105658636
>>105658617
v29 was the last good one though, there's no need to bother with quanting the rest
Anonymous No.105658636
>>105658631
>v29 was the last good one though
this
Anonymous No.105658640
>>105658616
kek
Anonymous No.105658659 >>105658669 >>105658671 >>105658675
i hope you guys aren't wasting any of your ram
Anonymous No.105658663
Anonymous No.105658669
>>105658659
My /lmg/iggas don't have this problem
Anonymous No.105658671
>>105658659
>23.6
uh oh stinky
Anonymous No.105658675 >>105658692
>>105658659
why does it eat so much ram on your side? are you going for a 1080p video or something?
Anonymous No.105658690 >>105658698 >>105658715 >>105658817
>Even Midjourney video is mogging us
as much as I like Wan, I'd like to see an improvement of that, API faggots are starting to go too much ahead of us :(
Anonymous No.105658692
>>105658675
nta but I have 80GB and if I run a workflow that loads a few different large models in succession (e.g. generate an init image with an SDXL or Flux model and then run it through Hidream to sharpen it up), the ram fills up

It's a good bc it's comfy storing the unused weights in ram so that when it's time to swap, it can more quickly transfer them back onto the GPU without having to touch the ssd again
Anonymous No.105658698 >>105658700
>>105658690
do we know which base i2v model MJ created theirs on? I doubt they made one from scratch
would the WAN license allow it?
Anonymous No.105658700
>>105658698
>would the WAN license allow it?
they would, Apache 2.0 licence means that the model belongs to everyone
Anonymous No.105658715 >>105658729 >>105661692
>>105658690
at least post something as high a res as >>105658597 and also with titty (you cant thoever)
Anonymous No.105658726 >>105659762
https://x.com/pabloprompt/status/1935822625663861192

more crazy hailuo clips itt, america is being humiliated atm
Anonymous No.105658729
>>105658715
>thoever
more like tHOEver am I right? >>105658597
Anonymous No.105658735
>>105658597
holy based, thanks
Anonymous No.105658737
Anonymous No.105658752
>>105658422 (OP)
digits witnessed
>>>/vp/napt
is your neighbor
Anonymous No.105658758
Anonymous No.105658766 >>105658836
Anonymous No.105658798
https://youtu.be/8sBC3YCTn_o?t=43
Anonymous No.105658817
>>105658690
>robotron walk engage
Anonymous No.105658836
>>105658766
Anonymous No.105658838 >>105659725
Anonymous No.105658898
>>105658617
Should be officially out soon though.
Anonymous No.105658912 >>105659073
>>105658617
do we know how much quality loss we're dealing with? is this equivalent to Q4 in terms of quality?
Anonymous No.105658987 >>105659033
>>105658625
>applying various snakeoils to wan? yes please
>applying snakeoils to chroma? wtf is wrong with you
Anonymous No.105658995
>>105658597
Those rococo dudes were on to something back then
Anonymous No.105659010 >>105659058 >>105659061
So, to use sage attention in Comfy, do I need to do anything more than pip install sageattention in my venv and launch Comfy with --use-sage-attention ?
Anonymous No.105659017
What is prompt matrix and what is prompt from file? how does each work?
Anonymous No.105659018 >>105659037
https://www.youtube.com/watch?v=JA5FgJ1c8-w
Anonymous No.105659033
>>105658987
Strawman, you don't know my position regarding chroma's tools.
Anonymous No.105659037
>>105659018
migu nooo :(
Anonymous No.105659048
Anonymous No.105659058 >>105659114
>>105659010
No that should be enough.
KJnodes has a node that lets you enable or disable sage attention without restarting comfyui though, bit more convenient.
Anonymous No.105659061
>>105659010
yep, that's all there is too it, speaking of sageattention, we'll get an upgrade of it soon
>>105650088
>>105650107
Anonymous No.105659073
>>105658912
Should be better than Q4 in terms of quality loss
Anonymous No.105659114 >>105659117 >>105659132
>>105659058
Is there any particular reason I would want to disable it ?
Anonymous No.105659117
>>105659114
0 reason at all lol
Anonymous No.105659132
>>105659114
bc it reduces accuracy a little bit. Not as bad as teacache but it does shuffle a few pixels around, might fuck something important up slightly once in every few gens (finger looks weird, eye slightly off etc. you know what I mean)
Anonymous No.105659139 >>105659175
*sip*
Anonymous No.105659152
Anonymous No.105659168 >>105659173 >>105659178 >>105659182 >>105659187
Anonymous No.105659173 >>105659179
>>105659168
is that that redscare chick? looks kinda like her though not quite
Anonymous No.105659175
>>105659139
OY OY OY DOKTOR MEOW!!! HOLY FUKE!!!
Anonymous No.105659178
>>105659168
she would destroy me
Anonymous No.105659179 >>105659184
>>105659173
nigga no....
Anonymous No.105659182
>>105659168
thats a man
Anonymous No.105659184
>>105659179
it does tho
Anonymous No.105659187
>>105659168
he would destroy me
Anonymous No.105659215 >>105659230
https://www.wan-ai.org/ja/models/Wan-3.1
>Wan 3.1
I really doubt they'll make this one local though, prepare for a treason
Anonymous No.105659230
>>105659215
seems like a scam (fake / unofficial) website
Anonymous No.105659236 >>105659243
Anonymous No.105659243 >>105659259
>>105659236
see you in 3 days anon :(
Anonymous No.105659259 >>105659275
>>105659243
Oh was that a rule here...... ah well oof.....
huh....
Anonymous No.105659267 >>105659285
I do think alibaba may betray us if they cook something good with an hypotetical Wan 3 though. They have virtually no competition in the open space right now and the hunyuan guys are no longer fans of open source
Anonymous No.105659275 >>105659288
>>105659259
>was that a rule here
well yeah /g/ is a blue board
Anonymous No.105659281
i dont see any nipple or vagene
Anonymous No.105659284 >>105659443
Anonymous No.105659285 >>105659295 >>105659309
>>105659267
How would they "betray" you? 2.1 is there to stay until better local solutions come around, from alibaba or not.
Anonymous No.105659288
>>105659275
wellllll shiiiiii

I don't know shit here then.
Anonymous No.105659295 >>105659318
>>105659285
>How would they "betray" you
by making the next model not local with an API paywall
Anonymous No.105659309 >>105659314 >>105659325
>>105659285
>until better local solutions come around, from alibaba or not.
The only team I can see delivering something comparable or better than Wan that could be open are BFL (even then it would likely be distilled)
Chinks seems to be all about API these days in the video front, and the west is too busy scamming VCs or producing models cucked by copyrighted constraints and censorship
Anonymous No.105659314
>>105659309
>The only team I can see delivering something comparable or better than Wan that could be open are BFL (even then it would likely be distilled)
I would be doubtful of that but since they made Kontext pro I might agree that they know what they're doing
Anonymous No.105659318 >>105659332 >>105659342 >>105659354
>>105659295
How is that a betrayal ? They don't owe us shit, but unlike huge western tech companies (with the surprising exception of Meta, lol), they've actually given us a fantastic open video model.

I laughed my ass off when Black Forest Labs just memory-holed their own upcoming video model when Wan released, BFL's was obviously very inferior and censored to hell like the Flux model.
Anonymous No.105659325
>>105659309
>a new smaller team forms from 's ex employees
>they release a newer, hotter local model that's on par with paid services
Happened more than once before
>>105659318
>They don't owe us shit
we gave them the free advertisment what are you talking about? do you really belive those companies release products that cost millions of dollars to the public for free? nah, it was an exchange
>gcc compiler fixes
>some more compile fixes and added a GL wrapper
hmm...
>>105659318
>unlike huge western tech companies (with the surprising exception of Meta, lol), they've actually given us a fantastic open video model.
meta? I have to argue that Mistral gave the community better quality models
>>105659332
>we gave them the free advertisment what are you talking about?
Fuck off you sad leech
>>105659344
cope and seethe, corpo bootlicker
>>105659318
Saying "betrayal" is just a figure of speech / thread lore banter.
The point is that those companies doing a 180 is not uncommon and it sucks, lol

>>105659332
Now that's a stretch, lol

>>105659342
Reminder that Meta never really open sources stuff beyond LLMs. They never open sourced their image and video gen models, and they have one of the best audio gen models in the industry (Audiobox) and never open sourced it as well
>>105659354
>that's a stretch
what stretch, which company would say something like: "all right we spent millions of dollars making that model, how about we release it to the nature without asking nothing in exchange, how does that sound?", do you seriously believe they did that because they wanted to give us a present or something? they did it because they knew that nothing beats an organic free advertisement from the local community, that's also why that CEO krea fag made a bait and switch to get some hype from his new product
>>105659354
>>105659370
this is the wet dream of every advertisers, that random people are willing to say the name of their product over and over on the internet
>>105659348
You're the corpo leech, crying about a potential model perhaps not having a open release.

And there's not even any reason to think they would not release a new model to the public, if nothing else to make western companies seethe, like holy shit they must fucking hate the existence of Wan video.
wan wan
>>105659348
it's a SaaS shillbot that has been operating for months now. he usually spits out the same garbage about how developers are risking their lives to develop AI so we should worship every release no matter how bad and if we don't like it we should subscribe to OpenAI
>>105659382
>if nothing else to make western companies seethe, like holy shit they must fucking hate the existence of Wan video.
based, fuck cucked western pigs
>>105659370
You guys should really stop talking about Krea, it seems like another Mogao situation, lol. If it happens, nice, but until them, they are just another SaaS cloudshit
wanxing so hard right noew
Makes me wonder if NVidia is helping Alibaba fund these open video models, the existence of Wan must have sold so many GPUs. I have so many friends who were like 'yeah, AI images, that's cool, anyway...' but now with Wan video they're like 'holy shit! what GPU do I need to buy ?'
>>105659386
>it's a SaaS shillbot that has been operating for months now.
I really hope he gets paid for that, imagine selling your soul for free lol
>>105659412
local AI users are a speck of dust, if we all stopped buying GPUs it would make no difference. it's more likely nvidia is funding pytorch and AMD to guarantee total enterprise market capture
>>105659382
>if nothing else to make western companies seethe, like holy shit they must fucking hate the existence of Wan video.
Nah, they likely enjoy the free research people published on these open weights models so they can be parasitic and implement new features / performance improvements to their proprietary models, I know I would

The thing is that companies like Alibaba are playing the long game, they are a cloud infra company, not an API company, unlike those smaller labs who is very likely become obsolete and not exist anymore in a few years
>>105659425
Nothing in this post made any sense at all, are you a bot ?
>>105659425
>local AI users are a speck of dust, if we all stopped buying GPUs it would make no difference.
but companies buy a lot of GPUs to train local models, so Nvdia has something from the local ecosystem, not from us directly but from the companies that made Wan or SDXL for example
>>105659433
>they likely enjoy the free research people published on these open weights models so they can be parasitic and implement new features / performance improvements to their proprietary models, I know I would
not only that, but since Wan is Apache 2.0 licence, nothing is preventing a Saas company to just finetune on top of that and call it a brand new model
>>105659284
happy pride month
>>105659439
nobody is buying enterprise gpus to train local models lol. they're buying enterprise gpus to train SaaS models that they sometimes release openly for publicity. if they didn't think SaaS would work, there would be no purchasing of gpus and thus no local models.
>>105659442
I wouldn't be surprised if Midjourney's new video model is just a glorified Wan fine-tune.
They are a small team with limited resources
>>105659444
>nobody is buying enterprise gpus to train local models lol. they're buying enterprise gpus to train SaaS models that they sometimes release openly for publicity.
what's the difference? Nvdia get that money anyway
>>105659450
I also believe it's a Wan finetune, they improved it a bit and attached the MJ aesthetic on top of it, desu that's smart if they did something like that
I don't see how those smaller labs will survive at all in the long run. Their moat is entirely temporary, and they are on life support due to VCs not realizing they are in a bubble and not realizing how fragile is the full picture of those companies business models.
Those startups will never be able to compete with the IaaS giants in pricing, reach and availability, and Google in particular will always release models that mogs theirs anyways, with better prices and inference speeds, running on their custom TPU clusters

Most VCs will never get their money back as it won't get long for all of those startups going bankrupt, lol

I think Alibaba Cloud's goal by giving away all those free models (Qwen, Wan and others) to use whatever the community and researchers cook to make the inference backend for their APIs cheaper, more broadly available and enticing, a few thousand guys running models in high end gaming GPUs is not going to hurt their overall business and is a drop in the ocean
I have a small ai gen channel on pixiv and someone's asking if i got the time for a project. Scam?
>>105659620
Possibly, but maybe just a way to get around fanbox ai ban
>still no sageattention2++
BOOOOOOO
>>105658838
naisu
>>105658726
I don't think they're actually behind, do you think they are? I think they are just waiting for a decision to be made on whether or not training these models on copyrighted content is fair use. They already have video models that are on par, or probably better, but those are sitting inside a lab. There's no way ClosedAI doesn't have a model significantly better than Sora.
>>105659665
false advertisement
i want a refund
>>105659665
they said "around 20th of June" so it's probably a bit later than that
What's the best way to minimize inpainting time for a small region that needs to be fixed? If I select :Inpaint area: Only masked", it attempts to draw a whole copy of the prompt into the small space (even if I delete the character details from the prompt, it still tries to add in humans cause that's all these models were baked on). So I have to regen the entire image (which takes minutes) just for that tiny space. Are there some tricks to get around this? Possibly like how tiled upscaling or adetailer works?
>>105659858
Controlnet Tile.
>>105659858
KRITA
R
I
T
A
the king of kong strikes again.
>>105659858
The best quality inpainting is done using the "only masked" method where it upscales the image, applies diffusion to it, then downscales and stitches it back in.
What the other anon said is right. Any kind of upscaling (inlcuding inpainting) should have controlnet tile enabled alongside it. 0.45 strength, 0.7-0.9 end step depending on the image. With that, you can bump up denoise to 0.6 or so. You keep the original look while giving you a nice, sharp and detailed inpaint.
If there's no controlnet tile for the model you're using, RIP. You're stuck with lowering denoise to 0.3 or so, any higher and you'll get exactly what you're getting now.
>>105659904
No fun unless he mocks Jobst
>>105659865
Any particular settings? I'm just messing with whatever I can find here, can't figure out how to get it to do what I want. I've done the controlnet union+ultimate sd upscale thing before, but that still draws every tile even the ones I don't need, and still sometimes doesn't "fix" the masked area in reforge inpaint tab.
>>105659903
ah crap, I've been meaning to check that out, but I've only got a little 3060.. soon...
>>105659908
Ok giving it a shot.. I don't know if I have the "right" controlnet tile thing though?
>>105659926
>SDXL : controlnet-union-sdxl-1.0-promax
>ILLUSTRIOUS: illustriousXL_tile_v2.5_controlnet_ep0_step5000
>NOOB: noob-sdxl-controlnet-tile
SDXL's union works with ill and noob for tile and I didn't notice any real difference or better option when I last tested it, but they do have their own tile models all the same.
With denoise, if you're doing mangled hands/body parts, you can bump it up to like 0.7 or so, and maybe lower end step a little to (0.6'ish).
>>105659904
>>105659912
>No fun unless he mocks Jobst
he's mocking Icuckz recently, what a gigachad
https://xcancel.com/BillyPacMan/status/1933132173697331472#m
>>105659582
VCs already know that the global scale services of big tech monopolies are the virtual equivalent of building iPhone assembly plants in India, and that not backing small, independent labs that can pursue novel research or compete quickly with innovation will leave us without a pool of people who can.
not that it matters since I'm already prompting in Chinese.
>>105659952
he's a rich man now.
>>105659970
that's better, now it's karl's money.
>>105659921
Also, you need to change the resolution to 1024x1024 when you inpaint on forge. Pic related is basically how you want inpainting set up. If you were using Illustrious 2.0, you'd set it to 1536x1536. Basically always set it what the model was trained at max. And make sure Ultimate SD Upscale is turned off too.
>>105659992
she should be thanking her lucky stars it's just milk this time
>>105659986
Thanks, downloading the illustrious tile now.
The problem I'm having is I've got this full body character (normal), and then there's a sword in the water next to her, then there's an extra hand behind it. Every time I try to inpaint it, if I do a full image inpaint, it "works" (but is slow and often leaves the painted area looking jaggy), and all the tile attempts I've done fill it in with another person or flesh-colored blob (tried changing fill/latent noise/latent nothing etc)
will give the settings a shot
check em
>>105660004
Areas like that are tricky because from the model's perspective and with high enough denoise, it looks like a person to it. Set tile strength higher and/or denoise lower, and change the prompt to "hand holding sword", "hand next to sword" "sword, hand, close up" or something like that.
>>105660013
or that
>>105658422 (OP)
How often does WAI generate creepy stuff?
I've been traumatized by that webm of the Chinese guy's head exploding into spaghetti and gore
>>105660048
You mean wan? Never unless you prompt for it. I've genned several hundred vids at this point and never gotten anything close to that. The worst you see with typical use is deformed privates if you're making coom gens.
Can I animate in Wan in 4:3 aspect ratio is it all locked to 16:9?
>>105660063
Good to hear
>The worst you see with typical use is deformed privates if you're making coom gens.
I assume because it wasn't trained on it?
>>105660068
kek good one
>generate dozens of naked bitches in wan jiggling their asses using the twerk lora
>meh
>generate dozens of bitches in wan jiggling their asses using the twerk lora, only now they're wearing midriff tops and tight white jean shorts
>diamonds
Life's funny sometimes
>>105660069
any aspect ratio is fine
>>105660085
Post an example pls.
>>105660104
GET YOUR OWN BITCHES
>>105659986
holy moly, it worked, and it was fast! thanks!
>>105660110
can still see the faint outline of where the hand was, run it through inpaint again and it'll go away
>>105660004
>>105660110
Cleanest way to remove stuff like say that hand isn't to try and just inpaint it out, it'll be hit or miss. I just Photoshop it out using the AI powered remove tool or just paint over it in a color close to the background color, then I run it through inpaint to clean it up
>most coom vids I make give the women weird fleshy growths between their legs
>can't escape this futa hell

Any tips on how to prevent this shit? I'm using the WAN NSFW and a motion lora.
>>105660175
>>105660175
looks like you're subconsciously orienting the model towards your real desires
>>105660175
Use a better NSFW anatomy LoRA?
>>105658076
BECAUSE WE HAVE NONE
ITS ALL DECODERS
>>105660211
>ITS ALL DECODERS
not at all, LLMs are encoders + decoders, that's why Hunyuan used llama3 for example, they managed to take only the encoding part of llama3
Can someone smarter than me explain why you have to shuffle tags during a lora training? Is the tag order relevant during training?
>>105660340
It probably matters a little bit less if you don't train the encoder as well. However, the embeddings that the encoder provides will contain some information about the order of the words. That is a pattern you want to avoid the diffusion model to pick up on, so therefore you shuffle the tags.
>>105660234
Llama3 is a decoder only llm. What hunyuan did was a dirty hack to allow some sort of encoding, but it was never properly trained for the task like t5 was.
>>105660353
I want to train some loras and for what I see the 0.1 illustrious model is the best one. Or so I believe. I'm tagging with booru tags and I want the best quality possible.

Should I change anything?
>>105660234
>year 2042
>latest local image gen still uses t5-xxl
>>105660362
>Llama3 is a decoder only llm.
but when you write to a LLM, the llm has to encode your text first before understanding it right?
>>105660376
I just recently trained one as well and did it with Illustrious 0.1, also questioning that choice. However, it ended up generalizing to all Illustrious derivatives I have, so I was convinced.

What's the purpose of the lora you're training? For my character lora, I kept the first two or three tags fixed (which included my unique character token)
>>105659665
>>105659793
>wakeup
>21st
>still no sageattention2++
it hurts bros

>>105660069
glorious 640 x 480 works perfectly for my 4:3 monitor
it's too hot to gen
i'm dying rn but i want to gen so badly.
>>105660175
kek, yea. make sure your image shows visible butthole and vag otherwise wan will make shit up. great for twerkan
I wish we had an updated rentry for lora training
Those civitai guides are all so bloated, hell most of them aren't even local or up to date
>>105660411
What is better in sageattn2++? from what chatgpt shat out at me, it just seems better for 40/50 series cards.
and wouldn't sageattn3 be better overall?
>>105660430
>What is better in sageattn2++?
https://arxiv.org/html/2505.21136v1
>Our experiments show that SageAttention2++ achieves a 3.9Γ— speedup over FlashAttention while maintaining the same attention accuracy as SageAttention2.
Since SA2 has a 3x speedup over FA, it means that SA2++ has a 30% speedup over SA2
>trying wan like every day
>gen 1 fin
>gen 2:
>the previous gen thumb preview started getting artifacts
>screen froze
>scannign for video output.jpg
>panic
>went back but holy scare
>>105660447
And only 5090 benefits, not 4090 or below
>>105660484
>And only 5090 benefits, not 4090 or below
no, you're talking about SA3
>>105660447
>>105660484
>>105660496
Oh sweet, 4090 still getting love!
>>105659959
KEK
https://x.com/bluedingo/status/1934428019935879358
https://x.com/LAME1116422/status/1933239490795221362
>>105660588
>twitter drama
who cares? this site is a cesspool
>>105660588
oh my god fuck off who cares.
Wan is so retarded, kek.
>>105660004
to remove an object easily in inpainting, choose latent noise, denoising strength: 1, whole picture. then just something like "background" in the prompt. read that on reddit once and it works very well.
>>105660679
Wan i2v struggles if input image has no depth
>>105660410
I want to replicate characters in a certain artstyle
>>105660679
that went from d'aaw to AAAAAA real fukin fast
Is it possible to include Ashley TTS to generate voice for the WAN videos?
How can I prevent the camera movement and making the wiggling more dynamic? I have the lora strength set to 0.8.
>>105660829
I see. In that case, you should probably tag absolutely everything using booru tags, which it sounds like you are already doing.

Just get started with something would be my advice. You can always revise the tagging if the training doesn't go well.
>>105660234
Lumina also uses Gemma 2 2B. Using an encoder/decoder model instead of t5/clip significantly improves prompt comprehension.
>>105660857
You would need a video to text model that can generate subtitles and timing from lip reading, not to mention people talking off screen. Rather improbable anon.
>>105660965
the camera is still/motionless, etc
>>105660986
I curated my tags and I'm training right now, let's see what I get...
Noob here who so far has only used stablediffusion to gen static images. How do I into video generation? I have a 3090
Why can't I do video batches? Does it need to stuff all of them into my vram at once?
>>105661078
read the OP ffs.
https://rentry.org/wan21kjguide
>>105661078
https://github.com/deepbeepmeep/Wan2GP
save yourself some headaches. dunno why people recommend comfy to complete noobs it's just pain
>>105661100
I can't blame the anon because every AI general's OP is bloated with 90% useless or out of date information

>>105661135
Same reason why I'd recommend someone wanting to learn to paint to just dive in with a blank canvas instead of buying one of those "paint inside the lines" kits
>>105661152
but ani is deprecating comfyui frontend anyways. there isn't a reason to get that into comfy anymore
>>105661117
ponyslop face. ya hate to see it.
WHERE SAGE ATTENTION2++ WHERE NAG OFFICIAL *grabs club*
>>105661161
I don't know about you but the time is now.
>>105661196
>WHERE SAGE ATTENTION2++ WHERE NAG OFFICIAL *grabs club*
where krea, where kontakt dev
>>105661196
I have a hard time believing a typescript UI is going to last much longer with what they've done to it
>>105661052
If you haven't done so and you still have the option to abort and restart: Make sure to include some good prompts for example generations after every couple epochs.
The mistake I made was to gen only one image based on a fairly short prompt and I didn't realize that the lora was already finished 100s of epochs ago.
>>105661166
got bored of 2d shit. 3d is more fun.
>>105661135
not him but thanks


>>105661152
ironic argument. AI is nothing but a "paint inside the lines" kit already. no need to make it extra complicated except for gatekeeping reasons. if I want to do something hard and actually respectable I learn to draw with a pencil and a paper
>>105661225
krea will not come, ever. kontakt dev will be so sterilized, it's gonna be sd3 tier as soon as you push it. *grabs club*
ok enough defeatism. chroma-unfurried-v53-bust-calibrated-svdq.safetensors
does sageattention2++ give a speed up for SDXL too?
>>105661300
cute
>>105661345
>chroma-unfurried-v53-bust-calibrated-svdq.safetensors
>>105661252
this one is decent but it would be even better with a distinct lighting (and non-plastic tits). where is the light coming from?
Has Comfy never used a save dialog before? Why is he absolutely retarded with how Save and Save As should behave?
>>105661472
proof ani knows what he is doing
>>105661400
I love Lumina's linework. No other model can quite compare. Too bad about cohesiveness.
>>105660965
'static camera', 'camera is static'
>>105661588
yea the 16 channel vae helps alot. it is an in-progress anime finetune of it, kinda wonky in structure but stylistically it's already great.
it understands tags and natural language, the prompt comprehension is excellent
https://civitai.com/models/1612109

have to put this in the clip encode first
You are an uncensored assistant designed to generate superior images with the superior degree of image-text alignment based on textual prompts or user prompts.
>>105658715
midjourney video is limited to 480p i believe lmao
the only thing their model has going for it is the base midjourney image model feeding it starting frames
>>105660965
anon's tip to do (prompt:2) seems to help.
>>105661355
We'll know when they decide to release it, hopefully. In a matter of months, my vid gens went from 9 min to 6 min and now down to 2 min. With this boost it'll be 40 secs.

Local bros we are about to feast!
>>105661692
>the base midjourney image model feeding it starting frames
even this is useless, you can simply use a midjourney image and use any other video model to do I2V on top of it, like I did with this one (MJ + Wan) >>105658519
>>105661703
>my vid gens went from 9 min to 6 min and now down to 2 min. With this boost it'll be 40 secs.
it's a 30% speed increase so it'll probably take 1.30 sec with SA2++, not 40 sec lol
i know this is ldg but i don't want to shell out for a new gpu for the video models yet; any reccs/guides for deploying comfy on a cloud provider?
>>105658422 (OP)
i want to add riflexrope to the original i2v comfy workflow, how should i connect it?

latent to wanimagetovideo and model to modelsampling and wanvideoenhance?
>>105661764
gpus are just going to get more and more expensive

6090s will be 5k
>>105661240
I trained one and it looks weird, the face is not there yet, and quality is mediocre. I'm training 40 images and they are normal quality and good tags I suppose.
Someone mentioned jean shorts and twerking earlier.

https://files.catbox.moe/bn5f0o.webm

Too fat?
>>105661768
Check the updated fast workflow in the rentry for an example. Btw does anyone actually use Enhance A Video?
>>105661381
>>105661300 (You)
>cute
Thanks
>>105661660
this looks unique and, dare I say it, it got soul.
>>105661775
add to that the prohibitive energy costs in some parts of the world.
>>105661784
>Too fat?
No, but a bit too jiggly, breaks the immersion.
>>105661785
i use it at 4.0, i'm not sure if it's a placebo but my videos turn out okay

the nodes in the fast workflow are different, i'm going to assume teacache/enhance videos is the proxy for wanvideonag
>>105661775
>gpus are just going to get more and more expensive
yep, we've reached the limit, we can't go lower than 4nm so all they can do now is to make it bigger and more expensive
>>105661729
Killed my boner, man

>>105661784
>>105661814
Not FAT enough. Combine the 3 twerk loras for extra wobble.
>>105661833
>he thinks they'll increase pefromance
lol
>>105661850
they always increased performance, even the transition between the 4090 and the 5090 had a performance improvement (abeit small)
>>105658519
midjourney sistas.. this base image looks like shit
>>105661937
>midjourney sistas.. this base image looks like shit
https://getlatka.com/companies/midjourney
>In 2025, Midjourney’s revenue reached $500M up from $300M in 2024. The company previously reported $300M in 2024, $200M in 2023, $50M in 2022. Since its launch in 2022, Midjourney has shown consistent revenue growth, reflecting its expanding user base and increasing adoption across various industries.
yes, npcs like slop
The term you're referring to is **Net Income** (also known as **Net Profit** or **Bottom Line**). Here's the breakdown:

- **Total Revenue (Sales):** $150 billion (money earned from selling products).
- **Expenses:** $200B billion (costs to run the company, salaries, etc.).
- **Net Income:** $150B (revenue) - $200B (expenses) = **-$50 billion**.

Net income represents the actual profit after all expenses are deducted from total revenue. Revenue alone refers to the total income generated from sales, while **net income** is what remains after costs.
MJSissies.. not like this
>>105661985
they run a small team tho, they make cash. still, midjourney:
>>105662010
whether this shit is midjourney or not, holy shit its bad
>>105662018
>too retarded to read the filename to know the source
MJHaters... not like this
>>105661784
SU_Twrk is too jiggly, while Real Twerk is a little too stiff. Try su_twrk at 0.5, RT at 0.8
>>105661965
>In 2025, Midjourney’s revenue reached $500M up from $300M in 2024.
Revenue is not profit, it costs a shit ton to train these models, and since MJ overtrains like morons, it costs them even more.

And now they are sued by Disney et al, and they will lose, since again they overtrain to the point of its models making as close to 1:1 shots of existing movie IP.

They are beyond retarded.
I trained my Lora, and sample images look low quality, they are more crispy and they look low resolution, but they are the same resolution as my database.

Any ideas?
>>105662057
>And now they are sued by Disney et al, and they will lose
can you also tell us when they'll release Sageattention2++?
>>105660429
I have only trained a few loras but it seems like it would be hard to write a general purpose guide as parameters and captioning often changes between models.
>MFW MJcels think appending "epicrealismXL" to a dogshit mj gen makes it good
just admit it looks like a PS2 cutscene, anon
>>105662068
What do you mean by sample images? The ones made as the LoRA trains? Are you still training it right now?
>>105662068
what base model? show some examples
>>105661300
>>105662084
but that's not a MJ image though, it's just an epicrealismXL image, that's all
>>105662095
really impressive, I'm pretty sure Wan wasn't trained on ice cream camel, yet it nailed that shit lol
>MJcels moving goalposts to Narnia
>filename = holy scripture now?
that is an epicrealism sdxl gen, too lazy to doctor my filenames. was supposed to be funny because of the 'slight disgust' expression in combination with the overall aesthetics (bimbo9000). ah well
>>105662093
I'm using illustrious 01
>>105662089
They resemble the dataset, but in lower quality. Like less resolution, like an image uploaded on 4chan with lower resolution and worse quality over all. The face looks mediocre. It's like a mediocre artist trying to replicate the original image. The tags are all there, I can see its being trained, just low quality.
>>105662072
Well it's obvious they will lose. They generate frames from Disney et al, movies which are practically 1:1 with the original.

The whole defence against copyright infringement rests upon how transformative a derivate work is, and they aren't even a little bit transformative.

This is of course why they are being targeted with a lawsuit, it's a slam dunk win.
>>105662144
That's normal for ill. Ignore the quality. What you do is, set the sample images to use the same seed, then watch for when the sample images start to look identical at that same seed. No point training past that point, it means the model has learned all it can from your set and is starting to overtrain.
Also, why ill 0.1 and not 1.0?
>>105660429
Is a guide really needed when you can yank the metadata from a good lora and simply copy the settings?
>>105662095
please, more cute busty cake girls covered in syrup, nobody can fap to this!
Did the Mayli anon deliver?

do we ahve a download link for his Mayli pics?
>>105662144
>>105662162
Why 1.0 and not 2.0?
>>105662144
Kohya's sample images always look shitty for illustrious, might be the same for other models too. All my LoRA's look like shit in the samples, but then look fine genned in comfy and reforge. I only use sample images to see how well its learning, like >>105662162 guy said
>>105662144
Optimizer:adamw8bit
Lr scheduler: cosine
Loss type:K2
Learning rate:0.0003
Unet learning rate:0.0003
Min SNR gamma:8
1024x1024 resolution
40 images
Training precision:fp16
4 batch size
5 repeats in database, shuffling all tags
Gradient accumulation 1
2000 steps
I'm sampling with DPM disolver++, every 100 samples
Warmup ratio: 0.05
>>105662183
Anons told me 2.0 sucks, told me it's better to train 01
>>105662176
is that the facial abuse girl? i can fix her and inherit her fathers wealth.
>>105662188
I'm using Lora easy trainer, does it happen there too? I suppose I will have to test in forge.
>>105662199
Nobody uses 0.1 anymore. You either use 1.0 or some people use wai, which was trained on 1.0
>>105662158
>This is of course why they are being targeted with a lawsuit, it's a slam dunk win.
I doubt they're gonna win, Trump (and the US in general) is aware that they're dealing with China on the AI race, and that they must win this one, and China doesn't give a fuck about IP, if the US shoot themselves in the foot with copyright faggotery they're gonna lose the technological war of the 21th century
https://arstechnica.com/tech-policy/2025/03/openai-urges-trump-either-settle-ai-copyright-debate-or-lose-ai-race-to-china/
>If the PRC’s developers have unfettered access to data and American companies are left without fair use access, the race for AI is effectively over
Is wai 14.0 better than NooBAI?
>>105662209
I've never used it, dunno. Just wait and give it a proper test once it's done, 99% sure it'll be fine
>>105662199
2.0 is the most mature version. It's better than the other versions.
>>105662232
Noob is overrated and the backgrounds suck. Tryhard artfags love it though because its trained on all their favorite nippon artists, and they get their jollies off making gens that look hand drawn. For gooning and general coom, wai is easier to use and looks better
>he can't make gens that look hand drawn
this is the only gen out of 16 that had her in a 2-seater, i guess big helicopter interiors are more represented in the training data

>>105662214
you have to remember the big Jew in the room: Disney (and to a lesser extent the New York Times et al)

This really is the Jewish-American civil war, between the Old Money Media Jews and the New Money Tech Jews, and I'm not even being antisemitic with this description
>>105662212
>Nobody uses 0.1 anymore
Training lora on 0.1 gives compatibility across several Illustrious 0.1 finetunes. You don't actually have to use 0.1.
>>105662232
wai is noobai epred with a bunch of loras mixed in
>>105662261
>go to booru
>find artist whose art looks hand drawn
>insert artist:hippo into noob, add pretentious tags, rinse and repeat
>>105662232
People say different things, even saying the base illustrious models are better, I don't know whom to trust at this point.
>>105662291
>doesnt post the output
i 1nder y
>>105662266
>Illustrious 0.1 finetunes
Obsolete.
>>105662308
My GPU is for coom, not artfaggotry
>>105662295
>People say different things, even saying the base illustrious models are better, I don't know whom to trust at this point.
You trust your own eyes. Download the checkpoints and experiment which ones gives you the results you want to see.

>>105662309
>Obsolete.
I wish!
I just make finetunes of noobai vpred for my "loras." Yes it takes a ton of space and yes it takes a really long time, but they come out really good.
>>105662328
concession accepted
>>105662295
You train on whatever model you intend to use it on. Its not a difficult concept.
>>105662352
poopcession accepted
>>105662295
people still use sd1.5 embeddings in their gens. clueless is the norm when it comes to stable diffusion.
>coom and artfaggotry are mutually exclusive
erm... anonie?
>geometric shape you dirty slut I am coooominggg
>>105662401
M.C. Edger
Of course anon will say the difficult to use models suck because they have trouble using them. But they yield superior results in the right hands.
https://rentry.org/localmodelsmeta#noobai
>Noob is often considered a "pro" prompting model, given it's much more difficult to produce "quality" outputs with basic prompts. It reacts poorly to simple tagging, especially without artist tags, but excels when you give it very detailed prompts for the character(s), pose, setting, lighting, etc, using as much detail as possible.
>>105662415
No lies detected
>>105662415
>even discussing this
>>105662214
>I doubt they're gonna win
They will, because MJ overtrained their models way past the barrier for copyright infingement.

Makes you wonder if they really are that stupid or if they are controlled opposition.
>>105662415
I never said it sucks, I said its artfaggot model for fart sniffers. Whoever wrote that is definitely high on his own supply too. Bet he gets rock hard when he sees le serious face anime girl in stark lines with shades of red and black kek
>>105662415
https://rentry.org/localmodelsmeta#illustriousxl desu should make reference to the fact that 2.0 has the largest dataset out of all the versions 20m compared to 1.0's 10m
also its ability to output 1526x1536px native
>>105662453
>overtrained their models way past the barrier for copyright infingement
does that look familiar to you?
https://www.youtube.com/watch?v=5yI9wEys2dc&t=749s
Fresh

>>105662481
>>105662481
>>105662481

Fresh
>>105662454
>noobai
>artfag model
LMAOEOEOEOEOEOOE