← Home ← Back to /g/

Thread 105774047

336 posts 96 images /g/
Anonymous No.105774047
/ldg/ - Local Diffusion General
Pinnacle of God's Beauty Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>105769424

https://rentry.org/ldg-lazy-getting-started-guide

>UI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX (video)
Guide: https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1

>Chroma
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and beyond: https://rentry.org/comfyui_guide_1girl
Tag explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate

>Neighbours
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/celeb+ai
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.105774062
Cream
Anonymous No.105774069
>>105774058
BTFO
Pay up son
Anonymous No.105774082 >>105774091
we finna fight the entire thread again??
Anonymous No.105774087 >>105774120 >>105774178
all my gens looks like shitty western cartoons now
Anonymous No.105774091 >>105774093 >>105774095 >>105774144 >>105774177
>>105774082
i figured out that the schizophrenic baker puts his own images in the collages
he never puts text with his images
Anonymous No.105774093 >>105774144
>>105774091
>citation needed
Anonymous No.105774095 >>105774144
>>105774091
source: it came to me in a dream
Anonymous No.105774112 >>105774164 >>105776386
>>105770040
> ======PSA NVIDIA FUCKED UP THEIR DRIVERS AGAIN======
> minor wan2.1 image to video performance regression coming from 570.133.07 with cuda 12.6 to 570.86.10 (with cuda 12.8 and 12.6)
> I tried 570.86.10 with cuda 12.6, the performance regression was still the same. Additionally I tried different sageattn versions (2++ and the one before 2++)
> reverted back to 560.35.03 with cuda 12.6 for good measure and the performance issue was fixed
> picrel is same workflow with same venv. the speeds on 560.35.03 match my memory of how fast i genned on 570.133.07
> t. on debian 12 with an RTX 3060 12GB

Could you share the workflow and the input picture please? Want to test it on Arc B580.

And what is your ram? I have to restart comfy after each 4th gen or it OOMs for ram during vae decoding of the 5th.
Anonymous No.105774120 >>105775594 >>105775657
>>105774087
you did this to yourself
this is your doomed future you created
>>105773833
>>105773848
WELL?!?
Anonymous No.105774129 >>105774137
>>105773897
If she chew Bubblegum (Crisis)
Anonymous No.105774134 >>105774174 >>105774600
Anonymous No.105774137 >>105774165
>>105774129
Better.
Still ugly as shit though
Anonymous No.105774144
>>105774091
>>105774093
>>105774095
The source = open your fucking eyes bitch
Anonymous No.105774153 >>105774161 >>105774335
>generate images on tensor
>shidpost them in /ldg/
>??????
>profit
I have been doing this for months and no one even knows
Anonymous No.105774161 >>105774335
>>105774153
no one care you're generating a local model render through API, as long as it's from a local model and not 4o it's all right lol
Anonymous No.105774164 >>105776840
>>105774112
>I have to restart my firehazard pc repeatedly while genning
Uh oh
Anonymous No.105774165 >>105774178
>>105774137
>still ugly as shit though
Anonymous No.105774174 >>105774208 >>105774236
>>105774134
>tfw Lora is trained on your ex gf
Anonymous No.105774177 >>105774600
>>105774091
>he never puts text with his images
why put text
Anonymous No.105774178 >>105774246
>>105774165
>>105774087
Anonymous No.105774182 >>105774187
Often the controlnet preview looks better than the final gen
Anonymous No.105774185 >>105774261 >>105774273 >>105774559 >>105774569 >>105774975 >>105775424
Can this be combined with SageAttention though?
https://www.reddit.com/r/StableDiffusion/comments/1lpfhfk/radial_attention_onlogn_sparse_attention_with/
Anonymous No.105774187
>>105774182
Screenshot it, I do it often
It could just all be in your head,
Itโ€™s like western vs Japanese videogame boxart
Neither are bad theyโ€™re just different
The brain wants the different one also
Anonymous No.105774198
anime girl is holding a book with the text "how to gen 1girls" in scribbled font.
Anonymous No.105774204
pool tags for illustrious?
also need a way to sort artists by nationality
Anonymous No.105774208 >>105774230 >>105774243 >>105774354 >>105774600 >>105775555
>>105774174
>>tfw Lora is trained on your ex gf
damn dude I just interrogated some old playboy model images
Anonymous No.105774224
>>105773922
>MY PC!
>anon, no! it's too late!
Anonymous No.105774230 >>105774265
>>105774208
Its the eye shape and checkbones kek
>>105773834
>>105773843
>>105773867
>spoon feeds you
https://civitai.com/models/1620140/matrix-bullet-time-camera-effect-wan21-i2v-lora
Anonymous No.105774236 >>105774249 >>105774335 >>105774507 >>105776072
>>105774174
no need for that anymore, just one image is enough to get what you want with kontext
Anonymous No.105774243 >>105774265
>>105774208
If you put a Covid mask on her itโ€™s 1:1
>>105774178
Trying backgrounds to see if that's any better
>>105774236
Years of academy training wasted manually editing/blending multiple layers in gimp kek
>>105774246
dont bend the knee to these bastards make some aeon flux / voltron 1girls
>>105774185
>Can this be combined with SageAttention though?
it will
https://www.reddit.com/r/StableDiffusion/comments/1lpfhfk/comment/n0vguv0/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
>Radial attention is orthogonal to Sage. They should be able to work together. We will try to make this happen in the ComfyUI integration.
god bless those chinks, they can't stop deliver good shit
>>105774243
>>105774230
>wahh i dated a hot girl wahh
fucking normie
>>105774185
>>105774261
Can RingAttention be used on image models aswell?
>>105774259
The schizophrenic baker will get his way no matter what
>>105774291
he'll just say that he's watching you until you get paranoid enough to leave forever and stop posting kek
>>105774295
>>105774291
Local diffusion?
>>105774236
>>105774161
>>105774153
LOCAL diffusion??
>>105774335
why did you quote me? when I said "Kontext" I was talking about Kontext Dev
when i generate videos it uses my 16gb ram. do you think i can still play old emulation games in the meanwhile or it will break everything? keep in mind i'm technologically dumb
>>105774208
this is a good one, nice.
wan fun inp vs flf2v, which one is better?
also
wan fun control vs vace, which one is better?
>>105774345
only one way to find out. emus usually hog the cpu and if the game has a small mem footprint you *should* be fine. you can also instruct comfyui to leave a bit of room vram wise
>>105774363
>wan fun control vs vace, which one is better?
vace is the goat
>>105774354
it's the real one lol
how do I generate two photos of the same person and same background but with different poses?
>>105774391
o I see so that is the scan you used. good shit, me like. found her, hah. Katariina Souri. https://www.listal.com/viewimage/1267543
>>105774433
grok is really nice for making prompts out of those. much more accurate than joy-caption, just doesn't accept nudity
>>105774449
we had this discussion a while ago. I went through a number of vision models via ollama/comfy but the results were pretty meh esp. with nsfw content, joy is alright but well its also pretty large. someone recommended I try gemini via api (semi-free with a generous daily quota) and it's amazing and zero censorship. also perfect for using it w/ large models because nothing needs to be loaded into the vram. (I haven't fed it real hc or other extreme content tho because I don't gen that shit). supposedly the large minicpm-v barely fits into 24gb vram and is pretty good.
>>105774493
just use kontext dev to get the same exact character bro >>105774236
>>105774507
don't tell me what to do bitch
>>105773210
>anons who post here go on to make money from their gens
based
>>105774493
>gemini via api
thanks for the tip, gotta try it next
>>105774516
>seething this hard over an advice
take some pills you mentally ill retard
>>105774517
and on and on it spins
>>105774517
that sounds like a funny fan-fiction, but I have yet to see such thing happen in reali life, even once kek
>>105774531
>no one is making money using Ai
uh oh stinky uh oh melty
>>105774449
ignore the part about gemini being uncensored, just threw a nude gen at it and got a refusal. weird, had it accurately describe various sex toys without issues but a mumu and tits is where it draws the line.
>>105774507
I haven't even installed it yet. next week when this fucking heatwave is over I'll git to it
>>105774537
strawman, read the post again, it says "an anon leaves /ldg/ and then makes a career out of his gens" >>105773210
>>105774507
Kontext is still quite hit or miss with likeness and it often adds the dreaded plastic skin since it's trained over base Flux
Plus it cannot do NSFW/porn like Chroma does.
But ideally, for likeness you'd actually train a Lora with Kontext as the base
>>105774545
>likely
you might want to learn how to read
>>105774185
>Radial Attention enables 4ร— longer video generation with LoRA tuning, outperforming dense attention in vision rewards, while achieving 3.7ร— speedup and 4.4ร— lower tuning costs.
So you could do 20s video with Wan for the same amount it takes to do 5sec now? Yeah, sure
I'm tired of all these papers which end up being snake oil doing nothing
>>105774556
>I admit it's a fan fiction
thank you
damn bro that troll nigga last thread trolled so hard his shidpoasts are still being linked
>>105774559
no, they said it's 4x faster for training, but for inference it's 1.9x faster, look the video it says it all >>105774185
>>105774561
>I admit my reading comprehension is quite poor and I don't understand probability , assumptions , or likeliness whatsoever
>>105774578
>I admit my reading comprehension is quite poor
you sure do

>>105774537
>>no one is making money using Ai
>>105774545
>strawman, read the post again, it says "an anon leaves /ldg/ and then makes a career out of his gens
>>105774578
>assumptions
a.k.a, a fan fiction
I accept your apology again.
>hr when someone gets fired
>>105774391
>>105774208
>>105774177
>>105774134
reported for avatar-posting :^)
>>105774600
based, death to chromakeks
>>105774598
ngl i have been fuggin ROASTED by hr many times
>>105774600
a bitter C U N T is what you are, rocketboy. and I was right, you are a psycho
>>105774608
>>105774600
I hate you disgusting dickheads so much it's unreal
>>105774615
>>105774616
Sometimes I wonder who the most mentally ill is here
>>105774615
nigga is the rocketgirl in the room wirh us right now?
>>105774615
>>105774616
>avatarfag seething noises
the best sound in the world
>>105774626
>>105774588
Speaking of assumptionsโ€ฆ
temperate today, perfect weather to bake a lora
>>105774615
why would R-avatarfag report people for avatarfagging, he's all for this lol
>>105774634
>tfw he is baking a Lora of my ex gf
>>105774422
help
>>105774598
I hate HRs so much it's unreal
https://www.youtube.com/watch?v=q6iGllJ-3aQ
>>105774646
by typing prompts??
>>105774643
you got a problem with that?
>>105774653
you better hurry up it'll be illegal soon
>>105774642
rgal is rangebanned for cunnyposting
>>105774648
>>105774698
>hair on her tongue
ewww...
>>105774707
happens if you smoke too many cigarettes
>>105774749
Low testosterone
is chroma not shit yet?
>>105774767
Judging by the last two threads, probably no
>>105774767
it'll never happen, it's distilled now
>>105774777
>probably
careful we don't understand Inference here or assumptions
>>105774779
why does compute always land in the hands of the unqualified
>>105774782
>assumptions
you mean fan fictions?
>>105774785
In a word? Outsourcing
>>105774800
do you have to take the bait every single fucking time? ngl nigga annoying for real
wan 2.2 will be a nothingburger btw
>>105774811
Pls no
>>105774809
>I was pretending to be retarded
nah, you're naturally like that
>>105774811
why would it? they perfectly nailed the previous version, they aren't a team of unqualified retard like that horse fucker >>105774779
>>105774598
>hr
Not ugly enough
>>105774698
Lore accurate
>>105774811
I would be very happy if they at very leas/t retrained the whole thing on better captions OR fine-tuned it to make 8/10 second long videos instead of 5
>>105774811
I can feel this model will be unified, like Kontext, like it'll do t2v, i2v, r2v, all in one model, that's the future
>>105774821
it'll be fine as long as expectations are kept in check. go in expecting wan 2.1 with little tiny improvements and you'll be happy
>>105774835
I'd be okay with that, I don't see where else that can take it without add billions more parameters and abandoning local
>>105774835
It a future but will it be as efficient as specific different models for t2v,i2v,r2v would be?
The size of the model is a factor for local use, i'd be happy to have 3 seperate models I can run wholly in 24gb than 1 unified model i have to use gguf to fit in. But i get your point.
>>105774846 or it's exactly the same but works faster, that'd be cool too
You niggas have to always assume those guys are not doing things for people with gaming GPUs, they are training models meant to be available on their API services, releasing the weights is a mere afterthought
So I doubt they are training models to make them faster or some shit except if there is a fundamental arch change
>>105774909
BFL? no shit
>>105774879
my wife
>>105774779
genuinely curious how chroma would've went if he just trained on dev at 1024x to start, no de-distillation or anything. i think that pixelflow finetune did it and it worked pretty well, though it was a stylistic finetune rather than a concept one. where is he getting the money for these experiments? he's constantly changing shit all the time as if he has unlimited budget yet the donation page hasn't even reached ~20 epochs worth of funding
>>105774914
Any major lab including Alibaba
If my second 3090 is running at x4, should I just train using only the 3090 that's running at x16 and ignore my second gpu?
Ain't no way alibaba is going to keep releasing open source models forever, at some point they will just release a Wan 3.0 api only
>>105774921
>genuinely curious how chroma would've went if he just trained on dev at 1024x to start, no de-distillation or anything.
the issue is that going from 512x512 to 1024x1024 is 4x slower :(
>>105774934
>keep releasing open source models forever
>he doesn't know about Wan 2.1 Pro
>>105774366
ty
>>105774422
Not possible.
Give up.
>>105774819
This is Ai? Wow
>>105774422
Flux Kontext does that out of the box
>>105774948
kek>>105774950
>>105774909
surely china is making progress on their own hardware, right? there is no way they're content paying for overpriced import-nerfed nvidia
>>105774950
>This is Ai?
yes
>Wow
Ikr
>>105774185
https://github.com/mit-han-lab/radial-attention
>Wan2.1-14B, HunyuanVideo, and Mochi-1 are supported for fast video generation with high quality under 1-4โจ‰ video length
>1-4โจ‰ video length
>Release LoRA checkpoints for longer-video generation

While exciting, I'm a little bit concerned about OOM. So in my understanding, this allows for continued generation WITHOUT the degradation beyond the Wan 5 sec limitation. So you're saying, if I load up 324 frames (20 secs), this should generate without the errors? Currently, if I load anything past 230 frames, I OOM.
>>105774960
They are just starting to make their own EUV machines and semiconductor foundries, it will take some time until mass production
>>105774975
it means that this method will help the model not produce shit if you go for more than 5 seconds, but it also means that you have to be able to handle the additional memory if you want to go for something longer of course
>>105774960
>import-nerfed nvidia
aren't people on weibo bragging about importing grey h200s
If I install Comfy portable, can I set it up so it opens in it's own browser window and not just in a tab?
https://www.youtube.com/watch?v=2PkMO3yVz7g&list=PLHRLDTelHSJpnC7ML-kNvPWhSyXH4nXzt
>>105774998
have two browsers
>>105774955
>local
>>105775033
?
Kontext Dev is local
>>105774987
Ah, figured that would be the case. Its good thing I'm upgrading, jej
>>105775017
Damn Rene
>>105775044
>upgrading
unless you're getting a RTX PRO 6000, you wont have enough
>>105775043
I'm waiting for nsfw lora for it
>>105774987
>>105775044
>>105775050
won't the memory requirement be less of a pain if they went for O(nlogn) instead of O(n2)?
>>105775059
There is one to remove clothes, but the nipples look weird since the base model is censored and it's hard to train the entire concept of nudity
Generating shit on tensor.art now costs more credits. Sucks.
https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751

neat multi image workflow with some style presets you can enable/disable

pixel art option on: anime girl is holding a book with the text "how to gen 1girls" in scribbled font.
I am still waiting for a method of reliably generating shit that is NOT slow motion while using the lightx2v lora
>>105775078
>style presets
nigger, it takes 2 seconds to write "pixel style" on the prompt box lol
>>105775091
I know, it's not necessary it's just there, I might even remove the node but I dont want to break the workflow. can just bypass anyway
>>105775050
And why wouldn't I? It OOMs on my 16gb card at 230 frames, I'm pretty sure even a 24gb can handle 320 frames. But I'm going for the 48gb card
>>105775103
yeah fair enough, thanks for sharing the workflow anon
anime girl is sitting at a desk in a dimly lit office. she is wearing a pink tracksuit and holding a book with the text "how to hide a body" in scribbled font.
>>105775118
it works well for combining stuff (ie pepe + character), I leave the reference images to 2 and just bypass the other image inputs if I want a single input. otherwise I enable it if I want to combine stuff. works decent enough, I just wanted a multi-image option for combining stuff if I want to.
Why do models use different vae?
Wouldn't it be nice if all of them work with the same vae so that you can mix their results in the same latent space? Is there any reason not to do this?
>>105775128
that workflow stitches images together right? do you prefer it over the reference conditioning cascade thing?
https://www.reddit.com/r/StableDiffusion/comments/1lo4lwx/here_are_some_tricks_you_can_use_to_unlock_the/
>>105775133
not all vaes are created equal
https://huggingface.co/spaces/rizavelioglu/vae-comparison
>>105775133
>Why do models use different vae?
because each company want to make their own vae and claim they made the best vae ever, it's a competition lol
>>105775141
What does it have to do with my question?
>>105775138
both work, I use both im just testing diff workflows to see how to get multiple images to interact, if you reference "green frog" it will understand, I think they are both considered separate in latent space or whatever then when generating it combines them.

ie: anime girl is sitting at a desk in a dimly lit office with a green cartoon frog that is wearing a red tshirt and blue shorts. she is wearing a pink tracksuit and holding a book with the text "how to hide a body" in scribbled font. keep the frog's expression the same.

originally I just used an image stitch node but sometimes it messed things up, the separate image inputs seems to work much nicer. can take a couple gens to get the pepe right though (you get a frog but not the exact face), but it works:
>>105775153
>>105775171
Do you understand the question?
>>105775167
also "keep expression the same" works really well for maintaining a face, otherwise you can get random faces which can be funny but not pepe.
>>105775153
you're literally asking why a gamecube disk doesn't work on a ps2 console, the consoles (models) have different architectures, they are not compatible at all
>>105775192
I know it's not compatible. I'm asking why they don't collaborate.
>>105775200
>I'm asking why they don't collaborate.
are you retarded? why would companies collaborate with each other? they are rivals not friends
>>105775210
Thank you for finally trying to be on point and answer the original question.
>>105775224
this is such a retarded question though
Don't lie. If you guys were a company/lab, would you jew out the weights too? Or would you do like based Emad, scam VCs and give away all weights for free?
>>105775230
People must find you to be hard to talk to irl.
Scattered mind of reasoning lol
>>105775188
like so:
>>105775239
nah, I'm not surrounded by retards like you so it goes pretty smoothly
>>105775239
you're either extremely ESL, retarded, or both. your question is shit and i feel bad for the anons who wasted their time answering you.
>>105775237
>Don't lie. If you guys were a company/lab, would you jew out the weights too? Or would you do like based Emad, scam VCs and give away all weights for free?
it really depends desu, if my company is big like Alibaba, I wouldn't mind sharing my best models
>>105775239
Ignore that nigga he does this shit every thread
>>105775073
NO PLS NO
>>105775067
>won't the memory requirement be less of a pain if they went for O(nlogn) instead of O(n2)?
chat is it true?
>>105775237
>based Emad
didn't he screech at compvis \ runway?
>>105775258
I really wish there's a unified vae
>>105775237
>based Emad
your "based" Emad wanted to cuck SD1.5, but Runway (the guys who trained the model) released it on an uncucked form, and Emad did everything in his power to nuke the model out of the internet's existence lol
>>105775266
Go see for yourself. More resolution - more credits cost.
>>105775237
If I were BFL, I would not release the weights for Kontext. It's just too useful and there is no other open source equivalent (omnigen 2 etc are not on the same level)

>>105775256
It pisses me off that Bytedance doesn't release their models. Their main business is social media, it makes no sense locking down their models behind API since it was never their thing anyway.
>>105775306
>It's just too useful
Idk man, for a professional setting I doubt they'll be using a model like that, it adds jpg artifacts on each iteration edit
>>105775319
Still miles ahead the alternatives for image editing, including mainstream SaaS models like 4o and gemini
>>105775306
>It pisses me off that Bytedance doesn't release their models.
they just released XVerse though
https://bytedance.github.io/XVerse/
>>105775242
anime girl is wearing a white tshirt with an image of the green cartoon frog that is wearing a red tshirt and blue shorts. She is at the beach holding a bottle of water.
how long before we have video 2 video kontext?
I need to nudify videos!
>>105775340
kek this is really good
>>105775328
if I were a company that does editing images, why would I use kontext dev? I would use their kontext pro/max API, that's the SOTA model
>>105775339
The good stuff, Seedream (image model), Seedance (video model), they keep to themselves, and both of them mog the local alternatives
>>105775340
one more

anime girl is wearing a white tshirt with an image of the green cartoon frog that is wearing a red tshirt and blue shorts. She is at the beach is waving hello. keep her blue and yellow hairclip the same.
>>105775340
drop a catbox of the workflow please
Why would a company keep locking down a model which have been outclassed by other companies, for the same price?
Why not just open source your old model and get free advertisement, and keep your newest one api only? Well, I guess other big tech companies would just train on top of it to fuck you over
Where is illu 3.5
>>105775368
it's this one:

https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751

just with the gguf q8 model + node instead of the default.
>>105775349
For topless I think you can just ask for undressing since WAN is good at taking clothes off, but for bottomless other than buttcheekage there's no point because you'll get weird scary pepperoni genitals
clip is wrong side but you get the idea:
>>105775383
reference images is set to 2, if I wanna do a single image I just bypass the second or third image inputs, works fine.
>>105775400
*because the selector can bug out for some reason, going to 0, so I just bypass the second image if I dont want a second reference.
>>105775364
>Seedance (video model), they keep to themselves, and both of them mog the local alternatives
This is true but from what it seems normies already don't care about AI video, and Seedance is ironically harder to set up than WAN if you've used AI before but haven't used tiktok/capcut. Also watermarked stuff is usually DOA at this point because if you cannot generate fake Gaza footage what's the point at all
>>105774185
Fuckos should've had the comfy node ready to go, nobody is gonna infer wan via their gay ass script
>>105775413
???
None of those problems would exist running local, my point was that they outright refuse to release the weights while I doubt this would hurt their business at all
>>105775413
>This is true but from what it seems normies already don't care about AI video
if this was true, I wouldn't get my tiktok feed flooded by veo3's videos (don't get me wrong they are really funny)
>>105775424
yeah, I hate when they do that, first they announce their new method, then we have to wait for ComfyUi's node to appear, they should do it all in one shot, the wow effect will have a much bigger impact
I made a nudey kontext LoRA that's slightly better than the old one. It's still pretty cooked given nude LoRA's for flux are generally shit no matter what, but it works better than the other one. Combine it with a Chroma inpaint on the nude bits afterwards and the results are pretty fucking great. Can upload it, but don't know where, catbox only takes like 200 and it's 350.
>>105775456
>Can upload it, but don't know where
huggingface?
>>105775456
>catbox only takes like 200
https://litterbox.catbox.moe/
upload here temp, let's see it
>>105775456
Compress the file splitting into two, then upload the catbox
>>105775364
was using seedream today, it's such a fucking perfect model and it makes me so angry we don't have it. it really feels like SDXL 2.0: wide range of styles, 1.2k res, minimal slopped look, and not too heavily biased towards anything.
>>105775429
Yeah anon, YOUR feed. Normies under 50 hate AI because "I hate the current thing" also a little bit of TDS

>>105775426
How would it help their business at all though? Especially if seedream isn't runnable on a 5090. How has releasing WAN helped Ali Baba's business?
>>105775507
>also a little bit of TDS
well, technically it would be called AIDS (literally KEEEEK)
>>105775456
>>105775477
>>105775480
>https://files.catbox.moe/8s88kw.rar
>rename to kn_v1.part1.rar
https://files.catbox.moe/0b7der.rar
>rename to kn_v1.part2.rar
Was more of a test of my dataset and captioning to see if I could get it to work. It's undertrained, but it does indeed werk.
Password is three letters and the name of this general. I'll post any future versions here too.
>>105775497
is it good tho? it turned a pretty solid description of this >>105774208 into this slop
>>105774955
what can kontext do and why would people use it?
>>105774120
On it boss, wait 3 minutes.
>>105774835
But Kontext unified into shit, please no Wan!
>>105775594
shiieet wan turned her into something decent, figures.
>>105774879
Very nice, but she has the hands of someone who has worked as a dishwasher for 30 years, surely she could afford a maid...
I've read what feels like every manga and manhwa on earth at this point with this hobby. What do you all do while waiting?
>>105775634
collect real images and videos for future use
>>105775555
what is the prompt and where are you using it?
>>105774909
Yes, obviously we understand that the watered down, heavily-censored, distilled models BFL releases to the public are crippled so as to make people gravitate towards their paid API versions.

This is also why they hate and want to destroy NSFW loras for their models, since that is something they will never offer on their API, and it makes the public versions more valuable.
>>105774120
K here u go boss.
>>105775523
>technically it would be called AIDS
Compute pool's closed due to AIDS

>>105775634
>What do you all do while waiting?
Video takes two minutes to generate now so I just stroke my penis (goon) or craft the next prompt
Most of my waiting is for my wife to be asleep or leave the room
>>105775634
How is the temp only 64?
Mine reaches 90
>>105775660
Probably power limiting, which you should almost certainly do if you have a strong GPU as well
I lowered my power on my 5070ti from 300w to 250w and fan speeds and temperature went down at no cost to gen time
>>105775634
Work in parallel bro. Get another image/workflow ready.
>>105774934
As long as they do, I will gladly accept.

Wan 2.1 was already a massive revolution in local video generation. With further optimizations and more loras there's a lot untapped potential there.

But they are at least releasing open Wan 2.2, so we will eat good once more at least.
>>105775673
how to do it in a laptop?
MSI shows the option as greyed out
>>105774960
It's not an actual problem, as it stands, China can just get cards from countries where there is no embargo. For chinese citizens and smaller companies there, yes, it is a problem, but for the Chinese state, or huge chinese companies like Alibaba, it's no problem at all.

That said, they are obviously making their own chips, it will take time though, still they are moving faster than expected.
>>105775594
dude on closer inspection.. this is amazing. are you willing to share the workflow+prompt?
>>105775652
something along the lines of
"A close-up portrait captures a young woman with captivating emerald green eyes and vibrant red lips, offering a gentle smile directly at the viewer. Her dark, straight hair features a short fringe, neatly framing her face. She wears an elaborate, traditional-style hat adorned with brown and white fur, featuring a richly embroidered band in reds, blues, and white, with a long blue fabric tail falling over her shoulder. A substantial silver necklace hangs around her neck, composed of numerous coin-like medallions and small dangling bells, reflecting the bright sunlight. Lush green leaves and clusters of glossy red berries from a tree branch artfully frame the upper left and right portions of the image, subtly diffusing the natural light. The background remains softly blurred, indicating an outdoor setting bathed in brilliant sunlight, which casts a gentle warmth and subtle highlights across the scene, imparting a serene and inviting atmosphere."
and I just used the free seedream image gen page at https://seedream.pro/
there are a decent amount of 32gb+ inference options, but almost all of them are for LLMs and would shit themselves trying to run image/video models.
>>105775657
damn that's pretty neat, technologically speaking of couse
>>105775276
Yes, first Emad was saying he wanted uncensored models because he hated censorship.

Then he dragged his feets for months releasing SD1.5.

Then Runway, their partner in making SD1.5, realized that unless they released it themselves, it would take another year since Emad was busy censoring the model like a motherfucker.

When Runway released SD1.5, Emad went pissed and there were some really passive aggressive posts from him and his employees, ending all further partnership with Runway.

If not for Runway, if we ever got SD1.5 it would have been MUCH later, and a heavily censored version.
>>105775722
>https://seedream.pro/
you got saar'd, that's not the actual site. it's on dreamina
>>105775698
use nvidia-smi
>>105775594
Encounters of the Double Kind
>>105775722
It's just the NAG + lightx2v workflow from the rentry with some breast jiggle loras from civit. I'm running on a RTX 6000 pro so I'm loading the full WAN model, might be making a difference.
>>105775547
That works pretty good, can you share your data set?
>>105775770
>6000 pro
is there a max resolution that the vid quality starts to degrade?
>>105775777
No.
>>105775770
thanks, I'll figure it out. how would the prompt look like, just in general? sorry for the smoothbrain question, I've yet to gen a single video. ahem.
>>105775746
Thanks
Running nvidia-smi -q -d POWER gives this out[ut
GPU 00000000:01:00.0
GPU Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : 76.15 W
Current Power Limit : 95.00 W
Requested Power Limit : 95.00 W
Default Power Limit : 80.00 W
Min Power Limit : 1.00 W
Max Power Limit : 95.00 W

What should I set it too? 75? 60?
>>105775799
why, is it based on your girlfriend's nudes?
>>105775844
what I did was overclock a bit and then cap the max clock
>>105775850
The 400 image set is 800MB and the 800 set is twice that. I couldn't be arsed uploading it.
A monkey could make a nude dataset for Kontext though. It's easy.
>collect images of nude women from some artsy nude site, go for quality and variety over quantity, with different body types
>run them through kontext using nested wildcards that give the nude women various types of clothing
>it will glitch on some, rerun it on the ones that don't work correctly until it does
>you now have two datasets, the control (AI clothed women) and the target (the source nude images)
>caption each with "Remove their clothing and make them nude. They are blah blah blah" describing their body features
>can use a local vision model to caption them, then append the "remove" caption at the start of each caption file
Easy as that.
>>105775657
MY MAN
HELL YEAH
>>105775577
I just use it for funny things
For any real anon reading, is there any good AI community where people are good with training and know what they are doing? I wanna discuss training settings and I'm not getting much help here, and based on uploaded images alone, I suspect half of the posters in this general are indians.
Is there a thorough guide i can read for kontext, I got comfy installed and used anons workflow for 2 images into one "shaking hands with the other woman" , that one, and it takes 7 minutes on my 4060ti.
Obv something is wrong or i misunderstand kontext and i need to figure out which is which.
>>105775961
I'd move to plan B
>>105775961
Pls sir think of my village please send btc for helping u
>el Dee gee is ass/garbage
Wow cool your eyes work great
>and it takes 7 minutes on my 4060ti.
>Obv something is wrong
>anon posts lora based on heavily censored model and knows how to uncensor it
>clearly knows what hes doing
>random retard "duuuuh where da trainers at you ppl are stooopid"
good job, tardlinger
>>105775961
go to reddit then, faggot
>>105775984
On other boards ragebaiting/insulting is the fastest way to get information (false-flagging) soโ€ฆ
>>105775984
He won't share his dataset though
>>105776003
You canโ€™t find naked photos of women on the internet??
>>105775980
Sorry i'm not up on memes, I have no idea what that sog is specifically telling me?
Start from scratches?
That's the way the dog bone crumbles?
I just farted?

any other options?
>>105776003
>anon posts his exact recipe on how he did it
>random faggot: "duuuh lemme borrow your homework so I don't have to do a thing myself? I'm too busy jerkin my meatstick to make a dataset"
good job, cock snorkeler
>>105775889
Who can help me make a Prophet Muhammad LoRA?
I can zip all the images I have. I don't know how to caption.
We must make one for free speech.
They removed the Jesus lora though
>>105776010
Time, the time-knife
Is referencing your short amount of time You spent Comparatively to all the other toaster gpu here
>>105776021
>they removed the Jesus Lora
Good.
>>105776021
In my experience, nobody is going to actually do it for you, especially niche shit like that. There's no collaboration here when it comes to LoRA's or training. At the most, anons will help you with advice or settings, and you'll need to learn from researching shit and trial and error.
>>105776041
So like any other hobby?
>>105776041
I would have done it myself if I didn't have a 6 GB gpu
>anons will help you with advice or settings
okay tell me
is it even possible for me to create a SDXL lora?
>>105776044
Pretty much, yeah. Eventually local assistant AI's will be powerful enough to do stuff like that, ie drop a folder of images and say "Make me a LoRA dataset and train it up, bitch", and they will. We're not there yet though.
>>105775961
>I wanna discuss training settings and I'm not getting much help here,
I don't know either, I just copypaste training settings I see on civitai to see what works and what don't
Though quite a lot of them have batch >= 8
>>105775999
>satanic trips of (mostly) truth
>>105776021
>Who can help me make a Prophet Muhammad LoRA?
>>105774236
>>105775961
Ask good questions, get good answers

You are probably asking the most retarded shit
>>105776072
Y tho
>>105776023
>Time, the time-knife
If it's normal for that card then I apologise, I guess it was my lack of experience with kontext and the way other models are so comparitively quick in genning.
>>105776086
Hypothesis: most people here donโ€™t even have a pc
Despite all the snobbery everyone here is crying about tensor rolling back the free gravy-train & now they have to spend their mctendies money to keep gooning
ignore the age, this is your reminder that exercise balls are a thing and women jiggle barefoot when bouncing on them
>>105775770
Pika finna catch a SA court case
>>105776112
>>105776072
Ask the guys at Charlie Hebdo, I'm sure they can help
>>105776083
Y not tho
>>105776078
>Assuming this thread have smart anons with standards
Anon...
>>105775994
I'm thinking about this, not joking. What a shame it's probable I would get higher quality feedback in fucking reddit than 4chan. How low the mighty have fallen.
>>105776112
FED
when 48GB drops onto mainstream will we finally see llm+t2i combos? So that the model talks to you, asking questions on how to refine the picture and what tags it understands
current black box approach is wasting a lot of time
>>105776176
>muh mighty
Read the guide
Follow the instructions literally in the exact thread youโ€™re whingeing in
Experiment
& most importantly, Have fun ;^)
>>105776176
I hope you overcome whatever is stopping you
>>105776183
this has been attempted multiple time on cloud models and never really went anywhere. something about refining back-and-forth just doesnt work. maybe related to that one paper from 2 years ago that talked about the issue with feeding AI itself and quickly collapsing its internal "world model" as a result
>>105775961
yes, https://arcenciel.io/
this is the only training community i found and they talk about mostly training illustrious loras. i stopped training months ago and don't care for realism so i'm not sure what else is out there.
>>105776205
My question is can a 3060 train
And will it take more than 1 hour
>>105776183
>when 48GB drops onto mainstream
So, perhaps year 2032 then
>>105776183
>when
if*
it has been 5 years of 24gb and there are no signs of it improving any time soon
>>105776227
it has literally improved to 32gb this year
>>105776227
There's an anon with RTX 6000 ITT
>>105776237
Allegedly
>>105776205
What guide, it's completely shit. You don't know how to make a good tutorial. It's just snake oiling
>>105776218
Train what ? What model, how many images, what resolution ?

For example, a 3060 12gb can train a Flux lora at 512x512 resolution with NF4 quantization, the results will be good enough, with 20-30 images it will take 4-6 hours to do 100 epochs which would transfer the style/likeness well enough.
>>105776238
Careful, we donโ€™t talk about things that COULD have allegedly happened, we cannot speculate whatsoever, we can not use inference. Period.
>>105776236
woah, really??? sweet i can finally run SDXL slightly faster than 3 years ago!!
>>105776242
>500x500
:(

I think the anon from a few threads ago was right, just train online with rented hardware
I break my render times up in groups to prevent heat-death
6 hours is never happening
>>105776249
that has nothing to do with vram lmao. lurk moar before you talk about things you do not understand
>>105776249
Donโ€™t belittle progress
Things you take for granted can evaporate overnight
>>105776252
For all the problems with Flux, it's amazing at training at low resolution and be able to generate high resolution images with the person/style without any artifacts or loss in detail, unless you are training something like detailed full posters, 512-640 is often easily enough and it obviously speed up training a lot.
>>105776257
>Things you take for granted can evaporate overnight
i can feel the bytes from my safetensors files slowly evaporating from radioactive decay as we speak
>>105776257
nvidia drip-feeding gaymerslop to protect their already outdated a100s is not progress. AMD or intel developing an adequate rival to Cuda would be.
>>105776298
they both had projects, then scuttled them
then they both funded ZLUDA, then dropped it

they failed, and they lost. the only person who doesnt deserve the seethe are consumers left without a choice
>>105776295
Do you want ssd Failure?
Because this is how you get ssd Failure
>>105776272
The face being accurate is v important to me
>>105776135
gonna need a ouija board
>>105774112
i have 64gb ram
https://files.catbox.moe/3aq184.mp4
exact workflow i use^^^
maybe increase your block swap size? if you dont have much ram you should use gguf and maybe load less loras? if you have plenty of ram then load loras in low mem mode
ps: fp8 is faster than q4 or q8
>>105776374
>open the portal! XD
Ya good idea
subgraphs when
>bump limit reached
>44 images
>>105776456
holy sdxl frying
>>105775770
hello rtx 6000 pro enjoyer. please share your workflow
>>105775961
your limiting yourself very hard if you rely on /g/ for a decent ai community. There multiple other diffusion threads on different boards and subreddits have decent discussions and gens posted. /g/ is only place of shills who only get excited to shill next fancy FOTM meme modelโ„ข and comfyui workflow. Your better off going to /trash/ or /b/ for advice on lora training from other vramlets.
>>105776491
>/g/ is only place of shills
hello sir
>>105775994
reddit has plenty of good 18+ ai subreddits that i lurk around.
Lets be honest, the only reason you should be on /g/ AI threads is if you're into celebs or kids or something else illegal so you can't post on reddit or discord

>n-no I'm just german so i just care about privacy and anonymity also I hate normies
that's just being into kids with extra steps
ok so why won't you go there and stay there?
>>105776527
Infected neovagina spotted
>>105776527
what the hell? what the helli?
https://www.tiktok.com/@fanduel/video/7503426695345573166
>>105776491
>bro just hang out the communities using SDXL finetunes they're going to help you with training Wan
What's the difference between let's say chroma-unlocked-v41.safetensors and chroma-unlocked-v41-detail-calibrated.safetensors?
>>105774164
>200W
>firehazard
>>105776824
>What's the difference between let's say chroma-unlocked-v41.safetensors and chroma-unlocked-v41-detail-calibrated.safetensors?
>>105774779
"detail calibrated" is "large", it means he's training the model at 1024x1024 instead of 512x512
>>105776456
really a bit thin eh. sorry I can't provide any new chroma material, working on sdxl set
>>105776824
'detail calibrated' has better detail, thus provides better detail on detail. insert graph here.
>>105776824
unlocked 41 gives you vintage analog grain (noise artifacts) from training at 512x512 that make the model look so authentic to the early 2000s myspace era
detailed calibrated is trained at 1024x1024 so you get extra-smooth vaseline smear from the High Quality synthetic dall-e 3 dataset.
>>105776859
Do you have anything better to use in mind?
>>105776859
>detailed calibrated is trained at 1024x1024 so you get extra-smooth vaseline smear from the High Quality synthetic dall-e 3 dataset.
that's because he started to make the detail calibrated in v34, and since v34 calibrated is v1 for large, it means that the large model is really undertrained, for example on v41, he's basically mixing a v41 ""base"" model (that's still a part distilled with fast) with a v7 large model, this is such a shitshow lol
>>105776869
no it was a joke. but generally i still prefer non-detail because detail feels a bit slopped in comparison still.
For those who want to install the newest version of Sage2++, you can use this
https://www.youtube.com/watch?v=QCvrYjEqCh8
>>105776242
> with 20-30 images it will take 4-6 hours to do 100 epochs
> 2000000-3000000 steps in 4-6 hours
Damn, 3060 is huge.
>>105776859
Bullshit. You will get less analog feel with 512x512 vs 1024x1024 because so much 'noisy' details are lost when converting them to latents to be trained.

You don't know anything about the subject yet you keep spouting off nonsense.

Seek help.
GET THE FUCK OUT
>>105776972
>>105776972
>>105776972
>>105776943
It's a good budget card, unlike the 4060 and 5060 models which has a 128-bit bus, the 3060 has 192-bit bus. The 5060 has DDR7 vram though, which speeds things up. The 4060 series is absolute shit though, steer clear.

Best bang for buck still has to be the 3060 Ti, I doub't it will ever be beaten in value.
>>105776994
what about a 4090 card?
>>105777111
4070 and upwards have 192-bit bus or larger, 4090 has 384-bit bus
>>105776890
Pretraining on a lower resolution is how every model is initially trained... the difference here is the smaller dataset size and the weird mixing method. Who knows what their basis for the latter is. They aren't dumb so I'll have some faith in the anthro wolf shaggers
>>105776943
Your maths might be a little off
>RifleXRoPE extends WAN video output by an additional 3 seconds, increasing the frame count from 81 to 129. However, it comes with some limitations: Higher VRAM usage, Longer generation time, A tendency to revert/loop the scene to its original state by the end of the video
How is this any different than just increasing the length?