Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>106221281https://rentry.org/ldg-lazy-getting-started-guide
>UIComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.com
https://civitaiarchive.com/
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanXhttps://github.com/Wan-Video
2.1: https://rentry.org/wan21kjguide
2.2: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y
>Chromahttps://huggingface.co/lodestones/Chroma1-HD/tree/main
Training: https://rentry.org/mvu52t46
>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
So did a few more tests with Heun as the high noise model. It's just too chaotic. It works great as a low noise sampler though. lcm for high and heun for low works well for me.
>>106224373are you using lightning lora on high? if not, how many steps?
>>106224163absolutely based, thanks anon
Blessed thread of frenship
no spelling, just a sign test
>>106224405she's holding that Z sideways, what a retard
>>106224405the anime girl is wearing a Super Mario outfit and Super Mario cap, and has a black moustache. She is holding a yellow star.
>>106224430A man of refined taste
wan is a funny meme maker.
one more: but with science
>>106224496logistic or agricultural?
>>106224509depends whats in the beaker
much better slam movement without lightning lora on high noise
sucks that it takes so much longer to gen
will the slowmo vids ever stop or
does comfyui do some kind of weird caching were gens bleed into each other? im using wan 2.2. first prompt was to have a girl angrily talk to the camera which went fine. second prompt was to have her put her hands behind her head but it genned her angrily rubbing her own tits.
>>106224559Grammar and writing is the great filter.
>>106224557I've had luck with enclosing the entire caption with a weight between 2 and 3. Applying motion blur to starting image works too but obviously you lose detail.
>generate image without pixel art LoRA
>img2img it with a pixel art LoRA added, but keeping everything else (especially seed) the same
This seems to keep the art style of the other LoRAs more intact than doing the first gen with a pixel art LoRA added. Denoise of 1 on the img2img pass gives the best looking result, but it's farther from the base image than with lower denoises.
Chat, is this okay? Webm related took 23 minutes to generate and ate 60 gigs of ram on top of full 12gb vram while at it.
>>106224668Here's another image I made with the same workflow, just a different prompt and style LoRA
>>106224687Think of this way, how many images did you generate? 128 frames? How long would 128 images on Flux take to generate?
>>106224696its neat how all these tools work together, can make an anime gen in reforge, edit it in kontext, and have wan animate it.
>>106224689pretty good. which model?
this one came out great.
prompt was:
the girl holds a small golden star. she quickly eats the star by and then stares at the viewer intimidatingly as her fake mustache falls off her face.
>>106224750yeah, it is exceedingly fun doing that.
>>106224764>the star byah what the fuck.
i'm lowering the res, these are taking too long for simple shitposts
comfy should be dragged out on the street and shot
Im on a shitty Asus F15, with 4GB VRAM RTX3050, am I cooked if want to attempt Local Diffusion?
>>106224754Illustrious, specifically WAI NSFW
Hassaku XL also handles pixel art pretty well, but not quite as good
>>106224796i'm on a 6gb 3060. at best i can run sdxl and it still takes far too long. 4gb 3050 is not even worth running a model.
anon who trained the digital camera lora on chroma, did you use diffusion-pipe or ai-toolkit?
>>106224794here's your (You). you happy, faggot?
why does she look so evil.
i love it
who the fuck said wan is good for single image photorealism?
it's fucking trash.
come out and I'll beat you in the ass with my chroma
>>106224812NTA but any lora I've made so far on AI-Toolkit has been ass. A lot of banding and doesn't look as good as some I've seen.
Guys I have Pentium 1, can I play Sim City 3000?
file
md5: 69a129c768a7f86bfd30d4850e9d8b10
๐
i love this node so much
>>106224834you gonna hit me with noise?
>>106224834Crawling on my gens, these macroblocks cannot heal
>>106224808so what am I looking at here for comfortable production, 5070 with 12GB ?
>>106224789for the miku I did 640x640 and it was like 2 min or so, pretty fast: if you do 1280x720 it's gonna take much longer, but at smaller res it still works fine, but is quicker.
>>106224853i love you anon
>>106224834sounds like a skill issue, it's the best model atm
Prompt weights are given like (prompt:1.x) and the higher the x, the higher the priority right? Iโve been having an issue where a hat doesnโt get removed from the head, but itโs just duplicates and the duplicate moves, I was wondering if there was a better phrasing.
Can finetunes fix the anatomy on chroma once the base model is finished, or is it inherent?
>>106224888weird how you forgot to provide an example gen
>>106224668>>106224689Looks very good, what pixel lora are you using ?
Typically you will see a lot of dithering when looking closely that gives it away, but here there is very little.
>>106224898It'll definitely fix it
>>106224853i was looking for that node actually. thanks. i thought nearest-exact was bad or no?
>>106224754That's pretty funny.
>>106224373I go simple, dpm++sde everything, though I don't know if any other would look better, this looks fine.
>>106224867for wan i have a 24gb card because sure i can offload etc but.. i don't wanna. and it's more futureproof, see Chroma and Qwern Image
>>106224916honestly i'm not sure. i'd have to play around with it more.
>>106224928ooooo this one looks great, I humbly ask that you teach me your ways sensei.
>>106224867You will be left behind if you have less than 24 GB of VRAM especially with the amount of money you have to buy in with even with a 5070.
>>106224867Don't buy anything with less than 16gb at this point. Overall performance-wise, I'd say a 5060 ti 16gb is the minimum if you want to buy a card now.
>>10622496024gb is an odd size for memory, which GPU are you running? 4090?
i love how filtering and sorting images on civit just doesn't fuckin work
>>106225021how is it an odd size? also yes, 4090.
file
md5: 77ffd29d32d2e7e9b454ec16f002ec24
๐
eh, it kinda kept the pixelart style.
>>106225026real and true. as expected of a vibe coded website.
>>106224994Not bothered to double check but as far as I remember 12gb 5070 beats 16gb 5060(in games). Not an expert in diffusion but I'm guessing for this task the more RAM the better even if it has the rest of the spec worse?
>>106224906I used Luminable's Pixel Art LoRA for both but Skormino's also seems good
Luminable's has a bias toward anime while Skormino's has a bias toward realism, and Luminable's seems more robust.
>>106225074the 5070 has more cuda cores and a higher memory bandwidth so it'll be faster but only if the model fits into vram completely
but 12gb is indeed peanuts these days
A wan 2.2 14b + lightning 8/2 gen based off a qwen-image 20b still
god damn, that train is done so well
>>106225074The problem here is the vram size, with 16gb you will have a lot less offloading when doing video and high quality image gen than with 12gb, which makes the cuda core / bus difference between 5070 and 5060 ti not worth the money.
If you want better performance than a 5060 ti 16gb, get a 5070 ti 16gb.
another similar gen... look how well wan 2.2 rotates the whole view... god damn
>>106225126This was impressive, people working as animators must be having ulcers at this point
>>106225126Man. Cool gen. Really makes me miss going to lawson at night for some sandwiches and drinks.
>>106225026For as great as it is as a hub, itโs a fucking terrible website to use and navigate. I hate how it manages model variants and such on a page with the buttons at the top and the search page is shit. Just a poor poor website designed around being showoffy more than usability
>>106225185Yes, thankful it exists, but the navigation and incredibly bloated ui is awful.
Actually the only place worse in the AI space would be Hugging Face, it's like it's deliberately made to make you never find anything you're looking for.
>>106224906>>106225080Oh also, I scaled it down 1/8 then back up to original size with nearest neighbor. The original output is pretty clean so the difference isn't that noticeable from a distance, but it crisps everything up well.
>>106225125thanks will have that in mind
I wonder if you could make a basic FMV game now...
>>106225105>>106225126its good but the movement should be done at 16 fps and not 24 fps, those fucking lighting loras are so fucking bullshit, I don't know why they released that crap
>>106225292to me it feels like they rushed it out the door unfinished, because of the amount of people pestering them and also they wanted to be the first ones
Is comfyui the only real way for video at the moment? I fucking hate it so much I tried and for what I do it was just way too overcomplicated compared to forge classic (and classic dev just cucked and created a flux branch so even less reason to leave it now lol). If itโs the only way I guess Iโll struggle through it, but if thereโs a more โwebuiโ front end for it that I donโt know of
>>106225214Thanks! I wonder how many indie devs are training models on pixel graphics now
>>106225105Waiting for the train with Miku
>>106225313what makes me mad, is that you could workout the slowness of the 2.1 Iora by generating 121 frames instead of 81, when you try generate 121 frames with the I2V 2.2 model all you get is some jittery shit
>>106225347Rethought suicide with Miku. Train too slow so she thought better of it, and decided to continue spreading her love until the train can move faster.
>>106225355Nice, very nice.
anyone try this? looks like kontext at home: https://huggingface.co/Skywork/Skywork-UniPic-1.5B
i hate civitai so much its unreal. if you have R, X, & XXX enabled, it will hide some SFW results. If you have them disabled, it will hide some NSFW results. you can't view them both at the same time.
this is so ridiculously annoying holy shit
to the anon a while back who recommended chroma for wonky faces, holy shit man you weren't kidding. this reminds me of sd1.5 but i can get a wonky realistic person with unique features every gen, noice!
>>106225586They have to carefully dance around the payment processor pron rules or they will get cancelled by the Christians.
>>106225329I think training a model might be overkill but I'm trying to make a simple game with these tools as practice with using them, and based on the two pics I posted it seems like signs are good that I can get it to generate the types of assets I want. I'll have to do manual cleanup of the outputs to maintain better style consistency but you can get a lot farther with this stuff than I expected.
It'll be interesting to see what people who are willing to do more sophisticated workflows will make with this stuff in the next few years
>>106225629Can they not make some option for premium users to view everything unfiltered? This is making it hard to hoard loras.
>>106225629lmao are you stuck in the 90s
progressives and people throwing a hissy fit over porn and sexy white women aren't christians
you have truly been mindfucked that you let marxists trick you like this
>>106225706Umm, this entire fucking payment processor crisis is caused by Christcucks.
Lefties and women are sex positive, have you seen the degeneracy they post and enjoy?
Search your feelings, you know it to be true.
>>106225520Kontext is kontext at home, though.
>>106225629It's not the christians who own the (((payment processors))) anon
This shit started way back during Biden's presidency, with (((them))) banning tons of stuff off Paypal and trying to ban OnlyFans, but for some reason backing off the latter
Now they keep banning, doesn't matter who is in the white house, this is a larger agenda
1
md5: fc64d97503b8e530fd967a0eb1ddfc86
๐
which one looks the most real?
1/3
>>106225728that's fair but this is a 1.5b model, pretty sure i could run it on my shitass phone.
>>106225725lmao okay buddy, you're a useful idiot
Collective Shout isn't Christian, they're feminazi socialist atheists.
2
md5: b8874485a9e3e81e0de3a2ad8ca3a54c
๐
3
md5: ed0ef467e42db1e8de863faf1895a96f
๐
>>106225737Hand gives it away, although not by much, just slightly off
>>106225747Hands give it away
>>106225756This wins
>>106225737this looks exactly like over edited gravure photo that japanese pump out 24/7
file
md5: b643d9aff6c7189a96d2cb8860ede9aa
๐
>>106225773>This winshard disagree. they're all very obvious ai
>>106225786The question was which looks most real, 3 looks most real
>>106225798real meaning non ai generated, yes?
seeing any text garbled like that immediately makes it look not realistic. i understand what you mean but realistic also means realistic text.
to that end, the swimsuit image is far more realistic (imo)
>>106225643You can make all the options that you want anon, the problem is that that no payment processor will work with you thats whats it right now, its fucking hard as fuck to monetize AI (and) nsfw stuff, there is a big fucking agenda against it
>>106225818i feel like the swimsuit image has a bad case of plastic flux face, especially the teeth
>>106225831only big corpos are allowed to monetize nsfw material right now (aka onlyfans parent companies), they are becoming a monopoly, all other attempts are getting outlawed
file
md5: f2a8a491b7d74edc471b1ff132f50561
๐
>>106225756great until you zoom in
>>106225831Yes, because NSFW is what the tech giants can't offer with their SAAS services, so they know this drives people to local, if people use local they have no need to pay for SAAS, and they also lose their ability to socially engineer you
They want local dead, the payment processors are owned by the same companies (like (((BlackRock))) who also own large stakes in SAAS AI companies
>>106225831The obvious mistake is these retards used Stripe for porn when there are already existing porn credit card companies, it's a rookie mistake. There's a reason why every porn site used CCBill. The reality outside of the moral bullshit from feminazis is porn/AI is likely way above normal for fraud and chargebacks.
>>106225875>way above normal for fraud and chargebacksI always hear that and I don't get why it's the case.
>>106225893Underaged and getting caught. Underaged steal their parents credit card or otherwise someone gets caught buying porn and refuse to admit they did. Then consider the addiction aspect especially for AI, it can get very expensive if you're paying for OnlyFans, cam girls, etc. Just consider how much fraud happened with AI Chatbot General with stealing keys and doing API usage fraud, the same stuff happens with AI and porn.
>>106225861That's Hebrew.
>>106225741What are you on about? A quick look shows the Collective Shout founder and current campaign manager are Christcucks.
>>106225924I don't always diffuse, but when I do, it's local
>>106225914Oh I see.
In that case it's still not really a problem, you can price that in taking into account these costs.
But the payment processors don't even talk about that, all they're about is "brand reputation", which makes no sense.
>>106225946Are you just making shit up? I get it you demon. Not once is Jesus or Christ mentioned on the About page. But do you know what they have in common? Being feminazi progressives.
>>106225391that's called ้ฃใณ่พผใฟ่ชๆฎบ, meaning "jump suicide"
how do you do inpaint with qwen?
>>106225955I think in this case they're massive morons run by people who think it's 2017 and they're going to get #metoo canceled. But they're never learned they've done massive brand damage with the quiet majority. Even me personally I now have a massive negative opinion of MasterCard now and I didn't even realize they were entrenched in the transactions on Steam.
>>106226000you don't. they haven't released their edit model yet.
>>106225292I hear you, but I started out with 24fps as a mistake and I kind of like it, so I just use it. Also, no lighting LoRA, I deliberately asked for orange/teal teehee!
>>106225914>Underaged and getting caught. Underaged steal their parents credit card or otherwise that a tale old as time and doesnt relate only to porn but every platform that allows payment, it could be just for buying shit and/or gambling too, hell , that shit has been showcased even on cartoons like the simpsons
>>106225962Expand your search beyond just their website.
>>106226013Well, for many years in my company the marketing department was obsessed in looking good in twitter and social medias in general, mainly with terminally online recruits thinking opinions there were representing the norm about the company image.
>>106225875>CCBillI work with site that use that payment processors, and oh boy it makes them hard for people to buy/pay for stuff
https://files.catbox.moe/jyyizv.zip
Anons please advise me am I tagging this shit correctly. Am I retarded? Say if I am.
>>106226039Yeah fucking retard you think Stripe works with gambling companies too? Why do you people always have dumbass takes. You do realize chargebacks are publicly available information? It takes almost no common sense to understand why someone would chargeback $15 after jerking off and why someone wouldn't chargeback a $15 grocery store purchase.
>>106226080you are very retarded. there are no txt tags in there.
>>106226102https://files.catbox.moe/kqieyf.zip
Sorry I uploaded the wrong one.
>>106225520the benchmarks make it look like it's the same as Omnigen2, which is a pretty flawed model. but maybe it's good in ways the benchmarks don't show
>>106226125is it for illustrious based model?
>>106226129Theoretically a ~1.5B model with modern transformers and attention should perform twice well as good as SDXL which is on a very outdated and stinky architecture.
the green cartoon frog is typing at a computer in an office. they are wearing a blue shirt and red shorts. keep the frog's expression the same.
>>106226192Looks good to me. I suggest you train two models, one with these tags and then one with added "pixel art" tag because this might help it steer towards pixel art style even without prompting for it
>>106225520stinks of gpt piss filter
>>106226209i want to hug fatty migu and encourage her to go on runs with me
>>106226213I'm totally OK with that.
Additionally, for an Illustrious model, do we know what the parameters should be all around?
This would be my first LORA and my first 10 epochs were very scuffed and blurry and completely unrelated to my concept.
>>106226244AdamW8bit at 1e-4. 1 repeat. Tune on steps and epochs.
>>106226244train only UNET. Loras trained on Illustrious models learn really well so you can use up to batch 4 with 4 gradient
>>106226281this + use cosine.
>have spent every waking second in comfyui
this is getting bad. im glued to my chair
>>106226323>>106226281Thank you anons. This shit has been truly grating. I don't know if I'm a techlet but I lost a lot of hair on my head to even get to training.
>>106226333>This shit has been truly gratingIt's part of the hobby, don't worry about it. You can always retrain old loras when you know more and make them better.
I can only kind of get Chroma to work with anatomy, and if the SD1.5/XL refiner is too aggressive you lose the unique Chroma look. I like the model a lot, I just wish I could wrangle it more effectively.
What's the lowest ammount of pics I can train a chroma lora on? I have datasets with about 200 pics each.
>>106226333You'll spend more time figuring out how to make datasets. Think of this like craft project. Once you get the trainer set up you'll never need to change the settings and AdamW @ 1e-4 works on every AI model for LoRAs.
>>106226372Nice, very non-ai in look
>>10622643320 is the safest minimum but i suppose that depends heavily n the model
>>106226391I've been testing impact pack iterative upscaler. It removes slop look very well
Delete chroma from OP
Regex to filter it:
change the text from "Pokemon" to "LDG". Change the date "07.22.2025" to "08.11.25".
put a yellow sign over the woman's ass saying "whoops! can't show that on a christian website!"
good prompt comprehension.
>have to use non lightning or else details are lost
>>106226564one more:
give the woman a red skirt.
pretty good, given the odd perspective.
>>106226485Very interesting. Thanks anon, I'll run some experiments.
remove the hair on the anime girl and replace it with shoulder length, cyan color hair.
actually turned out cute.
>>106226687yeah, using this workflow:
https://www.reddit.com/r/StableDiffusion/comments/1m5wpmv/flux_kontext_psa_you_can_load_multiple_images/
if I do a single image gen I just bypass the 3 nodes for the other image.
>>106226564you can do this in ms paint though
>>106226718sure, but the skirt one is not so easy
The anime girl is facing the camera and holding a green leek vegetable.
>>106226718I can also dig a hole with my bare hands instead of using a shovel.
>>106226326Same but with forge. Got it working Friday night with sdxl. By Saturday morning I was on illustrious noobai and all kinds of fine tunes. I haven beat off so much since high school long ago, my cock might fall off
>>106225324https://github.com/deepbeepmeep/Wan2GP
>>106226755using ai would be like building a robot to dig with it's bare hands
>>106226752okay, now we are in leek country.
>>106226770I'm not poor so using Kontext is faster than using MS Paint.
>>106226699thanks, but I don't think it's useful in my case. For example, if I want to generate a naked man with a large penis next to someone, it wouldn't work, since kontext is censored, right?
I could pre-generate an image of the naked man, but how would that be any different from just using photoshop to place them in the image?
>>106226782there are flux kontext clothes remover loras that work. you can do anything with the right tools.
>>106226326I'm tired of noodling so goddamn much I switched back to forge. comfy has been a gigantic waste of more time I could use to goon
>>106225520>kontext at homeDo you have any eyes? trash kontext dev barely changes the image with some prompts while this shit changes the face this much when you ask to replace the hat
Just wait for qwen image edit
am i schizophrenic for using fp32 wan vae? it's only 250mb difference
>>106226801part of the fun is designing awesome workflows that give me more flexibility and control. forge is too boring for me.
>>106226819I'm sorry spending 5 hours to get the same quality images is more enjoyable for you
>>106226819so comfy IS the autists choice
>>106226819Same. For me, at this point, it feels like cooking spaghetti is like building lego, where you put the blocks together to build something novel. And then, at the end, you get a cute 1girl as a reward for your building. I don't like the troubleshooting aspect in terms of installations and stuff, when it breaks it breaks, but when stuff works even the failures are fun. It is like building a 1-click goon factory.
>>106226833sure, if all you care about is generic 1girl gens, more power to you. with my workflows I can do way more than you ever could with just using forge. there is no comparison. there is no argument.
>>106226869like doing what? using cnets, wildcards and conditioning for example are shitty in comfy. don't even get me started about the inpainting
>>106226333What might be causing this shit? That exact dataset, 1e-4, AdamW8Bit. 32/32 Alpha/rank. SNR Gamma: 5.0 I dunno where I'm going wrong.
Trained on this model. https://civitai.com/models/1856313?modelVersionId=2100907
Can you train a lora on a model with the dmd lighting lora baked in?
>>106226901comfy killed inainting so we could never progress past sdxl quality ever again
The anime girl is sitting on a white beach chair on a sunny beach, drinking a bottle of water.
not bad
wan2.2 is something else
https://files.catbox.moe/ebmes6.mp4
>>106226980can you ask kontext to remove yourself from reality? thanks
>>106226992im not an image file, it can't do that!
>>106227001It'll learn. I watched Terminator, I've seen the future.
>>106227001well, time to stop using it
how come wan 2.2 self forcing has way better movement? i have seen people post the opposite but its not true. self forcing loses some details, but i cannot get the same good movement without it, and it take 30 minutes to trial and error
>>106227007relax, go enjoy the sun
>>106227027I still post 2.1 and everyone thinks it's 2.2. I don't think anyone knows what they are talking about including you
workflow? how is your action so smooth
>>106226869>>106226901ComfyUI is good for 1girl slop too, you just need the correct custom nodes. This is Qwen, except the 1girl is automatically segmented out for inpainting with an SD1.5 coom model, so we keep the watercolor backgrounds combined with the pinup foreground. This includes a dedicated step for automatically masking out her chest and increasing the mask size to give her proportions impossible for regular Qwen. If you have the efficient loader custom node, controlnets are also very easy to work with and to turn on and off. And once it is set up, you can gen a bunch of variations no problem. It's just fun, giving me the same satisfaction as programming, just for puerile purposes.
>>106226970What's the issue with inpainting? Are people using VAE Encode for Inpainting instead of Set Latent Noise Mask?
>>106226989could've done without the shitty music, but yes wan in general is nice
>>106227055frame interp idiot
>>106227064>ComfyUI is good for 1girl slop tooI never said it wasn't. Literally anything can do decent 1girl, even shitty online generators, so there's no point in using that as a metric for comparison.
>>106227064if all that work is to just make your picrel it falls under the 5 hours for the same result in forge without thinking or doing much. what exactly is the point of doing all these node gymnastics for something so bland? as for the inpainting, the interface sucks comfy's frontend runtime is abysmal and you increase the surface area of the workflow which means more dragging around the workflow slideshow. the frontend is using more memory than gradio so anything comfy said about his UI being more lightweight is now moot. enshitification is the straw that broke the camels back
Alright, now that the nodes has settled, what is the ACTUAL best wan 2.2 workflow for high quality with light lora?
>>106227157none of the nodes settled. things still randomly break or memory leak and there isn't a fucking way to free the ram.
>>106227169From what I've seen a lot of wan 2.2 gens are much better in complex prompt following but the image quality was shit with initial workflows posted.
As someone who genned with 2.1 for 40min on 3090 for production level quality and later found a great light lora workflow I waited for another similar workflow to pop up so that it's worth switching.
https://civitai.com/models/1719863?modelVersionId=2041861
>>106227148>what exactly is the point of doing all these node gymnastics for something so bland?Because after you set it up once, it is automatic after that. To get the specific thing I want for my fetish, which is the Qwen watercolor background and the coomy 1girl foreground with bigger boobs, you have to both add and subtract masks (for example, wings/chest), load multiple models in sequence, etc. I don't have to manually redraw the masks every time, because they get automatically segmented, because the masks can get automatically refined against the original image (custom node) I don't have to be careful with mask drawing at all, which in turn means I can pump up the denoise on the refiner without worrying about the background, etc.
Once it is set it up once, the workflow just goes, and I can just click once for the dopamine hit. And I can do it for every image of this type, of a Qwen background and an SD1.5 1girl pinup foreground, without ever modifying the workflow.
>>106227255manual masking is much more efficient than seg models. you have to spend 20 seconds calculating the mask when it takes one second to make a smudge over the thing that needs fixing. forge also has the same shit. with these kinds of workflows it always needs manual tweaking all over the place for edge cases. it's rarely a workflow that requires babysitting other than the most basic shit which is done better in other uis
>>106227157the one in the wan2.2 rentry guide has been rock solid for me, 120s per gen on a 4080
>>106227293>manual masking is much more efficient than seg models. you have to spend 20 seconds calculating the maskThat doesn't sound right at all. What are you using for segmentation? The most tweaking I do is changing the confidence value in SAM2/GroundingDino, which at least for an artlet like me is usually significantly faster than trying to follow the contour of say a high heel boot, and doing that twenty times instead of letting the Art Butler do it for me.
>>106227322you are saying this as if you get the perfect result every time. you have to reroll. comfy cache is too bullshit to do it efficiently so you have to add more nodes and start passthroughing and shit. also no you don't have to be exact with the masks retard and gradio uis have vector drawing tools as well so it's much easier if you do want that for whatever reason
>>106227390>so you have to add more nodes and start passthroughing and shit. also no you don't have to be exact with the masks retard and gradio uis have vector drawing tools as well so it's much easier if you do want that for whatever reasonThe way I have it set up
1) SAM2/GroundingDino segmentation generates a rough mask
2) Pass the mask to a Mask Edge Detail custom node, which automatically refines the mask against the original image.
3) Pass the refined mask to a Gaussian Blur, which softens the edges slightly to ensure the inpaint is natural
4) Pass this to the Latent Noise Mask
Because the edges follow contours while also having the slight blur, the denoise value on the refiner can go higher without interfering with the background. Yes, it adds more nodes to the workflow, but I personally don't have to do any additional work to make it go, nor do I have to do a new manual inpainting job for every new 1girl. If you are efficient with manual masks, hey, more power to you, but the spaghetti strand automation makes all this a lot simpler for me.
>wan2.2 rentry guide
>fp8 scaled instead of q8
Yeah, into the trash it goes
ugh bros its just one of those days where none of your gens turn out. its exhausting and my room is extremely hot
>>106227544My card that usually idles well below 40C has been running on near 55C and I'm not even genning right now.
I'm sweating like a pig.
what is the best way to generate videos locally?
>>106227598using a computer
>>106227598Depends on your VRAM, but currently Wan 2.x is the meta.
>>106227193>kijai 2.2 workflow just werks>this just OOMs instantlyI am mad now.
>>106224750>Fingers that are lifted close back down around the micTOO UNCANNY
>genning wan 2.2 on 32gb ram
How much longer does my ssd have left with all the swapping and stalling?
>>106227732not any time less since reads off ssd are free, zoomer
>>106227620even T2I Wan2.2 is killing it, can a chroma bro genning something as fucking detailed as this ???
>>106227751Colors and skin looks default flux tier plastic compared to chroma
>>106227620i have a 4060 ti 16GB
>>106227751you should lurk more
>>106227768Have you ever taken a picture with a camera that cost more than 50 dollars?
>>106227524fp8 scaled looks better in those images. you aren't helping your arguments.
>>106227805Yes, I also have eyes and know what plastic flux tier skin looks like
>>106227524gguf implementation is shit though. slow, crash prone. fp8 is a much better starting point.
>>106227750>>106227751the good ol' 1girl face from lightx2v, I cannot unsee it, whats the point of wan with that lora when every female gen has the same face?
i dont understand this stuff bros, i just make hot dick girls
>>106227818Indeed, having less defined details and 6 fingers with a thumb missing is obviously better.
Here's another comparison to see which quant is closer to actual fp16 output.
Any more cognitive dissonance cope for your cucklet quant for this one? I guess the originally trained model at full precision is the wrong one.
>>106227751>>106227768the woman in that pic might look fake and plastic, but real women look that way. you have to admit chroma can't render fingers/toes/anatomy that clean
what happened to genjam? isn't someone gonna do it?
>>106225105Howz it goinโ Fat-tsune Miniku?
>>106227828no it's not. i haven't crashed a single time out of the 1000+ wan gens I've done. 'slow' is too vague to even make a counterpoint for. you have no idea what you're talking about.
Not as much anime posted here these days i can see.
>>106227857No, real women don't look like flux generations.
>you have to admit chroma can't render fingers/toes/anatomy that cleanSure, but it still does it well enough and fingers and hands don't matter if it still has the plastic skin.
>>106227868ani won't continue it. what a jackass
>>106227857not only the girl, the ground, the paint, the chair... texture of flipflop....
>>106227909Wait, ani was genjam anon? Proof?
>>106224812lora trainer anon here
I used diffusion-pipe
>>106226923Bros...
captcha:sssvy
>>106227951>she's in space, but also somehow flying over an ocean
>>106227909Hahahahaha so it was trani??? That's why it datamined users gmails.
schizo anon warned us about trani trying to catch "schizo anon" through google forms
>>106227898>real women don't look like flux generationsnot all of them do, but the sad truth is that many women are basically just IRL flux gens, complete with LLM slop for conversation
>>106227920my bad. ani just said he didn't want to do it after the genjam anon gave up on it because of
>>106227976 schizo
see:
>>106182628>>106182642>>106182661>>106182679>>106182703
>>106227951>woman struggles to parallel park even in space>>106227898>Sure, but it still does it well enough and fingers and hands don't matter if it still has the plastic skin.nta you replied to, but to each their own, for me the horror chroma anatomy destroys immersion way more than what you call "plastic skin" but is actually a shot you'd get from a professional DSLR as opposed to an old nokia smartphone
can we please stop talking about people who don't exist and don't matter? jesus fuck you are all insufferable. can't wait to get called out as the infamous xyz shizo for just wanting this shit to stay away from these generals.
Guys I need a guide on how to use comfyui
Maybe this space needs to have more artists
Even /adt/ has nicer gens like
>>106227731
>>106228005> from a professional DSLR as opposed to an old nokia smartphoneDo you really think chroma boiz knows what a dslr is ? they don't even know what a toe is
>>106227751share prompt, I want to see how funny the chroma gen for this is.
is this wan2.2 face considered plastic ?
>>106228073yeah are you blind? she looks like she's from Death Stranding
>>106228071https://files.catbox.moe/ygf3w5.png
can't wait for melted feet :"3
>>106228073maybe, but the scalp hair looks like doll hair
>>106228073I don't know... can you generate a different face? kek
can I see a non plastic face chroma's users are talking about ? I can't gening rn...
>>106227999it didn't take a brain to no this was forced as shit
Is there a flux 4step lightning lora anywhere? I have 8 step but want to go faster
What image model would work best for training a LoRA on a 4080 super? I don't have a strong preference, I'd be fine with Chroma, Wan or Qwen.
>>106227751Feet slurp slurp
file
md5: 28c072ff0a66c6d7301dea62769e603b
๐
fuck amd, fuck rocm, and fuck pytorch, i trained flux loras just fine on my 16gb card but now can't get kohya scripts to work
i bet the solution is to switch to some year old rocm version, 6.1 maybe
>>106228191I've trained wan loras on a 12gb 3060, image only of course
I don't remember how or what settings I used, I just know I did because I had no other choice, and the character loras had very good likeness
musubi or sd-scripts don't support chroma so I've never tried, and qwen is extremely hungry and maxes out my 24gb even on 512x512
real
md5: 3c95cc1b1d1b5944ac1669b153af1dc8
๐
>>106228073Until your gens look like this, the answer will always be "yes"
what's the time complexity of the collage algorithm?
>>106228311what do you mean?
>>106227951Fruitiger Atmos
>>106228073Yes but it's not bad.
Chroma 49
>A portrait photo of her face centered in the frame. The background is a plain white wall.
>>106228311It's just a rectangle box sort
Having some problems with the I2V Wan 2.2 workflow in the OP. First I had to disable "merge loras" on the lora loader nodes, otherwise I couldn't gen. The gens I am getting all look like they have this weird red / orange aura around them. Anyone have any idea what the problem could be? I didn't change anything else.
>>106228334I mean, is it hard to fit different aspect ratios into a rectangle? I guess it does it greedily and crops the rest.
>>106228294Looks like a sculpture. A real woman has more mustache and would cover that shitty skin.
>>106228359that looks like either fucked CFG settings or you're doing too many steps on high noise.
make sure you do only the first half of steps (for example, 0 to 4 out of 8) on high, and then second half on (4 to 8) on low
and if you're using lightning lora use CFG 1 for everything but the first step (even on just first step, a higher cfg will sometimes result what you pictured)
https://www.reddit.com/r/StableDiffusion/comments/1mnbqxv/introducing_a_comfyui_ksampler_mod_for_wan_22_moe/
Supposed optimal auto switching to low noise
https://github.com/stduhpf/ComfyUI-WanMoeKSampler/tree/master
>>106226769Iโll have to try it and play around later to see how it is, thanks anon.
Where are the Chroma finetunes? Or even the loras?
>>106228390np, at the very least it's the most stable and werks on 8gb
how long has comfy been bleeding users? seems to be a really common occurrence nowadays
>>106228294>>106228342personally i think wan does better skin than both of those pictures (picrel for example)
but i'm not here to shit on chroma, i use it sometimes too, it does shine at some things like subject variety, and nsfw, and text (at least compared to wan, not qwen. but qwen is very slopped)
>>106228424>how long has comfy been bleeding users?If by bleeding you mean gaining, then for a long time, and it will continue to gain it for a long time given the modular workflow design is the only one that can keep up with all the advacements people want to have right away.
It's still the only way to have actual control over video gen.
>>106228389Interesting. I'll give this a shot later.
file
md5: 01c4332997dc0f0258d3e9aa9c8146ac
๐
>>106228393> finetunesmotherfucker the mode has been out like 2 days jesus
>>106228424I've noticed a particular anon always make this vague posts trying to undermine the success of Comfy. It's not working. Comfy is not bleeding users. It's more popular than ever.
Get some help.
>>106228393A few were posted by anon if you check the archives. 2000's photo, game screen, etc.
>>106228393T-they are in... *cums in his mouth*
>>106228490game screen? that sounds cool.
>>106228424Aren't they all dependent on Comfy's backend? So it's just the llama.cpp situation for image models.
file
md5: 4ffc548b09e9bea60d2a5d12d56e4f0a
๐
what's the proper name of the swords pirates used to use?
>>106228475Comfy is the least bad option, it's not going to be the UI we're using because Comfy basically phones it in and hates actually making real change. It's been a year, still no factory nodes, still no basic UX like grouping variables to make generating easier.
why doesnt the OP have any tutorial, guides or rentry for using forge with sdxl?
>>106228546wait no, cutlass
>>106228544Because it's the most simple thing to do
>>106228563Not if you're new and don't know where everything is or how to even use those tools. There's 0 guides
>>106228519if that was the case diffusers wouldn't exist
>>106228375Not sure what the problem was. I went to Kijai's repo and grabbed his workflow from there and it worked. Looks like the only differences were:
- Split step was utilized (in the other version it wasn't).
- 20 blocks swapped instead of 30.
- 1 vace block swapped instead of 0.
>>106228538I still can't believe we've talked about this many times and people keep refusing to address a simple fact. ComfyUI's main strength is model support and its backend to run diffusion models. If there is no viable alternative backend that can support all the new models, then it will continue to be popular even if you use a different UI because people want to try new shit and almost every single UI out there uses ComfyUI's backend code in some form.
>>106228544>>106228576>download program >download model from civitai >place model in models folder >start program>type 1girl large breasts >hit big orange button
>>106228589It is its main strength but only because we have many models coming out. But now that things are calming down and we're getting basically only one or two flagship models, it's going to make sense to start making UIs only for those flagship models and something like Comfy will be much less useful and as I already said, it's only tolerated because it's the least bad, not because it's good. Wan2GP is a taste of where we're going.
>>106228588>- Split step was utilized (in the other version it wasn't).i remember when i started i grabbed some workflow where this wasn't properly connected and i ended up doing the full step length on the high noise instead of just half and it got deep fried like you pictured
>>106228544What do you even mean? Provide more information. Have you installed forge? Can you use a computer beyond the graphical UI?
>>106228624>But now that things are calming downLmao
>>106228644>>106228600Nevermind, Using Comfy. Much more helpful community.
file
md5: 20c23b8a1f423c704860ede880ee2f43
๐
>nerfed to 14k cuda cores (8k less than a 5090)
>4k price tag
who are these niggas trying to fool with this?
>>106228624Comfy needs better memory management
>>106228650They are, retard. Wan, Hyvid, Flux Kontext, and Qwen Image in the last year.
>>106228658Nta but its all the same community, everyone here is either using comfyui or comfyui and everything else they need including forge
file
md5: e417269eada7d0516f4f810df9bc2657
๐
>>106228557indeed.
god i love playing with models, that fact it understands "shitty mspaint fanart" (not pictured) is amazing.
>>106228589ok but all the latest models are slow, have optional snake oils that crash everything and comfy's dev doesn't actually give a shit about the front end. the backend is a mess because they cemented shitily defined nodes in their with dumbass names like having SD3 in the name or separate checkpoint loaders instead of universal ones. The more the shit piles up the more memory it uses for junk, the front end runs slower and dependencies will constantly conflict because of retarded custom node devs
>>106228666And I should say, what made Comfy useful is there were many many new, real tech being added with SD 1.5 and SDXL. Now that the software and basic architecture has settled and you don't need to refactor every two months any more, it makes sense to contract back to user usage.
>>106228666Wan 2.1 workflows were being optimized until wan 2.2 came out with v2 lightx2v and wan 2.2 still doesnt have a proper non-cope gguf workflow that can match 2.1
New speed loras, samplers, tools etc all need special tooling that need special UI design for frontends like forge which won't happen unless months and months pass and a "proper" way to do it is locked in.
We still don't have good t2i edit workflows despite having kontext even with comfy because kontext is a mid model and we are waiting for qwen image which isn't even out.
Nothing settled down nor will it for years to come