collage
md5: e4558434f813e82572fad6157e7c2bd3
🔍
Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>106197528https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.com
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanXhttps://github.com/Wan-Video
2.2 Guide: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y
>Chromahttps://huggingface.co/lodestones/Chroma1-Base/tree/main
Training: https://rentry.org/mvu52t46
>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Any software to let me search images by generation parameters like https://github.com/zanllp/sd-webui-infinite-image-browsing but one that works with ComfyUI images?
Is python a good language to learn if I'd want to use it also outside of AI shit?
>>106201820qwen struggling a bit here lol...
>>106201478>seems like you have real fp8 supportinterestingly, little to no info confirming that this is supported on the XTX shows up on google
>>106201515>find fp8 scaled instead of just fp8Has anyone even uploaded a scaled fp8 for qwen yet? all I can find is these:
https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors
https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/non_official/diffusion_models/qwen_image_distill_full_fp8_e4m3fn.safetensors
>>106201882Those seem to be it, yes
Blessed thread of frenship
>>106201793it's a lang for prototyping and automating other application functions. if you are just toying around then the toy lang fits. making anything bigger than some one trick pony ends up being a dependency hell and also becomes unstable (see: comfyui)
>>106201897*hugs you*
Fuck the haters
after actually reading https://docs.comfy.org/tutorials/image/qwen/qwen-image, yeah the distilled version is OK at 15 steps and 4 CFG (pic), good for iteration at least. but it looks hideous at 1 CFG, why do model bakers suggest things without even testing them?
>>106201890neither of those are scaled AFAICT.
comfy should be dragged out on the street and shot
>>106202007>neither of those are scaled AFAICT.e4m3fn should be the scaled format
>>106202007non distilled at 20 steps 4 CFG. clear improvement in the text, I should have been using this one for text-heavy prompts. fuck GGUF, what a broken piece of shit that is
>>106202030Ah OK, I think you're right. I saw some search results where people were spreading incorrect info about this. we're good then
>>106202026After reinstalling everything 3 times and everything still being broken I'm starting to understand this grudge
>>106201767 (OP)another shit bake. last one was so bad, it got deleted. kek
Untitled
md5: 5f4712471d28bb902e72b33cbe227d6f
🔍
if you have 24GB VRAM why do people use the smaller versions of the text encoder instead of the biggest one?
doesn't the text encoding stage take place first, so it can then be unloaded from vram?
>>106202026Mildly baste
>>106202074Yeah I had the same issues recently. Had to do a completely fresh comfy install but with a slightly older version https://github.com/comfyanonymous/ComfyUI/releases/tag/v0.3.44
>>106202166I thought fp16 was the biggest umt actually, did anyone compare 16vs32?
>doesn't the text encoding stage take place first, so it can then be unloaded from vram?Correct, you should use the biggest one it can fit.
>>106202074I literally never have issues and pull master branch every other day. Do you not clone from the repo?
>>106202166fp16 is way faster than fp32 at a tiny loss in quality
>>106202166Normally I'd agree but it's so miniscule that like what's even the point?
are there any guides on making wan workflows? i opened a workflow and i have no idea what is going on
>>106202198>I literally never have issues and pull master branch every other day. Do you not clone from the repo?Been always using update via .bat and updating requirements if there's any problem. Everything else except wan works.
>>106202166gguf text encoders?
i thought text encoders were loaded into ram not vram? why would text need the gpu
>>106202339Are your custom_nodes also up to date? Those are usually the causes of breakage. Kijai is really fast about fixing issues (and basically everything) if you're using ComfyUI-WanVideoWrapper
>>106202459>Are your custom_nodes also up to date?Yeah, reinstalled updated several times. It just ooms during high low noise model change or when it goes trough vae. Early july version worked perfectly.
Is new pytorch an upgrade for wan? I got like two seconds in a minute long gen with chroma
>>106202531Now an actual battle axe, and not an axe you can buy in a store
>>106202026>>106202074works on my machine. I bet you're a windows user
>>106202198Good typography jere.
>>106202814>t. doesn't use bleeding edge snake oils
>>106202814>I bet you're a windows userYeah, win11. It's like staring at gaping asshole.
>>106202613Try --disable-mmap because that was added recently. I'm just a guy and not comfy so ymmv
>>106202830seed 0 gives and takes
Is stable diffusion webUI obsolete?
So, is v49 better than v50?
>v48 won the poll
I'm confused now
in my testing, chroma 49 seems to be the clear winner for 1MP resolutions, even compared to older chroma (detailed or non detailed) versions. I haven't seen a comparison yet where it's not superior.
which version were people saying should be genned at higher res? does anyone have some good resolution lists for different aspect ratios at that res?
>>106202836everything we're using is bleeding edge. it's easy to forget this because you're so used to it, but even the most basic local AI setup is like black magic alchemy to 99.9% of the population. and the devs for this shit don't know what's going on half the time.
>>106202897v49 and v50 are a mess because the creator took shortcuts at high res training. low funds so he messed with the dataset. he ran a subset of the dataset for 1 epoch instead of the rumoured 2 epochs at full dataset.
>>106202993can someone message the furry so he'll use EQ VAE to get a 7x speedup and quality improvement?? why is he messing with this other stuff instead of a proven method like that?
I was using the wrong CFG before. Oops.
>>106202962>>106198671this guy is simply wrong and/or his "improvement" only applies to a very narrow usecase/style. both v50 and v49 fall apart at hands and textwhen using a 1.125x resolution like that, also losing style and shading. pic related.
v49 at 1MP res is the best chroma setup overall for base gens, and it's very clear when you test a few different styles and complex prompts.
>>106203145Can you provide a catbox? I'd like to try around with that prompt a bit, looks like a very decent test case for further tests.
If you'd be alright, I would perhaps replace one of my plot prompts with yours. Hands/Text/Composition would be very nice for my plots, I think.
mad men or moving, they already got blocked here... you fucking hurt my family i will gut you like a fish with my fucking bare hands!
>>106203017Can you link the eq vae for chroma?
Flux one from https://huggingface.co/Anzhc/MS-LC-EQ-D-VR_VAE/tree/main results in noise artifacts
>>106203017>why is he messing with this other stuff instead of a proven method like that?because he has the money and he is surrounded by yes men who know shit about the tech. I think it's over dude. he has moved on to making an RL.
it is reeeeeeeeeeeear trash stop potinh ty.k
>>106203203this stuff isnt plug and play, he would need to train chroma to adapt to it first
and even then its not going to magically make your gens prettier, its to speed up convergence of the diffusion model
we won more cope is always welcome,
but you self centered mnd!!!! ;0 mong#mind
you are here forever...
What happened to the WAN videos? Why are we back to the SD1.5 trash? Do the VRamGODS go out on weekends and only the VRamlets stay behind cope-genning in their basement?
the ai is so advanced that the people who use it are not fucking thick...
>>106203066That image just reminded me of this image I have.
https://files.catbox.moe/qjws2v.jpg
Happening?
https://huggingface.co/lightx2v/Qwen-Image-Lightning/tree/main
>>106203066Are you going to animate it?
then why don you fix it then you fucking retard? been here all summer! loli wish, yes loli fucking kys fagggots!
there is only every one result and that is death.
loli is a distraction understand that.
>>106203188positive:
Masterpiece extremely dark French Neoclassical orientalist oil painting by Jean-Auguste-Dominique Ingres with visible brushstrokes Tenebrism and Craquelure.
An evil fat old wizard wearing a filthy ornate wizard robe with a dark hood with very shadowy face and glowing eyes. He's standing over a table holding his hand out over the table ritualistically, with dark red magic trails coming from his hands. The table has a glowing blood pentagram on it with candles. The table has a human skull on the corner. floating dancing in the middle of the pentagram on the table is a small tiny mini fairy-size mature succubus woman summoned from hell. The succubus is a demonic black haired 25 year old succubus devil woman with gigantic breasts, wide hips, bat wings and a devil tail, covered by skimpy black leather clothing. She's smiling alluringly at the viewer. extreme closeup on devil woman. the table has the word "1GIRL" inscribed on it.
The background is a dungeon cave full of sharp rocks and bats and is extremely dark, shadowy, smoky, and sinister.
negative:
furry, e621, nipples, plastic, shiny skin, cgi, bad anatomy, missing limbs, extra limbs, missing fingers, extra fingers, low resolution, framed, low quality
>>106203203>>106203216after looking into this further, EQ VAE are not compatible with models not trained to use it. IDK how easily the furry could implement it now. though here he says he's doing some other VAE experiment:
https://xcancel.com/LodestoneE621/status/1952176708972601720#m
>>106203265yeah you're right. IDK about the prettier gens part though, doesn't EQ VAE help unscramble hands?
just cast frost nova, bro
>>106203333it's europoor hours
>>106203411>distilled Qwen
>>106203066you brain is wired on one thing.
chroma isn't your life anon do not gloat with it like an idiot.
the model was a completely success of coarse.
only muh fucking haters are making noise.
>>106203512>muh ignore, muh self interpretation of conflict, how will i ever cope with people that ignore me?
any way to get nsfw LORAs for video gens? i'm using wan2.2 if it makes any difference
>>106203533works mostly but you might need to play with it so consider 2.00 strength, but here is the deal this shit is still new as fuck and some are getting good result some are getting bad. But over all i felt that wan 2.2 is better are porn you just got to do it right.
>>106203533you have to use 2 samplers for wan 2.2
if that is too much then go away, just go away... yeah fucking wait till you learn its 2 models...
>>106203411How is it just 1.7GB, is it just a lora for the actual Qwen Image?
>>106203488even better
97 seconds at this res and with kijai's 2.2 workflow + his i2v 2.2 lora (lightning), 6 steps (3/3).
>>106203333Can't gen to test out different Chroma versions and Wan at the same time
Chroma images are not so noisy anymore at res higher than 1024 bros! They look really clean.
are there local models for lip syncing? i want to gen a video of a girl. gen a voice for her, then make her say it with lip sync
chroma 50 is amazing and thank you for this amazing model. everything that was done made this model perfect. ignore the retards.
Gohei too big to be smashed completely on her head, but that's surprisingly canon
>>106203546Very nice , i've animated it for you
chroma 50 is completely uncensored and produces realistic images, why are people bashed it?
>>106203145>this guy is simply wrong and/or his "improvement" only applies to a very narrow usecase/style.Photorealism is not a narrow use case. I mean, test it out for the purpose yourself anon. Most users, including myself, noticed a direct downgrade. There's a noticeable fuzziness to the gens. This downgrade may not apply to artsy or anime gens though.
qwen lets me successfully put a bunny girl inside the shop window display. flux can't do it, krea can't do it, hidream can't do it, and chroma can barely do it. the prompt adherence and understanding seem pretty good. once flux redux equivalent is out for style transfer, I think qwen would be the meta and it'll be bye bye for flux
>>106201367>7900 xtxOh, that might be your problem more than anything else. I have a 4090 (Strix OC) and the performance hit is almost imperceptible with GGUF. Qwen Q8 also fits nicely into VRAM too with room to spare.
>>106203462>doesn't EQ VAE help unscramble hands?My current Jenny LoRA used EQ VAE during training and I haven't noticed any less shitty hands during inference, but I have noticed that there's less distortion in the likeness when the head/face is at extreme angles now. Leaning far over or even upside down no longer breaks the likeness.
>>106203679it fucking slow yes, it i but some of us do not care we just want quality an a model that does exactly what we tell it.
most flux loras work with it good luck get them to work with base flux.
not even ai is powerful enough to make women shut the hell up
I am really enjoying Qwen, but having to run it through a 1.5 step to make the girls have huge honkers is kind of a downer.
>>106203690Qwen with style transfer and/or controlnets would be THE meta, as far as I am concerned.
>>106203690>Bragging about a skill issue>The gen looks like plastic, not even the actual memeable potentially you'd get out of Chroma when your skill issue is solved.I have prompted and shared similar images with Chroma before, it's not hard anon.
>>106203690I think it got a little confused with her hand
>>106203656Isn't it just Wan's general avoidance of hitting people ?
>>106203736I don't listen to no gen
So, has anyone successfully trained a Qwenimg lora with 24gb~48gb vram yet?
okay. there we go, he finally used frost nova.
>>106203674Cool, I like how it came out.
>>106203736NTA but over the past few years I have genuinely developed a fetish for plastic AI girls. It is like a specific subset of bimbo style uncanny valley, equivalent to when robot girls in anime have unnatural movements.
>>106203563>>106203579like this, right?
but the base lora thinks of a penis like it's a log of shit, that's why i'm asking
this scaled chroma is not bad. it's not faster, but it's nice to have more free VRAM and less threat of OOM due to other processes. will come in handy when we get chroma controlnets too.
>>106203689I'm not disagreeing there's a downgrade in v50. I just think it should be avoided altogether. 49 or maybe older versions are the best ones rn.
>>106203640Nah, it knows many concepts for sure but the gens that it gives out don't really seem worth waiting a minute for despite what the shills say. There's still issues with mangled hands and anatomy and weird artifacts that become noticeable once you've used it enough. It still feels like an unfinished product imo.
>>106203690and it's less censored than flux
>>106201773https://github.com/RupertAvery/DiffusionToolkit
>>106201773>>106203901>but one that works with ComfyUI images?oh nvm
>>106203411>try this out>lora key not loaded>look at key names inside safetensors>it's not ComfyUI format>it's not Diffusers format eitherHow fucking hard is it to release a lora in a normal format that works with the inference programs people actually use? There's no excuse at this point. Absolutely insane that people keep inventing new formats and then relying on others to convert it or complaining to comfy to add support.
>>106203869It's base model, like SDXL and Flux which only took off when they got finetunes and/or loras, lodestone has stated many times that Chroma is a base model meant for further finetuning, primarily loras since that's the most efficient way.
if the mouth is closed on the first frame, then it seems like it's about a 50/050% chance whether they yap or not.
if it's open on the first frame then they are almost guaranteed to yap.
>>106203793>>106203690This is my first try anon, I'm not even trying.
https://files.catbox.moe/10nnad.png
Also worth noting I can make her naked and any number of poses that got censored out of Qwen's dataset.
>>106203986the hard part is to overlay the street background to the character, which chroma can barely do
>>106203980you want to put yapping in negative prompt, but you'll need to set up "WanVideoNAG" and "WanVideo Apply NAG" nodes
>>106204022negatives do not work bro I tried every token I could think if
>>106204038Have you tried negs in Chinese?
Any working T2I wan 2.2 workflow with the wrapper?
If I just use 1 frame the gens are all fucked up with mutations and the loras don't work.
>>106204038yeah negative prompt doesn't work without NAG nodes. Or are you saying you're using NAG nodes and it still doesn't work?
file
md5: fdfa4ec0be491d6e50d89664eddb66e4
🔍
>>106204022i guess i'll try those nodes, but the default node doesn't work very well
>>106204069negs work without the distill. we already did tests when wan first came out with the vanilla weights
>>106203969NTA but I hope you're right. It's just that it's not a great start to the whole chroma ecosystem when there is not even a clear consensus which model should be considered base. 48, 49 or 50. Good luck everyone.
>>106204111Yeah I have to agree with that, the release could have been done much better.
As for 48, 49, 50, that is only a problem because all are available, actually ALL epochs are available, meaning people will try them and make a subjective call on which works best for their prompts.
Typically you only get one release, that's it, it might not be the best checkpoint for your needs, but you will never know because it's the only one available. There are benefits and negatives to both approcaches.
>>106203986Nah, the hard part is the photoreal look. Without it what you have is slop, and any model can produce slop, which is not impressive. What you posted doesn't look realistic at all. It wouldn't fool anyone. Let me know when Qwen can do this and have it look this clean. I'd be delighted if a LoRA were enough. It likely won't be able to generalize the style without a finetune, and Chroma took a lot work.
>>106203986>>106204003seems like anon got a gimp moment here lol
>>106204111The biggest mistake Lodestone made was doing experiments during the continual pre-training run, instead of saving the extra compute he was given for afterwards.
Idk why he just didn't keep things as it was until version 50, then decide what to do after. This community never learns, and It's just exhausting at this point baka.
Now we're stuck with Flux with a horrible license and censorship, Wan which is great but stuck with T5, and Qwen Image which is fantastic at following prompts, but post trained on slop which limits its flexibility.
>>106203855mfw when your face is my face.
play with strengths and also you should be using the high noise the the low noise, and its fucking gay but its what works.
>>106204227>>106204227Auch
you you should understand is is by design! meaning a fucking ape can't use it.
>>106204228reminiscent of those stock images showing bizarre things like using a cordless drill on a PSU, or holding a soldering iron by the iron
So what is the annealed for?
>>106203333>"there can be only wan"
>>106204261it's his own personal jeetmix he made available as well. mostly a nothingburger
i2874
md5: 470be1d24c4f4b18195844b70751aaaa
🔍
musubi's double training works with no offload but ooms with offloading enabled
explain that
>>106204261It was just an experiment, someone in the discord said they got better results with it, YMMV.
>>106203690>>106203986>>106204003here's a similar prompt.
chroma really does struggle to put her behind the glass, even when she's behind it she destroys the reflections. lots of bad hand results too from every version I tried.
I like chroma but we're at v50 and hands and text still aren't consistent. its understanding of complex multilayered, multisubject prompts has improved significantly. but I also have started to find I have to rework old prompts when I'm testing them on the new releases. that really shouldn't happen if prompt comprehension is improving across the board, I'm still engineering prompts to avoid triggering chroma's weird biases and meltdowns.
every time we get a new model like a chroma release or qwen I try it out. but they're still very incomplete compared to SDXL anime models lmao. we probably have another year or two before we get a SOTA model that can do these things:
>styles>artist names>tags instead of boomerprompting>hands>bonus (difficulty: impossible): it's not bloated and slowhilarious how SAAS models other than MJ and NovelAI have the same problems too.
>>106204261do not want.
i do not fucking know man. seems you have a sore ass?
>>106204285>double trainingHow does that work? Does it spit separate high noise and low noise loras? Or just one you apply to both?
learning how to prompt is more tedious than getting this shit set up in the first place
>>106204357Learning how to prompt is the fun part
>>106204349the latter I'm pretty sure, it only spits out one model
there's all kinds of new parameters I'm trying to work out, where timesteps and thresholds correlate to high or low noise
prompting before:
>keywords and common Internet english
>emojis work
>vast artist knowledge
>porn in the datasets
prompting now:
>chingrish edited synthslop captions
>basic art styles and almost all Photoshop filter tier
>have to use another model to "fix" your prompt to the same slop it was trained on
grim
>>106204178that's just still not a good gen anon. her legs aren't even attached to her. and it has feet, which is disgusting.
file
md5: 76b52024a23a92ca6dbda406e11ce9df
🔍
>>106203826Thanks, but you've made another cool image, had to do it again!
>>106204473they aren't moving forward at all. that's busting my brain
>>106204300Your gen has the 1024p fuzziness. The hands are way more fucked with that. Try increasing the res to 1152x1152 for better results. I agree that it could be better, but its current state is very good. It mostly gets hands right on first try when you increase res.
>>106203901>>106203911Actually diffusiontoolkit works too since it can search raw metadata
>>106204493Have you ever heard of camera chase?
Here you go then. Snow is showing a better movement.
if i said what i could do here on pot or something i would be insta banned. their is not free speech here. i have not bothered in months. nut today i see the first sings they will delete them selves.
both wan 2.2 an chroma 50 ae azming tools for free. ignore the retards
we done here, we need to move...
So the meta is using Chroma v49 with 1152x1152 as the "official" resolution?
>>106204414the models that work with "prompting before" didn't magically disappear from our hard drives.
I tested the latest shit and made some memes, I'm going back to noob/illustrious now, the non-anime models simply aren't as fun or beautiful.
>>106204566you are just too retarded to use it fuck off.
it always favored higher resolution's ae you now gonna pretend to b that retard there is only person allowed to be retarded and that person is me.
>>106204357Write a shit prompt and run it through Qwen2.5-VL with the ref image.
i2875
md5: a5ff6b8335b92083090d477b4d5622e9
🔍
>>106204300Note that it's not always realistic to get the reflections looking exactly like what you want them to. Only under specific circumstances does that happen IRL. Google real photos of storefront displays, they do not all have that type of reflection you see on your Qwen image. Many of them look exactly like the image I posted, very minimal with the subject still in focus.
Anyways, here's the same prompt, but slightly altered time of day and also description to emphasize reflections. The beauty of Chroma is that you can precisely describe what you want. If the girl is fading due to reflections, do that. The dataset is not missing the data. It simply prefers to show the girl.
If you understand how diffusion models work, it's simple. You just direct attention to what is bothering you. LLMs speak the language that these models expect, and it becomes easier to describe what you want.
https://files.catbox.moe/ig6xu0.png
i was checking chroma v50 in the archives what a fucking piece of shit LMAO
>>106204633onions rage:
99 seconds, new kijai 2.2 lightning lora works great at 6 steps
i2809
md5: 32a96f5120de558045b1bd73d12860a8
🔍
>>106204660how does it cope with tummy hair?
I got a qwen-image lora training session going finally. Stupid me ovelooked the note about not being able to use the comfy version of the VAE. Once I switched to to the original it worked.
I'm able to set block_to_swap = 0 and comment out transformer_dtype = 'float8', but it takes almost 46GB of VRAM.
>>106204601do not you me you are all fucking retarded and i thank God you are this retarded
https://www.youtube.com/watch?v=LwucePphA2c
>>106204694You can prompt for anything involving 1girls.
is gonna get metal soon.
they are censoring us in most places...
>>106204769There's at least one anon here with a Blackwell 6000 Pro.
i can feel the smashing of peoples fucking skulls i really do, the streets are dead compared to other days, this is the time its gonna happen/.
cyan hair anime girl does a heart hands emote with her hands.
Style
md5: 8e17f7a878798120197a999f2cb25691
🔍
What should I prompt to get a similar style?
(There is no lora for this artist)
out
md5: 802fb81916d36b8d5056aaec213666e8
🔍
3060 chads rejoice.. sageattn2++ just works
speeds before were 160-165 seconds
What's the best way to merge two LoRA's together with OneTrainer?
I keep training my LoRA's again and again but I am messing up something and have to rely on another LoRA or two to get the desired results I want.
Pic related is after chaining it with another two LoRA's
>>106204963And if I don't get something like this for example.
>>106204957so you shaved off 10s for free, nice
anyone got penises to not look like long thumbs on wan t2i 2.2?
>>106204984yes and upgrading to cuda 12.8 shaved off 10 seconds before that (from 170-175)
free lunch!
What kind of hardware would you need to finetune Chroma locally?
cyan hair anime girl jumps over a large hill of snow with her snowboard.
i wonder where saya no uta anon went, i havent been genning saya no uta with chroma because unofficial chroma nunchaku has broken loras
>>106205007rtx 6000 pro is enough
>>106203411>>106203962Manually checkout the comfyui commit that fixes lora loading:
https://github.com/comfyanonymous/ComfyUI/commit/4c3e57b0ae9fb7ff1322977915efe7e98544d15d
Or just checkout the latest main commit.
The distil lora works but majorly nukes variety, I can 1024x1024 gen in 4s on a 5000 series card.
I haven't been keeping up with whatever is happening with Qwen. Can I run it on a 3090 if I use FP8?
>have a tranime diffusion thread
>tranny still spams his shit here
Clockwork
>>106205080ladg doesnt exist
maybe you should go over there cloudkuck
>>106205061With GGUF, Qwen can run on anything. I have 4gb VRAM and am using the Q2 and it works just fine with patience.
>>106205061I'm running it on a 4060 just fine (well 10 seconds /frame, but is still usable)
>>106205108> 10 seconds /frame,did you mean per step?
>4060how fast do you run wan?
>>106205099>>>/g/adt/I know trannies are mentally ill but you seem to be even worse than the average troon.
adt does not require clodshit, in the very first sentence in the general they say
>Cloud-based anime generation is also welcomeimplying that by default its local first, low IQ tranime retard.
>>106205121cloud based generation shouldnt be welcome at all, cloudtranny
>>106205133>moving the goalpost after being btfodConcession accepted, retarded dog.
>>106205143moving the goalpost?
my initial claim was local anime diffusion general doesn't exist, which you havent disproven
a local first general might exist (i am not debating that, so its besides the point), however it's undesirable for me because cloudtrannies like you can come in and shit it up
ok one more with this miku. wan works better when you 1) establish scene first, then 2) describe actions/details.
https://www.instasd.com/post/wan2-2-whats-new-and-how-to-write-killer-prompts
simple but good reference
>>106205159Nobody is forcing you to generate tranime with clodshit, that requirement does not exist.
Go to tranime gen general and post your low effort 1girl tranime with local models instead.
>>106205176>resorting to strawman after being btfodConcession accepted, retarded dog.
>>106205118Yes sorry. 6 to 10 seconds per frame on my 4060 16gb , stock.(tends to get faster after some frame
I should obviously try a better workflow with gguf model, but gen time is still acceptable.
wan? Oh it's very fast ,with sageattention, and lightx2v lora . 480p in a couple of minutes, 720p in 4-6 minutes depending on clip length. Results 90% acceptable on the first try.
>>106205183>"i cant go there because its cloudshit!!!!">you dont have to use clodshit>seethesTranny trying to argue without revising what happened status: literally impossible, every time.
anime girl miku hatsune turns around and waves hello.
it actually worked. no 360 lora. wan magic!
>>106205199>"i cant go there because its cloudshit!!!!"that is not my argument, my argument is that cloudshit is allowed there
again, strawman
Tranny trying to argue without using fallacies status: literally impossible, every time.
>>106205211>that is not my argument, my argument is that cloudshit is allowed thereWhich is not an argument.
The point of that thread is to post tranime and not to discuss local models and local tech. So if you want to post irrelevant tranime without adding anything then go to the general made for that, retard.
>>106205221no it is an argument because it's the reason i dont post there
cloudshit is allowed => i dont post there
quite simple, negro
Mikutroon and other tranimetroons don't want to go to anime diffusion thread because their slop is so low effort slop that everyone there would throw them out if they were to spam their low quality shit every third post like they do here in this dead general of schizos.
>>106205245>implying rocketranny posts anything of quality
>>106205197>>106205118Still calling 'em frames instead of steps
... gee i'm totally pooped
>>106205197>>106205266its okay anon, we all turn off reasoning sometimes
could you post speed for 640x480? i want to compare it to my 3060 (150s for a 640x480 4 step video)
>>106205245>>106194093this is our best and it showcases our taste quite perfectly
>>106205245i accept your concession
anime girl miku hatsune turns around and holds a large glass jar filled with blue liquid.
migu shield potion
>>106205312migu pee potion
c
md5: a81ccbb7ecf0c4c05a12c54d10789f90
🔍
>>106205342camera direction fixed it
the camera zooms out as a bald man hits a baseball with his baseball bat.
>>106205272q8 gguf, 81 frames, 640x480,6 steps 24 seconds per step, so around 150 seconds too.
same time but two more steps i guess?
>>106205245Hey, don't shill our general! We survived the schizo holocaust, and we're rebuilding from scratch!
>noooo! you can't just use SDXL models that are fast and easy to use!!! you have to use my """SOTA""" model that looks like plastic and requires a 1000 word essay and 2 minutes to gen!!!
Qwen is bomb.
>10 GB vram
>8fp
>400 secs
>>106204529impressive
very nice
>>106204473>>106204493Yo that is mindbending.
>>106204529This looks legit nice though.
a girl with dark hair waves hello and smiles.
very lady like.
>>106204878Wasn't easy getting qwen to put the bellybutton in the center.
>>106203980>>106204038You need to use the models normally with more steps, and cfg and no speed loras to make them shut up. Or do a second pass with MultiTalk where they say nothing.
>majicmixRealistic_v5Preview
>deliberate_v2
>stymmidreal25danimestymmidrealvers20
>lazymixRealAmateur_v20
>abyssorangemix3AOM3_aom3a1b
>unstableDiffuser_unstableGrimlock
>edgeOfRealism_eorV20Fp16BakedVAE
>breakdomainrealistic_M2050
>etherRealMix_etherRealMix2
>omegaMixFluffyrockAnd_v3
>expmixLine_v3
>cardosAnimated_v30
>epicrealism_pureEvolutionV3
>>106205522a girl with dark hair puts her hands on her hips and smiles.
cool, it actually worked.
a large anime girl on a billboard steps out of the billboard and starts walking on the street.
gen was with flux or noob, I forget
>>106205245>Everyone I disagree with is a tranny!What is it about trannies you secretly like so much, Anon-kun?
>>106205631>Everyone I disagree with is a tranny and / or Indian!is more accurate for 4chin I'd say lol
>>106205494>Qwen is bomb.
Krea does some kino GILFS
>>106205673I usually don't say this, but that genuinely looks like a tranny
>>106205630God damn, it even did a good job of showing her reflection in the windows at the right time.
I'm going to try using wan 2.2 to gen stills, it's like the only thing which will do a legit dark and gloomy scene in it with a person without needing a lora. Everything else insists on making a person well-lit.
>>106205682Grannies often look like trannies, it's true.
>>106205630this time running
a large anime girl on a billboard steps out of the billboard and starts running down the street.
>>106201767 (OP)>https://rentry.org/wan22ldgguidehow do i plug in a virtual vram node in this workflow? are there better versions on wan 2.2 you should use if you have a lot of memory? these are the ones in the guide Wan2_2-I2V-A14B-HIGH_fp8_e4m3fn_scaled_KJ.safetensors and Wan2_2-I2V-A14B-LOW_fp8_e4m3fn_scaled_KJ.safetensors
the windows getting the reflection is next level stuff, wan is pretty cool
>>106205758What a time to be alive
>>106205494>Qwen is bomb.you mean its shit?
I havent tried it yet.
training is finished and as i expected chroma is not good
>>106205873it's smart and kind of uncensored but it's too big and bland
>>106205883how does it compare to chroma50?
>>106205888>chroma50you mean 48 right?
>>106205878I don't plan on using it, but it's ugly SD1.5 aesthetic is kind of nostalgic.
a large anime girl on a billboard steps out of the billboard, turns around, and picks up the large building behind her with her hands.
neat
>>106205897no 50, its the most recent one
>>106205878v49 and v50 are real fucking bad, I have no idea what went wrong. There's pretty much no discussion about it I could find but it's a downgrade across every metric I've tried.
I've still been enjoying v48, for what it's worth.
tombok
md5: 2335f9cb4fdfaa964c9246029d6cb8a1
🔍
there we go:
a large anime girl on a billboard steps out of the billboard, turns around, and picks up the large building behind her with her hands and throws it into the sky.
>>106205938I think he just pushed those model out out of pure pressure
>>106205873Well think of it as undistilled flux, so it should be much better with loras. It does hands better than flux. It's also somewhat insenstive to the noise seed. The prompt matters much more with qwen.
>>106204935all of the current models have only the most basic understanding of proper terms used by illustrators, so it's a huge struggle. unfortunately there isn't a good solution. you can try mixing tags for artists with similar styles (this will fail to get the right balance of design and medium), train your own lora, or use openai's image generator iteratively to coerce it in the right direction.
>>106202283Why would you need to make one? Use kijai's or one in the rentry. The only settings that you should be concerned with are resolution and length.
>>106205658kek
"A CGI-rendered image of a war-torn urban environment at sunset. In the center foreground, resting on cracked and broken pavement, is a large, spherical, metallic bomb. The bomb is constructed from riveted metal plates, showing signs of wear and some rust, with a yellow and black striped pattern around its circumference. A complex-looking nozzle is visible on the front of the bomb, and a set of four fins is attached to the rear, from which a thin, wispy plume of white smoke is rising into the air. Lying on the ground just in front of the bomb is a single, vibrant red rose with a long green stem and leaves, creating a stark contrast with the surrounding destruction. The street is filled with rubble, including broken bricks and chunks of concrete. To the right, a heavily damaged wall of a building stands, with a large, jagged hole in it and exposed brickwork. The wall is covered in graffiti, including large, black Arabic calligraphy. In the background, the silhouettes of other ruined buildings are visible through a thick haze of dust and smoke. Further in the distance, against a bright orange and yellow sky, stands the tall, slender minaret of a mosque. The overall lighting is warm and dramatic, suggesting either dawn or dusk. Overlaid in the bottom right corner of the image is white text in an Arabic script, which reads "التلقويات فكب" on the top line and "الفيحالهى التسالكاف" on the bottom line, with both lines enclosed in large quotation marks. The perspective of the image is a low-angle shot, making the bomb appear prominent and imposing."
>>106204935>>106206020interesting challenge, what artist name first of all? I bet I can do it with noob, possibly even with chroma.
>>106205991wtf are you talking about, this is the weirdest way to describe Qwen imaginable lmao. The original Flux Dev is not really relevant at all for realistic gens at this point, also, given than Flux Krea completely mogs everything that isn't WAN in that regard, including Qwen.
>>106203333I'm working actually.
>>106205682I don't really see it DESU, maybe like a bad TV stereotype of a "transsexual" from the 90s or something.
>>106206098Found it https://gelbooru.com/index.php?page=post&s=view&id=10041775
>12 images LoRA time.
>>106205658>>106206083okay that prompt following by qwen is impressive, but try chroma at 1216x1216 or some shit, they botched the 1024.
>>106206083What did you use to get that long description?
>>106206168wat
my image is clearly labelled Flux Dev and Chroma V50, and they're both already 1024x1024 gens
>>106204106I think he's talking about using lightx2v with cfg 1 which is what most people are doing
>>106206197Gemini 2.5 Pro on Google AI Studio, with a very long and specific preamble that simultaneously jailbreaks it and tells it how to format the caption. Chroma was captioned with Gemini FYI so ostensibly that SHOULD have been like, a very suitable prompt for it, but seemingly not so much.
>>106206244I see, thanks anon.
>>106206213chroma V50 has problems at exactly 1024x1024 because of bad training
>>106204997There's a dicks lora on covit for 2.1 that kinda works
>>106206290>covitwhere is that?
>>106206213oh wait I misread the second part of your comment
regardlesss, no, genning at 1216x1216 doesn't fix shit for Chroma here lol. Nor does any amount of adding or removing any particular negative with the positive not changing, in case you were wondering
>>106205080People can post whatever they want here as long as it's local
>>106204300>>106204205qwen = photoshop
chroma = gimp
simple as
think carefully before you jump to conclusion directly and reply retarded comments like
>>106203736
>>106206290if you mean civitai and that:
https://civitai.com/models/1734179/wan-genitals-lora?modelVersionId=1969541
it doesn't really work sadly
>>106205755You can offload blocks. There's a node in there already for that
>>106206314Could you try it in v48?
>>106206314I don't know why your gen looks so weird. This was my first try.
>>106206363No it's the one with the thumbnail of two women looking at a black dick.
>>106206352Chroma is very, very, very good for like, weird hardcore NSFW scenes in 2D and also 3D-kinda-but-not-starkly-realistic styles. You can use natural language to prompt basically anything in that regard in a way that's never been possible before.
It's NOT fucking good, though, at high-fidelity "hard realism" and I have no idea why people keep trying to pretend like it is and pretend like it even makes sense to expect it to be given who trained it. Use Flux Krea for that kind of thing. Or WAN.
>>106206415Box or GTFO. Or at least state exact positive / negative / seed / sampler / scheduler / CFG.
>>106206444positive: copy pasted yours
negative: low quality, illustration
seed: 5
sampler: euler
scheduler: beta
CFG: 4
was the picture really that impressive that you think I lied?
>>106206493Nta but I think it's shit
why is v48 so much better than v50?
Do flux lora work with chroma?
>>106206431There are some very unusual people here that when a shortcoming of chroma is pointed out, will literally bend over backwards to attempt to prove it not to be true.
I think it's mental illness.
>>106206581some do, here's a checker script: https://github.com/EnragedAntelope/Flux-ChromaLoraConversion