← Home ← Back to /g/

Thread 106469492

316 posts 206 images /g/
Anonymous No.106469492 >>106474602 >>106476148
/ldg/ - Local Diffusion General
Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106464276

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
Chromaforge: https://github.com/maybleMyers/chromaforge
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://rentry.org/wan22ldgguide
https://github.com/Wan-Video
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.106469514 >>106469555
Blessed thread of frenship
Anonymous No.106469516 >>106469555
They tell me this thread is blessed
Anonymous No.106469536 >>106469565 >>106469705 >>106473856
neta is the future for anime start learning now
https://neta-lumina-style.tz03.xyz/
p ost. cardnon No.106469555
>>106469516
>>106469514
all of Gods children are blessed by his grace
Anonymous No.106469558 >>106469701
Anonymous No.106469565 >>106469593
>>106469536
>.xyz
Oh yes, anon, this link is definitely trustworthy.
Anonymous No.106469593
>>106469565
Seems fine
Anonymous No.106469612 >>106469621
>normalfags calling other normalfags out as being too eager to use the word "clanker" as a slur
ai bros stay winning
Anonymous No.106469621 >>106469642
>>106469612
I didn't realize styxhexenhammer was this big
Anonymous No.106469624
Anonymous No.106469625 >>106469633 >>106469634 >>106469643 >>106469699
is there any hope for local t2v? except for making shizo videos, local t2v is obsolete. only local i2v is excellent for serious things
Anonymous No.106469633
>>106469625
It's pretty good for porn with porn LoRAs but yeah most videos end up like a fever dream.
Anonymous No.106469634
>>106469625
I mean, we’re basically in the Ford model an era of this stuff and here you are writing off the future of cars. Bit silly don’t you think?
Anonymous No.106469642
>>106469621
i dont know who that is or how he relates
Anonymous No.106469643
>>106469625
It's good for producing extremely plausible realistic single frames from the video.
Like if candid realism if your goal I'd just extract a frame from i2v
Anonymous No.106469652 >>106469660
Anti AI fags have gone fucking nuts the last few months.
Anonymous No.106469658 >>106469701
whats the best tool for automated video captioning? I want to try training a t2v lora but I dont wanna caption my entire dataset by hand. I'm trying chatgpt but it cant caption videos for shit.
Anonymous No.106469660
>>106469652
>the last few months.
That's an understatement
Anonymous No.106469699
>>106469625
remember how anon hyped Wan T2I? shame the diaperfag decided to tune qwen instead. maybe the bigasp guy will do it.
Anonymous No.106469701 >>106470085 >>106471018
>>106469658
Gemini. It shits all over the other models for OCR stuff.

For example. >>106469558

Based on the video provided, here is a description of what happens:

The video features two characters from the TV series Star Trek: The Next Generation: the android Lieutenant Commander Data (on the left) and Lieutenant Commander Geordi La Forge (on the right). They are seated side-by-side in the cockpit of what appears to be a shuttlecraft.

The sequence of events is as follows:

Data holds up a blue and white package resembling a pack of cigarettes.

He removes a single cigarette and places it in his mouth.

The tip of the cigarette spontaneously lights up on its own.

Data takes a drag from the cigarette and then exhales smoke.

Throughout this entire action, Geordi La Forge looks at Data with a surprised and bewildered expression.
Anonymous No.106469703
Anonymous No.106469705 >>106469887 >>106470020
>>106469536
>Boris Vallejo
Looked up his style and... that's quite sad. Of course, nano banana can pull of his style just fine.
Anonymous No.106469729
love me ambiguous language
Anonymous No.106469755 >>106469882 >>106469883 >>106471031
According to Gemini

In this video:

The scene opens with Captain Picard looking thoughtful in his ready room. In the background, the Vocaloid star Hatsune Miku, now impressively dressed in a red Starfleet command uniform, walks into the room holding a Starfleet PADD (Personal Access Display Device).

She stops and stands behind the captain. Then, in a move that would surprise even the Borg, Picard picks up a 20th-century-style handgun from his desk and calmly examines it.

It's a fantastic edit, but a few small clues give it away:

The Starfleet Guest Star: As you know, Hatsune Miku, while a cultural icon here in Japan and worldwide, never actually made it onto the Enterprise-D's crew roster.

The Firearm: The biggest giveaway is the prop. Starfleet exclusively uses energy-based weapons like phasers. A projectile-based handgun like that is a massive anachronism and would be completely out of place in Picard's ready room, unless he was in the middle of a Dixon Hill holodeck program.

This is another great example of combining real footage with a digitally inserted character to create a completely new, surreal, and humorous narrative.
Anonymous No.106469882
>>106469755
Isnt gemini paid? Do you use with some local client like kobold or silly tavern? Im trying to find good uncensored llm for images
Anonymous No.106469883 >>106470154
>>106469755
those are not good captions
Anonymous No.106469887 >>106469899 >>106470020 >>106470176 >>106473136
>>106469705
>Two elf female characters with pointed ears. The character in the foreground has long, wavy blonde hair, light skin, and wears a blue, off-shoulder dress with white fur trim. She has a worried expression and blue markings on her forehead. The character behind her has long, orange hair, light brown skin, and wears a sleeveless purple top. She has green markings on her face and is embracing the blonde-haired character from behind, with one arm around her shoulder. The background is a textured, dark green and blue gradient, resembling a forest or cave
Anonymous No.106469899
>>106469887
basterd bitch delete this
Anonymous No.106469922
What's the most viable captioning method for deviantart-tier freak fetish stuff?
Anonymous No.106469998 >>106470011 >>106470024 >>106470041 >>106470085 >>106471396
should i get my lazy ass out of bed and finish installing wan? how long do gens take with a 5090
Anonymous No.106470011 >>106470030
>>106469998
they'll take no time at all you fuckin ass
Anonymous No.106470020 >>106470084 >>106470176 >>106473159
>>106469705
>>106469887
Anonymous No.106470024 >>106470038
>>106469998
sure / depends on settings but in the order of some minutes with many settings. you can do 1 minute gens at a not too terrible resolution if you take the fast options with 4 steps or so.
Anonymous No.106470030
>>106470011
my bad man, i genuinely got no clue on this shit
Anonymous No.106470038
>>106470024
thanks anon
Anonymous No.106470041 >>106474735
>>106469998
it takes me roughly 4-5 minutes for a 720p 8 second video
Anonymous No.106470084
>>106470020
Just like my futa doujins!
Anonymous No.106470085 >>106470165
>>106469998
2.2 is so good it convinced me to try training video loras.

>>106469701
thanks bruv. got everything captioned, surprised it let me do them all for free
Anonymous No.106470125
Anonymous No.106470133 >>106470141 >>106472405
Damn, AI inventing new instruments.
Anonymous No.106470141 >>106470148
>>106470133
very mongolian
Anonymous No.106470148
>>106470141
I fed Gemini a Batzorig video screenshot lol
Anonymous No.106470154
>>106469883
Yeah but I didn't prompt it how to caption. I was just more interested to see if it could identify what was happening in the video at all.
Anonymous No.106470165
>>106470085
>surprised it let me do them all for free
np. I assume they do it as a means to capture audience share. Their free stuff is very generous. I honestly just pay for gemini as my GPUs are usually blasting away at training most of the time anyway. It's the best model for captioning in my opinion. And the fact google made veo 3 should indicate as much.
Anonymous No.106470176 >>106470233 >>106470235
>>106469887
>>106470020

>Reference image

Literally just prompt for the guy
>A caveman carrying a wounded woman while pointing a spear a large flying bird over them while sitting atop a rocky hill by Boris Vallejo

https://files.catbox.moe/zcos9q.jpeg

Local would never.
Anonymous No.106470186 >>106470209
did comfy fix the shitty qwen edit text encode node?
Anonymous No.106470209
>>106470186
every fix breaks two more things. python was a mistake
Anonymous No.106470211 >>106470850
Anonymous No.106470233
>>106470176
Do you have some special version lol? I get this on nano which isn't even close.
Anonymous No.106470235 >>106470324
>>106470176
Didn't think it would recognize it. Flux can't really handle a more complex composition like that.
Anonymous No.106470246
Anonymous No.106470324
>>106470235
Not bad. Unlike the original nano banana can't show me booba, so a Chroma LoRA would win anyway.
As for the results you're getting with nano banana, no idea what you're using. I can get his likeness right away even across other seeds.
>A caveman with a shield standing atop a rocky hill while goblins are incoming. A woman kneels beside him by Boris Vallejo

https://files.catbox.moe/ak75k4.jpeg
Anonymous No.106470339
Anonymous No.106470715
ultra cozy
Anonymous No.106470779 >>106470835
is there anything as good as veo3?
Anonymous No.106470835
>>106470779
Yes veo3 is as good as veo3.
Anonymous No.106470850
>>106470211
if not for toes I would say it's not a gen
Anonymous No.106470885 >>106470895 >>106470993 >>106471050
Anonymous No.106470895
>>106470885
I was not aware she had a cat.
Anonymous No.106470908
2.2 for vace soon?
Anonymous No.106470940 >>106470956
Anonymous No.106470956
>>106470940
Expected miku to walk out of that...This thread is getting to me
Anonymous No.106470993 >>106471015
>>106470885
Who are those two from? I recognize them from something...
Anonymous No.106471007
not what I wanted at all but okay.
Anonymous No.106471012
bros... i beg... do loras work with qwen nunchaku yet... bros...
Anonymous No.106471015
>>106470993
Himawari and the flat chested one from Yuru Yuri
Anonymous No.106471018
>>106469701
>He removes a single cigarette and places it in his mouth.

obviously wrong
he was holding the cigarette already
Anonymous No.106471031 >>106471035
>>106469755
>Then, in a move that would surprise even the Borg

wtf

implying they have feels
Anonymous No.106471035
>>106471031
There was that one episode where the borg had feels.
Anonymous No.106471050 >>106474752 >>106476647
>>106470885
Animate this.
Anonymous No.106471067
Anonymous No.106471172 >>106471461
Anonymous No.106471219 >>106471248 >>106471266 >>106471295 >>106471357 >>106472717
>New furk post

See any issues here?
Anonymous No.106471248 >>106471262
>>106471219
>water is wet
Anonymous No.106471259
hey bros anyone got a spare 5090 to donate :) I promise ill train some qwen ToT loras with it
Anonymous No.106471262
>>106471248
I also just found out he blocked me. But look at his loss.
He's basically trained a broken LoRA and bragging about it.
Anonymous No.106471266
>>106471219
Is that 5600 steps??
Anonymous No.106471295
>>106471219
nans for days
Anonymous No.106471307 >>106471339 >>106471391
I'm not terribly impressed by how Qwen handles traditional media.
Anonymous No.106471339
>>106471307
Ask furk to train you a nan lora for you.
Anonymous No.106471357 >>106471711
>>106471219
well yeah, they aren't giving consumer cards 96gb vram because it would destroy their enterprise market overnight. that's why i'm hoping for a deepseek-level breakthrough from china but in the hardware space. they already have modded cards. they are also making 96gb custom cards but they're kind of shit because low bandwidth, no cuda, and shit-tier drivers.
Anonymous No.106471391 >>106472947
>>106471307
The paintings themselves in the back is honestly really well done, just the girl is slopped.
Anonymous No.106471396
>>106469998
5 hours of genning god damn. having fun with the I2V
Anonymous No.106471461
>>106471172

just image AI generating FPS walkthough jump-scare game movies in perpetuity
Anonymous No.106471473
Anonymous No.106471573 >>106472639 >>106476646
I shouldn't have updated my OS.
DRAM/VRAM management is kinda fucky now. God damn.
Anonymous No.106471711 >>106471833 >>106472450
>>106471357
It's not his post that's cringe. It's that he's bragging about his hardware while being unaware he is basically showing the world that his LoRA is stillborn.
Anonymous No.106471833 >>106471941
>>106471711
nta, but I can't help but question how the man is such a prolific (shit)poster seemingly everywhere but somehow missed that their training run was cooked from the go.
Anonymous No.106471933 >>106472188
Anyone ever used the captioning tool in Onetrainer? Usable or megacopium only good for boorutags?
Anonymous No.106471941
>>106471833
I'm convinced his low intelligence robbed him of his ability to second guess and check himself and by radiating enough confidence in a field most people knew little about, he was able to accidentally grift his way to notoriety by just being a fucking idiot.
Anonymous No.106472188 >>106472218
>>106471933
usable, but you still have to check manually for any flops afterward, if anything it saves time by doing the heavy captioning for you.
Anonymous No.106472218
>>106472188
Can it do nsfw?
Anonymous No.106472220
I know you can upscale and interpolate wan video but is there anything to fix any fuckups in the video like when something gets blurred out or things like that?
Anonymous No.106472242
with wan 2.2 you can save the latent from the high noise sampler and reroll with the low noise sampler to hopefully get a better result
Anonymous No.106472298 >>106472460
Can I run Wan on my M4 Max? How's the speed?
Anonymous No.106472405
>>106470133
>Playing 'viking boat'
Anonymous No.106472413
Anonymous No.106472450
>>106471711
Are you telling me that ohwx man training of himself is not the true way ?

Seriously this guy has been doing this for such a long time yet he has learning nothing, still seems to think there's some magic token combination, still hasn't understood that repeats are only for balancing training data when doing multiple concepts at the same time, doesn't even understand the principle of A - B testing and instead changes lots of parameters between experimental runs.

Snakeoil salesman if there ever was one.
Anonymous No.106472460
>>106472298
You probably can, I saw someone posting that they got it running, the speed was something horrendous though.
Anonymous No.106472485 >>106472495 >>106472516 >>106472526 >>106472643
I'm going insane trying to find a good noob-based model with decent coherence that allows some flexibility beyond basic tags. For example, this one https://civitai.com/models/1201815?modelVersionId=1491533 - you can actually add variations like 'blue glowing tattoo' instead of just 'tattoo' without it breaking. Problem is, these models all have shit mixes and can't follow artist styles closely like vanilla noob does. But I'm too much of a shitter to get kino results with plain noob. Is there a good middle ground model/remix that actually respects artist styles while being more forgiving?
Anonymous No.106472495
>>106472485
anything that uses (only) CLIP will never give you the control you seek
Anonymous No.106472516
>>106472485
Your best bet is to find a model that has an LLM text encoder slapped onto it. Idk if noob has a variant like that tho.
Anonymous No.106472526
>>106472485
illustrious has limited natural language support
Anonymous No.106472544
beeg birb
Anonymous No.106472639 >>106472965
>>106471573
The 1girl machine keeps churning, but memory management sucks.
Anonymous No.106472643
>>106472485
https://huggingface.co/Minthy/RouWei-Gemma

someone's been trying to stitch better encoders to sdxl but i don't see much difference so far
Anonymous No.106472717 >>106472799 >>106473264
>>106471219
>NaN
That explains those terrible LoRAs of himself.
Anonymous No.106472799
>>106472717
lul
Anonymous No.106472882
Anonymous No.106472904 >>106472916 >>106472946
>random character sheet out of nowhere
Thanks, I guess
Anonymous No.106472916 >>106472945
>>106472904
What did you prompt for ?
Anonymous No.106472945
>>106472916
https://genshin-impact.fandom.com/wiki/Jahoda
The appearance paragraph xd. I just specified anime artstyle. I guess AI generated articles are good prompts lmao
Anonymous No.106472946
>>106472904
not a footfag but that little red foot is cute
Anonymous No.106472947
>>106471391
At least it draws really good legs
Anonymous No.106472965
>>106472639
That's not a Luger. Man.
Anonymous No.106473065
beeg guy
Anonymous No.106473120
>multiple seeds, samplers and schedulers
>it keeps giving me ref sheets
There is no way a paragraph can have such strong specific style "vibes" that it fucks with the model. Is this a thing,
Anonymous No.106473136 >>106473159
>>106469887
Seb McKinnon Lora for Flux? That's not available on civitai, I want to test it.
Anonymous No.106473152
>an entire separate general of waisloppers
mortifying
Anonymous No.106473159
>>106473136
wrong post
>>106470020
Anonymous No.106473264 >>106473481
>>106472717
Anonymous No.106473269 >>106473282 >>106473289
It feels like progress is pretty stagnant after a few months. too bad the software still sucks ass and just got worse. any new models on the horizon to look forward to at least?
Anonymous No.106473282 >>106473365
>>106473269
qwen hinted at some updates but that's about it.
Anonymous No.106473289 >>106473365
>>106473269
Goddamn man we just had qwen not even 2 weeks ago. Is this zoomer brain I heard about?
Anonymous No.106473301 >>106473360 >>106473386 >>106473431 >>106473446 >>106473732 >>106473977
Chroma loras are so easy to bake, all you need is:

10-15 512x512 images
natural gemini captions
adamw optimizer
constant scheduler
batch size 1
set it for about 2K steps (about 150ish epochs)
Anonymous No.106473316
>512x512
we're regressing, not progressing
Anonymous No.106473360 >>106473553
>>106473301
I don't know if I would call that good quality anon. This seems to have the same quality problem flux did if you trained at such low resolution
Anonymous No.106473365 >>106473394 >>106473400 >>106473497
>>106473282
probably some standard controlnet IP adapter thing for their existing stuff.

>>106473289
qwen isn't really as impressive as it should be at that size. synthetic slopped datasets are a step backwards and the two stage models for wan is just annoying for a 10% higher quality video than 2.1
Anonymous No.106473386
>>106473301
>Chroma loras are so easy to bake,
Yes, it's shockingly easy to train Chroma loras effectively

>natural gemini captions
Don't need this, JoyCaption is good enough, and if all you train is a single concept like a person (or even an art style assuming it's consistent) you can train with just a simple 'foobar' nonsense tag and it will have no problem training it

You didn't mention learning rate, for people I would suggest 0.0001 (1e-4) for art styles you probably want to go a bit higher since it's more abstract as a concept
Anonymous No.106473394 >>106473497
>>106473365
>qwen isn't really as impressive as it should be at that size
you are right but a lot of people don't want to believe it. people will dismiss the arena rankings as nonsense, but qwen ranks around the same place as hidream and it honestly looks it. i went to train a qwen lora and it was like 60gb worth of slop. it's incredibly bloated for a model that does not feel anywhere near the top-10. if it was good it would've made way more strides like flux dev did compared to SD3/SDXL (though at the time we didnt realize how impossible it would be to tune).
Anonymous No.106473400
>>106473365
You know I was gonna take you seriously but then
>10% higher quality video than 2.1
Ahh another vramlet seethe. Trust me if you can't run this you probably just dip from the scene, it's gonna get worse from here.
Anonymous No.106473430
recommend me some cool Illustrious base model hidden gems
Anonymous No.106473431 >>106473478 >>106473571 >>106473577 >>106473723 >>106473943
>>106473301
people should just ignore chroma, and work on qwen. chroma is like bigasp. big potential, crappy results. maybe another model, merged with chroma will save the day
Anonymous No.106473446 >>106473453 >>106473577
>>106473301
when will girls stop having their feet buried in the ground?
Anonymous No.106473453
>>106473446
When you type full body shot or prompt something regarding footwear
Anonymous No.106473473 >>106473538
Do flux loras still work with chroma
Anonymous No.106473478
>>106473431
>chroma is like bigasp
come on now lets not be disingenuous...
at least bigasp was trained at 1024x!
Anonymous No.106473481
>>106473264
Anonymous No.106473497 >>106473534 >>106473546 >>106475056
>>106473365
>>106473394
>qwen isn't really as impressive as it should be
I'd agree, but the alternatives in terms of prompt adherence and non-mangled hands/poses are kinda slim.
But the shocking amount of sameface and general lack of variance between seeds hurt the model a lot.
Picrel, 3 seeds, same prompt.
Anonymous No.106473529
because qwen is DPO'd to shit, it has been said multiple times. you need to inject noise if you want an actual different image per seed
Anonymous No.106473534
>>106473497
Lack of seed variation is actually good for i2v gen purposes. You can change the pose or other details, and the face/character tends to stay the same. I agree it does take some tard wrangling to avoid unwanted generalizations.
Anonymous No.106473538
>>106473473
Some do some don't. You have try one by one.
Anonymous No.106473546
>>106473497
I personally like it makes editing an image much easier without just losing the entire damn composition.
Anonymous No.106473553 >>106473577 >>106476856
>>106473360
It's not the resolution, most likely prompted to look like a phone camera shot

This is from a Chroma lora I recently trained at 512 resolution
Anonymous No.106473571 >>106473621 >>106473656 >>106473695
>>106473431
Qwen's comprehension of traditional media styles, as well as creativity, is piss poor. I don't think it will be remedied by finetuning
It's a great model, but more experienced users will get more out of chroma.
Anonymous No.106473577 >>106473766
>>106473431

One of the really popular SDXL model makers is making a finetune right now, I've used a prototype lora of it and it is VERY promising.

>>106473446

This nigga likes feet!

>>106473553

Yeah they're prompted/oversharpened to look like phone shots after they're loaded to IG.
Anonymous No.106473595 >>106473637
I'm new I2V and I'm following the guide. When I try to generate with the first workflow (https://rentry.org/wan22ldgguide) I'm getting:
> ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
And ChatGPT insists that fp8e4nv doesn't work on a 3090. Is it wrong?
Anonymous No.106473621
>>106473571
>I don't think it will be remedied by finetuning
I think you can, but it will be very expensive since the model is large and massively overtrained, not sure if anyone with enough money would think it's worth it
Anonymous No.106473637 >>106474533
>>106473595
I had that same issue when I tried a new comfy install, even though I used the e4m3fn models with my old install no problem. I dunno. Just try the e5m2 models.
Anonymous No.106473656 >>106473674
>>106473571
>remedied by finetuning
The fuck, if the distilled horseshit that flux was able to be unfucked by finetuning, a non distilled model should be 100 fold easier.
Anonymous No.106473674 >>106473703
>>106473656
qwen would take forever. it needs a bit more elbow grease and cash than chroma
Anonymous No.106473687 >>106474253
https://chromaawards.com/
I think 11labs is paying off civit to not add a chroma category because of this shit
Anonymous No.106473695
>>106473571
>Qwen's comprehension of traditional media styles, as well as creativity, is piss poor
correct
>I don't think it will be remedied by finetuning
it CAN, but nobody will because the model is way too bloated.
>but more experienced users will get more out of chroma.
completely false
Anonymous No.106473703 >>106473716 >>106473719 >>106473733 >>106473749
>>106473674
People keep saying that but all big finetune needs a lot, chroma needed 105,000 H100 hours are you saying qwen would need more? SDXL needed a cluster, finetune will need big hardware for any model. It's such a non argument.
Anonymous No.106473716
>>106473703
chroma is 8.9b. qwen is 20b
Anonymous No.106473719
>>106473703
>are you saying qwen would need more?
it entirely depends on the size of the model. the other anon is right, it's too bloated but it's also overturned which is why there isn't much seed variation
Anonymous No.106473720 >>106474052 >>106474103
Anonymous No.106473723 >>106473730
>>106473431
qwen is overfit and bloated to shit. just do wan
Anonymous No.106473730
>>106473723
wan is slightly over it as well but at least it's in the realm of doable
Anonymous No.106473732 >>106473797 >>106473977
>>106473301
hello beautiful babe
Anonymous No.106473733
>>106473703
yes, qwen would absolutely need more because it's massive in comparison to chroma. chroma already had to cope by removing parameters and training at 1/4 the resolution of fucking sdxl. and even with all that, he still wound up spending $150k on it. acting like the compute costs for these models are the same as SDXL is simply retarded
Anonymous No.106473749
>>106473703
Use your brain
Anonymous No.106473766 >>106473797
>>106473577
>prototype lora

Where?
Anonymous No.106473797 >>106473812 >>106473987 >>106476867
>>106473732

Love it lmao

>>106473766

I've said too much.
Anonymous No.106473812
>>106473797
Oh you meant a chroma tune, that;s good was wondering how did you manage to make a qwen tune look so shitty lol
Anonymous No.106473856
>>106469536
Neta is not perfect, but this is the only anime model that can handle multiple subjects on screen without mangling them
Anonymous No.106473943
>>106473431
Most people outside of here are ignoring chroma
Anonymous No.106473977 >>106473991
>>106473732
>>106473301

tranny hands
Anonymous No.106473987
>>106473797
>prompt for indian man
>get a cholo
Anonymous No.106473991
>>106473977
>t. has futa images saved
Anonymous No.106474052
>>106473720
>white mans kriptonite.png
Anonymous No.106474103 >>106474422
>>106473720
Anonymous No.106474253
>>106473687
Could be, damn
Anonymous No.106474383 >>106476130
Are flux dev and schnell loras interchangeable?
Anonymous No.106474388 >>106474406 >>106474536 >>106474692
>>106474328
Hello, I'm trying to switch from Forge to ComfyUI.
I prefer ComfyUI's interface because my entire txt2img + hires fix workflow fits on my screen without scrolling.
The problem is that I can't get it to work correctly. I've posted more details in the attached thread.
Any help would be appreciated. Thanks!
json: https://files.catbox.moe/nv2b7k.json
Anonymous No.106474406
>>106474388
I think sdxl wants -2 clip layer
Anonymous No.106474422 >>106476679
>>106474103
tranny eyes
Anonymous No.106474533
>>106473637
Yeah that worked.
Anonymous No.106474536 >>106474631
>>106474388
clip needs to be -2. Not sure what you are trying to do with the tiled vae encode/decode nodes. Also 'BREAK' commands don't work in the default CLIP Text Encode nodes, there are custom nodes that use the A1111 parser if you want to keep them but in Comfy you should break each one out into separate text encode nodes and concat them.
Anonymous No.106474602 >>106474619
>>106469492 (OP)
how do i generate abstract happy merchant memes? i am super retarded when it comes to prompting ai
Anonymous No.106474619
>>106474602
literally just img to img and play with the denoise
Anonymous No.106474631 >>106474669 >>106474692
>>106474536
Thanks I am looking here and /adt/ for answers. All those options were loaded by default when I dragged the gen made in Forge to Comfy. I will keep in mind what you tell me.
Anonymous No.106474669
>>106474631
you can also just disable the CLIP Set Last Layer node, ComfyUI will automatically use -2 with SDXL models.
Anonymous No.106474692
>>106474631
>>106474388
I think it would be much faster to just make a workflow from the ground up
Anonymous No.106474735
>>106470041
Fuck me I don’t know how you guys put up with that, 5seconds for my 1girls feels insufferably inexcusably long as it is. I’m sure someone will chime in with β€œit’s worth it” or whatever but that’s into β€œtoo long for me to bother” territory. And that’s with the fastest consumer card.
Anonymous No.106474752
>>106471050
Well it’s a good thing the oral insertion lora bandit got bored, otherwise…ToT
Anonymous No.106474761 >>106474793 >>106474864 >>106474876
Say I'm training a lora and I want 3000 steps total, with 20 images.
Is it better to do 3 epochs with 1000 steps each, or 6 epochs with 500 steps each?

Is there a noticeable difference between the two at same step amounts (ie, first at epoch 2, 2000 steps, second at epoch 4 at 2000 steps)
Anonymous No.106474793
>>106474761
what is an "epoch" and what is a "step"?
Anonymous No.106474864
>>106474761
it's the same exact thing, 1 epoch is just one complete pass of your entire dataset (including repeats). step count is what matters, so just up your number of epochs until you hit 3000 steps total. repeats are only useful if you are training on multiple datasets at once
Anonymous No.106474876
>>106474761
You are confused

An 'epoch' is having trained all images once, 100 epochs is having trained all images 100 times

A 'step' is a single training iteration, typically that is 1 image, but it can be more if you are using a higher batch

So, to make it simple, if you are training 20 images, 1 epoch = 20 images thus 20 steps IF you are using batch 1 (1 image trained at once)

If you are using batch 2 (2 images trained at once) 1 epoch = 20 images, BUT 10 steps (2 images trained at once)
Anonymous No.106474911 >>106474969 >>106475179
I'm going insane trying to upscale chroma, I don't know why it only gives me blurrier results.
I'm using ultimate sd upscale, is there anything better or simpler?
Anonymous No.106474935
Anonymous No.106474969 >>106475008
>>106474911
>0.5 cfg
r u sure?
Anonymous No.106475008 >>106475031
>>106474969
It's just a random example I tried.
I tried everything including this cfg I've seen somewhere else.
Either this node is not compatible with chroma or I'm missing something.
Anonymous No.106475031 >>106475183
>>106475008
nta but 0.22 denoise is too low, especially with a controlnet. try increasing it to 0.35 - 0.5. i haven't used chroma much so idk about what sampler settings it uses but cfg 4.5 should be good unless you're using a lightning lora. 40 steps also seems excessive for an upscale
Anonymous No.106475046 >>106475972
https://www.youtube.com/watch?v=YZTvL8C_xz4

not a shill but has anyone tried this method for longer clips?
Anonymous No.106475056
>>106473497
Yeah seriously. I'm a newfag and I don't really fw NSFW at the moment but if it could just mix things up a little bit more it would be perfect. It could be faster of course but a fast model that spits out crap output is infinitely slow.
Anonymous No.106475074 >>106476331
https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
Anonymous No.106475179 >>106475264
>>106474911
youre using some retarded samplers and scheduler
Anonymous No.106475183
>>106475031
OK I will try again, thanks;
Anonymous No.106475199
change the location to a pond in a forest. a tent is nearby with a campfire outside it. keep the character in green and in the same style.

qwen edit is neat, also maintained the pixel style:
Anonymous No.106475264
>>106475179
I've also used euler/simple and unipc, same results.
Anonymous No.106475302 >>106475314 >>106475373
Is it possible to run Wan 2.1 with A1111? And would a 3060 12gb be capable of generating 480p videos?
Anonymous No.106475314 >>106475486
>>106475302
>Is it possible to run Wan 2.1 with A1111
hard no
>And would a 3060 12gb be capable of generating 480p videos?
soft yes
Anonymous No.106475373 >>106475486
>>106475302
use this https://sys.4chan.org/derefer?url=https%3A%2F%2Fgithub.com%2Fdeepbeepmeep%2FWan2GP
Anonymous No.106475382 >>106475401
Is there an "apply controlnet" node with latent instead of image as input?
Anonymous No.106475391
4chan janny d0x dump
go fuck yourself you disgusting kike nigger sludge
https://files.catbox.moe/r1zn79.7z
death to israel, death to all kike niggers and their golems. there is no future for you.
Anonymous No.106475401 >>106475413
>>106475382
no the controlnet needs an image input
Anonymous No.106475413
>>106475401
Ah, welp ok thanks anon.
Anonymous No.106475486
>>106475314
>>106475373
Alright, thanks!
Anonymous No.106475502 >>106475540
>ERROR - An unexpected error occurred during conversion of Tron_Flux.safetensors: [WinError 123]
I removed all spaces and brackets and parentheses from the filenames and this shit still won't start converting.
Anonymous No.106475540
>>106475502
using this btw
https://github.com/EnragedAntelope/Flux-ChromaLoraConversion
Anonymous No.106475569 >>106475578 >>106475579
bros... does nunchaku qwen work with loras...
Anonymous No.106475575
>ERROR - ERROR: No UNet LoRA weights were applied despite being found!
>ERROR - This may indicate a key mapping or base model mismatch.
YOU TOLD ME IT'S COMPATIBLE YOU STUPID FUCKING PROGRAM
Anonymous No.106475578 >>106475601
>>106475569
Not yet.
Anonymous No.106475579
>>106475569
lol man just check the github. It's in like big font every time they update it
Anonymous No.106475585 >>106475611
Why are there 3 imagegen generals?

Anyway

>https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager

> Available to the world except the European Union, the UK, and South Korea

I know yuros cucked themselves with legislation but what did korea do?
Anonymous No.106475601 >>106475608 >>106475916
>>106475578
im gonna kill myself
Anonymous No.106475605
>animateddiff_\d+.mp4
isn't this localgen, why are you using an online app?
Anonymous No.106475608
>>106475601
Please don't.
Anonymous No.106475611
>>106475585
>korea
What didn't they do.. baka
Anonymous No.106475638
Im no druggie but i know image gen can be better than adderall lol
Anonymous No.106475862 >>106476060
Anonymous No.106475916 >>106475963 >>106476313
>>106475601
Please, think of the bimbos!

Too bad it didn't get the reflection quite right.
Anonymous No.106475939
>Error (code 1224) "The requested operation cannot be performed on a file with a user-mapped section open."
Is the anon who was using the flux lora converter the other day here? How did you make this shit work? Shit collides with itself
Anonymous No.106475963 >>106476079
>>106475916
Great detail
Anonymous No.106475972 >>106475982
>>106475046
It's still only using the last frame for the next part of the generation.
Is there really no way to use more frames in latent to preserve motion?
Anonymous No.106475982 >>106476544
>>106475972
honestly the fastest way would be to gen with lightx2, then have a node that extracts the last frame, and feed that back into the img source, then redo the prompt.

stitch 5 clips = 25s. also, if the second clip is off you can just redo it, idk if batch processing would be ideal.
Anonymous No.106476037 >>106476075 >>106476251
Noob here. Any info on using multiple GPU's for stuff anywhere? Can't seem to be able to find much on it. I have two 4070's which ollama will use to load up to ~20GB models, but I don't see that happening when using ComfyUI. Is it just a Comfy thing?
Anonymous No.106476053
Anonymous No.106476060 >>106476071
>>106475862
impressionism unlocked
Anonymous No.106476071 >>106476091
>>106476060
Anonymous No.106476075 >>106476241
>>106476037
you have to use some hacky unstable custom node for it. the only UI using ggml is anistudio but can't recommend in it's current state
Anonymous No.106476079 >>106476158
>>106475963
I can't stop genning random 1girl slop. Your harry is pretty damn good though. How much hand gacha did that take?
Anonymous No.106476088 >>106476103 >>106476106
so i just discovered i can make 14 second videos with my rtx 3050 8gb in like 12 seconds with fastwan, wtf. for people like me who are stupid technologically speaking, just download the Pinokio app, it's plug and play
Anonymous No.106476091 >>106476158
>>106476071
Lemmy, the Hobo Cop!
Anonymous No.106476103 >>106476106
>>106476088
I cant imagine how garbage it looks, post it
Anonymous No.106476106
>>106476088
>>106476103
12 minutes sorry
Anonymous No.106476110 >>106476154 >>106476353
The best gen you ever did was one you cancelled midway.
Anonymous No.106476130 >>106476656
>>106474383
I have not tried loras but I have tried controlnets and they don't seem to work between them. If chroma is schnell (not sure, I think it started out as such but became some kind of frankenmodel over time) then comfy will let me load flux dev loras on it but they look messed up.
Anonymous No.106476146
>generate with SDXL
>fix with Gemini
>animate with Grok

is there seriously a better way with a measly 8GB VRM GPU?
Anonymous No.106476148 >>106476305
>>106469492 (OP)
sauce on the nazi anons
Anonymous No.106476154 >>106476271
>>106476110
shut the fuck up disgusting nigger bot
death to israel and your american golems
Anonymous No.106476158
>>106476079
Great k-pop 1girls
>How much hand gacha did that take?
None. Got lucky since I didn't tag characters.

>>106476091
ace of spades, space of aids
Anonymous No.106476198 >>106476386 >>106476395 >>106476657
Anonymous No.106476241
>>106476075
Damn that sucks. Thanks for the response
Anonymous No.106476251
>>106476037
https://github.com/pollockjj/ComfyUI-MultiGPU

VRAM block splitting is supported by this node. If you're looking for parallel inferencing, I have no clue.
Anonymous No.106476271
>>106476154
ai local is chinese. you are on wrong thread faggot shizo
Anonymous No.106476305 >>106476311
>>106476148
What do you mean?
Anonymous No.106476311
>>106476305
I mean a catbox anon
Anonymous No.106476313
>>106475916
i realize i never tried genning mirrors. looks pretty cool
Anonymous No.106476319
Anonymous No.106476331
>>106475074
The Chad haoming just implementing things without fanfare vs the virgin panchovix promising everything with fanfare and BTFOing himself before he even started.
Anonymous No.106476353
>>106476110
>he doesn’t have an aggressive preview interval
Ngmi
Anonymous No.106476386 >>106476418
>>106476198
Anonymous No.106476395 >>106476418 >>106477369
>>106476198
how?? you can vid2vid?
Anonymous No.106476396 >>106476406 >>106476435 >>106476461 >>106476605 >>106476630
https://github.com/Haoming02/sd-webui-forge-classic/tree/neo?tab=readme-ov-file#installation

actually works.
Anonymous No.106476406 >>106476435
>>106476396
Features [Sep. 03]
Most base features of the original Automatic1111 Webui should still function

New Features
Support Wan 2.2
txt2img, img2img, txt2vid, img2vid
Support Nunchaku (SVDQ) Models
dev, krea, kontext, t5
Support Flux Kontext
img2img, inpaint
Anonymous No.106476418 >>106476621 >>106476984
>>106476386
>>106476395
its an i2v, flux > wan
Anonymous No.106476435 >>106476447
>>106476396
>>106476406
He’s been silently grinding I’ve used forge classic for months now and it just works. Haven’t upgraded to the flux/wan builds yet cause I don’t want to be a Guinea pig but tester but after a few more releases mite b gud. Panchovix shat on.
Anonymous No.106476436
OMG GUYS! it's gradio!!! So much wow! let's bloat together!
Anonymous No.106476447
>>106476435
I use reforge for illu/noob anime stuff (wai v14, etc) and comfy for wan 2.2 and qwen edit, but im curious how long gens take in this gui. worth a try at least.
Anonymous No.106476461
>>106476396
anyone got gen times comparing to nooodle ui?
Anonymous No.106476544
>>106475982
This works in general, it's not complicated. I haven't been able to find a specific node for extracting the last frame of a batch of images, though.
Anonymous No.106476556
Anonymous No.106476605
>>106476396
But muh speed! My noodes! You CANT make something easy! We are supposed to be big boys!
Anonymous No.106476621 >>106476657 >>106477369
>>106476418
well her expression looks a lot like this
Anonymous No.106476629 >>106476771 >>106477261 >>106477323
Actual madman
Anonymous No.106476630 >>106476861
>>106476396
Yeeeh! the Forge UI with all the improvements and implementations
Want something? Xformers? Cuda? SAtention?
Yes? Just install it yourself with this outdated third pary guide!
Anonymous No.106476646
>>106471573

i distrohopped and current thing krita works with 1998 hp microsoft srgb profile again
Anonymous No.106476647 >>106477167
>>106471050
Anonymous No.106476656 >>106476661 >>106476832
>>106476130
Anonymous No.106476657
>>106476621
you surely mean this >>106476198
Anonymous No.106476661
>>106476656
bow physics good
boob physics bad
Anonymous No.106476667
>https://civitai.com/models/573152?modelVersionId=2155386
new lustify
Anonymous No.106476679
>>106474422
Perhaps. Hope you get a better computer soon troon destroyer 9000. Show us how it's done

The eyes are fun to gen though. thx original eyes anon for inspo
Anonymous No.106476681
Can anyone throw a qwen edit workflow my way?
Anonymous No.106476720
Anonymous No.106476725 >>106476898
so how come 2.1 lightx2 works better than 2.2, for wan 2.2?
Anonymous No.106476735
love qwen edit. it just works.
>replace the blonde anime girl with Miku Hatsune.
Anonymous No.106476771
>>106476629
it's not that old, man!
Anonymous No.106476809
Anonymous No.106476832 >>106476924
>>106476656
Not bad. I like how the shadow of the bow string tracks along her leg as it moves.
Anonymous No.106476838 >>106476889
Anonymous No.106476839 >>106477189
has anyone gen'd a good medusa? I swear ai cant do a good medusa
Anonymous No.106476856
>>106473553
bruh share!
Anonymous No.106476861 >>106476900
>>106476630
>Yes? Just install it yourself with this outdated third pary guide!
you mean just like we do with comfy?
Anonymous No.106476867
>>106473797
Is this someone you're stalking
Anonymous No.106476889 >>106476938
>>106476838
zoom in, enhance, what is the location?
Anonymous No.106476898 >>106476929 >>106479197
>>106476725
some say because comfy's/kijai's implementation is wrong and the wan moe ksampler is the "correct" implementation but I haven't had good experiences with it in my limited testing
https://github.com/stduhpf/ComfyUI-WanMoeKSampler/tree/master
Anonymous No.106476900
>>106476861
even the org wiki maintainer doesn't know shit about the software
Anonymous No.106476924 >>106476935
>>106476832
are you still using pixelwave, or is this qwen? i haven't kept up with what's new
Anonymous No.106476929
>>106476898
I get better motion with this sampler when using with or without 2.2 lora.
Anonymous No.106476935
>>106476924
That picture was variously pixelwave and chroma through different iterations. The last img2img was pixelwave.
Anonymous No.106476938
>>106476889
Outside Chroma HQ with lora on -1. This is inside with lora on 1
Anonymous No.106476946 >>106476973
Can qwen edit run the lightning lora or is it gonna fuck shit up? Alsop what was the magic number that prevented the zoom?
Anonymous No.106476973 >>106476982
>>106476946
yeah there is an 8 step one specifically for qwen edit.
Anonymous No.106476982
>>106476973
https://huggingface.co/lightx2v/Qwen-Image-Lightning/resolve/main/Qwen-Image-Edit-Lightning-8steps-V1.0-bf16.safetensors?download=true
Anonymous No.106476984
>>106476418
Anonymous No.106477030
Anonymous No.106477085 >>106477127 >>106477155 >>106477161
Man, Qwen is so fucking literal with everything.
Anonymous No.106477127
>>106477085
>People will argue it's a bad thing.
Anonymous No.106477143
>models looking like slopped shit is good because... i can larp as a prompt engineer!
Anonymous No.106477155 >>106477279
>>106477085
You could 100% sell fashion like this these days bro. Go make a bank.
Anonymous No.106477161
>>106477085
What did it take literally?
Anonymous No.106477167
>>106476647
Hahaha. Nice.
Anonymous No.106477189
>>106476839
Best I could do
Anonymous No.106477261
>>106476629
lower ur tone before speaking here again
Anonymous No.106477279
>>106477155
Sounds like work, man.
Anonymous No.106477323
>>106476629
Absolutely based
Anonymous No.106477369
>>106476395
>>106476621
it's from this lora; https://civitai.com/models/1897340/sigma-face-expression
Anonymous No.106477378
Anonymous No.106477380
new
>>106477375
>>106477375
>>106477375
>>106477375
Anonymous No.106477738
Anonymous No.106479197
>>106476898

Blurrier than KJ nodes at 3:3 steps... I'll stick with KJ.