← Home ← Back to /g/

Thread 105842620

316 posts 244 images /g/
Anonymous No.105842620 >>105842632 >>105842642 >>105845302 >>105849251
/ldg/ - Local Diffusion General
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>105836648

https://rentry.org/ldg-lazy-getting-started-guide

>UI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX (video)
Guide: https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1

>Chroma
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and beyond: https://rentry.org/comfyui_guide_1girl
Tag explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate

>Neighbours
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/celeb+ai
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.105842632
>>105842620 (OP)
GOOD MORNiN :3
Anonymous No.105842642
>>105842620 (OP)
neighbors list seems v outdated
Ai technologies have proliferated to several other boards...
hmmmm
Anonymous No.105842646 >>105842783 >>105842857 >>105842919 >>105842988 >>105843062 >>105843183 >>105843778
New SLG implementation is finally live.

https://github.com/comfyanonymous/ComfyUI/pull/8759
Anonymous No.105842648 >>105842659 >>105842701
comfy should be dragged out on the street and shot
Anonymous No.105842651 >>105842677 >>105849381
>radial attention waiting room
Anonymous No.105842659 >>105842681 >>105842701
>>105842648
that's a bit harsh. I just hope someone btfo his app and it becomes irrelevant
Anonymous No.105842664
Good Evening and Happy where the fuck is radial attention
Anonymous No.105842667 >>105842689
I love ComfyUI so god damn much. Updated frequently, implements improvements from the community, it's fast, it's flexible, very modular, it's clean and easy to develop custom nodes for.

I couldn't ask for anything better. Thank you Comfy for allowing us peasents to seamlessly and effortlessly produce AI content!
Anonymous No.105842677 >>105842884
>>105842651
i wouldn't expect anything until next year considering their current pace. on the bright side, that gives you plenty of time to save up for a 5090 or 6000
Anonymous No.105842681
>>105842659
>that's a bit harsh.
nowhere near enough
Anonymous No.105842689 >>105842722
>>105842667
>I love ComfyUI so god damn much. Updated frequently
this is b8
Anonymous No.105842698 >>105842708 >>105842773
vace + miku + generic model runway video:

this is with causvid, gonna try the light2x lora as well.
γƒγ‚Ήγƒˆγ‚«γƒΌγƒ‰ No.105842701
>>105842659
>>105842648
rude.
i still use it for a bunch of autistic stuff
its not my "favorite" interface by any means
but surely you can atleast appreciate its use-case
Anonymous No.105842708
>>105842698
this used the default canny processor, low 0.1, high 0.3 (otherwise it wasnt detecting the edges in the vid)
Anonymous No.105842722
>>105842689
No, It's not. I mean that from the bottom of my heart. I am sorry you are too low IQ to fully utilize the GOD like power of ComfyUI. Maybe read some books or something so that one day, you too, can become enlightened, my brainlet friend. I will be waiting for you at the Comfy Altar.
Anonymous No.105842727 >>105842773
lightx2v lora instead of causvid at 1.0 strength:

works fine. didnt specify clothes so the clothes here are different. prompt is just "the girl is showing off her clothes."
Anonymous No.105842768 >>105842775
Miku + Kiryu slamming a desk and walking away:

but the lora does work just fine at 1.0 str. need to test more though
γƒγ‚Ήγƒˆγ‚«γƒΌγƒ‰ No.105842773
>>105842727
>>105842698
try: https://tensor.art/models/872743460111704414
&
https://tensor.art/models/839853388687731926
Anonymous No.105842775 >>105842792 >>105842801
>>105842768
When are we gonna get past the point where everything feels like its underwater
Anonymous No.105842783
>>105842646
city96 is a cool dude
Anonymous No.105842787 >>105842809
Any word on local 3D model generation or UIs?
Anonymous No.105842792 >>105842864
>>105842775
the output framerate is low. this is just testing outputs, thats like 12fps. higher fps or interpolation helps a lot.
γƒγ‚Ήγƒˆγ‚«γƒΌγƒ‰ No.105842801 >>105842864
>>105842775
adjust negative prompt use proper wan & those errors are (mostly) mitigated
>neg: slow movement, slow motion, freeze-frame, etc
Anonymous No.105842809 >>105842841
>>105842787
exists like hunyuan3d-2, typical ui is comfyui as nearly always

most people here don't do much or any actual 3d models at this point
Anonymous No.105842812 >>105842827
how am I getting torch oom if I close comfy and reopen it, it worked fine a gen ago.
Anonymous No.105842821
what are the settings for res_3m image 2 image? my shit looks a little cooked
Anonymous No.105842827
>>105842812
I picked a diff clip with shorter length and now it's fine, but the frames are set to 81 so why does it matter?

oh...the canny node is trying to preview all 27 seconds, not the 81 frames (4 seconds)
Anonymous No.105842839 >>105842903 >>105843693
explain comfy memory usage to me. I used to have 32gb and got a deal on 64gb of better latency RAM, yet sometimes 50gb is in use.

what is going on? does it try to populate as much memory as possible?
Anonymous No.105842841
>>105842809
Tripo Studio is getting pretty good, wish I had something like that locally.
Anonymous No.105842857
>>105842646
doesnt matter for us FusionXisters or lightx2virgins right?
Anonymous No.105842864 >>105842894
>>105842792
>>105842801
>no examples
Yeah sure buddy. I have yet to see a local video where the character actually has a "pop" to their movements
Anonymous No.105842884
>>105842677
well they've been updating it around every weekend so, I'd give it 2 months tops

T_T
Anonymous No.105842890 >>105842947
*sips coffee*
Anonymous No.105842894 >>105842915
>>105842864
lurk moar fren
>>>>105835417
Anonymous No.105842903 >>105842996 >>105843170
>>105842839
Stop using custom nodes, update pytorch to the latest version.
Anonymous No.105842915 >>105842962
>>105842894
>more underwater slop
Is this supposed to be a joke or something? Go outside, that's not how people move in real life. Even more so in animated film
Anonymous No.105842919 >>105842930
>>105842646
Sell me on skip layer as a concept, I've never used it, am I being retarded ?
Anonymous No.105842930 >>105842977 >>105842988
>>105842919
It generates good hands. It generates non blurry hands when using TeaCache with WAN.
Anonymous No.105842947
>>105842890
annoying that "coffee" has such a strong bias towards a starbucks cup in wan t2v.
"milk" is heavily biased towards glass bottles and cartons, too.
Anonymous No.105842962
>>105842915
smoke and a pancake?
cigar and a waffle?
THEN THERE IS NO PLEASING U

kling will have "less water" but it queue system is annoying as fuck & no one should be supporting closed-source fag shit
Anonymous No.105842977
>>105842930
Thanks!
γƒγ‚Ήγƒˆγ‚«γƒΌγƒ‰ No.105842988
>>105842646
>>105842930
based anon i'll look into it<3
Anonymous No.105842990 >>105843115
Excuse me, with apologies to the anons I will have a meltie.
Anonymous No.105842996 >>105843041
>>105842903
start explaining shit instead of running away from the problem
Anonymous No.105843008
>thinks its not obvious when he takes off his trip
Anonymous No.105843023
>thinks bananas aren't fruits
Anonymous No.105843041
>>105842996
thats a diff anon, I dont want to break torch so whats the ideal way to do so

or just get rid of some custom nodes?
Anonymous No.105843062 >>105843183 >>105843238
>>105842646
could this fix sd3.5m? testing...
Anonymous No.105843088
Anonymous No.105843091 >>105843107
Anonymous No.105843099
can anyone share a good workflow for the new chroma rl low steps?
Anonymous No.105843107 >>105847380
>>105843091
Anonymous No.105843115 >>105843133 >>105843141 >>105843150 >>105843162 >>105843724 >>105844085 >>105846632
>>105842990

1)The Core Problem Nobody's Addressing

Everyone's avoiding the elephant in the room: these models are fundamentally stupid. People keep vibing merging, but nobody tackles the real cognitive limitations of these models. Yes, we have millions of LoRAs and checkpoints for art styles, copyrighted characters, fixing five fingers, preventing extra arms
MILLIONS OF THEM! STOP MAKING MORE!

2)Basic Spatial Understanding is BROKEN

DON'T YOU REALIZE THAT IF I TELL SDXL TO HAVE MY CHARACTER LOOK AT HIS PALM HE DOESN'T UNDERSTAND WHERE HIS HAND IS OR HIS PALM?

WHY DO I WANT GOOD AESTHETICS OR 2025 ART STYLES? YES, IT'S OBVIOUSLY 1000 TIMES EASIER TO TAKE SCREENSHOTS AND TAG THEM IN A SLOP VLM THAN TACKLE THE REAL COGNITIVE PROBLEM!

3)The "SOVL" Problem

I can't bring my characters to life with these shitty models. Sure, I can generate images, but they lack SOVL. Having to micromanage everything manually just kills the creative process.

I had more fun generating images with NovelAI using temp emails for 30 free 1024x1024 images than with my 24GB VRAM PC. Why does NovelAI have SOVL? It's like it reads my subconscious. It's frustrating we can't get it locally or buy it like a Steam game.

4)Local Models Can't Handle Basic Scenes

Local models constantly ignore prompts:

Character staring at the sea? Nope, they'll be looking anywhere but there
Character spiking a volleyball mid-air, crashing through the net? NO WAY!

It doesn't matter which checkpoint or version - they're all the same stupid model with minor tweaks.

5)TLDR: What's Even the Point?

Local models just create 2D mannequins in random poses. No action, zero sense of motion or energy in the images.

Life's too short to break scenes into 500 tags and read hundreds of articles, only for the model to grasp maybe 15% of your vision and produce the usual slop.
Anonymous No.105843133
>>105843115
I agree except for glazing saas. every model lacks sovl. I don't think data scientists have good taste when it comes to selecting data. it's slop all the way down
Anonymous No.105843141
>>105843115
Yeah I've been playing the patience card for the past year since we saw such exponential growth before then but at this point it's just getting weird how bad prompt comprehension and model intelligence is for local. At least now we can have Kontext fix up mistakes for image gen, but I worry for video gen since it's facing the same issues with more difficult scaling laws
Anonymous No.105843150
>>105843115
you are right about literally everything except novelai. novelai sucks, midjourney and dalle 3 were the only models with actually sovl
Anonymous No.105843151 >>105843161
The true artist does not blame his tools.
Anonymous No.105843161 >>105843205
>>105843151
a true artist has the right tools in the first place
Anonymous No.105843162
>>105843115
If you want great control over the output, just do simple img2img generations like anyone who isn't retarded.

You can get away with VERY SIMPLE drawings / paintings as long as you add a bit of noise to the drawing before you do img2img, this is also the TRUE creativity with ai imagegen, since you are not just rolling the dice hoping for something cool, you are actively directing where things should go, what pose a person should have, the exact composition etc.

Don't blame the tool because you've never even tried to move past the most basic use.
Anonymous No.105843165
a man wearing dark black sunglasses looking up at the sky, eats a McDonalds cheeseburger.

720p q8 wan but at a smaller size is pretty fast for gens with the lora. (light2x)
Anonymous No.105843170
>>105842903
I ran update_comfyui.bat, launches with pytorch version: 2.7.1+cu128 - is that right?
Anonymous No.105843183 >>105843604
>>105843062
>>105842646
sd3.5m still has the bad hands, not sure if I see any improvement from this really.
Anonymous No.105843189
a man wearing dark black sunglasses looking up at the sky, opens a pizza box and eats a pizza slice. (interpolated output)
>why food?
to test.
Anonymous No.105843205
>>105843161
I will keep genning locally, seethe
Anonymous No.105843216 >>105843296
a man wearing dark black sunglasses fires a rocket launcher at the black helicopter in the sky behind him, causing it to explode in fire and smoke.

he didnt do it, but you get some neat special fx anyway!
Anonymous No.105843238 >>105843251 >>105843259 >>105843270
>>105843062
Anonymous No.105843251
>>105843238
Muffin to see here.
Anonymous No.105843259
>>105843238
It was cute until she defiled the blueberry muffin
Anonymous No.105843270 >>105843447
>>105843238
based. can wan render a guy drinking coffee out of her head?
Anonymous No.105843296
>>105843216
okay, ALMOST the desired result.
Anonymous No.105843298 >>105843308 >>105843326 >>105843328 >>105843330 >>105843620 >>105845036 >>105848116
just deleted all of my models and 99.9% of gens

praying I escape for good this time
Anonymous No.105843308 >>105843384
>>105843298
but why, AI is fun, my GPU isnt just for games
Anonymous No.105843321
>tfw ywn have a harem
Anonymous No.105843326 >>105843384
>>105843298
You are just going through a bit of summer depression, you'll be so pissed at what you did when it subsides.
Anonymous No.105843328 >>105843384
>>105843298
nice ... freckles
Anonymous No.105843330 >>105843384
>>105843298
you can't escape ai, goyim
Anonymous No.105843384 >>105845539
>>105843328
thanks, it's all from a lora and 0 skill
>>105843308
I spend enough time behind the computer as it is
>>105843326
nah, i've plateaued in skill and dont feel like learning anymore
>>105843330
true, a lot of businesses use boomer slop AI images in their marketing nowadays

here's the catbox in case somebody cares, maybe sth interesting for 1girl aficionados:
https://files.catbox.moe/ktvraz.jpg
Anonymous No.105843398
a man on a bicycle rides it off a ramp and flies high into the sky. he pumps his fist in the air.

Todd has Skyrim magic.
Anonymous No.105843447 >>105843477
>>105843270
Closest I got that wasn't a dude popping a coffee cup into existence.
Anonymous No.105843477
>>105843447
bruh imagine touching cappuccina ballerina's ass and kissing the rim of her head like that
Anonymous No.105843493 >>105843526
what's best? hunyuan or wan? working with a 12GB 4070 and only really do T2V
Anonymous No.105843526
>>105843493
wan is best, use the rentry workflow + the lora for way faster gens

multigpu node lets you use virtual vram so you can use larger models too.
Anonymous No.105843598 >>105843660
"VHS-style" gens anon, can you kindly share your prompts? I've been replying to your posts in a couple of threads

Older versions of Chroma used to nail the aesthetic with ease, now it only produces cinematic slop
Anonymous No.105843604
>>105843183
a tarantula!
Anonymous No.105843620
>>105843298
fuck you and see you tomorrow
Anonymous No.105843660 >>105843679
>>105843598
dont use detailed
Anonymous No.105843662 >>105843669 >>105843685 >>105843708
A man with a beard holds up a large bag of money with a dollar sign symbol on the bag. He smiles.
Anonymous No.105843669 >>105843717
>>105843662
now make one with the jobst retard lmao
Anonymous No.105843679
>>105843660
Are you that anon? Gib prompt pls
Anonymous No.105843685 >>105847132
>>105843662
changed size, still got same type of result, gen time much faster (messing with 720p Q8 wan, and comparing to 480p)
Anonymous No.105843688 >>105843764
Anonymous No.105843693
>>105842839
>>105841774
> Is it just me or does ComfyUI freezes the pc every few WAN gens?
Try disabling "smart" memory.
Anonymous No.105843708
>>105843662
Kinda needs Jobst crying, but not bad
Anonymous No.105843717 >>105843738
>>105843669
success, picked a random google image result

"a blonde man sits at a desk and starts crying."
Anonymous No.105843724
>>105843115
Enjoy your 200b models with 8xH100 requirements (and 8000xH100 for training).
Anonymous No.105843738 >>105843752
so close. >>105843717
Anonymous No.105843752 >>105843807
>>105843738
there

poor guy cant even use a gun right...
Anonymous No.105843764 >>105843812 >>105845364
>>105843688
box or style info por favor?

>>105834947
https://files.catbox.moe/u1aj0s.png
Anonymous No.105843768
What if... llm agent, but for images? It will analyze a picture by itself and send to img2img models fixing hands and other artifacts iteratively? Or adding something new/changing colors/effects/etc.
Anonymous No.105843778
>>105842646
am I doing it right?
Anonymous No.105843803 >>105847427
>>105839234
Catbox please
Anonymous No.105843807
>>105843752
Have a female asian hand hold the gun.

'Goodbye husbando!'
Anonymous No.105843812 >>105843863
>>105843764
https://files.catbox.moe/mexmlo.png
Anonymous No.105843863 >>105843962
>>105843812
>antialiased latent upscale
how do you get this in comfy? this seems like it could solve the jaggies problem and make latent upscales viable
Anonymous No.105843887
a man wearing black sunglasses picks up a large black bomb and throws it, causing a huge explosion of fire and smoke.
Anonymous No.105843959
Anonymous No.105843962
>>105843863
>no results found
Forgechads I kneel
Anonymous No.105843970
a man jumps off a building into a swimming pool.

kek
Anonymous No.105843998
been gone for a while. did chroma get official nunchaku support yet
Anonymous No.105844016
a man drinks a bottle of beer in a dark room at night.

there we go, including a reference to the light levels made the sudden brightness go away.
Anonymous No.105844049 >>105844074 >>105844089 >>105844097 >>105844108
>spend an eternity looking for wan extension workflows that doesnt burn or use "last frame"
>think of the loop nodes in comfy but to stupid to figure it out
>find workflow on youtube that does all of that with i2v including vace
>behind a patreon paywall/sign up

I hate youtubers
Anonymous No.105844074
>>105844049
ai youtubers are the worst
Anonymous No.105844085
>>105843115
>No action, zero sense of motion or energy in the images
At this point I can only hope video saves image gen somehow. Maybe if AI can do passable "two cars crashing" in video then we'll get models able to gen good static images of it.
Anonymous No.105844089
>>105844049
anon just use logic, you gotta mask the frames you wanna extend with VACE and thats it
Anonymous No.105844097 >>105844304
>>105844049
farukan gogizur
Anonymous No.105844107
a man drinks opens a brown bag of McDonalds and grabs a McDonalds cheeseburger, and eats it.

JC must consume
Anonymous No.105844108 >>105844129 >>105844304
>>105844049
have you tried this one?
https://www.reddit.com/r/StableDiffusion/comments/1llx9uq/
Anonymous No.105844129 >>105844304
>>105844108
Seconding this one, it's the one I used to make this video.
Anonymous No.105844155 >>105844169 >>105844170 >>105847690
3090gods... we won. Can anyone now send the official /ldg/ memo to lodestonesnigger to stop catering to low step vramlet shitters and stop cucking chroma before he ruins it permanently? Thanks.

https://strawpoll.com/XOgOVDj1Gn3/
Anonymous No.105844169
>>105844155
>3060
you know not all of us have 12 gb cards, some of us gen on a laptop
Anonymous No.105844170
>>105844155
>106 votes
LMAO yeah sure
Anonymous No.105844188
Anonymous No.105844195 >>105844225
is there a node that can extract the last frame of a video input? so I can stitch generated clips together for example. Ideally I wouldnt need to use a web app to extract it every time.
Anonymous No.105844202
Anonymous No.105844225
>>105844195
nm load video (vhsloader) does this
Anonymous No.105844289
Anonymous No.105844304
>>105844097
kek

>>105844108
>>105844129
saw this before, was put off by dicking around with picrel. however he did commented with an automated version, so this should do the trick: https://pastebin.com/TCs9J88i
Anonymous No.105844317
I think it worked? two clips:
Anonymous No.105844329
>chroma
>all that burnt training on distilled flux instead of training wan 1.3b/14b
OH NO NO NO
Anonymous No.105844340 >>105844366
>choma
>all that burnt training on distilled flux instead of training sana
OH NO NO NO
Anonymous No.105844351 >>105844366
All that burnt training when they could've done a custom model with a 16 channel VAE.
Anonymous No.105844366 >>105844381
>>105844340
Why not Lumina?
>>105844351
Why not use someone else's already spent massive compute as a foundation?
Anonymous No.105844381 >>105844833
>>105844366
I think you grossly overestimate the compute required to train a model. I also think you grossly underestimate how much compute is wasted undoing the lobotomy / redoing a model's understanding of anatomy.
Anonymous No.105844433
Anonymous No.105844471 >>105844958
Anonymous No.105844477 >>105844533
a house at the top of a hill explodes with smoke and fire everywhere.

pretty cool
Anonymous No.105844533 >>105844626
>>105844477
alternatively,

a white house at the top of a hill launches into the air like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.

not much elevation, but still pretty good!
Anonymous No.105844583 >>105844652
Wan is seriously better compared to Flux/Chroma at generating still images. The video training translates into better still frames.
Anonymous No.105844597 >>105846001
ComfyUI made me realize this hobby requires at least 120 IQ points.
Anonymous No.105844619 >>105845053
Anonymous No.105844626 >>105844633 >>105847177
>>105844533
okay, giving a distance made it move more.

a white house at the top of a hill launches into the air like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.
Anonymous No.105844633
>>105844626
er,

a white house at the top of a hill launches miles into the sky like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.

miles is what did the trick.
Anonymous No.105844652 >>105844666
>>105844583
but can it do bobs and vagene
Anonymous No.105844666
>>105844652
Out of the box it's not great but it takes a bare minimum Lora to get it to A-tier.
Anonymous No.105844699
Anonymous No.105844833 >>105844851 >>105844872
>>105844381
NTA, but my opinion is that the quality of the base model very strongly influences the results of a finetune the size of Chroma. Models like Flux and Wan are trained on literally billions of images. LAION alone is like 5b and that's an older dataset. Chroma's 5 million training dataset is nothing in comparison. You absolutely cannot train a model from scratch on 5m images and have it be any decent.
Anonymous No.105844847 >>105844948
I said to space, seems that is not possible quite yet
Anonymous No.105844851
>>105844833
You really think there are billions of unique images? I think you grossly underestimate how much variety is in 5 million images. You do realize they pad "billions" because most of them are duplicates, resizes and crops right? How many thousands of variations of Harold exists do you think?
Anonymous No.105844872 >>105844894 >>105845192
>>105844833
no way in hell flux trained on that many
novelai claims to have trained from scratch and their dataset is at most like 20m
Anonymous No.105844894
>>105844872
I think they do this to intimidate people from trying to train models. It's important the plebs don't realize they can make their own printing press.
Anonymous No.105844948
>>105844847
groq is this real
Anonymous No.105844958
>>105844471
have her hold her sword in front of her hips and twerk
Anonymous No.105844998
>>105842199
naisu
Anonymous No.105845036
>>105843298
i'll miss you anon
Anonymous No.105845053
>>105844619
nice
Anonymous No.105845130 >>105845158
Give the man black sunglasses, he is holding a large bag of money, overflowing with dollar bills. On the bag is the text "KARL" in scribbled font. He is wearing a black baseball cap that says "KING OF KONG" in white text.

kontext is so fun. it's like inpainting evolved, but does stuff inpainting can't.
Anonymous No.105845158 >>105845188
>>105845130
give him an anime gf
Anonymous No.105845188 >>105845200 >>105845206
>>105845158
anime girl Miku Hatsune is standing beside the man, wearing a black baseball cap saying "karl LOST" in white text.

and this is one image, if I want a better miku I just put a good miku picture in the second image input

workflow: https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751
Anonymous No.105845192 >>105845242
>>105844872
Even pixart alpha, that was woefully undertrained, claimed to use at least 25M images. Obviously pixart alpha is too small to be a viable modern base. But if the number of parameters is passable and vae and text encoder are modern, why throw away those 25M images already trained in, unless there's an architectural breakthrough? Lodestone's dataset is around 5 times smaller, if I remember correctly.
Anonymous No.105845197 >>105845205 >>105845237 >>105845249 >>105846001
I'm sick and tired of 1girl effortless AI slop in this thread.
Anonymous No.105845200 >>105845244
>>105845188
make the anime girl smoke a pipe lol
Anonymous No.105845205
>>105845197
>I'm sick
You can always commit suicide, that way all your problems go away (you are the problem)
Anonymous No.105845206 >>105845214
>>105845188
and this is with 2 images (bypass the 2nd input if you just want a solo image for input)

anime girl with teal hair Miku Hatsune is standing beside the man, wearing a black baseball cap saying "karl LOST" in white text.

it just works.jpg
Anonymous No.105845214 >>105845244
>>105845206
and fixed the hat with a simple hat text prompt:
Anonymous No.105845237
>>105845197
for every one kinosoul 1girl there are 5b effortless 1girls
Anonymous No.105845242 >>105845708
>>105845192
Boy you quickly gave up billions of images huh? Maybe it's not much use to talk to someone who is ignorant.
Anonymous No.105845244 >>105845252
>>105845214
>>105845200

one more! revised:

pink hair anime girl is standing beside the man in a black baseball cap. she is smoking a pipe. change the location to a bank. keep her blue and yellow hairclip the same. keep the man's pose the same.
Anonymous No.105845249
>>105845197
>complainer
>nogen
quite literally, everystein.singleberg.timeowitz.
Anonymous No.105845252 >>105845277 >>105845299
>>105845244
kek, double pipe this gen
Anonymous No.105845277
>>105845252
lol
Anonymous No.105845299
>>105845252
bonus: anime billy
Anonymous No.105845302
>>105842620 (OP)
Where did that dual clip thing came from?
Anonymous No.105845307 >>105845312 >>105845389 >>105847154
>3090
>no fp8
>sage attention doesn't work
>torch compile does nothing
lol, lmao even
Anonymous No.105845312
>>105845307
Also almost 5 years old.
Anonymous No.105845324 >>105845344
The man is pointing and laughing at a blonde swedish man wearing a t-shirt that says "KARL JACOBS", who looks very upset.

kek, I dont know if he is swedish so I used that as a generic npc.
Anonymous No.105845344
>>105845324
oops, cant forget to make sure he is still holding money.
Anonymous No.105845345 >>105845349
Anonymous No.105845349
>>105845345
interesting leg
Anonymous No.105845352 >>105845792
change the location to a mcdonalds restaurant. the man is sitting at a table eating a McDonalds Big Mac. His table is surrounded with hundreds of cheeseburgers.

JC needs to eat so he can stop the illuminati
Anonymous No.105845364
>>105843764
Thank you!
Anonymous No.105845389 >>105847154
>>105845307
3090 shills deserve death
Anonymous No.105845416 >>105845829
is ai art getting shittier by the day?
Anonymous No.105845455
Anonymous No.105845539 >>105845570
>>105843384
>nah, i've plateaued in skill
pic unrelated? your gens look like shit dude.
Anonymous No.105845570 >>105845572
>>105845539
I like them, personally.
Anonymous No.105845572
>>105845570
I seen better
Anonymous No.105845643 >>105845654
The man is wearing a hat saying "#1 illuminati fan". keep his pose and expression the same. the image is in a pixel art style.

neat
Anonymous No.105845645 >>105845816 >>105847448
What would you prompt to get weird / unorthodox / asymmetrical lewd swimsuits?
Anonymous No.105845654
>>105845643
and without pixel art
Anonymous No.105845679 >>105845765
Anonymous No.105845708
>>105845242
Nta.
Anonymous No.105845742 >>105845752 >>105845762
two image inputs:

The man is shaking hands with the pink hair anime girl. the background is black.
Anonymous No.105845752
>>105845742
Anonymous No.105845762 >>105845771
>>105845742
please share the catbox
thanks
Anonymous No.105845765
>>105845679
I knew it!
Anonymous No.105845771 >>105845778
this one turned out better:
>>105845762
same prompt I used in the post.
Anonymous No.105845778 >>105845796
>>105845771
I know, I need the workflow
my two image workflow is broken
please fren
P0STCARD No.105845792
>>105845352
excellent
Anonymous No.105845796 >>105845801
>>105845778
https://files.catbox.moe/tfnkvg.png

got it from here: https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751
Anonymous No.105845801
>>105845796
thank you
Anonymous No.105845803
there, bit better proportions:
postcard No.105845816 >>105845876
>>105845645
Turn cfg low & let the prompt run for 20 iterations w/ β€œloose settings”
U sometimes get neat outfits this way
Anonymous No.105845829
>>105845416
all the top posters are rangebanned by the baker again
grim
Anonymous No.105845864
diff image

The man is sitting at a computer and is typing. the pink hair anime girl is waving hello. the background is black. keep the man's expression the same.
Anonymous No.105845876
>>105845816
>>105845754
fat bitch
>>105836371
Anonymous No.105845908
The man is sitting at a computer and is typing in a dimly lit office. A rectangular sign above says "glowie HQ" in yellow text.
Anonymous No.105845936 >>105847463
>show OCD friend my Comfy workflow
>he loses his mind

It's not that bad, right?
Anonymous No.105845985
Anonymous No.105846001
>>105844597
You just need the right motivation (genning the waifu, gooning, etc) to get your footing. It's not TOO horrible... except when everything breaks.

>>105845197
Be the change you want to see!
Anonymous No.105846045
Anonymous No.105846064
Anonymous No.105846073
Anonymous No.105846115
Anonymous No.105846125
quiet tonight...
Anonymous No.105846131
Anonymous No.105846160
Anonymous No.105846174
Anonymous No.105846194 >>105846214
Survey
https://strawpoll.com/XOgOVDj1Gn3/results
Anonymous No.105846214 >>105846220
>>105846194
would have been 17 for 3060 had I voted
Anonymous No.105846220
>>105846214
Survey
https://strawpoll.com/XOgOVDj1Gn3
Anonymous No.105846274
Anonymous No.105846345 >>105846386 >>105846484 >>105846637 >>105846659
>flux kontext
use case?
Anonymous No.105846386 >>105846408
>>105846345
Correct small details and do meme I guess
If it was uncensored AND could keep artstyle, it would have killed loras and this would have been big
Unfortunately it didn't
Anonymous No.105846408
>>105846386
>If it was uncensored AND could keep artstyle, it would have killed loras and this would have been big
>Unfortunately it didn't
When will we get this?
2 more weeks or anything that's actually on the horizon?
Anonymous No.105846484
>>105846345
fry your image in just 5 revisions!
Anonymous No.105846632
>>105843115
>SDXL
Dude. You're using a 2 years old obsolete clip_l based model. It can at most understand one (1) character doing one (1) simple thing if you prompt it right. With NoobAI/Illustrious we maximized the fuck out of that architecture and did things that shouldn't be possible, but it's like giving a new coat of paint on a 1960s Ford 2. You won't gain any racing competition with it.

Flux-dev 1.0 dual clip/t5 has a more recent architecture and prompt comprehension, which means it is only one year obsolete now. Still pretty bad, and Flux-dev has huge issues with its guidance distillation which makes it kinda retarded a lot of the time. But it's technically better than SDXL in prompt following, from 1/10 to 3/10.

If you want a decent prompt following check HiDream (a solid 5/10 on prompt following), but for some reasons people decided two months ago they didn't like HiDream.
Anonymous No.105846637
>>105846345
I think it's useful if you need specifically the thing it does. For everything else it's just a deepfryer and shitty mememaker
Anonymous No.105846641 >>105846735
>finally set up everything
>can now generate as many fatties as i wish
I will dehydrate from all the gooning holy shit
Anonymous No.105846659
>>105846345
For me? Remove clothes, and change anime girls into realistic. Both are so/so and for now worse than doing manual inpaint and stuff, but you don't need to do manual inpaint.

Also for some reason my outpainting of characters is better with Kontext than with Fill.
Anonymous No.105846735 >>105846749
>>105846641
>anon can generate anything
>wastes it on fatasses
Anonymous No.105846749
>>105846735
I'm considering taking vacation desu
FUCK YES
Anonymous No.105846768
Anonymous No.105846780 >>105846792
wan is still the lightest i2v, right?
Anonymous No.105846792 >>105846807
>>105846780
Isn't it the only good i2v? I mean there's lighter one but they're more proof of concept, and Hunyuan Video is shit at i2v.
Anonymous No.105846807 >>105846822
>>105846792
I don't know, I don't lurk here much. Just waiting for some simple model that can animate images, maybe even without conditioning. Wan is way too slow.
Anonymous No.105846822 >>105846876
>>105846807
Slow generation times or slow to input what you want?
Anonymous No.105846876 >>105846939
>>105846822
I mean waiting for 20+ minutes for a video that doesn't even follow a simple prompt most of the time
Anonymous No.105846939 >>105846974
>>105846876
sage attention + teacache can help you to halve the render time, but yeah it's slow. But it's the only one that really works beyond trivial stuff.

But if you want trivial stuff like people breathing or something, LTX video and CogVideoX are faster, and far more limited.
Anonymous No.105846974 >>105847002
>>105846939
LTX Video claims to
>produces 30 FPS videos at a 1216Γ—704 resolution faster than they can be watched.
Sounds interesting if it's even remotely true
Anonymous No.105847002 >>105847163
>>105846974
Well, LTX is blazing fast.

It's also shit at anything that isn't "camera immobile/slightly panning in/slightly panning out" and "character standing, breathing" or "character sitting, breathing", occasionally "character walking".
Anonymous No.105847014 >>105847100
>another cool little wan speed boost we'll probably never see for comfy
https://github.com/madebyollin/taehv
Anonymous No.105847090 >>105848289 >>105848314
Can you control an AI by inputting an image that is a screenshot of a code and the AI executes it?
Anonymous No.105847100 >>105847145
>>105847014
Isn't that already implemented on WanVideoWrapper?
Anonymous No.105847132
>>105843685

Do you mind catboxing one of your I2V gens? Thanks in advance!
Anonymous No.105847145 >>105847186
>>105847100
you're right, I was googling the wrong thing, disregard my tard comment

https://huggingface.co/Kijai/WanVideo_comfy/blob/main/README.md
Anonymous No.105847154
>>105845307
>>105845389
anything better for under $1000?
Anonymous No.105847163
>>105847002
>It's also shit at anything that isn't "camera immobile/slightly panning in/slightly panning out" and "character standing, breathing" or "character sitting, breathing", occasionally "character walking".

Though this was their 2b model. I see they've published a 13b model. I know what I'll be testing tonight.
Anonymous No.105847177
>>105844626
KCD IV looks wild.
Anonymous No.105847186 >>105847214
>>105847145
No worries. It might be compatible with vanilla comfyui as well. You can just replace them for a vae.

https://huggingface.co/Kijai/WanVideo_comfy/blob/main/taew2_1.safetensors

I use tiny encoder for sdxl previews, its really good.
Anonymous No.105847214
>>105847186
Thanks, I'll give it a try later after work
Anonymous No.105847299 >>105847308 >>105847354 >>105848653
Anonymous No.105847308
>>105847299
Anonymous No.105847347 >>105847389 >>105847657
Canny Valley
Anonymous No.105847354
>>105847299
I don't get it, how are you supposed to inspect it with your shirt over your head?
Anonymous No.105847380
>>105843107
I like this
Anonymous No.105847389 >>105847657
>>105847347
more of a canny channel really
Anonymous No.105847427 >>105847614 >>105848006
>>105843803
https://files.catbox.moe/icjkb3.png

There ya go
Anonymous No.105847448
>>105845645
You could also erase by hand some part of a normal one, add extra lines and feed it trough again
Anonymous No.105847463 >>105847550
>>105845936
This is nothing yet, it will grow larger in time
Anonymous No.105847550 >>105847604
>>105847463
anon mine hasn't grown larger in 15 years
Anonymous No.105847559 >>105847569 >>105848805
Am i retarded or there is no way of moving a model from one device to another once it's loaded? I have this problem where I run 2 large models after one another and the second one is really slow since my vram can't fit both of them, but swapping them between devices would speed things up
Anonymous No.105847568
>Am i retarded
ngl I stopped reading after that
Anonymous No.105847569
>>105847559
in comfy
Anonymous No.105847577 >>105847614
Anonymous No.105847604
>>105847550
Tug on it more maybe something will happen
Anonymous No.105847614
>>105847427
NTA but thank you, this is great.
>>105847577
Maximum neuron activation.
...catbox?
Anonymous No.105847647 >>105847714
SDXL:
>2023 release
>1024x1024
>3.5b parameters
>learns styles in under 10 epochs
>understands complex sex positions
>outputs in 3 seconds without any copechaku quanting needed
Chroma:
>2025 release
>512x512
>8b parameters
>hasnt learned a single style in over 40 epochs
>no characters
>melted anatomy and duplicate limbs
>barely understands POV missionary
>takes 20 seconds per image on a 4090

i'm thinking 3 more years of SDXL
Anonymous No.105847655
you're suppossed to prompt the style in retard
Anonymous No.105847657 >>105847672 >>105848031
>>105847347
>>105847389
>canny channel
Are you kidding me? It's right there
A canny canyon
Fuck you ESLs
Anonymous No.105847663
If you weren't a poorfag you'd use a video model.
Anonymous No.105847672
>>105847657
>A canny canyon
so a cannyon?
Anonymous No.105847690 >>105847837
>>105844155
>no M3/M4
nvidia nerds will never learn what 120GB VRAM feels like
Anonymous No.105847714
>>105847647
U-net vs DiT
Anonymous No.105847837
>>105847690
>540GB/s
>$19,999.99
lol
Anonymous No.105847856
so what upscaler or setting work best for anime/drawn hires fix,
cant figure or search out why it turns into smudge compared to realistic checkpoints that nicely enhanced details and fixes mistake
Anonymous No.105847884 >>105848259
which one should I buy for genning?
https://mdcomputers.in/catalog/graphics-card/nvidia/rtx-50-graphics-card/rtx-5090-graphics-card
Anonymous No.105847906 >>105847913
anyone have experience with character consistency? i was thinking of genning a face then face swapping it onto the image. and using wan to gen a video, then use frames of that video to have the character standing vs sitting
Anonymous No.105847913 >>105847941
>>105847906
kontext
Anonymous No.105847941
>>105847913
Nta, but I couldn't make kontext *swap* faces in particular. It seems to treat this request as a deepfake threat.
Anonymous No.105848006
>>105847427
Thanks based anon, also this image is good as well.
Anonymous No.105848031
>>105847657
more of a canny chasm really
Anonymous No.105848116
>>105843298
>in a time when models are being reported and deleted for no reason through false reports
what a stupid thing to do.
Anonymous No.105848136 >>105848288 >>105848408
Anonymous No.105848259
>>105847884
Astral is the only real choice because you can check the per pin power/amps to make sure your connecter isn't going to fucking melt. That said, I thought it was too loud (even in quiet mode) when genning so I put it in a custom loop.
Anonymous No.105848288
>>105848136
Nice, Chroma looks promising for sure
Anonymous No.105848289
>>105847090
4o can read text in images and can read and interpret code so I guess it can do that.
Anonymous No.105848314
>>105847090
you typically control models through strings (which are converted to tokens blah blah), much more convenient
Anonymous No.105848353
Anonymous No.105848408
>>105848136
How long did the training take?
Anonymous No.105848481
causvid and other speed up loras produce almost no motion, solution lets create a dual sampler workflow that looks like cancer and shill it. both samplers use 8 steps, so that's 16 steps total, whats the fucking point then?
Anonymous No.105848511
vramlet here
Can I use this to make little animations?
https://huggingface.co/CiaraRowles/TemporalDiff/blob/main/temporaldiff-v1-animatediff.safetensors
Anonymous No.105848653 >>105848749
>>105847299
I made those original target pics around the end of May. can I ask how you came across them?
Anonymous No.105848681 >>105848758 >>105848762 >>105848795 >>105848799
umm?
Why did the thread die all of a sudden?
Anonymous No.105848749 >>105848768 >>105848936
>>105848653
>I made those original target pics around the end of May. can I ask how you came across them?
Can't remember, just another prompt in my wildcards. Perhaps you posted your prompt around then and I saved it
Anonymous No.105848758
>>105848681
huh?
Anonymous No.105848762
>>105848681
I stopped genning
Anonymous No.105848768 >>105848771 >>105848936
>>105848749
women as sex robots is the lowest form of sci fi
Anonymous No.105848771
>>105848768
cry more feminist
Anonymous No.105848790 >>105848803
no matter how many times I download these and restart it keeps showing missing again and again
Updated cumfy too
Anonymous No.105848795
>>105848681
China went to bed.
Anonymous No.105848799
>>105848681
because you showed up
Anonymous No.105848803
>>105848790
Install them from their git pages
I noticed this yesterday with some nodes, installing from git fixed it.
Anonymous No.105848805
>>105847559
Anyone?
Anonymous No.105848853 >>105848872 >>105848973
it's a little ropey but that's the effect I wanted
Anonymous No.105848872 >>105848880
>>105848853
That's really great. Wasn't there some 90's tv show with stuff like this
Anonymous No.105848880 >>105848901
>>105848872
funny you should say that, that's what I trained it on
Anonymous No.105848901 >>105848910
>>105848880
Was it the Sabrina witch? Ancestral penis memory
Anonymous No.105848910 >>105848959
>>105848901
yes I sat through 7 seasons of the shit. They really scaled back the effects after the first 2, that's where most of them are
and yeah I did get a chub at parts, but mostly it's horrible cringe and I'm embarrassed for everyone involved
Anonymous No.105848936
>>105848749
awesome, I also posted a collection to civitai with full metadata. prompt sharing is fun

>>105848768
yeah, it's a classic
Anonymous No.105848959
>>105848910
The talking cat looked like shit, that I remember
Anonymous No.105848973
>>105848853
please tell me you did i2v too
Anonymous No.105848980
Finally, Yelakeqrinde
Anonymous No.105848984
>>105848978
>>105848978
>>105848978
>>105848978
Anonymous No.105849251
>>105842620 (OP)
Is there a desktop application that I can use like notebooklm or something, but also allows me to use API keys if necessary?
I usually jailbreak deepseek remotely via an "untrammeled" prompt and it has been great so far just as an erp bot, but I want something that helps me use it to learn and improve my notes.
I also want to help better at prompting or understand it better, since deepseek either has a very disgusting and frustrating aneurysm or gets its ethical guard up as it tries to lecture me on topics I couldn't give a rats ass about as I press the stop button.
Anonymous No.105849381
>>105842651
Install linux while you wait a dependency for it is flashinfer and that's linux only.