Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>105836648https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, & Upscalershttps://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanX (video)Guide: https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1
>ChromaTraining: https://rentry.org/mvu52t46
>Illustrious1girl and beyond: https://rentry.org/comfyui_guide_1girl
Tag explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/celeb+ai>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>>105842620 (OP)neighbors list seems v outdated
Ai technologies have proliferated to several other boards...
hmmmm
New SLG implementation is finally live.
https://github.com/comfyanonymous/ComfyUI/pull/8759
comfy should be dragged out on the street and shot
>radial attention waiting room
>>105842648that's a bit harsh. I just hope someone btfo his app and it becomes irrelevant
Good Evening and Happy where the fuck is radial attention
I love ComfyUI so god damn much. Updated frequently, implements improvements from the community, it's fast, it's flexible, very modular, it's clean and easy to develop custom nodes for.
I couldn't ask for anything better. Thank you Comfy for allowing us peasents to seamlessly and effortlessly produce AI content!
>>105842651i wouldn't expect anything until next year considering their current pace. on the bright side, that gives you plenty of time to save up for a 5090 or 6000
>>105842659>that's a bit harsh.nowhere near enough
>>105842667>I love ComfyUI so god damn much. Updated frequentlythis is b8
vace + miku + generic model runway video:
this is with causvid, gonna try the light2x lora as well.
>>105842659>>105842648rude.
i still use it for a bunch of autistic stuff
its not my "favorite" interface by any means
but surely you can atleast appreciate its use-case
>>105842698this used the default canny processor, low 0.1, high 0.3 (otherwise it wasnt detecting the edges in the vid)
>>105842689No, It's not. I mean that from the bottom of my heart. I am sorry you are too low IQ to fully utilize the GOD like power of ComfyUI. Maybe read some books or something so that one day, you too, can become enlightened, my brainlet friend. I will be waiting for you at the Comfy Altar.
lightx2v lora instead of causvid at 1.0 strength:
works fine. didnt specify clothes so the clothes here are different. prompt is just "the girl is showing off her clothes."
Miku + Kiryu slamming a desk and walking away:
but the lora does work just fine at 1.0 str. need to test more though
>>105842727>>105842698try: https://tensor.art/models/872743460111704414
&
https://tensor.art/models/839853388687731926
>>105842768When are we gonna get past the point where everything feels like its underwater
>>105842646city96 is a cool dude
Any word on local 3D model generation or UIs?
>>105842775the output framerate is low. this is just testing outputs, thats like 12fps. higher fps or interpolation helps a lot.
>>105842775adjust negative prompt use proper wan & those errors are (mostly) mitigated
>neg: slow movement, slow motion, freeze-frame, etc
>>105842787exists like hunyuan3d-2, typical ui is comfyui as nearly always
most people here don't do much or any actual 3d models at this point
how am I getting torch oom if I close comfy and reopen it, it worked fine a gen ago.
what are the settings for res_3m image 2 image? my shit looks a little cooked
>>105842812I picked a diff clip with shorter length and now it's fine, but the frames are set to 81 so why does it matter?
oh...the canny node is trying to preview all 27 seconds, not the 81 frames (4 seconds)
explain comfy memory usage to me. I used to have 32gb and got a deal on 64gb of better latency RAM, yet sometimes 50gb is in use.
what is going on? does it try to populate as much memory as possible?
>>105842809Tripo Studio is getting pretty good, wish I had something like that locally.
>>105842646doesnt matter for us FusionXisters or lightx2virgins right?
>>105842792>>105842801>no examplesYeah sure buddy. I have yet to see a local video where the character actually has a "pop" to their movements
>>105842677well they've been updating it around every weekend so, I'd give it 2 months tops
T_T
>>105842839Stop using custom nodes, update pytorch to the latest version.
>>105842894>more underwater slopIs this supposed to be a joke or something? Go outside, that's not how people move in real life. Even more so in animated film
>>105842646Sell me on skip layer as a concept, I've never used it, am I being retarded ?
>>105842919It generates good hands. It generates non blurry hands when using TeaCache with WAN.
>>105842890annoying that "coffee" has such a strong bias towards a starbucks cup in wan t2v.
"milk" is heavily biased towards glass bottles and cartons, too.
>>105842915smoke and a pancake?
cigar and a waffle?
THEN THERE IS NO PLEASING U
kling will have "less water" but it queue system is annoying as fuck & no one should be supporting closed-source fag shit
>>105842646>>105842930based anon i'll look into it<3
Excuse me, with apologies to the anons I will have a meltie.
>>105842903start explaining shit instead of running away from the problem
>thinks its not obvious when he takes off his trip
>thinks bananas aren't fruits
>>105842996thats a diff anon, I dont want to break torch so whats the ideal way to do so
or just get rid of some custom nodes?
>>105842646could this fix sd3.5m? testing...
can anyone share a good workflow for the new chroma rl low steps?
>>1058429901)The Core Problem Nobody's Addressing
Everyone's avoiding the elephant in the room: these models are fundamentally stupid. People keep vibing merging, but nobody tackles the real cognitive limitations of these models. Yes, we have millions of LoRAs and checkpoints for art styles, copyrighted characters, fixing five fingers, preventing extra arms
MILLIONS OF THEM! STOP MAKING MORE!
2)Basic Spatial Understanding is BROKEN
DON'T YOU REALIZE THAT IF I TELL SDXL TO HAVE MY CHARACTER LOOK AT HIS PALM HE DOESN'T UNDERSTAND WHERE HIS HAND IS OR HIS PALM?
WHY DO I WANT GOOD AESTHETICS OR 2025 ART STYLES? YES, IT'S OBVIOUSLY 1000 TIMES EASIER TO TAKE SCREENSHOTS AND TAG THEM IN A SLOP VLM THAN TACKLE THE REAL COGNITIVE PROBLEM!
3)The "SOVL" Problem
I can't bring my characters to life with these shitty models. Sure, I can generate images, but they lack SOVL. Having to micromanage everything manually just kills the creative process.
I had more fun generating images with NovelAI using temp emails for 30 free 1024x1024 images than with my 24GB VRAM PC. Why does NovelAI have SOVL? It's like it reads my subconscious. It's frustrating we can't get it locally or buy it like a Steam game.
4)Local Models Can't Handle Basic Scenes
Local models constantly ignore prompts:
Character staring at the sea? Nope, they'll be looking anywhere but there
Character spiking a volleyball mid-air, crashing through the net? NO WAY!
It doesn't matter which checkpoint or version - they're all the same stupid model with minor tweaks.
5)TLDR: What's Even the Point?
Local models just create 2D mannequins in random poses. No action, zero sense of motion or energy in the images.
Life's too short to break scenes into 500 tags and read hundreds of articles, only for the model to grasp maybe 15% of your vision and produce the usual slop.
>>105843115I agree except for glazing saas. every model lacks sovl. I don't think data scientists have good taste when it comes to selecting data. it's slop all the way down
>>105843115Yeah I've been playing the patience card for the past year since we saw such exponential growth before then but at this point it's just getting weird how bad prompt comprehension and model intelligence is for local. At least now we can have Kontext fix up mistakes for image gen, but I worry for video gen since it's facing the same issues with more difficult scaling laws
>>105843115you are right about literally everything except novelai. novelai sucks, midjourney and dalle 3 were the only models with actually sovl
The true artist does not blame his tools.
>>105843151a true artist has the right tools in the first place
>>105843115If you want great control over the output, just do simple img2img generations like anyone who isn't retarded.
You can get away with VERY SIMPLE drawings / paintings as long as you add a bit of noise to the drawing before you do img2img, this is also the TRUE creativity with ai imagegen, since you are not just rolling the dice hoping for something cool, you are actively directing where things should go, what pose a person should have, the exact composition etc.
Don't blame the tool because you've never even tried to move past the most basic use.
a man wearing dark black sunglasses looking up at the sky, eats a McDonalds cheeseburger.
720p q8 wan but at a smaller size is pretty fast for gens with the lora. (light2x)
>>105842903I ran update_comfyui.bat, launches with pytorch version: 2.7.1+cu128 - is that right?
>>105843062>>105842646sd3.5m still has the bad hands, not sure if I see any improvement from this really.
a man wearing dark black sunglasses looking up at the sky, opens a pizza box and eats a pizza slice. (interpolated output)
>why food?
to test.
>>105843161I will keep genning locally, seethe
a man wearing dark black sunglasses fires a rocket launcher at the black helicopter in the sky behind him, causing it to explode in fire and smoke.
he didnt do it, but you get some neat special fx anyway!
>>105843238Muffin to see here.
>>105843238It was cute until she defiled the blueberry muffin
>>105843238based. can wan render a guy drinking coffee out of her head?
>>105843216okay, ALMOST the desired result.
just deleted all of my models and 99.9% of gens
praying I escape for good this time
>>105843298but why, AI is fun, my GPU isnt just for games
>>105843298You are just going through a bit of summer depression, you'll be so pissed at what you did when it subsides.
>>105843298nice ... freckles
>>105843298you can't escape ai, goyim
>>105843328thanks, it's all from a lora and 0 skill
>>105843308I spend enough time behind the computer as it is
>>105843326nah, i've plateaued in skill and dont feel like learning anymore
>>105843330true, a lot of businesses use boomer slop AI images in their marketing nowadays
here's the catbox in case somebody cares, maybe sth interesting for 1girl aficionados:
https://files.catbox.moe/ktvraz.jpg
a man on a bicycle rides it off a ramp and flies high into the sky. he pumps his fist in the air.
Todd has Skyrim magic.
>>105843270Closest I got that wasn't a dude popping a coffee cup into existence.
>>105843447bruh imagine touching cappuccina ballerina's ass and kissing the rim of her head like that
what's best? hunyuan or wan? working with a 12GB 4070 and only really do T2V
>>105843493wan is best, use the rentry workflow + the lora for way faster gens
multigpu node lets you use virtual vram so you can use larger models too.
"VHS-style" gens anon, can you kindly share your prompts? I've been replying to your posts in a couple of threads
Older versions of Chroma used to nail the aesthetic with ease, now it only produces cinematic slop
>>105843298fuck you and see you tomorrow
>>105843598dont use detailed
A man with a beard holds up a large bag of money with a dollar sign symbol on the bag. He smiles.
>>105843662now make one with the jobst retard lmao
>>105843660Are you that anon? Gib prompt pls
>>105843662changed size, still got same type of result, gen time much faster (messing with 720p Q8 wan, and comparing to 480p)
>>105842839>>105841774> Is it just me or does ComfyUI freezes the pc every few WAN gens?Try disabling "smart" memory.
>>105843662Kinda needs Jobst crying, but not bad
>>105843669success, picked a random google image result
"a blonde man sits at a desk and starts crying."
>>105843115Enjoy your 200b models with 8xH100 requirements (and 8000xH100 for training).
>>105843738there
poor guy cant even use a gun right...
>>105843688box or style info por favor?
>>105834947https://files.catbox.moe/u1aj0s.png
What if... llm agent, but for images? It will analyze a picture by itself and send to img2img models fixing hands and other artifacts iteratively? Or adding something new/changing colors/effects/etc.
>>105842646am I doing it right?
>>105843752Have a female asian hand hold the gun.
'Goodbye husbando!'
>>105843764https://files.catbox.moe/mexmlo.png
>>105843812>antialiased latent upscalehow do you get this in comfy? this seems like it could solve the jaggies problem and make latent upscales viable
a man wearing black sunglasses picks up a large black bomb and throws it, causing a huge explosion of fire and smoke.
>>105843863>no results found Forgechads I kneel
a man jumps off a building into a swimming pool.
kek
been gone for a while. did chroma get official nunchaku support yet
a man drinks a bottle of beer in a dark room at night.
there we go, including a reference to the light levels made the sudden brightness go away.
>spend an eternity looking for wan extension workflows that doesnt burn or use "last frame"
>think of the loop nodes in comfy but to stupid to figure it out
>find workflow on youtube that does all of that with i2v including vace
>behind a patreon paywall/sign up
I hate youtubers
>>105844049ai youtubers are the worst
>>105843115>No action, zero sense of motion or energy in the imagesAt this point I can only hope video saves image gen somehow. Maybe if AI can do passable "two cars crashing" in video then we'll get models able to gen good static images of it.
>>105844049anon just use logic, you gotta mask the frames you wanna extend with VACE and thats it
>>105844049farukan gogizur
a man drinks opens a brown bag of McDonalds and grabs a McDonalds cheeseburger, and eats it.
JC must consume
>>105844049have you tried this one?
https://www.reddit.com/r/StableDiffusion/comments/1llx9uq/
>>105844108Seconding this one, it's the one I used to make this video.
3090gods... we won. Can anyone now send the official /ldg/ memo to lodestonesnigger to stop catering to low step vramlet shitters and stop cucking chroma before he ruins it permanently? Thanks.
https://strawpoll.com/XOgOVDj1Gn3/
>>105844155>3060you know not all of us have 12 gb cards, some of us gen on a laptop
>>105844155>106 votes LMAO yeah sure
is there a node that can extract the last frame of a video input? so I can stitch generated clips together for example. Ideally I wouldnt need to use a web app to extract it every time.
>>105844195nm load video (vhsloader) does this
steps
md5: 8e1c0110469fa1e56c62af8923cc9d20
๐
>>105844097kek
>>105844108>>105844129saw this before, was put off by dicking around with picrel. however he did commented with an automated version, so this should do the trick: https://pastebin.com/TCs9J88i
I think it worked? two clips:
>chroma
>all that burnt training on distilled flux instead of training wan 1.3b/14b
OH NO NO NO
>choma
>all that burnt training on distilled flux instead of training sana
OH NO NO NO
file
md5: 42b2e24b184ae240f1752a96475038e0
๐
All that burnt training when they could've done a custom model with a 16 channel VAE.
>>105844340Why not Lumina?
>>105844351Why not use someone else's already spent massive compute as a foundation?
file
md5: ba42310052f0e3da6b4d5abffac73e72
๐
>>105844366I think you grossly overestimate the compute required to train a model. I also think you grossly underestimate how much compute is wasted undoing the lobotomy / redoing a model's understanding of anatomy.
a house at the top of a hill explodes with smoke and fire everywhere.
pretty cool
>>105844477alternatively,
a white house at the top of a hill launches into the air like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.
not much elevation, but still pretty good!
Wan is seriously better compared to Flux/Chroma at generating still images. The video training translates into better still frames.
ComfyUI made me realize this hobby requires at least 120 IQ points.
>>105844533okay, giving a distance made it move more.
a white house at the top of a hill launches into the air like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.
>>105844626er,
a white house at the top of a hill launches miles into the sky like a rocket, leaving a rocket trail and flames. The camera pans up to show the house in the sky.
miles is what did the trick.
>>105844583but can it do bobs and vagene
>>105844652Out of the box it's not great but it takes a bare minimum Lora to get it to A-tier.
>>105844381NTA, but my opinion is that the quality of the base model very strongly influences the results of a finetune the size of Chroma. Models like Flux and Wan are trained on literally billions of images. LAION alone is like 5b and that's an older dataset. Chroma's 5 million training dataset is nothing in comparison. You absolutely cannot train a model from scratch on 5m images and have it be any decent.
I said to space, seems that is not possible quite yet
file
md5: edeab666a4c419af72016b4c090b6272
๐
>>105844833You really think there are billions of unique images? I think you grossly underestimate how much variety is in 5 million images. You do realize they pad "billions" because most of them are duplicates, resizes and crops right? How many thousands of variations of Harold exists do you think?
>>105844833no way in hell flux trained on that many
novelai claims to have trained from scratch and their dataset is at most like 20m
>>105844872I think they do this to intimidate people from trying to train models. It's important the plebs don't realize they can make their own printing press.
>>105844847groq is this real
>>105844471have her hold her sword in front of her hips and twerk
>>105843298i'll miss you anon
Give the man black sunglasses, he is holding a large bag of money, overflowing with dollar bills. On the bag is the text "KARL" in scribbled font. He is wearing a black baseball cap that says "KING OF KONG" in white text.
kontext is so fun. it's like inpainting evolved, but does stuff inpainting can't.
>>105845130give him an anime gf
>>105845158anime girl Miku Hatsune is standing beside the man, wearing a black baseball cap saying "karl LOST" in white text.
and this is one image, if I want a better miku I just put a good miku picture in the second image input
workflow: https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751
>>105844872Even pixart alpha, that was woefully undertrained, claimed to use at least 25M images. Obviously pixart alpha is too small to be a viable modern base. But if the number of parameters is passable and vae and text encoder are modern, why throw away those 25M images already trained in, unless there's an architectural breakthrough? Lodestone's dataset is around 5 times smaller, if I remember correctly.
I'm sick and tired of 1girl effortless AI slop in this thread.
>>105845188make the anime girl smoke a pipe lol
>>105845197>I'm sickYou can always commit suicide, that way all your problems go away (you are the problem)
>>105845188and this is with 2 images (bypass the 2nd input if you just want a solo image for input)
anime girl with teal hair Miku Hatsune is standing beside the man, wearing a black baseball cap saying "karl LOST" in white text.
it just works.jpg
>>105845206and fixed the hat with a simple hat text prompt:
>>105845197for every one kinosoul 1girl there are 5b effortless 1girls
>>105845192Boy you quickly gave up billions of images huh? Maybe it's not much use to talk to someone who is ignorant.
>>105845214>>105845200one more! revised:
pink hair anime girl is standing beside the man in a black baseball cap. she is smoking a pipe. change the location to a bank. keep her blue and yellow hairclip the same. keep the man's pose the same.
>>105845197>complainer>nogenquite literally, everystein.singleberg.timeowitz.
>>105845244kek, double pipe this gen
>>105845252bonus: anime billy
>>105842620 (OP)Where did that dual clip thing came from?
>3090
>no fp8
>sage attention doesn't work
>torch compile does nothing
lol, lmao even
>>105845307Also almost 5 years old.
The man is pointing and laughing at a blonde swedish man wearing a t-shirt that says "KARL JACOBS", who looks very upset.
kek, I dont know if he is swedish so I used that as a generic npc.
>>105845324oops, cant forget to make sure he is still holding money.
>>105845345interesting leg
change the location to a mcdonalds restaurant. the man is sitting at a table eating a McDonalds Big Mac. His table is surrounded with hundreds of cheeseburgers.
JC needs to eat so he can stop the illuminati
>>1058453073090 shills deserve death
is ai art getting shittier by the day?
>>105843384>nah, i've plateaued in skillpic unrelated? your gens look like shit dude.
>>105845539I like them, personally.
The man is wearing a hat saying "#1 illuminati fan". keep his pose and expression the same. the image is in a pixel art style.
neat
0
md5: 1692961a708d7d49ab69f3b172a79ac2
๐
What would you prompt to get weird / unorthodox / asymmetrical lewd swimsuits?
>>105845643and without pixel art
two image inputs:
The man is shaking hands with the pink hair anime girl. the background is black.
>>105845742please share the catbox
thanks
this one turned out better:
>>105845762same prompt I used in the post.
>>105845771I know, I need the workflow
my two image workflow is broken
please fren
>>105845778https://files.catbox.moe/tfnkvg.png
got it from here: https://openart.ai/workflows/amadeusxr/change-any-image-to-anything/5tUBzmIH69TT0oqzY751
there, bit better proportions:
>>105845645Turn cfg low & let the prompt run for 20 iterations w/ โloose settingsโ
U sometimes get neat outfits this way
>>105845416all the top posters are rangebanned by the baker again
grim
diff image
The man is sitting at a computer and is typing. the pink hair anime girl is waving hello. the background is black. keep the man's expression the same.
The man is sitting at a computer and is typing in a dimly lit office. A rectangular sign above says "glowie HQ" in yellow text.
comfy
md5: a20835885478ee90f245b386cf722004
๐
>show OCD friend my Comfy workflow
>he loses his mind
It's not that bad, right?
>>105844597You just need the right motivation (genning the waifu, gooning, etc) to get your footing. It's not TOO horrible... except when everything breaks.
>>105845197Be the change you want to see!
Survey
https://strawpoll.com/XOgOVDj1Gn3/results
>>105846194would have been 17 for 3060 had I voted
>>105846214Survey
https://strawpoll.com/XOgOVDj1Gn3
>>105846345Correct small details and do meme I guess
If it was uncensored AND could keep artstyle, it would have killed loras and this would have been big
Unfortunately it didn't
>>105846386>If it was uncensored AND could keep artstyle, it would have killed loras and this would have been big>Unfortunately it didn'tWhen will we get this?
2 more weeks or anything that's actually on the horizon?
>>105846345fry your image in just 5 revisions!
>>105843115>SDXLDude. You're using a 2 years old obsolete clip_l based model. It can at most understand one (1) character doing one (1) simple thing if you prompt it right. With NoobAI/Illustrious we maximized the fuck out of that architecture and did things that shouldn't be possible, but it's like giving a new coat of paint on a 1960s Ford 2. You won't gain any racing competition with it.
Flux-dev 1.0 dual clip/t5 has a more recent architecture and prompt comprehension, which means it is only one year obsolete now. Still pretty bad, and Flux-dev has huge issues with its guidance distillation which makes it kinda retarded a lot of the time. But it's technically better than SDXL in prompt following, from 1/10 to 3/10.
If you want a decent prompt following check HiDream (a solid 5/10 on prompt following), but for some reasons people decided two months ago they didn't like HiDream.
>>105846345I think it's useful if you need specifically the thing it does. For everything else it's just a deepfryer and shitty mememaker
>finally set up everything
>can now generate as many fatties as i wish
I will dehydrate from all the gooning holy shit
>>105846345For me? Remove clothes, and change anime girls into realistic. Both are so/so and for now worse than doing manual inpaint and stuff, but you don't need to do manual inpaint.
Also for some reason my outpainting of characters is better with Kontext than with Fill.
>>105846641>anon can generate anything>wastes it on fatasses
>>105846735I'm considering taking vacation desu
FUCK YES
wan is still the lightest i2v, right?
>>105846780Isn't it the only good i2v? I mean there's lighter one but they're more proof of concept, and Hunyuan Video is shit at i2v.
>>105846792I don't know, I don't lurk here much. Just waiting for some simple model that can animate images, maybe even without conditioning. Wan is way too slow.
>>105846807Slow generation times or slow to input what you want?
>>105846822I mean waiting for 20+ minutes for a video that doesn't even follow a simple prompt most of the time
>>105846876sage attention + teacache can help you to halve the render time, but yeah it's slow. But it's the only one that really works beyond trivial stuff.
But if you want trivial stuff like people breathing or something, LTX video and CogVideoX are faster, and far more limited.
>>105846939LTX Video claims to
>produces 30 FPS videos at a 1216ร704 resolution faster than they can be watched. Sounds interesting if it's even remotely true
>>105846974Well, LTX is blazing fast.
It's also shit at anything that isn't "camera immobile/slightly panning in/slightly panning out" and "character standing, breathing" or "character sitting, breathing", occasionally "character walking".
>another cool little wan speed boost we'll probably never see for comfy
https://github.com/madebyollin/taehv
Can you control an AI by inputting an image that is a screenshot of a code and the AI executes it?
>>105847014Isn't that already implemented on WanVideoWrapper?
>>105843685Do you mind catboxing one of your I2V gens? Thanks in advance!
>>105847100you're right, I was googling the wrong thing, disregard my tard comment
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/README.md
>>105845307>>105845389anything better for under $1000?
>>105847002>It's also shit at anything that isn't "camera immobile/slightly panning in/slightly panning out" and "character standing, breathing" or "character sitting, breathing", occasionally "character walking".Though this was their 2b model. I see they've published a 13b model. I know what I'll be testing tonight.
>>105844626KCD IV looks wild.
>>105847145No worries. It might be compatible with vanilla comfyui as well. You can just replace them for a vae.
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/taew2_1.safetensors
I use tiny encoder for sdxl previews, its really good.
>>105847186Thanks, I'll give it a try later after work
coom
md5: 6b59742941d3474ffb3c9cfc1e63f911
๐
>>105847299I don't get it, how are you supposed to inspect it with your shirt over your head?
>>105847347more of a canny channel really
>>105843803https://files.catbox.moe/icjkb3.png
There ya go
>>105845645You could also erase by hand some part of a normal one, add extra lines and feed it trough again
>>105845936This is nothing yet, it will grow larger in time
>>105847463anon mine hasn't grown larger in 15 years
Am i retarded or there is no way of moving a model from one device to another once it's loaded? I have this problem where I run 2 large models after one another and the second one is really slow since my vram can't fit both of them, but swapping them between devices would speed things up
>Am i retarded
ngl I stopped reading after that
>>105847550Tug on it more maybe something will happen
>>105847427NTA but thank you, this is great.
>>105847577Maximum neuron activation.
...catbox?
SDXL:
>2023 release
>1024x1024
>3.5b parameters
>learns styles in under 10 epochs
>understands complex sex positions
>outputs in 3 seconds without any copechaku quanting needed
Chroma:
>2025 release
>512x512
>8b parameters
>hasnt learned a single style in over 40 epochs
>no characters
>melted anatomy and duplicate limbs
>barely understands POV missionary
>takes 20 seconds per image on a 4090
i'm thinking 3 more years of SDXL
you're suppossed to prompt the style in retard
>>105847347>>105847389>canny channelAre you kidding me? It's right there
A canny canyon
Fuck you ESLs
If you weren't a poorfag you'd use a video model.
>>105847657>A canny canyonso a cannyon?
>>105844155>no M3/M4nvidia nerds will never learn what 120GB VRAM feels like
>>105847690>540GB/s>$19,999.99lol
rbt
md5: 5f19ab07d7f31b6fd8670a47f303eee4
๐
so what upscaler or setting work best for anime/drawn hires fix,
cant figure or search out why it turns into smudge compared to realistic checkpoints that nicely enhanced details and fixes mistake
which one should I buy for genning?
https://mdcomputers.in/catalog/graphics-card/nvidia/rtx-50-graphics-card/rtx-5090-graphics-card
anyone have experience with character consistency? i was thinking of genning a face then face swapping it onto the image. and using wan to gen a video, then use frames of that video to have the character standing vs sitting
>>105847913Nta, but I couldn't make kontext *swap* faces in particular. It seems to treat this request as a deepfake threat.
>>105847427Thanks based anon, also this image is good as well.
>>105847657more of a canny chasm really
>>105843298>in a time when models are being reported and deleted for no reason through false reportswhat a stupid thing to do.
>>105847884Astral is the only real choice because you can check the per pin power/amps to make sure your connecter isn't going to fucking melt. That said, I thought it was too loud (even in quiet mode) when genning so I put it in a custom loop.
>>105848136Nice, Chroma looks promising for sure
>>1058470904o can read text in images and can read and interpret code so I guess it can do that.
>>105847090you typically control models through strings (which are converted to tokens blah blah), much more convenient
>>105848136How long did the training take?
causvid and other speed up loras produce almost no motion, solution lets create a dual sampler workflow that looks like cancer and shill it. both samplers use 8 steps, so that's 16 steps total, whats the fucking point then?
vramlet here
Can I use this to make little animations?
https://huggingface.co/CiaraRowles/TemporalDiff/blob/main/temporaldiff-v1-animatediff.safetensors
>>105847299I made those original target pics around the end of May. can I ask how you came across them?
umm?
Why did the thread die all of a sudden?
>>105848653>I made those original target pics around the end of May. can I ask how you came across them?Can't remember, just another prompt in my wildcards. Perhaps you posted your prompt around then and I saved it
>>105848681I stopped genning
>>105848749women as sex robots is the lowest form of sci fi
>>105848768cry more feminist
no matter how many times I download these and restart it keeps showing missing again and again
Updated cumfy too
>>105848681China went to bed.
>>105848681because you showed up
>>105848790Install them from their git pages
I noticed this yesterday with some nodes, installing from git fixed it.
it's a little ropey but that's the effect I wanted
>>105848853That's really great. Wasn't there some 90's tv show with stuff like this
>>105848872funny you should say that, that's what I trained it on
>>105848880Was it the Sabrina witch? Ancestral penis memory
>>105848901yes I sat through 7 seasons of the shit. They really scaled back the effects after the first 2, that's where most of them are
and yeah I did get a chub at parts, but mostly it's horrible cringe and I'm embarrassed for everyone involved
>>105848749awesome, I also posted a collection to civitai with full metadata. prompt sharing is fun
>>105848768yeah, it's a classic
>>105848910The talking cat looked like shit, that I remember
>>105848853please tell me you did i2v too
>>105842620 (OP)Is there a desktop application that I can use like notebooklm or something, but also allows me to use API keys if necessary?
I usually jailbreak deepseek remotely via an "untrammeled" prompt and it has been great so far just as an erp bot, but I want something that helps me use it to learn and improve my notes.
I also want to help better at prompting or understand it better, since deepseek either has a very disgusting and frustrating aneurysm or gets its ethical guard up as it tries to lecture me on topics I couldn't give a rats ass about as I press the stop button.
>>105842651Install linux while you wait a dependency for it is flashinfer and that's linux only.