Discussion of Free and Open Source Text-to-Image/Video Models
GenJam3: https://forms.gle/hWs19H4vTGTdwARq8
Prev:
>>106157414https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info
https://openart.ai/workflows
>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanXhttps://github.com/Wan-Video
2.2 Guide: https://rentry.org/wan22ldgguide
https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y
>Chromahttps://huggingface.co/lodestones/Chroma1-Base/tree/main
Training: https://rentry.org/mvu52t46
>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Blessed thread of frenship
file
md5: 27e230068d201cce25778ac0e14a3a55
๐
wincucks can't into qwen-image
>>106162736Aww, Anon-kun...
is diffusion-pipe the most retarded training script ever?
useless deepspeed bloat, nonsensical cryptic output, cannot create samples during training, doesn't work on windows
>>106162774I have no idea, I didn't create it.
>>106162754chroma is number 1 anyway
>>106162717 (OP)>including ani in the collage
Why do normoids like ghibli style gens so much?
>DeepMind Genie3 architecture speculation
Gonna be fun playing with Hunyuan-World2.0 next year or whichever chinese company does open source
Why does Comfy seethe about Ani so much? They used to be friends ;_;
>>106162777>I call this piece 'autistic japanese girls on a beach'
>>106158221is there a HF space for this? I don't want to have to load an LLM every time I want to rice one of my prompts
>>106162878comfy said C was too hard but ani is a gooner idiot and can do it proving comfy is just a lazy asshole phoning it in
>>106162781The only valid complaint there was that it can't create samples, that's a clear negative.
Works fine on Windows using WSL2, it's kind of hilarious since Deepspeed is made by Microsoft, and it doesn't run on native Windows. Then again, all AI research and development is done on Linux.
Deepspeed is used since it allows training with multiple GPUs.
>>106158052Sombrabox please.
>>106162804This man speaks the truth
>>106162832ani-taur is a cute!!!! OK?!?!?!
>>106162878i think it was something about cumfarts and pedos
How do you guys feel about Statler/Waldorf?
>>106162947wrong
comfy said that it makes no difference because once everything is on the gpu it doesn't matter if the wrapper is python/c/whatever
Also TraniStudio is somehow 10x slower than the same workflow in comfy
I don't know what I'm doing.
Why doesn't this change the image output?
>>106163013forced meme. would be better in moderation
Now trying neta lumina again,and neta yume lumina
Ladies(males)&Gentlemen:
Please
'''
git pull
'''
your ReForge repository.
Panchovix made important changes!
Yes, we are back!
>>106163022comfy also said learn C but doesn't write C himself. ani doesn't have access to a lot of the optimizations since they were written in python first. also there is no vulkan option for comfyui. if people actually care about cleaning up the huge amounts of waste from deps they should be making stuff that works with C/C++ first then provide python bindings not abuse the python wrapper for everything. that's how a disk becomes a landfill fast
>>106161237I have 3 datasets that i got from the internet. Each zip has ~250 pics + captions. Is that too much? Should I thin it out? They are former sdxl datasets.
>>106163022>wahhhh the inference speed is everything!!!! don't think about the bloating or the shitty frontend runtime!!!!
>>106162974I'm sure you can figure it out on your own anon, i'm rooting for you
Holy shit, bong_tangent acts like it's got a built-in controlnet. If you use it in an upscaler and set denoise to 1.00 it sticks really close to the original image.
You can't push it too far (you start getting minor body horror at 3x or 4x resolution), but it's certainly different than the other schedulers that start sprouting multiple heads at 2k resolution.
>>106163072>ani doesn't have access to a lot of the optimizations since they were written in python first.This sounds like you saying it's the tools fault for being written in python instead of her fault for choosing to not use python
waaaaa my shitUI is better than your shitUI waaaaa
>>106163172it has python interop anyways so just plug in whatever. it's good ani is opening up new paths for people to reverse engineer optimizations that work and trashing dep bloat
Remember:
VRAMlets hobby
VRAMlets general
VRAMlets website
>>106163118No, I can't. You have tried that with the Mercy gen already and it didn't work. Please don't be like that.
>>106163200ok I see how anistudio can destroy comfy now. ani has to work harder though it's taking forever
will the release of the old chatgpt models 3 and 4 have any impact on pic genning?
or is this just for chatting?
>>106163331anon, those don't have any impact at all. it would just outright refuse nsfw
>>106163095Are you argueing for using a super slow tool that's missing 90% of expected features because... people should just use it mkay?
>>106163362it's only as slow as comfy inference if you just use the interop but it's hard for you to comprehend an actual modular design instead of forced nodeshit
>>106162754Just use the Linux replacement, I mean WSL.
>>106163381it would be nice if it could mix cpp and python nodes so we can watch comfy get deprecated over time as each node gets replaced
Qwen tagged all the GPT slop so you can neg it out... right?
so qwen is the new hot thing to use?
Can you train loras with 24gb vram
i2845
md5: 505808daec9a0a257d84eaf2b487123a
๐
>>106163510>20byou can't even run fp8 with less than 32GB of vram. I think you would need 48GB to train.
>>106163530I have 12gb Vram and 32gb Ram and I can run fp8 fine.
>open thread
>he's shilling his shitty wrapper again
>close thread
in his defence i actually got it to run the other day but it crashed after a few gens :d
>>106163530I am running it just fine it seems to use 23gb vram with the default comfy setup.
I do have two gpus but multiGPU just doesn't seem to work, that node is worthless
>>106163331those models are getting dunked on as worse than existing open source models
>>106163510With system ram offloading, yes, and of course with quantization you will be able to train without offloading, but quality will suffer
The training will be slow though, it's a big model
>>106163564I wish the metadata/project layout load button functioned so I don't have to set everything up again everytime it crashes
>>106163022>it makes no difference because once everything is on the gpu it doesn't matter if the wrapper is python/c/whateverGIL and other Python issues do affect the speed at which everything is on the GPU. The difference using the same code but with libtorch would be marginal, maybe more of a difference on slower/older CPUs but any real performance gains come from optimizing the operators and custom kernels.
>aniStudio is somehow 10x slowerThat's because sd-cpp has shit kernels
I have a question:
Let's say I have generated an anime-style video involving 1-2 characters in a room. Let's say for example it's a school classroom.
What if I want to gen a continuation of this video in a new cut? For example, a different angle and composition, but the character(s) are in the same pose. wan cannot cut mid-gen so obviously I would need to generate an image that somehow believably captures what's going on in exactly the same environment.
What are my best options for this? Any good workflows? I know openAI's sora is pretty good at this, but the output resolutions are non-standard and I don't want to deal with censorship.
I added Qwen-Image support to diffusion-pipe
The model is large, but with fp8 base model plus a bit of block swapping you can train loras on a single 24GB GPU. I've added instructions and an example config file to do this.
Ideally, you would have something like 2x3090 and use pipeline parallelism to split the layers across two GPUs, which works very well for this model.
If the 5070TiS gets 24GB is it gonna be a better purchase over a used 4090?
>>106163510i'm sorry anon
>>106163802Nice, not that I think I'll torture my rusty old 3090 with this model either way, it will likely still feel too slow to be worth it.
I have a 5070Ti, and I am not a vramlet. This is all just petty elitism from embarrassing manchildren. 16GB vram is more than capable for good AI generation in a timely manner.
>>106163856Well played, chuckles were had
>>106163551What fun do you see in running that model at low quality without Loras? That's why I remain loyal to SDXL. Also, for Vramlets and SFW this model is free on LMarena now.
>>106163054 good consistency,
>>106163790Hmm... Probably Design Doll or Cascadeur with controlnet to maintain posing. Those are posing software.
>>106163564Yeah also crashed all the time for me
>>106163887>CascadeurCascadeur? I recognize this as a physics based animation software.
>>106163872> I have a 5070Ti, and I am not a vramletSayd the vramlet
>>106163874This could actually be a scene directly from The Muppets
We could have those Bert and Ernie nazi memes in 720p video, what a time to be alive.
If you don't have at least 48VRAM itt, you are not prepared for this hobby and you a lesser human being.
High vram cannot compete with a creative mind.
>spend 2k to generate a 5s hit or miss slop video
KEK
>>106163918unironically many do not understand this
better card =/= better gens
Do not forget we have a genjam:
https://forms.gle/DekqEyMcMpjtyxjy9
Is there no I2V light2x 2.2 lora?
>>106163919>"Your hobby is stupid because I can't afford it!"
>>106163979i think kijai released one but it works worse than the 2.1 lora
>>106163899Yeah. Animations.
>>106164022Not really sure what this can do that Blender can't.
Also still not sure how I can take a still frame from a wan gen, then somehow transition to a 3D workflow to change the camera angle while maintaining the same environment and character poses.
>>106162864>Hunyuan-World2.0 next year or whichever chinese company does open sourceTencent are not big fans of open source anymore
Pray alibaba looks into it
>>106163919my 3090 was 500$ and i can use it for a crap ton of things anyway, sis
>>106161112>if one attosecond passed for every single possible i2v prompt, how long would it take?Letโs break down what you're asking and work through it carefully.
Goal:
You want to calculate how long it would take if we spent 1 attosecond (1ร10-18 seconds) per image generated by an image-to-visual (i2v) model like Stable Diffusion or DALLยทE, for every possible image that could be generated, considering:
Every possible text prompt
Every negative prompt
Every resolution (within reason)
Every random seed
Every possible combination of pixel values that a consumer PC can realistically process
Step-by-step Breakdown
Weโll estimate the total number of possible outputs, then multiply that by 1 attosecond.
1. Pixel-Level Possibilities
Letโs use a single resolution as an example โ say 512ร512 (a common resolution for models like Stable Diffusion). We'll generalize later.
Bits per pixel
Assume 24-bit RGB images:
Each pixel = 3 color channels ร 8 bits = 24 bits = 1 byte per channel ร 3 = 3 bytes
So,
512ร512 = 262,144 pixels
Total possible pixel combinations:
256(512ร512ร3)=256786,432=2(8ร786,432)=26,291,456
256(512ร512ร3)=256786,432=2(8ร786,432)=26,291,456
That's ~10189404 possible pixel combinations โ just for 512ร512 RGB images.
2. Resolution Variants
A "consumer PC" can reasonably handle resolutions from around 64ร64 up to maybe 2048ร2048 (though that's pushing memory limits). That gives a range of image sizes.
There are perhaps a few thousand practical resolution combinations (square, portrait, landscape) within that range.
Letโs be generous and say 104 resolutions.
3. Random Seeds
Models like Stable Diffusion often use 32-bit seeds. Thatโs:
232 = ~4.3 billion seeds = ~4.3ร109
so radian attention was a snake oil
>>1061641364. Prompt and Negative Prompt Space
This is the trickiest part because it's infinite in theory. But you limited it to โeverything a consumer PC can handle.โ Letโs assume:
A model can take a max 77 tokens (common in transformer models like CLIP).
A vocabulary size of ~50,000 tokens.
So, the number of unique prompts is:
โk=17750,000k
k=1โ7750,000k
This grows extremely fast. To get a rough upper bound, we can approximate with:
50,00077=10(log10(50,000)ร77)โ10(4.7ร77)โ10362
50,00077=10(log10(50,000)ร77)โ10(4.7ร77)โ10362
Letโs assume the same for negative prompts.
So total prompt combinations โ
10362ร10362=10724
10362ร10362=10724
Total Possible Unique Generations
Multiply all components:
Pixel combinations (for 512ร512): ~10189404
Resolutions: ~104
Seeds: ~109
Prompt + negative prompt combos: ~10724
Total imagesโ10189404ร104ร109ร10724=10189404+4+9+724=10190141
Total imagesโ10189404ร104ร109ร10724=10189404+4+9+724=10190141
Time in Attoseconds
1 attosecond = 1ร10-18 seconds
So:
Total time=10190141ร10โ18=10190123 seconds
Total time=10190141ร10โ18=10190123 seconds
Convert to Years
There are โ 31.56 million seconds in a year:
10190123รท3.156ร107โ10190123โ7.5โ10190115.5 years
10190123รท3.156ร107โ10190123โ7.5โ10190115.5 years
Final Answer:
If 1 attosecond passed for every single possible image under your constraints:
It would take approximately 10^190115.5 years.
For comparison, the age of the universe is ~13.8 billion years โ 1.38 ร 1010 years โ a blip compared to this number.
>>106164143too annoying to get running and wastes memory
High-quality digital art in an anime style from pixiv. depicting a fantastical scene of a busy medieval tavern in the fantasy genre. The central figure is a girl wearing fantasy inspired armor resembling a metal bikini and has a toned, athletic physique, she is leaning back in a simple wooden chair with a stylised speech bubble saying "another successful commission!" . Surrounding her is an average looking adventuring party including one wizard with a hat and a paladin with a shield next to his chair she is drinking beer and talking with. The girl is holding a beer glass and waving it around enthusiastically. The atmosphere is celebratory and the people are all looking at each other. The tavern has an aged look, the wooded tables and barstools looking worn. The character's expression is intense, reflecting determination and focus.
is 16gb vram and 32gb of ram enough to generate videos?
>>1061643188gb is enough so yeah
>>106164304This is a good SDXL replacement for anime
>>106164389it's probably going to have similar problems to flux in refusing to change style much or properly learn characters/artists
>>106163918well if you generate 100 videos or pics and get a good one by chance, it can be more productive than a creative mind. there might also be some AI agent to pick the best ones for you. that could be the future of art
is there a guide for retards on how to actually generate videos once you actually get this shit installed? ive never messed with it, but id like to tinker with it while i work from home.
>>106164462>2.2 Guide: https://rentry.org/wan22ldgguide
>>106164486thats what i used for install. i was looking more for a guide on how to properly use workflows and what each section does, etc.
>>106164445You keep relying on machines my guy.
Me? I have more than enough faith in my vision.
>>106164411Oh... so this will be forever SDXL?
>>106164486the 2.2 t2v link is broken. also what is this flowmatch stuff?
>then delete "ModelSamplingSD3" from the workflow and replace "BasicScheduler" with the new "FlowMatchSigmas" nodethose aren't in kijai's workflow
>>106164555No, you must trust
Trust in RowWei
Trust in Neta Lumina
Trust in NovelAI when they release their open weights
Trust in illusitrous
Trust in Noob
If only the outputs weren't completely slopped... (I used the same prompt about 2000s digital camera from the last thread)
file
md5: 54f412d5612b58f3bf8483572da33cc0
๐
I tire of these sidegrades when will we get a Real New Model
>>106164580It's Qwen-Image (forgot to say)
>>106163874what is your height and width set to?
comfy should be dragged out on the street and shot
>>106164445brownest post I've seen in a while
>>106164580i kinda like the slop
anything new since kontext nudify loras?
>>106163790Seems like a LoRA could be created for hard cutting to different angles
>>106164767i got laid (no longer have a use for nudify loras)
>>106164047have you tried flux kontext?
>>106164047in wan
>still scene. time frozen. camera rotates around the roomSomething like that and then find a frame for the angle that you want.
>>106164813I checked my vram situation, and as a dual gpu chad, it seems I still get plenty of free vram in one of the GPUs. It may be worth look into creating a Comfy flow where I can use Chroma as a second img2img pass on Qwenimg outputs, acting as a filter that unslops the image. The downside is that Chroma (at least the current version) would give bad fingers etc
oh i understand now, it thinks penises are pink soft serve ice cream so thats why smooth chunks get licked off
>>106164875how do i delete a post
>>106164858>The downside is that Chroma (at least the current version) would give bad fingers etcfor some reason Chroma works really nicely on i2i second pass, doesn't ruin fingers. I use res_multistep + beta
That's it, I'll filter the word chroma, I don't have to see the name of a old and inferior model. Qwen has better composition, and SDXL it's the king of quick cooms.
>>106161650>fucking sad that I have to rely on advanced color grading in premiere pro to fix my gensdoes it work well and is it automatic?
Really thinking we need /vdg/ now.
>>106165032>vdgexists on /gif/
>>>/gif/29249725
>>106165032You don't post gens here though.
>>106165048Yeah well we need an sfw version. Do not care for hardcore degeneracy.
>>106165057I certainly do.
I have zero interest in chroma or qwen and that's all I see here.
>>106165070ok go cope in /sdg/ with the other vramlets
>>106165080/vdg/ is coming soon.
>>106165080 (me)
oh nvm i misinterpreted your post
flux is megacensored and barely usable even with extreme lora autism
krea looks slightly better but might be even more censored and has no loras
chroma still has sd1.5 anatomy
wan image is less censored, but weirdly sensitive about settings, looks sloppy most of the time, very slow especially with dual model autism, early lora situation
qwen follows a little better than wan with a single model, but looks super sloppy and soft, even slower
who is going to save image gen?
Is it a bad idea to go beyond 81 frames on 2.2?
>>106165145don't do it anon
ouch
md5: e1b85b7e0e49c71a2b354adc819ae678
๐
>>106165145i'm doing 161 frames no problem. well except pic related
>>106165139praying for some richfag to do a proper chroma anatomy finetune
>>106165166What's the output like though? Does the video try to go back to the beginning? Any side effects like earthquakes or overbrightness?
>>106165139Praying for some richfag to do a proper SDXL 2.0
I don't understand, you' guy are simping over 5 seconds of the worst animation possible?
>>106165269What you don't understand could fill a library
Hey, what happened to landscape diffusion? Nobody baked anything?
>>106164328>blonde>denim shortsthose are some nice genes
>>106164304Mmmm this is a good base bit is slopish as hell. Appart i supose that if SDXL its 1b and this model its 20b, a coomer would have to finetune the checkpoint or Lora x20 times
>>106165446Betting $100 theres a black man in that suit
>>106165181>Does the video try to go back to the beginningwith complex prompts it does tend to do that but you can kinda fight it with the prompt, like "the woman turns around to face away from the camera. blah blah blah. she continues to face away from the camera."
>Any side effects like earthquakes or overbrightnessno
If I wanted to do image classification for tagging images with a long form description of what seems to be happening in an image and/or general descriptors tags?
Also is there one for short gifs?
oh wow, qwen->second pass chroma i2i is... really fucking good when it works. like might be all we need
>>106165563sorry I meant to ask what model I might use...
>>106165490he is at once in hell and also having the time of his life
>>106164906A screenshot of your i2i workflow kind anon ?
>>106165563Depends if it's nsfw or not. Joycaption for porn, Qwen2.5-VL-7B-Instruct for non-porn.
>>106162840>"do the fucking pushups">*throws ayyliums*
what is this qwen shit everyone is talking about?
man I swear I'm gone for like a day or two and some new shit pops up instantly.
so we have qwen image with all this fancy stuff like editing and all we can use is t2i?
>>106165710Imagine Flux but scaled up and more Chinese.
>>106165723so its a new model like a better flux or what?
>>106165575oh wow your pic looks amazing. worth it
>>106165740Better Flux with less censorship. But it's also unreasonably big.
>>106165745he's still genning, be patient
>>106165743Billy has upgraded his Nintendo Super Scope, Jobst better run
>>106165446https://youtu.be/3iLY-EzLefw
>>106165803Since I don't like AI slop it's wan2.2 14b lownoise > flux krea > qwen image > flux
I believe that in a street test, 100 out of 100 passers-by would choose wan without hesitation and say that it is a real photo.
>radial attention and jenga back to getting regular updates
what a time to be alive!
1
md5: 0d4b288d7117b59b8f9953b4a3fe0ada
๐
hey KARL, look what i'm doing with your money!
400
md5: 6634e5c22273017f2f2695684b3dc905
๐
>>106165803>But it's also unreasonably big.you cant run it local?
>>106165859To be fair, 90 of them would say dreamshaper looks real too
>>106163856howd you make it longer than 5sec?
>>106165907A quanted version, but LoRA training is also important
more karl being super rich:
>>106165997*billy
just kidding karl is poor
>>106165903Hot mother and daughter combo, is this what they mean when they say WINNING ?
>>106165918by just doing it
Reminder that if you get OOM after a first successful generation, there is a node called unload-model that can solve the problem.
It did for me.
Just place before the last node, usually the one for saving the generation.
>>106165904cool gen, my take on it
I've put example images and videos in my lora folders.
Is there a way to explore lora files directly from comfy and have it show the video or image?
>>106166107its ok, an amer*can made the lora
is it true wan 2.2 t2v is the best image gen
if so, why?
>>106166089loramanager extension
>>106166141Basically does porn out of the box, or at least with an any 8000 step LoRA.
can gwen do edit like kontex
is it better?
>>106163856First they came for the 12gbs, and I did not speak outโbecause I was not a POORFAG.
Then they came for the 16gbs, and I did not speak outโbecause I was not a VRAMLET.
Then they came for the 24gbs, and I did not speak outโbecause I was not a OBSOLETEFAG.
Then they came for meโand there was no one left to speak for me.
>>106166197<16gbs aren't even human
kek, wan 2.2 camera test
a man runs to the right very fast. the camera tracks his movement as he runs.
Do your qwen image, i2i it to chroma, 20 steps sfg 4.5, res multistep beta, denoise .35, second prompt is just (amateur photo:3) or whatever you want if you want to add in elements that qwen doesnt do well
>>106166301add some 90s camera filter and make ragebait post on xister
>>106166301>>106165446you cant move like that and still be fat
>>106165776>>106165639pure sexo
>>106166197>>106166217People doing image / video gen don't even know how good they have it. Over in /lmg/, 72 GB (3x3090) is considered by many to be King of VRAMlets. Only at 96GB+ are you no longer VRAMlet.
>>106166144This?
https://github.com/willmiao/ComfyUI-Lora-Manager
With its own interface?
Can it interact locally? It looks like it's mostly about pulling stuff from civitai.
>>106166433Yeah but you can pool memory with LLMs, but I don't think you can do that with image and videogen.
>>106166434all these interface extensions all universally suck
>>106166433at least you chain cards relatively cheap cards. we need the memory all on one card and the card has to be fast
>>106166342Sadly it still looks kinda slopped with those settings
Less slopped for sure, but still slopped
wan img2vid never gets old
the little things always get me
>>106166516this, like my first time trying SD all over again
a man runs to the right extremely fast and flies into the sky. The sky is blue and cloudy.
>>106166470It definitely has some effect though.
It improves skin (compare with
>>106164580), but it doesn't get rid of the background blur
I used (candid amateur photo:3)
>>106166614maybe try something like "textured wallpaper in background" so its not just a blurry color? Might need to give it something to force it into bootleg f/22 mode
>>106166570You gotta hand it to the chinks, Wan can handle basically every concept you throw at it, and you can run it locally on consumer hardware
>>106166417>Women on the highwayAlways a recipe for disaster
made a node that gets all the loras from high noise and allows you to reuse them for low noise with adjustable lora strength. reduces boilerplate/clutter from duplicate loras. do you think this is helpful?
that looks a lot less slopped actually
>>106165918Wait until you find out 4fps/8fps LoRAs work.
>>106166758kinda amazing that wan can even interpret that
>>106166769>4fps/8fps LoRAs workthe what/where
>>106166776china numba wan
>>106166926If you make a LoRA at 4fps or 8fps you effectively double or quadruple your running length. You basically rescale a clip from 4fps to 16fps (which will appear to be fast forwarded) but trained like that Wan will produce longer length clips. 120 frames = 30 seconds at 4fps, and it works.
i am STILL waiting for anon to prove mmaudio can do nsfw
>>106166769isnt the data set standardized at 80 frames 16fps? wouldnt extending the number of frames mess up motion? i was thinking about taking the last frame and re-sending it through the workflow and splice them together in post, but seems like a lot of work
>>1061670444fps is basically fast forward or sped up footage which is in the dataset, so the concept actually works.
>https://x.com/jkbr_ai/status/1953154961988305384
this shit is honestly crazy, cant wait until some chink opensources the thing with 90% quality
>>106167068oh interesting... ill have to play around with that, didnt think sped up footage was in there
>>106167069>actually works, the prompt bled into the sliding door showing the outsideIt's a gimmick but as I suspected the results are way over blown, it's just a slightly refined version of the Minecraft hallucination model.
>>106167077I mean of course it is, sped up footage is in many normal videos. All you're doing is rescaling the 16fps "sped up" footage to 4fps. And of course that's not talking about how Wan is a very smart model in general, it knows a lot of small things due to its size and complexity.
>>106167080wonder how they handled "memory" of actions made in the model at runtime since if you look away it doesnt make stuff disappear
>>106167096It's just stacked up frames in context. I don't think they ever showed a true memory challenge, only that it doesn't forget within 5 seconds of moving away. I've also noticed in their cherrypicked examples that the motion is generally slow with no forms of fast travel, so maybe if I saw a demo where they were exploring an environment like Myst where they travel in a circle on an island back to a wall with custom writing on it after 30 seconds I'd be more impressed.
>>106167129For example, I think with Wan if you had 20 frames of "context" rather than a straight first frame, you'd have a similar memory solution.
>>106167096might be flux chin, it's always there even if you don't want it to be
>>106167129interesting, this is the only one ive seen with decent interactive memory, but like you said, it never really leaves the frame for more than 5 seconds, so good eye on that
>https://x.com/_rockt/status/1952735159834325210
Finally got FastWan working
>>106167232solid camera shake on this
>>106167224seems like a good match with vr, in that eternal tech demo state vr exists in
>>106167232>deepfried in corn oil
>my preview nodes stopped working and only the last node is actually saving
wtfff
>>106167254I mean it's good enough to make VR porn, not that long memory is that important.
pro tip: it's hard to forget something when it's right in front of you
>>106167275then how can i forget about all the trash around my monitor so easily?
>>106167298liar, you know it's there because you were trained on slop
>tfw parents are a slop dataset
>>106166758>>106166959The future of jaks is bright
still not jumping on the wan bandwagon but would 10 GB be enough, or should i not bother
>>106166042is there a better node system to control model management? i want to decide when a model is loaded or unloaded :/
>>106167403my brainlet lora paying dividends already
a man flies into the sky to the right at extremely high speed. The sky is blue and cloudy.
holy shit, billy became a plane. wan never gets boring. and thanks to the lightx2 i2v loras, it's fast. (no more 15 min gens)
>>106167559my guy became jay jay the jet plane
for video gen, new gpu or more ram? at 64gb ddr5 3090 atm
>>106167593more ram would be a fraction of the cost of a new gpu so why not both?
>>106167559there we go. now he's flying (as a person)
>>106167637>i'm off to take karl's money again
>>106167681not for that pony
guys, can you give us some better themes? i'm sick of your ugly males gens. i know this thread is blue, but come on
>>106166493Model? This is badass.
>>106167697what do you have in mind
just stopping by to say this is the best fucking collage I can remember lmao
>>106167718Anon posts kinosoul ITT doesn't he
>>106167697too busy losing my sanity with spicy 2.2 gens
>>106167697The male form is beautiful tho and I'm not gay for enjoying it
>70 images @ bump limit
ooooffff ;3
>>106167746i...now...want...to...die...for...the central banking usury debt slave system of made up fiat currency not tied to anything physical...
>>106167709robots theme, medieval, starwars, etc
>>106167681my ass hurts...
>>106167718i like to look at sort of the progression of the collages with how the number of gens included seems to ebb and flow "naturally" i.e. there'll be a bunch in one collage, then the number of gens will slowly decrease over time, and then suddenly there'll be one that includes almost 20+ gens
>>106167757kek, also have this Kino.
>>106167780Wow how original anon...
Anon-sans, how do I obtain less dithered animations?
>>106167697nope you must visit other generals or wait until your cooldown\throttle is removed ;3
behave !
>>106167839increase resolution. tone down the strength of lightx2v maybe
>>106167700Noobai with pixelization script.
Here's a typical output without it.
>>106167838but nobody made them,except some rare robots gens. medieval theme is rarely here
>>106163013he targets me a LOT
but such are the ways of \g\ schizos ;c
>>106166959prompt? I have been trying to get someone to puke for months now
>>106168137nta but 2.2 is way better at prompt adherence
>>106167864Very cool. Thank you anon.
>omg, the cat is working at mcdonald's. this is epic!