Manlet Edition
Discussion of Free and Open Source Text-to-Image/Video Models
Prev:
>>105712655https://rentry.org/ldg-lazy-getting-started-guide
>UISwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Models, LoRAs, & Upscalershttps://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info
>Cookhttps://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe
>WanX (video)Guide: https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1
>ChromaTraining: https://rentry.org/mvu52t46
>Illustrious1girl and beyond: https://rentry.org/comfyui_guide_1girl
Tag explorer: https://tagexplorer.github.io/
>MiscLocal Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate
>Neighborshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg>>>/b/degen>>>/b/celeb+ai>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>>105714855 (OP)>Manlet EditionI approve
Kontext reminder, try setting ModelSamplingFlux values to 0, which can help depending on the prompt.
>>105714880Also the "without changing the rest of the image at all" etc prompt is redundant here
realistic great white shark with its mouth wide open, photo realistic, high quality
its still not perfect, it definitely can't mix anime with realistic very well you would have to start with realistic first then layer the other stuff over the top of it.
>>105714880What the fuck is Kontext?
>>105712745>https://rentry.org/omnigen2_plotsAdded to https://rentry.org/ldg-lazy-getting-started-guide#anon-guides-and-resources thank you anon.
>>105714925jpeg artifact filter
reminder that qwen3 imagegen is going to btfo kontext
>>105714919try removing the realistic
try removing the realistic and adding photograph raw
Inb4 lode announces chroma-kontext.
>>105714932its pretty grim for local models rn
>>105714936>A great white shark with its mouth wide open, photograph rawlike this? I'm running it to see if that improves, i lowered the cfg to 1.5 and did what
>>105714880said
it knows what a shark is so its not outside of its dataset.
>>105714949>Inb4 lode announces chroma-kontext.it won't happen
>>105714949It will never be 'grim' for local models because you will always be limited to neutered, watered down censored models (and have everything you do tracked/monitored) when using online models.
there's a reason local is infinitely more popular and discussed.
50 minutes for this img2vid with 8gb vram. i did something wrong or is that normal?
>>105715007How much ram offload? Do you have torch compile enabled? It increases gen time. How much ram?
is there a page showing kontext examples, for reference
>>105714936well it didn't work so now i'm gonna try something else.
A photograph of a black cat sitting on a cream colored sofa.
if it can't do that then its going into the trash literally.
>>105715007Most likely you are running out of vram and it's using system ram, which will slow things down a lot, but it also depends on the card model you are using.
Convert to quick pencil sketch
neat, it works. using q8 gguf quants from https://huggingface.co/bullerwins/FLUX.1-Kontext-dev-GGUF/tree/main
>local is so hecking free!
>you MUST add content filters to your outputs or go to jail!
why act like censorship isn't catching up to local? civitai is banning hundreds of loras every day. local finetuners are kekkolds who kneel to licenses. this isn't some magical land of piracy and free expression. everyone involved in local has been getting buckbroken by censors recently and it's not as yippee-freedom as it once was.
>>105715042YOU SUPPORT THIS FUCKING SHIT???
>>105715050
>>105715019i'm a noob and don't understand your questions lol. i have 16 gb ram
>>105715052I dont give a shit about what their rules are, how would they even prove I used their model if I can strip all data from it
>>105715023you need to wait for chroma kontext to get great realism, flux cant come close even with top loras today
>>105715050not my problem, weights are on my pc
nothing like New Thing to bring out the FUD poasters
It can do a harmless pussy cat but sharks are too DANGEROUS! Better censor that just in case someone gens some diver gore. How fucking gays, Germans really are too serious.
>>105715068hive actually detects the model that was used
>>105715087>run it through img2img on sd for half a secondheh, nothing personell
>>105715095but then it will look like shit. you are like the glaze trannies
it's already more fun than flux fill.
>>105715050Ban pencils and paint brushes, fuck it better just ban thinking all together in case someone imagines something evil. Why does AI make the normie freak so much? Its just another fucking art tool.
>>105715120>put the text "no gays!" on the sign. change the heart on the sign to a rainbow flag.this is a lot of fun and im just doing basic tests.
>>105715070those weights will never do your heckin celebrity porn btw
change her clothes into a white blouse and blue jeans.
magic
imagine living in a country were what you draw on a piece of paper can land you in jail, this is the literally definition of thought crime and also victimless because its all fiction. but here we are, it think it is time the western men to just ditch their countries and move to a more sane country.
let it all fucking burn
>>105715050Here's more crazy shit they've 'updated':
>b. Non-Commercial Use Only. You may only access, use, Distribute, or create Derivatives of the FLUX.1 [dev] Model or Derivatives for Non-Commercial Purposes. If you want to use a FLUX.1 [dev] Model or a Derivative for any purpose that is not expressly authorized under this License, such as for a commercial activity, you must request a license from Company, which Company may grant to you in Companyโs sole discretion and which additional use may be subject to a fee, royalty or other revenue share. Please see www.bfl.ai if you would like a commercial license.This means that if for example you create a Flux lora and receive buzz for it on Civitai, you need to contact BFL and get a license.
>>105715145it's only natural that the mindless insect fears the machine
https://xcancel.com/ostrisai/status/1938340573557002322
>It works! Fine tuning Kontext on task specific instruction datasets is going to open up a whole new world. Samples at 1,500 steps.
Blessed thread of frenship
>>105715187people will act like this doesnt impact them until 2 years later when there still arent any finetunes, then they'll be asking "what happened guys where are the kontext finetunes???"
Damn, Kontext is pretty cool, actually. Even if it's just a toy, that's pretty fun toy.
>>105715203This has given me hope
>>105715191awesome, catbox by any chance ?
>>105715203HUH NO WAY YOU'RE TELLING ME IT'S POSSIBLE TO FINETUNE KONTEXT !!?!?!?!?!?!??!?!??!?!?!?!? BUT 4CHAN TOLD ME IT"S IMPOSSIBRU!?!?!??!?!?!?!?!?
>>105715203is ostris == debo or something?
>>105714919Default Flux has shit style sorry anon
Change the crates on the man's back to cheeseburgers.
this is an amazing tool desu, infinite meme potential and actual text based inpainting/swaps.
OH MY GOSHHHHHHHH IT CAN PUT PARTY HATS AND SUNGLASSES ON ANIMALS??!?!?!?!?!? LOCAL IS SAVED!!!!!!!!!!
>zoom out, make her breasts gigantic, her hair very long, looking at viewer, split her hair color at her templer so the left side of her hair is gray
Mostly censored and a little limited to simpler things what it knows but pretty good, takes some rerolling for more complex things but what doesn't
Shows great promise that every model in the future, one that is not censored and capable of proper realism like Chroma, will one day have this easy to use textual editing workflow
>>105715301I'm gonna need a box anon
>>105715291It turns out that that's a really difficult problem to solve and took all of humanity up until this point to achieve. Making the skin look less plastic comparatively takes 30 minutes of time for someone with a >90 IQ. So yeah, local is saved.
>>105715187>Steals other peoples data>Trains model with stolen data>Expects people to pay them royaltiesI hope they get sued.
give the anime girl very large breasts and a t-shirt that has a milk bottle on it.
it made thin migu pregnant...
>>105715317You don't even need to sue them, just ignore their meme license. The only case where courts might side with them is if you're competing with them directly as an API provider.
>>105715187Which part of non-commercial do you not understand?
>>105715312>Making the skin look less plastic comparatively takes 30 minutes of time for someone with a >90 IQ.kekd
>>105715203Btw, you can already train Kontext with plain t2i training on all training scripts with zero modifications, since the model is architecturally the same as Dev and can be used in normal t2i mode also.
>>105715212>2 years later
>>105715256No one said it was impossible, it's just Flux with a context latent and the "caption" is an instruction.
>>105715181it will do whatever I want it to, training scripts already exist
>>105715327oh i will anon, its not like they are gonna know when the image has been passed through wan.
>>105715187Prior to this what did you think the purpose of the Flux.1 Dev Non-Commercial license was? To permit commercial use of the model weights? That is just clarification of the existing terms and it's one reason why people are using Schnell-based models instead
so is it a nerfed model? cant do the ghibli meme test
>>105715350That would be retarded because what you want is:
Input Image
Input Caption
=
Output Image
>>105715339When you recieve buzz, you are by definition 'commercial', you are recieving monetary compensation.
Are you retarded ?
>>105715350I don't doubt you senpai but I wonder why he said
>think I have it properly integrated. Training a test LoRA now with a custom made instruction dataset to put a specific logo on people's t-shirts.makes it seem like it's not
>>105715375this is the closest so far from a stellar blade image.
Convert Image to Ghibli Style
>>105715372>16 year olds replying to each otherWhy are you like this?
>>105715296Name of this semen demon?
>>105715327>meme licenseand it's exactly that, what they going to do if i use an image to gen a video using wan? bunch of clowns they are.
>>105715301please post a box i need it too
>>105715387sex
>>105715403>no one said it's impossible>*posts someone literally saying it's impossibleIf this sequence of events upsets you then you should consider detransitioning and checking yourself into a mental health facility.
>>105715419THIS IS FUCKING 4CHAN LITERAL RETARDS LIKE YOU POST HERE
>>105715377Brother, you can do vanilla t2i training just like Dev, to teach it new concepts. And then even use the new concepts in image editing tasks. Source: me, I'm doing that right now, and already have some early lora checkpoints that show it works.
>>105715385He's talking about a custom dataset specifically for image editing in a certain way. And tbhdesu I don't think his dataset is a good example; if you teach the model what the logo is using vanilla t2i training, you shouldn't need the editing-specific examples to make it work.
okay it's pretty good
Convert Image to Ghibli Style + dsp
>>105715419it's not that it's impossible, it just really fucking sucks trying to train a distilled model
>>105715362you and what compute, goy?
so how does comfy do the multi panel comic example?
>>105715325She is bearing our children who will grow up to be big and strong.
>>105715435converted to a ghibli-style image
>>105715452my gpus retard
Convert Image to black and white lineart, done with a pencil.
excited for the 2x3060 finetune, it will be insane bro!
>>105715417thanks
mayhaps i touch myself to her later
>>105715477why do you assume everyone is poor like you bro?
>>105715432I think in general with Kontext we want to train directions, not styles...
give the anime girl a large baseball cap that says "LDG"
file
md5: 5acbbed385737387b09898f0f0380c32
๐
My fleet grows
>>105715494replace the green leek the anime girl is holding with a large greatsword.
>Colorize this image
Damn, son.
>>105715505interesting how despite the typo it got the subtitle font style right.
>>105715494give the anime girl blue jeans and a white hoodie.
aqua color strings, nice touch!
>>105715489Let me put it this way:
1. You have a dataset consisting of images of a weird object called a smorgla.
2. You train Kontext using vanilla t2i training on this dataset, using existing training scripts with literally 0 modifications.
3. Now you use the lora in a Kontext image editing workflow, you have an image of a person and say "make him hold a smorgla" and it just werks.
This appears to work perfectly, I am doing it right now. Maybe it won't work well if you have some extremely weird concept that absolutely needs training examples consisting of editing instructions. But for like 99% of what people will want to train on, this is good enough, and you don't need to construct a dataset consisting of an image pair and editing instructions.
>>105715551>You train Kontext using vanilla t2i training on this dataset, using existing training scripts with literally 0 modificationsI assume this is using flux-dev settings. I would still like to train it with an i2i instruct mode for specific tasks, hope the trainers add that soon
>>105715419dude we've already been over this exact chain of events with flux dev
>>105715551Some of us are distinguished scholars
>>105715567ai-toolkit just updated with control image + image/txt pair datasets, I'll be training it out
I was sleeping for years.
Qrd on where did Automatic1111 go? Idle interest.
>>105715605went back to making rimworld mods after seeing the writing on the wall with SDXL
>>105715625you mean just playing DoTA right?
change the text from "VICTORY ROYALE" to "LDG POSTER!"
it maintains the style of the text which is neat, like flux fill
>>105715499serious hardware swรคrjebro
>>105715625>rimworldi've been playing this recently and was about to go play it again after seeing how bad kontext is. 2 of my colonist have babbies on the way and the new massive freezer warehouse is finished.
>>105715673me? im waiting for the new dlc so i don't get burnt out before
are there any decent image to video generators i can play with on colab? my computer is a decade-old potato
just looking to animate some images, nothing too involved
>>105715697we're probably gonna need to start a new game though due to all the stuff they are adding.
>>105715704i would have been willing to help but i've got to go get something to eat or i will surely die.
file
md5: e9cddac5acc36c4aadc0cda5f4ee220a
๐
why isn't it working, i tried put on clothes on the pregnant girl, put clothes on the girl, put a dress on the girl now
>>105715749consume what you must
>>105715769>The exposed AGP transexual has to invade our thread too
>>105715769censored slop. the absolute state of local releases
Omnigen is kinda blurry (?)
>>105715769sir...
a man with sunglasses and wearing a long black trenchcoat, opens a pizza box and eats a slice of pizza.
Wan is probably really good at making a control dataset. You can make it do everything, especially the magic transitions work fine.
>>105715369Well, yes, this is mainly clarifying things people thought but weren't sure of.
In short: You can't recieve any monetary compensation for loras, fine-tunes, merges etc based on Flux dev without getting a license from BFL, which of course means paying them.
The good thing is that this clarifications means Flux dev is dead for the Civitai lora maker crowd, and most likely they will create lora for Chroma instead, thus making its ecosystem stronger.
Thanks BFL!
>>105715804this one is much better imo
file
md5: d69ee5f6e922beffa55f524b86b712c7
๐
JUST FUCKING WORK
FUCK
FUCK
FUCK
FUCK
FUCK
file
md5: a4eb93dace557ca9d8e2f848e7603c88
๐
nice
>a furry and China combined make bfl, the original creators of stable diffusion, irrelevant
what is this timeline?
>>105715939Crazy what nanotech can do
>>105715946what kind of cope is this?
>>105715946It's amazing isn't it ? The offshoot from SAI is going to die the same way as SAI, being greedy and virtue signalling itself into an early grave.
>>105715255https://files.catbox.moe/zwsdto.png
Her name is Rallye, she is an infamous shitposter on /o/.
she is a used rental slut machine who wants you to give her the beans cap'n
>rentry
is there any guide for fast flux?
>>105716021Wan killed BFL's video model they were working on, since it suddenly became obsolete, meanwhile BFL's licensing means that community support is essentially dead for their models, with the exception of Flux Schnell which has a permissive license, and is the one Chroma is based on.
So if you were making lora for Flux dev, and want to keep on making loras, you will most likely target Chroma or some other model with decent licensing terms instead.
>>105716067I see. Can you post some case law for the license thing, I've not heard of anyone getting sued despite there definitely being infringing parties
>>105716067you're right about wan, but i wouldn't say chroma made BFL irrelevant when it's entirely reliant on a BFL base model. it likely killed the BFL-dev license though and guaranteed they will never release anything actually permissive ever again.
>>105714855 (OP)The grey one is pretty cool.
file
md5: 18be0ef8f35ecb3d9467f5ecff772261
๐
The real truth is you don't need any of these people to make your own model especially if you have H100s.
with kontext dev, can you keep original image's WxH or it must be 1024x1024?
>>105716157You can give it an empty latent, if you don't use your input's latents it does seem to crop like 20px though.
>>105716112Nobody has been sued AFAIK, but it doesn't matter, BFL coming out now and clarifying that this any commercial use of Flux dev or its derivatives needs a license means they most likely are going to contact Civitai etc and tell them they can't have monetary compensation for lora / finetunes based on Flux dev.
And even if they don't right now, they can do it at any time, meaning the time and money spent on training those are now worthless from a compensation standpoint.
>Can you post some case law for the license thing,Case law for licensing in general ?
So is kontext good or a nothingburger?
>>105714925chatgpt studio ghibli edit but local
>>105716189chroma kontext is all you will need, until then, this is good for simple changes to the images one might need
remove the man in sunglasses from the photo.
>>105716189Depends on your needs, but personally I can't think of anything I want to use it for that I can't do with img2img, inpainting, and with better results.
It's just kinda meh, perhaps because it was hyped so much that it made me expect something much better.
>>105716228Can it take an image and leave only the outfit against a white background? That would be a pretty huge use case for me.
>>105716206chroma kontext will never be a thing
kek
make the image black and white and give the anime girl a WW2 era Nazi uniform.
>>105716321>tranny raid on /lmg/ AND /ldg/grim
>>105716307I agree, the concept of Kontext is underwhelming and the results are overall really poor.
A week from now it will be forgotten, thus Chroma won't have a kontext model.
>>105714855 (OP)I haven't been involved in local imagen for the past two years, is there anything better than stable diffusion yet
I'm using comfy example
why kontext crop the original image though?
>>105716343No, SD1.5 is still the best.
>>105716346your image isnt the usual image gen dimesnions make em 128px divisible, 768x1280 1024x1024
>>105715850prompting like a real shizo, gj.
>>105716320make the image pencil outlines:
>>105715850you can crudely put cloths on her in ms paint and the ai will understand at like 0.5 noise
>>105716335it's way too slow for what it is. waiting 20+ seconds to give a giraffe sunglasses is too much. and that's ignoring all the lobotomization and artifacts
change the image to be the style of a pixar movie.
calarts retards are finished
make the image manga style in black and white colors. the girl is holding a green leek with her right hand.
well, it's the right hand on the picture, not her right hand. but still neat.
>>105716189it's pretty gimped, basically just inpainting without manual selection. "tell, don't show" but cursed.
>>105715850one of the first things I did, too. the way it totally fucking ignores prompts, it's like a shy mormon boy that won't even look at a girl that might make him lust.
>>105716412You're saying that putting plastic skin and a Flux chin on a person isn't worth 20+ measly seconds ?
What are you, a russian bot ? I bet you voted for Trump and owns a Tesla!
make a four panel comic with this character in manga style. each panel is black and white. In the first panel the girl is eating a cheeseburger. In the second panel the girl is eating a pizza. In the third panel the girl is drinking a glss of water. In the fourth panel she holds a sign saying "LDG".
so it understands a sequence of directions
>>105716469and there was a typo. new gen:
>>105716148how's the electricity bill anon?
file
md5: fb694914db03ec18b1c32be6d93ae1e5
๐
SD 1.5 sovl
Wait wtf there is already Chroma v39? I got 38 like a few days ago.
>>105716186people will just remove the lora's and upload to a site that does not recognize their shitty license.
73
md5: b586a7198617308f20cf70b0d60b28f7
๐
>>105716519same default workflow?
>>105716528*also which model are you using
>>105715510I still can't get the damn thing to do text correctly, its meant to read: billions must colorize
>>105714855 (OP)>neighbors:>>>/vp/naptwe are your relevant related Ai image generation community ;3
>>105716565well im not paying scam altman for prompts
>>105716537>BILLIONS MUST CULTURIZE
Does Chroma also use just euler + whatever or ?
>>1057165804o is free, flux kontext is a paid api
unless of course you're talking about the censored demo model that serves as little more than an ad for api nodes
remove the red hair anime girl from the image.
asuka got stalin'd
Chroma melting prompt elements into one is just my skill issue?
>>105716537>>105716584i've figured it out I think, either it's the fact i've changed the prompt to
text: "my text"
or its because I'm using the exact resolution of the original image, gonna test both ways to figure out what I've been doing wrong. No idea why its cutting half the image out, and it still has garbled text at the bottom, but I'm getting there. I just want this for the meme genning potential because its fast enough for genning quick meme images on along side shit posting.
pretty good free online generator:
https://perchance.org/ai-text-to-image-generator
it seems generally more coherent than what i can do locally with SDXL.
can anyone tell whats going on in the backend and how I can do the same with comfy
it's good at maintaining styles in edits like flux fill.
>>105716652ahahahaha i got it, you need to set the empty latent node to what ever 1024x1024 and important, the ModelSamplingFlux node width and height to the reference image original size. This time it did it correctly.
>>105716682can it do studio ghibli?
>studio ghibli
>generic anime
>flux face
yea it's flux dev alrigh
convert the image to manga style, in black and white. the shading is done in halftones.
>>105716682and i just tried this with this
>>105714880and it does not work, the text is garbled again, so you do need shift.
lmao
md5: 21780c7253f05b2fbce7a04eea2ed7bc
๐
>other ai porn creators follow me
>i dont follow back
>>105716682>>105716727so let this be a lesson on why we need shift. but i suspect the comfy implementation is to blame here when its turning shit into plastic
>>105716729>anon perfects AI pornpriorities amirite?
>>105716740>but i suspect the comfy implementation is to blame here when its turning shit into plasticWhat? Diffusers version doesn't have that? Can you post an example
>>105716749i unironically think god is testing me by letting me peer into the pit
>>105716729>don't follow this guy because he uses a default loraless model>don't follow this guy because he has overbaked cfg>don't follow this guy because he's literally zeroefforting using some shitty web service>don't follow this guy because his lighting and shading settings make skin look like sandMaybe I'm just autistic
>>105716756idk desu, a decent pony like cyber realistic does not do plastic skin, it actually gens decently realistic skin. There are examples where kontext does not gen plastic looking skin but only when you turn off shift.
the chalkboard at the back of the room says "LDG" in white chalk.
it was empty before. did you notice the edit?
>>105716776it looks like you painted in photoshop
>>105716776change the girl's clothes into a panda costume.
>>105716787manletification and boob removal complete
>>105716787the girl in the photo is sitting at a desk and writing with a pencil.
not bad
https://xcancel.com/fofrAI/status/1938271690493927932
heh
it literally turns characters into chibi style, what did they mean by this? :D
I'm trying to use Chroma with NAG but I keep getting really bad results. It's all either overcooked or undercooked.
>>105716767>>don't follow this guy because he uses a default loraless model I thought you meant he uses a base model with no loras which I was going to say BASED to until I read the rest of your post
>>105716807the girl in the photo is wearing black eyeglasses and is holding a beaker with bubbling green liquid. she is wearing a lab coat.
>>105716657>https://perchance.org/ai-text-to-image-generatorlooks okay
>>105716808>2 reference imagesModelSamplingFlux stacking? I'm gonna have to try this.
>>105716592kek why do you think im trans?
im literally libertarian
if you MUST insult
atleast research...
see ya >>>/g/sdg
;3
just woke up, what's the verdict
and is gguf ready yet
>>105716829this would take a human 40 hours lmao
>>105716848https://huggingface.co/bullerwins/FLUX.1-Kontext-dev-GGUF
it's a fun toy, need to mess with it more
>test: OG miku source, "give the anime girl a ww2 nazi outfit and make the image black and white".
replace the headline text with "stupid retards fund DEI game and go broke"
amazing. almost first try.
>>105716881remove the black woman with the red coat from the image.
also good at stalin photography, text based inpainting.
>>105716906the background lines being consistent is a neat touch as well, it's not just a black blob.
>>105716906remove the black woman with the red coat from the image. Replace her with a sexy blonde woman with large breasts, in a black dress.
>>105716872or a chinese street artist 6 minutes
remove the black woman with the red coat from the image. Replace her with a sexy blonde woman with large breasts, in a black dress. remove the black man with the teal coat from the image and replace them with Jackie Chan.
kek
convert to an image in the style of van gogh:
>>105716495It's like $200/mo extra
>>105716808how is this done? the template workflow just has image stitch which is shit
convert to a painting style with visible brushstrokes
convert to a painting with visible brushstrokes. Give the anime girl a red beret and change her shirt to a white blouse.
le Miku:
Alright boys... real talk, what's the general requirements for training a Wan video lora? Is it comparable to XL lora training in terms of hardware? I'm pretty familiar with training those and embeds.
How do you go about training? Gather up videos, chop them up into tiny clips and caption them?
I think I've fucked around with makin vids enough and I wanna bite into some nitty gritty type shit. Hoping my lil 4070ti can chug along and train.
>>105716983If you have to ask you don't have enough
>>105716983I wouldn't bother with less than 24 GB of VRAM especially if you want to do motion. If you're poor you're much better off just using Runpod because it only takes a few hours.
>>105716988;_;
I'm guessing that's a no go then.
>>105716991Rip.
I just wanted to get some loras that actually fucking worked and weren't stupid shit.
convert to a painting with visible brushstrokes. Give the anime girl black hair and a spiked neck collar. Change her shirt to be a Ramones music band tshirt.
neat punk Miku:
>>105716983I read that you can try it with videos using only 12GB vram inside of windows, not sure about linux but it was months ago at time of writing. just check civitai for the method, it will be somewhere under wan2.1
>>105717001what are you after
if it's cartoon shit forget it, otherwise I'm looking for ideas myself
>>105717003Give the anime girl red curly anime hair. Change her outfit into a white dress.
>>105715769I had some success changing "girl" for "character."
>>105716983>Hoping my lil 4070ti can chug along and trainYou can generate videos with a 4070 ti no problem, but training video lora, that's not realistic.
>>105717041bullshit, people were doing it with 12GB cards months ago, takes a long time though, like 68 hours or more. I'd not bother which is why I haven't.
image lora's on the other hand which really only work in a t2v setting are an option.
>>105717026>I read that you can try it with videos using only 12GB vramWouldn't you only be able to train with videos if you were training a video lora?
>>105717027No clue really. Mainly NSFW shit, fixing some issues I've been having with the general loras. Fluids turning into fucking firehoses is a big one. No matter how I prompt that shit, what are intended to be small drips turn into streams with enough pressure to send a bitch to orbit.
>>105717041I knew it was a pipe dream, but I had to ask.
>>105717062A 5090 is $0.89/hr on Runpod, most Loras can be trained in less than 5 hours.
this isnt the plushie I ordered, I want a refund, the boobs are missing
>>105717062>Wouldn't you only be able to train with videos if you were training a video lora?no, it's not exactly what I meant, you can train wan lora's with images they just won't have any motion outside of wans dataset, but here you can combine them with motion lora's.
> prompt image with kontext
> nothing changes
i..ok? but why.
>>105717086sounds like a skill issue problem
>>105717086did you try to prompt it with nsfw prompts? Then that is why.
>>105717080remove the plane in the background and replace it with a hot dog stand. A large sign saying "BIG GUYS" is above the hot dog stand.
this is a fun model desu
>>105717088"change the girl's expression to be happy"
i guess kontext is telling me she was already happy kek
>>105717092Avoid abstract concepts, use something like "smile" or "grinning" instead
>>105717092you vill not look happy and you vill be happy
>>105717072Yeah I know that's an option, but there's something better to me about fucking up a training session on my own hardware and the only cost is time, vs paying for time on another machine and then being out a few bucks if it's fucked. I know that's kind of stupid.
I know electricity costs money on my end, but I'm using the same amount of power doing other similarly intensive processes, so it's not really like I'm paying any more than I usually would.
>>105717082Ohhh, I get what you mean, I think. So like you could train a character lora based on images, and it'll use Wan's dataset for motion for that character. Seems like something more meant for text2vid. I'm more looking at image2vid, t2v is neat, but it doesn't seem quite there like i2v is since you can start from something that is already 90% of the way there.
>>105717097ah. that's actually a fair point. fucking chroma and 1girl slop has ruined me.
>>105717055>takes a long time though, like 68 hours or more.Yeah, that's the not realistic part, as in training for 68 hours or more, and you typically don't get it right the first time and need to adjust parameters and possibly data set, and train again...
If you have endless patience, yes, as long as the model starts training on your hardware, you technically can train, technically you can train with a cpu instead of a gpu as well...
>>105717100>Ohhh, I get what you mean, I think. So like you could train a character lora based on images, and it'll use Wan's dataset for motion for that character. Seems like something more meant for text2vid. I'm more looking at image2vid, t2v is neat, but it doesn't seem quite there like i2v is since you can start from something that is already 90% of the way there.yeah, then you can use a lora stack loader and combine nsfw lora's with some image lora. this only works with text 2 video model or perhaps even vace/phantom models. wan is very flexible with being able to mix models so experiment. but the secret source is learning how to take a person and use phantom to get the job done :-)
>>105717122>>105717100i meant to say if you can learn to use phantom really well then you actually don't need a lora.
Change the background to a suit store, the man in the blue shirt is now wearing a black suit and red tie.
what version of sageattention does the OP wan have?
>>105717131the only public version
>>105717129should use the blowjob lora desu
>>105715435do dsp blowing his brains out with a shotgun because he fuckin sucks
Does NAG and the other guidance methods work on Chroma?
>>105717177i don't think anyone has really done it and it probably needs the size of the matrices? changed to be compatible, anyone can go ahead and create a node for it.
i did find this
https://www.reddit.com/r/StableDiffusion/comments/1lkubao/anyways_yet_to_improve_infer_speed_on/
more importantly why is chroma so fucking slow? when this flux based kontext model has been pleasurable to use in terms of speed.
>https://github.com/ChenDarYen/ComfyUI-NAG
I don't get it. does nag not work for regular wan?
>>105716830>>105716968>I'm passing one image ref condition directly into another and seeing what happens
>>105716808>>105716968this, i've been trying to figure it out, stacking the nodes casues oom, batching them only uses one image at a time. concatenating them does not work. though if i send them concatenated but only send one of them into the FluxKontextImageScale it does influence the image
>>105717208>needs another fucking snowflake node>doesn't work if I hook up another NAG node
>>105716593res_multistep + sgm_uniform
>>105717247i tried that, this was giving me oom even when i scaled them down to 512x512 each, you might be doing it differently. do post any results if you get anything decent like what the X user posted.
>>105717261hang on
*blows*
not yet
lumina just oozes with style, i think the base model was trained on alot of art. that's the thing, china doesn't care about copyright one bit so they don't exclude things from the dataset. slop is a symptom of no art in the dataset
>>105717271There are hundreds of thousands of works of fine art that is in public domain
>>105717208+20 seconds to gen time is crazy. Also 6 fingers in neg made her hide her other hand lmao.
>>105717271chinese have a much larger percentage who are into anime girl stuff so they include lots of them in the dataset
>>105717275then it would make sense for black forest labs models to be somewhat competent at replicating varied styles other than photorealism if it has those styles in the dataset?
why do so many models have the plastic sheen/sepia tint?
lumina is a model that seems to exempt from the "plasticity" of flux despite it's smaller size
Replace the text "Concord will go offline" with "SHIT GAME GOES OFFLINE". The black woman with a red coat is holding a bucket of KFC chicken. The text "DEI STRIKES AGAIN" is at the bottom of the image.
it's neat you can do this without even doing inpaint masks.
>>105717291Because Flux is heavily lobotomized for "style" and given how many people praise it it clearly makes the Redditors happy.
remove the four characters in the image and replace them with a sign in the floor that says "note: they were all fired".
we need a reference for types of prompts that kontext understands since there is so much possible stuff.
>>105717247i've done i think, use the combine conditioning node perhaps? well it's doing something...
>>105717293replace the text at the top of the image. replace it with "why the fuck did you buy this shit?"
almost, but shows how effective the context based editing is. it also keeps the font style.
>>105717293do you really need ai to do this?
>>105717323there, we have a winner. also, even if you wanted to shoop it, you'd have to have a duplicate font. this allows you to dupe the font via AI knowing the style.
>>105717331without AI i'd need the font IGN used to make it look authentic.
>>105717331For the record it would take longer than 20 seconds to do it in Photoshop
>>105717335*also to remove all the characters and make it look neat it'd take much more time than 30-40s.
also note
>>105717310 as the removed characters are gone but the background lines are still consistent. i'd have to draw those in a shoop.
>>105717247yeah anon that causes oom if you do it that way, combining the conditioning does not oom. but that produces garbage but hang on because now i've found what causes the oom. The ModelSamplingFlux node will be the next thing to combine i'm sure.
>>105717411what? its SHIT
>>105717247this setup seemed to be working the first few steps then it fucks up, so i know i'm close to the solution. i probably just need to change the prompt now and or images and steps. i love to tinker with shit like this, its how i find the best workflows desu.
>>105717422no anon, it's pure gold. gimme more foxes
In the face of incredible leaps in local diffusion, being a vramlet sucks.
>>105717623What did you use to make this and how much vram?
>>105717986KiwiMix-XL V3, 8 GB VRAM.
>>105714932>qwen3 imagegen>235bkek no one is gonna run that
>>105715183I mean, simple editing shit like that can be done with impainting, the most interesting thing of 4o and kontext is character reference, and kontext makes the skin plastic and can't keep the drawing style when it uses a character and make it do something else, that's the problem