← Home ← Back to /g/

Thread 105714855

324 posts 238 images /g/
Anonymous No.105714855 >>105714875 >>105716119 >>105716343 >>105716572
/ldg/ - Local Diffusion General
Manlet Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>105712655

https://rentry.org/ldg-lazy-getting-started-guide

>UI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassic
SD.Next: https://github.com/vladmandic/sdnext
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Models, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com
https://tensor.art
https://openmodeldb.info

>Cook
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX (video)
Guide: https://rentry.org/wan21kjguide
https://github.com/Wan-Video/Wan2.1

>Chroma
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and beyond: https://rentry.org/comfyui_guide_1girl
Tag explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Samplers: https://stable-diffusion-art.com/samplers/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage | https://rentry.org/ldgtemplate

>Neighbors
https://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards
>>>/aco/csdg
>>>/b/degen
>>>/b/celeb+ai
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.105714875
>>105714855 (OP)
>Manlet Edition
I approve
Anonymous No.105714880 >>105714901 >>105714925 >>105714964 >>105715360 >>105716727
Kontext reminder, try setting ModelSamplingFlux values to 0, which can help depending on the prompt.
Anonymous No.105714896
Anonymous No.105714901
>>105714880
Also the "without changing the rest of the image at all" etc prompt is redundant here
Anonymous No.105714919 >>105714936 >>105715287
realistic great white shark with its mouth wide open, photo realistic, high quality

its still not perfect, it definitely can't mix anime with realistic very well you would have to start with realistic first then layer the other stuff over the top of it.
Anonymous No.105714923 >>105714985
Anonymous No.105714925 >>105714931 >>105716198
>>105714880
What the fuck is Kontext?
Anonymous No.105714930
>>105712745
>https://rentry.org/omnigen2_plots
Added to https://rentry.org/ldg-lazy-getting-started-guide#anon-guides-and-resources thank you anon.
Anonymous No.105714931
>>105714925
jpeg artifact filter
Anonymous No.105714932 >>105714949 >>105714951 >>105718079
reminder that qwen3 imagegen is going to btfo kontext
Anonymous No.105714936 >>105714964 >>105715023
>>105714919
try removing the realistic
try removing the realistic and adding photograph raw
Anonymous No.105714949 >>105714978 >>105714982
Inb4 lode announces chroma-kontext.

>>105714932
its pretty grim for local models rn
Anonymous No.105714951
>>105714932
>going to
Anonymous No.105714964
>>105714936
>A great white shark with its mouth wide open, photograph raw
like this? I'm running it to see if that improves, i lowered the cfg to 1.5 and did what >>105714880
said

it knows what a shark is so its not outside of its dataset.
Anonymous No.105714978
>>105714949
>Inb4 lode announces chroma-kontext.
it won't happen
Anonymous No.105714982
>>105714949
It will never be 'grim' for local models because you will always be limited to neutered, watered down censored models (and have everything you do tracked/monitored) when using online models.

there's a reason local is infinitely more popular and discussed.
Anonymous No.105714985
>>105714923
Plot twist
Anonymous No.105715007 >>105715019 >>105715031
50 minutes for this img2vid with 8gb vram. i did something wrong or is that normal?
Anonymous No.105715019 >>105715059
>>105715007
How much ram offload? Do you have torch compile enabled? It increases gen time. How much ram?
Anonymous No.105715021
is there a page showing kontext examples, for reference
Anonymous No.105715023 >>105715069
>>105714936
well it didn't work so now i'm gonna try something else.

A photograph of a black cat sitting on a cream colored sofa.

if it can't do that then its going into the trash literally.
Anonymous No.105715031
>>105715007
Most likely you are running out of vram and it's using system ram, which will slow things down a lot, but it also depends on the card model you are using.
Anonymous No.105715042 >>105715052
Convert to quick pencil sketch

neat, it works. using q8 gguf quants from https://huggingface.co/bullerwins/FLUX.1-Kontext-dev-GGUF/tree/main
Anonymous No.105715050 >>105715052 >>105715070 >>105715145 >>105715187
>local is so hecking free!
>you MUST add content filters to your outputs or go to jail!
why act like censorship isn't catching up to local? civitai is banning hundreds of loras every day. local finetuners are kekkolds who kneel to licenses. this isn't some magical land of piracy and free expression. everyone involved in local has been getting buckbroken by censors recently and it's not as yippee-freedom as it once was.
Anonymous No.105715052 >>105715068
>>105715042
YOU SUPPORT THIS FUCKING SHIT??? >>105715050
Anonymous No.105715059
>>105715019
i'm a noob and don't understand your questions lol. i have 16 gb ram
Anonymous No.105715068 >>105715087
>>105715052
I dont give a shit about what their rules are, how would they even prove I used their model if I can strip all data from it
Anonymous No.105715069
>>105715023
you need to wait for chroma kontext to get great realism, flux cant come close even with top loras today
Anonymous No.105715070 >>105715181
>>105715050
not my problem, weights are on my pc
Anonymous No.105715077
nothing like New Thing to bring out the FUD poasters
Anonymous No.105715083
It can do a harmless pussy cat but sharks are too DANGEROUS! Better censor that just in case someone gens some diver gore. How fucking gays, Germans really are too serious.
Anonymous No.105715087 >>105715095
>>105715068
hive actually detects the model that was used
Anonymous No.105715095 >>105715104
>>105715087
>run it through img2img on sd for half a second
heh, nothing personell
Anonymous No.105715104
>>105715095
but then it will look like shit. you are like the glaze trannies
Anonymous No.105715120 >>105715155
it's already more fun than flux fill.
Anonymous No.105715145 >>105715191
>>105715050
Ban pencils and paint brushes, fuck it better just ban thinking all together in case someone imagines something evil. Why does AI make the normie freak so much? Its just another fucking art tool.
Anonymous No.105715155
>>105715120
>put the text "no gays!" on the sign. change the heart on the sign to a rainbow flag.

this is a lot of fun and im just doing basic tests.
Anonymous No.105715181 >>105715362
>>105715070
those weights will never do your heckin celebrity porn btw
Anonymous No.105715183 >>105718098
change her clothes into a white blouse and blue jeans.

magic
Anonymous No.105715184
imagine living in a country were what you draw on a piece of paper can land you in jail, this is the literally definition of thought crime and also victimless because its all fiction. but here we are, it think it is time the western men to just ditch their countries and move to a more sane country.

let it all fucking burn
Anonymous No.105715187 >>105715212 >>105715220 >>105715317 >>105715339 >>105715369
>>105715050
Here's more crazy shit they've 'updated':

>b. Non-Commercial Use Only. You may only access, use, Distribute, or create Derivatives of the FLUX.1 [dev] Model or Derivatives for Non-Commercial Purposes. If you want to use a FLUX.1 [dev] Model or a Derivative for any purpose that is not expressly authorized under this License, such as for a commercial activity, you must request a license from Company, which Company may grant to you in Company’s sole discretion and which additional use may be subject to a fee, royalty or other revenue share. Please see www.bfl.ai if you would like a commercial license.

This means that if for example you create a Flux lora and receive buzz for it on Civitai, you need to contact BFL and get a license.
Anonymous No.105715191 >>105715255
>>105715145
it's only natural that the mindless insect fears the machine
Anonymous No.105715203 >>105715230 >>105715249 >>105715256 >>105715282 >>105715350
https://xcancel.com/ostrisai/status/1938340573557002322
>It works! Fine tuning Kontext on task specific instruction datasets is going to open up a whole new world. Samples at 1,500 steps.
Anonymous No.105715210
Blessed thread of frenship
Anonymous No.105715212 >>105715352
>>105715187
people will act like this doesnt impact them until 2 years later when there still arent any finetunes, then they'll be asking "what happened guys where are the kontext finetunes???"
Anonymous No.105715220
>>105715187
kek
Anonymous No.105715230
>>105715203
Nice.
Anonymous No.105715231
Damn, Kontext is pretty cool, actually. Even if it's just a toy, that's pretty fun toy.
Anonymous No.105715249
>>105715203
This has given me hope
Anonymous No.105715255 >>105716040
>>105715191
awesome, catbox by any chance ?
Anonymous No.105715256 >>105715361
>>105715203
HUH NO WAY YOU'RE TELLING ME IT'S POSSIBLE TO FINETUNE KONTEXT !!?!?!?!?!?!??!?!??!?!?!?!? BUT 4CHAN TOLD ME IT"S IMPOSSIBRU!?!?!??!?!?!?!?!?
Anonymous No.105715282
>>105715203
is ostris == debo or something?
Anonymous No.105715283
Anonymous No.105715287
>>105714919
Default Flux has shit style sorry anon
Anonymous No.105715289
Change the crates on the man's back to cheeseburgers.

this is an amazing tool desu, infinite meme potential and actual text based inpainting/swaps.
Anonymous No.105715291 >>105715312
OH MY GOSHHHHHHHH IT CAN PUT PARTY HATS AND SUNGLASSES ON ANIMALS??!?!?!?!?!? LOCAL IS SAVED!!!!!!!!!!
Anonymous No.105715296 >>105715360 >>105715407
>zoom out, make her breasts gigantic, her hair very long, looking at viewer, split her hair color at her templer so the left side of her hair is gray
Mostly censored and a little limited to simpler things what it knows but pretty good, takes some rerolling for more complex things but what doesn't

Shows great promise that every model in the future, one that is not censored and capable of proper realism like Chroma, will one day have this easy to use textual editing workflow
Anonymous No.105715301 >>105715310 >>105715413
Anonymous No.105715310
>>105715301
I'm gonna need a box anon
Anonymous No.105715312 >>105715344
>>105715291
It turns out that that's a really difficult problem to solve and took all of humanity up until this point to achieve. Making the skin look less plastic comparatively takes 30 minutes of time for someone with a >90 IQ. So yeah, local is saved.
Anonymous No.105715317 >>105715327
>>105715187
>Steals other peoples data
>Trains model with stolen data
>Expects people to pay them royalties

I hope they get sued.
Anonymous No.105715325 >>105715457
give the anime girl very large breasts and a t-shirt that has a milk bottle on it.

it made thin migu pregnant...
Anonymous No.105715327 >>105715366 >>105715411
>>105715317
You don't even need to sue them, just ignore their meme license. The only case where courts might side with them is if you're competing with them directly as an API provider.
Anonymous No.105715339 >>105715384
>>105715187
Which part of non-commercial do you not understand?
Anonymous No.105715344
>>105715312
>Making the skin look less plastic comparatively takes 30 minutes of time for someone with a >90 IQ.
kekd
Anonymous No.105715350 >>105715377 >>105715385
>>105715203
Btw, you can already train Kontext with plain t2i training on all training scripts with zero modifications, since the model is architecturally the same as Dev and can be used in normal t2i mode also.
Anonymous No.105715352
>>105715212
>2 years later
Anonymous No.105715360
>>105715296
check out >>105714880
Anonymous No.105715361 >>105715372
>>105715256
No one said it was impossible, it's just Flux with a context latent and the "caption" is an instruction.
Anonymous No.105715362 >>105715452
>>105715181
it will do whatever I want it to, training scripts already exist
Anonymous No.105715366
>>105715327
oh i will anon, its not like they are gonna know when the image has been passed through wan.
Anonymous No.105715369 >>105715827
>>105715187
Prior to this what did you think the purpose of the Flux.1 Dev Non-Commercial license was? To permit commercial use of the model weights? That is just clarification of the existing terms and it's one reason why people are using Schnell-based models instead
Anonymous No.105715372 >>105715403
>>105715361
>No one said it was impossible
>>105714765
>>105714895
Anonymous No.105715375 >>105715387
so is it a nerfed model? cant do the ghibli meme test
Anonymous No.105715377 >>105715432
>>105715350
That would be retarded because what you want is:
Input Image
Input Caption
=
Output Image
Anonymous No.105715384
>>105715339
When you recieve buzz, you are by definition 'commercial', you are recieving monetary compensation.

Are you retarded ?
Anonymous No.105715385 >>105715432
>>105715350
I don't doubt you senpai but I wonder why he said
>think I have it properly integrated. Training a test LoRA now with a custom made instruction dataset to put a specific logo on people's t-shirts.
makes it seem like it's not
Anonymous No.105715387 >>105715413
>>105715375
this is the closest so far from a stellar blade image.

Convert Image to Ghibli Style
Anonymous No.105715403 >>105715419
>>105715372
>16 year olds replying to each other
Why are you like this?
Anonymous No.105715407 >>105715417
>>105715296
Name of this semen demon?
Anonymous No.105715411
>>105715327
>meme license
and it's exactly that, what they going to do if i use an image to gen a video using wan? bunch of clowns they are.
Anonymous No.105715413
>>105715301
please post a box i need it too
>>105715387
sex
Anonymous No.105715417 >>105715479
>>105715407
jadeyanh
Anonymous No.105715419 >>105715428 >>105715436 >>105715570
>>105715403
>no one said it's impossible
>*posts someone literally saying it's impossible
If this sequence of events upsets you then you should consider detransitioning and checking yourself into a mental health facility.
Anonymous No.105715428 >>105715451
>>105715419
THIS IS FUCKING 4CHAN LITERAL RETARDS LIKE YOU POST HERE
Anonymous No.105715432 >>105715489
>>105715377
Brother, you can do vanilla t2i training just like Dev, to teach it new concepts. And then even use the new concepts in image editing tasks. Source: me, I'm doing that right now, and already have some early lora checkpoints that show it works.
>>105715385
He's talking about a custom dataset specifically for image editing in a certain way. And tbhdesu I don't think his dataset is a good example; if you teach the model what the logo is using vanilla t2i training, you shouldn't need the editing-specific examples to make it work.
Anonymous No.105715435 >>105715463 >>105717164
okay it's pretty good

Convert Image to Ghibli Style + dsp
Anonymous No.105715436
>>105715419
it's not that it's impossible, it just really fucking sucks trying to train a distilled model
Anonymous No.105715451
>>105715428
sneed
Anonymous No.105715452 >>105715468
>>105715362
you and what compute, goy?
Anonymous No.105715453
so how does comfy do the multi panel comic example?
Anonymous No.105715457
>>105715325
She is bearing our children who will grow up to be big and strong.
Anonymous No.105715458
Anonymous No.105715463
>>105715435
converted to a ghibli-style image
Anonymous No.105715468
>>105715452
my gpus retard
Anonymous No.105715472
Convert Image to black and white lineart, done with a pencil.
Anonymous No.105715477 >>105715488
excited for the 2x3060 finetune, it will be insane bro!
Anonymous No.105715479
>>105715417
thanks
mayhaps i touch myself to her later
Anonymous No.105715488
>>105715477
why do you assume everyone is poor like you bro?
Anonymous No.105715489 >>105715551
>>105715432
I think in general with Kontext we want to train directions, not styles...
Anonymous No.105715494 >>105715505 >>105715541
give the anime girl a large baseball cap that says "LDG"
Anonymous No.105715499 >>105715642
My fleet grows
Anonymous No.105715505 >>105715514
>>105715494
replace the green leek the anime girl is holding with a large greatsword.
Anonymous No.105715510 >>105716537
>Colorize this image
Damn, son.
Anonymous No.105715514
>>105715505
interesting how despite the typo it got the subtitle font style right.
Anonymous No.105715541
>>105715494
give the anime girl blue jeans and a white hoodie.

aqua color strings, nice touch!
Anonymous No.105715551 >>105715567 >>105715574
>>105715489
Let me put it this way:

1. You have a dataset consisting of images of a weird object called a smorgla.
2. You train Kontext using vanilla t2i training on this dataset, using existing training scripts with literally 0 modifications.
3. Now you use the lora in a Kontext image editing workflow, you have an image of a person and say "make him hold a smorgla" and it just werks.

This appears to work perfectly, I am doing it right now. Maybe it won't work well if you have some extremely weird concept that absolutely needs training examples consisting of editing instructions. But for like 99% of what people will want to train on, this is good enough, and you don't need to construct a dataset consisting of an image pair and editing instructions.
Anonymous No.105715567 >>105715574
>>105715551
>You train Kontext using vanilla t2i training on this dataset, using existing training scripts with literally 0 modifications
I assume this is using flux-dev settings. I would still like to train it with an i2i instruct mode for specific tasks, hope the trainers add that soon
Anonymous No.105715570
>>105715419
dude we've already been over this exact chain of events with flux dev
Anonymous No.105715574
>>105715551
Some of us are distinguished scholars

>>105715567
ai-toolkit just updated with control image + image/txt pair datasets, I'll be training it out
Anonymous No.105715605 >>105715625
I was sleeping for years.
Qrd on where did Automatic1111 go? Idle interest.
Anonymous No.105715623
Anonymous No.105715625 >>105715635 >>105715637 >>105715673
>>105715605
went back to making rimworld mods after seeing the writing on the wall with SDXL
Anonymous No.105715635
>>105715625
Thank you.
Anonymous No.105715637
>>105715625
you mean just playing DoTA right?
Anonymous No.105715639
change the text from "VICTORY ROYALE" to "LDG POSTER!"

it maintains the style of the text which is neat, like flux fill
Anonymous No.105715642
>>105715499
serious hardware swΓ€rjebro
Anonymous No.105715673 >>105715697
>>105715625
>rimworld
i've been playing this recently and was about to go play it again after seeing how bad kontext is. 2 of my colonist have babbies on the way and the new massive freezer warehouse is finished.
Anonymous No.105715697 >>105715718 >>105715736
>>105715673
me? im waiting for the new dlc so i don't get burnt out before
Anonymous No.105715704 >>105715749
are there any decent image to video generators i can play with on colab? my computer is a decade-old potato
just looking to animate some images, nothing too involved
Anonymous No.105715718
>>105715697
yup me too.
Anonymous No.105715736
>>105715697
we're probably gonna need to start a new game though due to all the stuff they are adding.
Anonymous No.105715749 >>105715773
>>105715704
i would have been willing to help but i've got to go get something to eat or i will surely die.
Anonymous No.105715769 >>105715785 >>105715790 >>105715801 >>105717033
why isn't it working, i tried put on clothes on the pregnant girl, put clothes on the girl, put a dress on the girl now
Anonymous No.105715773
>>105715749
consume what you must
Anonymous No.105715785
>>105715769
>The exposed AGP transexual has to invade our thread too
Anonymous No.105715790
>>105715769
censored slop. the absolute state of local releases
Anonymous No.105715801
Omnigen is kinda blurry (?)

>>105715769
sir...
Anonymous No.105715804 >>105715829
a man with sunglasses and wearing a long black trenchcoat, opens a pizza box and eats a slice of pizza.
Anonymous No.105715813
Wan is probably really good at making a control dataset. You can make it do everything, especially the magic transitions work fine.
Anonymous No.105715827
>>105715369
Well, yes, this is mainly clarifying things people thought but weren't sure of.

In short: You can't recieve any monetary compensation for loras, fine-tunes, merges etc based on Flux dev without getting a license from BFL, which of course means paying them.

The good thing is that this clarifications means Flux dev is dead for the Civitai lora maker crowd, and most likely they will create lora for Chroma instead, thus making its ecosystem stronger.

Thanks BFL!
Anonymous No.105715829 >>105715939
>>105715804
this one is much better imo
Anonymous No.105715850 >>105716372 >>105716385 >>105716436
JUST FUCKING WORK
FUCK
FUCK
FUCK
FUCK
FUCK
Anonymous No.105715919 >>105716031
nice
Anonymous No.105715939 >>105715956
>>105715829
720p
Anonymous No.105715946 >>105715968 >>105715982 >>105716021
>a furry and China combined make bfl, the original creators of stable diffusion, irrelevant
what is this timeline?
Anonymous No.105715956
>>105715939
Crazy what nanotech can do
Anonymous No.105715968
>>105715946
what kind of cope is this?
Anonymous No.105715982 >>105716021
>>105715946
It's amazing isn't it ? The offshoot from SAI is going to die the same way as SAI, being greedy and virtue signalling itself into an early grave.
Anonymous No.105716021 >>105716067
>>105715946
>>105715982
?
Anonymous No.105716031
>>105715919
wtf kino
Anonymous No.105716040
>>105715255
https://files.catbox.moe/zwsdto.png
Her name is Rallye, she is an infamous shitposter on /o/.
she is a used rental slut machine who wants you to give her the beans cap'n
Anonymous No.105716053
>rentry
is there any guide for fast flux?
Anonymous No.105716067 >>105716112 >>105716114
>>105716021
Wan killed BFL's video model they were working on, since it suddenly became obsolete, meanwhile BFL's licensing means that community support is essentially dead for their models, with the exception of Flux Schnell which has a permissive license, and is the one Chroma is based on.

So if you were making lora for Flux dev, and want to keep on making loras, you will most likely target Chroma or some other model with decent licensing terms instead.
Anonymous No.105716076
cosmos >>> chroma
Anonymous No.105716082 >>105716138
Anonymous No.105716112 >>105716186
>>105716067
I see. Can you post some case law for the license thing, I've not heard of anyone getting sued despite there definitely being infringing parties
Anonymous No.105716114
>>105716067
you're right about wan, but i wouldn't say chroma made BFL irrelevant when it's entirely reliant on a BFL base model. it likely killed the BFL-dev license though and guaranteed they will never release anything actually permissive ever again.
Anonymous No.105716119
>>105714855 (OP)
The grey one is pretty cool.
Anonymous No.105716138
>>105716082
Anonymous No.105716148 >>105716495
The real truth is you don't need any of these people to make your own model especially if you have H100s.
Anonymous No.105716157 >>105716168
with kontext dev, can you keep original image's WxH or it must be 1024x1024?
Anonymous No.105716168
>>105716157
You can give it an empty latent, if you don't use your input's latents it does seem to crop like 20px though.
Anonymous No.105716186 >>105716515
>>105716112
Nobody has been sued AFAIK, but it doesn't matter, BFL coming out now and clarifying that this any commercial use of Flux dev or its derivatives needs a license means they most likely are going to contact Civitai etc and tell them they can't have monetary compensation for lora / finetunes based on Flux dev.

And even if they don't right now, they can do it at any time, meaning the time and money spent on training those are now worthless from a compensation standpoint.

>Can you post some case law for the license thing,
Case law for licensing in general ?
Anonymous No.105716189 >>105716206 >>105716228 >>105716436
So is kontext good or a nothingburger?
Anonymous No.105716198
>>105714925
chatgpt studio ghibli edit but local
Anonymous No.105716206 >>105716307
>>105716189
chroma kontext is all you will need, until then, this is good for simple changes to the images one might need
Anonymous No.105716226
remove the man in sunglasses from the photo.
Anonymous No.105716228 >>105716240
>>105716189
Depends on your needs, but personally I can't think of anything I want to use it for that I can't do with img2img, inpainting, and with better results.

It's just kinda meh, perhaps because it was hyped so much that it made me expect something much better.
Anonymous No.105716240
>>105716228
Can it take an image and leave only the outfit against a white background? That would be a pretty huge use case for me.
Anonymous No.105716273
Anonymous No.105716307 >>105716335
>>105716206
chroma kontext will never be a thing
Anonymous No.105716320 >>105716379
kek

make the image black and white and give the anime girl a WW2 era Nazi uniform.
Anonymous No.105716321 >>105716328
Anonymous No.105716328
>>105716321
>tranny raid on /lmg/ AND /ldg/
grim
Anonymous No.105716335 >>105716412
>>105716307
I agree, the concept of Kontext is underwhelming and the results are overall really poor.

A week from now it will be forgotten, thus Chroma won't have a kontext model.
Anonymous No.105716343 >>105716358
>>105714855 (OP)
I haven't been involved in local imagen for the past two years, is there anything better than stable diffusion yet
Anonymous No.105716346 >>105716363
I'm using comfy example
why kontext crop the original image though?
Anonymous No.105716358
>>105716343
No, SD1.5 is still the best.
Anonymous No.105716363
>>105716346
your image isnt the usual image gen dimesnions make em 128px divisible, 768x1280 1024x1024
Anonymous No.105716372
>>105715850
prompting like a real shizo, gj.
Anonymous No.105716379
>>105716320
make the image pencil outlines:
Anonymous No.105716385
>>105715850
you can crudely put cloths on her in ms paint and the ai will understand at like 0.5 noise
Anonymous No.105716412 >>105716450
>>105716335
it's way too slow for what it is. waiting 20+ seconds to give a giraffe sunglasses is too much. and that's ignoring all the lobotomization and artifacts
Anonymous No.105716415
change the image to be the style of a pixar movie.

calarts retards are finished
Anonymous No.105716433
make the image manga style in black and white colors. the girl is holding a green leek with her right hand.

well, it's the right hand on the picture, not her right hand. but still neat.
Anonymous No.105716436
>>105716189
it's pretty gimped, basically just inpainting without manual selection. "tell, don't show" but cursed.
>>105715850
one of the first things I did, too. the way it totally fucking ignores prompts, it's like a shy mormon boy that won't even look at a girl that might make him lust.
Anonymous No.105716450
>>105716412
You're saying that putting plastic skin and a Flux chin on a person isn't worth 20+ measly seconds ?

What are you, a russian bot ? I bet you voted for Trump and owns a Tesla!
Anonymous No.105716469 >>105716490 >>105716519
make a four panel comic with this character in manga style. each panel is black and white. In the first panel the girl is eating a cheeseburger. In the second panel the girl is eating a pizza. In the third panel the girl is drinking a glss of water. In the fourth panel she holds a sign saying "LDG".

so it understands a sequence of directions
Anonymous No.105716490
>>105716469
and there was a typo. new gen:
Anonymous No.105716495 >>105716967
>>105716148
how's the electricity bill anon?
Anonymous No.105716503
SD 1.5 sovl
Anonymous No.105716505
Wait wtf there is already Chroma v39? I got 38 like a few days ago.
Anonymous No.105716515
>>105716186
people will just remove the lora's and upload to a site that does not recognize their shitty license.
Anonymous No.105716519 >>105716528
>>105716469
Anonymous No.105716528 >>105716532 >>105716565
>>105716519
same default workflow?
Anonymous No.105716532 >>105716565
>>105716528
*also which model are you using
Anonymous No.105716537 >>105716584 >>105716652
>>105715510
I still can't get the damn thing to do text correctly, its meant to read: billions must colorize
Anonymous No.105716565 >>105716580
>>105716528
>>105716532
its 4o kek.
Anonymous No.105716572 >>105716592
>>105714855 (OP)
>neighbors:
>>>/vp/napt
we are your relevant related Ai image generation community ;3
Anonymous No.105716580 >>105716604
>>105716565
well im not paying scam altman for prompts
Anonymous No.105716584 >>105716652
>>105716537
>BILLIONS MUST CULTURIZE
Anonymous No.105716592 >>105716845
>>105716572
ywn
baw
Anonymous No.105716593 >>105717260
Does Chroma also use just euler + whatever or ?
Anonymous No.105716604
>>105716580
4o is free, flux kontext is a paid api
unless of course you're talking about the censored demo model that serves as little more than an ad for api nodes
Anonymous No.105716612
remove the red hair anime girl from the image.

asuka got stalin'd
Anonymous No.105716625
Chroma melting prompt elements into one is just my skill issue?
Anonymous No.105716649
how do you uncensor fux
Anonymous No.105716652 >>105716682
>>105716537
>>105716584
i've figured it out I think, either it's the fact i've changed the prompt to

text: "my text"

or its because I'm using the exact resolution of the original image, gonna test both ways to figure out what I've been doing wrong. No idea why its cutting half the image out, and it still has garbled text at the bottom, but I'm getting there. I just want this for the meme genning potential because its fast enough for genning quick meme images on along side shit posting.
Anonymous No.105716657 >>105716829
pretty good free online generator:

https://perchance.org/ai-text-to-image-generator

it seems generally more coherent than what i can do locally with SDXL.
can anyone tell whats going on in the backend and how I can do the same with comfy
Anonymous No.105716675
Hmm, almost
Anonymous No.105716681
it's good at maintaining styles in edits like flux fill.
Anonymous No.105716682 >>105716693 >>105716727 >>105716740
>>105716652
ahahahaha i got it, you need to set the empty latent node to what ever 1024x1024 and important, the ModelSamplingFlux node width and height to the reference image original size. This time it did it correctly.
Anonymous No.105716693
>>105716682
can it do studio ghibli?
Anonymous No.105716703
>studio ghibli
>generic anime
>flux face
yea it's flux dev alrigh
Anonymous No.105716723
convert the image to manga style, in black and white. the shading is done in halftones.
Anonymous No.105716727 >>105716740
>>105716682
and i just tried this with this >>105714880
and it does not work, the text is garbled again, so you do need shift.
Anonymous No.105716729 >>105716749 >>105716767
>other ai porn creators follow me
>i dont follow back
Anonymous No.105716740 >>105716756
>>105716682
>>105716727
so let this be a lesson on why we need shift. but i suspect the comfy implementation is to blame here when its turning shit into plastic
Anonymous No.105716749 >>105716766
>>105716729
>anon perfects AI porn
priorities amirite?
Anonymous No.105716756 >>105716770
>>105716740
>but i suspect the comfy implementation is to blame here when its turning shit into plastic

What? Diffusers version doesn't have that? Can you post an example
Anonymous No.105716766
>>105716749
i unironically think god is testing me by letting me peer into the pit
Anonymous No.105716767 >>105716823
>>105716729
>don't follow this guy because he uses a default loraless model
>don't follow this guy because he has overbaked cfg
>don't follow this guy because he's literally zeroefforting using some shitty web service
>don't follow this guy because his lighting and shading settings make skin look like sand
Maybe I'm just autistic
Anonymous No.105716770
>>105716756
idk desu, a decent pony like cyber realistic does not do plastic skin, it actually gens decently realistic skin. There are examples where kontext does not gen plastic looking skin but only when you turn off shift.
Anonymous No.105716776 >>105716786 >>105716787
the chalkboard at the back of the room says "LDG" in white chalk.

it was empty before. did you notice the edit?
Anonymous No.105716786
>>105716776
it looks like you painted in photoshop
Anonymous No.105716787 >>105716802 >>105716807
>>105716776
change the girl's clothes into a panda costume.
Anonymous No.105716802
>>105716787
manletification and boob removal complete
Anonymous No.105716807 >>105716825
>>105716787
the girl in the photo is sitting at a desk and writing with a pencil.

not bad
Anonymous No.105716808 >>105716830 >>105716968 >>105717253
https://xcancel.com/fofrAI/status/1938271690493927932
heh
Anonymous No.105716813
it literally turns characters into chibi style, what did they mean by this? :D
Anonymous No.105716821
I'm trying to use Chroma with NAG but I keep getting really bad results. It's all either overcooked or undercooked.
Anonymous No.105716823
>>105716767
>>don't follow this guy because he uses a default loraless model
I thought you meant he uses a base model with no loras which I was going to say BASED to until I read the rest of your post
Anonymous No.105716825
>>105716807
the girl in the photo is wearing black eyeglasses and is holding a beaker with bubbling green liquid. she is wearing a lab coat.
Anonymous No.105716829 >>105716872
>>105716657
>https://perchance.org/ai-text-to-image-generator
looks okay
Anonymous No.105716830 >>105717247
>>105716808
>2 reference images
ModelSamplingFlux stacking? I'm gonna have to try this.
Anonymous No.105716845
>>105716592
kek why do you think im trans?
im literally libertarian
if you MUST insult
atleast research...
see ya >>>/g/sdg
;3
Anonymous No.105716848 >>105716874
just woke up, what's the verdict
and is gguf ready yet
Anonymous No.105716872 >>105716932
>>105716829
this would take a human 40 hours lmao
Anonymous No.105716874
>>105716848
https://huggingface.co/bullerwins/FLUX.1-Kontext-dev-GGUF

it's a fun toy, need to mess with it more
>test: OG miku source, "give the anime girl a ww2 nazi outfit and make the image black and white".
Anonymous No.105716881 >>105716906
replace the headline text with "stupid retards fund DEI game and go broke"

amazing. almost first try.
Anonymous No.105716906 >>105716914 >>105716925
>>105716881
remove the black woman with the red coat from the image.

also good at stalin photography, text based inpainting.
Anonymous No.105716914
>>105716906
the background lines being consistent is a neat touch as well, it's not just a black blob.
Anonymous No.105716918
Anonymous No.105716925
>>105716906
remove the black woman with the red coat from the image. Replace her with a sexy blonde woman with large breasts, in a black dress.
Anonymous No.105716932
>>105716872
or a chinese street artist 6 minutes
Anonymous No.105716934
remove the black woman with the red coat from the image. Replace her with a sexy blonde woman with large breasts, in a black dress. remove the black man with the teal coat from the image and replace them with Jackie Chan.

kek
Anonymous No.105716960
convert to an image in the style of van gogh:
Anonymous No.105716967
>>105716495
It's like $200/mo extra
Anonymous No.105716968 >>105717247 >>105717253
>>105716808
how is this done? the template workflow just has image stitch which is shit
Anonymous No.105716969
convert to a painting style with visible brushstrokes
Anonymous No.105716976
Anonymous No.105716980
convert to a painting with visible brushstrokes. Give the anime girl a red beret and change her shirt to a white blouse.

le Miku:
Anonymous No.105716983 >>105716991 >>105717000 >>105717026 >>105717041
Alright boys... real talk, what's the general requirements for training a Wan video lora? Is it comparable to XL lora training in terms of hardware? I'm pretty familiar with training those and embeds.
How do you go about training? Gather up videos, chop them up into tiny clips and caption them?
I think I've fucked around with makin vids enough and I wanna bite into some nitty gritty type shit. Hoping my lil 4070ti can chug along and train.
Anonymous No.105716988 >>105717001
>4070 ti
lol
Anonymous No.105716991 >>105717001
>>105716983
If you have to ask you don't have enough
Anonymous No.105716996
hmm, good effort?
Anonymous No.105717000
>>105716983
I wouldn't bother with less than 24 GB of VRAM especially if you want to do motion. If you're poor you're much better off just using Runpod because it only takes a few hours.
Anonymous No.105717001 >>105717027
>>105716988
;_;
I'm guessing that's a no go then.

>>105716991
Rip.
I just wanted to get some loras that actually fucking worked and weren't stupid shit.
Anonymous No.105717003 >>105717030
convert to a painting with visible brushstrokes. Give the anime girl black hair and a spiked neck collar. Change her shirt to be a Ramones music band tshirt.

neat punk Miku:
Anonymous No.105717026 >>105717062
>>105716983
I read that you can try it with videos using only 12GB vram inside of windows, not sure about linux but it was months ago at time of writing. just check civitai for the method, it will be somewhere under wan2.1
Anonymous No.105717027 >>105717062
>>105717001
what are you after
if it's cartoon shit forget it, otherwise I'm looking for ideas myself
Anonymous No.105717030
>>105717003
Give the anime girl red curly anime hair. Change her outfit into a white dress.
Anonymous No.105717033
>>105715769
I had some success changing "girl" for "character."
Anonymous No.105717041 >>105717055 >>105717062
>>105716983
>Hoping my lil 4070ti can chug along and train
You can generate videos with a 4070 ti no problem, but training video lora, that's not realistic.
Anonymous No.105717055 >>105717118
>>105717041
bullshit, people were doing it with 12GB cards months ago, takes a long time though, like 68 hours or more. I'd not bother which is why I haven't.

image lora's on the other hand which really only work in a t2v setting are an option.
Anonymous No.105717062 >>105717072 >>105717082
>>105717026
>I read that you can try it with videos using only 12GB vram
Wouldn't you only be able to train with videos if you were training a video lora?

>>105717027
No clue really. Mainly NSFW shit, fixing some issues I've been having with the general loras. Fluids turning into fucking firehoses is a big one. No matter how I prompt that shit, what are intended to be small drips turn into streams with enough pressure to send a bitch to orbit.

>>105717041
I knew it was a pipe dream, but I had to ask.
Anonymous No.105717071
Anonymous No.105717072 >>105717100
>>105717062
A 5090 is $0.89/hr on Runpod, most Loras can be trained in less than 5 hours.
Anonymous No.105717079
Anonymous No.105717080 >>105717089
Anonymous No.105717081
this isnt the plushie I ordered, I want a refund, the boobs are missing
Anonymous No.105717082 >>105717100
>>105717062
>Wouldn't you only be able to train with videos if you were training a video lora?
no, it's not exactly what I meant, you can train wan lora's with images they just won't have any motion outside of wans dataset, but here you can combine them with motion lora's.
Anonymous No.105717085
Anonymous No.105717086 >>105717087 >>105717088
> prompt image with kontext
> nothing changes

i..ok? but why.
Anonymous No.105717087
>>105717086
sounds like a skill issue problem
Anonymous No.105717088 >>105717092
>>105717086
did you try to prompt it with nsfw prompts? Then that is why.
Anonymous No.105717089
>>105717080
remove the plane in the background and replace it with a hot dog stand. A large sign saying "BIG GUYS" is above the hot dog stand.

this is a fun model desu
Anonymous No.105717092 >>105717097 >>105717099
>>105717088
"change the girl's expression to be happy"

i guess kontext is telling me she was already happy kek
Anonymous No.105717097 >>105717103
>>105717092
Avoid abstract concepts, use something like "smile" or "grinning" instead
Anonymous No.105717099
>>105717092
you vill not look happy and you vill be happy
Anonymous No.105717100 >>105717122 >>105717127
>>105717072
Yeah I know that's an option, but there's something better to me about fucking up a training session on my own hardware and the only cost is time, vs paying for time on another machine and then being out a few bucks if it's fucked. I know that's kind of stupid.
I know electricity costs money on my end, but I'm using the same amount of power doing other similarly intensive processes, so it's not really like I'm paying any more than I usually would.

>>105717082
Ohhh, I get what you mean, I think. So like you could train a character lora based on images, and it'll use Wan's dataset for motion for that character. Seems like something more meant for text2vid. I'm more looking at image2vid, t2v is neat, but it doesn't seem quite there like i2v is since you can start from something that is already 90% of the way there.
Anonymous No.105717103
>>105717097
ah. that's actually a fair point. fucking chroma and 1girl slop has ruined me.
Anonymous No.105717109
Anonymous No.105717118
>>105717055
>takes a long time though, like 68 hours or more.
Yeah, that's the not realistic part, as in training for 68 hours or more, and you typically don't get it right the first time and need to adjust parameters and possibly data set, and train again...

If you have endless patience, yes, as long as the model starts training on your hardware, you technically can train, technically you can train with a cpu instead of a gpu as well...
Anonymous No.105717122 >>105717127
>>105717100
>Ohhh, I get what you mean, I think. So like you could train a character lora based on images, and it'll use Wan's dataset for motion for that character. Seems like something more meant for text2vid. I'm more looking at image2vid, t2v is neat, but it doesn't seem quite there like i2v is since you can start from something that is already 90% of the way there.
yeah, then you can use a lora stack loader and combine nsfw lora's with some image lora. this only works with text 2 video model or perhaps even vace/phantom models. wan is very flexible with being able to mix models so experiment. but the secret source is learning how to take a person and use phantom to get the job done :-)
Anonymous No.105717127
>>105717122
>>105717100
i meant to say if you can learn to use phantom really well then you actually don't need a lora.
Anonymous No.105717129 >>105717150
Anonymous No.105717130
Change the background to a suit store, the man in the blue shirt is now wearing a black suit and red tie.
Anonymous No.105717131 >>105717143
what version of sageattention does the OP wan have?
Anonymous No.105717143
>>105717131
the only public version
Anonymous No.105717150
>>105717129
should use the blowjob lora desu
Anonymous No.105717164
>>105715435
do dsp blowing his brains out with a shotgun because he fuckin sucks
Anonymous No.105717171
Anonymous No.105717177 >>105717208 >>105717212
Does NAG and the other guidance methods work on Chroma?
Anonymous No.105717208 >>105717255 >>105717279
>>105717177
kek
Anonymous No.105717212
>>105717177
i don't think anyone has really done it and it probably needs the size of the matrices? changed to be compatible, anyone can go ahead and create a node for it.

i did find this
https://www.reddit.com/r/StableDiffusion/comments/1lkubao/anyways_yet_to_improve_infer_speed_on/
Anonymous No.105717231
more importantly why is chroma so fucking slow? when this flux based kontext model has been pleasurable to use in terms of speed.
Anonymous No.105717235
>https://github.com/ChenDarYen/ComfyUI-NAG
I don't get it. does nag not work for regular wan?
Anonymous No.105717236
Anonymous No.105717247 >>105717263 >>105717317 >>105717366 >>105717424
>>105716830
>>105716968
>I'm passing one image ref condition directly into another and seeing what happens
Anonymous No.105717253
>>105716808
>>105716968
this, i've been trying to figure it out, stacking the nodes casues oom, batching them only uses one image at a time. concatenating them does not work. though if i send them concatenated but only send one of them into the FluxKontextImageScale it does influence the image
Anonymous No.105717255
>>105717208
>needs another fucking snowflake node
>doesn't work if I hook up another NAG node
Anonymous No.105717260
>>105716593
res_multistep + sgm_uniform
Anonymous No.105717261 >>105717270
Has the dust settled?
Anonymous No.105717263
>>105717247
i tried that, this was giving me oom even when i scaled them down to 512x512 each, you might be doing it differently. do post any results if you get anything decent like what the X user posted.
Anonymous No.105717270
>>105717261
hang on
*blows*
not yet
Anonymous No.105717271 >>105717275 >>105717287
lumina just oozes with style, i think the base model was trained on alot of art. that's the thing, china doesn't care about copyright one bit so they don't exclude things from the dataset. slop is a symptom of no art in the dataset
Anonymous No.105717275 >>105717291
>>105717271
There are hundreds of thousands of works of fine art that is in public domain
Anonymous No.105717279
>>105717208
+20 seconds to gen time is crazy. Also 6 fingers in neg made her hide her other hand lmao.
Anonymous No.105717287
>>105717271
chinese have a much larger percentage who are into anime girl stuff so they include lots of them in the dataset
Anonymous No.105717291 >>105717296
>>105717275
then it would make sense for black forest labs models to be somewhat competent at replicating varied styles other than photorealism if it has those styles in the dataset?
why do so many models have the plastic sheen/sepia tint?
lumina is a model that seems to exempt from the "plasticity" of flux despite it's smaller size
Anonymous No.105717293 >>105717323 >>105717331 >>105717352
Replace the text "Concord will go offline" with "SHIT GAME GOES OFFLINE". The black woman with a red coat is holding a bucket of KFC chicken. The text "DEI STRIKES AGAIN" is at the bottom of the image.

it's neat you can do this without even doing inpaint masks.
Anonymous No.105717296
>>105717291
Because Flux is heavily lobotomized for "style" and given how many people praise it it clearly makes the Redditors happy.
Anonymous No.105717310 >>105717346 >>105717352
remove the four characters in the image and replace them with a sign in the floor that says "note: they were all fired".

we need a reference for types of prompts that kontext understands since there is so much possible stuff.
Anonymous No.105717317
>>105717247
i've done i think, use the combine conditioning node perhaps? well it's doing something...
Anonymous No.105717323 >>105717335 >>105717352
>>105717293
replace the text at the top of the image. replace it with "why the fuck did you buy this shit?"

almost, but shows how effective the context based editing is. it also keeps the font style.
Anonymous No.105717331 >>105717335 >>105717337
>>105717293
do you really need ai to do this?
Anonymous No.105717335 >>105717346 >>105717352
>>105717323
there, we have a winner. also, even if you wanted to shoop it, you'd have to have a duplicate font. this allows you to dupe the font via AI knowing the style.
>>105717331
without AI i'd need the font IGN used to make it look authentic.
Anonymous No.105717337
>>105717331
For the record it would take longer than 20 seconds to do it in Photoshop
Anonymous No.105717346
>>105717335
*also to remove all the characters and make it look neat it'd take much more time than 30-40s.

also note >>105717310 as the removed characters are gone but the background lines are still consistent. i'd have to draw those in a shoop.
Anonymous No.105717350
Fresh

>>105717349
>>105717349
>>105717349

Fresh
Anonymous No.105717352
>>105717335
>>105717323
>>105717310
>>105717293
>M-M-MUH GAMES!!!
ew.
Anonymous No.105717366
>>105717247
yeah anon that causes oom if you do it that way, combining the conditioning does not oom. but that produces garbage but hang on because now i've found what causes the oom. The ModelSamplingFlux node will be the next thing to combine i'm sure.
Anonymous No.105717411 >>105717422
>>105713664
god damn
Anonymous No.105717422 >>105717430
>>105717411
what? its SHIT
Anonymous No.105717424
>>105717247
this setup seemed to be working the first few steps then it fucks up, so i know i'm close to the solution. i probably just need to change the prompt now and or images and steps. i love to tinker with shit like this, its how i find the best workflows desu.
Anonymous No.105717430
>>105717422
no anon, it's pure gold. gimme more foxes
Anonymous No.105717623 >>105717986
In the face of incredible leaps in local diffusion, being a vramlet sucks.
Anonymous No.105717986 >>105718003
>>105717623
What did you use to make this and how much vram?
Anonymous No.105718003
>>105717986

KiwiMix-XL V3, 8 GB VRAM.
Anonymous No.105718079
>>105714932
>qwen3 imagegen
>235b
kek no one is gonna run that
Anonymous No.105718098
>>105715183
I mean, simple editing shit like that can be done with impainting, the most interesting thing of 4o and kontext is character reference, and kontext makes the skin plastic and can't keep the drawing style when it uses a character and make it do something else, that's the problem