← Home ← Back to /g/

Thread 107054044

316 posts 186 images /g/
Anonymous No.107054044 [Report] >>107057813
/ldg/ - Local Diffusion General
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107049284

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Neta Yume (Lumina 2)
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
Anonymous No.107054061 [Report] >>107054106 >>107054121 >>107054221
Imagine the kino level of shitpost if we really get suno 4.5 at home
Anonymous No.107054091 [Report]
>all chroma shit
>probably own gens too
kys OP frfr
Anonymous No.107054106 [Report] >>107054110
>>107054061
>rainbow
>woman avatar
....... is he cooking?
Anonymous No.107054110 [Report]
>>107054106
>woman
I think it's a man avatar, the hair is short
Anonymous No.107054121 [Report] >>107054155
>>107054061
If it was so easy suno would have been as good as udio long ago, but udio was always better, so I have my doubt that local can be as good.
Anonymous No.107054137 [Report]
>>107054015
gonna depend on the model, lora strength, captions etc. this is generally the reason why you use nonstandard words to invoke the lora to avoid confusion in the model.
Anonymous No.107054151 [Report]
Anonymous No.107054155 [Report]
>>107054121
this, suno is overrated as fuck, they always pretend it's at the same level as udio when it's definitely not
Anonymous No.107054162 [Report]
>caring about the fagollage
Anonymous No.107054175 [Report]
Pay debo no mind he's disabled
Anonymous No.107054183 [Report] >>107054206 >>107054244
>>107054047
https://vocaroo.com/1lhI4LNQojvT
Udio 1.0 of course.
Anonymous No.107054206 [Report] >>107054450
>>107054183
udio is amazing desu
Anonymous No.107054221 [Report] >>107054248
>>107054061
I listened to his samples, they are struggling to get quality of YuE even.
Anonymous No.107054227 [Report] >>107054322 >>107054339 >>107054350 >>107054367
So wan q8 gguf is like 95% of as good as fp16 and a little faster?
Anonymous No.107054244 [Report]
>>107054183
meh
Anonymous No.107054248 [Report] >>107057648
>>107054221
show some of those samples here anon, I don't want to go to trooncord
Anonymous No.107054322 [Report]
>>107054227
with a little optimism and cope
Anonymous No.107054339 [Report]
>>107054227
yeah, the quality is really equivalent and it's 2x lighter in terms of size
Anonymous No.107054350 [Report]
>>107054227
I have the same speed almost with 16 vs q8 on a 5090, I just block swap half the model.
Anonymous No.107054367 [Report]
>>107054227
Half the precision, half as good
But being able to run it makes you twice as blind
Anonymous No.107054430 [Report]
Anonymous No.107054444 [Report] >>107054448 >>107054509 >>107054550 >>107056231
Contrastive flow matching is tight.
Anonymous No.107054448 [Report] >>107054492
>>107054444
>Contrastive flow matching
what is that?
Anonymous No.107054450 [Report]
>>107054206
Yes, I really want to hope that local will catch up but that seems like a leap from nothing, not even SD, to a Dalle 3 tier music model. High quality manually captioned audio data is probably a must for such results, and then really good DPO process.
Anonymous No.107054482 [Report] >>107054512 >>107054531 >>107054564 >>107054629 >>107056057 >>107057615
opensource emu3.5 with 32b, which according to the authors is supposed to be superior to nano banana in every way.
Looking at the sample images, I have my doubts about
Anonymous No.107054492 [Report] >>107054498
>>107054448
It's a new version of flow matching that encourages the model to find unique paths which speeds up convergence, gives shaper results because it's not blending paths, and also encourages diverse results (because paths aren't blended).
Anonymous No.107054498 [Report] >>107054508
>>107054492
I see, and I guess you're using that method to make a lora right?
Anonymous No.107054508 [Report] >>107054518 >>107054524 >>107054530 >>107056231
>>107054498
I'm using the method to train a 1.5B model from scratch.
Anonymous No.107054509 [Report]
>>107054444
test finetune or what
Anonymous No.107054512 [Report]
>>107054482
its chinkslop. if its not good they lie and say it is. if it is good they make you pay for it. the good thing about china isnt that its better, its that its cheaper
Anonymous No.107054518 [Report] >>107054542
>>107054508
Tell us more. This sounds interesting.
Anonymous No.107054524 [Report] >>107054542
>>107054508
nice, how much faster is it compared to the previous method?
Anonymous No.107054530 [Report]
>>107054508
i believe in bigma
Anonymous No.107054531 [Report] >>107054546 >>107054547
>>107054482
even if it's true, 32b is just too big
Anonymous No.107054542 [Report] >>107054591 >>107054636 >>107056231
>>107054518
It's just a revision of the Pixart model I'm working on. 1.5B, Pixart architecture with the HDM mlp, and Ostris's 16 channel VAE.

>>107054524
Insanely fast compared to MSE, doing like 0.02 loss per day which means a full from scratch model on a single 5090 in 50 days.
Anonymous No.107054546 [Report]
>>107054531
>is just too big
Anonymous No.107054547 [Report]
>>107054531
32b for a model claiming to do everything they claim it can is quite impressive actually. the problem is benchmarks don’t hold up against reality.
Anonymous No.107054550 [Report]
>>107054444
Based quads
Anonymous No.107054564 [Report] >>107054576 >>107054595
>>107054482
>according to the authors
And why should I believe authors this time? They lie as often as the common whore.
Anonymous No.107054576 [Report]
>>107054564
>And why should I believe authors this time?
you should never believe them, like everything you test it out by yourself and see that at 95% of the time it's a big nothingburger
Anonymous No.107054591 [Report] >>107054636
>>107054542
Pretty cool tinkering. What sort of database are you using?
Anonymous No.107054595 [Report]
>>107054564
My text consisted of two parts. Why are you asking me something that I answer in the second sentence?
Anonymous No.107054629 [Report] >>107054657
>>107054482
It's unfortunate that from Chinese all we get are models on par with Seedream but they are made out to be something more. I'll give them props for catching up to Seedream by using Seedream based synthetic slop in conjunction with 4o though.
Anonymous No.107054636 [Report] >>107054642 >>107054706 >>107054725 >>107054938 >>107056231
>>107054542
Sexy loss graph. The new training run is switching from a 3d perlin noise Automagic warm up to AdamW.

>>107054591
I scraped millions of images from duckduckgo but I also have e621, danbooru and gelbooru (200k images). Then a lot of lets plays from Youtube for games. A few movie screencaps from stuff I have / famous movies. It's generally pop culture and art centered.
Anonymous No.107054642 [Report] >>107054659
>>107054636
so like the dark blue was the normal method and the light blue is the constrative flow thing?
Anonymous No.107054655 [Report]
Weird how no local models can break into the top 10 anymore in arenas. Shame they fell so far behind
Anonymous No.107054657 [Report] >>107054907
>>107054629
Arguably, Qwen already did that though. We are stuck getting same model over and over again. One slightly more fancy than the previous.
Anonymous No.107054659 [Report]
>>107054642
No dark blue was me experimenting with a Automagic idea I had which basically was activating layers and parameters with 3d perlin noise much like how wind is simulated in a video game. Light blue is AdamW which is much faster than the other optimizer but I think the Automagic warmup was probably helpful especially for forcing the layers and parameters to be randomly and usefully activated.
Anonymous No.107054706 [Report] >>107054715 >>107054947
>>107054636
>I scraped millions of images from duckduckgo
Yandex used to be so good for this. Now it barely works as image search.
Anonymous No.107054715 [Report] >>107054727
>>107054706
Unfortunately all search results are fucking garbage now because AI bullshit is everywhere. I scraped early last year before the flood.
Anonymous No.107054725 [Report] >>107054736 >>107054902
>>107054636
retard here, what exactly does loss mean in this context?
Anonymous No.107054727 [Report]
>>107054715
I think to be sure you should scrap every images before 2022, it's the last year before the AI flood
Anonymous No.107054736 [Report] >>107054980
>>107054725
the loss is like the error between the image you trained and the recreation of the model, the goal is to get the lower loss possible
Anonymous No.107054798 [Report]
TTS with voice cloning capabilities that you can set up by using docker compose with the relevant options for your system, it comes with an api and GUI set up too.

https://github.com/devnen/Chatterbox-TTS-Server
Anonymous No.107054902 [Report] >>107054980 >>107055036
>>107054725
0 loss means the model’s prediction perfectly matches the objective, denoising an image or finding a flow. In practice this is catastrophic failure and memorization if you ever get to 0. Normally for diffusion models (e.g. SDXL) loss is how well the image is denoised. Models like Flux have loss based on finding paths in latent space to the image. Contrastive flow adds an additional objective that paths must be unique.
Anonymous No.107054907 [Report]
>>107054657
you forgot slower every iteration
Anonymous No.107054935 [Report] >>107057278 >>107057327
Hello people, its my first time using chroma, how can I get my pics to be higher quality? Higher steps or more negative prompts?
Anonymous No.107054938 [Report]
>>107054636
Cool stuff. It's a great learning experience.
Anonymous No.107054947 [Report] >>107055017
>>107054706
Yeah, quite sad what they did to it.
Anonymous No.107054969 [Report] >>107055046
>>107053778
3.5 Medium was stronger at the top end of its resolution range than the lower end TBQH, like generating at e.g. 1216x1600 would very often be way more coherent and better looking than 832x1216 on the same seed. So for a high res use case like this it might make sense especially if they've tuned it any past the base model. Attached pic is a native 1280x1536 one-shot Medium gen, for example.
Anonymous No.107054980 [Report] >>107055016
>>107054736
>>107054902
i see
interesting and makes me wish i wasn't a retard
Anonymous No.107055016 [Report]
>>107054980
It's not that complicated and all the smart people already did the math for you. I'm half a retard and I just experiment with bleeding edge research other people have already done. Really to get into any of this you just have to drop into it and be willing to sweat, a lot of this shit is just tedious churning, especially captioning. For example I'm working on finetuning Joycaption to be better which requires handcaptioning thousands of images.
Anonymous No.107055017 [Report]
>>107054947
It's AI filtered walled garden.
Speculation wise, maybe Cuckflare and others restricted their web crawlers because of the geopolitical incidents. It's okay because Google can leech everything.
Anonymous No.107055036 [Report]
>>107054902
>Contrastive flow adds an additional objective that paths must be unique.
that's quite a smart idea when you think about it, it forces the model to not be lazy and work on every edge cases
Anonymous No.107055043 [Report]
Anonymous No.107055046 [Report] >>107055061 >>107056557
>>107054969
And this one's 1600x1216
Anonymous No.107055061 [Report] >>107055096 >>107055103
>>107055046
illustrious is the peak of local diffusion. no other model comes close in terms of character knowledge, style knowledge, concept knowledge, and overall fidelity. Chroma is a disgusting blurry mangled mess, neta knows a fraction of the styles, qwen is a bloated stopgap that can’t even compete with seedream 3. SDXL is an absolute triumph and will likely not be surpassed for years
Anonymous No.107055077 [Report] >>107055080 >>107055697
Anonymous No.107055080 [Report]
>>107055077
depends on the scope of the finetune dataset. they'll probably manage to make the girls/boys hotter, among some other things. it's probably fixing the biggest popular issue in a bunch of months or so?

idk if anyone will get enough compute to train the boorus, real fashion/nsfw collections, cjk idols and so on.
Anonymous No.107055096 [Report] >>107055100
>>107055061
(You)
Anonymous No.107055100 [Report]
>>107055096
i think people actually will start to finetune it with ramtorch or w/e. it'll likely be quite slow.
Anonymous No.107055103 [Report] >>107055110 >>107056262
>>107055061
There's multiple small model projects now that have proven you can train a from scratch DiT model for a couple thousand dollars. The fact is many people don't want to stick out their neck especially if they're in North America or Europe.
Anonymous No.107055110 [Report] >>107055149
>>107055103
Not really. Multigpu is mainly used for higher batch sizes, not learning rate. A single gpu would only go ~4 times slower than the typical setup, and you would only rake in more donations by going slow and doing incremental releases. Recent papers have proven that with modern optimizers the results are no worse too.
Anonymous No.107055149 [Report] >>107055370 >>107056262 >>107057925
>>107055110
That's not what I said. There's nothing stopping you from taking HDM, scaling it 4x. The AMD 340m toy model was trained in 1.5 days. Pixart 600m is better than SD 1.5 and on par with SDXL. So you can assume a modern DiT 2B model could be trained in less than 30 days and be SOTA as a booru model.
Anonymous No.107055167 [Report] >>107055184
>>107053799
>Udio partnership with UMG

https://desuarchive.org/g/thread/106957370/#106958310

So it has begun. First they will figure out which model is best, give their artists exclusive access to this model, and then give the general public a watered down version of the model, if the public gets a version at all.
Anonymous No.107055177 [Report] >>107057255
Sage attention 3 when.
Anonymous No.107055184 [Report]
>>107055167
Remember, Udio is no joke compared to everyone else. Going after them specifically is very strategic.
Anonymous No.107055277 [Report] >>107055283 >>107055541
Finished baking male vocals version of:
>>107053777
Lyrics I posted here already:
>>107053946
https://voca.ro/1muX2AJkOy1P
Anonymous No.107055283 [Report]
>>107055277
cringe
im waiting for acestep 1.5
Anonymous No.107055288 [Report] >>107055297 >>107055412 >>107055603
>"Udio is da best!!!11!!1"
Reality is picrel
You can make direct comparisons yourselves in the Music Arena if you want
Udio is only "better" if you are into gacha and generate parts of the song multiple times until it does something good, and Suno allows you to do something like that as well
Anonymous No.107055297 [Report] >>107058213
>>107055288
>mememarks
>udio v1.5
Anonymous No.107055321 [Report] >>107055359 >>107055483 >>107055531 >>107055650
I haven't been in the game for like a year now and I'm starting to plan a full system upgrade. Why is there no normal middle ground between 16 and 32 VRAM cards, nvidia?
I have a 2060 super, 8GB vram. My question is, assuming I just go for a 5080 or something for the 16GB, could I use it together with the 2060 to add up the VRAM and leave the rest to regular RAM? Or should I not bother and just make do with a 5080?
I could afford a 5090, I just cant help but feel like I'm being ripped off and honestly more than anything else I'm worried about it melting and exploding or something. Is undervolting/underclocking a good idea?
Anonymous No.107055359 [Report]
>>107055321
Im retarded and forgot to mention that I want to get into video generation. the pastebin doesn't go too much into detail on multi-card drifting for what its worth
Anonymous No.107055370 [Report] >>107055524
>>107055149
I think the best possible way to train an anime model would be to first train your own captioner model that took tag lists along with an image as input, and interleaved the tags into proper sentences (in a way that would try to be grammatically correct but not necessarily to a fault) based on what it could actually see, while also adding spatial information where it could and where it made sense. Then you could just run that model on the Danbooru dataset directly with the original accurate tag lists for each model.

I'd also pick just a maximum resolution and proportionally downscale larger images to that if needed, but not *upscale* anything whatsoever, rather just bucketing everything at as close to the original upload res as possible. So the end result would be basically a fairly robust mixed-res model that could coherently do a wide range of resolutions rather than just focusing on one range.
Anonymous No.107055412 [Report] >>107055603
>>107055288
Udio 1.0 was best by far. They neutered the model after that. I've never seen a Suno gen on par with Udio 1.0 composition wise.
Anonymous No.107055418 [Report]
Anonymous No.107055483 [Report]
>>107055321
If you're patient you could wait for the 50 series super refresh as the 5080 super is supposed to have 24gb vram. Can't speak to multi gpu use, but 8gb seems rather abysmal and not worth the hassle especially when you consider you can offload to system memory.
As for the 5090, I have one and I've undervolted, overclocked it and have capped the power at 80% (460W) without any issues. In any case make sure to get at least 64gb of system ram if you're gonna gen videos.
Anonymous No.107055524 [Report] >>107055797
>>107055370
How'd I do it is have the same image with three different captions:
- tags as seen on the booru site
- short description
- long description

Each caption really is a supported way for a user to prompt the model and the model will naturally learn how to mix the different caption types. The problem we've now seen multiple times now is people training on caption blob balls and forcing the model to be reliant on long captions if you want maximum output quality.
Anonymous No.107055531 [Report]
>>107055321
There's really no downside to undervolting a 5090, you lose 5% performance and reduce the power 30%.
Anonymous No.107055541 [Report] >>107055599
>>107055277
Bridges and outro are pretty weak https://voca.ro/12sNX01jyU6M
Anonymous No.107055599 [Report]
>>107055541
It's still pretty good if it's local.
Eg. in terms of slop.
Anonymous No.107055603 [Report] >>107055685 >>107056111
>>107055412
>>107055288
Suno very likely is trained on royalty-free music libraries like Audiojungle which is why it sounds worse but also more "polished. I'm guessing Udio was trained on more copyrighted music even before the UMG deal so it's more random but gives more interesting results.
Anonymous No.107055650 [Report]
>>107055321
3090's are pretty cheap
Anonymous No.107055685 [Report]
>>107055603
Old Udio was overfit on copyrighted stuff, if you input the same tags and lyrics some tracks had, you'd get nearly identical outputs
Anonymous No.107055697 [Report] >>107055752
>>107055077
Anonymous No.107055713 [Report] >>107056586
>>>/pol/520205588
How do I do this on my laptop?
Anonymous No.107055729 [Report]
>https://huggingface.co/nvidia/ChronoEdit-14B-Diffusers
anyone tried the new nvidia edit model?
Anonymous No.107055752 [Report] >>107056080
>>107055697
Anonymous No.107055797 [Report] >>107055828
>>107055524
I think your way might work if you literally swapped out the sets of captions for each image between epochs, probably better than slapping them all in one caption file
Anonymous No.107055826 [Report] >>107055860 >>107055951
Anonymous No.107055828 [Report]
>>107055797
That's what I mean, you duplicate the image for each caption type. And if you practice VAE jitter you prevent memorization.
Anonymous No.107055860 [Report]
>>107055826
Nice. Maybe more... try dynamic angle, rim light. Really good.
Anonymous No.107055921 [Report] >>107056239
Anonymous No.107055946 [Report]
does infinite talk work with wan2.2?
Anonymous No.107055951 [Report] >>107055982 >>107056008
>>107055826
Anonymous No.107055982 [Report] >>107056008
>>107055951
Love it. What's the model? I should try and gen something related to this.
Anonymous No.107056008 [Report] >>107056018
>>107055951
>>107055982
Qwen
Anonymous No.107056018 [Report]
>>107056008
Cinematic Redmond had these wibes. Cool that Qwen can be grainy too. Of course it's probably pretty stiff but that's what they all are.
Anonymous No.107056057 [Report]
>>107054482
>emu3.5
HF links are all dead
I can test it if the models are available somewhere
Anonymous No.107056080 [Report]
>>107055752
I can't make her sit on the pig, but it's still funny
Anonymous No.107056110 [Report]
Is there a way for Librewolf (flatpak) to remember its last directory? In Linux Mint.
It's somewhat tiring to use ComfyUI and I need to open a file dialog to traverse all the way up from /home/ to my work mount...
Anonymous No.107056111 [Report] >>107056121
>>107055603
I don't think they're prompted the same. Udio has a better understanding of music, that shows because with a good prompt it destroys almost any Suno song I've ever heard

https://www.udio.com/songs/2bXYLKaVDyVwi1GAb6pSkR

This is a very hard song
https://www.udio.com/songs/7zrLreMnwCYrdBqQkGtEXM

The musical depth I've witnessed out of this model truly is insane. Unprecedented connection between lyrics and musical notes. It has mastered vocals and intonation in a way Suno has not.

Using high quality copyrighted music in conjunction with whatever royalty-free music is available for training the model is the way to go.
Anonymous No.107056121 [Report] >>107056281
>>107056111
Yeah, that's just subjective to people who have never played any instrument in their lives.
Anonymous No.107056151 [Report] >>107056670
Any of you has a recommandation to generate videos for a music video? I'm looking for 16:9, some kind of 35mm grain/look, mostly still shots but with some travelling too. Theme is urban 90's/2000's workers working daily shifts.

I only know of Veo 3 so far, and looking at Runway.
Anonymous No.107056217 [Report]
so has the copywritepocalypse finally started
Anonymous No.107056231 [Report] >>107056241
>>107054444
>>107054508
>>107054542
>>107054636
Are you using TREAD like HDM too? And have you considered going VAE-less using
SVG (or at least using EQ-VAE like HDM)? https://arxiv.org/pdf/2510.15301

if not, you are missing out on huge speedups
Anonymous No.107056239 [Report] >>107056256
>>107055921
this is a very nice gen
what model did you use?
Anonymous No.107056241 [Report]
>>107056231
TREAD is much harder to implement if you want the real speed up. The 16-channel VAE I'm using has been EQ'd yeah.
Anonymous No.107056256 [Report] >>107056349
>>107056239
NovaOrangleXL_v120
I'm just testing my linux installation, I deleted all my previous models. Don't have noob or anything else.
Anonymous No.107056262 [Report]
>>107055149
>>107055103
THIS, we are on the cusp of home baked local SOTA.
Anonymous No.107056281 [Report] >>107056305
>>107056121
Udio literally just got acquired by the largest music label. That should tell you all you need to know.
Anonymous No.107056305 [Report]
>>107056281
I don't know, really.
Anonymous No.107056317 [Report]
Test
Anonymous No.107056339 [Report]
Anonymous No.107056342 [Report]
Anonymous No.107056349 [Report] >>107056389
>>107056256
thank you anon, hows your linux experience going?
Anonymous No.107056375 [Report] >>107056415 >>107056597
Yume hands do work but you have to describe them in the prompt, treat it more like a LLM
Anonymous No.107056376 [Report]
/iemg/ lore, you wouldnt get it
Anonymous No.107056382 [Report]
Anonymous No.107056389 [Report] >>107056405 >>107056432
>>107056349
Yeah, well, I'm an experienced faggot but I wouldn't advertise it for normal people. Even with the most common interfaces, it's been 20 years and they still can't get a file save dialog right.
I have used Irix and it never had these issues.
Like save a file from Cum and it defaults to some ~/.
Open a file...
It's great if you are a developer but for a normal person just use Windows.
I feel like that Linux environments have gone backwards since I last used them 15 years ago.
Anonymous No.107056405 [Report] >>107056432
>>107056389
Flatpak browser does not remember the save file location from Cum.
This is what I mean.
I need to browse in 5+ deep to just get to the directory I want.
Anonymous No.107056410 [Report]
Anonymous No.107056412 [Report] >>107056440
remember that guy that was training a model and said his image of a brown splotch for the prompt "a woman" was 80% of the way there?
Anonymous No.107056415 [Report] >>107056421
>>107056375
Okay, then what's the optimal total prompt length in your experience?
Anonymous No.107056421 [Report] >>107056455 >>107056490
Text can be consistent with small phrases from the looks of it and it is really sensitive to artist. The wrong tag will completely fuck everything which lends to the needs more training.
>>107056415
I never pay attention to that doesn't matter from my testing?
Anonymous No.107056432 [Report] >>107056441
>>107056405
>>107056389
im a linux user, what distro r u on?
im on debian and the default file save dialog in brave/mullvad remembers locations (actually not sure about mullvad because it has crazy settings, but firefox did work) for extensions
what environment are u using? i use dwm so less ram/vram is used
Anonymous No.107056440 [Report] >>107056451
>>107056412
how's your dataset going? still 0%?
Anonymous No.107056441 [Report] >>107056458
>>107056432
Yeah, comparing distros is like comparing dicks. I think I'm using Librewolf and it's a flatpak - this explains why it does not remember the directories.
Anonymous No.107056451 [Report] >>107056465
>>107056440
how's your training going 2 years later? still at 80% there?
Anonymous No.107056455 [Report]
>>107056421
>hands do work but you have to describe them in the prompt
Sounded like your approach is to really bloat the prompt with specifics, but I guess I misunderstood.
Anonymous No.107056458 [Report] >>107056491
>>107056441
try firefox or brave, or look for settings to disable forgetting save directory in about:config
flatpak is likely the issue because of muh sandboxing
Anonymous No.107056465 [Report] >>107056476
>>107056451
it doesn't take much philosophy to understand you don't do anything and thus won't achieve anything, don't put your insecurities of failure on me, thanks :)
Anonymous No.107056476 [Report] >>107056497
>>107056465
yup, still 80% there confirmed lmao
Anonymous No.107056490 [Report] >>107056511
>>107056421
do you switch the system prompt like how the guide says to help with text or? i have yet to really fuck with text on it
Anonymous No.107056491 [Report] >>107056620
>>107056458
Nah, your advice is just like any of the useless non-tech advice - changing distro or even browser does not accomplish anything. If it works it works and if it does not there is way to do this but sure as hell it is not by reinstalling my disks.
Anonymous No.107056497 [Report] >>107056518
>>107056476
actually that doesn't mean anything, for all you know I've already released a model, but what we both know is in 2 years you don't have anything except a bitter attitude
it's truly funny I'm living rent free in your brain though
Anonymous No.107056511 [Report]
>>107056490
I treat it like chroma I also use my own system prompt I'll look into the guide again but pigeonholing it to anime only doesn't do much for me
Anonymous No.107056518 [Report] >>107056537
>>107056497
for all anyone knows you haven't released anything lmao
Anonymous No.107056525 [Report]
Anonymous No.107056537 [Report] >>107056560
>>107056518
Feel free to explain why anyone would ever attach any of their professional work to 4chan. No one releasing a model that had their name attached to it would ever link to it 4chan if they wanted to be taken serious.
Anonymous No.107056557 [Report]
>>107055046
One more
Anonymous No.107056560 [Report] >>107056567
>>107056537
how convenient
my locally trained 300B model is going great too
Anonymous No.107056567 [Report] >>107056585
>>107056560
The only thing you're developing is suicidal thoughts.
Anonymous No.107056581 [Report] >>107056596 >>107056650
Anonymous No.107056585 [Report]
>>107056567
let's see a 1girl gen, bro, I'm sure it will be great bro, two years of improvement bro
Anonymous No.107056586 [Report]
>>107055713
This is brilliant desu. Now they just need to make it uncensored so that a guy jerking off shows a girl fingering her pussy. Then bye bye thots, any guy can become an OnlyFans whore.
Anonymous No.107056596 [Report] >>107056640
>>107056581
Great stuff.
Anonymous No.107056597 [Report] >>107056603 >>107056649 >>107057111
>>107056375
That's not really true at all for Yume 3.5 IMO, you can absolutely even Booru prompt it straight up as long as you leave the Gemma boilerplate properly in my experience. That the generally recommended sampling configs are both not really that good is more likely the issue for some people, DPM++ 2S Ancestral at 4.5 to 5.5 CFG gives massively better results most if the time for me. It is slower though.
Anonymous No.107056603 [Report]
>>107056597
Oh I forgot to say, that's with Linear Quadratic.
Anonymous No.107056614 [Report] >>107056687
https://huggingface.co/lightx2v/Wan2.2-Distill-Models/tree/main

new models, anyone test?
Anonymous No.107056620 [Report] >>107056652
>>107056491
did saving files with librewolf remember?
Anonymous No.107056635 [Report]
Lustify is pretty good for off-topic gens.
Anonymous No.107056640 [Report] >>107056650 >>107056679
>>107056596
ty
Anonymous No.107056644 [Report]
Anonymous No.107056649 [Report]
>>107056597
sadly that sampler is not in neo forge for some reason but I get good luck with DPM++ 2M
Anonymous No.107056650 [Report] >>107056819
>>107056640
>>107056581
huh? aren't these just film stills?
Anonymous No.107056652 [Report]
>>107056620
I does remember the last directory for images but with cum ui it does not.
Anonymous No.107056666 [Report]
Anonymous No.107056670 [Report] >>107056708
>>107056151
>Any of you has a recommandation to generate videos for a music video? I'm looking for 16:9, some kind of 35mm grain/look, mostly still shots but with some travelling too. Theme is urban 90's/2000's workers working daily shifts.

>I only know of Veo 3 so far, and looking at Runway.
This is the local general so I'll give you advice for a model you can run on a GPU on your home computer

Your only real option for cinematic stuff is Wan 2.2 or 2.1 with the MoviiGen lora. I would recommend trying a 2.2 workflow + that Lora at 720 using a 5090, or FusionX if you choose to use 2.1

If you don't need the lack of censorship of WAN and you have money to spend, I'd just use runway for this. Higgsfield AI may also be interesting to you because they have specific stuff for music videos
Anonymous No.107056674 [Report] >>107056688 >>107056698
Anonymous No.107056679 [Report] >>107056709 >>107056819
>>107056640
I am going to give a you tip:
watch Bram Stoker's Dracula (90's) and take couple of screenshots, there's Lucy and all that. Then img2img them. That'll be great.
Anonymous No.107056683 [Report]
Anonymous No.107056687 [Report]
>>107056614
>more i2v
I sleep
Anonymous No.107056688 [Report]
>>107056674
i wish this were me right now
Anonymous No.107056690 [Report] >>107056726
Kek, I was playing the Suno side and thought local already caught up somehow

https://levo-demo.github.io/

Very disingenuous demo
Anonymous No.107056698 [Report]
>>107056674
You should put him in a van. And make him go.
Anonymous No.107056708 [Report]
>>107056670
Cool thanks
Anonymous No.107056709 [Report]
>>107056679
>Bram Stoker's Dracula
The best shots are couple of still frames from inside the film, not these tiktok screenshots etc.
Anonymous No.107056726 [Report] >>107056742 >>107056890
>>107056690
Kek, who trained this model? It spits out Adele unprompted?

https://levo-demo.github.io/static/audio_sample/overview/04_en.mp3

It might be good. How come I've never heard of it.
Anonymous No.107056732 [Report] >>107056744
Anonymous No.107056742 [Report] >>107056773
>>107056726
Is there a reason why you feel the need to talk about a unrelated subject in the thread when it can exist in it's own thread with actual documentation we can grab from the OP and all use?
Just seems odd you can't do that instead
Anonymous No.107056744 [Report]
>>107056732
bigger
Anonymous No.107056751 [Report]
Anonymous No.107056773 [Report]
>>107056742
There's no comfy workflow, and the model seems like some experimental half trained model, what is there to talk about?
Anonymous No.107056799 [Report]
Anonymous No.107056819 [Report] >>107056836
>>107056650
Remember this scene?

>>107056679
It's a decent movie but I would unironically remake all Keanu Reeves dialogue with AI
Anonymous No.107056836 [Report]
>>107056819
bro dont upload that scary ass shit here
Anonymous No.107056867 [Report] >>107056915 >>107056983
I'm getting closer
Anonymous No.107056890 [Report] >>107056904 >>107057924
>>107056726
Tencent is actually training their own music model.

https://huggingface.co/tencent/SongGeneration

>TODOs
>Release SongGeneration-v1.5 (trained on a larger multilingual dataset, supports more languages, and integrates a Reward Model with Reinforcement Learning to enhance musicality and lyric alignment)

And the data is so copyrighted it spits out Adele unprompted as you can see on their demo. That is wild, with Qwen doing the same, my faith in China has been restored.
Anonymous No.107056904 [Report] >>107056926 >>107056935
>>107056890
What does this have to do with image diffusion?
Anonymous No.107056915 [Report] >>107056946
>>107056867
to killing urself? never been happier for u
Anonymous No.107056926 [Report] >>107056946
>>107056904
There is no music thread. It's either here or /lmg/, the only two places we can discuss local models.
Anonymous No.107056930 [Report]
Anonymous No.107056935 [Report] >>107056946
>>107056904
this is local diffusion general, we accept video and audio related content here.
Anonymous No.107056943 [Report]
i for one welcome our music gen brothers
Anonymous No.107056946 [Report] >>107056984
>>107056935
>Discussion of Free and Open Source Text-to-Image/Video Models
>>107056915
>>107056926
You revealed yourself go back to your containment thread
Anonymous No.107056958 [Report]
Anonymous No.107056983 [Report] >>107056987
>>107056867
closer to approaching the quality of a quantized 2gb illustrious model? maybe
Anonymous No.107056984 [Report] >>107056987 >>107057002
>>107056946
Comfy has audio models. We should be allowed to discuss anything comfy adopts as long as it is local.
Anonymous No.107056987 [Report]
>>107056983
>>107056984
You're so fucking pathetic dude
Anonymous No.107057002 [Report]
>>107056984
Besides, good audio models are pivotal for video. Since Sora 2 it's not muted audio era anymore, the SOTA has changed, so all discussion on audio research is welcomed.
Anonymous No.107057004 [Report]
uh oh, melty
Anonymous No.107057066 [Report] >>107057119 >>107057157
I'll give the NetaYume shill this, the model requires a whole lot of gacha but at least it has some actual variation in its outputs.
Anonymous No.107057081 [Report]
Im running comfy ai and following the guide ive been playing around with the hand and face detailer. Is there an equivalent for feet/toes? Id like to be able to fix those too.
Anonymous No.107057093 [Report]
Anonymous No.107057111 [Report] >>107057320
>>107056597
Some are better than others, clearly, but IMHO much of sampler/scheduler choice is subjective. The latter moreso than the former in my estimation.
Anonymous No.107057112 [Report]
Anonymous No.107057119 [Report] >>107057137
>>107057066
You haven't taken any steps to learn the model and it shows, Why not explore something before going on multi day complaints?
Anonymous No.107057137 [Report] >>107057145
>>107057119
I barely post in this thread, you're tilting at the wrong windmill friend. And I'm saying I like the model, I get better results out of it for the particular thing I'm prompting than I get out of the other boomerprompt models.
Anonymous No.107057145 [Report] >>107057165
>>107057137
Anything to show?
There have been this constant wave of anons that complain about this model but don't post anything. I know you're just wasting time but take your low skill ass to one of the other threads
Anonymous No.107057155 [Report]
Anonymous No.107057157 [Report] >>107057165
>>107057066
>the model requires a whole lot of gacha
Describe the poses/gestures better
Anonymous No.107057165 [Report] >>107057172
>>107057157
"Face and proportions that don't look weird"
>>107057145
>take your low skill ass to one of the other threads
OK
Anonymous No.107057172 [Report] >>107057334
>>107057165
Fuck off now thanks!
Anonymous No.107057175 [Report] >>107057196
Anonymous No.107057196 [Report]
>>107057175
illustrious 2gb?
Anonymous No.107057206 [Report] >>107057391
why is netayume so sloppy bros??
Anonymous No.107057213 [Report]
*yawn*
Anonymous No.107057227 [Report] >>107057303
>Mindbroken because hen ever made anything good in his life
Anonymous No.107057255 [Report]
>>107055177
it's already there
Anonymous No.107057278 [Report]
>>107054935
both
Anonymous No.107057303 [Report]
>>107057227
damn you melting so hard you cant even spell yumebro
Anonymous No.107057317 [Report]
https://www.youtube.com/watch?v=xboXFT46XSo
Anonymous No.107057320 [Report]
>>107057111
DPM++ 2S Ancestral is pretty objectively better than Res Multistep at least for details like hands and text, using Linear Quadratic for both, I'd say at least
Anonymous No.107057327 [Report] >>107057391 >>107057457
>>107054935
You typically don't need more than 25 steps. Most of my 50 step outputs have been either a sidegrade or even a downgrade in terms of quality.
Don't forget that chroma can gen pics above 1024 dimensions.
Anonymous No.107057334 [Report] >>107057352 >>107057355
>>107057172
It's clearly the same fairly bad troll as yesterday, he's blatantly ragebaiting
Anonymous No.107057352 [Report] >>107057408
>>107057334
yeah I agree fellow yumebro, theres totally not a vast majority of people that find this model trash
Anonymous No.107057355 [Report]
>>107057334
It's the same retard from the rentry, he spends his entire life doing this for years and is just reduced to a bitter faggot.
Anonymous No.107057391 [Report]
>>107057206
kek yeah that poster was an idiot
>>107057327
nice
Anonymous No.107057408 [Report]
>>107057352
You're right, there's in fact not a vast majority of such people
Anonymous No.107057457 [Report] >>107057474 >>107058352
>>107057327
NTA. Your pic is neat af. This is also 25 steps
Anonymous No.107057474 [Report] >>107057505
>>107057457
Oh, this is neat too, how's radiance compared to DC-2K?
Anonymous No.107057497 [Report] >>107057589
Anonymous No.107057505 [Report]
>>107057474
>How's radiance compared to DC-2K?
Couldn't tell you, but I loved the 2k debug ones. There's still a lack of blending the macro pixels but it's mostly good
Anonymous No.107057515 [Report] >>107057589
Anonymous No.107057521 [Report] >>107057530 >>107057589
Anonymous No.107057530 [Report]
>>107057521
Cinematic Redmond is great.
Anonymous No.107057553 [Report]
Anonymous No.107057588 [Report]
Anonymous No.107057589 [Report] >>107057609 >>107057686 >>107057713
>>107057497
"The lighting is even with no strong shadows." compared to "Cinematic lighting, dark background, deep shadows, detailed skin. Sharp HDR."

>>107057515
>>107057521
very cool
Anonymous No.107057591 [Report]
Anonymous No.107057595 [Report]
Anonymous No.107057609 [Report]
>>107057589
https://www.youtube.com/watch?v=ZEWGyyLiqY4
Anonymous No.107057615 [Report] >>107057690
>>107054482
>32b
Mostly useless for local. Viable for use with quantization, especially nunchaku, but LoRA training will be a nightmare, and a model without low cost LoRA training is pointless beyond ten minutes of novelty use.
Anonymous No.107057648 [Report] >>107057829
>>107054248
From the final pretrained model we haven't seen any samples, but this is as it was training

J-pop song
https://vocaroo.com/19CHG4V410OP

Some pop song
https://vocaroo.com/1i7OjKcLbmnO

Some opera song
https://vocaroo.com/1f64Fkmpn9Ax

Idk, maybe with SFT phase it'll catch up to where it needs to be, but those outputs are very underwhelming. Just a bit concerning, but I don't know jack shit about these models.
Anonymous No.107057655 [Report] >>107057838 >>107057980
Slowly getting it together still need to learn composition better
Anonymous No.107057657 [Report] >>107057671 >>107057672
What was that feature of comfyui that was being advertised a while ago where you bundle a bunch of nodes together and then you can re-use that as one node?

did this ever actually happen?
Anonymous No.107057671 [Report] >>107058085
>>107057657
subgraphs? didn't really change anything and was kind of a letdown. the node implementation in general is lacking too much and everything done to the front end has been lipstick on a pig
Anonymous No.107057672 [Report] >>107058085
>>107057657
subgraphs?
they're pretty great to clean up wf and only see what you actually need to see
Anonymous No.107057677 [Report] >>107057718
>https://huggingface.co/meituan-longcat/LongCat-Video
>We introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across Text-to-Video, Image-to-Video, and Video-Continuation generation tasks. It particularly excels in efficient and high-quality long video generation, representing our first step toward world models.
Anyone tried it? Works with KJ wanvideowrapper
Anonymous No.107057686 [Report]
>>107057589
I had an antiquated gpu. But jesus, the boost even SDXL has gotten in terms noise... Sounds like a faggotry.
Anonymous No.107057690 [Report]
>>107057615
>LoRA training will be a nightmare
ostrisai's trainer has supported 3bit quants for a while now.. wouldn't that be sub-16gb? https://xcancel.com/ostrisai/status/1953933728948121838
Anonymous No.107057700 [Report]
anyone knows if infinite talk works with wan2.2, or is it just for 2.1?
Anonymous No.107057713 [Report] >>107057744
>>107057589
what model is this?
Anonymous No.107057718 [Report] >>107057731
>>107057677
some onions have been trying it out. doesn't look that much different from context window jerkiness after every 5 seconds
Anonymous No.107057731 [Report]
>>107057718
>onions
anons filters to onions sometimes or something? the more you know I guess
Anonymous No.107057744 [Report] >>107058315
>>107057713
Chroma-DC-2K-T2-SL4-Q8_0
Anonymous No.107057754 [Report]
Anonymous No.107057788 [Report] >>107057810
>python?
>no, that shit is gay
Anonymous No.107057810 [Report]
>>107057788
based chink
Anonymous No.107057813 [Report] >>107057823 >>107057836 >>107057838
>>107054044 (OP)
slowly but surely, mistakes were made just need to adjust values
Anonymous No.107057823 [Report] >>107057922
>>107057813
Thank you Ran. You wanted some attention.
Anonymous No.107057829 [Report]
>>107057648
It's impressive to see the vocals don't sound anywhere near as robotic as original ACE-Step though. If they catch up to Suno 4.5 maybe there's a chance of getting Udio tier kino now and then.
Anonymous No.107057836 [Report]
>>107057813
you can't adjust values if you are worthless
Anonymous No.107057838 [Report] >>107058031
>>107057655
>>107057813
the painted nails are nice
Anonymous No.107057860 [Report]
Anonymous No.107057861 [Report] >>107057893 >>107057915
So far I've been using the Wan 2.1 workflow from the rentry but wanted to try out 2.2 from here: https://civitai.com/models/1818841/wan-22-workflow-t2v-i2v-t2i-kijai-wrapper (2.2 I2V)
Why isn't it recognizing the vae? Everything looks correct to me, straight dragging the vae output from the loader to the decoder doesn't do anything either
Anonymous No.107057881 [Report]
Anonymous No.107057893 [Report] >>107058287
>>107057861
it's not connected to the decode node, pull the string from the vae loader to the decode node to connect them
Anonymous No.107057912 [Report]
https://www.youtube.com/watch?v=Gu3TAuw3ZJ8
Anonymous No.107057915 [Report] >>107058287
>>107057861
If it's not bait, as it's probably is, but it can be useful for newfags, use the example workflow instead and just load the correct models :
https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_wan2_2_14B_i2v.json
Anonymous No.107057917 [Report] >>107057937
1girl
Anonymous No.107057922 [Report] >>107057950 >>107057959
>>107057823
*MrCatJak
Anonymous No.107057924 [Report]
>>107056890
need them to train a speech model with emotion prompting so we can be freed from the dead end known as vibevoice
Anonymous No.107057925 [Report]
>>107055149
That's what I want to hear.
Start a group, delegate simpler tasks to me, such as some manual captioning, and I'll contribute $250 toward training.
The only catch is that you share the training process and I get to ask a few technical questions.
We can find 20 others; there are plenty of interested people out there.
I don't care if it's a failure.
Anonymous No.107057937 [Report]
>>107057917
I wonder if krea has a buttchin obsession too
Anonymous No.107057949 [Report] >>107057988
Anonymous No.107057950 [Report]
>>107057922
You want to suck off people.
Anonymous No.107057959 [Report]
>>107057922
What took you so long, please offer your asshole.
Anonymous No.107057980 [Report]
>>107057655
hot
Anonymous No.107057988 [Report]
>>107057949
is that...
Anonymous No.107058024 [Report] >>107058217
elongated 1girl
Anonymous No.107058031 [Report] >>107058049 >>107058068 >>107058104
>>107057838
Thanks, I think I'm getting the hang of this model now, the hardest part is finding the right bled of tags for a presentable image followed by adjustments, starting to feel like 60 steps is the magic number with this model. I wish neoforge had all the samplers I don't know why he took some away.
One thing I like with this model is I can game due to how little vram it uses compared to chroma
Sorry but I have a dedicated sperg that hates me and has been holding a grudge for years as well just ignore him
Anonymous No.107058049 [Report]
>>107058031
?
Anonymous No.107058068 [Report]
>>107058031
I respected you for years but not any longer. Seems like you are just spiteful.
Anonymous No.107058084 [Report] >>107058092
>disabled retard noises
Anonymous No.107058085 [Report] >>107058130 >>107058160
>>107057671
>>107057672
Thanks, I've started using subgraphs but I can't figure out how to make all subgraphs reflect each other's changes when I edit one of them. Any ideas? I would expect them to work like Scenes in Godot.
Anonymous No.107058092 [Report]
>>107058084
Why do you refer in 3rd person?
Anonymous No.107058104 [Report]
>>107058031
>ran wanted to come out
He manages to spit out a narcissist rant.
Anonymous No.107058130 [Report] >>107058193
>>107058085
clone it
Anonymous No.107058160 [Report]
>>107058085
>I would expect them to work like Scenes in Godot.
there are a lot of expectations from modern nodegraphs and comfyui ducks up 90% of what's standard
Anonymous No.107058193 [Report] >>107058215
>>107058130
ah, I re-cloned and the clone is working now! I guess I must've cloned too early before, or there was a bug, which caused my clones to become unique (and were no longer clones).
Anonymous No.107058204 [Report]
Anonymous No.107058213 [Report]
>>107055297
>pleeeeeease novel ai, i need the model files, my local model is kinda noisy
Anonymous No.107058215 [Report]
>>107058193
if you duplicate you get separated entities, and if you clone you get tied ones
Anonymous No.107058217 [Report]
>>107058024
Buffy x slenderman
Anonymous No.107058274 [Report] >>107058307
Yeah I need to make loras for this model, it's the boost I needed, it should also be pretty fast compared to training chroma,
Anonymous No.107058287 [Report]
>>107057893
doing that just crashed comfyui
>>107057915
not bait, I'm just a bit of a brainlet when it comes to this but your workflow works fine, thanks
Anonymous No.107058307 [Report]
>>107058274
Netayume is fucking garbage, holy shit
Anonymous No.107058315 [Report] >>107058355
>>107057744
>Chroma-DC-2K-T2-SL4-Q8_0
nta, nice gens, with lora?
Anonymous No.107058318 [Report]
Netayume is fucking trash and just having it write some text that looks like its done in paint doesnt make it redeemable

Chroma for complex stuff and illustrious for hentai is the way to go
Anonymous No.107058339 [Report]
uh oh meltie
Anonymous No.107058352 [Report]
>>107057457
Shit I didnt notice you replied to me

>25 steps is enough
Thanks for the heads up boss, chroma fp16 with fp16 text encoder doesnt run all that slow on my 5060ti 16gb if I keep it under 30 steps
Anonymous No.107058355 [Report]
>>107058315
Yeah, uploading to civitai right now
Anonymous No.107058370 [Report] >>107058403
does using this node lead to loss in quality?
Anonymous No.107058403 [Report]
>>107058370
not anything visible
Anonymous No.107058408 [Report] >>107058441
You can still download Udio songs on the fly as 320kbps btw, just downloaded a couple of bangers. No need to record or anything like that.
Anonymous No.107058432 [Report] >>107058444 >>107058469 >>107058474
can a pitfag rate this pit for me
Anonymous No.107058441 [Report] >>107058499
>>107058408
from what I read they're limited to 192kbps mp3
I'm getting everything in bulk I saved there
Anonymous No.107058444 [Report]
>>107058432
pits fine but smaller boobs would be more harmonious
Anonymous No.107058469 [Report]
>>107058432
6/10
I prefer mine like this
Anonymous No.107058474 [Report]
>>107058432
Its nuts that i immediately spot a netayume pic every time since it looks so off
Anonymous No.107058482 [Report]
Fresh

>>107058480
>>107058480
>>107058480

Fresh
Anonymous No.107058499 [Report]
>>107058441
Yeah dunno, it's quite strange.

Was able to download a few of them at 320kbps with fetchv, like https://www.udio.com/songs/hoCg4BmayTYXcJfjo4jvbT

But other ones are only 192kbps. Maybe for some reason some of them stream at 320kbps, while other ones don't?
Anonymous No.107059859 [Report]
I see people making AI images of trump and stuff like that.
But some of the stuff is definitely better than others.
Is there any way to set it up such that whatever prompt I give, the character is strictly that one character?
I mean, not just simply typing in the name of the character but making it more realistic?
Like, in a way such that even when I make anime or caricature images, it seems like some professional artist drew that based on the likeness of the person?
I don't know much about the loras and such, that's why I ask.