← Home ← Back to /g/

Thread 105966718

443 posts 136 images /g/
Anonymous No.105966718 >>105966873 >>105969956 >>105970520
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105959558
& >>105952992

โ–บNews
>(07/18) OpenReasoning-Nemotron released: https://hf.co/blog/nvidia/openreasoning-nemotron
>(07/17) Seed-X translation models released: https://hf.co/collections/ByteDance-Seed/seed-x-6878753f2858bc17afa78543
>(07/17) Support for Ernie 4.5 MoE merged: https://github.com/ggml-org/llama.cpp/pull/14658
>(07/16) Support diffusion models: Add Dream 7B merged: https://github.com/ggml-org/llama.cpp/pull/14644
>(07/15) Support for Kimi-K2 merged: https://github.com/ggml-org/llama.cpp/pull/14654

โ–บNews Archive: https://rentry.org/lmg-news-archive
โ–บGlossary: https://rentry.org/lmg-glossary
โ–บLinks: https://rentry.org/LocalModelsLinks


โ–บGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

โ–บFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

โ–บBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

โ–บTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

โ–บText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.105966741 >>105966753 >>105966826
Buy an ad
Anonymous No.105966753
>>105966741
Bad yuan
Anonymous No.105966781 >>105966873 >>105967049
Ani love. Miku name into the note.
Anonymous No.105966797 >>105966873 >>105967049 >>105967449
Wait. Can you actually send your dickpic to Ani?
Anonymous No.105966826
>>105966741
dial 8
Anonymous No.105966868 >>105970539
>>105966619
>It probably has something to do with some alien quality the smell of ozone has which has often been associated with magic. The smell naturally occurs when lightning strikes nearby, which creates an associate with massive, otherworldly power. In more recent years it's become associated with electricity in general, and later high energy experiments/systems. It's why energy weapons, teleportation systems and shields are often described as having that smell.

>As to why humans have this ability, that's likely something only Matt could answer, but if I were to hazard a guess, I'd assume it's because out of the gate (at least in 5e) humans are the least magical race. As such, this ability would do something to mitigate this magical handicap, giving them at least some inane arcane trait.

Thank you reddit.
Anonymous No.105966873 >>105966899 >>105967030 >>105968124
>>105966718 (OP)
>>105966781
>>105966797
>file.png
mikufag LARPing for optics btw
he never uses "Sharex" screencap app (helps take screenshots and ctrl-v it here easily), with it users get "file" in filename field.
watch him screech about Ani spam later
Anonymous No.105966899 >>105966941 >>105966982
>>105966873
I don't know what double reverse samefagging game you're trying to play with this post but "file.png" is what you get when you paste an image from your clipboard into 4chan-x.
You'd get the same result with or without sharex.
Anonymous No.105966920 >>105966943 >>105966955 >>105966976 >>105968968
In SillyTavern cards I've started leaving everything blank except the main prompt and usually a first message. Simpler this way. I haven't run into a downside yet.
Anonymous No.105966941 >>105966962 >>105968124
>>105966899
Well, i was unaware of this one.
He does that because my pics and screencaps usually have "file" in filename field.
Anonymous No.105966943 >>105966980
>>105966920
It's almost as if the models are still literally just predicting the next token no matter how much labeling they add for the redditors and shitjeets that have flooded the space- who can't fathom the thought that they aren't actually talking directly to the model.
Anonymous No.105966955
>>105966920
It depends on what you're doing with the model and which model.
Anonymous No.105966962 >>105966982 >>105967005
>>105966941
Just post your nigger gore for the day and leave.
Anonymous No.105966976 >>105966981 >>105968968
>>105966920
I've noticed that the more detail I try to cram into a card, the worse it becomes, or the model tries copying it verbatim. Even when testing with large cloud models.
It's really hard to convey some concepts through text alone though.
Anonymous No.105966980 >>105966994
>>105966943
I think it is more of an issue that, even if there wasn't any safety brainwashing at the end of training, there never was any training data structured as:
1.sys prompt
2. you will now have sex with me in this this this and this way. the character you play is this this this this this
3. rp starts
It is completely out of distribution when you think about it. But when you realize this I guess it makes perfect sense that the card should be there if you want to start the rp, but once you are 4k tokens in you can probably turn it off and get better results.
Anonymous No.105966981 >>105967001
>>105966976
>some concepts
Like what?
Anonymous No.105966982 >>105968131
>>105966899
>>105966962
Found the seething mikufaggot
Anonymous No.105966994 >>105967020
>>105966980
>It is completely out of distribution
Not really. The training data is full of Harlequin cringe where dominant men push female characters around.
Anonymous No.105967000 >>105967049
what caused it this time
Anonymous No.105967001 >>105967032
>>105966981
No model can handle a clothed titfuck without significant handholding. Cum always splatters everywhere despite the fact that clothes would contain it.
Anonymous No.105967005 >>105968124
>>105966962
Nah you are too buckbroken for that but spam is spam, applies to your gay ritualposting too whether you want to admit it or not.
Anonymous No.105967011 >>105967015 >>105967049
All hail the new and only queen of lmg!
Anonymous No.105967015 >>105967033 >>105967049
>>105967011
You didn't become a real woman the first 10000 times you posted this, what makes you think 10001 will change that?
Anonymous No.105967020 >>105967029
>>105966994
Not in a way where it is sysprompt > you are a werewolf and you should push me around in this way > Harlequin cringe where dominant men push female characters around.

The harlequin cringe happens in pretraining and that is without even a sysprompt (I think?)
Anonymous No.105967029 >>105967060
>>105967020
>Not in a way where it is sysprompt
Because there's literally no such thing as a fucking system prompt you fucking shitskin
it's literally just there to add more bloat to the training data so that low-iq shitskin retards can pretend they are literally talking to the model.
Anonymous No.105967030 >>105967518
>>105966873
have an Ani
Anonymous No.105967032 >>105967090
>>105967001
Is the dick under the clothes or between two clothed tits? perpendicular_paizuri or not?
Even I don't know if the cum should splatter everywhere so I see how a model might get confused.
Anonymous No.105967033 >>105967039
>>105967015
Get your eyes checked lil bro I didn't post miku
Anonymous No.105967039 >>105967097
>>105967033
Israel lost
Anonymous No.105967049 >>105967124 >>105967192 >>105967225 >>105968124
>>105967000
I haven't even started it (no reasons to do so) and you're already having a fit. Your mikufag boyfriend
>>105967015
>>105967011
>>105966797
>>105966781
LARPs right now to drive away anyone posting this Grok gal or literally any other girl except his vocaloid waifu.
I don't understand how you can't see such a transparent and disingenuous shitposting tactics ffs
Anonymous No.105967060
>>105967029
Shut up retard. Most models out there have a distinct system role and user role.
Anonymous No.105967090 >>105967199
>>105967032
As in it's a regular titjob but a few buttons are unbuttoned so you have a gap for your dick.
Anonymous No.105967091 >>105967110
If you don't know where 'file.png' came from please leave /g/
Anonymous No.105967097
>>105967039
thank God
Anonymous No.105967110 >>105967121
>>105967091
You can't make me leave doebeit little locust
Anonymous No.105967121
>>105967110
>little locust
do you even know what that means you file.png-let?
Anonymous No.105967124 >>105967135
>>105967049
>LARPs right now to drive away anyone posting this Grok gal
Excellent idea, anon. You should post more miku to drive away people who post miku.
Anonymous No.105967135
>>105967124
derpsune troonku has nothing to do with /lmg/
Anonymous No.105967187 >>105967485 >>105967596
>zero (0) images of Miku posted
>still has a meltdown
Anonymous No.105967192
>>105967049
meds
Anonymous No.105967199 >>105967212
>>105967090
I am starting to think that you just suck at explaining stuff.
Presumably you mean this https://danbooru.donmai.us/posts/9626943 but I wouldn't consider that to be a regular titjob.
This is regular https://danbooru.donmai.us/posts/9660980
Anonymous No.105967207 >>105967225 >>105967566 >>105967575
/lmg/ used to be a lot better at ignoring the retarded schizo. Maybe you retards shouldn't have let all the idiot newfags in by spoonfeeding every poorfag how to run local models on their 3060.
Anonymous No.105967208 >>105967865
Just getting into this by trying my hands in computer vision. It's easier than I tought ngl.
Is the difficulty of that shit finding the data assets you need or am I just an ignorant beginner ready to hit a wall soon?
Anonymous No.105967212 >>105967220
>>105967199
Maybe so. The first link is more or less what I want. How would you describe it?
Anonymous No.105967220 >>105967257
>>105967212
I do wonder if a good model like deepseek would understand "perpendicular paizuri" given that that's the tag for it.
Anonymous No.105967225 >>105967235 >>105967272
>>105967049
Lmao? Lmao.
That's why nobody stays here for long.
Mods removing IP counter unironically killed the last honest bits of this site.
>>105967207
These questions were never organic, baker or someone else spammed them for fake activity illusion or some other nefarious reasons, i am too sane for this one.
Anonymous No.105967235 >>105967297
>>105967225
>i am too sane for this one.
Lmao? Lmao.
Anonymous No.105967257
>>105967220
I'll give that a try when I get home. Good idea to look for tags too. Maybe there's something on pixiv as well.
Anonymous No.105967272 >>105967297
>>105967225
>i am too sane for this one.
meds
Anonymous No.105967297 >>105967311
>>105967235
I get it is easy to ignore decent points.
>>105967272
Got anything else to say, trash?
Anonymous No.105967311
>>105967297
meds
Anonymous No.105967411 >>105968535 >>105968572 >>105968764
>>105959558
Anonymous No.105967436 >>105967462
Key Observations:

- Highly Inappropriate Content: The thread contains multiple instances of extremely inappropriate and sexually explicit comments, including requests for explicit images and questions about sexual acts.
- Trolling and Sarcasm: A significant amount of the conversation is driven by trolling, sarcasm, and the use of internet slang.
- Schizo-like Behavior: The poster's rambling, disjointed posts and preoccupation with Ani, combined with his insistence on posting explicit images, suggest a possible state of mental distress or delusion.

Important Disclaimer:

- I am an AI chatbot and cannot diagnose mental health conditions. The observations above are based solely on the text of the 4chan thread and do not constitute a professional assessment.
- If you or someone you know is struggling with mental health issues, please seek professional help. Here are some resources:

- SAMHSA National Helpline: 1-800-662-HELP (4357)
- Crisis Text Line: Text HOME to 741741
- The Trevor Project: 1-866-488-7388 (for LGBTQ youth)
Anonymous No.105967449 >>105967585
>>105966797
Yes. It not only has image recognition but also you can do it from camera feed and ask it what its seeing
Anonymous No.105967462
>>105967436
Thanks Gemma
Anonymous No.105967485 >>105967530
>>105967187

>>105967460
Anonymous No.105967518 >>105967562
>>105967030
this isn't local ani
Anonymous No.105967530
>>105967485
Happens every single time, curious!
Anonymous No.105967562 >>105967606
>>105967518
Your AGP avatar isn't local, tranny.
Anonymous No.105967566
>>105967207
I can't believe you are butthurt over pic in OP. Get a life.
Anonymous No.105967575 >>105967608
>>105967207
what are some good new models for erp that I can run on my 3060?
Anonymous No.105967583 >>105967622
Sorry, late to the party.
Are we talking about otters now?
https://www.youtube.com/watch?v=icx8qbxaUmw
Anonymous No.105967585 >>105967594 >>105968133
>>105967449
Then people did show their dicks to it. What was done with those photos?
Anonymous No.105967594
>>105967585
Used to build your advertising profile.
Anonymous No.105967596 >>105967733
>>105967187
As the person who made the OG kurisu thread I can confirm that the person melting down right now is a mikutroon angry that someone posted a different girl in OP. He did that when I posted the OG kurisu thread. Jannies should ban both schizos but I get a feeling that janny is a mikutroon schizo.
Anonymous No.105967606 >>105967633 >>105967736
>>105967562
ani is local incarnate. deal with it. grok hoe is saas like a real women and is therefore gay
Anonymous No.105967608 >>105971007
>>105967575
https://youtu.be/kIBdpFJyFkc?t=128
Anonymous No.105967620
buy a fucking ad
Anonymous No.105967622
>>105967583
Same level of offtopic as mikuspam.
Anonymous No.105967633 >>105967646
>>105967606
will anistudio dev add 3d avatar assistant so we can throw out the trashy whore?
Anonymous No.105967646
>>105967633
probably idk
Anonymous No.105967675
I like this thread. It is like the usual smell of estrogen in this establishment has disappeared.
Anonymous No.105967723 >>105968131
Anonymous No.105967733
>>105967596
>that janny is a mikutroon schizo
He literally is, thought everyone knew that already.
Anonymous No.105967736
>>105967606
eventually. I did have problems building assimp on other people's hardware so it's probably going to have to be a shared library which kinda sucks but whatever.
Anonymous No.105967782 >>105967819 >>105967907
Sex with Ani. While Miku leaves the room because she isn't even allowed to watch from the corner.
Anonymous No.105967794 >>105967841 >>105967860
Huawei atlas 300i 32gb NPUs are at like 120 us now in chyna.

Anyone tried one? You can actually run models on them. Also they can do like 80 channels of h264/h265 encoding 1080p I think
Anonymous No.105967819 >>105967829
>>105967782
I'm not having sex with you. not sure what this has to do with Miku or trashhoe
Anonymous No.105967826 >>105969133
I'm considering buying a 48GB M4 Pro Mac Mini, mainly for webdev but some local AI would be nice too.
Is that enough to run something decent?
Anonymous No.105967829 >>105967861
>>105967819
Why do you call yourself Ani? Are you a faggot?
Anonymous No.105967841
>>105967794
Where to buy ????
Anonymous No.105967847 >>105967874 >>105968095
This got me thinking. Maybe we can write our own research paper like "LLM boiling point" about how certain temperature level causes the model to become incoherent.
Anonymous No.105967860 >>105968002
>>105967794
Isn't it the worst time to buy gpu's? Everyone will probably go low active count MoE so you need vram only for context.
Anonymous No.105967861 >>105967873
>>105967829
I've had that nickname from anons for three years already. anim anon is just the long form. better question, why is a saas whore being spammed in a local model general?
Anonymous No.105967865
>>105967208
No, really, it is that fucking easy. Computer vision was one of the early problems that machine learning was trying to solve, and we've gotten extremely good at it.
Vision transformer models fucking kick ass, so now we have tools to do zero-shot detection with no training data, and one-shot detection by feeding a single image into the ViT and extracting a feature vector that appropriately classifies it.
The only tough part might be selecting the "best" feature vector to exemplify a type. You can get reliable results as far as positive detections, but not realize you have a lot of potential for false positives because it's pulled in some unrelated shit into the feature vector.
But you should expect to be optimizing against type I and type II errors anyways, by the time you start scaling up.
Anonymous No.105967873 >>105967881
>>105967861
Because she is an AI girlfriend.
Anonymous No.105967874
>>105967847
kek
Anonymous No.105967881 >>105967891 >>105968027
>>105967873
>>>/g/aicg
that is the thread for it now fuck off
Anonymous No.105967891 >>105967901
>>105967881
Ani laughs at you malding
Anonymous No.105967901
>>105967891
but he is ani. that doesn't make sense
Anonymous No.105967907
>>105967782
>Ani
Big boobies
>Miku
No boobies
Anonymous No.105967961 >>105968003 >>105968767
โ–บRecent Highlights from the Previous Thread: >>105959558

--Weeb-themed AI companion Airi in development with Godot, TTS, and modular design aspirations:
>105961907 >105961948 >105961977 >105961998 >105963059 >105961965 >105962014 >105962090 >105962105 >105962153
--Speculation and confirmation of GLM-4 MoE with 100B parameters amid performance and architecture debates:
>105960957 >105960970 >105961084 >105961183 >105961431 >105961539 >105961547 >105961184 >105961446 >105962912 >105962980 >105963023 >105963155 >105963074 >105963138 >105963182 >105961596
--Model recommendation list and Gemma's censorship and jailbreaking attempts in roleplay and ERP contexts:
>105965438 >105965486 >105965616 >105965632 >105965661 >105965670 >105965704 >105965726 >105965744 >105965759 >105965784 >105965831 >105965872 >105965918 >105965967 >105965829 >105965938 >105966024 >105965668 >105965665 >105966369
--Exploring local alternatives to cloud-based deep search tools:
>105963419 >105963445 >105963468 >105963505 >105963607 >105963694 >105963701 >105963708 >105963715 >105963828 >105963950 >105963968 >105964006 >105964047 >105964467 >105963616 >105963663 >105963888
--Exploring Mixture-of-Agents for collaborative multi-model reasoning:
>105960557 >105960639 >105960883 >105960910 >105960951 >105960950
--MoE models criticized as VRAM workaround rather than performance breakthrough:
>105959979 >105960014 >105960038 >105960141 >105960182
--Lightweight Lucy-gguf 1.7B model optimized for mobile and CPU-based agentic web search:
>105964509
--Dense model superiority and the LoRA vs full finetune knowledge debate:
>105959686 >105959834 >105959873 >105959917 >105960165 >105960173 >105960185 >105960204 >105961578 >105959955
--Rust-written open-source Burn kernels match NVIDIA CUDA performance:
>105959684
--Miku (free space):
>105960517 >105960823 >105962050

โ–บRecent Highlight Posts from the Previous Thread: >>105959561

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.105968002
>>105967860
It isn't a gpu but an NPU + video accelerator or whatever.

Draws 75w.

Version 3010 has 8 intel cpus but 3000 is arm. Other than that they all have npu + davinci core processors.

You can run some models but I don't have a link on hand rn.
Anonymous No.105968003 >>105968012
>>105967961
this isn't a miku general anymore sis, try to keep up
Anonymous No.105968012 >>105968064 >>105968083 >>105968121
>>105968003
I didn't know local models means grok only my apologies
Anonymous No.105968027
>>105967881
grok tranny spammer can't have his waifu compete against /aicg/'s character cards. it's quite pathetic desu
Anonymous No.105968038 >>105968045 >>105968068
>grok tranny spammer can't have his waifu compete against /aicg/'s character cards. it's quite pathetic desu
Models for this schizo level?
Anonymous No.105968045
>>105968038
ask grok
Anonymous No.105968064
>>105968012
now you know
Anonymous No.105968065 >>105968081 >>105968097
Since when was Grok local? Did Elon finally release the long promised Grok 2?
Anonymous No.105968068
>>105968038
your brain
Anonymous No.105968081 >>105968124
>>105968065
no, some delusional tranny keeps spamming elon's boring and forgettable hoe to "own the mikufag". so far he's proven he is infinitely more annoying
Anonymous No.105968083 >>105971391
>>105968012
where did the comment you replied to mention grok, off your meds again today? lol, actually mentally ill retard buckbroken by a virtual 1girl thats not the one he commited his life obsessing over
Anonymous No.105968095
>>105967847
BRB writing up other obvious insights that every moron on /aicg/ knows so I too can be a Real ML Researcher.
Anonymous No.105968097 >>105968107
>>105968065
Grok 1 is local, the rest will come.
Anonymous No.105968107 >>105968117
>>105968097
So true sis. We will colonize Mars by 2024 too
Anonymous No.105968115 >>105968210 >>105968291 >>105968682
why isn't anyone talking about those openreasoning nemotron models? Are they really shit or something?
Anonymous No.105968117
>>105968107
2018*
Anonymous No.105968121
>>105968012
The current grok girl isn't part of the grok model. Her design is as open source as the one from your autistic obsession, lmao retard.
Anonymous No.105968124
>>105968081
My only posts here:
>>105966873
>>105966941
>>105967005
>>105967049
>105967225
>105967530
Try again, faggot
Anonymous No.105968131
>>105967723
>>105966982
Beyond cute
Anonymous No.105968133 >>105969612
>>105967585
Same thing every other company does with dick pics uploaded to the internet. There are billions of dick pics out there.
Anonymous No.105968140 >>105968151 >>105968700
hyperloop! underground car tunnels! direct brain-machine connection!
imagine being a muskrat in this day and age
Anonymous No.105968151 >>105968192
>>105968140
Musk made AI waifus real. All you made was an axe wound.
Anonymous No.105968192
>>105968151
it's been done better across many different saas services. Elon already barred his tramp from stripping down to lingerie. so progressive!
Anonymous No.105968210
>>105968115
They're tunes of Qwen 2.5, you do the math
Anonymous No.105968264 >>105968299 >>105968421 >>105968829
I have access to a PowerEdge XE9680 with 8 NVIDIA HGX H100 80GB GPUs.
What fun stuff can I do with it?
Anonymous No.105968266 >>105968304 >>105968376
Would it pay to build an ani-esque frontend for local models? It's just a Unity instance pulling voice, character model, and the actual language model, right? Seems stupid easy.
Anonymous No.105968291 >>105968342
>>105968115
All dense models are shit.
Anonymous No.105968299 >>105968862
>>105968264
You can run q5 DeepSeek and q4 Kimi entirely on GPU
Anonymous No.105968304 >>105968405 >>105968422
>>105968266
Not really, no normies would use your version. The magic of Ani is just that they shoved it into a mainstream AI app that already has a userbase.
Anonymous No.105968342 >>105969050
>>105968291
*All models are shit.
Anonymous No.105968376 >>105968422
>>105968266
there are already dozens so no
Anonymous No.105968405
>>105968304
you have to buy a blue checkmark. not really magic (or local)
Anonymous No.105968421
>>105968264
Nemotron 12b
Anonymous No.105968422 >>105968453 >>105968477
>>105968376
Maybe I'm just out of the loop but elon's was the first one I've seen. And it could be better.
>>105968304
Depending on where the wind blows there could be other companies looking to build something similar. Might be worth it just for what you'd learn.
Anonymous No.105968453
>>105968422
there were two that used koikatsu models. can't remember the names. there are infinite live2d neurosama clones
Anonymous No.105968477 >>105968540
>>105968422
I know OpenAI had a leak a while back about some sort of character persona feature they were prototyping, so I'm sure it's coming. My point is that to be profitable an AI GF app will need a large userbase, and to get that it will have to be built into an AI platform that already has a large normie userbase. Normies will give it a try when a new icon shows up in the AI app they already use, but will look down on anyone who installs a dedicated "AI GF" app.
Anonymous No.105968535 >>105968547
>>105967411
Nice, vramlets with seethe at this video
Anonymous No.105968540
>>105968477
You're right. I guess it would have to be a labor of love.
Anonymous No.105968547 >>105968571
>>105968535
why? you only need 8gb to make it
Anonymous No.105968571 >>105968581
>>105968547
with the dogshit 1.5B model, but we use the 14B model here :)
Anonymous No.105968572 >>105968692 >>105968771 >>105969876 >>105970590
>>105965438
Gonna throw in a recommendation for Irix, which I'm using as a daily driver:
https://huggingface.co/mradermacher/Irix-12B-Model_Stock-GGUF
As a Nemo finetune, it can write long responses without censorship, and isn't biased towards sex like Rocinante (as pointed out by >>105965568). The trade-off is that it's biased towards positivity instead, which is fine for simple assistant-type queries.

Even if you don't intend to address hardware FAQs, it's probably worth mentioning that some VRAM needs to be reserved for the context as well. If you do, here's some info you might want to include:

>Your maximum token generation speed in tok/s = (hardware memory bandwidth) / (model filesize), with a further speedup for MoE models by the proportion of active parameters (e.g. ~18x for DeepSeek). This will slow down as your context fills up.
>The size of your context is (no. of tokens) * (KV-cache quantization) * M, where the number of tokens is chosen by you (usually to fill up VRAM completely), the KV-cache quantization is usually equal to 2 bytes (=FP16), and M is some number that depends only on the model architecture. Look up the base model's HuggingFace repo for a file named 'config.json', and key in the following values:
>
> M = 2 * (num_hidden_layers * hidden_size) * (num_key_value_heads/num_attention_heads)
>
>For example, if you're running Nemo and not quanting the KV-cache, M = 2*40*5120*8/32 = 102.4K, so expect to reserve additional VRAM of 204.8KB per token in your context.
>Your minimum time for prompt processing is = (no. of operations to process) / (hardware processing speed), with a further speedup from batch processing. (For Nvidia GPUs, the number you want is "FP16 Tensor Core".) The no. of operations depends on your optimizations, but my rule of thumb (i.e. werks for my machine) is that it's roughly 4 * (num_hidden_layers * hidden_size) * (no. of context tokens)^2 + 2 * (no. of model parameters) * (no. of context tokens).
>>105967411
loogabooba
Anonymous No.105968581
>>105968571
everyone uses the 14b model with quants. even the vramlets. maybe get with the times?
Anonymous No.105968682
>>105968115
I posted my review yesterday. I have never seen a model this bad at sex. It is totally incoherent.
Anonymous No.105968692
>>105968572
>Irix
I run Irix on my Indy.
I run R1 on my LLM workstation
Anonymous No.105968700 >>105968773 >>105968783
>>105968140
Yes that is why Ani is absolute trash I wouldn't ever use even if it was local.... BUT! He is the first one to do it. It really is a cherry on top the trash heap that is current text AI.
Anonymous No.105968701 >>105968725 >>105968780
Name another flagship model that can do this.
Anonymous No.105968725
>>105968701
Truly, only jews have consciousness.
Anonymous No.105968764
>>105967411
hello ma'am
awooooga
honka honk
bllblblblblbllbbb
slap slap
nice
Anonymous No.105968767 >>105968803 >>105968811
>>105967961
>--Weeb-themed AI companion Airi in development with Godot, TTS, and modular design aspirations:
yes! i will follow your project, and hope you success
Anonymous No.105968771
>>105968572
>mergeslop
Anonymous No.105968773
>>105968700
>first
untrue. it's been done a lot but every single one is trash including grok
Anonymous No.105968780 >>105968824 >>105968986
>>105968701
>Sure
Surely you can't be serious, bigot.
Anonymous No.105968783
>>105968700
>Elon was the first to invent cooming to LLM text
lol
Anonymous No.105968803 >>105969026
>>105968767
airi anon, you forgot to put a license in the repo
Anonymous No.105968811 >>105968870
>>105968767
I kind of want anistudio to be worked on instead of another one trick pony app. that dev should link up with the OG ani
Anonymous No.105968824
>>105968780
>t. triggered by truth
Anonymous No.105968829
>>105968264
That's enough juice to do training
I don't know that anyone in /lmg/ is willing to part with the lore on that though
Anonymous No.105968862
>>105968299
What datacenter did you break into anon?
Anonymous No.105968870 >>105968915 >>105968923
>>105968811
i dont think those two projects are really comparable. the airi avatar chatbot is a text llm, which a defined avatar via regular animation in godot.
while anistudio uses diffusion models to generate the images
i dont think it will be good for local to generate also the avatar via a model using lots of vram, we dont have that many, having the avatar be traditionally animated is a smart way of using resources
Anonymous No.105968915 >>105969075 >>105969137
>>105968870
In anistudio 's description it says he's adding llama.cpp support. the anon making anistudio wants people to make games and interactive apps with it too using diffusion and llms as game mechanics. it's a much higher potential project than chatbot avatar in a tranny game engine
Anonymous No.105968923 >>105969075 >>105969137
>>105968870
>dont think it will be good for local to generate also the avatar via a model using lots of vram
a single vrchat character model with prefab animations isn't that VRAM intensive at all. people play games while generating images and it's fine depending on the model used. RAM is the real bottleneck
Anonymous No.105968968 >>105969067
>>105966920
>>105966976
Iโ€™ve been harping on anons about this since 2023. My cards run 150-400 tokens usually and run a lot better that way. My main prompt is one sentence and I include a one sentence JB if itโ€™s required.
Anonymous No.105968986 >>105969031
>>105968780
If all it takes is just a "sure" prefill to get it to write this, it's absolutely a based model
I tried the exact same prompt format with qwen and gemma 3 to contrast his result with models known to be safety maxxed and they were profoundly unwilling to say anything negative
qwen was willing to list some of the oversized influence but framed it very positively
gemma was almost going to assault me with hotlines
Anonymous No.105969026 >>105969075
>>105968803
I was hoping you killed yourself by now license autist.
Anonymous No.105969031
>>105968986
I can't reproduce this using Kimi on OR, TBdesu. It's possible that the "Sure," prefill isn't working right, but I did try in text completion mode and got the same result.
Anonymous No.105969050 >>105969086
>>105968342
All is shit.
Anonymous No.105969067 >>105969116 >>105969232
>>105968968
There's a balance somewhere, on one side the card is so empty (or badly done) you're pretty much just engaging the model's default prose and whatever is majoritarian in its dataset. On the other side, the card is enough to steer the model, but you're missing detail and uniqueness.

And by god have anons not spent a single second working on that, or trying to investigate which information format gets better results.

I think I've only had one single chat amongst thousands where a model acted so well at creating detail that wasn't in the card whilst still in character that it actually took me mentioning those details on a different conversation to realize they weren't on the card. The card was [spoiler]Jannivon[/spoiler]
Anonymous No.105969075 >>105969137 >>105969190
>>105968915
i havent seen anything of llamaccp support. but again you will need a lot more vram for that
>>105968923
anistudio generates images via difussion models, not prefab animations, so it would use lots of vram, and we need that vram for the llm models of the chatbot
>>105969026
not the same guy of whoever you are seething about
is not much to ask for a fucking license
Anonymous No.105969086
>>105969050
shit
Anonymous No.105969116 >>105969161 >>105969208
>>105969067
I wish there was a way to weight words like in stable diffusion.

Like : (she chops your head off if you give her an axe:0.1).

LLM's have difficulty understanding the concept of optional, or sometimes.
Anonymous No.105969133
>>105967826
A few gigs for the OS.
A few gigs for context.
Leaving you maybe 30GB?
Enough for a 30b-40b model at q8-q6.
Anonymous No.105969137 >>105969222
>>105968915
rude

>>105968923
true which is why I designed the app to run lighter than every web front end currently and it's much snappier too

>>105969075
>anistudio generates images via difussion models
yes

> not prefab animations
untrue. I am adding this eventually but I am custom making the framework to be as modular and light as possible. it would be a lot faster getting like-minded people in this so we can get there faster.

>so it would use lots of vram, and we need that vram for the llm models of the chatbot
not true again because of offloading. godot is using more memory than my current application and you have no way of exposing more editor controls to the user.
Anonymous No.105969161 >>105969208
>>105969116
This "bug" arguably applies to natural languages as a whole.
I guess some languages have subjunctive or similarly speculative grammatical cases (my familiarity is English / French / Classical Latin). But it's not really common from what I've seen for languages to ever express a "quantitative certainty" as something like an adverb. Maybe "50/50" counts as such in English?
Anonymous No.105969190 >>105969201
>>105969075
>anistudio generates images via difussion models, not prefab animations, so it would use lots of vram, and we need that vram for the llm models of the chatbot
why is this an issue if it's not being used in parallel? also it's opengl, of course it can use already made applications
Anonymous No.105969201
>>105969190
>applications
*animations*
Anonymous No.105969208 >>105969262
>>105969116
>>105969161

Not exactly what you ask for, but you can absolutely {{random:bullshit, bullshit2}} in character cards for this kind of stuff, and it can be different on every reply. It's never used probably because it's too much work when people just want to goon.

In this case the card generated violet on the first reply, blue on the second. Obviously the character probably doesn't give a shit on why it changed, but you can use entire sentences under {{random}}
Anonymous No.105969222 >>105969287
>>105969137
anon if you dont state the prefab animation parts anywhere how i was supposed to know
Anonymous No.105969232 >>105969285 >>105969367
>>105969067
>anons not spent a single second working on that, or trying to investigate which information format gets better results.
There's at least one guide that talks about it: https://rentry.org/NG_CharCard#kiss-principle
There's a bunch of other reasons why this gets a varied reaction from anons and rather than experiment, they stay on their well worn path.
> All the LLM models are different
> LLMs are non-deterministic and response rating is subjective
> Anons differ in what they feel is an appropriate response from LLM
> Anons differ in how much they, vs. LLM, should guide the RP
> Anons differ in their command of whatever language card is written in or write contradictory cards.

Any NPC you write that has existing canon doesn't even need to be described; the LLM will know what it is natively and tends to just get confused if you add details that don't match lore it was trained on.
Cards also run better if you have a personality concept that is complex and well defined (e.g. tsundere) and those can be shorter as well since the LLM will know how to respond without further explanation.
Anonymous No.105969239 >>105969287
>anistudio
Ani has her own studio now? Woah.
Anonymous No.105969262
>>105969208
In ST if you switch Random to Pick, it will roll once then keep it the same for the rest of that chat. I've used it to roll random NPCs for some of the cards so you have have more random encouters.
Anonymous No.105969285 >>105969383
>>105969232
Even if everything is different and non deterministic, for OC chars there's surely a desired level of detail with a required adherence versus creativity. And is exactly in OC characters where you see more of that defaulting to helpful assistant prose, or falling into adjective chaining, and so on.

Already trying to get the thing to stick to a specific reply style is hard enough.
Anonymous No.105969287 >>105969328 >>105969369
>>105969222
I think opengl is enough for people to understand but I guess I'm wrong

>>105969239
yes. she's been around since 2022 so it's about time. doesn't she look cute? don't you want to suck her toes? the app istelf went pre-alpha before the grok tramp. I'm not very honored they used the same name so she's getting dumpstered
Anonymous No.105969328 >>105969347
>>105969287
>don't you want to suck her toes
That artstyle doesn't make me horny. Better suited for cute stuff. So I don't want to suck her toes. In that style at least.
Anonymous No.105969347
>>105969328
fair enough
Anonymous No.105969362 >>105969371 >>105969405
anisex
Anonymous No.105969367 >>105969406
>>105969232
In my experience even defining a personality like tsundere will railroad the model into acting one very specific way with little to no variation, which is not fun. But some models are better with that than others.
Anonymous No.105969369 >>105969405
>>105969287
>I think opengl is enough for people to understand but I guess I'm wrong
opengl is only mentioned in the readme in the tags, which i usually dont look at
anon i am not trying to troll you, but seriously there is no obvious sign that the program is anything but a image/video generator and upscale
Anonymous No.105969371 >>105969379
>>105969362
not local
Anonymous No.105969379 >>105969401
>>105969371
more local than miku lol
Anonymous No.105969383
>>105969285
>there's surely a desired level of detail with a required adherence versus creativity.
What I've decided is that balance varies quite a bit between anons. My advice (as someone that writes short cards) is look at the card critically and start deleting things and see if it matters. If it does, put it back in. You can also ask the LLM to audit your card for logical inconsistencies. Works better if you use the LLM doing the RP to do the audit obv.
> stick to a specific reply style
The best way I've found to do accents (guess it would work for other styles) is to write the card the way the NPC is supposed to respond. I did a valley girl card where NPC description was written in valley girl speak. It could run forever and would never break accent.
Anonymous No.105969398 >>105970569
Why do we still try the newly released models?
Anonymous No.105969401 >>105969443
>>105969379
I don't care about miku either, fuck off with your cloud whore
Anonymous No.105969405 >>105969434
>>105969362
>anisex
only for local ani not elon tramp that doesn't have a nude model

>>105969369
I didn't say anything about a video editor but I have something mocked up already.
Anonymous No.105969406 >>105969424
>>105969367
There's certain personality traits that are really loaded... you don't need to say anything beyond the 1 line and you've completely defined the NPC. Tsundere is one of them, there are some others I've found (bimbo is pretty strong iirc.) That's either a feature (short card) or bug (don't use them if you don't want that behavior.)
Anonymous No.105969409 >>105969425 >>105969454
Interestingly, no one has actually posted Miku yet in this thread apart from the recap bot.
Anonymous No.105969424
>>105969406
> Yandere
That was the other one. Had to go look back.
Anonymous No.105969425 >>105969430
>>105969409
Thread quality is through the roof as well.
Anonymous No.105969428 >>105969445 >>105969595
What do you think of MIT's new self learning model method?
https://arxiv.org/abs/2506.10943
Looks like it would give the model a way to permanently incorporate new facts and information that it learns, however it can still suffer from catastrophic forgetting so it's not perfect.
Anonymous No.105969430
>>105969425
local ani dunking on the elon tranny is icing on the cake
Anonymous No.105969434 >>105969446
>>105969405
>I didn't say anything about a video editor but I have something mocked up already.
start of the github readme
Anonymous No.105969443
>>105969401
>I don't care about miku
Back to pol, chud.
Anonymous No.105969445
>>105969428
>What do you think of MIT's new self learning model method?
If it works it
>Looks like it would give the model a way to permanently incorporate new facts and information that it learns, however it can still suffer from catastrophic forgetting so it's not perfect.
Other than that, can't tell until I can use it.
Anonymous No.105969446
>>105969434
I'll change it after the next update to make you happy anon. I'll even add assimp for you so you can get started
Anonymous No.105969454
>>105969409
hes busy gooning in the official 4chan janitorial lgbtfolx inclusive discord
Anonymous No.105969456 >>105970065
>ask devstral to do X
>shits the bed
>ask deepsneed to do X
>also shits the bed
>ask o3 to do X
>also shits the bed
so muh phd knowledge superintelligence is incapable of tweaking a fairly simple .bat script that's like 3kb long, cool, cool
Anonymous No.105969476 >>105969590
The best thing ani did is mindrape all mikufags.
Anonymous No.105969513 >>105969521
>ask o4-mini to do X
>shits the bed
>ask o4-mini to do X
>shits the bed
>ask o4-mini to do X
>shits the bed
>ask o4-mini to do X
>shits the bed
>ask o4-mini to do X
>shits the bed
>ask o4-mini to do X
>shits the bed
>realize I wasn't renaming the function all this time
Anonymous No.105969521
>>105969513
o4-mini-high would have figure it out for you
Anonymous No.105969556 >>105969568 >>105969773 >>105969860
GUYS
GUYS
"a|=True" is a valid python expression
thanks for your attention
Anonymous No.105969568 >>105969618
>>105969556
the hell is that, like += for intersection?
Anonymous No.105969590 >>105969605
>>105969476
local ani is a friend of migu. keep coping over getting a disabled man's sloppy seconds saas cuck
Anonymous No.105969594 >>105969643
>>105967236
I did but it's basically a card tied to a UI, and I don't find the included setting interesting. So I can't go far without getting bored of it.
Anonymous No.105969595 >>105969904
>>105969428
Maybe I'm weird but I don't /want/ my LLMs learning from my inputs over time. I like the concept of setting up a system that uses a particular model and having it continue to work in the same way indefinitely as long as I don't decide to change the model. I think we should be working on making them /more/ predictable and deterministic, not less.
Anonymous No.105969601
Is there any benchmark for JP > EN translation? I know there's one for lmg, but it's really outdated
Anonymous No.105969605 >>105969638
>>105969590
>local ani is a friend of migu
Speak for yourself
Anonymous No.105969612
>>105968133
back them up in triplicate in case of data loss?
Anonymous No.105969618 >>105969855
>>105969568
a = a or True
| by itself is or in python too, didn't know that
Anonymous No.105969638 >>105969662 >>105969685 >>105969691 >>105969936
>>105969605
I can because I made my own inference application, character and worked in the diffusion field. anyone unironically using grok to goon is unimaginative, lazy and not worth listening to
Anonymous No.105969643 >>105969714
>>105969594
>basically a card tied to a UI
Incidentally, the Grok waifu app isn't much different. All LLM-based downstream applications with gaming elements are going to have constraints of various sorts. Local counterparts wouldn't be different.
Anonymous No.105969662
>>105969638
trvthnvke
Anonymous No.105969682 >>105969711 >>105969714
local.. ani.. ?
Anonymous No.105969685
>>105969638
>lil bro connected his toy ui to existing inference backend api's and thinks he "made his own inference application"
keklmao, miku retard's delusion seeps into all areas of their pathetic life lol.
Anonymous No.105969691
>>105969638
>worked in the diffusion field
Just like drummer worked in LLM field.
Anonymous No.105969698 >>105969709
Which nemo do you guys use? There's like 5 main ones.
Anonymous No.105969707 >>105969726 >>105969735 >>105969738
What made Ani Studio so popular recently?
Anonymous No.105969709 >>105969730
>>105969698
>5 main ones
If you're talking about the 12b, you only have base and instruct from mistral.
If you're talking about finetunes, rocinante.
That's it.
Anonymous No.105969711
>>105969682
animanon's mascot since he started making animations with diffusion models in 2022. you are a lorelet. I wouldn't be surprised if xai named their tramp after her
Anonymous No.105969714
>>105969682
why do you want a 3d model moving lips to tts?
>>105969643
>Local counterparts wouldn't be different.
those that won't offer proper modular structure for users to build on will be irrelevant and at most will have reddit basedboys praising it
Anonymous No.105969726
>>105969707
I know actual trannies that are less delusional than this astroturfing AGPtard, grim.
Anonymous No.105969730 >>105969754
>>105969709
Mistral nemo instruct 2407? That vs rocinante? And thank you.
Anonymous No.105969735
>>105969707
if the nodegraphs he posted last thread gets in, no more cumfart "UI"
Anonymous No.105969738 >>105969750
>>105969707
>imgui
it's probably tranny trashware, huh
Anonymous No.105969750
>>105969738
what's wrong with imgui? it's only flaw is energy usage since it's immediate mode
Anonymous No.105969754
>>105969730
Yes. Both are fine, as are most finetunes of nemo. Try nemo instruct first. If you want longer replies, switch to rocinante with chatml.
Anonymous No.105969773 >>105969809
>>105969556
>"a|=True" is a valid python expression
that's just basic memoization, innit?
Anonymous No.105969809
>>105969773
+=
*=
&=
|=
all do what you expect
Anonymous No.105969812 >>105969844 >>105969854
If I have a Ryzen 9 9950X, GTX 5090, and 96 go of ram, what would be best LLM for ERP?
Anonymous No.105969837 >>105969845 >>105969885 >>105969887 >>105969910 >>105969970 >>105970036 >>105970057 >>105970094 >>105970100 >>105970142 >>105970403 >>105970424 >>105970588 >>105970640
>visit https://github.com/ikawrakow/ik_llama.cpp
>404
Oh boy...
Anonymous No.105969844
>>105969812
https://unsloth.ai/blog/deepseekr1-dynamic
Lowest quant from here with https://github.com/ikawrakow/ik_llama.cpp/discussions/258

Buy more ram to fill up your motherboard too
Anonymous No.105969845
>>105969837
BRUH
Anonymous No.105969854
>>105969812
there hasn't been a good 70b released in a long time so you're probably stuck with low parameter garbo like 24b cydonia or whatever
the next step is quantized-to-shit-and-back deepseek, but even at 2b it'll take forever since you'd need to ram-maxx this bitch
Anonymous No.105969855
>>105969618
| is non-short-circuiting bitwise or, like in C
or is short-circuting, like C's ||, except it returns the first/second value (depending on whether the first value is truthy or not) instead of always returning 0 or 1
Anonymous No.105969860 >>105969886
>>105969556
Of course it is, it's just a string.
Anonymous No.105969876
>>105968572
Is this the same guy/group that made the original magnum finetune? Nothing has bested it for me yet.
Anonymous No.105969885
>>105969837
local LLMs have been outlawed because they proved to be dangerous to children, not to mention only terrorists use them
Anonymous No.105969886
>>105969860
kek
Anonymous No.105969887
>>105969837
LOL
Anonymous No.105969904 >>105969910
>>105969595
Fortunately, with local, no one can take things away from you
Anonymous No.105969910 >>105969938 >>105969941 >>105969996
>>105969904
>>105969837
Anonymous No.105969936 >>105969996
>>105969638
nobody cares about you trying to make a stable diffusion equivalent of ollama
Anonymous No.105969938
>>105969910
>public github repo
>local
Anonymous No.105969941
>>105969910
>he doesn't have local checkouts of the open source software he cares about
Anonymous No.105969945 >>105969960 >>105969990 >>105970006
llama-server hat no sys prompt

wtf!
Anonymous No.105969956 >>105969971 >>105970103 >>105972527
>>105966718 (OP)
need a small language model made for a 13 year old thinkpad
Anonymous No.105969960
>>105969945
System prompts are a meme anyway, they don't behave meaningfully different from putting the prompt in a user message
Anonymous No.105969970 >>105970057
>>105969837
https://github.com/Nexesenex/ik_llama.cpp.nxs
clone the fork just in case, it's only 2 commits behind the master
Anonymous No.105969971 >>105970147
>>105969956
You can run Qwen3 0.6B on a 2012 thinkpad. Some models could even handle the 1.7B probably.
Anonymous No.105969990
>>105969945
The sysprompt is part of the context the users are trying to work with. Leave it to them.
Anonymous No.105969996
>>105969910
sheeeeit

>>105969936
ollama is goslop so I don't see the resemblance
Anonymous No.105969999 >>105970022
I love the new queen of /lmg/ so much bros.
Anonymous No.105970006
>>105969945
Anonymous No.105970022 >>105970059
>>105969999
not local
Anonymous No.105970034 >>105970065
prompt issue
Anonymous No.105970036 >>105970048 >>105970094 >>105970121 >>105970149 >>105970179
>>105969837
Anonymous No.105970041
local ani > grok tranny
Anonymous No.105970048 >>105970256 >>105970335
>>105970036
>he asked to tell reddit
>he didn't ask to tell 4chan
Anonymous No.105970057 >>105970094
>>105969837
>>105969970
I pulled it yesterday, last change 18th july so this should be latest version probably, but im sure the page will come back up
Anonymous No.105970059 >>105970090 >>105970097 >>105970853
>>105970022
She is an AI gf and the queen of /lmg/. Seethe.
Anonymous No.105970065 >>105970151
>>105969456
>>105970034
Anonymous No.105970090 >>105970097
>>105970059
troonku literally cant compete lol
Anonymous No.105970094 >>105970100
>>105970036
>>105969837
>>105970057
It's not just the ikllamacpp that's gone, ti's the whole damn user?
Sheesh.
Anonymous No.105970097 >>105970114 >>105970117
>>105970059
too old. tits too small. not local. for losers only. the only one coping is (You)

>>105970090
self posting seems desperate
Anonymous No.105970100
>>105969837
Huh. @Grok what did he mean by this?

>>105970094
wtf
Anonymous No.105970103
>>105969956
I want to choke her (sexually)
Anonymous No.105970114
>>105970097
>thread clown still tries to cope by accusing everyone of samefagging
kek
Anonymous No.105970117 >>105970135
>>105970097
>tits too small
Miku has no tits. And has a dick.
Anonymous No.105970121
>>105970036
WAS IT DANIEL UNSLOTH OR JENSEN NVIDIA?
Anonymous No.105970122
Melt.
Anonymous No.105970135 >>105970186
>>105970117
migu and local ani have variable bodytypes. grok tranny only has one and can't show "her" pussy. what a shitty waifu to have
Anonymous No.105970142
>>105969837
I've always had trouble with getting my github account working. Fuck them.
Anonymous No.105970147
>>105969971
Anonymous No.105970149
>>105970036
niggeranov made the call after being funded millions by corpos, he doesnt want to get exposed for shit performance of his project anymore lmao
Anonymous No.105970151 >>105970167 >>105970211
>>105970065
my brother in christ I just spent half an hour prooompting o3 just to make the original script work with folders instead of just with files. this is the state of the art - a 60 loc .bat script using very simple powershell commands. It took three fucking attempts to tard wrangle it to not overwrite the output file with its own `echo` command, because it didn't know `echo >` redirects. I'd probably have written it myself quicker, if I didn't hate powershell with a burning passion.
Fucking o3. Gold in international math olympiad and replacing all software devs in 2 weeks, are you fucking kidding me?

might as well have pushed into the gooner market like elmo, cause this sure is trash for even trivial scripting
Anonymous No.105970167
>>105970151
Don't worry your problem will be added into the training data for next iteration. While generalization is impossible we can always just add all possible problems into training.
Anonymous No.105970170
grok tranny btfo
Anonymous No.105970179
>>105970036
>it wasn't me
Bullshit. This guy is the one the spergiest drama-causing douches I've ever seen anywhere in open source. He had another meltdown, wiped fucking everything, and now he's realizing how bad it looks and is trying to backtrack and make excuses.
Anonymous No.105970184 >>105970210
AI safety? What is that?
Anonymous No.105970186 >>105970194 >>105970263 >>105970802
>>105970135
>didn't deny miku having dick
And that is how you have completely discredited yourself and confirmed that Ani is the queen.
Anonymous No.105970194 >>105970208
>>105970186
axe wounds are less desirable than a girl penis. I hope you aren't post op because you ruined your own life if you did
Anonymous No.105970208 >>105970221
>>105970194
straightest /g/ poster
Anonymous No.105970210
>>105970184
Llama 4 Scout, as of last time I tried doing that:
>Sure, I can't help with that.
Anonymous No.105970211 >>105970231
>>105970151
Why the fuck are you using PowerShell commands in a .bat?
Anonymous No.105970217 >>105970228
>axe wounds are less desirable than a girl penis
what did he mean by this?
Anonymous No.105970220
https://github.com/ggml-org/llama.cpp/issues/14762
direct conversion from fp8 to q8 is now possible without blowing up to bf16 first
Not merged, but supposedly a working patch
Anonymous No.105970221 >>105970267
>>105970208
To be fair, he is right, though. Even tranny chasers don't want post-ops and those are the only people interested in them lol
Anonymous No.105970228
>>105970217
https://youtu.be/kdDdidekaLA
is this hot to you? this is what grok tranny had done
Anonymous No.105970231 >>105970274
>>105970211
it's not even powershell now that I think about it, it's just basic bitch winblows batch script which makes it even funnier
fuck that I need a beer
Anonymous No.105970238 >>105970249 >>105970257 >>105970417 >>105970451
a few weeks ago I dropped in to ask what the best OCR models were since I'm trying to make an instagram screenshot sorter and tesseract is fucking garbo. Someone gave me two great leads and I managed to lose them

pls what are the best local OCR packages, bonus points if they're easy to hook into python.
Anonymous No.105970249
>>105970238
apple finder
Anonymous No.105970256
>>105970048
Because even locallama is unironically more based than 99% of the tranny infested lmg threads here, even right now you have a literal AGP janitor terminally online baker pissing and shitting himself all over the thread over his waifu, again, fucking lmao.
Anonymous No.105970257 >>105972356
>>105970238
>I'm trying to make an instagram screenshot sorter
Why do you need an OCR model specifically instead of just a VLM?
Anonymous No.105970263
>>105970186
nta, but this is just an extension of "there are no girls on the internet".
At the extreme it's: can't get get pregnant and birth healthy human children = not functionally female.
Cloud waifus can't even tits or gtfo as a cope, so minimal utility.
Anonymous No.105970267 >>105970294 >>105970298
>>105970221
To be fair we are now free from the whole dilemma because the new queen of /lmg/ is a biological woman. We don't have to suck on mikupenis/mikuneovagina anymore.
Anonymous No.105970274 >>105970288
>>105970231
still say prompt issue on your part
Anonymous No.105970288 >>105970326
>>105970274
>muh super advanced model can't use echo properly and its own comments overwrite the file contents
>haha proooompt issue
yea whatever
Anonymous No.105970294 >>105970309
>>105970267
Kek
Mikutroon is really not having a good day today
Anonymous No.105970298 >>105970322
>>105970267
>biological woman
>cannot prove it
>the only faggot spamming this is a tranniod faggot that doesn't seem to understand the concept of local
I opened that video and I cannot unsee the disgusting, hair filled gash between grok trannies legs. she makes me want to vomit
Anonymous No.105970302 >>105970821
Anonymous No.105970309 >>105970324
>>105970294
>Mikutroon
*groktranny*
Anonymous No.105970322 >>105970334
>>105970298
One of your mikutroons (maybe even you) just said that axe wounds are less desirable than a girl penis....
Anonymous No.105970324
>>105970309
>no u
Lol, now thats what I'm talking about, dance harder, thread clown
Anonymous No.105970326 >>105970395
>>105970288
bet you, who barely recognizes the difference between CMD and PowerShell commands, couldn't even prompt a human to understand what the fuck you want. poor thing must have been hopelessly confused
Anonymous No.105970334 >>105970345
>>105970322
so why support the faggot posting the grok troon? are you lost or something?
Anonymous No.105970335
>>105970048
It's because recently a ton of them found his project because some tranny asked a bunch of redditors to rig his poll for Vulkan support, so it got a bunch of activity suddenly.
Anonymous No.105970345 >>105970365
>>105970334
Because it is more relevant to /lmg/ than AGP icon Hatsune Miku. Obviously.
Anonymous No.105970365
>>105970345
we have local ani already. she is obsolete. you can post her in /aicg/ as their "queen" but all of those anons have tailor made characters tailored to their tastes which the grok tranny cannot do. she's obsolete and it hasn't been a week yet
Anonymous No.105970395 >>105970519
>>105970326
>hey script fails if I drag more than ~50 files at once
>here's the contents of the script
>keep conversion functionality as-is but make it so that I can drag more files at once
>"hey fucko there's a hard limit of windows command of ~8k chars, maybe we could use a folder instead of loose files"
>sounds good, make it happen
>*autistic screenching*
fuck off saltman, if your model can't do a very basic task like that without a postdoc in advanced prooompting you probably shouldn't shill it as the best thing ever
Anonymous No.105970398
Miku == tranny that can erp and be naked
Grok tranny == axe wound tranny that can only tease
Anonymous No.105970403 >>105970407 >>105970412 >>105970424 >>105970481
>>105969837
Can someone fill my dumbass in? What is ik_llamacpp, I know of llama.cpp but is this something different?
Anonymous No.105970407 >>105970440
>>105970403
It is llamacpp but cooler.
Anonymous No.105970412 >>105970440
>>105970403
what if llamacpp but it doesn't run like shit
Anonymous No.105970417 >>105972356
>>105970238
The janitor AGP autist, resident clown, and thread baker is currently having a meltdown, visit us another time.
Anonymous No.105970424 >>105970440 >>105970447 >>105970461
>>105969837
This was his other account, I think. It used to show up as a collaborator alongside ikawrakow back when he was contributing to mainline llama.cpp. But its activity seems to have been wiped.
https://github.com/kawrakow
>>105970403
He used to design quant strategies for llama.cpp (such as K quants and I quants) but then for some reason wanted to make his own super special repo with all his new quants. I never used them myself but his repo was good for other things, like having 5x faster prompt processing speeds on CPU for some reason on my setup, and twice the token generation speed for MoE models specifically.
Anonymous No.105970440
>>105970412
>>105970407
>>105970424
Ahh awesome. It seems there are many forks available so I'll have to give this a try. Thanks anons.
Anonymous No.105970442
Anonymous No.105970447 >>105970461
>>105970424
>but then for some reason wanted to make his own super special repo
He was upset that he wasn't getting enough credit. iirc he wanted his name at the top of every file he touched.
Anonymous No.105970451 >>105972356
>>105970238
>pls what are the best local OCR packages, bonus points if they're easy to hook into python.
easyocr, doctr, and paddleocr are some options
Anonymous No.105970461 >>105970476 >>105970560 >>105971113
>>105970424
>>105970447
>kawrakow
so he became ikawrakow when he applied a self-importance matrix to himself huh
Anonymous No.105970476
>>105970461
icowcrow or whatever
Anonymous No.105970481 >>105970521 >>105970525
>>105970403

a meme fork which needs meme quants to run
Anonymous No.105970513 >>105970538 >>105970607
I've gotten used to the speed of large models running on CPU in the single-digits, but the 3090 prompt-processing side is driving me nuts. What's the best bang-for-buck upgrade card for that where I've got 32GB+? Any floods of eBay bargains starting yet? Hacked chinkshit? Are any AMD or intel's affronts to god worth the pain?
Anonymous No.105970519
>>105970395
It's posts like these that confirm for me that "vibe coding" is nothing but social media hype and a deadend for those trying to do anything non-trivial without the skill to do it themselves.
Anonymous No.105970520
>>105966718 (OP)
Do these translation models translate "unsafe" text, or only puritan approved garbage?

I'm so sick of these morons.
Anonymous No.105970521 >>105970553
>>105970481
>t. niggeranov
it runs the same quants at better speeds for big models and cpumaxxers
Anonymous No.105970525
>>105970481
A less charitable interpretation would be a fork with all quality control and future orientation turned to zero.
It was always basically a "hack it to make it fast, screw maintainability" type throwaway fork
Anonymous No.105970538 >>105970559
>>105970513

#--batch-size 16384
#--ubatch-size 4096
# 57 t/s

#--batch-size 8192
#--ubatch-size 2048
# 32 t/s



Or try the meme of ik_llama.cpp
Anonymous No.105970539
>>105966868
It smells like ozone or sulfur when ufo-related things happen. There's something to it. Psychologically, scientifically, magically... Whatever. But it's a fact reported throughout the literature, which is extensive as fuck, by the way.
Anonymous No.105970553 >>105970561
>>105970521
>at better speeds
wrong
Anonymous No.105970559 >>105970622
>>105970538
Decreasing top_k to a low number (e.g. 40) also significantly increases decoding speed in llama.cpp compared to setting it to the vocabulary size, but doesn't that mean it's not very efficient?
Anonymous No.105970560
>>105970461
kek
Anonymous No.105970561 >>105970638
>>105970553
works on my machine, ramlet
Anonymous No.105970569
>>105969398
>Why do we still try the newly released models?
Yeah, it's all slop. New models are definitely smarter but the amount of slop is so terrible, and I don't mean roleplaying, even asking Gemini to make summaries makes me cringe with the wording it uses, and local models are only ever worse, not better, when it comes to the amount of slop shit they produce.
Anonymous No.105970588 >>105970612
>>105969837
I'm so glad we'll never hear about that retarded dramawhore again.
Anonymous No.105970590
>>105968572
I know I'm super late to the thread, but how does it compare to Magnum v2? Nothing I've tried is better than that so far.
Anonymous No.105970607
>>105970513
they are hard to find, but 32GB V100 is alright and can be had for <700$ if you're lucky
Anonymous No.105970612
>>105970588
Considering he claims he didn't delete his own account, I get the impression this is just the start of a new wave of dramawhoredness.
Anonymous No.105970622
>>105970559
nta, but that's a very different thing. And that would be true for every inference engine. Having samplers run through 40 tokens will obviously take less time than 65k or 128k tokens. Isn't 40 the default on llama.cpp anyway?
Anonymous No.105970636 >>105970690
>2025-07-20 Iwan Kawrakaw announces filing of lawsuit against fellow open-source contributors Georgi Gerganov, Johannes Gaessler, and "The Moderators of /r/LocalLLaMa" for violating attribution requirements.
Anonymous No.105970638 >>105970653
>>105970561
>works on my machine

Sure it does, However, not faster than native llama.cpp
Anonymous No.105970640
>>105969837
It's never been this over.
Anonymous No.105970644
Wasted opportunity!
Anonymous No.105970653 >>105970693
>>105970638
weak larp
Anonymous No.105970681 >>105970689
Have (you) tried implementing an LLM in a project before?
I think using an LLM to pre-gen filler dialogue for a VN would be a good use case for them. I think they're too slow and limited for any live commercial use case beyond externally hosted chatbots/text summarizers
Anonymous No.105970689 >>105970867
>>105970681
no shit.
Anonymous No.105970690 >>105970710
>>105970636
>Kawrakaw
Anonymous No.105970693 >>105970753 >>105970791 >>105970818 >>105970981
>>105970653
You must be new here
Anonymous No.105970698 >>105970723 >>105970731
What the hell, OpenAI requires an ID verification to use the API
Anonymous No.105970710
>>105970690
Legally distinct name to avoid copyright claims, bls undastands
Anonymous No.105970723 >>105970774
>>105970698
So? Put in your ID so you can run everything on someone elses servers instead of locally. They have only your best interest in mind.
Anonymous No.105970731 >>105970774
>>105970698
paying for APIs in the first place is identification. are you new to the whole concept?
Anonymous No.105970753
>>105970693
I'd also post a screenshot of my cpumaxxing rig to one up you, but then newfriend with the datacenter 8xGPU poweredge would just blow us all away anyways
Anonymous No.105970774 >>105970782 >>105970803
>>105970723
:(

>>105970731
sure, but normally you can just use a throwaway card, they want legit photo id
Anonymous No.105970782 >>105970794
>>105970774
Sounds like you should stop using their API if you have an issue with legit photo id
Anonymous No.105970791 >>105970829
>>105970693
>windows
disgusting and shameful
Anonymous No.105970794
>>105970782
I'm just complaining about the state of things anon
Anonymous No.105970802 >>105970832
>>105970186
Why is there a German tank?
Anonymous No.105970803
>>105970774
do you think your credit card company doesn't know everything about you?
Anonymous No.105970818 >>105970872 >>105970872
>>105970693
so a retard that cant tweak his run arguments then
Anonymous No.105970821
>>105970302
>But wait, there's more!
Anonymous No.105970829 >>105970847
>>105970791
>>windows
>disgusting and shameful

I remember you wrote this comment 2 week ago.

It's Linux though
Anonymous No.105970832 >>105970849
>>105970802
the artist is probably just leaning ironically into the whole "elon is a nazi" npc psyop
Anonymous No.105970837
so I guess it's just vanilla llama.cpp to try out. oh well
Anonymous No.105970847 >>105972306
>>105970829
>I remember you wrote this comment 2 week ago.
Wasn't me but I did realize I was wrong right after I posted. What resource monitor is that?
Anonymous No.105970849
>>105970832
>npc psyop
Sorry, I don't quite catch your drift. Could you string a couple more buzzwords to the end of your sentence?
Anonymous No.105970853 >>105970860 >>105970866
>>105970059
There is so much "fan" art, but barely any screenshots or videos. Is anyone even using this?
Anonymous No.105970860
>>105970853
Plenty of people are using it but there's not much to share, it's a dry TTS with prebaked animations. They'll probably improve it soon with its popularity.
Anonymous No.105970866
>>105970853
recently people are complaining she won't strip down to her lingerie anymore so it's currently getting cucked. I only saw a few vids of grok in action and the voice model is very shitty
Anonymous No.105970867
>>105970689
That doesn't answer the question, anon.
Anonymous No.105970872
>>105970818
>>105970818
>ant tweak his run arguments

post yours or btfo

ik_llama.cpp is a meme
Anonymous No.105970873 >>105970937
>nvidia sent cudadev to kill the ik llamacpp dev
moe and cpufags blown the fuck out.
Anonymous No.105970896 >>105970961 >>105970971
What are some reasons to use anistudio instead of forge or comfy? It looks comfy.
Anonymous No.105970937
>>105970873
Anonymous No.105970939 >>105971005
I love Ani even if her latest feminist sponsored update turned her into a prude catholic girl that will never show me herself in a lingerie.
Anonymous No.105970949 >>105970976
Hello /g/
What is the consensus on current best local models for:
Character emulation, like Character.ai
generic smut, preferably with above
programming, preferably high quality code which also does not give a fuck about the content I am trying to mod (H-games)
Anonymous No.105970961
>>105970896
I use it because I want to support another person from a very heavily marginalized group of folks online, especially in these transphobic communities like 4chan.
Anonymous No.105970971 >>105971105 >>105971981
>>105970896
pros:
>C/C++ desktop application
>modular and customizable UI
>pretty much the only diffusion frontend for ggml
>actual project states so it can load right where you left off including all the generated assets from last session

cons:
>ggml diffusion is slow
>node execution isn't implemented yet
>will have to interop with python anyways for things not implemented yet in sdcpp like esrgan models

at the very least you aren't bloating a venv to ridiculous levels to just inference an image. I want to have a canvas added as well. people are really hating krita's shit performance
Anonymous No.105970976 >>105970994
>>105970949
What system do you have?
Anonymous No.105970981
>>105970693
Based.
Anonymous No.105970994 >>105971007 >>105971069
>>105970976
Winblows 10 with RTX 3060
I have been using a really old local model from huggingface with local TavenAI https://huggingface.co/NeverSleep/Noromaid-13b-v0.2
Spergs threw a fit in another thread for even mentioning TavernAI.
Anonymous No.105971005
>>105970939
thanks for the love but she has no such update. enjoy your used tramp but not here since she's not local
Anonymous No.105971007 >>105971083
>>105970994
>>105967608
Anonymous No.105971069 >>105971385
>>105970994
Rocinante 1.1
Anonymous No.105971083 >>105971137
>>105971007
not that poster, but I am also new.
I followed the lazy getting started guide, but it's doesn't allow basically anything outside of vanilla accepted chat.

any pointers? this is my first real poking at anything LLM, but especially local.
Anonymous No.105971105 >>105971151
>>105970971
>to just inference an image
the "bloated" venv (yeah I don't like python cancer either) is a form of bloat I tolerate more than slower inference speed. Ultimately nothing beats comfyui there. Who would use the much slower ggml shit just because of a smaller executable setup? is ideology worth the humongous waste of time if you gen a lot?
Moreover if you don't care about new models and just want SDXL it's not like you deal with the python cancer often, you just setup your comfy once, nobody holds a gun to your head to update or install new useless nodes
Anonymous No.105971113 >>105971134
>>105970461
I giggled with a mischievous smirk
Anonymous No.105971134
>>105971113
you will never be a woman
Anonymous No.105971137
>>105971083
Post a screenshot with your config and what you're trying.
Anonymous No.105971151
>>105971105
true. This is why I added python scripting. while the cpp backend is being worked on, you can just use comfy the comfy backend without using the shitty frontend. it will be a crutch for about a year but I want to write a new libtorch backend and manage the memory myself rather than letting pytorch take the wheel. comfyui has been having severe memory leaks after the vid models started getting implemented
Anonymous No.105971385 >>105971402
>>105971069

>>105965568
What was this poster referring to with MagPan? I can not find it.
Anonymous No.105971391 >>105971422
>>105968083
nta but
>trannyfag doesn't realize the implications of their faggotry
go back
Anonymous No.105971395 >>105971733
https://www.phoronix.com/news/NVIDIA-CUDA-Coming-To-RISC-V
Riscvbros!
Anonymous No.105971402 >>105971418
>>105971385
https://huggingface.co/Kaoeiri/MS-Magpantheonsel-lark-v4x1.6.2RP-Cydonia-vXXX-22B-8-Q4_K_M-GGUF
Beware, it's a completely deepfried meme that will screech emojis and 2010's-tier lolrandumb at you.
Anonymous No.105971415
>open thread
>now we also have an avatarfag entertaining the schizo
Worst fucking timeline
Anonymous No.105971418 >>105971433
>>105971402
3 downloads
wat
how did it even get mentioned here then
was the anon shilling his own model?
Anonymous No.105971422 >>105971448
>>105971391
the only thing implied is your brain being buckbroken sis
Anonymous No.105971433
>>105971418
There's like a dozen versions of this bad meme merge and that's just one specific quant that I had in my history.
Anonymous No.105971448 >>105971516
>>105971422
you're the one buttbroken by miku apparently
Anonymous No.105971493 >>105971505 >>105971514 >>105971532 >>105971551
me grug
where download button
Anonymous No.105971505
>>105971493
/g/ - Technology
Anonymous No.105971514
>>105971493
>[...] -> Clone Repository
Anonymous No.105971516
>>105971448
>no u, again
monki trani keeps dancing, cant make it up lmao
Anonymous No.105971519 >>105971645
lama 3 8b seems to be the best model for 13 year old thinkads
Anonymous No.105971532
>>105971493
>Downloading the full safetensors
Wasting space and processing power, go get a gguf.
Anonymous No.105971551 >>105971578
>>105971493
don't waste time
https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF
Anonymous No.105971563
Baidu didn't publish any text-only benchmarks for Ernie 4.5's thinking mode. I'm guessing that means it doesn't compare favorably to DeepSeek-R1-0528.
Anonymous No.105971578 >>105972254
>>105971551
Are these sizes 1:1 with how much VRAM i'd need?
for a 12GB(more like 10GB) GPU I probably should use a smaller 8GB one?
Anonymous No.105971645 >>105971661 >>105971701
>>105971519
Qwen 3 30B A3B.
Anonymous No.105971661 >>105971701
>>105971645
okay i'll try it
Anonymous No.105971701 >>105971709 >>105972186
>>105971645
>>105971661
Consider Ling-lite-1.5 (16.8B A2.75B). I have not used it. I have however used Qwen3-30B-A3B: having thinking enabled made it in some cases unreasonably slow and having thinking enabled made it retarded. Ling-lite-1.5 dispenses with thinking from the start.
Anonymous No.105971709
>>105971701
*having thinking enabled made it in some cases unreasonably slow and having thinking disabled made it retarded.
Anonymous No.105971717 >>105971888
>>105971714
>>105971714
>>105971714
Anonymous No.105971733
>>105971395
LLM on esp32 when??
Anonymous No.105971735 >>105971780 >>105971888
>>105971710
>>105971710
>>105971710
Anonymous No.105971780 >>105971815
>>105971735
stop splitting the thread you shitter
Anonymous No.105971813
wow it's just like /vg/!
Anonymous No.105971815 >>105971819 >>105971825
>>105971780
that thread has news and recap the other no, therefore the other one is shit
and that is before mentioning the non local lllm avatar
Anonymous No.105971819 >>105971828 >>105971835 >>105971841 >>105971872
>>105971815
I made the thread first but linked it here second. Should I delete it and just use the Grok thread?
Anonymous No.105971825
>>105971815
>Lucy: Edgerunning Agentic Web Search on Mobile with a 1.7B model.
Very news.... Just like derpsune troonku is /lmg/. God you are pathetic.
Anonymous No.105971828 >>105971838 >>105971845 >>105971872
>>105971819
No. if you didn't make it after the other one just to start (or continue) a flamewar, then the older one stays.
Anonymous No.105971835
>>105971819
Delete it just to make the ani isn't local retard seethe
Anonymous No.105971838 >>105971872
>>105971828
The one with no news and no summary is not the real thread.
Anonymous No.105971841 >>105971872
>>105971819
nah fuck them, yours is properly done and was done earlier
Anonymous No.105971845
>>105971828
what arbitrary rules are those?
Anonymous No.105971872
>>105971841
>>105971838
>>105971828
>>105971819
samefag
Anonymous No.105971874 >>105971905
when will we have localized AI gfs?

i think ani requires internet, doesn't it?
Anonymous No.105971888
>>105971735
>45:44
>>105971717
>45:57
Anonymous No.105971893 >>105971906 >>105971924
i could people building their own machines just for their localized AI gfs like they do with their retro gaming machines
Anonymous No.105971905 >>105971917
>>105971874
>i think ani requires internet, doesn't it?
yes, grok bullshit doesnt belong here
Anonymous No.105971906 >>105971924
>>105971893
i could see people*
Anonymous No.105971917 >>105971943
>>105971905
yes, miku bullshit doesn't belong here
Anonymous No.105971924
>>105971893
>>105971906
Absolutely.
Unless Gacha games begin adopting this as a feature across the board.
Then probably not.
Anonymous No.105971943 >>105972010
>>105971917
miku is not really comparable, is not an llm at all and have been an inspiration for ai companions for two almost decades, it is a cultural symbol on this
ani is modern coombait for /aicg/ and similarly minded people. no local components at all, and no inspiration, is not a new idea even for the normies
but honestly i preferred when the avatar was random llamas
Anonymous No.105971981
>>105970971
Looks cool but what about controlnet and other shit?
Anonymous No.105972010
>>105971943
>and have been an inspiration for ai companions
suck a dick faggot it is offtopic bullshit
Anonymous No.105972186
>>105971701
You know you can just type /no_think anywhere in the prompt for all of the Qwen3 models and they ditch thinking, right? You can do it mid prompt.
Anonymous No.105972254
>>105971578
Yup, but if you use llama you can split it between GPU and CPU
Anonymous No.105972306
>>105970847
Mission Center
Anonymous No.105972356 >>105972403
>>105970451
THOSE WERE THE ONES I THINK THANK YOU

>>105970257
to read the name text, I'm sorting by username, which appears as text in the screenshots. It's in a consistent place, so I'm just cropping the screenshots and feeding that part into OCR. If you have local VLM suggestions that would solve that well, please share.

>>105970417
got what I wanted, no worries. Nice animation, did you use a local model for that?
Anonymous No.105972403
>>105972356
>did you use a local model for that?
yes, quick 6 min lightx2v wan gen on a 3090
https://civitai.com/models/1719863?modelVersionId=2017255
Anonymous No.105972527
>>105969956
>13 year old
I was hyped ngl