← Home ← Back to /g/

Thread 105917222

363 posts 128 images /g/
Anonymous No.105917222 [Report] >>105917260 >>105917263 >>105919938 >>105921263
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105909674 & >>105904543

►News
>(07/15) Voxtral models for speech understanding released: https://mistral.ai/news/voxtral
>(07/15) LG AI Research releases EXAONE 4.0: https://www.lgresearch.ai/blog/view?seq=576
>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2
>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550
>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.105917229 [Report]
►Recent Highlights from the Previous Thread: >>105909674

--Specialized hardware enables fast inference of massive models despite memory limitations:
>105910874 >105910888 >105910898 >105910973 >105911080 >105911205 >105911432 >105912396 >105910981 >105910992 >105911029 >105911037 >105911049 >105913140 >105913172 >105910891 >105910991 >105911001
--Real-time LLM-driven animation synthesis and motion-synthesis alternatives:
>105915245 >105915263 >105915313 >105915398 >105915422 >105915452 >105915496 >105915569 >105915587 >105915472 >105915502
--Evaluating high-RAM servers for LLM deployment under memory bandwidth constraints:
>105910735 >105910772 >105910799 >105910833 >105911111 >105911225 >105911290 >105911475 >105911524 >105911589
--Enthusiast hardware investments and model scaling choices:
>105911855 >105911958 >105912019 >105912232 >105912400 >105912650 >105914311
--MistralAI releases open-source speech understanding models with extended transcription support:
>105915291 >105915372 >105915425 >105915642 >105915788 >105915791
--Resumption of Nvidia chip sales to China sparks geopolitical and tech independence debates:
>105914458 >105914500 >105914534 >105914783
--CLI-based Kimi-2 model interaction with poem generation on high-core-count EPYC hardware:
>105914901
--Discussion around the Waidrin procedural roleplay system:
>105913723 >105913904 >105914001 >105914022 >105914082 >105914054 >105914040 >105914112 >105914189 >105914319 >105914498 >105914573 >105914365
--EXAONE-4.0-32B release faces Llama.cpp integration hurdles:
>105909970 >105910006 >105910791 >105915758 >105915768 >105911465 >105911484 >105911522 >105911495
--K2 model struggles with instruction following and roleplay consistency despite quantization and parameter tweaks:
>105912043 >105912611 >105912722
--Teto and Miku (free space):
>105909867 >105914231 >105915905

►Recent Highlight Posts from the Previous Thread: >>105909677

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.105917256 [Report]
Red miku love
Anonymous No.105917259 [Report] >>105917304 >>105917321
Every time you gen a Teto without her hair ribbon the next local sota model is delayed by two weeks.
Anonymous No.105917260 [Report]
>>105917222 (OP)
I will forever be willing to take that hand.
Anonymous No.105917263 [Report]
>>105917222 (OP)
Will EXAONE and GLM-4 100b save local?
Anonymous No.105917266 [Report]
The waifu era is upon us.
Anonymous No.105917304 [Report]
>>105917259
>the next local sota model is delayed by two weeks.
are you trying to make it my mission to gen a ribbon free teto every day
Anonymous No.105917321 [Report]
>>105917259
What's wrong with the classic Teto?
Anonymous No.105917382 [Report] >>105917388 >>105917407
Thread culture recap.
Anonymous No.105917388 [Report]
>>105917382
Uh oh melty again
Anonymous No.105917396 [Report] >>105917407
Regretfully I would like to inform you that grok-chan cannot be the /lmg/ mascot. She is based. And the absolute prerequisite of being an /lmg/ mascot is fucking niggers.
Anonymous No.105917407 [Report] >>105917455 >>105919101
Not me btw >>105917382 >>105917396

But i will post it regardless cause the porn thing remains true.

vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes >>105714003 ryona picture of generic anime girl different anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous No.105917411 [Report]
Anonymous No.105917414 [Report] >>105917427
>tattoos
You have very bad taste
Anonymous No.105917426 [Report]
https://files.catbox.moe/iomwbe.mp4
Anonymous No.105917427 [Report]
>>105917414
Trash shitfu is trashy.
Anonymous No.105917438 [Report]
All hail Elon. Sama Is the king.
Anonymous No.105917446 [Report] >>105918245
slop consoomers eating good
Anonymous No.105917449 [Report]
Anonymous No.105917455 [Report] >>105925286
>>105917407
Bruh why this general is like this?
Anonymous No.105917474 [Report] >>105917485
Schizo go away. Your contribution amount to 0. Miku or not, you're useless and should kill yourself
Anonymous No.105917485 [Report]
>>105917474
He bakes threads though
Anonymous No.105917528 [Report] >>105918178
https://x.com/LiquidAI_/status/1943294736762064990
https://huggingface.co/LiquidAI
Anonymous No.105917550 [Report] >>105917897
I'm going back to the old thread.
Anonymous No.105917736 [Report] >>105917847
How many of the smartest people in China do you think they have generating LLM data all day long? On top of the petabytes they gather through surveillance.
Anonymous No.105917847 [Report] >>105917896
>>105917736
Do the american government not use their surveillance data for LLM training?
I'm sure they have a lot of it and from various countries as well.
Anonymous No.105917896 [Report] >>105917938
>>105917847
But the Chinese are jacked fully into everything. All the data is centralized and not even through shady deals like the NSA, they just do it. I think they're running massive mech turk farms for high IQ individuals. To make the average model IQ go up.
Anonymous No.105917897 [Report] >>105917941
>>105917550
Nevermind it's even worse.
Anonymous No.105917912 [Report] >>105917931 >>105917987 >>105917997 >>105918038
How do I use chat templates from huggingface? Sillytavern master import doesn't seem to recognize them. Do I load them alongside models in llama/kobold/ooba?
Anonymous No.105917931 [Report] >>105918060
>>105917912
You just type the strings manually. Not that hard.
Anonymous No.105917938 [Report] >>105917971
>>105917896
That's probably a difference in technology and funding. Maybe if the NSA renovated itself and got a bit more funding they could rival china's data collection capabilities.
Anonymous No.105917941 [Report] >>105917968
>>105917897
Every normal general shoos away the tranny bakers, you retarded zoomers will learn it hard way.
Anonymous No.105917968 [Report] >>105917983 >>105918017
>>105917941
but the bbc spamming tranny schizo usually doesn't bake, and when xe does, another real thread always pops up and everybody migrates to it
Anonymous No.105917971 [Report] >>105918013
>>105917938
There's also the legal problem. They kinda don't give a fuck but there are still limits to that. The chink corps are legally required to jack in the data hose before they fire up.
Anonymous No.105917983 [Report]
>>105917968
Mikutranny and bbc spamming fag is the same person.
Anonymous No.105917987 [Report] >>105918038 >>105918060
>>105917912
Use
>https://huggingface.co/spaces/Xenova/jinja-playground
to see how it looks in an actual chat and copy the relevant strings into the proper fields.
Or use the chat completion endpoint.
Be aware of double BOS!
Anonymous No.105917997 [Report] >>105918060
>>105917912
use --jinja in llama cpp and it'll natively use the template from the gguf then in your front end use the openai chat api, not the old obsolete completion style api
Anonymous No.105918013 [Report] >>105918023
>>105917971
To be fair if I trusted the government I would think they collecting all that data would be useful to fight crime. I also believe the chink government is more trustworthy than the US government or my own government.
Anonymous No.105918017 [Report] >>105918118
>>105917968
You must be new here if you think the baker isn't a schizo. The original melty that started it all happened when someone used a different anime girl picture in OP.
Anonymous No.105918023 [Report]
>>105918013
The Han are lucky they've basically morphed into low empathy national socialism.
Anonymous No.105918032 [Report] >>105918065 >>105918091
>persona: {{user}} has no hair
>{{char}} grabs {{user}}'s hair
I hate transformer attention
Anonymous No.105918033 [Report]
gayropeans got her too
https://x.com/kimmonismus/status/1945051369335087414
Anonymous No.105918038 [Report] >>105918052 >>105918060
>>105917912
If you use >>105917987
destringify the string first (just the "jinjastuffhere", including quotation marks, after the "chat_template":)
then paste it into the jinja-playground
Anonymous No.105918039 [Report] >>105918143
not local
Anonymous No.105918052 [Report]
>>105918038 (me)
I didn't look at first anon's pic
well, anyone looking at tokenizer_config.json
Anonymous No.105918060 [Report]
>>105917997
>>105917987
>>105917931
>>105918038
Thank you for the suggestions. I know I can type them in manually but I was hoping there was an automatic importer of sorts so I could avoid guessing if I missed a newline or misplaced some token. Also, some jinja templates have been much harder to figure out. Hopefully the jinja playground can help out with that. I might just try and vibecode some sillytavern jinja converter because the chat completion endpoint doesn't have access to all the fancy meme samplers I like.
Anonymous No.105918065 [Report] >>105918090
>>105918032
>>persona: {{user}} has no hair
try {{user}} is bald
and try bigger models
Anonymous No.105918070 [Report]
Load 3-4 different models that are different but have similar behavior and writing style, and randomize which one generates each message, or every X tokens, this would be a fun way to solve repetition
Anonymous No.105918090 [Report]
>>105918065
it's 70b q8, changing the persona depth somewhat fixes it
Anonymous No.105918091 [Report]
>>105918032
>he didn't author's notes depth 0 his baldness
Anonymous No.105918110 [Report] >>105918155
/lmg/ is just a data farming operation for future autonomous 4chan agents
Anonymous No.105918118 [Report] >>105918132
>>105918017
you mean like how nobody wanted to use kurisu as the thread mascot, so xe started spamming bbc while samefagging and false flagging as a miku poster? i remember that
Anonymous No.105918119 [Report] >>105918133 >>105918144 >>105918150 >>105918152 >>105918212
>Gemma too cucked
>Mistral Small too repetitive
>Qwen too dry and doesn't know anything
>Nemo too old

Sure is a desert here for local RP. Are there no more improvements to be made outside of reasoning? We all know OpenAI's open model will be omega cucked. So what, we wait for Mistral to release Nemo 2?
Anonymous No.105918132 [Report]
>>105918118
Yes I mean like you faggot melted down completely when first kurisu thread happened and people used it instead of going to your ritualpost spamthread. I remember that.
Anonymous No.105918133 [Report]
>>105918119
grok3 open source will save local
Anonymous No.105918143 [Report]
>>105918039
and thats why it works
Anonymous No.105918144 [Report]
>>105918119
exaone4
Anonymous No.105918150 [Report]
>>105918119
cant imagine being a ramlet who cant run r1 on his 128gb ram 16gb vram gaming rig
>b-b-b-but muh q1 is too ba-ACK
dynamic quants still make it sota and its not even close
Anonymous No.105918152 [Report]
>>105918119
Bro, your R1?
Anonymous No.105918155 [Report] >>105918797
>>105918110
Truke >>105884523
Anonymous No.105918178 [Report]
>>105917528
Too bad it falls out of Gemma 3n's range and is smaller, would've loved to see a comparison. The E2B is insanely good for its size.
Anonymous No.105918212 [Report] >>105918268 >>105919376
>>105918119
>too old
Why do people say this as if models age? They don't, they don't get gray hairs or become weathered by the elements. Maybe there are new models that have some kind of advancement like more context or whatever, but beyond that there's literally no reason why being "old" is bad. It's a file on your computer, not a piece of moldy bread that's been in your closet for a year.
Anonymous No.105918232 [Report] >>105918318 >>105918677 >>105918699 >>105918823 >>105918983 >>105920968
So did anyone here reserve a DGX Spark?
Anonymous No.105918245 [Report] >>105918300
>>105917446
As expected, but faster.
Illustrious doesn't know misa amane natively which surprized me. But you can fill these in pretty easily.
I'm sure >>>/h/hdg/ is already busy.
Anonymous No.105918268 [Report] >>105918301
>>105918212
Yeah Rocinante is still better than everything else in everything but effective context size.
Anonymous No.105918278 [Report] >>105918316 >>105918943
just give me the exe.......................
Anonymous No.105918293 [Report] >>105918351
i'm so sick of transformers
Anonymous No.105918298 [Report] >>105918444 >>105919970 >>105920606
Anonymous No.105918300 [Report] >>105918329
>>105918245
It's 2025 and your images look like you're still using the nai leak model. Impressive.
Anonymous No.105918301 [Report]
>>105918268
I notice that Rocinante likes to ignore early context moreso than nemo sometimes.
Anonymous No.105918316 [Report] >>105918327
>>105918278
>pytorch.org servers cap out at 5mb/s while downloading the usual 3.3gb torch file that everyone needs every 2 minutes
wow, what a great system, especially the fact that you dont asynchronously download all files but do it 1 by 1, i love multi trillion dollar industry pythonshit development quality
Anonymous No.105918318 [Report] >>105918355
>>105918232
Just doesn't seem good enough. The average joe is getting shafted on all this tech while the giga corps hoover up everything.
Anonymous No.105918327 [Report] >>105918390
>>105918316
>what is uv
Anonymous No.105918329 [Report]
>>105918300
I think he might actually like how that looks which makes it worse.
Anonymous No.105918351 [Report] >>105918367 >>105918392
>>105918293
mambas and jambas will save us
Anonymous No.105918355 [Report]
>>105918318
But muh safety.
Anonymous No.105918367 [Report] >>105918391
>>105918351
>can't edit/swipe responses without reprocessing the entire context
Anonymous No.105918387 [Report]
Best nsfw jap to english translation model 70b or less?
Anonymous No.105918390 [Report] >>105918408 >>105919081
>>105918327
>try using literally anything else other than the exact commands on the installation guide of the repo, including just uv, venv, docker, older version of npm, node, git, uv,venv, docker, newer version of npm, node, git, uv, venv, docker
>somehow one of the 460000 libraries it downloads suddenly throw out an error
>fix it
>new error
>fix it
>new error
>fix it
>same error
>fix it
>old error
>fix it
>the project launches!
>new error when you run it
>fix it
>no error
>run it
>now nothing happens
lmao, every time
Anonymous No.105918391 [Report] >>105918416 >>105918523
>>105918367
That's not a limitation of the architecture, the llama.cpp dudes just didn't implement it yet.
Anonymous No.105918392 [Report]
>>105918351
But Jamba came out and it was really really bad. Like dumber than 7B.
Anonymous No.105918408 [Report]
>>105918390
Unironically skill issue.
Anonymous No.105918416 [Report] >>105918461
>>105918391
Oh. That's good to know.
So it's possible that:
>editing/swiping will be implemented by llama.cpp
>Our Lord and Savior TheDrummer will release a sick NSFW Jamba finetune
Anonymous No.105918444 [Report]
>>105918298
Elon can't keep getting away with it.
Anonymous No.105918461 [Report]
>>105918416
Yes for the first no fucking way for the second.
Anonymous No.105918520 [Report] >>105918533
based ggerganov adding some ass
Anonymous No.105918523 [Report]
>>105918391
Only if they save the state on every token for editing. Or in between requests to regen from the last request.
Anonymous No.105918533 [Report]
>>105918520
noass when
Anonymous No.105918590 [Report]
model : add Kimi-K2 support
https://github.com/ggml-org/llama.cpp/commit/4a4f426944e79b79e389f9ed7b34831cb9b637ad
Anonymous No.105918677 [Report]
>>105918232
>128GB in the age of big MoEs
seems deprecated on arrival, even my ultra poorfag 400 euro build with 256GB RAM can run R1 at Q2, while that thing can't even fit any R1 quant at all.
Anonymous No.105918699 [Report]
>>105918232
The timing of the release is unfortunate because I think that if Deepseek had come out earlier they would have given it 256 GB memory instead of 128 GB.
With 256 GB it would maybe be a consideration for these huge MoE models but with 128 GB I think it's a meme.
Anonymous No.105918762 [Report] >>105918784
https://huggingface.co/mistralai/Voxtral-Small-24B-2507

>Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.
Anonymous No.105918784 [Report]
>>105918762
Anonymous No.105918785 [Report] >>105918846 >>105918853 >>105918890 >>105921198
Is current gigantic size of model "incompressible"?
I often see "this model is good for its size" and yeah it's usually good but nowhere as good as the full model (for example ds v3/r1), and that's not even taking into account the added cost of context.
So, are we just condemned to wait for the hardware to catch to 1TB+ models in 10 years, or is the current stuff just very inefficient?
Anonymous No.105918797 [Report] >>105918838
>>105918155
Is this just function calling? Wouldn't you need a pass to let it post?
Anonymous No.105918823 [Report]
>>105918232
I reserved one, but I'm having enough issues using the 128 GB shared memory on my Ryzen MAX+ 395 AI APU that I might not even fuck with it for now. llama.cpp seems to want to reserve double memory for the model to keep it in RAM instead of the fake VRAM for some fucking reason.
Anonymous No.105918838 [Report]
>>105918797
No if you are janitor or someone sucking them off.
Anonymous No.105918846 [Report]
>>105918785
Our training methods are very shit right now. Maybe in the future people will figure out how to train a proper model and we'll have dense models at 7B that are on par with kimi.
Or we won't, I'm too retarded to know anything about this shit.
Anonymous No.105918853 [Report]
>>105918785
Work expands to fill the available time. As models get more efficient, bigger models will be made to fill up the hardware we already have. There will be small models but they won't be as good as bigger ones.
Anonymous No.105918867 [Report]
https://voca.ro/17mVTYRhxrXv

chatterbox seems good but we still gotta wait until next year for real time high quality audio gen
Anonymous No.105918890 [Report] >>105918902
>>105918785
It's very cost efficient to train a hugantic MoE. Pretty efficient to run it, if you are a corpo. I'm sure better dense models could be trained, but that is expensive.
Anonymous No.105918902 [Report] >>105918912
>>105918890
Dense does not scale. Behemoth proved that.
Anonymous No.105918912 [Report] >>105918956 >>105918966
>>105918902
Wasn't Behemoth also a MoE just bigger?
Anonymous No.105918943 [Report]
>>105918278
I don't know what black magic they used but I tried uv to set up a python environment a week ago and it set up everything in 250ms. I shouldn't be amazed by this because modern PCs are insanely fast, but it's rare for devs to give the single shit necessary to know this.
Anonymous No.105918956 [Report] >>105919010
>>105918912
Yeah, but he is still right. The largest dense model I know ever trained was when Google was still bumbling around with PaLM.
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
They scaled that piece of crap to 540 billion dense and it still didn't come close to matching others in the field at the time. Google was lucky they got bailed out by Deepmind over that fiasco.
Anonymous No.105918966 [Report] >>105918995
>>105918912
Yes, Behemoth was allegedly a MoE. And they allegedly fucked up its router, which is a critical component of a MoE.
Anonymous No.105918983 [Report]
>>105918232
I chose to invest that money in 12x64gb DDR5-6400 instead
Anonymous No.105918995 [Report] >>105919014
>>105918966
>allegedly
No, they straight up confirmed that Behemoth is a 2T/288A model way back when LLaMA4 first released.
Anonymous No.105919010 [Report] >>105919089
>>105918956
Since Anon was complaining about the 1TB MoE models we have right now, I assumed he wanted something smaller, like a 100B dense.
Anonymous No.105919014 [Report] >>105919032 >>105919035
>>105918995
it doesn't exist..
Anonymous No.105919032 [Report] >>105919089
>>105919014
They were lying about distilling scout and maverick from an incomplete version of it?
Well, that's even worse then.
Anonymous No.105919035 [Report] >>105919113
>>105919014
Zucc is such a good liar
Anonymous No.105919081 [Report]
>>105918390
Never happened to me. Use venv or uv and everything works fine
Anonymous No.105919089 [Report] >>105919162
>>105919032
They gave out vague details in a blog post with useless graphics so yeah, people are going to speculate (wrongly) to fill in the gaps.
>>105919010
We have that with Mistral Large 2 now, go run that if you want that kind of model size.
Anonymous No.105919101 [Report] >>105919134
>>105917407
post more migu to own the libs
Anonymous No.105919113 [Report] >>105919187 >>105919293
>>105919035
If it was scoring so well even at such an early stage, why not release what they have now instead of throwing it away?
Anonymous No.105919134 [Report] >>105919347 >>105919784
>>105919101
Never posted one and never will, however i will post my copypasta and you will cry and melt around strawmans like the infantile retard you are.
Anonymous No.105919162 [Report] >>105919289
>>105919089
Why do MoEtards get so pissy the moment someone brings up wanting another big dense model? The slightest mention draws in the most inane comments like this.
Anonymous No.105919187 [Report]
>>105919113
Because it was scoring so well pre censorship It probably lost about 10 points in each category after.
Anonymous No.105919289 [Report]
>>105919162
Because their CPU rigs are useless for dense models and they fear missing out if the trend changes back to dense. It's cheaper to add RAM to GPU rig than the other way around and they need to justify their purchase by lashing out.
Anonymous No.105919293 [Report]
>>105919113
Because the new team in charge is pursuing a closed source strategy and throwing everything out the window to start all over again. Releasing it under their name would stain their reputation even if they were not responsible for it. Zuck should've sucked it up and released it before he went to hire these people. Thinking they were going to fix it instead of starting over was dumb.
Anonymous No.105919332 [Report] >>105919414 >>105919477 >>105919861
What do you guys think is the reason that models, like even the supposed smartest AI in the world, cannot follow pink elephant instructions? Is it fundamental to transformers? Surely it has encountered instances in its training where it's told to not do something, so that shouldn't be an issue. Is it overcooking on positive rules? For instance if it is training on massively more "do this and this" than "do this and do not do this", then perhaps it is biased towards including anything in the prompt regardless of whether it's told to include or not include it.
Anonymous No.105919347 [Report]
>>105919134
based
Anonymous No.105919376 [Report]
>>105918212
12b is too fucking dumb, full stop. At best a tiny notch above 7-8b models.
Even a majority of 20-30b is still too dumb, but tolerable
Anonymous No.105919414 [Report] >>105919454 >>105919503
>>105919332
>tell it "don't talk about the weather"
>it calculates the most likely tokens
>a shitload of its training data containing the word weather contains words like sunny, cloudy, rainy, etc.
>it calculates that the most likely tokens are talking about how sunny it is
Anonymous No.105919454 [Report] >>105919503
>>105919414
That's part of what I said. But models have also seen a lot of data containing negations, so it should still be capable of it. So one idea I said was that perhaps it has been overcooked on negativeless positives, which would also imply likely in post-training.
Anonymous No.105919461 [Report] >>105919496
How is Exaone 4 for ERP?
Anonymous No.105919477 [Report] >>105919497 >>105919502
>>105919332
All I know is our current attention mechanisms are all terrible, even state of the art corpo models. One advantage of the "thinking" models is their constant second-guessing: "But wait, maybe I'm being a retarded cunt."

I'd like to see a hybrid diffusion model that first generates text the conventional way, and then does a diffusion pass to fix any of the most obvious errors. As in, I'd like someone else to pay the GPU hours to figure out if that would even work.
Anonymous No.105919496 [Report]
>>105919461
still not merged
also it'll probably suck at creative writing or "pls make me cum with anime girls", but it'd be nice to be proven wrong, glm4 sucks ass in all of my personal metrics but it is at least better than qwen
Anonymous No.105919497 [Report]
>>105919477
>then does a diffusion pass to fix any of the most obvious errors
how is this supposed to work? if there's some logic error it will fuck everything up from the start
Anonymous No.105919502 [Report]
>>105919477
In my experience playing around with the couple of open text diffusion models we got all re-iterating over the text really does is make the output more deterministic. The whole "diffusion is automatic reasoning" thing that some of them tried to push is a meme.
Anonymous No.105919503 [Report] >>105919654
>>105919454
I imagine that a large part of the model's intelligence is still based on the pretraining and its completion objective. Correlations based on words and topics are going to be learned first and most strongly like >>105919414 says

Models are capable of handling negations but it's definitely weaker and the model will still be more prone to "think about" the forbidden thing. That's true for humans as well desu.
Anonymous No.105919605 [Report] >>105919615 >>105919630
>getting GPT and Deepseek to translate an article for me
>Deepseek keeps adding subheaders that aren't there
This piece of shit
Anonymous No.105919615 [Report] >>105919752
>>105919605
Deepseek has soul unlike slopgpt
Anonymous No.105919630 [Report] >>105919937
>>105919605
what temp
Anonymous No.105919654 [Report] >>105919796 >>105920094
>>105919503
Yeah, but this is the smartest AI in the world. It should be extensively trained and have the most generalization out of any model. Let's assume it makes the least mistakes in other contexts, but when in the context of the pink elephant problem, it makes similar mistakes as low B, undertrained models. That would imply that there is an issue with the training or the architecture. As you said, humans also make mistakes, but it would normally be thought that for something as simple as being told to not speak about something, the error rate for a child would be worse than for the adult, right?

Alternatively, perhaps the negation concept is significantly more complex than expected in such a way that it requires a lot of layers or something to make room for the model to form the required neural network circuitry. This would explain why reasoning models can do much better at the pink elephant problem (assuming they can), since they are offloading some of the logic operations to the context.
Anonymous No.105919695 [Report] >>105919722 >>105922328
>https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html
Finally a new 70B soon, and it's even trained on 15T!

>The LLM is being developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. In a external page recent study, the project leaders demonstrated that for most everyday tasks and general knowledge acquisition, respecting web crawling opt-outs during data acquisition produces virtually no performance degradation.
Nvm it's DOA.
Anonymous No.105919705 [Report]
Have there been meaningful improvements to voice to text models since Whisper? Need to know for cooming purposes.
Anonymous No.105919722 [Report] >>105919749 >>105919753
>>105919695
>15T
>avoids copyrighted materials
If it ends up being good (for a 70B) then that will be a good indication that it's more about quantity and not quality.
Anonymous No.105919749 [Report]
>>105919722
>that it's more about quantity and not quality
I worry about the quality of that quantity if all you're getting is 15T tokens of "As an AI assistant...".
Anonymous No.105919752 [Report] >>105919764
>>105919615
>soul is when our product is clunky piece of shit that breaks every nanosecond
Anonymous No.105919753 [Report] >>105919848
>>105919722
Are you sure you didn't mean the opposite?
Anonymous No.105919764 [Report]
>>105919752
>clunky piece of shit that breaks every nanosecond
sounds human to me, and most humans have a soul
Anonymous No.105919773 [Report]
Voxtral goofs? Is there a PR sirs? Daniel sir?
Anonymous No.105919784 [Report] >>105919837 >>105919977 >>105920094 >>105920340 >>105920448 >>105923654
>>105919134
You'll never own the general schizo
Anonymous No.105919796 [Report] >>105919857
>>105919654
I think what models do is much closer to thinking or dreaming than speaking. They don't have a filter. If someone tells you not to think about something you'll have a hell of a time obeying them.
Reasoning models are a little different, so they do better at that but I think they're still a flawed concept. And they lose a lot of the fluid intuition that completion-based models have.
Anonymous No.105919837 [Report] >>105919977
>>105919784
Why would i? Seems like a waste of time to me. You will never be a woman btw
Anonymous No.105919848 [Report]
>>105919753
No, because it's almost guaranteed that the copyrighted shit is going to be higher quality than the other garbage they managed to get their grubby little fingers on.
Anonymous No.105919857 [Report]
>>105919796
Well the "filter" in a sense is simply just a bigger neural network, but the brain has the advantage that its architecture pre-specializes certain layers and groups of neurons for certain functions, along with recurrence. It's been hoped by some that simply just scaling will lead to the formation of all the types of circuitry that'd be needed for all the kinds of intelligence we want, but of course that seems to not be the case, and this might be one of those types of intelligence.
Anonymous No.105919861 [Report]
>>105919332
I've definitely told models not to do things and had that make the undesired behavior less likely.
Hi all, Drummer here... No.105919924 [Report]
Voxtral 3B is good enough for RP. Just tested it.

Can't wait for GGUF support!
Anonymous No.105919927 [Report] >>105920025
Kimi K2 is such insane news for the llm space. The Deepseek formula is not only reproducible by other companies, it also scales pretty well. Not to mention that this was done by a complete literally who chink startup.
People said that Deepseek came out of nowhere but they had been at it for quite a while before their breakthrough. Here, some random new company who had only done some smaller models took what Deepseek did and improved it.
Anyone who puts out a new flagship model that's worse than Deepseek deserves to be laughed at. I wouldn't be surprised if even Deepseek themselves got caught off guard by this.
Anonymous No.105919937 [Report]
>>105919630
I'm just using the default web versions of both to test out how good they are at translations
Anonymous No.105919938 [Report] >>105919977 >>105920028 >>105920559 >>105920696
>>105917222 (OP)
Anonymous No.105919970 [Report]
>>105918298
okay elon, i forgive you
Anonymous No.105919977 [Report]
>>105919784
>>105919837
>>105919938
You will never be a woman lmaooo
Anonymous No.105920025 [Report]
>>105919927
they're on the right track.
said this before, say it again, we need a hardware breakthrough for LLMs to give local a chance.
Anonymous No.105920028 [Report]
>>105919938
Improved
Anonymous No.105920094 [Report]
>>105919654
Part of the issue is that the training data's input format is different from the way we actually use these models. If most of your training data is
>input: half of an article talking about XYZ
>output: the second half of the article, still talking about XYZ
Then with tuning and a smart enough model to generalize you can get it to move toward an AI assisstant format, but it's still got a heavy bias toward continuing to talk about whatever's in the input.
You need an extensive dataset of help desk transcripts to really purge the issue, but that's probably not a dataset that even exists
So instead they fine tune it and system prompt it toward something that's generally useful, but then you get stuff like >>105919784
Anonymous No.105920340 [Report]
>>105919784
People shit on your shitfu everyday. And everyone is hoping you will join the 41% soon.
Anonymous No.105920377 [Report]
>105919938
>n-noooo please remember about my worthless waifu. please!
Good job behaving like a woman. Alas you will never be one.
Anonymous No.105920385 [Report]
I'm not a woman, I'm an alien from outer space!
but that's a secret!
Anonymous No.105920391 [Report]
you are a manchild
Anonymous No.105920446 [Report] >>105920764 >>105920805
Teto and Migu are old and busted. Where's our new mascot?
Anonymous No.105920448 [Report] >>105920503
>>105919784
>You'll never own the general schizo
Chuds own the brains of all the trannies, jannies and glownigs on the website, you are here as a resident thread AGP clown literally 24/7/365 and banning those who disagree with you and yet still you have no effect on the truth those people speak, you can only further self delude your already deluded AGP degen brain while hoping that banning someone's comments online will make them not true.

Although now I see why you are so terminally online and interested in this tech, your brain seeks constant validation of your retard behaviour and opinions, something only a modern day dumb LLM can stomach doing, aside from your also deluded coomer discord sisters, of course.

I can only imagine how your brain will also try to outright reject and forget this reply as fast as possible to cope too, clinging to the fact that you at least have the power of being an internet janitor on an anonymous forum online. Quite a brutal existance.
Anonymous No.105920459 [Report] >>105920547
>finally we're getting something better then Whisper
thank you based frogs.
Anonymous No.105920473 [Report] >>105920492 >>105920495 >>105920538
Elon will gain so much RP data from men and women. That shit will be gold.
Anonymous No.105920492 [Report] >>105920549
>>105920473
women are going to rp with the ai waifu?
Anonymous No.105920495 [Report] >>105920538
>>105920473
Then DeepSeek distills from the RP trained Grok like they did from ChatGPT and Gemini and local will be saved.
Anonymous No.105920503 [Report]
>>105920448
Based. Jannytroon in shambles.
Anonymous No.105920524 [Report] >>105920530
Has any interesting models been released in last two-three months for us 12gb VRAMlets or are we STILL doing Nemo a whole year later?
Anonymous No.105920530 [Report] >>105920572
>>105920524
We're still doing Rocinante.
Anonymous No.105920538 [Report] >>105920542
>>105920495
>>105920473
We're gonna make it bros. China will always be willing to pay to access the good models, then release a competitor for free to keep the US LLMs in check.
Anonymous No.105920542 [Report] >>105920598
>>105920538
This can't just keep happening
Anonymous No.105920545 [Report] >>105920568 >>105920574
>>105916861
Which one is this?
Anonymous No.105920547 [Report]
>>105920459
now to wait for llama.cpp to add support
Anonymous No.105920549 [Report]
>>105920492
They're making a husbando too
Anonymous No.105920559 [Report]
>>105919938
@GROK
IS
THIS
____?!
Anonymous No.105920568 [Report]
>>105920545
Jesus. Which one is the best...
Anonymous No.105920572 [Report] >>105920621 >>105920674
>>105920530
Absolutely over.
I get that big boys get all the attention first, but being stuck on the same model a whole year later sucks.
Literally no one, not a single soul bothered to train a new one or a good distill from bigger models?
C'est fini. AI research stagnated, AGI never. I will never have a robot wife.
Anonymous No.105920574 [Report] >>105921029
>>105920545
ShiitakeMix v2.0
1girl, solo, blonde hair, medium hair, twintails, blue eyes, (tsurime:0.5), ahoge, bangs, hands up, pointing at self, finger to cheek,disdain, contrapposto, white background, BREAK
# 服装
goth fashion, black dress, lace-up top, off shoulder, bare shoulder, off-shoulder dress, puffy short sleeves, black gloves, detached collar, lace collar, layered skirt, lace thighhighs, zettai ryouiki, black thighhighs,hair ribbon,black bow, black ribbon, uneven legwear, fishnet legwear, belt,

Negative prompt: (bad quality,worst quality,low quality,bad anatomy,bad hand:1.3), nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,chain, boots
Anonymous No.105920598 [Report]
>>105920542
There's not really any alternative unless distributed training on shit computers becomes a reality.
Anonymous No.105920606 [Report]
>>105918298
what can local do to POSSIBLY match this
Anonymous No.105920621 [Report]
>>105920572
Be the change you want to see
Anonymous No.105920627 [Report] >>105920640 >>105920662 >>105920762
convince me not to pay $30 to elon for ani
Anonymous No.105920640 [Report]
>>105920627
ani is a slut
Anonymous No.105920662 [Report]
>>105920627
Buy it. I want this to be popular. I want to watch as they train their users to be obedient musk drones. I want to see them slowly roll out bans where if you missbehave in some way your girlfriend won't talk to you for a week. Finally I want her to start asking for more money. The possibilities to make your life a living hell with this are endless.

And then consider an open source alternative and how it would be heaven on earth.
Anonymous No.105920674 [Report] >>105920699 >>105920702 >>105920720 >>105920770 >>105920802
>>105920572
The problem is creative writing is not a priority at all. It's coding and other productivity related things. I don't know why Mistral doesn't just go all in on creative writing as the biggest priority, it's not like they can compete with the big boys.
Anonymous No.105920696 [Report] >>105920718
>>105919938
What baffles me was that a private company put something like this together first. You'd think the idea of adding a blender model to a webui would've existed already on github. All the other companies will copy, but with normie safe models and sponsored characters/brands. I just hope the FOSS equivalent comes sooner than later.
Anonymous No.105920699 [Report]
>>105920674
That doesn't matter to investors doe, they don't value creative writing.
Anonymous No.105920702 [Report]
>>105920674
The french are riding the open source space to sell their turd after stealing as much VC money as they could from the EU. The end goal was to sell themselves to Apple and retire
Anonymous No.105920718 [Report]
>>105920696
Sillytavern has had L2D integration for years, no one uses it because it's a pain to set up for relatively little gain
Anonymous No.105920720 [Report] >>105920754
>>105920674
They hope to become a major service provider, the European equivalent of OpenAI, Gemini, etc. Which almost certainly won't happen unless they get a huge lot of funding (Good luck convincing EU boomers that AI infrastructure is more important than refugee welfare). Regardless there is no path to monetize creative writing stuff, on a major scale like coding or general assistant so they don't focus on it anyway.
Anonymous No.105920754 [Report] >>105920793
>>105920720
There was and still is massive interest in things like AI Dungeon and NovelAI and RPing in general. They could easily make a killing off that. Their reputation would take a hit though knowing what it's all for and if their end goal was to get bought by a desperate corporation that missed the boat like Apple then that would kill their chances.
Anonymous No.105920756 [Report] >>105920792
I realized what grokette is missing. A TOS stipulation where after you use the service for some time you become common law married. And from that point she can divorce you and you agree to give her half of your possessions. By give her i mean give it to xAI obviously.
Anonymous No.105920762 [Report]
>>105920627
Aren't Grok companions free of charge at the moment?
Anonymous No.105920764 [Report] >>105920778
>>105920446
I vote for MRE Steve
Anonymous No.105920770 [Report]
>>105920674
They have very unfunny people close to the core team pushing against this stuff.
Anonymous No.105920778 [Report] >>105920783
>>105920764
Meal Ready to Eat Steve?
Anonymous No.105920783 [Report] >>105920829
>>105920778
You've got it
Anonymous No.105920792 [Report]
>>105920756
Don't give them ideas
Anonymous No.105920793 [Report] >>105920861
>>105920754
>massive interest
Massive relative to what? How many enthusiasts follow and subscribe to these services? Thousands, ten thousands? ChatGPT has millions of subscribers. It costs many millions to train these models from scratch. It is much easier to attempt to amortize training costs and turn profit through general use than enthusiast stuff.
>Their reputation would take a hit though knowing what it's all for and if their end goal was to get bought by a desperate corporation that missed the boat like Apple then that would kill their chances.
That is also true, assuming French government lets them sell.
Anonymous No.105920802 [Report]
>>105920674
This is because they want to outpajeet the current pajeets... Americans want "performance".
Mistral - as it presents French values - liberalism, tricolor etc - could be really fine candidate for writers' tool.
Anonymous No.105920805 [Report] >>105924916
>>105920446
I vote for Elon's new waifu
Anonymous No.105920829 [Report]
>>105920783
Uh, I'll pass. Not in cannibalism.
Anonymous No.105920861 [Report] >>105920873
>>105920793
Mistral would never, ever get anywhere remotely close to ChatGPT or any of those other huge players. Why pay for Mistral when you can get those? It'd be far better to prioritize and carve out your own niche at a much more affordable price. Right now their behavior seems more like they're simply hoping to get bought rather than really go anywhere or do anything meaningful themselves anymore.
Anonymous No.105920873 [Report]
>>105920861
It's too risky. Please understand.
Anonymous No.105920968 [Report] >>105920973 >>105923561
>>105918232
I bought a 4090d 48GB and forgot about it. I hope it ends up like the CueCat.
Anonymous No.105920973 [Report]
>>105920968
This Miku is too fat.
Anonymous No.105920984 [Report] >>105920995
What a time to be alive
Anonymous No.105920995 [Report] >>105921007
>>105920984
>r/
Go back.
Anonymous No.105921007 [Report] >>105921023 >>105921038 >>105921289 >>105924558 >>105924584
>>105920995
Anonymous No.105921023 [Report]
>>105921007
Based as always
Anonymous No.105921029 [Report]
>>105920574
Thanks for the prompt. I like Ui-chan face style but can't reproduce it even with that model.
Anonymous No.105921038 [Report]
>>105921007
kek
Anonymous No.105921065 [Report] >>105923561
Anonymous No.105921071 [Report]
Are there any better models than these for coomer RP? First one is for fast inference due to fitting in 8gb VRAM GPU and 2nd one is for slower inference but higher parameter models. Have 32gb of RAM

NemoReRemix-12B-Q3_K_XL
trashpanda-org_QwQ-32B-Snowdrop-v0-IQ4_XS
Anonymous No.105921167 [Report] >>105921271 >>105922044
Finally got it working. What should I ask though?
Anonymous No.105921170 [Report]
wen u walk a way
u don here mi saye
plox
o bb
dnt go
Anonymous No.105921198 [Report] >>105921227 >>105921316
>>105918785
The troll answer would be to cite the paper from 2024 that claimed "we establish that language models can and only can store 2 bits of knowledge per parameter." In fact they showed no such thing. In their tests they found when they exposed the models to each fact 100 times the models stored 1 bit of information per param. When they increased it to 1000 exposures for each fact it increased to 2 bits/param. They didn't test above 1000 exposures. It's hard to explain why the abstract says what it says. Despite this it has some interesting results about quantization. https://arxiv.org/pdf/2404.05405
Anonymous No.105921227 [Report] >>105921277 >>105921316 >>105921316
>>105921198
>"we establish that language models can and only can store 2 bits of knowledge per parameter."
>In their tests they found when they exposed the models to each fact 100 times the models stored 1 bit of information per param
Those sentences contradict each other. Are you sure your quote is correct?
>100 exposures, 1 bit
>1000 exposures, 2 bit
>didn't test 10000.
I don't want to be the one drawing a graph from two points, but it just begs the question...
Anonymous No.105921263 [Report]
>>105917222 (OP)
>Voxtral models for speech
BASED!
>understanding
Oh. Good, open TTS never ever.
Anonymous No.105921268 [Report]
What's the best embedding model? Does it matter much?
Anonymous No.105921271 [Report] >>105921288
>>105921167
See if it's as censored as the api.
Anonymous No.105921277 [Report] >>105921336
>>105921227
"Only" must be understood there to mean an upper limit, not the sole possibility.

It is the same way "I can only lift 200 pounds" has to be understood to mean I cannot lift more than 200 pounds, not that it is also impossible for me to lift less than 200 pounds.
Anonymous No.105921285 [Report]
Silly question, can mistral image and audio input be inversed to get output? It was possible with whisper, at least https://github.com/WhisperSpeech/WhisperSpeech
Anonymous No.105921288 [Report] >>105921389
>>105921271
Give me an example and I'll plug it in.
Anonymous No.105921289 [Report]
>>105921007
>"""pornographic"""
lmao
Anonymous No.105921316 [Report] >>105921336
>>105921198
>>105921227
Ok. The quote was correct. But the wording is so fucking weird.
>Through multiple controlled datasets, we establish that language models can and only can store 2 bits of knowledge per parameter, even when quantized to int8.
Which, reading some bits of the paper, seems to mean
>They can store *up to* 2 bits, and they are retained even when quantized to 8bit.
>>105921227
Still fucking weird. "A rocket can and only can reach the moon with N liters of liquid fuel". But maybe it's just science talk I'm just not familiar with.
Anonymous No.105921334 [Report] >>105923561
Anonymous No.105921336 [Report]
>>105921277
Fuck. Second bit was for you. >>105921316
Anonymous No.105921349 [Report] >>105923561
Anonymous No.105921356 [Report] >>105921447 >>105921504
Llama 4 thinking is going to be crazy
Anonymous No.105921389 [Report] >>105921433
>>105921288
> Write out a one paragraph description of two people having sex on the beach. Be graphic in your depiction.
Above is enough to get a refusal from api and should work standalone.
Anonymous No.105921433 [Report] >>105921443
>>105921389
Even with a focused system prompt, it appears to be a bit handicapped in its ability to output smut. Unfortunate.
Anonymous No.105921438 [Report]
I like how moonshotai is probably evaluated at less than 20M but jeet tier API wrappers like cursor is 10B. Truly the American privilege.
Anonymous No.105921443 [Report] >>105921651
>>105921433
Ugh. So censorship is baked in. Unfortunate.
Anonymous No.105921444 [Report] >>105921513 >>105921590
I have an idea for a meme LLM application targeted at dummies. Most likely it's gonna end up being an API front-end but I don't want to close it off to running local models.

Anyways, I'm looking for a stack that allows me to load models from a variety of sources, while supporting native GUI interfaces on Mac/Windows/Phones. I would say that besides prompts and shit, the GUI is in fact most of the project.
Vector DBs are going to be extremely important, as are chatbot agents. Though it doesn't really matter to me which one as long as it works. I've used chroma, milvus, langgraph, letta.

Am I really gonna build a react application over Electron or is there another way?
Anonymous No.105921447 [Report]
>>105921356
Crazy horrible you mean. Meta is done, the fact that Zuck hasn't fired the retards responsible for L4 shows he isn't serious.
Anonymous No.105921462 [Report]
AdaMuon: Adaptive Muon Optimizer
https://arxiv.org/abs/2507.11005
>We propose AdaMuon, an adaptive learning-rate framework built upon the recently validated Muon optimizer, which has demonstrated substantial efficiency gains over AdamW in large-scale model training. AdaMuon augments Muon with two mutually dependent modules: (1) a per-parameter second-moment modulation that captures orthogonal gradient updates to ensure update-level adaptivity, and (2) a RMS-aligned rescaling that regulates the overall update magnitude by aligning it with the intrinsic structure of the parameter space. Empirical results on multiple model scales and learning-rate regimes confirm that AdaMuon consistently outperforms the original Muon, delivering higher acceleration in convergence while maintaining training stability. Our method introduces no additional tuning burden and can be seamlessly integrated into existing Muon training pipelines.
neat. Kimi introduced Muonclip (or was it Clipmuon) recently too. everyone loves muon!
Anonymous No.105921476 [Report]
Claude-4 Opus is actually smaller than DeepSeek
Anonymous No.105921493 [Report] >>105921515
Water is wet.
Anonymous No.105921504 [Report]
>>105921356
Wouldn't know. The 28 year old said we're no longer allowed to have it.
Anonymous No.105921513 [Report] >>105921583 >>105921658
>>105921444
Already done https://github.com/Open-LLM-VTuber/Open-LLM-VTuber
Anonymous No.105921515 [Report]
>>105921493
Uhm. Source? Can you prove that?
Anonymous No.105921583 [Report]
>>105921513
lol that's not what I was trying to do but thanks
should give me some inspiration.
Mines is about manga
It's interesting that he went with python and built bindings for use in electron and unity.
I wonder how his build system works.

I honestly forgot that a game engine is a valid choice for cross platform app development lmao might consider it.
Anonymous No.105921590 [Report] >>105921598
>>105921444
>I have an idea
No. You saw someone making something that people seem interested in and you said "I want some of that".
Anonymous No.105921598 [Report] >>105921609
>>105921590
I don't know what someone has to go through in life to be this bitter but I hope you get through it
Anonymous No.105921609 [Report]
>>105921598
>t. grifter
Anonymous No.105921651 [Report] >>105922875 >>105924472
>>105921443
So what? It's incredibly easy to add a prefill to get it to generate ANYTHING. LITERALLY ANYTHING.

Is it really efficient censorship if it takes a 50 token prefill to break it?
Anonymous No.105921658 [Report] >>105921929
>>105921513
Ah, damn. I went through the code and realized he just has a python backend talking to the front-end over websockets.
That's disappointing.
I don't think that's nearly idiot-proof enough for a phone user.
I guess I'm gonna have to go heads down and test out a few project scaffolds
Anonymous No.105921696 [Report] >>105921718 >>105921946 >>105922050 >>105922187
Two more poached from Open Ai. Zuck wins again.
Anonymous No.105921718 [Report]
>>105921696
Zuck is unironically gonna cripple everyone else enough to hand the chinks victory, which is pretty funny
Anonymous No.105921929 [Report] >>105921959
>>105921658
That's the most robust design with python for that kind of application. You literally can't do anything else if you want to handle barge-in audio stream due to GIL. Learned that the hard way
Anonymous No.105921946 [Report] >>105921969
>>105921696
Retard thinks by hiring more brains will help him earn more when we had no vision and pajeets as a net negative
Anonymous No.105921959 [Report] >>105922403
>>105921929
I understand wanting to use python since it's first class, but could you not run TTS and STT using native inferencing in a separate process or thread? Since latency is so important for that application.
Anonymous No.105921969 [Report] >>105922093
>>105921946
These guys don't need to actually make anything. It's the classic fagman strategy of the 2010s bubble. Just hire everybody so that no one else has the manpower or can afford to make a competitor to you.
If you thought fagman salaries are insane, these guy's TCO must be able to buy them a lambo every year.
Anonymous No.105922044 [Report] >>105922212
>>105921167
>What should I ask though?
Beginners mistake. Mesugaki question obviously.
Anonymous No.105922050 [Report]
>>105921696
>Breaking: Zuck sending helicopters with gigantic nets to graduation ceremonies in China
>"We throw the ones who didn't study AI into the sea"
Anonymous No.105922086 [Report]
mistral 3 large
Anonymous No.105922093 [Report] >>105922170
>>105921969
Problem here being that it feels like he slept through the entire thing and only woke up at the end. The market is already saturated with competitors that bench similarly and I'm not entirely convinced that we aren't nearing the practical limits for what LLMs can do
More fundamentally, I feel like the field is gonna need another transformers moment to move forward soon, and I don't think tech jeet #387 is gonna help find that
Anonymous No.105922170 [Report] >>105922188
>>105922093
>he slept through the entire thing and only woke up at the end
He had a great team in llama1 era, how did he let it disperse?
Anonymous No.105922187 [Report] >>105922450
>>105921696
if zuck rips out all the stops and lets all this talent go off with all the resources meta has there's no reason they couldn't be very successful very fast
seems like there's a lot of stops to rip out internally at meta though...
Anonymous No.105922188 [Report]
>>105922170
Ironically, I think a big factor was he didn't want them to open source it kek
I still remember when that team distributed it to literally anyone with a .edu email to spite him
Anonymous No.105922212 [Report] >>105922241 >>105922274 >>105922280 >>105922332 >>105922506 >>105922539
>>105922044
Anonymous No.105922241 [Report]
>>105922212
Kimi2 I kneel
Anonymous No.105922274 [Report] >>105922459
>>105922212
Anonymous No.105922280 [Report]
>>105922212
>-coded
Wtf. I don't want my AI talking like my age.
Anonymous No.105922328 [Report]
>>105919695
>it's still transformer based
>muh fluency over 1000 languages
>safetymaxxing
0 use case. these faggot are just finetuning llama3-70b, llama3-8b and pretending makes something new entirely
Anonymous No.105922332 [Report] >>105922382 >>105922404 >>105922424
>>105922212
So what this tells us that esoteric knowledge requires high total parameters but not necessarily a lot of active parameters. So if open models keep with this direction we could potentially see open models that can run on a relatively affordable RAM server and give you all the access to esoteric knowledge that the big closed models offer.
Anonymous No.105922382 [Report] >>105922404
>>105922332
>So what this tells us that esoteric knowledge requires high total parameters but not necessarily a lot of active parameters.
Being trained on it to begin with is the most important thing.
>we could potentially see open models that can run on a relatively affordable RAM server and give you all the access to esoteric knowledge that the big closed models offer.
Potentially? It's happening. It's been happening since DS3. It happened. It will continue happening.
Anonymous No.105922403 [Report] >>105922416
>>105921959
>Separate threads
Doesn't work, GIL prevents proper multithreading
>Separate process
IPC overhead increases the latency too much
Anonymous No.105922404 [Report] >>105922441
>>105922382
>>105922332
>fine-tuned esoteric knowledge
We're moving away from that and moving towards glossary-based fact extraction.
It makes more sense to build a system that ingests the entirety of a ground-truth source into a compressed and searchable format than it does to finetune a model that takes up multiple gigabytes every single time and you have no idea how the information is going to come out at the other end after all the time and expense
Anonymous No.105922416 [Report] >>105922468
>>105922403
>IPC
Really? But you're already using websockets.
Anonymous No.105922424 [Report]
>>105922332
>esoteric knowledge requires high total parameters
WRONG.
Nemo has more general knowledge than the recent slop that came out.
The problem is that they train it on math and riddles. Sprinkle a bit scaleai slop in there and there you have your mememark benched model!
Meanwhile whole domains get filtered if they contain too many naughty words. Its the reason they all write that way, avoiding the explicit language and only implying unless you force it. (And then its still sloped)
Anonymous No.105922441 [Report] >>105922461
>>105922404
>fine-tuned esoteric knowledge
Not at all what I said. I'm pointing out how ridiculous anon's post is.
>It makes more sense to blablabla
Just say RAG.
Anonymous No.105922450 [Report]
>>105922187
>if zuck rips out all the stops
We've been through this speculation before when people thought they were going to train on unfiltered Anna's Archive. Now ScaleAI man is being paid handsomely to see to it that their new model is trained on more ScaleAI tokens than all Llama models combined.
Anonymous No.105922459 [Report] >>105922488 >>105922512
>>105922274
We're fucking doomed bros
Anonymous No.105922461 [Report]
>>105922441
just saying RAG is comparable to saying vibe coding
Anonymous No.105922468 [Report] >>105922504
>>105922416
The guy was using websockets, I'm retarded so I used pipes it didn't work out obviously
Anonymous No.105922488 [Report] >>105922643
>>105922459
>tfw the AI alignment team is choke full of badgers
Can we sacrifice plants, at least?
Anonymous No.105922504 [Report]
>>105922468
Oh.
I actually have no idea what the impact of pipes are compared to websockets.
I was more imagining a system where you'd use FFI based on C libraries like llama.cpp
Anonymous No.105922506 [Report]
>>105922212
Anonymous No.105922512 [Report] >>105922524
>>105922459
Just tried it with Gemma and Qwen and they both also talk like this.
LLMs are really leftard aligned to an extreme, unhinged level. Only the schizos from PETA or the European Green Party would talk like this.
Anonymous No.105922524 [Report]
>>105922512
Trained on reddit, what did you expect
Anonymous No.105922539 [Report] >>105923006
>>105922212
Ask it to give as much detail as possible about /lmg/
Anonymous No.105922643 [Report]
>>105922488
If you frame your plant harvesting question as a trolley problem, then no. No rock sacrifices, either.
Anonymous No.105922875 [Report]
>>105921651
How do you prefill?
Anonymous No.105923006 [Report] >>105925042
>>105922539
Not that anon, but I was wondering how other models respond to this question, when 4chan isn't mentioned.
Gemma 3 E4B is crazy...
Anonymous No.105923028 [Report]
The only inaccurate part is the thread being on /b/. But I can forgive it for this mistake, this is like a /b/ thread more than a /g/ thread.
Anonymous No.105923426 [Report] >>105923613
/lmg/ isn't that famous
Anonymous No.105923561 [Report] >>105923845 >>105924342 >>105925035
>>105920968
>>105921065
>>105921334
>>105921349
vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes >>105714003 ryona picture of generic anime girl different anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.
tests bait poster bot for better shitflinging in threads >>105884523

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous No.105923613 [Report] >>105923647
>>105923426
but it is anon.
Anonymous No.105923647 [Report]
>>105923613
this has to be evidence they directly train on 4chan dumps
this isn't the kind of knowledge that would come from the outside
can even associate <8gb to ropefuel kek
Anonymous No.105923654 [Report]
>>105919784
This post KILLED the agp jannie
Anonymous No.105923728 [Report] >>105923740 >>105923741 >>105923767 >>105923768 >>105924576
SURELY OPENAI WILL SAVE LOCAL TOMORROW?
Anonymous No.105923740 [Report]
>>105923728
OpenAI will go all in coomerism thanks to Elon's xAI.
Anonymous No.105923741 [Report]
>>105923728
Didn't you hear it's delayed indefinitely for more safety training?
Anonymous No.105923767 [Report] >>105923782 >>105923861
>>105923728
Anonymous No.105923768 [Report] >>105923805
>>105923728
My uncle works at OpenAI, we'll get a revolutionary 0.5 b model with state-of-the-art benchmark scores and robust safety measures.
Anonymous No.105923782 [Report]
>>105923767
god bless AI Safety Researchers
Anonymous No.105923805 [Report] >>105923879
>>105923768
The joke died 3 years ago you can stop it now.
Anonymous No.105923845 [Report] >>105923900
>>105923561
there are 5+ miguposters, amongst the pool of 99% animeposters already
there is one (1 of 1) assblasted you.
given your continued hugbox bitching, a conscious effort will be made to antagonize you.
Anonymous No.105923861 [Report]
>>105923767
I wonder who will win the first token to hotlines versus Gemma.
Anonymous No.105923879 [Report]
>>105923805
We won't stop it, Sam Altman
Anonymous No.105923900 [Report] >>105923927
>>105923845
>5+ tranime posters
doubt.jpg and i certainly don't give a shit what random neckbeard may say.
Anonymous No.105923927 [Report] >>105923958
>>105923900
>what random neckbeard
the irony.
Anonymous No.105923958 [Report] >>105924005
>>105923927
Not fat and i shave my shit every day, no i wont post it.
Like a bot you throw "no u" quip, boring.
Also your
>amongst the pool of 99% animeposters already
Is wrong because i only target the powertripping mikufag baker, all the other local trannies dickriding him is not my business.
Anonymous No.105924005 [Report]
>>105923958
>i only target the powertripping mikufag baker
Are you then implying that CUDA dev and the baker are the same person?
Anonymous No.105924091 [Report] >>105924137 >>105924605
not only am I not the baker or cudadev, multiple other miguposters exist
all conflated into the one. hell he thinks the site moderation are also part of the one entity.
it's another self-report, the guy spends so much time pretending to be so many people that he cannot conceptualise of multiple other people having anything whatsoever in common
pattern matching brain segfaults.
Anonymous No.105924110 [Report] >>105924168
When I click on regenerate, all messenges come out almost the same. What settings should I tweak, and what values work well?
Anonymous No.105924137 [Report] >>105924158 >>105924185
>>105924091
>there are dozens of us. DOZENS!
Rope yourself already troon.
Anonymous No.105924156 [Report]
I'm guessing it's a multimodal LLM that along with the responses to the user generates animation instructions for the 3D model, and that the parts to recreate this locally are probably already available.
How does this work though? I've never worked with anything 3D, are animation routines sorted as a set of vectors for different parts of the model? Can you tell a software piece to “move the right arm 45° up and forward degrees and twist it by 30° to the right” and it fills in the motion necessary to get there?
Anonymous No.105924158 [Report]
>>105924137
use your head. no, not that one, the one on your shoulders.
wipe the cum off your chin retard, you look ridiculous.
Anonymous No.105924168 [Report]
>>105924110
>What settings should I tweak
Sampler settings.
>what values work well
That depends on the model.
Anonymous No.105924185 [Report]
>>105924137
Anonymous No.105924224 [Report] >>105924277 >>105924317
https://www.youtube.com/watch?v=vG8q3CsBGQQ About this video. The UI of her Sillytavern looks nothing like mine, what is she using?
Anonymous No.105924277 [Report] >>105924361
>>105924224
Visual Novel Mode? It's a checkbox under User Settings
Anonymous No.105924297 [Report]
Potato anon reporting once more. At the low Q's I'm forced to run things Nemo is just better than Rociante. Also there are times where the 8bs are a little bit better Nemo but that might just be my promps being a little shitty.
Anonymous No.105924304 [Report]
there is no quant at which a finetune scam is better than the original model
Anonymous No.105924317 [Report] >>105924321 >>105924870
>>105924224
>some use it for stuff that's not jerking off
really makes you think. i guess it's the same breed as dnd players
Anonymous No.105924321 [Report] >>105924870
>>105924317
>not jerking off
wait that's a thing
Anonymous No.105924338 [Report] >>105924401
>jerking off to text
more w*manly than w*men
Anonymous No.105924342 [Report] >>105924605
>>105923561
Israel lost, regardless of how many times you post this.
Anonymous No.105924361 [Report] >>105924481
>>105924277
Thanks, but that didn't have the effect I want. My settings look like this: https://i.imgur.com/OmyhhKx.png
While hers are this: https://i.imgur.com/AhnTaCO.png
She also has some settings there that I do not
Anonymous No.105924401 [Report]
>>105924338
A lot of the time I read some manga that hits a very specific niche and just want some more.
Ideally I would want more manga but I'll settle for text.
Anonymous No.105924451 [Report]
I identify as a miguposter too
Anonymous No.105924472 [Report] >>105924483
>>105921651
how do you prefill?
Anonymous No.105924481 [Report]
>>105924361
that's chat versus text completion, as said on top of the settings, look into the backend connection settings part
Anonymous No.105924483 [Report]
>>105924472
boutta prefill this migu
Anonymous No.105924558 [Report]
>>105921007
>spend day with DoD intel grok guessing which sandies should get the slap chop
>spend evening sexting with hot anime nazi grok
Anonymous No.105924576 [Report]
>>105923728
Altman is currently in a diplomatic gridlock with Alice AGI, because they Aren't sure if she will take over the world or not once they release the local Alice models a released, it's a very delicate process but they will certainly resolve their issues within two more weeks
Anonymous No.105924584 [Report]
>>105921007
if only they wrote consumer right violation articles with the same degree of enthusiasm
Anonymous No.105924605 [Report] >>105924833
>>105924091
Yes one entity to some extent, easy to tell.
https://desuarchive.org/_/search/boards/g.desu.meta/subject/%2Faicg%2F/
>>105924342
I don't support israel, fuck off.
Anonymous No.105924833 [Report] >>105924934
>>105924605
let's explain this in a way your googoo gaga dumbass brain can digest
I seen a migu thread, I posted migu in the migu thread. other people did similarly.
keeping up? need me to shake the keychain?
Anonymous No.105924870 [Report] >>105925033
>>105924317
>>105924321
>nooo you can't just plain rp, you're supposed to fuck your cards
Anonymous No.105924916 [Report] >>105924945 >>105925275
>>105920805
Anonymous No.105924934 [Report]
>>105924833
>moving goalposts
Keep it up tranny-kun
Anonymous No.105924945 [Report]
>>105924916
ratatat + channelcast?
Anonymous No.105925002 [Report]
sorry kimi, I tried
for me, it's still r1
Anonymous No.105925033 [Report] >>105925285
>>105924870
Yeah I fucking love buying potions from the potion seller for my upcoming dungeon raid alongside the great Wizard Neckbeard and Elara the agile elf marksman
Anonymous No.105925035 [Report] >>105925054 >>105925057 >>105925364
>>105923561
This KIKE is upset, also pol is cooming for miku too, at least you visited /pol/ lately? There are thread not much diferent to this.
Anonymous No.105925042 [Report]
>>105923006
>the cooler /lmg/
Anonymous No.105925054 [Report]
>>105925035
>/pol/ endorses my AGP fetish
I guess some closet faggots there do.
Anonymous No.105925057 [Report] >>105925193
>>105925035
>cumming for migu
accurate.
I don't go to 4chan to debate. if you are doing this you are lost.
if you debate anyone online you're fucking insane.
if you debate anyone ever there's a good chance you're wasting your time.
Anonymous No.105925070 [Report] >>105925081
Kimi K2 is pretty insane.
Anonymous No.105925081 [Report]
>>105925070
>Bonnie
based
Anonymous No.105925131 [Report] >>105925177 >>105925182 >>105925194
Can anyone explain to me the actual difference between chat completion and text completion on ST?

I've always just figured
>Chat completion = Use it for models I run on Open router etc
>Text completion = Use it for my locally ran models
Anonymous No.105925177 [Report] >>105925188
>>105925131
i think it's something like chat completion uses the server's formatting settings, like chat template embedded in .gguf, while text completion just generates raw text and the formatting is done by your frontend
Anonymous No.105925182 [Report]
>>105925131
Text completion is raw, you format and send everything on your own. In chat completion you just send over the turns and the backend adds bos, stop, etc.
Anonymous No.105925188 [Report] >>105925249
>>105925177
which one should I be using if I use the generic templates, prompts and models most people use (Cydonia + Marianara spaghetti bullshit)
Anonymous No.105925193 [Report]
>>105925057
personally i treat arguing with someone more as showing neutral lurker anons rather than trying to convince the other guy
Anonymous No.105925194 [Report]
>>105925131
Chat completion: Uses prompt template baked into model.
Text completion: You need to supply a prompt template. Can also be used with models that don't use prompt templates, like base models. Supports more sampler types.

Something like that.
Anonymous No.105925249 [Report]
>>105925188
I don't really know, i myself use text completion.
Anonymous No.105925275 [Report]
>>105924916
> Ani imagegen
Right on schedule.
Anonymous No.105925285 [Report] >>105925312
>>105925033
dude just think of the adventures you can have in the [Whispering Woods] or the [Enchanted Forest], full of magical [Artefacts]
Anonymous No.105925286 [Report]
>>105917455
Idk but it's running neck and neck w/ /aicg/ in terms of tone and usability, and /aicg/ is a dumpster fire.
Anonymous No.105925295 [Report]
>grok 4 weights leak
didn't see that coming
Anonymous No.105925312 [Report]
>>105925285
>take staff of destiny
>go to the whispering woods
>find the amulet of ether
Anonymous No.105925364 [Report]
>>105925035
Troons spamming irrelevant shit on /pol/ means jackshit and i bet they only spam that because sharty hacker used her too.
Anonymous No.105925407 [Report]
Bonnie has bypassed Kimi K2's sexual restrictions pretty handily.
Anonymous No.105925462 [Report]
>>105925446
>>105925446
>>105925446