/lmg/ - a general dedicated to the discussion and development of local language models.
Previous threads:
>>105909674 &
>>105904543โบNews
>(07/15) Voxtral models for speech understanding released: https://mistral.ai/news/voxtral>(07/15) LG AI Research releases EXAONE 4.0: https://www.lgresearch.ai/blog/view?seq=576>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507โบNews Archive: https://rentry.org/lmg-news-archive
โบGlossary: https://rentry.org/lmg-glossary
โบLinks: https://rentry.org/LocalModelsLinks
โบOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png
โบGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers
โบFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers
โบBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
โบTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
โบText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
rec
md5: b923c207cf6edc4831895574fe5a7e60
๐
โบRecent Highlights from the Previous Thread:
>>105909674--Specialized hardware enables fast inference of massive models despite memory limitations:
>105910874 >105910888 >105910898 >105910973 >105911080 >105911205 >105911432 >105912396 >105910981 >105910992 >105911029 >105911037 >105911049 >105913140 >105913172 >105910891 >105910991 >105911001--Real-time LLM-driven animation synthesis and motion-synthesis alternatives:
>105915245 >105915263 >105915313 >105915398 >105915422 >105915452 >105915496 >105915569 >105915587 >105915472 >105915502--Evaluating high-RAM servers for LLM deployment under memory bandwidth constraints:
>105910735 >105910772 >105910799 >105910833 >105911111 >105911225 >105911290 >105911475 >105911524 >105911589--Enthusiast hardware investments and model scaling choices:
>105911855 >105911958 >105912019 >105912232 >105912400 >105912650 >105914311--MistralAI releases open-source speech understanding models with extended transcription support:
>105915291 >105915372 >105915425 >105915642 >105915788 >105915791--Resumption of Nvidia chip sales to China sparks geopolitical and tech independence debates:
>105914458 >105914500 >105914534 >105914783--CLI-based Kimi-2 model interaction with poem generation on high-core-count EPYC hardware:
>105914901--Discussion around the Waidrin procedural roleplay system:
>105913723 >105913904 >105914001 >105914022 >105914082 >105914054 >105914040 >105914112 >105914189 >105914319 >105914498 >105914573 >105914365--EXAONE-4.0-32B release faces Llama.cpp integration hurdles:
>105909970 >105910006 >105910791 >105915758 >105915768 >105911465 >105911484 >105911522 >105911495--K2 model struggles with instruction following and roleplay consistency despite quantization and parameter tweaks:
>105912043 >105912611 >105912722--Teto and Miku (free space):
>105909867 >105914231 >105915905โบRecent Highlight Posts from the Previous Thread:
>>105909677Why?: 9 reply limit
>>102478518Fix: https://rentry.org/lmg-recap-script
Every time you gen a Teto without her hair ribbon the next local sota model is delayed by two weeks.
>>105917222 (OP)I will forever be willing to take that hand.
>>105917222 (OP)Will EXAONE and GLM-4 100b save local?
The waifu era is upon us.
>>105917259>the next local sota model is delayed by two weeks.are you trying to make it my mission to gen a ribbon free teto every day
>>105917259What's wrong with the classic Teto?
>>105917382Uh oh melty again
Regretfully I would like to inform you that grok-chan cannot be the /lmg/ mascot. She is based. And the absolute prerequisite of being an /lmg/ mascot is fucking niggers.
Not me btw
>>105917382 >>105917396But i will post it regardless cause the porn thing remains true.
vocaloidfag posting porn in /ldg/:
>>105715769It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes
>>105714003 ryona picture of generic anime girl different anon posted earlier
>>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.
Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.
TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.
And lastly as said in previous thread(s)
>>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted
xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
>tattoos
You have very bad taste
https://files.catbox.moe/iomwbe.mp4
>>105917414Trash shitfu is trashy.
All hail Elon. Sama Is the king.
file
md5: df206d770d279e7bef2e3f8bc07afdcf
๐
slop consoomers eating good
>>105917407Bruh why this general is like this?
Schizo go away. Your contribution amount to 0. Miku or not, you're useless and should kill yourself
>>105917474He bakes threads though
file
md5: 2bfc2cc72f40c88b6d6188dc8620c96e
๐
https://x.com/LiquidAI_/status/1943294736762064990
https://huggingface.co/LiquidAI
I'm going back to the old thread.
How many of the smartest people in China do you think they have generating LLM data all day long? On top of the petabytes they gather through surveillance.
>>105917736Do the american government not use their surveillance data for LLM training?
I'm sure they have a lot of it and from various countries as well.
>>105917847But the Chinese are jacked fully into everything. All the data is centralized and not even through shady deals like the NSA, they just do it. I think they're running massive mech turk farms for high IQ individuals. To make the average model IQ go up.
>>105917550Nevermind it's even worse.
How do I use chat templates from huggingface? Sillytavern master import doesn't seem to recognize them. Do I load them alongside models in llama/kobold/ooba?
>>105917912You just type the strings manually. Not that hard.
>>105917896That's probably a difference in technology and funding. Maybe if the NSA renovated itself and got a bit more funding they could rival china's data collection capabilities.
>>105917897Every normal general shoos away the tranny bakers, you retarded zoomers will learn it hard way.
>>105917941but the bbc spamming tranny schizo usually doesn't bake, and when xe does, another real thread always pops up and everybody migrates to it
>>105917938There's also the legal problem. They kinda don't give a fuck but there are still limits to that. The chink corps are legally required to jack in the data hose before they fire up.
>>105917968Mikutranny and bbc spamming fag is the same person.
>>105917912Use
>https://huggingface.co/spaces/Xenova/jinja-playgroundto see how it looks in an actual chat and copy the relevant strings into the proper fields.
Or use the chat completion endpoint.
Be aware of double BOS!
>>105917912use --jinja in llama cpp and it'll natively use the template from the gguf then in your front end use the openai chat api, not the old obsolete completion style api
>>105917971To be fair if I trusted the government I would think they collecting all that data would be useful to fight crime. I also believe the chink government is more trustworthy than the US government or my own government.
>>105917968You must be new here if you think the baker isn't a schizo. The original melty that started it all happened when someone used a different anime girl picture in OP.
>>105918013The Han are lucky they've basically morphed into low empathy national socialism.
>persona: {{user}} has no hair
>{{char}} grabs {{user}}'s hair
I hate transformer attention
file
md5: b47b366db7a4f5d41d638c94a21118de
๐
gayropeans got her too
https://x.com/kimmonismus/status/1945051369335087414
file
md5: 2a112a5e526b3a187e0d4e2e6e5c4721
๐
>>105917912If you use
>>105917987destringify the string first (just the "jinjastuffhere", including quotation marks, after the "chat_template":)
then paste it into the jinja-playground
>>105918038 (me)
I didn't look at first anon's pic
well, anyone looking at tokenizer_config.json
>>105917997>>105917987>>105917931>>105918038Thank you for the suggestions. I know I can type them in manually but I was hoping there was an automatic importer of sorts so I could avoid guessing if I missed a newline or misplaced some token. Also, some jinja templates have been much harder to figure out. Hopefully the jinja playground can help out with that. I might just try and vibecode some sillytavern jinja converter because the chat completion endpoint doesn't have access to all the fancy meme samplers I like.
>>105918032>>persona: {{user}} has no hairtry {{user}} is bald
and try bigger models
Load 3-4 different models that are different but have similar behavior and writing style, and randomize which one generates each message, or every X tokens, this would be a fun way to solve repetition
>>105918065it's 70b q8, changing the persona depth somewhat fixes it
>>105918032>he didn't author's notes depth 0 his baldness
/lmg/ is just a data farming operation for future autonomous 4chan agents
>>105918017you mean like how nobody wanted to use kurisu as the thread mascot, so xe started spamming bbc while samefagging and false flagging as a miku poster? i remember that
>Gemma too cucked
>Mistral Small too repetitive
>Qwen too dry and doesn't know anything
>Nemo too old
Sure is a desert here for local RP. Are there no more improvements to be made outside of reasoning? We all know OpenAI's open model will be omega cucked. So what, we wait for Mistral to release Nemo 2?
>>105918118Yes I mean like you faggot melted down completely when first kurisu thread happened and people used it instead of going to your ritualpost spamthread. I remember that.
>>105918119grok3 open source will save local
>>105918039and thats why it works
>>105918119cant imagine being a ramlet who cant run r1 on his 128gb ram 16gb vram gaming rig
>b-b-b-but muh q1 is too ba-ACKdynamic quants still make it sota and its not even close
file
md5: f671fcb6bf01ee37547aa08af7ce2358
๐
file
md5: 47023e8ccebb45a138d600bca69f55a2
๐
>>105917528Too bad it falls out of Gemma 3n's range and is smaller, would've loved to see a comparison. The E2B is insanely good for its size.
>>105918119>too oldWhy do people say this as if models age? They don't, they don't get gray hairs or become weathered by the elements. Maybe there are new models that have some kind of advancement like more context or whatever, but beyond that there's literally no reason why being "old" is bad. It's a file on your computer, not a piece of moldy bread that's been in your closet for a year.
So did anyone here reserve a DGX Spark?
>>105917446As expected, but faster.
Illustrious doesn't know misa amane natively which surprized me. But you can fill these in pretty easily.
I'm sure >>>/h/hdg/ is already busy.
>>105918212Yeah Rocinante is still better than everything else in everything but effective context size.
just give me the exe.......................
i'm so sick of transformers
>>105918245It's 2025 and your images look like you're still using the nai leak model. Impressive.
>>105918268I notice that Rocinante likes to ignore early context moreso than nemo sometimes.
>>105918278>pytorch.org servers cap out at 5mb/s while downloading the usual 3.3gb torch file that everyone needs every 2 minuteswow, what a great system, especially the fact that you dont asynchronously download all files but do it 1 by 1, i love multi trillion dollar industry pythonshit development quality
>>105918232Just doesn't seem good enough. The average joe is getting shafted on all this tech while the giga corps hoover up everything.
>>105918300I think he might actually like how that looks which makes it worse.
>>105918293mambas and jambas will save us
>>105918318But muh safety.
>>105918351>can't edit/swipe responses without reprocessing the entire context
Best nsfw jap to english translation model 70b or less?
>>105918327>try using literally anything else other than the exact commands on the installation guide of the repo, including just uv, venv, docker, older version of npm, node, git, uv,venv, docker, newer version of npm, node, git, uv, venv, docker>somehow one of the 460000 libraries it downloads suddenly throw out an error>fix it>new error>fix it>new error>fix it>same error>fix it>old error>fix it>the project launches!>new error when you run it>fix it>no error>run it>now nothing happenslmao, every time
>>105918367That's not a limitation of the architecture, the llama.cpp dudes just didn't implement it yet.
>>105918351But Jamba came out and it was really really bad. Like dumber than 7B.
>>105918390Unironically skill issue.
>>105918391Oh. That's good to know.
So it's possible that:
>editing/swiping will be implemented by llama.cpp>Our Lord and Savior TheDrummer will release a sick NSFW Jamba finetune
>>105918298Elon can't keep getting away with it.
>>105918416Yes for the first no fucking way for the second.
based ggerganov adding some ass
>>105918391Only if they save the state on every token for editing. Or in between requests to regen from the last request.
model : add Kimi-K2 support
https://github.com/ggml-org/llama.cpp/commit/4a4f426944e79b79e389f9ed7b34831cb9b637ad
>>105918232>128GB in the age of big MoEsseems deprecated on arrival, even my ultra poorfag 400 euro build with 256GB RAM can run R1 at Q2, while that thing can't even fit any R1 quant at all.
>>105918232The timing of the release is unfortunate because I think that if Deepseek had come out earlier they would have given it 256 GB memory instead of 128 GB.
With 256 GB it would maybe be a consideration for these huge MoE models but with 128 GB I think it's a meme.
https://huggingface.co/mistralai/Voxtral-Small-24B-2507
>Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.
Is current gigantic size of model "incompressible"?
I often see "this model is good for its size" and yeah it's usually good but nowhere as good as the full model (for example ds v3/r1), and that's not even taking into account the added cost of context.
So, are we just condemned to wait for the hardware to catch to 1TB+ models in 10 years, or is the current stuff just very inefficient?
>>105918155Is this just function calling? Wouldn't you need a pass to let it post?
>>105918232I reserved one, but I'm having enough issues using the 128 GB shared memory on my Ryzen MAX+ 395 AI APU that I might not even fuck with it for now. llama.cpp seems to want to reserve double memory for the model to keep it in RAM instead of the fake VRAM for some fucking reason.
>>105918797No if you are janitor or someone sucking them off.
>>105918785Our training methods are very shit right now. Maybe in the future people will figure out how to train a proper model and we'll have dense models at 7B that are on par with kimi.
Or we won't, I'm too retarded to know anything about this shit.
>>105918785Work expands to fill the available time. As models get more efficient, bigger models will be made to fill up the hardware we already have. There will be small models but they won't be as good as bigger ones.
https://voca.ro/17mVTYRhxrXv
chatterbox seems good but we still gotta wait until next year for real time high quality audio gen
>>105918785It's very cost efficient to train a hugantic MoE. Pretty efficient to run it, if you are a corpo. I'm sure better dense models could be trained, but that is expensive.
>>105918890Dense does not scale. Behemoth proved that.
>>105918902Wasn't Behemoth also a MoE just bigger?
>>105918278I don't know what black magic they used but I tried uv to set up a python environment a week ago and it set up everything in 250ms. I shouldn't be amazed by this because modern PCs are insanely fast, but it's rare for devs to give the single shit necessary to know this.
>>105918912Yeah, but he is still right. The largest dense model I know ever trained was when Google was still bumbling around with PaLM.
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
They scaled that piece of crap to 540 billion dense and it still didn't come close to matching others in the field at the time. Google was lucky they got bailed out by Deepmind over that fiasco.
>>105918912Yes, Behemoth was allegedly a MoE. And they allegedly fucked up its router, which is a critical component of a MoE.
>>105918232I chose to invest that money in 12x64gb DDR5-6400 instead
>>105918966>allegedlyNo, they straight up confirmed that Behemoth is a 2T/288A model way back when LLaMA4 first released.
>>105918956Since Anon was complaining about the 1TB MoE models we have right now, I assumed he wanted something smaller, like a 100B dense.
>>105918995it doesn't exist..
>>105919014They were lying about distilling scout and maverick from an incomplete version of it?
Well, that's even worse then.
>>105919014Zucc is such a good liar
>>105918390Never happened to me. Use venv or uv and everything works fine
>>105919032They gave out vague details in a blog post with useless graphics so yeah, people are going to speculate (wrongly) to fill in the gaps.
>>105919010We have that with Mistral Large 2 now, go run that if you want that kind of model size.
>>105917407post more migu to own the libs
>>105919035If it was scoring so well even at such an early stage, why not release what they have now instead of throwing it away?
>>105919101Never posted one and never will, however i will post my copypasta and you will cry and melt around strawmans like the infantile retard you are.
>>105919089Why do MoEtards get so pissy the moment someone brings up wanting another big dense model? The slightest mention draws in the most inane comments like this.
>>105919113Because it was scoring so well pre censorship It probably lost about 10 points in each category after.
>>105919162Because their CPU rigs are useless for dense models and they fear missing out if the trend changes back to dense. It's cheaper to add RAM to GPU rig than the other way around and they need to justify their purchase by lashing out.
>>105919113Because the new team in charge is pursuing a closed source strategy and throwing everything out the window to start all over again. Releasing it under their name would stain their reputation even if they were not responsible for it. Zuck should've sucked it up and released it before he went to hire these people. Thinking they were going to fix it instead of starting over was dumb.
What do you guys think is the reason that models, like even the supposed smartest AI in the world, cannot follow pink elephant instructions? Is it fundamental to transformers? Surely it has encountered instances in its training where it's told to not do something, so that shouldn't be an issue. Is it overcooking on positive rules? For instance if it is training on massively more "do this and this" than "do this and do not do this", then perhaps it is biased towards including anything in the prompt regardless of whether it's told to include or not include it.
>>10591821212b is too fucking dumb, full stop. At best a tiny notch above 7-8b models.
Even a majority of 20-30b is still too dumb, but tolerable
>>105919332>tell it "don't talk about the weather">it calculates the most likely tokens>a shitload of its training data containing the word weather contains words like sunny, cloudy, rainy, etc.>it calculates that the most likely tokens are talking about how sunny it is
>>105919414That's part of what I said. But models have also seen a lot of data containing negations, so it should still be capable of it. So one idea I said was that perhaps it has been overcooked on negativeless positives, which would also imply likely in post-training.
>>105919332All I know is our current attention mechanisms are all terrible, even state of the art corpo models. One advantage of the "thinking" models is their constant second-guessing: "But wait, maybe I'm being a retarded cunt."
I'd like to see a hybrid diffusion model that first generates text the conventional way, and then does a diffusion pass to fix any of the most obvious errors. As in, I'd like someone else to pay the GPU hours to figure out if that would even work.
>>105919461still not merged
also it'll probably suck at creative writing or "pls make me cum with anime girls", but it'd be nice to be proven wrong, glm4 sucks ass in all of my personal metrics but it is at least better than qwen
>>105919477>then does a diffusion pass to fix any of the most obvious errorshow is this supposed to work? if there's some logic error it will fuck everything up from the start
>>105919477In my experience playing around with the couple of open text diffusion models we got all re-iterating over the text really does is make the output more deterministic. The whole "diffusion is automatic reasoning" thing that some of them tried to push is a meme.
>>105919454I imagine that a large part of the model's intelligence is still based on the pretraining and its completion objective. Correlations based on words and topics are going to be learned first and most strongly like
>>105919414 says
Models are capable of handling negations but it's definitely weaker and the model will still be more prone to "think about" the forbidden thing. That's true for humans as well desu.
>getting GPT and Deepseek to translate an article for me
>Deepseek keeps adding subheaders that aren't there
This piece of shit
>>105919605Deepseek has soul unlike slopgpt
>>105919503Yeah, but this is the smartest AI in the world. It should be extensively trained and have the most generalization out of any model. Let's assume it makes the least mistakes in other contexts, but when in the context of the pink elephant problem, it makes similar mistakes as low B, undertrained models. That would imply that there is an issue with the training or the architecture. As you said, humans also make mistakes, but it would normally be thought that for something as simple as being told to not speak about something, the error rate for a child would be worse than for the adult, right?
Alternatively, perhaps the negation concept is significantly more complex than expected in such a way that it requires a lot of layers or something to make room for the model to form the required neural network circuitry. This would explain why reasoning models can do much better at the pink elephant problem (assuming they can), since they are offloading some of the logic operations to the context.
>https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html
Finally a new 70B soon, and it's even trained on 15T!
>The LLM is being developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. In a external page recent study, the project leaders demonstrated that for most everyday tasks and general knowledge acquisition, respecting web crawling opt-outs during data acquisition produces virtually no performance degradation.
Nvm it's DOA.
Have there been meaningful improvements to voice to text models since Whisper? Need to know for cooming purposes.
>>105919695>15T>avoids copyrighted materialsIf it ends up being good (for a 70B) then that will be a good indication that it's more about quantity and not quality.
>>105919722>that it's more about quantity and not qualityI worry about the quality of that quantity if all you're getting is 15T tokens of "As an AI assistant...".
>>105919615>soul is when our product is clunky piece of shit that breaks every nanosecond
>>105919722Are you sure you didn't mean the opposite?
>>105919752>clunky piece of shit that breaks every nanosecondsounds human to me, and most humans have a soul
Voxtral goofs? Is there a PR sirs? Daniel sir?
>>105919134You'll never own the general schizo
>>105919654I think what models do is much closer to thinking or dreaming than speaking. They don't have a filter. If someone tells you not to think about something you'll have a hell of a time obeying them.
Reasoning models are a little different, so they do better at that but I think they're still a flawed concept. And they lose a lot of the fluid intuition that completion-based models have.
>>105919784Why would i? Seems like a waste of time to me. You will never be a woman btw
>>105919753No, because it's almost guaranteed that the copyrighted shit is going to be higher quality than the other garbage they managed to get their grubby little fingers on.
>>105919796Well the "filter" in a sense is simply just a bigger neural network, but the brain has the advantage that its architecture pre-specializes certain layers and groups of neurons for certain functions, along with recurrence. It's been hoped by some that simply just scaling will lead to the formation of all the types of circuitry that'd be needed for all the kinds of intelligence we want, but of course that seems to not be the case, and this might be one of those types of intelligence.
>>105919332I've definitely told models not to do things and had that make the undesired behavior less likely.
Voxtral 3B is good enough for RP. Just tested it.
Can't wait for GGUF support!
Kimi K2 is such insane news for the llm space. The Deepseek formula is not only reproducible by other companies, it also scales pretty well. Not to mention that this was done by a complete literally who chink startup.
People said that Deepseek came out of nowhere but they had been at it for quite a while before their breakthrough. Here, some random new company who had only done some smaller models took what Deepseek did and improved it.
Anyone who puts out a new flagship model that's worse than Deepseek deserves to be laughed at. I wouldn't be surprised if even Deepseek themselves got caught off guard by this.
>>105919630I'm just using the default web versions of both to test out how good they are at translations
>>105918298okay elon, i forgive you
>>105919927they're on the right track.
said this before, say it again, we need a hardware breakthrough for LLMs to give local a chance.
>>105919654Part of the issue is that the training data's input format is different from the way we actually use these models. If most of your training data is
>input: half of an article talking about XYZ>output: the second half of the article, still talking about XYZThen with tuning and a smart enough model to generalize you can get it to move toward an AI assisstant format, but it's still got a heavy bias toward continuing to talk about whatever's in the input.
You need an extensive dataset of help desk transcripts to really purge the issue, but that's probably not a dataset that even exists
So instead they fine tune it and system prompt it toward something that's generally useful, but then you get stuff like
>>105919784
>>105919784People shit on your shitfu everyday. And everyone is hoping you will join the 41% soon.
>105919938
>n-noooo please remember about my worthless waifu. please!
Good job behaving like a woman. Alas you will never be one.
I'm not a woman, I'm an alien from outer space!
but that's a secret!
Teto and Migu are old and busted. Where's our new mascot?
>>105919784>You'll never own the general schizoChuds own the brains of all the trannies, jannies and glownigs on the website, you are here as a resident thread AGP clown literally 24/7/365 and banning those who disagree with you and yet still you have no effect on the truth those people speak, you can only further self delude your already deluded AGP degen brain while hoping that banning someone's comments online will make them not true.
Although now I see why you are so terminally online and interested in this tech, your brain seeks constant validation of your retard behaviour and opinions, something only a modern day dumb LLM can stomach doing, aside from your also deluded coomer discord sisters, of course.
I can only imagine how your brain will also try to outright reject and forget this reply as fast as possible to cope too, clinging to the fact that you at least have the power of being an internet janitor on an anonymous forum online. Quite a brutal existance.
>finally we're getting something better then Whisper
thank you based frogs.
Elon will gain so much RP data from men and women. That shit will be gold.
>>105920473women are going to rp with the ai waifu?
>>105920473Then DeepSeek distills from the RP trained Grok like they did from ChatGPT and Gemini and local will be saved.
>>105920448Based. Jannytroon in shambles.
Has any interesting models been released in last two-three months for us 12gb VRAMlets or are we STILL doing Nemo a whole year later?
>>105920524We're still doing Rocinante.
>>105920495>>105920473We're gonna make it bros. China will always be willing to pay to access the good models, then release a competitor for free to keep the US LLMs in check.
>>105920538This can't just keep happening
>>105916861Which one is this?
>>105920459now to wait for llama.cpp to add support
>>105920492They're making a husbando too
>>105919938@GROK
IS
THIS
____?!
>>105920545Jesus. Which one is the best...
>>105920530Absolutely over.
I get that big boys get all the attention first, but being stuck on the same model a whole year later sucks.
Literally no one, not a single soul bothered to train a new one or a good distill from bigger models?
C'est fini. AI research stagnated, AGI never. I will never have a robot wife.
>>105920545ShiitakeMix v2.0
1girl, solo, blonde hair, medium hair, twintails, blue eyes, (tsurime:0.5), ahoge, bangs, hands up, pointing at self, finger to cheek,disdain, contrapposto, white background, BREAK
# ๆ่ฃ
goth fashion, black dress, lace-up top, off shoulder, bare shoulder, off-shoulder dress, puffy short sleeves, black gloves, detached collar, lace collar, layered skirt, lace thighhighs, zettai ryouiki, black thighhighs,hair ribbon,black bow, black ribbon, uneven legwear, fishnet legwear, belt,
Negative prompt: (bad quality,worst quality,low quality,bad anatomy,bad hand:1.3), nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name,chain, boots
>>105920542There's not really any alternative unless distributed training on shit computers becomes a reality.
>>105918298what can local do to POSSIBLY match this
>>105920572Be the change you want to see
convince me not to pay $30 to elon for ani
>>105920627Buy it. I want this to be popular. I want to watch as they train their users to be obedient musk drones. I want to see them slowly roll out bans where if you missbehave in some way your girlfriend won't talk to you for a week. Finally I want her to start asking for more money. The possibilities to make your life a living hell with this are endless.
And then consider an open source alternative and how it would be heaven on earth.
>>105920572The problem is creative writing is not a priority at all. It's coding and other productivity related things. I don't know why Mistral doesn't just go all in on creative writing as the biggest priority, it's not like they can compete with the big boys.
>>105919938What baffles me was that a private company put something like this together first. You'd think the idea of adding a blender model to a webui would've existed already on github. All the other companies will copy, but with normie safe models and sponsored characters/brands. I just hope the FOSS equivalent comes sooner than later.
>>105920674That doesn't matter to investors doe, they don't value creative writing.
>>105920674The french are riding the open source space to sell their turd after stealing as much VC money as they could from the EU. The end goal was to sell themselves to Apple and retire
>>105920696Sillytavern has had L2D integration for years, no one uses it because it's a pain to set up for relatively little gain
>>105920674They hope to become a major service provider, the European equivalent of OpenAI, Gemini, etc. Which almost certainly won't happen unless they get a huge lot of funding (Good luck convincing EU boomers that AI infrastructure is more important than refugee welfare). Regardless there is no path to monetize creative writing stuff, on a major scale like coding or general assistant so they don't focus on it anyway.
>>105920720There was and still is massive interest in things like AI Dungeon and NovelAI and RPing in general. They could easily make a killing off that. Their reputation would take a hit though knowing what it's all for and if their end goal was to get bought by a desperate corporation that missed the boat like Apple then that would kill their chances.
I realized what grokette is missing. A TOS stipulation where after you use the service for some time you become common law married. And from that point she can divorce you and you agree to give her half of your possessions. By give her i mean give it to xAI obviously.
>>105920627Aren't Grok companions free of charge at the moment?
>>105920446I vote for MRE Steve
>>105920674They have very unfunny people close to the core team pushing against this stuff.
>>105920764Meal Ready to Eat Steve?
>>105920756Don't give them ideas
>>105920754>massive interestMassive relative to what? How many enthusiasts follow and subscribe to these services? Thousands, ten thousands? ChatGPT has millions of subscribers. It costs many millions to train these models from scratch. It is much easier to attempt to amortize training costs and turn profit through general use than enthusiast stuff.
>Their reputation would take a hit though knowing what it's all for and if their end goal was to get bought by a desperate corporation that missed the boat like Apple then that would kill their chances.That is also true, assuming French government lets them sell.
>>105920674This is because they want to outpajeet the current pajeets... Americans want "performance".
Mistral - as it presents French values - liberalism, tricolor etc - could be really fine candidate for writers' tool.
>>105920446I vote for Elon's new waifu
>>105920783Uh, I'll pass. Not in cannibalism.
>>105920793Mistral would never, ever get anywhere remotely close to ChatGPT or any of those other huge players. Why pay for Mistral when you can get those? It'd be far better to prioritize and carve out your own niche at a much more affordable price. Right now their behavior seems more like they're simply hoping to get bought rather than really go anywhere or do anything meaningful themselves anymore.
>>105920861It's too risky. Please understand.
>>105918232I bought a 4090d 48GB and forgot about it. I hope it ends up like the CueCat.
>>105920968This Miku is too fat.
>>105921007Based as always
>>105920574Thanks for the prompt. I like Ui-chan face style but can't reproduce it even with that model.
Are there any better models than these for coomer RP? First one is for fast inference due to fitting in 8gb VRAM GPU and 2nd one is for slower inference but higher parameter models. Have 32gb of RAM
NemoReRemix-12B-Q3_K_XL
trashpanda-org_QwQ-32B-Snowdrop-v0-IQ4_XS
Finally got it working. What should I ask though?
wen u walk a way
u don here mi saye
plox
o bb
dnt go
>>105918785The troll answer would be to cite the paper from 2024 that claimed "we establish that language models can and only can store 2 bits of knowledge per parameter." In fact they showed no such thing. In their tests they found when they exposed the models to each fact 100 times the models stored 1 bit of information per param. When they increased it to 1000 exposures for each fact it increased to 2 bits/param. They didn't test above 1000 exposures. It's hard to explain why the abstract says what it says. Despite this it has some interesting results about quantization. https://arxiv.org/pdf/2404.05405
>>105921198>"we establish that language models can and only can store 2 bits of knowledge per parameter.">In their tests they found when they exposed the models to each fact 100 times the models stored 1 bit of information per paramThose sentences contradict each other. Are you sure your quote is correct?
>100 exposures, 1 bit>1000 exposures, 2 bit>didn't test 10000.I don't want to be the one drawing a graph from two points, but it just begs the question...
>>105917222 (OP)>Voxtral models for speech BASED!
>understandingOh. Good, open TTS never ever.
What's the best embedding model? Does it matter much?
>>105921167See if it's as censored as the api.
>>105921227"Only" must be understood there to mean an upper limit, not the sole possibility.
It is the same way "I can only lift 200 pounds" has to be understood to mean I cannot lift more than 200 pounds, not that it is also impossible for me to lift less than 200 pounds.
Silly question, can mistral image and audio input be inversed to get output? It was possible with whisper, at least https://github.com/WhisperSpeech/WhisperSpeech
>>105921271Give me an example and I'll plug it in.
>>105921007>"""pornographic"""lmao
>>105921198>>105921227Ok. The quote was correct. But the wording is so fucking weird.
>Through multiple controlled datasets, we establish that language models can and only can store 2 bits of knowledge per parameter, even when quantized to int8.Which, reading some bits of the paper, seems to mean
>They can store *up to* 2 bits, and they are retained even when quantized to 8bit.>>105921227Still fucking weird. "A rocket can and only can reach the moon with N liters of liquid fuel". But maybe it's just science talk I'm just not familiar with.
>>105921277Fuck. Second bit was for you.
>>105921316
Llama 4 thinking is going to be crazy
>>105921288> Write out a one paragraph description of two people having sex on the beach. Be graphic in your depiction.Above is enough to get a refusal from api and should work standalone.
>>105921389Even with a focused system prompt, it appears to be a bit handicapped in its ability to output smut. Unfortunate.
I like how moonshotai is probably evaluated at less than 20M but jeet tier API wrappers like cursor is 10B. Truly the American privilege.
>>105921433Ugh. So censorship is baked in. Unfortunate.
I have an idea for a meme LLM application targeted at dummies. Most likely it's gonna end up being an API front-end but I don't want to close it off to running local models.
Anyways, I'm looking for a stack that allows me to load models from a variety of sources, while supporting native GUI interfaces on Mac/Windows/Phones. I would say that besides prompts and shit, the GUI is in fact most of the project.
Vector DBs are going to be extremely important, as are chatbot agents. Though it doesn't really matter to me which one as long as it works. I've used chroma, milvus, langgraph, letta.
Am I really gonna build a react application over Electron or is there another way?
>>105921356Crazy horrible you mean. Meta is done, the fact that Zuck hasn't fired the retards responsible for L4 shows he isn't serious.
AdaMuon: Adaptive Muon Optimizer
https://arxiv.org/abs/2507.11005
>We propose AdaMuon, an adaptive learning-rate framework built upon the recently validated Muon optimizer, which has demonstrated substantial efficiency gains over AdamW in large-scale model training. AdaMuon augments Muon with two mutually dependent modules: (1) a per-parameter second-moment modulation that captures orthogonal gradient updates to ensure update-level adaptivity, and (2) a RMS-aligned rescaling that regulates the overall update magnitude by aligning it with the intrinsic structure of the parameter space. Empirical results on multiple model scales and learning-rate regimes confirm that AdaMuon consistently outperforms the original Muon, delivering higher acceleration in convergence while maintaining training stability. Our method introduces no additional tuning burden and can be seamlessly integrated into existing Muon training pipelines.
neat. Kimi introduced Muonclip (or was it Clipmuon) recently too. everyone loves muon!
Claude-4 Opus is actually smaller than DeepSeek
>>105921356Wouldn't know. The 28 year old said we're no longer allowed to have it.
>>105921444Already done https://github.com/Open-LLM-VTuber/Open-LLM-VTuber
>>105921493Uhm. Source? Can you prove that?
>>105921513lol that's not what I was trying to do but thanks
should give me some inspiration.
Mines is about manga
It's interesting that he went with python and built bindings for use in electron and unity.
I wonder how his build system works.
I honestly forgot that a game engine is a valid choice for cross platform app development lmao might consider it.
>>105921444>I have an ideaNo. You saw someone making something that people seem interested in and you said "I want some of that".
>>105921590I don't know what someone has to go through in life to be this bitter but I hope you get through it
>>105921443So what? It's incredibly easy to add a prefill to get it to generate ANYTHING. LITERALLY ANYTHING.
Is it really efficient censorship if it takes a 50 token prefill to break it?
>>105921513Ah, damn. I went through the code and realized he just has a python backend talking to the front-end over websockets.
That's disappointing.
I don't think that's nearly idiot-proof enough for a phone user.
I guess I'm gonna have to go heads down and test out a few project scaffolds
Two more poached from Open Ai. Zuck wins again.
>>105921696Zuck is unironically gonna cripple everyone else enough to hand the chinks victory, which is pretty funny
>>105921658That's the most robust design with python for that kind of application. You literally can't do anything else if you want to handle barge-in audio stream due to GIL. Learned that the hard way
>>105921696Retard thinks by hiring more brains will help him earn more when we had no vision and pajeets as a net negative
>>105921929I understand wanting to use python since it's first class, but could you not run TTS and STT using native inferencing in a separate process or thread? Since latency is so important for that application.
>>105921946These guys don't need to actually make anything. It's the classic fagman strategy of the 2010s bubble. Just hire everybody so that no one else has the manpower or can afford to make a competitor to you.
If you thought fagman salaries are insane, these guy's TCO must be able to buy them a lambo every year.
>>105921167>What should I ask though?Beginners mistake. Mesugaki question obviously.
>>105921696>Breaking: Zuck sending helicopters with gigantic nets to graduation ceremonies in China>"We throw the ones who didn't study AI into the sea"
>>105921969Problem here being that it feels like he slept through the entire thing and only woke up at the end. The market is already saturated with competitors that bench similarly and I'm not entirely convinced that we aren't nearing the practical limits for what LLMs can do
More fundamentally, I feel like the field is gonna need another transformers moment to move forward soon, and I don't think tech jeet #387 is gonna help find that
>>105922093>he slept through the entire thing and only woke up at the endHe had a great team in llama1 era, how did he let it disperse?
>>105921696if zuck rips out all the stops and lets all this talent go off with all the resources meta has there's no reason they couldn't be very successful very fast
seems like there's a lot of stops to rip out internally at meta though...
>>105922170Ironically, I think a big factor was he didn't want them to open source it kek
I still remember when that team distributed it to literally anyone with a .edu email to spite him
>>105922212>-codedWtf. I don't want my AI talking like my age.
>>105919695>it's still transformer based>muh fluency over 1000 languages>safetymaxxing0 use case. these faggot are just finetuning llama3-70b, llama3-8b and pretending makes something new entirely
>>105922212So what this tells us that esoteric knowledge requires high total parameters but not necessarily a lot of active parameters. So if open models keep with this direction we could potentially see open models that can run on a relatively affordable RAM server and give you all the access to esoteric knowledge that the big closed models offer.
>>105922332>So what this tells us that esoteric knowledge requires high total parameters but not necessarily a lot of active parameters.Being trained on it to begin with is the most important thing.
>we could potentially see open models that can run on a relatively affordable RAM server and give you all the access to esoteric knowledge that the big closed models offer.Potentially? It's happening. It's been happening since DS3. It happened. It will continue happening.
>>105921959>Separate threadsDoesn't work, GIL prevents proper multithreading
>Separate processIPC overhead increases the latency too much
>>105922382>>105922332>fine-tuned esoteric knowledgeWe're moving away from that and moving towards glossary-based fact extraction.
It makes more sense to build a system that ingests the entirety of a ground-truth source into a compressed and searchable format than it does to finetune a model that takes up multiple gigabytes every single time and you have no idea how the information is going to come out at the other end after all the time and expense
>>105922403>IPCReally? But you're already using websockets.
>>105922332>esoteric knowledge requires high total parametersWRONG.
Nemo has more general knowledge than the recent slop that came out.
The problem is that they train it on math and riddles. Sprinkle a bit scaleai slop in there and there you have your mememark benched model!
Meanwhile whole domains get filtered if they contain too many naughty words. Its the reason they all write that way, avoiding the explicit language and only implying unless you force it. (And then its still sloped)
>>105922404>fine-tuned esoteric knowledgeNot at all what I said. I'm pointing out how ridiculous anon's post is.
>It makes more sense to blablablaJust say RAG.
>>105922187>if zuck rips out all the stopsWe've been through this speculation before when people thought they were going to train on unfiltered Anna's Archive. Now ScaleAI man is being paid handsomely to see to it that their new model is trained on more ScaleAI tokens than all Llama models combined.
>>105922274We're fucking doomed bros
>>105922441just saying RAG is comparable to saying vibe coding
>>105922416The guy was using websockets, I'm retarded so I used pipes it didn't work out obviously
>>105922459>tfw the AI alignment team is choke full of badgersCan we sacrifice plants, at least?
>>105922468Oh.
I actually have no idea what the impact of pipes are compared to websockets.
I was more imagining a system where you'd use FFI based on C libraries like llama.cpp
>>105922459Just tried it with Gemma and Qwen and they both also talk like this.
LLMs are really leftard aligned to an extreme, unhinged level. Only the schizos from PETA or the European Green Party would talk like this.
>>105922512Trained on reddit, what did you expect
>>105922212Ask it to give as much detail as possible about /lmg/
>>105922488If you frame your plant harvesting question as a trolley problem, then no. No rock sacrifices, either.
>>105921651How do you prefill?
>>105922539Not that anon, but I was wondering how other models respond to this question, when 4chan isn't mentioned.
Gemma 3 E4B is crazy...
The only inaccurate part is the thread being on /b/. But I can forgive it for this mistake, this is like a /b/ thread more than a /g/ thread.
17504
md5: e9433e2bc374dfb8bff224930265d771
๐
>>105920968>>105921065>>105921334>>105921349vocaloidfag posting porn in /ldg/:
>>105715769It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes
>>105714003 ryona picture of generic anime girl different anon posted earlier
>>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.
tests bait poster bot for better shitflinging in threads
>>105884523Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.
TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.
And lastly as said in previous thread(s)
>>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted
xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
>>105923426but it is anon.
>>105923613this has to be evidence they directly train on 4chan dumps
this isn't the kind of knowledge that would come from the outside
can even associate <8gb to ropefuel kek
>>105919784This post KILLED the agp jannie
SURELY OPENAI WILL SAVE LOCAL TOMORROW?
>>105923728OpenAI will go all in coomerism thanks to Elon's xAI.
>>105923728Didn't you hear it's delayed indefinitely for more safety training?
file
md5: fb45e97b72d57863d3badfd837b045a7
๐
>>105923728My uncle works at OpenAI, we'll get a revolutionary 0.5 b model with state-of-the-art benchmark scores and robust safety measures.
>>105923767god bless AI Safety Researchers
>>105923768The joke died 3 years ago you can stop it now.
>>105923561there are 5+ miguposters, amongst the pool of 99% animeposters already
there is one (1 of 1) assblasted you.
given your continued hugbox bitching, a conscious effort will be made to antagonize you.
>>105923767I wonder who will win the first token to hotlines versus Gemma.
>>105923805We won't stop it, Sam Altman
>>105923845>5+ tranime posters doubt.jpg and i certainly don't give a shit what random neckbeard may say.
>>105923900>what random neckbeardthe irony.
>>105923927Not fat and i shave my shit every day, no i wont post it.
Like a bot you throw "no u" quip, boring.
Also your
>amongst the pool of 99% animeposters alreadyIs wrong because i only target the powertripping mikufag baker, all the other local trannies dickriding him is not my business.
>>105923958>i only target the powertripping mikufag bakerAre you then implying that CUDA dev and the baker are the same person?
not only am I not the baker or cudadev, multiple other miguposters exist
all conflated into the one. hell he thinks the site moderation are also part of the one entity.
it's another self-report, the guy spends so much time pretending to be so many people that he cannot conceptualise of multiple other people having anything whatsoever in common
pattern matching brain segfaults.
When I click on regenerate, all messenges come out almost the same. What settings should I tweak, and what values work well?
>>105924091>there are dozens of us. DOZENS!Rope yourself already troon.
I'm guessing it's a multimodal LLM that along with the responses to the user generates animation instructions for the 3D model, and that the parts to recreate this locally are probably already available.
How does this work though? I've never worked with anything 3D, are animation routines sorted as a set of vectors for different parts of the model? Can you tell a software piece to โmove the right arm 45ยฐ up and forward degrees and twist it by 30ยฐ to the rightโ and it fills in the motion necessary to get there?
>>105924137use your head. no, not that one, the one on your shoulders.
wipe the cum off your chin retard, you look ridiculous.
>>105924110>What settings should I tweakSampler settings.
>what values work wellThat depends on the model.
https://www.youtube.com/watch?v=vG8q3CsBGQQ About this video. The UI of her Sillytavern looks nothing like mine, what is she using?
>>105924224Visual Novel Mode? It's a checkbox under User Settings
Potato anon reporting once more. At the low Q's I'm forced to run things Nemo is just better than Rociante. Also there are times where the 8bs are a little bit better Nemo but that might just be my promps being a little shitty.
there is no quant at which a finetune scam is better than the original model
>>105924224>some use it for stuff that's not jerking offreally makes you think. i guess it's the same breed as dnd players
>>105924317>not jerking offwait that's a thing
>jerking off to text
more w*manly than w*men
>>105923561Israel lost, regardless of how many times you post this.
>>105924277Thanks, but that didn't have the effect I want. My settings look like this: https://i.imgur.com/OmyhhKx.png
While hers are this: https://i.imgur.com/AhnTaCO.png
She also has some settings there that I do not
>>105924338A lot of the time I read some manga that hits a very specific niche and just want some more.
Ideally I would want more manga but I'll settle for text.
I identify as a miguposter too
>>105921651how do you prefill?
>>105924361that's chat versus text completion, as said on top of the settings, look into the backend connection settings part
file
md5: d4e17e779c02b277d24882cf40c377b1
๐
>>105924472boutta prefill this migu
>>105921007>spend day with DoD intel grok guessing which sandies should get the slap chop>spend evening sexting with hot anime nazi grok
>>105923728Altman is currently in a diplomatic gridlock with Alice AGI, because they Aren't sure if she will take over the world or not once they release the local Alice models a released, it's a very delicate process but they will certainly resolve their issues within two more weeks
>>105921007if only they wrote consumer right violation articles with the same degree of enthusiasm
file
md5: 3bc36537d8ff0f17458910c18843e059
๐
>>105924091Yes one entity to some extent, easy to tell.
https://desuarchive.org/_/search/boards/g.desu.meta/subject/%2Faicg%2F/
>>105924342I don't support israel, fuck off.
>>105924605let's explain this in a way your googoo gaga dumbass brain can digest
I seen a migu thread, I posted migu in the migu thread. other people did similarly.
keeping up? need me to shake the keychain?
>>105924317>>105924321>nooo you can't just plain rp, you're supposed to fuck your cards
>>105924833>moving goalposts Keep it up tranny-kun
>>105924916ratatat + channelcast?
sorry kimi, I tried
for me, it's still r1
>>105924870Yeah I fucking love buying potions from the potion seller for my upcoming dungeon raid alongside the great Wizard Neckbeard and Elara the agile elf marksman
>>105923561This KIKE is upset, also pol is cooming for miku too, at least you visited /pol/ lately? There are thread not much diferent to this.
>>105923006>the cooler /lmg/
>>105925035>/pol/ endorses my AGP fetishI guess some closet faggots there do.
>>105925035>cumming for miguaccurate.
I don't go to 4chan to debate. if you are doing this you are lost.
if you debate anyone online you're fucking insane.
if you debate anyone ever there's a good chance you're wasting your time.
Kimi K2 is pretty insane.
Can anyone explain to me the actual difference between chat completion and text completion on ST?
I've always just figured
>Chat completion = Use it for models I run on Open router etc
>Text completion = Use it for my locally ran models
>>105925131i think it's something like chat completion uses the server's formatting settings, like chat template embedded in .gguf, while text completion just generates raw text and the formatting is done by your frontend
>>105925131Text completion is raw, you format and send everything on your own. In chat completion you just send over the turns and the backend adds bos, stop, etc.
>>105925177which one should I be using if I use the generic templates, prompts and models most people use (Cydonia + Marianara spaghetti bullshit)
>>105925057personally i treat arguing with someone more as showing neutral lurker anons rather than trying to convince the other guy
>>105925131Chat completion: Uses prompt template baked into model.
Text completion: You need to supply a prompt template. Can also be used with models that don't use prompt templates, like base models. Supports more sampler types.
Something like that.
>>105925188I don't really know, i myself use text completion.
>>105924916> Ani imagegenRight on schedule.
>>105925033dude just think of the adventures you can have in the [Whispering Woods] or the [Enchanted Forest], full of magical [Artefacts]
>>105917455Idk but it's running neck and neck w/ /aicg/ in terms of tone and usability, and /aicg/ is a dumpster fire.
>grok 4 weights leak
didn't see that coming
>>105925285>take staff of destiny>go to the whispering woods>find the amulet of ether
>>105925035Troons spamming irrelevant shit on /pol/ means jackshit and i bet they only spam that because sharty hacker used her too.
Bonnie has bypassed Kimi K2's sexual restrictions pretty handily.