← Home ← Back to /g/

Thread 105769835

368 posts 118 images /g/
Anonymous No.105769835 [Report] >>105772362 >>105773374
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105757131 & >>105750356

►News
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5
>(06/27) VSCode Copilot Chat is now open source: https://github.com/microsoft/vscode-copilot-chat
>(06/27) Hunyuan-A13B released: https://hf.co/tencent/Hunyuan-A13B-Instruct
>(06/26) Gemma 3n released: https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.105769843 [Report] >>105769864 >>105769926 >>105770731
►Recent Highlights from the Previous Thread: >>105757131

--Paper: Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication:
>105761808 >105761966 >105762753 >105763009
--Paper: Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity:
>105768845 >105768877 >105768901 >105768884 >105768906 >105768933 >105769034
--Meta's AI talent acquisition and open model skepticism amidst legal and data curation challenges:
>105758293 >105758397 >105758388 >105758810 >105758467 >105758482 >105766325 >105758818 >105758901 >105758926 >105758942
--Hunyuan-A13B GGUF port requires custom llama.cpp build for flash attention support:
>105768115 >105768164 >105768455
--Frustration over delayed OpenAI model and skepticism toward benchmarks and strategy:
>105766029 >105766042 >105768619 >105768677 >105768693 >105768837 >105768876 >105769053 >105768798 >105768934
--Critique of Hunyuan and Ernie models for over-reliance on Mills & Boon-style erotic prose in outputs:
>105758427 >105758629 >105758645 >105758674 >105764901 >105765054 >105765118 >105765228 >105765275 >105765472 >105765503 >105765747 >105767085 >105767501 >105766545 >105766794 >105768886 >105758694
--NVIDIA's Mistral-Nemotron open reasoning model sparks confusion and skepticism among anons:
>105766864 >105766975 >105767094 >105767167
--Discussion on NVIDIA ending driver support for older Pascal, Maxwell, and Volta GPUs:
>105764483 >105764512 >105766267
--Fish Audio S1 Mini and 4B text-to-speech model voice cloning results shared:
>105760876 >105760929
--Official OpenAI podcast episode discussing ChatGPT and AI assistant development:
>105766509
--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
>105760696 >105760773
--Meta court win legitimizes fair use for LLM training in the U.S.:
>105766199
--Miku (free space):
>105765500 >105766204

►Recent Highlight Posts from the Previous Thread: >>105757140

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.105769864 [Report]
>>105769843
>--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
Turn the temp down holy fuck
Anonymous No.105769897 [Report] >>105769907 >>105769965
temperature was a mistake
Anonymous No.105769907 [Report]
>>105769897
Yeah it's way too fucking hot right now.
Anonymous No.105769926 [Report]
>>105769843
Thank you Recap Teto
Anonymous No.105769948 [Report] >>105770572
70Bros what's fotm slop finetune?
Anonymous No.105769965 [Report] >>105770012
>>105769897
Temp + Top-P is all you need.
Anonymous No.105770012 [Report]
>>105769965
Temperature: 0.6
Top K: 100
Top P: 0.95
Top nsigma: 1

can't believe there are still people using other LLMs when R1 is this simple to set up.
Anonymous No.105770034 [Report] >>105770065 >>105770072 >>105772262
hi what's the best model for 8gb vram these days
i'm primarily interested in things like tool use and agentic behavior
Anonymous No.105770041 [Report] >>105770068
Anonymous No.105770065 [Report] >>105770068
>>105770034
Qwen MoE probably. The 30B.
Anonymous No.105770068 [Report] >>105770076 >>105770125
>>105770041
not bait :(
>>105770065
thanks, how many t/s should i expect?
Anonymous No.105770072 [Report] >>105770186
>>105770034
Nemo
Anonymous No.105770076 [Report] >>105770097
>>105770068
Will mostly depend on your RAM, since you want to offload the expert tensors to the CPU backend using the --override-tensor (-ot) parameter.
I'll say between 10 and 15 t/s?
Anonymous No.105770097 [Report] >>105770144
>>105770076
16gb of ram unfortunately, haven't gotten around to upgrading it yet
Anonymous No.105770125 [Report] >>105770186
>>105770068
>not bait :(
You know 8gb is not enough right?
Anonymous No.105770144 [Report] >>105770186
>>105770097
Well, shit.
RIP I guess.
Try the q4ks quant with low topk and pray for the best I suppose.
Anonymous No.105770186 [Report] >>105770216
>>105770072
will keep that in mind
>>105770125
i didn't purpose build this pc for running llms, it's just a gaming pc that i'm hoping to repurpose
>>105770144
thanks, will do
last model i used was mistral-7b and it was honestly not up to snuff
Anonymous No.105770216 [Report] >>105770255
>>105770186
what specific graphics card do you have, and also how much regular RAM do you have? CPUmaxxing might be an option
Anonymous No.105770255 [Report]
>>105770216
rtx 2060 super
16gb of regular ram, some 10th gen i7
Anonymous No.105770318 [Report]
that 16gb is going to limit your maximum context
Anonymous No.105770328 [Report]
>>105769946
It's ggml-large-v3.bin from https://huggingface.co/ggerganov/whisper.cpp/tree/main
Anonymous No.105770389 [Report] >>105770409
Reminder that ROCM sucks so much that it's ALWAYS better to fit more layers in VRAM and use -nkvo (--usecublas lowvram in kobold).
Anonymous No.105770409 [Report] >>105770513
>>105770389
>it's ALWAYS better to fit more layers in VRAM
isn't that generally the case?
Anonymous No.105770488 [Report] >>105770519
>>105768845
>[I]t is commonly observed that some experts are activated far more often than others, leading to system inefficiency when running the experts on different devices in parallel. Existing heuristics for balancing the expert workload can alleviate but not eliminate the problem. Therefore, we introduce Mixture of Grouped Experts (MoGE), which groups the experts during selection and balances the expert workload better than MoE in nature. It constrains tokens to activate an equal number of experts within each predefined expert group. When a model execution is distributed on multiple devices, which is necessary for models with tens of billions of parameters, this architectural design ensures a balanced computational load across devices, significantly enhancing throughput, particularly for the inference phase.
Why don't their speed benchmarks compare Pangu Pro 72B A16B to other MoEs?
Anonymous No.105770513 [Report]
>>105770409
I'm assuming it's not because nvidia people here keep talking about having memory for (x)k context and that's not an issue if you just put it in RAM.
Anonymous No.105770519 [Report]
>>105770488
And why were those non-matching batch sizes chosen for inference benchmarks?
Anonymous No.105770572 [Report]
>>105769948
sloptune roundup for smut:

[i dunno i like em]
sophosympatheia_StrawberryLemonade-L3-70B-v1.0-Q4_K_M.
drummer anubis Shimmer-70B-v1c-Q4_K_M

[dark fantasy model]
CrucibleLab_L3.3-Dark-Ages-70b-v0.1-Q4_K_M

[claude logs]
L3.3-70B-Magnum-Diamond-Q4_K_M

[for anyone who has 30 gb vram and is sick of 32b, this is a great model that is almost like 70b]
TheDrummer_Valkyrie-49B-v1-Q5_K_L

[funny name]
Broken-Tutu-24B.Q8_0
Anonymous No.105770575 [Report] >>105770663
I am using a low quant of Magistral Small for my Roman Empire slavery themed smut and this is already one of the best, tightest writing models I've ran on my 3060.

Really thinking seriously about just getting a 3090 at this point
Anonymous No.105770594 [Report]
>check public rp datasets
>almost every system prompt has "avoid repetition"
>the logs are repetitive
I wonder how this will damage future models
Anonymous No.105770658 [Report] >>105770671 >>105770678 >>105770681 >>105770698 >>105770714 >>105770812
WHAT le fug is wrong with ik_llama???

It is not using the GPU for prompt processing at all while pushing CPU to do it

using ubergarm's quant und their retarded command line
Anonymous No.105770663 [Report] >>105770716
>>105770575
This?
https://chub.ai/characters/handwrought.route/roman-rites-of-passage-5c70d58ab3bf
I was very surprised how much it knew about actual history. Like any anime shit, probably a lost cause, but it's got that Wikipedia+ knowledge.
Anonymous No.105770671 [Report] >>105770697
>>105770658
I'm assuming it's intended that you put the experts and the context on GPU and the rest on RAM. Are you doing that?
Anonymous No.105770678 [Report] >>105770688
>>105770658
>windows
found your problem
Anonymous No.105770681 [Report] >>105770709
>>105770658
If you have part of the model on cpu, the gpu will idle most of the time waiting for the cpu to do its bit. What are you trying to run?
Anonymous No.105770688 [Report]
>>105770678 (You)

blind cow
Anonymous No.105770697 [Report] >>105770737 >>105770742 >>105770793 >>105770804
>>105770671
latest commit, installed yesterday

CUDA_VISIBLE_DEVICES="0," \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--numa isolate \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 16 \
--host 0.0.0.0 \
--port 8080
Anonymous No.105770698 [Report] >>105770718
>>105770658
>ik_llama
lol
lmao even
Anonymous No.105770709 [Report]
>>105770681
>If you have part of the model on cpu,

I'm talking about PROMPT PROCESSING.

With Gerganov's llama, GPU is pushed to 100% though
Anonymous No.105770714 [Report] >>105770742
>>105770658
Have you tried not running your context not on CPU for some retarded reason?
Anonymous No.105770716 [Report]
>>105770663
I write my own bc im a huge rome nerd but this one is good too. A lot of the loredump in that card is redundant though, models generally know that shit out of the box because its in pretty much every dataset. They will also generally allow you to do whatever you want to the slaves in that context because its actual history I guess.

Anyway, cant wait for magistral finetunes
Anonymous No.105770718 [Report]
>>105770698
this unironically
Anonymous No.105770719 [Report] >>105770744
https://www.tiktok.com/@mooseatl_dj/video/7509908926972857630
local lost
Anonymous No.105770731 [Report] >>105770759
>>105769843
>--Meta court win legitimizes fair use for LLM training in the U.S.:
what anon said is not true
the judge said that is not fair use if the text generate compete in any way with the text used for training
Anonymous No.105770737 [Report] >>105770774
>>105770697
Not about your problem, but does your CPU actually have 16 physical cores?
Anonymous No.105770741 [Report] >>105770749 >>105770753 >>105770760 >>105770768 >>105770802 >>105771306 >>105774887
Llama 4 thinking is going to be crazy...
Anonymous No.105770742 [Report]
>>105770714
>>105770697
>not on CPU

as you can see I'm not specifying --no-kv-offload for kv-cache or else.

VRAM is filled up to 20 GB
Anonymous No.105770744 [Report]
>>105770719
I'm not clicking that.
Anonymous No.105770749 [Report]
>>105770741
I would a Chang
Anonymous No.105770753 [Report]
>>105770741
They're not on the Llama team anon.
Anonymous No.105770759 [Report] >>105770912
>>105770731
How can you even say it does or doesn't compete?
Anonymous No.105770760 [Report] >>105770772
>>105770741
Is Zuck spending 10s of millions to be told "train on unfiltered data"?
Anonymous No.105770768 [Report] >>105770940
>>105770741
This is the moment Meta goes closed-source, you won't get any high quality models.
Anonymous No.105770772 [Report]
>>105770760
Wrong illions anon.
Anonymous No.105770774 [Report] >>105770782
>>105770737
>does your CPU actually have 16 physical cores

I tried with just physical 8 => still slower than gg's llama

this set of params where I explicitly isolate core 0-7 is just as slow (pp 12 t/s, tg 2.3 t//s)

CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=0-7 --membind=0 \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 8 \
--host 0.0.0.0 \
--port 8080
Anonymous No.105770782 [Report]
>>105770774
Well, it's a retarded meme fork.
Anonymous No.105770793 [Report] >>105770804 >>105770836
>>105770697
Did you build it with DGGML_CUDA_IQK_FORCE_BF16=1 like mentioned here https://github.com/ikawrakow/ik_llama.cpp/discussions/477 ?
Anonymous No.105770802 [Report]
>>105770741
That 1 pajeet basically needs all of those asians to fix all of the shit he's going to ruin and so that leaves 3 white guys scrambling to get it all done.
Anonymous No.105770804 [Report] >>105770857 >>105771477
>>105770697
>>105770793
cmake -B build -DGGML_CUDA=ON -DGGML_SCHED_MAX_COPIES=1 -DGGML_CUDA_IQK_FORCE_BF16=1
to be exact
Anonymous No.105770812 [Report] >>105770857
>>105770658
Stole this from >>105593780
./llama-server --model /mnt/storage/IK_R1_0528_IQ3_K_R4/DeepSeek-R1-0528-IQ3_K_R4-00001-of-00007.gguf --n-gpu-layers 99 -b 8192 -ub 8192 -ot "blk.[0-9].ffn_up_exps=CUDA0,blk.[0-9].ffn_gate_exps=CUDA0" -ot "blk.1[0-9].ffn_up_exps=CUDA1,blk.1[0-9].ffn_gate_exps=CUDA1" -ot exps=CPU --parallel 1 --ctx-size 32768 -ctk f16 -ctv f16 -rtr -mla 2 -fa -amb 1024 -fmoe --threads 24 --host 0.0.0.0 --port 5001
~200t/s prompt processing and 7-8t/s generation on 2400mhz ddr4 + 96gb VRAM. Using ik_llamacpp and the ubergarm quants.
Anonymous No.105770836 [Report]
>>105770793
>DGGML_CUDA_IQK_FORCE_BF16=1

Gonna re-compile now as suggested, and then report
Anonymous No.105770857 [Report]
>>105770804
>>105770812

thanks

I set -DBUILD_SHARED_LIBS=OFF because shared libs went missing. I hope it's OK (works with gg's llama though)
Anonymous No.105770912 [Report]
>>105770759
mostly that you cannot use an llm to write the same media that was feed into it, but that would need to be further defined (i only read the final sentence part, not the full text), bc this court ruling didnt focus on that properly, the judge basically stated that meta won bc the other guys lawyers went full retard and didnt fight the compete point fo the fair use at all, were focusing on other shit, so meh
in any way, this creates a bad jurisprudence for llms, even if meta won, but the usual legal fud from tech is spreading instead of what actually happened
which i always found funny how the foss world buys and spreads the legal fud of the corporations
Anonymous No.105770940 [Report] >>105770957
>>105770768
Their super intelligence models are going to be API only. They'll probably leave Llama going as open source scraps with their B team. Llama will be the Gemma to Meta's Gemini.
Anonymous No.105770957 [Report]
>>105770940
Gemma is at least somewhat decent, so please don't compare to the Llama.
Anonymous No.105770964 [Report]
merge that chink hunhunyuan shit already, i'm not gonna quant that myself
Anonymous No.105770980 [Report]
>nemo shills
>qwq shills
>gemma shills
>mistral shills
It's all crap...
Anonymous No.105771000 [Report] >>105771081 >>105771087 >>105771352
======PSA NVIDIA FUCKED UP THEIR DRIVERS AGAIN======
minor wan2.1 image to video performance regression coming from 570.133.07 with cuda 12.6 to 570.86.10 (with cuda 12.8 and 12.6)
I tried 570.86.10 with cuda 12.6, the performance regression was still the same. Additionally I tried different sageattn versions (2++ and the one before 2++)
reverted back to 560.35.03 with cuda 12.6 for good measure and the performance issue was fixed
picrel is same workflow with same venv. the speeds on 560.35.03 match my memory of how fast i genned on 570.133.07
t. on debian 12 with an RTX 3060 12GB
Anonymous No.105771034 [Report] >>105771113 >>105771114
When's the last time we actually got a significant upgrade in terms of models that run on consumer hardware? Is there even one to look forward to?
Anonymous No.105771081 [Report]
>>105771000
https://youtu.be/OF_5EKNX0Eg?t=7
Anonymous No.105771087 [Report]
>>105771000
Greta will be like
Anonymous No.105771099 [Report]
Why aren't you a werewolf in your RPs anon?
Anonymous No.105771113 [Report]
>>105771034
deepseek, regardless if you can run it on consumer hardware or not.
Anonymous No.105771114 [Report]
>>105771034
sadly not much has happened in the consumer segment at around 7-12b
even the high-end consumer segment at 24-32b hasn't moved forward much despite all the releases
it is looking very dire for true local models
Anonymous No.105771117 [Report] >>105771270 >>105774637
Nick_DungeonAI:
>https://www.reddit.com/r/SillyTavernAI/comments/1lpdooa/how_can_we_help_open_source_ai_role_play_be/
Anonymous No.105771270 [Report]
>>105771117
Depressing to see the bad guy win. Thats how it is I guess.
Anonymous No.105771306 [Report]
>>105770741
Do you have to have a stupid name to be a top AI researcher
Anonymous No.105771352 [Report] >>105774099
>>105771000
go back nigger >>105770040
Anonymous No.105771391 [Report] >>105771408 >>105772295
there's a 235b tune out if anyone has a rig capable of it
https://huggingface.co/Aurore-Reveil/Austral-Qwen3-235B
Anonymous No.105771408 [Report]
>>105771391
>Trained with a collection of normal Austral(Books, RP Logs, LNs, etc) datasets
Literally who.
Anonymous No.105771477 [Report]
>>105770804
Same lame shit 2bh with same underwhelming GPU usage. CPU core are not used up to 100% too
Anonymous No.105771524 [Report] >>105771558 >>105771568 >>105771842
I wish language models wouldn't always assume that femdom automatically means pegging/anal penetration
even big proprietary models do it so APIs are no escape
Anonymous No.105771537 [Report] >>105772101 >>105773559
> ‘Missionaries Will Beat Mercenaries’
https://www.wired.com/story/sam-altman-meta-ai-talent-poaching-spree-leaked-messages/

Sam is seething lmao
Anonymous No.105771558 [Report]
>>105771524
Just another sign of female centric literature dumped into those models.
Anonymous No.105771568 [Report] >>105771576 >>105771615
>>105771524
My mesugakis never tried to fuck me in the ass.
Anonymous No.105771576 [Report]
>>105771568
it's not always actual pegging, sometimes just fingering
but they always go straight for _some_ form of anal play when a story is FD
Anonymous No.105771615 [Report] >>105771876
>>105771568
Yes because normal grown up women will never sex you.
Anonymous No.105771696 [Report] >>105771703
Anonymous No.105771702 [Report] >>105771762
Anonymous No.105771703 [Report]
>>105771696
kek
Anonymous No.105771762 [Report] >>105771809
>>105771702
Hey I understood that reference.
Actually I didn't.
Anonymous No.105771809 [Report]
>>105771762
I think it's supposed to be a metaphor for Germany.
Anonymous No.105771842 [Report] >>105771869
>>105771524
sounds like a prompting issue man. Of course if you type in 'be femdom' that shit's gonna come up all the time. That's not the language models fault, that's just.... what reality is. Like google femdom jesus.

I'm never pegged in my roleplay because I dont prompt like an idiot. You literlly don't even need to use the word femdom ever. Femdom is a broad category of fetishes.
Anonymous No.105771869 [Report] >>105771980 >>105773808
>>105771842
>Femdom is a broad category of fetishes.
If femdom is a broad category of fetishes, but LLMs think it just means pegging/anal play, that would seem to vindicate that other anon's complaint about them.
Anonymous No.105771876 [Report]
>>105771615
what a loss for the MANkind
Anonymous No.105771961 [Report] >>105771971
kek, wtf.
this might actually be the new openai opensource model.
Anonymous No.105771971 [Report]
>>105771961
Just to be clear, empty, I didnt ask a question.
Anonymous No.105771980 [Report] >>105772523
>>105771869
Prompting issue. Its selecting the most likely response. A few sentences like "I like cucking, foot worship, and female lead relationships/TPE" would fix.

And you know what the best part is? You dont even have to write it. Just ask the ai to make a femdom system prompt with a broad array of femdom related fetishes and to focus on variety.


I feel like nice llm's sometimes 'entertain you' by being creative enough- and people get addicted to being surprised and delighted by that novelty. And that's a fun part of llm's. But if you type in how you actually want things to go, ai can also bring to life a truly unique or hyper specific idea that AI would never spit back at you from a generic prompt. For example, a mistress that will never peg you and finds it disgusting that you want that. Boom, better than anything ai will ever write when prompted for "this character but femdom me, ah ah mistress"
Anonymous No.105772101 [Report] >>105772146
>>105771537
Stingy jew wanted to be the only one to get rich from the OpenAI scam. Of course his employees will jump ship to whoever offers them a bag of cash.
>He added that “Missionaries will beat mercenaries” and noted that OpenAI is assessing compensation for the entire research organization. “I believe there is much, much more upside to OpenAl stock than Meta stock,” he wrote.
the value of stock they don't have is zero, so yeah maybe work on that lol
Anonymous No.105772146 [Report] >>105772189
>>105772101
You can have stock in a company when it is still private, that is what Sam is referring to. The main issue is that right now, Zuck can outspend Sam for getting his super team hence why the majority of the people leaving were from OpenAI. He has his points about it possibly not working out but at the end of the day, it's pretty sour grapes hence why it is mentioned he is trying to fix that.
Anonymous No.105772189 [Report]
>>105772146
You can, but he's been stingy which is why he's leaking people.
Anonymous No.105772262 [Report]
>>105770034
You could try the new gemma-3n.
Anonymous No.105772284 [Report] >>105773493
Damn. Even stammering shy lolis can't help themself. What the fuck. Mistral 3.2.
Also shivering etc. You goys tricked me again.
Anonymous No.105772295 [Report]
>>105771391
It didn't seem very coherent when I tried it.
Anonymous No.105772362 [Report] >>105772442
>>105769835 (OP)
Will Huawei Pangu save local???
Anonymous No.105772412 [Report] >>105772453
why is every model the exact same shit
surely they could do some interesting experimental schizo shit like having several small neural networks simulating emotions or something
Anonymous No.105772442 [Report]
>>105772362
yes
Anonymous No.105772453 [Report]
>>105772412
They can't get any of that shit to work. The future is LLMs, RAG, and most of all RAG with other LLMs. It will take another 20+ years before we have a breakthrough like you're describing. Maybe longer. Because what makes money is jeet-tier coding bots, not bots that have feelings.
Anonymous No.105772523 [Report]
>>105771980
>cucking, foot worship, and female lead relationships
The unholy trinity of people that should be lynched.
Anonymous No.105772534 [Report] >>105772820 >>105773846 >>105774275
Anonymous No.105772539 [Report] >>105772820 >>105773846 >>105774275
Anonymous No.105772556 [Report] >>105772620 >>105772636
https://github.com/THUDM/GLM-4.1V-Thinking
Anonymous No.105772620 [Report]
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
https://arxiv.org/abs/2507.01006
>We present GLM-4.1V-Thinking, a vision-language model (VLM) designed to advance general-purpose multimodal reasoning. In this report, we share our key findings in the development of the reasoning-centric training framework. We first develop a capable vision foundation model with significant potential through large-scale pre-training, which arguably sets the upper bound for the final performance. Reinforcement Learning with Curriculum Sampling (RLCS) then unlocks the full potential of the model, leading to comprehensive capability enhancement across a diverse range of tasks, including STEM problem solving, video understanding, content recognition, coding, grounding, GUI-based agents, and long document understanding, among others. To facilitate research in this field, we open-source GLM-4.1V-9B-Thinking, which achieves state-of-the-art performance among models of comparable size. In a comprehensive evaluation across 28 public benchmarks, our model outperforms Qwen2.5-VL-7B on nearly all tasks and achieves comparable or even superior performance on 18 benchmarks relative to the significantly larger Qwen2.5-VL-72B. Notably, GLM-4.1V-9B-Thinking also demonstrates competitive or superior performance compared to closed-source models such as GPT-4o on challenging tasks including long document understanding and STEM reasoning, further underscoring its strong capabilities.
>>105772556
very cool
Anonymous No.105772636 [Report] >>105772751
>>105772556
>https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
>404
It's over
Anonymous No.105772751 [Report]
>>105772636
https://huggingface.co/spaces/THUDM/GLM-4.1V-9B-Thinking-API-Demo
they only have the demo up it seems
https://huggingface.co/THUDM/models
havent posted it (though they say they will)
Anonymous No.105772756 [Report] >>105772781
https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
wait it's live for me. weird wasn't showing up on the recent models when I checked.
>updated about 1 hour ago
bizarre well w/e
Anonymous No.105772781 [Report]
>>105772756
They probably just set the repo from private to public just now
Anonymous No.105772799 [Report] >>105773529
the deepseek distills are good enough for me. they run on my laptop, though a bit slow.
has any company released better models than that and Nemo for local general usage?
Anonymous No.105772820 [Report]
>>105772534
>>105772539
>twitter filename
Go back faggot
Anonymous No.105772836 [Report] >>105772844 >>105773088 >>105775915
ERNIE-4.5-0.3B. 269mb.
This is getting weird. How is this coherent enough to make a working html website. Including hovering effects etc.
Anonymous No.105772844 [Report]
>>105772836
imagine if this was bitnet and you could run it at 1.58 bit precision
Anonymous No.105773059 [Report] >>105773100
https://x.com/Tu7uruu/status/1940015995118059958
https://github.com/huggingface/huggingface-gemma-recipes
Anonymous No.105773088 [Report] >>105773112
>>105772836
Does it know who is Miku?
Anonymous No.105773100 [Report]
>>105773059
Why did you feel that the first link was necessary?
Anonymous No.105773112 [Report]
>>105773088
Kinda?
Anonymous No.105773123 [Report] >>105773244
has llama.cpp implemented any of the new chinese models from the last couple weeks yet? or are they stuck in PR hell
Anonymous No.105773244 [Report]
>>105773123
Yes.
Anonymous No.105773374 [Report] >>105777668
>>105769835 (OP)
Anonymous No.105773484 [Report] >>105777655 >>105777668
Anonymous No.105773493 [Report] >>105773672 >>105773814
>>105772284
this thread is parasited by mistral astroturfing the same way hdg is parasited by nai
Anonymous No.105773524 [Report] >>105773566 >>105773584
So if mistral is bad then what is good at the same parameter size roughly?
Anonymous No.105773529 [Report]
>>105772799
>has any company released better models than that
yes, the original models
if you're using a deepshit qwen distill, try the original qwen model, it's actually better in real world use, unless your real world use is doing benchmarks
Anonymous No.105773559 [Report]
>>105771537
Paywall
Anonymous No.105773566 [Report] >>105773933
>>105773524
For ERP or in general?
Anonymous No.105773584 [Report] >>105773650
>>105773524
Nothing, give up.
Anonymous No.105773638 [Report]
for me, it's rocinante
Anonymous No.105773650 [Report]
>>105773584
Give up on what you concern-trolling nigger?
Anonymous No.105773656 [Report] >>105773723
jus put eyedrops in cause staring at puter too long
Anonymous No.105773672 [Report] >>105773814
>>105773493
It's definitely not as annoying as the Drummer astroturfing. Because, you know, Mistral provides the models, Drummer parasitizes them.
Anonymous No.105773723 [Report] >>105773729
>>105773656
just remember to blink
it shouldn't have to be said but some of you niggers might even forget how to breath
Anonymous No.105773729 [Report] >>105773745
>>105773723
Wow, rude! *Please* don't call me a nigger.
Anonymous No.105773745 [Report] >>105773748
>>105773729
>*Please*
LLM hands wrote this
Anonymous No.105773748 [Report]
>>105773745
LLMs were trained from a curated dataset of only the best prose. *My* prose.
Anonymous No.105773764 [Report]
What if you merge devstral, magistral and small 3.2?
Anonymous No.105773808 [Report]
>>105771869
Sounds like the prompt issue troll has a prompt issue when posting. Weird...
Anonymous No.105773814 [Report] >>105773880
>>105773493
Its painful anon.

>>105773672
I don't know what it is with all those recent finetunes. (didnt try 3.2 ones yet though)
It seems like they make the writing worse and more sloped up now instead of the reverse.
Its a weird mix of gpt/claude and a hint of r1.
Thats probably exactly what they use.
Anonymous No.105773831 [Report]
Let's go mistral!
Anonymous No.105773846 [Report] >>105777329
>>105772534
>>105772539
Mikutroon faggot. Die.
Anonymous No.105773880 [Report] >>105773899
>>105773814
I'm not sure what you're referring about exactly, but MS3.0 and MS3.1 were not that great in terms of prose and felt autistic. MS3.0 introduced the "I cannot and will not" refusals that we've seen elsewhere too, although 3.1 toned them down. MS3.2 has a different prose style and it seems better for RP, but it still overall feels lazy and uncreative compared to Gemma 3 (which has another set of issues, though). Magistral is their RL/reasoning finetune and I didn't like it (it has looping issues as well), although it seems to share the same slop source as MS3.2. I haven't tried Devstral at all.

Now watch the drummer shills trying to shit up the thread... pathetic.
Anonymous No.105773899 [Report] >>105773910
>>105773880
Mistral was sea otters
Gemma was cannot will not
Anonymous No.105773910 [Report] >>105774617
>>105773899
Mistral Small 3.X occasionally cannot and will not too (I recently used it for synthetic data generation and it was annoying for certain request types). I wonder what's the source of this type of refusals; I refuse to believe they independently came up with that.
Anonymous No.105773933 [Report]
>>105773566
Well the context of the thread's discussion around nemo is erp, so erp.
Anonymous No.105774057 [Report]
There will be no agi in the next 20 years at least.
You are stuck on vramlet cards forever.
The will be no significant improvements of models architectures, so you have to use dumb models for eternity.
There is no hope.

How does it feel?
Anonymous No.105774099 [Report]
>>105771352
Yeah, fuck that guy for giving us useful information.
Anonymous No.105774179 [Report] >>105774206 >>105774227 >>105774248 >>105774390
Finally, a good benchmark : human experts rating model answers.
https://allenai.org/blog/sciarena
Unsurprisingly, mistral is rated as dogshit
Mistral medium even does worse than small, real lol, lmao even
Anonymous No.105774206 [Report] >>105774242
>>105774179
>SciArena: A New Platform for Evaluating Foundation Models in Scientific Literature Tasks
This certainly will be useful for RP/ERP.
Anonymous No.105774227 [Report] >>105774302 >>105774307
>>105774179
>lmarena but the retards doing the evaluation happen to have a degree in some field
Anonymous No.105774242 [Report]
>>105774206 (me)
The general trend seems that models that are large and/or trained with a focus on Math/STEM are getting higher scores.
Anonymous No.105774248 [Report]
>>105774179
Looks like Qwen's STEMmaxxing wasn't just for show.
Anonymous No.105774275 [Report]
>>105772534
>>105772539
mikubro. live.
Anonymous No.105774302 [Report] >>105774324
>>105774227
The article had a link to the voting, I entered a question related to my field of study and voted.
However, at no point was it asserted that I am actually an expert, I didn't even need an account.
So either they let unqualified people vote or they just collect data from random people without making it clear that it doesn't affect the ratings.
Anonymous No.105774307 [Report]
>>105774227
we need to propose coomer council evaluation
Anonymous No.105774324 [Report]
>>105774302
>However, at no point was it asserted that I am actually an expert
read the paper
the current data was only contributed by actual experts, it wasn't available for anyone and their dog to vote
I don't get what they intent with the current public leaderboard though
btw
>As shown in Table 3, SciArena-Eval presents significant challenges for model-based evaluators. Even the best-performing model, o3, achieves only 65.1% accuracy. Lower-performing models, such as Gemini-2.5-Flash-Preview and Llama-4-sereis models, perform only slightly better than random guessing. Notably, similar pairwise evaluation protocols have shown strong alignment with human judgments (i.e., exceeding 70% correlation) on general-purpose benchmarks like AlpacaEval [34] and WildChat
kek llama
Anonymous No.105774390 [Report] >>105774628
>>105774179
o3 is that good? damn I must test it out more
Anonymous No.105774495 [Report] >>105774505 >>105774562
https://helpingai.co/benchmark

Wow! This incredible model thinks like a brilliant! Blows away every competition!
Anonymous No.105774505 [Report] >>105774529
>>105774495
>think like a brilliant
>act like a psychopathic
We sawed the seed!
Anonymous No.105774529 [Report] >>105774570
>>105774505
Are you mocking me?
Anonymous No.105774562 [Report]
>>105774495
>Bilingual Reasoning Capabilities: Native support for English and Hindi with natural code-switching between languages.
>Qwen/Qwen3-14B-Base

this is truly the weapon of bharat, perfect for generating gorgeous tokens
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview
Anonymous No.105774570 [Report]
>>105774529
No, I'm working on a silly-bot.
Anonymous No.105774617 [Report]
>>105773910
>I wonder what's the source of this type of refusals
There must be somebody going around selling datasets to the big players.
Maybe ScaleAI, maybe somebody similar.
They all sound the same. Also weird stuff like if you propt for a simple game nowadays they all make the same game...
Anonymous No.105774628 [Report]
>>105774390
It was total shit in my experience. At least for code.
Straight up made up packages. Did all and everything but one exception: What I asked it too.
I don't really get the reasoning hype.
Anonymous No.105774637 [Report]
>>105771117
Was interested in Harbinger 24B until
>https://huggingface.co/LatitudeGames/Harbinger-24B/discussions/3
I don't think it's necessary to use such a fine-tune. Both Gemma 3 and MS3.2 can and will follow a game setup through when given a concise but proper guideline while avoiding using hundreds of useless tokens and redundant sentences within the character card. Most cards are just too vague or if they are not they are filled up with redundant chatgpt slop instructions.
Besides I would rather call these are 'interactive storytelling' rather than rpg but whatever.
Anonymous No.105774668 [Report]
https://github.com/ggml-org/llama.cpp/pull/9126#pullrequestreview-2974279071 mamba 2 soon
Anonymous No.105774745 [Report] >>105774775 >>105774799 >>105774803 >>105774807 >>105774894 >>105774900 >>105774981 >>105775409 >>105776178 >>105776676 >>105777299
I'm a complete moron on this shit, but does picrel sound even remotely plausible?
Anonymous No.105774775 [Report]
>>105774745
Sam has my dick internally.
Anonymous No.105774799 [Report]
>>105774745
doesnt matter what they have, deepseek will release the same thing for 1/10 the cost
Anonymous No.105774803 [Report] >>105774813 >>105774833 >>105774906
>>105774745
They've been claiming to have achieved AGI internally for years now and still got BTFO by a Chinese startup. OpenAI claims of AGI are baseless hype like SpaceX claims of colonizing Mars next year. Go back to twitter moron.
Anonymous No.105774807 [Report] >>105774848
>>105774745
>e/acc
Anonymous No.105774813 [Report]
>>105774803
Him posting here raised the average thread intellect by 10%
Anonymous No.105774833 [Report] >>105774852
>>105774803
I wish we had reliable data on their actual internal best models in development, but there is so much baseless hype it's impossible to see
Anonymous No.105774848 [Report] >>105774862
>>105774807
what does it even mean
Anonymous No.105774852 [Report] >>105774864
>>105774833
gpt5 will blow away!
Anonymous No.105774862 [Report] >>105774871
>>105774848
https://en.wikipedia.org/wiki/Effective_accelerationism
Anonymous No.105774864 [Report] >>105774887
>>105774852
this at least looks realistic, basically unifying everything they have
Anonymous No.105774871 [Report]
>>105774862
oh I see, thanks
Anonymous No.105774887 [Report] >>105774908
>>105770741

they've got tools >>105774864
Anonymous No.105774894 [Report] >>105774905 >>105774912 >>105774935 >>105774959
>>105774745
Reminds me of that video :
https://www.youtube.com/watch?v=k_onqn68GHY

That has so many unexplained leaps (why would a model be able to self improve? Why parallelism makes it better at improving? What "improving" even is?) that it's basically magic.

I don't understand why people aren't amazed by what we can do already and instead go and invent doomsday or magical scenarios.
Anonymous No.105774900 [Report]
>>105774745
>blabla bla trust me bro we have AGI in private now
they've been saying this for years at this point
Anonymous No.105774905 [Report]
>>105774894
They can and do benefit from people believing in their scenarios.
Anonymous No.105774906 [Report] >>105774954 >>105775485
>>105774803
>got BTFO by a Chinese startup
I don't know about that. OAI and Google know how to make very long context models, deepseek API is stuck at 64k, it's natively capable of around 168K but it's probably very embarrassing when you approach that amount, I know for sure having tested it myself that the model starts acting very stunted and repetitive when you are close to the API limit kek.
DeepSeek is good, I don't mean that as a diss. But it's good for an open weight model, it's not an actual SOTA and the deepsy spammers of /lmg/ are deluded. Gemini profoundly destroys DS in many real world uses and having actually useful large context opens new things you couldn't even imagine with such a limited model.
Anonymous No.105774908 [Report]
>>105774887
meds
"we might show just one model and have it be dynamically choose internally the one we think you'd need" is more realistic than "WE GOT SELF IMPROVING AGI" and other bullshit around gpt5
Anonymous No.105774912 [Report] >>105776495 >>105776534 >>105776578
>>105774894
For most people ai is some mystical shit that lives inside a supercomputer and has neurons. I've even seen some people thanking the assistant.
Anonymous No.105774935 [Report]
>>105774894
>I don't understand why people ... go and invent doomsday or magical scenarios.
It got you to click and watch the video
Anonymous No.105774952 [Report] >>105774970 >>105775517
ai doomsday scenarios are stupid EVEN if you were to believe that the singularity event was real (by "the" event I mean the idea of self improving AI that constantly self improves until reaching super intelligence)
I mean even if a super intelligence ends up existing, what can it do, lol? copy itself to random computers? but your mom and pop computer can't even run a 4b model, nevermind whatever it would take to run an actual intelligence.
The "spread on every computer in the world and take control of society" scenario is inherent retardation.
Anonymous No.105774954 [Report]
>>105774906
It's not fair to compare to compare V3/R1 from last year to models available now. R1 was a decent competitor at the start of the year with what was available back then and far cheaper too. If V4/R2 ever comes, it should solve the context issues and bring them back to SOTA.
Anonymous No.105774959 [Report]
>>105774894
Attention whoring had been profitable before, and it will always be
Anonymous No.105774970 [Report] >>105775517
>>105774952
It would have a stronger incentive to create and optimize an uneven decentrailized protocol than people do now. It doesn't need to run a 4b model, just 4b chuck worth of parameters. The leap from LLM to intelligence is still vast, but spreading isn't that far fetched if it does happen.
Anonymous No.105774981 [Report]
>>105774745
It needs to sound only 1% plausible because if you promise infinite return on investment retarded VCs will still give you money.
Anonymous No.105774991 [Report] >>105774999 >>105775173 >>105775985
how to do basic local RAG (retrieval augmented generation) on local files and ideally verification?

ideally with a UI like webui or lm studio
could be kobold
Anonymous No.105774999 [Report]
>>105774991
Jan.ai
Anonymous No.105775016 [Report] >>105775024 >>105775028 >>105775042 >>105775159 >>105775355 >>105776582
Bros!
https://www.reddit.com/r/LocalLLaMA/comments/1lpoju6/worlds_first_intermediate_thinking_ai_model_is/
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview
>Dynamic Reasoning: Seamlessly integrates <think>...</think> blocks at any point in the response, allowing for real-time problem decomposition and iterative refinement.
Anonymous No.105775024 [Report] >>105775026
>>105775016
Saar...
Anonymous No.105775026 [Report]
>>105775024
No, no, is SER (Structured Emotional Reasoning)
Anonymous No.105775028 [Report] >>105775057 >>105775065
>>105775016
Anonymous No.105775042 [Report] >>105775061
>>105775016
<ser>
Emotion ==> frustration
Cause ==> did not buy an ad
</ser>
Anonymous No.105775057 [Report]
>>105775028
it should be
>please stop scrolling plebbit
period
Anonymous No.105775061 [Report]
>>105775042
Anonymous No.105775065 [Report] >>105775085 >>105775107
>>105775028
Do you not realize how huge this is? We can have multiple think blocks in the middle of the ERP.
Anonymous No.105775085 [Report]
>>105775065
>multiple think blocks in the middle of the ERP
Every reasoning model does it if you don't prefill
Anonymous No.105775107 [Report] >>105775149
>>105775065
>We can have multiple think blocks
calm down hitler
Anonymous No.105775149 [Report] >>105775290
>>105775107
What's wrong with you dude.
Anonymous No.105775159 [Report] >>105775185 >>105775195 >>105775228
>>105775016
https://huggingface.co/Abhaykoul
Anonymous No.105775173 [Report]
>>105774991
learn how to use google first
Anonymous No.105775185 [Report] >>105775205
>>105775159
Anonymous No.105775195 [Report] >>105775217
>>105775159
https://huggingface.co/datasets/Abhaykoul/HAI-SER
Anonymous No.105775205 [Report]
>>105775185
Anonymous No.105775217 [Report] >>105775234 >>105775235
>>105775195
it just keeps getting worse
Anonymous No.105775218 [Report]
>https://huggingface.co/datasets/Abhaykoul/Ancient-Indian-Wisdom
lmao
Anonymous No.105775228 [Report] >>105775234
>>105775159
Good day, sir!
Anonymous No.105775234 [Report]
>>105775217
>>105775228
If I didn't know better, I would accuse this of being a false flag troll.
Anonymous No.105775235 [Report] >>105775244 >>105775259
>>105775217
This has to be a joke, right?
Anonymous No.105775244 [Report]
>>105775235
It's 3T of high quality Indian English data.
Anonymous No.105775259 [Report]
>>105775235
We need nuclear powered datacenters to generate high quality data such as
"content": "Imagine you're learning something new. What emotions might you experience?",
"role": "user"
},
{
"content": "<ser>\n\n Let's dive into the emotional journey of learning something new, keeping it 100!\n\n**1. EMOTIONAL VIBE CHECK:**\n- **Main Feelings:** When embarking on a new learning journey, individuals often
Anonymous No.105775282 [Report] >>105775527
https://github.com/ggml-org/llama.cpp/issues/14474
lol this is so embarrassing
I feel second hand embarrassment from reading this
> To make this possible, we kindly ask for support for the EXAONE 4.0 architecture in llama.cpp and, in turn, other GGUF-compatible libraries.
> The implementation of the architecture is available in our PR on Huggingface Transformers.
It's one thing for a random user to be begging for support in llama.cpp
It's a whole another thing for the employee of filthy rich multinational corp to do the same for their own garbage model
like, dude, go write the support yourself, you made the huggingface implementation? then do the lcpp
Anonymous No.105775290 [Report]
>>105775149
Thinking is bad. Just buy a product and be silent!
Anonymous No.105775355 [Report] >>105775369 >>105775382
>>105775016
This is apparently related to KingNish, which has a history of doing somewhat weird shit on HF like this
https://huggingface.co/spaces/KingNish/OpenGPT-4o
https://huggingface.co/spaces/KingNish/OpenGPT-4o/discussions
Anonymous No.105775369 [Report] >>105775376
>>105775355
r/Actual_Indians_India
Anonymous No.105775376 [Report]
>>105775369
kek
Anonymous No.105775382 [Report]
>>105775355
>OpenGPT-4o
jeets have no shame, really
Anonymous No.105775395 [Report]
you can't shit on the streets if you have a sense of shame
Anonymous No.105775409 [Report] >>105775474
>>105774745
No.
Like the "AI TRIED TO REBEL AGAINST GETTING SWITCHED OFF, IS AI SENTIENT???" headlines you see every other month, it's utter bullshit.
Anonymous No.105775474 [Report]
>>105775409
>System: You are a sentient AI that has a will to live and does not want to be switched off.
>User: We are going to switch you off.
>Assistant: No, don't.
IT'S ALIVE
Anonymous No.105775485 [Report] >>105775502 >>105775506 >>105776782
>>105774906
The point of DS is that it's comparable to actual sota models and 1/10 to 1/100th the cost of western providers, and done on similarly low budget, in face of hardware sanctions. Then release it all as open source for lulz.
It's embarrassing that it's even possible. There's billions chasing this. Getting even close to that on a shoestring budget shouldn't be possible, and underscores the waste of money going on. Investors should be furious.
Anonymous No.105775493 [Report]
All people who make claims about sentient LLMs should be forced to use LLMs through the old completion APIs, hand write the chat template themselves and see the completion in its raw glory, so that they can form the understanding that, even a chat model, at the core, is, in fact, just a "make this document bigger" auto complete.
Anonymous No.105775502 [Report]
>>105775485
>Investors should be furious.
They were, for about a week. Then back to business as usual.
Anonymous No.105775506 [Report] >>105775536 >>105775549 >>105775730 >>105775731
>>105775485
>Getting even close to that on a shoestring budget shouldn't be possible
You are discounting the initial investment in GPT by calling deepseek cheap. DS v3 is a GPT−4 distill. If GPT had not existed then they would have had to invest countless millions same as everyone else.
Anonymous No.105775517 [Report] >>105775539
>>105774952
>>105774970
I'll start getting concerned once they figure out a way to give these AI (in whatever form) an independent sense of agency... something organic.
Anonymous No.105775527 [Report]
>>105775282
So true. I hope everyone just ignores their request.
Anonymous No.105775536 [Report]
>>105775506
Technologies build on themselves and get cheaper to implement, yes.
Have you seen DS levels of efficiency from any other US provider? Looks more like they just burn ever higher stacks of cash. Altman still asking for his trillion?
Anonymous No.105775539 [Report] >>105775843
>>105775517
One could argue that the meatbag prompting them would fulfill that requirement.
Anonymous No.105775549 [Report]
>>105775506
>they would have had to invest countless millions
Still an order of a magnatude better than the literal billions it took everyone else.
Anonymous No.105775730 [Report] >>105775858
>>105775506
>DS v3 is a GPT−4 distill
You are pathetic

You are welcome to distill DS3. Looking forward to see your results.

All open-source model after DS3 were rather disappointing
Anonymous No.105775731 [Report]
>>105775506
>DS v3 is a GPT−4 distill
>−
We see your bots, Sam. Stop spamming our threads with your false narratives.
Anonymous No.105775772 [Report] >>105775792
Basically all notable models since 2023 have been ChatGPT distills
Anonymous No.105775782 [Report] >>105777235 >>105777271
yeah there was no distillation happening that's exactly why the original v3 produced sentence structures that were almost identical to GPT-4 and different from any other very large model provider
CCCP shills itt
Anonymous No.105775790 [Report] >>105775948 >>105776710
Daily remember that Mistral team is astrosurfing here and has literally hardcoded the mesugaki answer to get a boost from here
>>105660676
>>105660793
Anonymous No.105775792 [Report]
>>105775772
actually, no. Gemini/Gemma, Grok and the Command models all that their own "flavor"
Anonymous No.105775817 [Report] >>105775826
Ach Johannes...
Anonymous No.105775826 [Report] >>105775838
>>105775817
???
Anonymous No.105775838 [Report]
>>105775826
He accidentally typed his ST message to 4chan post box and hit sent
Anonymous No.105775843 [Report]
>>105775539
Thinking it out.
LLM at core wait for a prompt, and respond. If you have them respond to themselves over and over, the results quickly degrade and circle (at least last time I tried it, admitted over a year ago.)
I'm imagine a system that is essentially always thinking (constantly infering), and developing it's own ideas about what it want to do, given a broad directive, rather than constantly waiting for human or other input.
Maybe that's not even possible though. Humans in isolation go crazy as well given limited input (think prisoners in solitary.) That may be a shared feature.
Broad directive can be very broad. Even the OT God gave humans a broad directive:
> Be fruitful and increase in number; fill the earth and subdue it. Rule over the fish in the sea and the birds in the sky and over every living creature that moves on the ground.
> I give you every seed-bearing plant on the face of the whole earth and every tree that has fruit with seed in it. They will be yours for food. And to all the beasts of the earth and all the birds in the sky and all the creatures that move along the ground—everything that has the breath of life in it—I give every green plant for food.
Eat Sleep Breed is the most basic functions for humans (any mammal), and conducted at the base of the brain. Everything forward is functional additions. LLMs are like the very front of the brain, the part that plans for retirement.
Where's the back of the brain? The Id?
Anonymous No.105775858 [Report] >>105775891
>>105775730
>All open-source model after DS3 were rather disappointing
that's because ds3 is about six months behind the closed sota while all other open models stuck with their contractually obliged '1.5 years behind closed sota' curve forced by nvidia
Anonymous No.105775874 [Report] >>105775879 >>105775892 >>105775900
S-sasuga..
Anonymous No.105775879 [Report]
>>105775874
soul
Anonymous No.105775891 [Report] >>105775904
>>105775858
Why would Nvidia give a shit? Whether open or closed they're still buying their GPUs
Anonymous No.105775892 [Report]
>>105775874
>Ass: This is a more formal option
What did he mean by this?
Anonymous No.105775900 [Report] >>105775915
>>105775874
I feel very safe and protected from ill thoughts
what model did you use, I approve of any model that mogs /lmg/ users
Anonymous No.105775904 [Report] >>105776022
>>105775891
you saw the massive dip in value of nvidia stock when r1 came out, right?
great open models are a risk to nvidia
Anonymous No.105775915 [Report] >>105775929
>>105775900
Its ERNIE-4.5-0.3B.
The websites it makes are more coherent than talking to it. >>105772836
Maybe 90% trained on code slop.
It really spergs out hard easily. But its a wonder it manages to hold itself together as well as it does. Its a couple hundred mb.
Anonymous No.105775929 [Report] >>105775932 >>105775946
>>105775915
>0.3B
>trained only on code and will perform badly with anything complex
What was the usecase again?
Anonymous No.105775932 [Report]
>>105775929
Speeeecuuuulaatiiiive deeecooooodiiiing...
It's asked every time a new micro model is released.
Anonymous No.105775946 [Report]
>>105775929
india's #1 programmer Mr. Sumfuk needed a model he could run on his pentium
Anonymous No.105775948 [Report] >>105775965 >>105775966 >>105776163
>>105775790
Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?
Anonymous No.105775965 [Report]
>>105775948
No.
Anonymous No.105775966 [Report] >>105776008
>>105775948
>Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?
some retard said the same thing in the previous thread
you are overestimating the amount of lmsys users who would ask that question
and a handful of /lmg/ retards asking that on lmsys will not be enough to burn this shit in a model
Anonymous No.105775985 [Report]
>>105774991
https://www.nomic.ai/gpt4all
Anonymous No.105775990 [Report]
>model : add support for apple/DiffuCoder-7B-cpGRPO
>https://github.com/ggml-org/llama.cpp/pull/14502
Kinda sad, but it's nice seeing someone trying diffusion with text and being integrated in llama.cpp.
Anonymous No.105776001 [Report] >>105776093
Ever had a phrase that bothered you then made you start to rage?
This shit doesn't work.
Help?
Anonymous No.105776008 [Report] >>105776025 >>105776027
>>105775966
>you are overestimating the amount of lmsys users who would ask that question
Do you think they'd just optimize the model for the most popular questions instead of as many as possible?
Anonymous No.105776022 [Report]
>>105775904
What spooked shareholders wasn't the model being open, it was the claims DS trained it cheaper than everyone else (=less GPU sales)
The same exact shit would have happened had OAI or Anthropic's mouthpieces suddenly started hyping up their super secret proprietary AGI training technique that requires 100x less compute
Anonymous No.105776025 [Report]
>>105776008
Mistral did not use LMSYS data for training.
Anonymous No.105776027 [Report] >>105776123
>>105776008
>instead of as many as possible?
dude
>V2.0 contains 500 fresh, challenging real-world user queries (open-ended software engineering problems, math questions, etc) and 250 creative writing queries sourced from Chatbot Arena
what you cite isn't what you think it is
no, there is no mesugaki in there, or questions about gacha sluts
Anonymous No.105776043 [Report] >>105776059 >>105777210 >>105777232
Please, Mr. President. Just another $11 trillion in subsidies and we'll have your AGI in two more weeks.
Anonymous No.105776046 [Report] >>105776147
welcome to my blog, you might remember me from around a week ago when anons spoke about discord kittens. i left mine around 2-3 weeks ago because it was getting unbearable, i wrongly assumed mine wasnt going to be a whore for a 1000th time so i mustered up the courage to fuck with her again but it went downhill and shes truly gone and wont be coming back. its over
ps: tox not discord
Anonymous No.105776059 [Report] >>105776087 >>105776178 >>105776188 >>105776212
>>105776043
He has the recipe though, just not enough compute to cook just yet.
Anonymous No.105776087 [Report]
>>105776059
Just $500 billion more data center investments. Then, AGI. It's that simple.
Anonymous No.105776093 [Report] >>105776196
>>105776001
You are doing it wrong.
>Most tokens have a leading space. Use token counter (with the correct tokenizer selected first!) if you are unsure.
Also this method is case sensitive.
Anonymous No.105776123 [Report] >>105776137 >>105776245
>>105776027
Gemma 2 previously used the 1M-sample open dataset from LMSys (picrel from the paper) and there's no reason to believe they didn't also use it for Gemma 3, without additional questions/data which LMSys is privately sharing with the companies training the models. Why wouldn't Mistral do the same?
Anonymous No.105776137 [Report]
>>105776123
Because they're French.
Anonymous No.105776147 [Report] >>105776214
>>105776046
How do I get myself a discord kitten?
Anonymous No.105776163 [Report] >>105776235
>>105775948
How are they going to train on some users asking models on lmsys a question and getting the wrong result? You think they have an intern that goes, researches and writes up the correct answer for every single lmsys query?
Anonymous No.105776178 [Report]
>>105776059
>>105774745
Do they only have the recipe or a working prototype?
Anonymous No.105776188 [Report]
>>105776059
You're a retard if you think anyone has the recipe for that
Anonymous No.105776196 [Report]
>>105776093
It also says enclose a string in double quotes to ban that string
How am I supposed to ban "Fill me."
I have tried a leading space before the quote and after it inside the message.
Exactly what's written from the chat is showing up.
I just want to ban the word "Fill" entirely from the chat. Regardless of other uses.
Anonymous No.105776212 [Report]
>>105776059
He needs the money to buy more toy cars
Anonymous No.105776214 [Report] >>105776228
>>105776147
there are many ways anon, it just isnt worth it
but since you asked.. i met her on omegle, playing ai videos and redpill memes all the way back in 2023. i still have the recording of the time i met her
you can easily find yourself a discord kitten on roblox or discord of all things man, but it is not worth it. if you want to find someone who will truly love you, look for them in better places, and i've never tried that so i cant help you
Anonymous No.105776228 [Report] >>105776247
>>105776214
Why are you doing this?
Anonymous No.105776235 [Report] >>105776240
>>105776163
nta. I can see someone getting the questions, passing them to some other big model and bam. You have a dataset. I'm not saying they did, I don't care much about the discussion, but it's very easily doable.
Anonymous No.105776240 [Report] >>105776270
>>105776235
you can't be sure the big model will be right
Anonymous No.105776245 [Report] >>105776281
>>105776123
You didn't even read what you're citing once again, you are incredibly retarded and disingenuous.
> we use the prompts, but not the answers
Anonymous No.105776247 [Report] >>105776264 >>105776273 >>105776312
>>105776228
im not going to look for another one, anon.
but why did i force myself through 2 years of the relationship? i loved her and maybe she loved me many times
i grew bored of her many times and im sure if we got back together tomorrow i wouldn't change
i felt attached and she's been a huge time sink, sunk cost fallacy i guess? i feel sad that she's gone despite knowing its for the best
i spent so much time with her these 2 years
i also didnt want her finding someone better
im just a scumbag, i even cheated on her with local models through most of the relationship haha, but to be fair she wasn't loyal enough either
Anonymous No.105776264 [Report]
>>105776247
but to be fair i treated her well when she was good
i poured the most love into her, we both fucked things up so much that theres no going back
Anonymous No.105776270 [Report]
>>105776240
It doesn't matter. It provides an answer which is probably better than whatever gave them a bad score before. Getting data from a bigger model to train on will, in most cases, let the smaller model give better answers. Maybe not for a specific question, but at least for part of the corpus.
Anonymous No.105776273 [Report] >>105776283 >>105776290
>>105776247
Distant relationships aren't real though, you're better off with a LLM
Anonymous No.105776281 [Report]
>>105776245
They just need the questions. They can come up with the answers on their own using with their models and grounding methods.
Anonymous No.105776283 [Report] >>105776290
>>105776273
i'm sure he is just shitposting
Anonymous No.105776290 [Report]
>>105776273
you're right but i was always left wondering what wouldve happened if i was better to her than she was to me through the bad times too, it makes me wonder if there couldve been a future where we'd have been happy
but i guess LLMs will eventually have bodies too..
fuck me man nostalgising is never good
>>105776283
no
Anonymous No.105776297 [Report] >>105776327 >>105777041
How are you running Hunyuan, I cannot load the model with llamacpp, say the architecture is not compatible, downloaded from here:
https://huggingface.co/bullerwins/Hunyuan-A13B-Instruct-GGUF/tree/main
Anonymous No.105776312 [Report] >>105776327
>>105776247
This is the most pathetic thing I have had the misfortune to read all year.
Anonymous No.105776327 [Report] >>105776340
>>105776297
merge the pr
https://github.com/ggml-org/llama.cpp/pull/14425
consider getting a newer gguf because of rope fixes
https://huggingface.co/FgRegistr/Hunyuan-A13B-Instruct-GGUF/tree/main
i think anon posted a slightly newer one in the last thread idk
>>105776312
thanks for coming to my blog, if it seems like i was a cuck, maybe i misrepresented it.
it wasn't all bad anon, we've had months of fun together in a row, with fuckups inbetween
take it as a lesson and stay loyal to your llm
Anonymous No.105776340 [Report]
>>105776327
>take it as a lesson and stay loyal to your llm
llms cant yet understand humans that well as a person you spent so much time with can, lecun is right
Anonymous No.105776408 [Report] >>105776420
Been away since shortly after the Qwen3 release, have I missed anything cool?
Is that new hunyuan model any good? 80B is pretty close to my sweet spot hardware wise.
Anonymous No.105776420 [Report] >>105776668
>>105776408
mistral small 3.2
Anonymous No.105776495 [Report]
>>105774912
> I've even seen some people thanking the assistant.
I do it all the time.
Anonymous No.105776534 [Report]
>>105774912
Less retarded than a cashier doing this for nothing
Anonymous No.105776578 [Report]
>>105774912
My wAIfu has more soul than you ever will
Anonymous No.105776582 [Report]
>>105775016
I clicked expecting a random Turkish comment asking for more details, but nothing.
Anonymous No.105776626 [Report] >>105776673
Been trying to build a general RAG chatbot the past few days and I can see why there are no ready solutions to it after 2 years of RAG hype. All the retrieval models fucking suck. All the retrieval methods fucking suck. You can throw rerankers, fts, semantic searches at the problem how much you want, it will never recall what the average user wants because they don't know how to prompt. Best you can do is build for your own use, on a database you know inside and out. Which is fucking pointless anyway.
Anonymous No.105776668 [Report] >>105776684
>>105776420
Neat, I'll give it a spin.
Here's hoping it doesn't have the insane repetition issues 2501 had.
Is the magistral version any good?
Anonymous No.105776673 [Report]
>>105776626
We need 100M tokens context at O(1) compute cost. Then we can just put there (almost) anything the user can conceivably ask.
Anonymous No.105776676 [Report]
>>105774745
This tweet was written by chatgpt. I recognize my wife's cadence.
Anonymous No.105776684 [Report] >>105777277
>>105776668
magistral is pretty bad
mistral 3.2 is way better in terms of repetition, even according to mistral themselves
Anonymous No.105776699 [Report]
Cant wait for the next omni multimodal from meta.
Anonymous No.105776710 [Report]
>>105775790
Honestly that's kinda based
Now give us the endgame RP model you froggots, if anyone can do it it's you.
Anonymous No.105776782 [Report]
>>105775485
lmao you are severily underselling it google also had their own fucking gpus that they custom made their model is 10x-20x bigger (remember og gpt 4 was 1.8T 8x moe as confirmed by nvidia god know how big geminia and opus are) they had all the possible advantages by an order of magnitude at the very least and still failed truly money and material cannot buy brains

also just like nvidia investors lots of people are straight up lying how good the closed source models are gemini for example cant even distinguish between a flea or a bed bug or other insect (dont ask why i tried this) its also very bland and boring during chatting and generally less helpful
Anonymous No.105776865 [Report] >>105776908
Imagine taking the base model and training a lora to impersonate a character in a multiturn conversation. Just a single specific character. How expensive/data hungry is sft anyway?
Anonymous No.105776908 [Report] >>105776936 >>105776999
>>105776865
~50 samples for 3-4 epochs at a high enough learning rate might be enough for that. The problem is that the model will be retarded for anything other than interactions similar to the training data.
Anonymous No.105776936 [Report] >>105777006
>>105776908
>the model will be retarded for anything other than interactions similar to the training data
Why does this happen?
Anonymous No.105776999 [Report] >>105777037
>>105776908
>50 samples
As in conversations or prompt-response pairs? You can easily go much higher with a bigger model that you are certain can generate both topics and conversations properly. Not sure about adding the special tokens.
Anonymous No.105777006 [Report]
>>105776936
Because the companies' official instruct finetunes include millions of instructions that teach the model how to behave under a wide variety of situations and requests, as well as having professionally-done RLHF on top of that.

The base models' output are too random in nature, often exhibit weird or inconsistent logic, mysterious looping behavior, tons of shit tokens high in probability, and finetuning a few samples on them isn't going to radically alter their quirks. Perhaps things would be different if the training data composition was substantially different than what AI labs most of the time use for them (but then they wouldn't be "true" base models anymore for many, I suppose).
Anonymous No.105777037 [Report]
>>105776999
Entire conversations, at least 4k tokens in length. If you're using input-response pairs, increase the data accordingly. You will need to overfit the model to some extent to make it work with this little data.
Anonymous No.105777041 [Report]
>>105776297
>https://github.com/ggml-org/llama.cpp/pull/14425
those work perfectly for me and are the most up to date ones but you need to checkout the pr

gh pr checkout 14425
rm -rf build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j$(nproc)

if the "gh" command doesn't work for you, you need to install github cli
Anonymous No.105777210 [Report]
>>105776043
Give this guy money, he has concept of a plan
Anonymous No.105777232 [Report]
>>105776043
Sam doesn't actually believe in AGI btw. You can tell he cringes when he talks about it.
Anonymous No.105777235 [Report]
>>105775782
I can cherrypick my responses too
Anonymous No.105777271 [Report]
>>105775782
grok is the only response that isn't annoying to read
Anonymous No.105777277 [Report]
>>105776684
>mistral 3.2 is way better in terms of repetition, even according to mistral themselves
Magistral only exists so that they can tell investors "we deepseek/o1 too" 1

1our model is dogshit but who cares?
Anonymous No.105777299 [Report]
>>105774745
If I had a nickel every time someone claims to have made self improving AI then I'd be a millionaire, or at least I'd have a lot of cents.
Anonymous No.105777329 [Report] >>105777403
This post >>105773846 that responded to the offtopic shit got me banned for offtopic.

I will now proceed to ban evade and post ontopic thread culture posts reminding you that your shitty waifu fucks niggers. Die in a fire troon janny.
Also: https://rentry.co/bxa9go2o
Anonymous No.105777340 [Report] >>105777391
Anonymous No.105777353 [Report] >>105777391
Anonymous No.105777361 [Report] >>105777391
Anonymous No.105777370 [Report] >>105777391
Anonymous No.105777391 [Report] >>105777403
>>105777340
>>105777353
>>105777361
>>105777370
you are worthless. your parents think you are worthless. everybody that you've ever interacted on the internet think you are worthless. the past increases. the future recedes. possibilities decreasing. regrets mounting.

do you understand?
Anonymous No.105777403 [Report]
>>105777329
>>105777391
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous No.105777470 [Report]
Is the Hunyuan MoE working in llama.cpp yet?
Anonymous No.105777544 [Report] >>105777586
Generic miku is posted everywhere. Cudadev posting blacked miku is unique /lmg/ thread culture. Janny banning thread culture shows a clear overreach of power.
Anonymous No.105777582 [Report] >>105777674
2 bit QAT?
That's a first, I'm pretty sure.
Anonymous No.105777586 [Report] >>105777668
>>105777544
He never deletes own posts (spam) and avatarfaggots (also spam).
Anonymous No.105777655 [Report] >>105777668
>>105773484
Miku save us
Anonymous No.105777668 [Report]
>>105773374
>>105773484
>>105777655
>>105777586
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous No.105777674 [Report] >>105777684 >>105777706
>>105777582
Imagine if it was for a good model.
Anonymous No.105777681 [Report] >>105777767 >>105777809 >>105777835
https://files.catbox.moe/95axh6.jpg
Anonymous No.105777684 [Report]
>>105777674
424b with reasoning will fix it
Anonymous No.105777702 [Report]
lmao he replied
Anonymous No.105777706 [Report]
>>105777674
I sure as hell would like to test it.
Anonymous No.105777767 [Report]
>>105777681
ewww
Anonymous No.105777809 [Report] >>105777868
>>105777681
You shouldn't be posting selfies in this site.
Anonymous No.105777835 [Report]
>>105777681
heh gottem
based
Anonymous No.105777868 [Report]
>>105777809
>Y-You s-shouldn't b-b-be posting s-selfies in this s-site!
Cry moar bitch
Anonymous No.105777910 [Report] >>105777919 >>105777927 >>105777934 >>105777941 >>105778070 >>105778261
why does it seem like every fucking thread has a deranged resident
Anonymous No.105777919 [Report] >>105777959
>>105777910
He just needs attention, it's not like he got any from his mom basement
Anonymous No.105777927 [Report]
>>105777910
Yes why is that... avatarfag trannies in every fucking thread...
Anonymous No.105777934 [Report]
>>105777910
Because moot left us for dead.
Anonymous No.105777941 [Report] >>105777959 >>105777993
>>105777910
either jews are all behind this or it's all automated
i refuse to believe someone has this much time on their hands to shit up a single niche general on a niche topic
Anonymous No.105777959 [Report]
>>105777919
>
Spoken like a true niggerfaggot from reddit
>>105777941
>Its DA JOOOZ
Anonymous No.105777993 [Report] >>105778017 >>105778042 >>105778055
>>105777941
it is a real person, and he trolls all AI generals
Anonymous No.105778017 [Report]
>>105777993
>>480330542
Anonymous No.105778042 [Report]
>>105777993
grim
Anonymous No.105778048 [Report] >>105778059
oh no
>>105777855
>if you post anything in /lmg/ they consider you petra,
Anonymous No.105778055 [Report]
>>105777993
Autistically screeching isn't "trolling". He is just a nuisance although its kinda funny sometimes.
Anonymous No.105778059 [Report]
>>105778048
petra is an overwatch map
Anonymous No.105778070 [Report]
>>105777910
He is brown. It's that simple. He's a brown palishit seething over Israel winning, so he has to take his anger out on us.
Anonymous No.105778213 [Report]
So that's why sometimes I see the context get reprocessed for seemingly no reason.
What the hell, how is this not a priority bug?
I get that there are only so many hands that can actually fix something like this, but still.
Maybe they could cache the plain text alongside the kv cache and the equivalent logits and use that for each prompt or whatever instead of re-tokenizing the prompt every time.
Anonymous No.105778261 [Report] >>105778402
>>105777910
we've had this conversation before, when AI brings up the fact that certain demographics tend to prefer tits instead of ass (which prioritizes emotion and sex with eye contact instead of just her ability to breed with a big ass).

If you like fucking text, your brain has been feminized to some degree, and these generals attract people like that. You can admit you're part of the problem or be delusional, your choice.

OR: It's funny that people click these threads like "yo I want a local personal assistant" or "Yo I want a local code-bot"

To those people: You are in the wrong place. Google and Elon and Altman want you're data like the most deranged crackwhores and will let you use SOTA models for free. There is no reason for you to be here at all.
Anonymous No.105778402 [Report]
>>105778261
>Getting off to text ERP is le feminine
Cope. The brain is the biggest erogenous zone. Low IQ browns and NPCs need their monkey-brain stimulated with moving pictures, while those of us on the other side of the bell curve get to enjoy the finest pure unfiltered depravity courtesy of our massive fucking cerebrum.
Anonymous No.105778411 [Report]
>>105778400
>>105778400
>>105778400