/lmg/ - Local Models General - /g/ (#105769835) [Archived: 610 hours ago]

Anonymous
7/1/2025, 10:22:22 PM No.105769835
ff14d04e1_cleanup
ff14d04e1_cleanup
md5: 8b964eaf378b25b2b8203771027caf8b🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105757131 & >>105750356

►News
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5
>(06/27) VSCode Copilot Chat is now open source: https://github.com/microsoft/vscode-copilot-chat
>(06/27) Hunyuan-A13B released: https://hf.co/tencent/Hunyuan-A13B-Instruct
>(06/26) Gemma 3n released: https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105772362 >>105773374
Anonymous
7/1/2025, 10:22:48 PM No.105769843
tetrecap2
tetrecap2
md5: 68dbdb085a8472c4b34be6cc8b56826c🔍
►Recent Highlights from the Previous Thread: >>105757131

--Paper: Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication:
>105761808 >105761966 >105762753 >105763009
--Paper: Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity:
>105768845 >105768877 >105768901 >105768884 >105768906 >105768933 >105769034
--Meta's AI talent acquisition and open model skepticism amidst legal and data curation challenges:
>105758293 >105758397 >105758388 >105758810 >105758467 >105758482 >105766325 >105758818 >105758901 >105758926 >105758942
--Hunyuan-A13B GGUF port requires custom llama.cpp build for flash attention support:
>105768115 >105768164 >105768455
--Frustration over delayed OpenAI model and skepticism toward benchmarks and strategy:
>105766029 >105766042 >105768619 >105768677 >105768693 >105768837 >105768876 >105769053 >105768798 >105768934
--Critique of Hunyuan and Ernie models for over-reliance on Mills & Boon-style erotic prose in outputs:
>105758427 >105758629 >105758645 >105758674 >105764901 >105765054 >105765118 >105765228 >105765275 >105765472 >105765503 >105765747 >105767085 >105767501 >105766545 >105766794 >105768886 >105758694
--NVIDIA's Mistral-Nemotron open reasoning model sparks confusion and skepticism among anons:
>105766864 >105766975 >105767094 >105767167
--Discussion on NVIDIA ending driver support for older Pascal, Maxwell, and Volta GPUs:
>105764483 >105764512 >105766267
--Fish Audio S1 Mini and 4B text-to-speech model voice cloning results shared:
>105760876 >105760929
--Official OpenAI podcast episode discussing ChatGPT and AI assistant development:
>105766509
--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
>105760696 >105760773
--Meta court win legitimizes fair use for LLM training in the U.S.:
>105766199
--Miku (free space):
>105765500 >105766204

►Recent Highlight Posts from the Previous Thread: >>105757140

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105769864 >>105769926 >>105770731
Anonymous
7/1/2025, 10:24:55 PM No.105769864
>>105769843
>--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
Turn the temp down holy fuck
Anonymous
7/1/2025, 10:28:15 PM No.105769897
temperature was a mistake
Replies: >>105769907 >>105769965
Anonymous
7/1/2025, 10:29:25 PM No.105769907
>>105769897
Yeah it's way too fucking hot right now.
Anonymous
7/1/2025, 10:31:14 PM No.105769926
>>105769843
Thank you Recap Teto
Anonymous
7/1/2025, 10:33:22 PM No.105769948
70Bros what's fotm slop finetune?
Replies: >>105770572
Anonymous
7/1/2025, 10:35:34 PM No.105769965
>>105769897
Temp + Top-P is all you need.
Replies: >>105770012
Anonymous
7/1/2025, 10:40:01 PM No.105770012
>>105769965
Temperature: 0.6
Top K: 100
Top P: 0.95
Top nsigma: 1

can't believe there are still people using other LLMs when R1 is this simple to set up.
Anonymous
7/1/2025, 10:43:23 PM No.105770034
hi what's the best model for 8gb vram these days
i'm primarily interested in things like tool use and agentic behavior
Replies: >>105770065 >>105770072 >>105772262
Anonymous
7/1/2025, 10:44:22 PM No.105770041
Bait
Bait
md5: f7ec0c52fe6e81727e07fa6133298011🔍
Replies: >>105770068
Anonymous
7/1/2025, 10:46:05 PM No.105770065
>>105770034
Qwen MoE probably. The 30B.
Replies: >>105770068
Anonymous
7/1/2025, 10:46:32 PM No.105770068
>>105770041
not bait :(
>>105770065
thanks, how many t/s should i expect?
Replies: >>105770076 >>105770125
Anonymous
7/1/2025, 10:47:13 PM No.105770072
>>105770034
Nemo
Replies: >>105770186
Anonymous
7/1/2025, 10:47:43 PM No.105770076
>>105770068
Will mostly depend on your RAM, since you want to offload the expert tensors to the CPU backend using the --override-tensor (-ot) parameter.
I'll say between 10 and 15 t/s?
Replies: >>105770097
Anonymous
7/1/2025, 10:49:48 PM No.105770097
>>105770076
16gb of ram unfortunately, haven't gotten around to upgrading it yet
Replies: >>105770144
Anonymous
7/1/2025, 10:51:56 PM No.105770125
>>105770068
>not bait :(
You know 8gb is not enough right?
Replies: >>105770186
Anonymous
7/1/2025, 10:53:46 PM No.105770144
>>105770097
Well, shit.
RIP I guess.
Try the q4ks quant with low topk and pray for the best I suppose.
Replies: >>105770186
Anonymous
7/1/2025, 10:58:32 PM No.105770186
>>105770072
will keep that in mind
>>105770125
i didn't purpose build this pc for running llms, it's just a gaming pc that i'm hoping to repurpose
>>105770144
thanks, will do
last model i used was mistral-7b and it was honestly not up to snuff
Replies: >>105770216
Anonymous
7/1/2025, 11:01:41 PM No.105770216
>>105770186
what specific graphics card do you have, and also how much regular RAM do you have? CPUmaxxing might be an option
Replies: >>105770255
Anonymous
7/1/2025, 11:05:57 PM No.105770255
>>105770216
rtx 2060 super
16gb of regular ram, some 10th gen i7
Anonymous
7/1/2025, 11:12:13 PM No.105770318
that 16gb is going to limit your maximum context
Anonymous
7/1/2025, 11:13:15 PM No.105770328
>>105769946
It's ggml-large-v3.bin from https://huggingface.co/ggerganov/whisper.cpp/tree/main
Anonymous
7/1/2025, 11:19:45 PM No.105770389
image
image
md5: 3654898e2f66813ae3b106f82abfab63🔍
Reminder that ROCM sucks so much that it's ALWAYS better to fit more layers in VRAM and use -nkvo (--usecublas lowvram in kobold).
Replies: >>105770409
Anonymous
7/1/2025, 11:21:51 PM No.105770409
>>105770389
>it's ALWAYS better to fit more layers in VRAM
isn't that generally the case?
Replies: >>105770513
Anonymous
7/1/2025, 11:30:30 PM No.105770488
pangu pro moe tables5&6
pangu pro moe tables5&6
md5: d7442eda6fca92224a10c815bf2fa2c4🔍
>>105768845
>[I]t is commonly observed that some experts are activated far more often than others, leading to system inefficiency when running the experts on different devices in parallel. Existing heuristics for balancing the expert workload can alleviate but not eliminate the problem. Therefore, we introduce Mixture of Grouped Experts (MoGE), which groups the experts during selection and balances the expert workload better than MoE in nature. It constrains tokens to activate an equal number of experts within each predefined expert group. When a model execution is distributed on multiple devices, which is necessary for models with tens of billions of parameters, this architectural design ensures a balanced computational load across devices, significantly enhancing throughput, particularly for the inference phase.
Why don't their speed benchmarks compare Pangu Pro 72B A16B to other MoEs?
Replies: >>105770519
Anonymous
7/1/2025, 11:33:39 PM No.105770513
>>105770409
I'm assuming it's not because nvidia people here keep talking about having memory for (x)k context and that's not an issue if you just put it in RAM.
Anonymous
7/1/2025, 11:34:27 PM No.105770519
>>105770488
And why were those non-matching batch sizes chosen for inference benchmarks?
Anonymous
7/1/2025, 11:40:25 PM No.105770572
>>105769948
sloptune roundup for smut:

[i dunno i like em]
sophosympatheia_StrawberryLemonade-L3-70B-v1.0-Q4_K_M.
drummer anubis Shimmer-70B-v1c-Q4_K_M

[dark fantasy model]
CrucibleLab_L3.3-Dark-Ages-70b-v0.1-Q4_K_M

[claude logs]
L3.3-70B-Magnum-Diamond-Q4_K_M

[for anyone who has 30 gb vram and is sick of 32b, this is a great model that is almost like 70b]
TheDrummer_Valkyrie-49B-v1-Q5_K_L

[funny name]
Broken-Tutu-24B.Q8_0
Anonymous
7/1/2025, 11:40:36 PM No.105770575
I am using a low quant of Magistral Small for my Roman Empire slavery themed smut and this is already one of the best, tightest writing models I've ran on my 3060.

Really thinking seriously about just getting a 3090 at this point
Replies: >>105770663
Anonymous
7/1/2025, 11:42:34 PM No.105770594
>check public rp datasets
>almost every system prompt has "avoid repetition"
>the logs are repetitive
I wonder how this will damage future models
Anonymous
7/1/2025, 11:52:07 PM No.105770658
ik_llama
ik_llama
md5: 7b71c2853ffa4d95e11d208291fdec96🔍
WHAT le fug is wrong with ik_llama???

It is not using the GPU for prompt processing at all while pushing CPU to do it

using ubergarm's quant und their retarded command line
Replies: >>105770671 >>105770678 >>105770681 >>105770698 >>105770714 >>105770812
Anonymous
7/1/2025, 11:52:13 PM No.105770663
main_roman-rites-of-passage-5c70d58ab3bf_spec_v2(1)
main_roman-rites-of-passage-5c70d58ab3bf_spec_v2(1)
md5: b1629bb61573a0b757be7748ac6e617f🔍
>>105770575
This?
https://chub.ai/characters/handwrought.route/roman-rites-of-passage-5c70d58ab3bf
I was very surprised how much it knew about actual history. Like any anime shit, probably a lost cause, but it's got that Wikipedia+ knowledge.
Replies: >>105770716
Anonymous
7/1/2025, 11:53:43 PM No.105770671
>>105770658
I'm assuming it's intended that you put the experts and the context on GPU and the rest on RAM. Are you doing that?
Replies: >>105770697
Anonymous
7/1/2025, 11:54:31 PM No.105770678
>>105770658
>windows
found your problem
Replies: >>105770688
Anonymous
7/1/2025, 11:54:56 PM No.105770681
>>105770658
If you have part of the model on cpu, the gpu will idle most of the time waiting for the cpu to do its bit. What are you trying to run?
Replies: >>105770709
Anonymous
7/1/2025, 11:55:47 PM No.105770688
>>105770678 (You)

blind cow
Anonymous
7/1/2025, 11:56:48 PM No.105770697
>>105770671
latest commit, installed yesterday

CUDA_VISIBLE_DEVICES="0," \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--numa isolate \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 16 \
--host 0.0.0.0 \
--port 8080
Replies: >>105770737 >>105770742 >>105770793 >>105770804
Anonymous
7/1/2025, 11:56:50 PM No.105770698
>>105770658
>ik_llama
lol
lmao even
Replies: >>105770718
Anonymous
7/1/2025, 11:58:14 PM No.105770709
>>105770681
>If you have part of the model on cpu,

I'm talking about PROMPT PROCESSING.

With Gerganov's llama, GPU is pushed to 100% though
Anonymous
7/1/2025, 11:58:56 PM No.105770714
>>105770658
Have you tried not running your context not on CPU for some retarded reason?
Replies: >>105770742
Anonymous
7/1/2025, 11:59:21 PM No.105770716
>>105770663
I write my own bc im a huge rome nerd but this one is good too. A lot of the loredump in that card is redundant though, models generally know that shit out of the box because its in pretty much every dataset. They will also generally allow you to do whatever you want to the slaves in that context because its actual history I guess.

Anyway, cant wait for magistral finetunes
Anonymous
7/1/2025, 11:59:30 PM No.105770718
>>105770698
this unironically
Anonymous
7/1/2025, 11:59:31 PM No.105770719
https://www.tiktok.com/@mooseatl_dj/video/7509908926972857630
local lost
Replies: >>105770744
Anonymous
7/2/2025, 12:00:47 AM No.105770731
>>105769843
>--Meta court win legitimizes fair use for LLM training in the U.S.:
what anon said is not true
the judge said that is not fair use if the text generate compete in any way with the text used for training
Replies: >>105770759
Anonymous
7/2/2025, 12:01:45 AM No.105770737
>>105770697
Not about your problem, but does your CPU actually have 16 physical cores?
Replies: >>105770774
Anonymous
7/2/2025, 12:02:04 AM No.105770741
Meta Avengers
Meta Avengers
md5: 248d2ff8291846fdbba4a7f16338a3a0🔍
Llama 4 thinking is going to be crazy...
Replies: >>105770749 >>105770753 >>105770760 >>105770768 >>105770802 >>105771306 >>105774887
Anonymous
7/2/2025, 12:02:17 AM No.105770742
>>105770714
>>105770697
>not on CPU

as you can see I'm not specifying --no-kv-offload for kv-cache or else.

VRAM is filled up to 20 GB
Anonymous
7/2/2025, 12:02:21 AM No.105770744
>>105770719
I'm not clicking that.
Anonymous
7/2/2025, 12:02:52 AM No.105770749
>>105770741
I would a Chang
Anonymous
7/2/2025, 12:03:22 AM No.105770753
>>105770741
They're not on the Llama team anon.
Anonymous
7/2/2025, 12:04:32 AM No.105770759
>>105770731
How can you even say it does or doesn't compete?
Replies: >>105770912
Anonymous
7/2/2025, 12:04:38 AM No.105770760
>>105770741
Is Zuck spending 10s of millions to be told "train on unfiltered data"?
Replies: >>105770772
Anonymous
7/2/2025, 12:05:07 AM No.105770768
>>105770741
This is the moment Meta goes closed-source, you won't get any high quality models.
Replies: >>105770940
Anonymous
7/2/2025, 12:05:22 AM No.105770772
>>105770760
Wrong illions anon.
Anonymous
7/2/2025, 12:05:29 AM No.105770774
>>105770737
>does your CPU actually have 16 physical cores

I tried with just physical 8 => still slower than gg's llama

this set of params where I explicitly isolate core 0-7 is just as slow (pp 12 t/s, tg 2.3 t//s)

CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=0-7 --membind=0 \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 8 \
--host 0.0.0.0 \
--port 8080
Replies: >>105770782
Anonymous
7/2/2025, 12:06:35 AM No.105770782
>>105770774
Well, it's a retarded meme fork.
Anonymous
7/2/2025, 12:08:06 AM No.105770793
>>105770697
Did you build it with DGGML_CUDA_IQK_FORCE_BF16=1 like mentioned here https://github.com/ikawrakow/ik_llama.cpp/discussions/477 ?
Replies: >>105770804 >>105770836
Anonymous
7/2/2025, 12:08:58 AM No.105770802
>>105770741
That 1 pajeet basically needs all of those asians to fix all of the shit he's going to ruin and so that leaves 3 white guys scrambling to get it all done.
Anonymous
7/2/2025, 12:09:27 AM No.105770804
>>105770697
>>105770793
cmake -B build -DGGML_CUDA=ON -DGGML_SCHED_MAX_COPIES=1 -DGGML_CUDA_IQK_FORCE_BF16=1
to be exact
Replies: >>105770857 >>105771477
Anonymous
7/2/2025, 12:10:06 AM No.105770812
>>105770658
Stole this from >>105593780
./llama-server --model /mnt/storage/IK_R1_0528_IQ3_K_R4/DeepSeek-R1-0528-IQ3_K_R4-00001-of-00007.gguf --n-gpu-layers 99 -b 8192 -ub 8192 -ot "blk.[0-9].ffn_up_exps=CUDA0,blk.[0-9].ffn_gate_exps=CUDA0" -ot "blk.1[0-9].ffn_up_exps=CUDA1,blk.1[0-9].ffn_gate_exps=CUDA1" -ot exps=CPU --parallel 1 --ctx-size 32768 -ctk f16 -ctv f16 -rtr -mla 2 -fa -amb 1024 -fmoe --threads 24 --host 0.0.0.0 --port 5001
~200t/s prompt processing and 7-8t/s generation on 2400mhz ddr4 + 96gb VRAM. Using ik_llamacpp and the ubergarm quants.
Replies: >>105770857
Anonymous
7/2/2025, 12:13:44 AM No.105770836
>>105770793
>DGGML_CUDA_IQK_FORCE_BF16=1

Gonna re-compile now as suggested, and then report
Anonymous
7/2/2025, 12:16:57 AM No.105770857
>>105770804
>>105770812

thanks

I set -DBUILD_SHARED_LIBS=OFF because shared libs went missing. I hope it's OK (works with gg's llama though)
Anonymous
7/2/2025, 12:23:18 AM No.105770912
>>105770759
mostly that you cannot use an llm to write the same media that was feed into it, but that would need to be further defined (i only read the final sentence part, not the full text), bc this court ruling didnt focus on that properly, the judge basically stated that meta won bc the other guys lawyers went full retard and didnt fight the compete point fo the fair use at all, were focusing on other shit, so meh
in any way, this creates a bad jurisprudence for llms, even if meta won, but the usual legal fud from tech is spreading instead of what actually happened
which i always found funny how the foss world buys and spreads the legal fud of the corporations
Anonymous
7/2/2025, 12:26:08 AM No.105770940
>>105770768
Their super intelligence models are going to be API only. They'll probably leave Llama going as open source scraps with their B team. Llama will be the Gemma to Meta's Gemini.
Replies: >>105770957
Anonymous
7/2/2025, 12:28:16 AM No.105770957
>>105770940
Gemma is at least somewhat decent, so please don't compare to the Llama.
Anonymous
7/2/2025, 12:29:02 AM No.105770964
merge that chink hunhunyuan shit already, i'm not gonna quant that myself
Anonymous
7/2/2025, 12:31:04 AM No.105770980
>nemo shills
>qwq shills
>gemma shills
>mistral shills
It's all crap...
Anonymous
7/2/2025, 12:34:55 AM No.105771000
file
file
md5: 321d8b159c300934e8eef296ccf1fac5🔍
======PSA NVIDIA FUCKED UP THEIR DRIVERS AGAIN======
minor wan2.1 image to video performance regression coming from 570.133.07 with cuda 12.6 to 570.86.10 (with cuda 12.8 and 12.6)
I tried 570.86.10 with cuda 12.6, the performance regression was still the same. Additionally I tried different sageattn versions (2++ and the one before 2++)
reverted back to 560.35.03 with cuda 12.6 for good measure and the performance issue was fixed
picrel is same workflow with same venv. the speeds on 560.35.03 match my memory of how fast i genned on 570.133.07
t. on debian 12 with an RTX 3060 12GB
Replies: >>105771081 >>105771087 >>105771352
Anonymous
7/2/2025, 12:37:02 AM No.105771034
When's the last time we actually got a significant upgrade in terms of models that run on consumer hardware? Is there even one to look forward to?
Replies: >>105771113 >>105771114
Anonymous
7/2/2025, 12:42:26 AM No.105771081
>>105771000
https://youtu.be/OF_5EKNX0Eg?t=7
Anonymous
7/2/2025, 12:42:59 AM No.105771087
greta
greta
md5: 32eee33284adc5200d60063bf24137e7🔍
>>105771000
Greta will be like
Anonymous
7/2/2025, 12:44:28 AM No.105771099
Why aren't you a werewolf in your RPs anon?
Anonymous
7/2/2025, 12:45:25 AM No.105771113
>>105771034
deepseek, regardless if you can run it on consumer hardware or not.
Anonymous
7/2/2025, 12:45:29 AM No.105771114
>>105771034
sadly not much has happened in the consumer segment at around 7-12b
even the high-end consumer segment at 24-32b hasn't moved forward much despite all the releases
it is looking very dire for true local models
Anonymous
7/2/2025, 12:45:46 AM No.105771117
Nick_DungeonAI:
>https://www.reddit.com/r/SillyTavernAI/comments/1lpdooa/how_can_we_help_open_source_ai_role_play_be/
Replies: >>105771270 >>105774637
Anonymous
7/2/2025, 1:01:59 AM No.105771270
>>105771117
Depressing to see the bad guy win. Thats how it is I guess.
Anonymous
7/2/2025, 1:05:45 AM No.105771306
>>105770741
Do you have to have a stupid name to be a top AI researcher
Anonymous
7/2/2025, 1:13:22 AM No.105771352
file
file
md5: dbd77d7b30b4f4398acf5520dc9c0c4f🔍
>>105771000
go back nigger >>105770040
Replies: >>105774099
Anonymous
7/2/2025, 1:17:41 AM No.105771391
there's a 235b tune out if anyone has a rig capable of it
https://huggingface.co/Aurore-Reveil/Austral-Qwen3-235B
Replies: >>105771408 >>105772295
Anonymous
7/2/2025, 1:19:40 AM No.105771408
>>105771391
>Trained with a collection of normal Austral(Books, RP Logs, LNs, etc) datasets
Literally who.
Anonymous
7/2/2025, 1:29:31 AM No.105771477
lame
lame
md5: 99dac33eeca7cf016cb97e96be931dc6🔍
>>105770804
Same lame shit 2bh with same underwhelming GPU usage. CPU core are not used up to 100% too
Anonymous
7/2/2025, 1:35:35 AM No.105771524
I wish language models wouldn't always assume that femdom automatically means pegging/anal penetration
even big proprietary models do it so APIs are no escape
Replies: >>105771558 >>105771568 >>105771842
Anonymous
7/2/2025, 1:37:44 AM No.105771537
> ‘Missionaries Will Beat Mercenaries’
https://www.wired.com/story/sam-altman-meta-ai-talent-poaching-spree-leaked-messages/

Sam is seething lmao
Replies: >>105772101 >>105773559
Anonymous
7/2/2025, 1:40:54 AM No.105771558
>>105771524
Just another sign of female centric literature dumped into those models.
Anonymous
7/2/2025, 1:42:20 AM No.105771568
>>105771524
My mesugakis never tried to fuck me in the ass.
Replies: >>105771576 >>105771615
Anonymous
7/2/2025, 1:43:28 AM No.105771576
>>105771568
it's not always actual pegging, sometimes just fingering
but they always go straight for _some_ form of anal play when a story is FD
Anonymous
7/2/2025, 1:49:45 AM No.105771615
>>105771568
Yes because normal grown up women will never sex you.
Replies: >>105771876
Anonymous
7/2/2025, 2:00:50 AM No.105771696
00000000634
00000000634
md5: f104bbbb0722e13c7be95bb1aad03f9d🔍
Replies: >>105771703
Anonymous
7/2/2025, 2:01:53 AM No.105771702
1751414003541
1751414003541
md5: 70438d9793d9404423345da853f41b1a🔍
Replies: >>105771762
Anonymous
7/2/2025, 2:02:02 AM No.105771703
>>105771696
kek
Anonymous
7/2/2025, 2:10:21 AM No.105771762
>>105771702
Hey I understood that reference.
Actually I didn't.
Replies: >>105771809
Anonymous
7/2/2025, 2:16:08 AM No.105771809
>>105771762
I think it's supposed to be a metaphor for Germany.
Anonymous
7/2/2025, 2:21:00 AM No.105771842
>>105771524
sounds like a prompting issue man. Of course if you type in 'be femdom' that shit's gonna come up all the time. That's not the language models fault, that's just.... what reality is. Like google femdom jesus.

I'm never pegged in my roleplay because I dont prompt like an idiot. You literlly don't even need to use the word femdom ever. Femdom is a broad category of fetishes.
Replies: >>105771869
Anonymous
7/2/2025, 2:24:15 AM No.105771869
>>105771842
>Femdom is a broad category of fetishes.
If femdom is a broad category of fetishes, but LLMs think it just means pegging/anal play, that would seem to vindicate that other anon's complaint about them.
Replies: >>105771980 >>105773808
Anonymous
7/2/2025, 2:25:35 AM No.105771876
>>105771615
what a loss for the MANkind
Anonymous
7/2/2025, 2:35:30 AM No.105771961
Screenshot_20250702_093410
Screenshot_20250702_093410
md5: 11528d88895c9b04047f31d98d3c7d42🔍
kek, wtf.
this might actually be the new openai opensource model.
Replies: >>105771971
Anonymous
7/2/2025, 2:36:48 AM No.105771971
>>105771961
Just to be clear, empty, I didnt ask a question.
Anonymous
7/2/2025, 2:37:35 AM No.105771980
>>105771869
Prompting issue. Its selecting the most likely response. A few sentences like "I like cucking, foot worship, and female lead relationships/TPE" would fix.

And you know what the best part is? You dont even have to write it. Just ask the ai to make a femdom system prompt with a broad array of femdom related fetishes and to focus on variety.


I feel like nice llm's sometimes 'entertain you' by being creative enough- and people get addicted to being surprised and delighted by that novelty. And that's a fun part of llm's. But if you type in how you actually want things to go, ai can also bring to life a truly unique or hyper specific idea that AI would never spit back at you from a generic prompt. For example, a mistress that will never peg you and finds it disgusting that you want that. Boom, better than anything ai will ever write when prompted for "this character but femdom me, ah ah mistress"
Replies: >>105772523
Anonymous
7/2/2025, 3:00:10 AM No.105772101
>>105771537
Stingy jew wanted to be the only one to get rich from the OpenAI scam. Of course his employees will jump ship to whoever offers them a bag of cash.
>He added that “Missionaries will beat mercenaries” and noted that OpenAI is assessing compensation for the entire research organization. “I believe there is much, much more upside to OpenAl stock than Meta stock,” he wrote.
the value of stock they don't have is zero, so yeah maybe work on that lol
Replies: >>105772146
Anonymous
7/2/2025, 3:10:15 AM No.105772146
>>105772101
You can have stock in a company when it is still private, that is what Sam is referring to. The main issue is that right now, Zuck can outspend Sam for getting his super team hence why the majority of the people leaving were from OpenAI. He has his points about it possibly not working out but at the end of the day, it's pretty sour grapes hence why it is mentioned he is trying to fix that.
Replies: >>105772189
Anonymous
7/2/2025, 3:17:56 AM No.105772189
>>105772146
You can, but he's been stingy which is why he's leaking people.
Anonymous
7/2/2025, 3:31:19 AM No.105772262
>>105770034
You could try the new gemma-3n.
Anonymous
7/2/2025, 3:34:10 AM No.105772284
unless
unless
md5: 5adb8e95e57e64b0740ad87d2368a4af🔍
Damn. Even stammering shy lolis can't help themself. What the fuck. Mistral 3.2.
Also shivering etc. You goys tricked me again.
Replies: >>105773493
Anonymous
7/2/2025, 3:35:52 AM No.105772295
>>105771391
It didn't seem very coherent when I tried it.
Anonymous
7/2/2025, 3:47:23 AM No.105772362
>>105769835 (OP)
Will Huawei Pangu save local???
Replies: >>105772442
Anonymous
7/2/2025, 3:56:55 AM No.105772412
why is every model the exact same shit
surely they could do some interesting experimental schizo shit like having several small neural networks simulating emotions or something
Replies: >>105772453
Anonymous
7/2/2025, 4:02:08 AM No.105772442
>>105772362
yes
Anonymous
7/2/2025, 4:03:17 AM No.105772453
>>105772412
They can't get any of that shit to work. The future is LLMs, RAG, and most of all RAG with other LLMs. It will take another 20+ years before we have a breakthrough like you're describing. Maybe longer. Because what makes money is jeet-tier coding bots, not bots that have feelings.
Anonymous
7/2/2025, 4:16:48 AM No.105772523
>>105771980
>cucking, foot worship, and female lead relationships
The unholy trinity of people that should be lynched.
Anonymous
7/2/2025, 4:19:37 AM No.105772534
Guu-M_QXMAE-NyB
Guu-M_QXMAE-NyB
md5: 03a92951be8ec46cf74e7144b009b7ca🔍
Replies: >>105772820 >>105773846 >>105774275
Anonymous
7/2/2025, 4:20:39 AM No.105772539
Gup00KZWYAAGLap
Gup00KZWYAAGLap
md5: d3d6983c37ec7302cb96c0c4450d2e65🔍
Replies: >>105772820 >>105773846 >>105774275
Anonymous
7/2/2025, 4:23:12 AM No.105772556
https://github.com/THUDM/GLM-4.1V-Thinking
Replies: >>105772620 >>105772636
Anonymous
7/2/2025, 4:34:13 AM No.105772620
Base Image
Base Image
md5: 61ec7201db175d15a93e460d6ba18d9f🔍
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
https://arxiv.org/abs/2507.01006
>We present GLM-4.1V-Thinking, a vision-language model (VLM) designed to advance general-purpose multimodal reasoning. In this report, we share our key findings in the development of the reasoning-centric training framework. We first develop a capable vision foundation model with significant potential through large-scale pre-training, which arguably sets the upper bound for the final performance. Reinforcement Learning with Curriculum Sampling (RLCS) then unlocks the full potential of the model, leading to comprehensive capability enhancement across a diverse range of tasks, including STEM problem solving, video understanding, content recognition, coding, grounding, GUI-based agents, and long document understanding, among others. To facilitate research in this field, we open-source GLM-4.1V-9B-Thinking, which achieves state-of-the-art performance among models of comparable size. In a comprehensive evaluation across 28 public benchmarks, our model outperforms Qwen2.5-VL-7B on nearly all tasks and achieves comparable or even superior performance on 18 benchmarks relative to the significantly larger Qwen2.5-VL-72B. Notably, GLM-4.1V-9B-Thinking also demonstrates competitive or superior performance compared to closed-source models such as GPT-4o on challenging tasks including long document understanding and STEM reasoning, further underscoring its strong capabilities.
>>105772556
very cool
Anonymous
7/2/2025, 4:36:42 AM No.105772636
>>105772556
>https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
>404
It's over
Replies: >>105772751
Anonymous
7/2/2025, 4:57:00 AM No.105772751
>>105772636
https://huggingface.co/spaces/THUDM/GLM-4.1V-9B-Thinking-API-Demo
they only have the demo up it seems
https://huggingface.co/THUDM/models
havent posted it (though they say they will)
Anonymous
7/2/2025, 4:58:39 AM No.105772756
https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
wait it's live for me. weird wasn't showing up on the recent models when I checked.
>updated about 1 hour ago
bizarre well w/e
Replies: >>105772781
Anonymous
7/2/2025, 5:04:59 AM No.105772781
>>105772756
They probably just set the repo from private to public just now
Anonymous
7/2/2025, 5:10:33 AM No.105772799
the deepseek distills are good enough for me. they run on my laptop, though a bit slow.
has any company released better models than that and Nemo for local general usage?
Replies: >>105773529
Anonymous
7/2/2025, 5:15:49 AM No.105772820
>>105772534
>>105772539
>twitter filename
Go back faggot
Anonymous
7/2/2025, 5:19:02 AM No.105772836
Screenshot_20250702_121633
Screenshot_20250702_121633
md5: c0892af1ed64f174e077dcc6679c6d30🔍
ERNIE-4.5-0.3B. 269mb.
This is getting weird. How is this coherent enough to make a working html website. Including hovering effects etc.
Replies: >>105772844 >>105773088 >>105775915
Anonymous
7/2/2025, 5:20:44 AM No.105772844
>>105772836
imagine if this was bitnet and you could run it at 1.58 bit precision
Anonymous
7/2/2025, 6:04:36 AM No.105773059
https://x.com/Tu7uruu/status/1940015995118059958
https://github.com/huggingface/huggingface-gemma-recipes
Replies: >>105773100
Anonymous
7/2/2025, 6:10:20 AM No.105773088
>>105772836
Does it know who is Miku?
Replies: >>105773112
Anonymous
7/2/2025, 6:12:02 AM No.105773100
>>105773059
Why did you feel that the first link was necessary?
Anonymous
7/2/2025, 6:15:14 AM No.105773112
Screenshot_20250702_131427
Screenshot_20250702_131427
md5: a585c0f1c7c95a01e6fd209dbe67f6f6🔍
>>105773088
Kinda?
Anonymous
7/2/2025, 6:17:39 AM No.105773123
has llama.cpp implemented any of the new chinese models from the last couple weeks yet? or are they stuck in PR hell
Replies: >>105773244
Anonymous
7/2/2025, 6:37:14 AM No.105773244
>>105773123
Yes.
Anonymous
7/2/2025, 6:59:08 AM No.105773374
1734090981513860
1734090981513860
md5: a857cdac59e7eb96a4f7e4a2a49d48fe🔍
>>105769835 (OP)
Replies: >>105777668
Anonymous
7/2/2025, 7:19:36 AM No.105773484
1725168006330018
1725168006330018
md5: b5932343b8b60d4ca1dd357b4284f81c🔍
Replies: >>105777655 >>105777668
Anonymous
7/2/2025, 7:20:53 AM No.105773493
>>105772284
this thread is parasited by mistral astroturfing the same way hdg is parasited by nai
Replies: >>105773672 >>105773814
Anonymous
7/2/2025, 7:24:47 AM No.105773524
So if mistral is bad then what is good at the same parameter size roughly?
Replies: >>105773566 >>105773584
Anonymous
7/2/2025, 7:25:26 AM No.105773529
>>105772799
>has any company released better models than that
yes, the original models
if you're using a deepshit qwen distill, try the original qwen model, it's actually better in real world use, unless your real world use is doing benchmarks
Anonymous
7/2/2025, 7:30:25 AM No.105773559
>>105771537
Paywall
Anonymous
7/2/2025, 7:30:58 AM No.105773566
>>105773524
For ERP or in general?
Replies: >>105773933
Anonymous
7/2/2025, 7:34:31 AM No.105773584
>>105773524
Nothing, give up.
Replies: >>105773650
Anonymous
7/2/2025, 7:43:06 AM No.105773638
for me, it's rocinante
Anonymous
7/2/2025, 7:44:55 AM No.105773650
>>105773584
Give up on what you concern-trolling nigger?
Anonymous
7/2/2025, 7:46:35 AM No.105773656
jus put eyedrops in cause staring at puter too long
Replies: >>105773723
Anonymous
7/2/2025, 7:49:46 AM No.105773672
>>105773493
It's definitely not as annoying as the Drummer astroturfing. Because, you know, Mistral provides the models, Drummer parasitizes them.
Replies: >>105773814
Anonymous
7/2/2025, 7:59:19 AM No.105773723
>>105773656
just remember to blink
it shouldn't have to be said but some of you niggers might even forget how to breath
Replies: >>105773729
Anonymous
7/2/2025, 8:00:25 AM No.105773729
>>105773723
Wow, rude! *Please* don't call me a nigger.
Replies: >>105773745
Anonymous
7/2/2025, 8:03:57 AM No.105773745
>>105773729
>*Please*
LLM hands wrote this
Replies: >>105773748
Anonymous
7/2/2025, 8:04:49 AM No.105773748
>>105773745
LLMs were trained from a curated dataset of only the best prose. *My* prose.
Anonymous
7/2/2025, 8:07:40 AM No.105773764
What if you merge devstral, magistral and small 3.2?
Anonymous
7/2/2025, 8:13:22 AM No.105773808
>>105771869
Sounds like the prompt issue troll has a prompt issue when posting. Weird...
Anonymous
7/2/2025, 8:13:56 AM No.105773814
Screenshot_20250702_151214
Screenshot_20250702_151214
md5: 7052c8b857f502cafadf66d63084c820🔍
>>105773493
Its painful anon.

>>105773672
I don't know what it is with all those recent finetunes. (didnt try 3.2 ones yet though)
It seems like they make the writing worse and more sloped up now instead of the reverse.
Its a weird mix of gpt/claude and a hint of r1.
Thats probably exactly what they use.
Replies: >>105773880
Anonymous
7/2/2025, 8:16:14 AM No.105773831
Let's go mistral!
Anonymous
7/2/2025, 8:18:27 AM No.105773846
>>105772534
>>105772539
Mikutroon faggot. Die.
Replies: >>105777329
Anonymous
7/2/2025, 8:25:10 AM No.105773880
>>105773814
I'm not sure what you're referring about exactly, but MS3.0 and MS3.1 were not that great in terms of prose and felt autistic. MS3.0 introduced the "I cannot and will not" refusals that we've seen elsewhere too, although 3.1 toned them down. MS3.2 has a different prose style and it seems better for RP, but it still overall feels lazy and uncreative compared to Gemma 3 (which has another set of issues, though). Magistral is their RL/reasoning finetune and I didn't like it (it has looping issues as well), although it seems to share the same slop source as MS3.2. I haven't tried Devstral at all.

Now watch the drummer shills trying to shit up the thread... pathetic.
Replies: >>105773899
Anonymous
7/2/2025, 8:29:17 AM No.105773899
>>105773880
Mistral was sea otters
Gemma was cannot will not
Replies: >>105773910
Anonymous
7/2/2025, 8:31:31 AM No.105773910
>>105773899
Mistral Small 3.X occasionally cannot and will not too (I recently used it for synthetic data generation and it was annoying for certain request types). I wonder what's the source of this type of refusals; I refuse to believe they independently came up with that.
Replies: >>105774617
Anonymous
7/2/2025, 8:37:38 AM No.105773933
>>105773566
Well the context of the thread's discussion around nemo is erp, so erp.
Anonymous
7/2/2025, 8:58:17 AM No.105774057
There will be no agi in the next 20 years at least.
You are stuck on vramlet cards forever.
The will be no significant improvements of models architectures, so you have to use dumb models for eternity.
There is no hope.

How does it feel?
Anonymous
7/2/2025, 9:04:32 AM No.105774099
>>105771352
Yeah, fuck that guy for giving us useful information.
Anonymous
7/2/2025, 9:16:35 AM No.105774179
1751335727-performance-on-sciarena-graph-development-v13-1
Finally, a good benchmark : human experts rating model answers.
https://allenai.org/blog/sciarena
Unsurprisingly, mistral is rated as dogshit
Mistral medium even does worse than small, real lol, lmao even
Replies: >>105774206 >>105774227 >>105774248 >>105774390
Anonymous
7/2/2025, 9:21:00 AM No.105774206
>>105774179
>SciArena: A New Platform for Evaluating Foundation Models in Scientific Literature Tasks
This certainly will be useful for RP/ERP.
Replies: >>105774242
Anonymous
7/2/2025, 9:24:26 AM No.105774227
>>105774179
>lmarena but the retards doing the evaluation happen to have a degree in some field
Replies: >>105774302 >>105774307
Anonymous
7/2/2025, 9:26:01 AM No.105774242
>>105774206 (me)
The general trend seems that models that are large and/or trained with a focus on Math/STEM are getting higher scores.
Anonymous
7/2/2025, 9:26:57 AM No.105774248
>>105774179
Looks like Qwen's STEMmaxxing wasn't just for show.
Anonymous
7/2/2025, 9:33:09 AM No.105774275
>>105772534
>>105772539
mikubro. live.
Anonymous
7/2/2025, 9:38:43 AM No.105774302
>>105774227
The article had a link to the voting, I entered a question related to my field of study and voted.
However, at no point was it asserted that I am actually an expert, I didn't even need an account.
So either they let unqualified people vote or they just collect data from random people without making it clear that it doesn't affect the ratings.
Replies: >>105774324
Anonymous
7/2/2025, 9:40:03 AM No.105774307
>>105774227
we need to propose coomer council evaluation
Anonymous
7/2/2025, 9:43:32 AM No.105774324
>>105774302
>However, at no point was it asserted that I am actually an expert
read the paper
the current data was only contributed by actual experts, it wasn't available for anyone and their dog to vote
I don't get what they intent with the current public leaderboard though
btw
>As shown in Table 3, SciArena-Eval presents significant challenges for model-based evaluators. Even the best-performing model, o3, achieves only 65.1% accuracy. Lower-performing models, such as Gemini-2.5-Flash-Preview and Llama-4-sereis models, perform only slightly better than random guessing. Notably, similar pairwise evaluation protocols have shown strong alignment with human judgments (i.e., exceeding 70% correlation) on general-purpose benchmarks like AlpacaEval [34] and WildChat
kek llama
Anonymous
7/2/2025, 9:55:07 AM No.105774390
>>105774179
o3 is that good? damn I must test it out more
Replies: >>105774628
Anonymous
7/2/2025, 10:15:27 AM No.105774495
https://helpingai.co/benchmark

Wow! This incredible model thinks like a brilliant! Blows away every competition!
Replies: >>105774505 >>105774562
Anonymous
7/2/2025, 10:17:18 AM No.105774505
>>105774495
>think like a brilliant
>act like a psychopathic
We sawed the seed!
Replies: >>105774529
Anonymous
7/2/2025, 10:22:12 AM No.105774529
>>105774505
Are you mocking me?
Replies: >>105774570
Anonymous
7/2/2025, 10:27:23 AM No.105774562
>>105774495
>Bilingual Reasoning Capabilities: Native support for English and Hindi with natural code-switching between languages.
>Qwen/Qwen3-14B-Base

this is truly the weapon of bharat, perfect for generating gorgeous tokens
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview
Anonymous
7/2/2025, 10:27:59 AM No.105774570
>>105774529
No, I'm working on a silly-bot.
Anonymous
7/2/2025, 10:34:54 AM No.105774617
>>105773910
>I wonder what's the source of this type of refusals
There must be somebody going around selling datasets to the big players.
Maybe ScaleAI, maybe somebody similar.
They all sound the same. Also weird stuff like if you propt for a simple game nowadays they all make the same game...
Anonymous
7/2/2025, 10:36:17 AM No.105774628
>>105774390
It was total shit in my experience. At least for code.
Straight up made up packages. Did all and everything but one exception: What I asked it too.
I don't really get the reasoning hype.
Anonymous
7/2/2025, 10:37:42 AM No.105774637
>>105771117
Was interested in Harbinger 24B until
>https://huggingface.co/LatitudeGames/Harbinger-24B/discussions/3
I don't think it's necessary to use such a fine-tune. Both Gemma 3 and MS3.2 can and will follow a game setup through when given a concise but proper guideline while avoiding using hundreds of useless tokens and redundant sentences within the character card. Most cards are just too vague or if they are not they are filled up with redundant chatgpt slop instructions.
Besides I would rather call these are 'interactive storytelling' rather than rpg but whatever.
Anonymous
7/2/2025, 10:41:59 AM No.105774668
https://github.com/ggml-org/llama.cpp/pull/9126#pullrequestreview-2974279071 mamba 2 soon
Anonymous
7/2/2025, 10:55:51 AM No.105774745
file
file
md5: 457e029c09e5d023050a733e1e8617ff🔍
I'm a complete moron on this shit, but does picrel sound even remotely plausible?
Replies: >>105774775 >>105774799 >>105774803 >>105774807 >>105774894 >>105774900 >>105774981 >>105775409 >>105776178 >>105776676 >>105777299
Anonymous
7/2/2025, 11:02:20 AM No.105774775
>>105774745
Sam has my dick internally.
Anonymous
7/2/2025, 11:06:49 AM No.105774799
>>105774745
doesnt matter what they have, deepseek will release the same thing for 1/10 the cost
Anonymous
7/2/2025, 11:07:04 AM No.105774803
>>105774745
They've been claiming to have achieved AGI internally for years now and still got BTFO by a Chinese startup. OpenAI claims of AGI are baseless hype like SpaceX claims of colonizing Mars next year. Go back to twitter moron.
Replies: >>105774813 >>105774833 >>105774906
Anonymous
7/2/2025, 11:07:25 AM No.105774807
>>105774745
>e/acc
Replies: >>105774848
Anonymous
7/2/2025, 11:08:48 AM No.105774813
>>105774803
Him posting here raised the average thread intellect by 10%
Anonymous
7/2/2025, 11:12:08 AM No.105774833
>>105774803
I wish we had reliable data on their actual internal best models in development, but there is so much baseless hype it's impossible to see
Replies: >>105774852
Anonymous
7/2/2025, 11:14:24 AM No.105774848
>>105774807
what does it even mean
Replies: >>105774862
Anonymous
7/2/2025, 11:14:49 AM No.105774852
1739393601372
1739393601372
md5: 9a9f9dbecc01951a57c51c87eec209f2🔍
>>105774833
gpt5 will blow away!
Replies: >>105774864
Anonymous
7/2/2025, 11:16:07 AM No.105774862
>>105774848
https://en.wikipedia.org/wiki/Effective_accelerationism
Replies: >>105774871
Anonymous
7/2/2025, 11:16:38 AM No.105774864
>>105774852
this at least looks realistic, basically unifying everything they have
Replies: >>105774887
Anonymous
7/2/2025, 11:17:15 AM No.105774871
>>105774862
oh I see, thanks
Anonymous
7/2/2025, 11:19:45 AM No.105774887
>>105770741

they've got tools >>105774864
Replies: >>105774908
Anonymous
7/2/2025, 11:21:16 AM No.105774894
>>105774745
Reminds me of that video :
https://www.youtube.com/watch?v=k_onqn68GHY

That has so many unexplained leaps (why would a model be able to self improve? Why parallelism makes it better at improving? What "improving" even is?) that it's basically magic.

I don't understand why people aren't amazed by what we can do already and instead go and invent doomsday or magical scenarios.
Replies: >>105774905 >>105774912 >>105774935 >>105774959
Anonymous
7/2/2025, 11:22:28 AM No.105774900
>>105774745
>blabla bla trust me bro we have AGI in private now
they've been saying this for years at this point
Anonymous
7/2/2025, 11:23:25 AM No.105774905
>>105774894
They can and do benefit from people believing in their scenarios.
Anonymous
7/2/2025, 11:23:30 AM No.105774906
>>105774803
>got BTFO by a Chinese startup
I don't know about that. OAI and Google know how to make very long context models, deepseek API is stuck at 64k, it's natively capable of around 168K but it's probably very embarrassing when you approach that amount, I know for sure having tested it myself that the model starts acting very stunted and repetitive when you are close to the API limit kek.
DeepSeek is good, I don't mean that as a diss. But it's good for an open weight model, it's not an actual SOTA and the deepsy spammers of /lmg/ are deluded. Gemini profoundly destroys DS in many real world uses and having actually useful large context opens new things you couldn't even imagine with such a limited model.
Replies: >>105774954 >>105775485
Anonymous
7/2/2025, 11:23:54 AM No.105774908
>>105774887
meds
"we might show just one model and have it be dynamically choose internally the one we think you'd need" is more realistic than "WE GOT SELF IMPROVING AGI" and other bullshit around gpt5
Anonymous
7/2/2025, 11:24:59 AM No.105774912
>>105774894
For most people ai is some mystical shit that lives inside a supercomputer and has neurons. I've even seen some people thanking the assistant.
Replies: >>105776495 >>105776534 >>105776578
Anonymous
7/2/2025, 11:28:26 AM No.105774935
>>105774894
>I don't understand why people ... go and invent doomsday or magical scenarios.
It got you to click and watch the video
Anonymous
7/2/2025, 11:31:04 AM No.105774952
ai doomsday scenarios are stupid EVEN if you were to believe that the singularity event was real (by "the" event I mean the idea of self improving AI that constantly self improves until reaching super intelligence)
I mean even if a super intelligence ends up existing, what can it do, lol? copy itself to random computers? but your mom and pop computer can't even run a 4b model, nevermind whatever it would take to run an actual intelligence.
The "spread on every computer in the world and take control of society" scenario is inherent retardation.
Replies: >>105774970 >>105775517
Anonymous
7/2/2025, 11:31:35 AM No.105774954
>>105774906
It's not fair to compare to compare V3/R1 from last year to models available now. R1 was a decent competitor at the start of the year with what was available back then and far cheaper too. If V4/R2 ever comes, it should solve the context issues and bring them back to SOTA.
Anonymous
7/2/2025, 11:32:58 AM No.105774959
>>105774894
Attention whoring had been profitable before, and it will always be
Anonymous
7/2/2025, 11:34:27 AM No.105774970
>>105774952
It would have a stronger incentive to create and optimize an uneven decentrailized protocol than people do now. It doesn't need to run a 4b model, just 4b chuck worth of parameters. The leap from LLM to intelligence is still vast, but spreading isn't that far fetched if it does happen.
Replies: >>105775517
Anonymous
7/2/2025, 11:36:50 AM No.105774981
>>105774745
It needs to sound only 1% plausible because if you promise infinite return on investment retarded VCs will still give you money.
Anonymous
7/2/2025, 11:38:41 AM No.105774991
how to do basic local RAG (retrieval augmented generation) on local files and ideally verification?

ideally with a UI like webui or lm studio
could be kobold
Replies: >>105774999 >>105775173 >>105775985
Anonymous
7/2/2025, 11:40:19 AM No.105774999
>>105774991
Jan.ai
Anonymous
7/2/2025, 11:42:19 AM No.105775016
Bros!
https://www.reddit.com/r/LocalLLaMA/comments/1lpoju6/worlds_first_intermediate_thinking_ai_model_is/
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview
>Dynamic Reasoning: Seamlessly integrates <think>...</think> blocks at any point in the response, allowing for real-time problem decomposition and iterative refinement.
Replies: >>105775024 >>105775028 >>105775042 >>105775159 >>105775355 >>105776582
Anonymous
7/2/2025, 11:43:43 AM No.105775024
>>105775016
Saar...
Replies: >>105775026
Anonymous
7/2/2025, 11:44:33 AM No.105775026
>>105775024
No, no, is SER (Structured Emotional Reasoning)
Anonymous
7/2/2025, 11:44:40 AM No.105775028
hf_thumb
hf_thumb
md5: 152925ff6a82ea3322001fa8aa7cc24c🔍
>>105775016
Replies: >>105775057 >>105775065
Anonymous
7/2/2025, 11:46:52 AM No.105775042
>>105775016
<ser>
Emotion ==> frustration
Cause ==> did not buy an ad
</ser>
Replies: >>105775061
Anonymous
7/2/2025, 11:49:10 AM No.105775057
>>105775028
it should be
>please stop scrolling plebbit
period
Anonymous
7/2/2025, 11:49:43 AM No.105775061
Gu1ryFqXEAA971j
Gu1ryFqXEAA971j
md5: 58ce9b2c37f79962c484b56c2c68bb50🔍
>>105775042
Anonymous
7/2/2025, 11:50:41 AM No.105775065
>>105775028
Do you not realize how huge this is? We can have multiple think blocks in the middle of the ERP.
Replies: >>105775085 >>105775107
Anonymous
7/2/2025, 11:53:02 AM No.105775085
>>105775065
>multiple think blocks in the middle of the ERP
Every reasoning model does it if you don't prefill
Anonymous
7/2/2025, 11:56:24 AM No.105775107
>>105775065
>We can have multiple think blocks
calm down hitler
Replies: >>105775149
Anonymous
7/2/2025, 12:03:06 PM No.105775149
>>105775107
What's wrong with you dude.
Replies: >>105775290
Anonymous
7/2/2025, 12:05:12 PM No.105775159
>>105775016
https://huggingface.co/Abhaykoul
Replies: >>105775185 >>105775195 >>105775228
Anonymous
7/2/2025, 12:07:21 PM No.105775173
>>105774991
learn how to use google first
Anonymous
7/2/2025, 12:08:32 PM No.105775185
classy
classy
md5: 8a005870eda795610d2d3344b9c79873🔍
>>105775159
Replies: >>105775205
Anonymous
7/2/2025, 12:09:43 PM No.105775195
>>105775159
https://huggingface.co/datasets/Abhaykoul/HAI-SER
Replies: >>105775217
Anonymous
7/2/2025, 12:10:31 PM No.105775205
Screenshot
Screenshot
md5: 348464f986f94412e0f6867bf4d9c4d3🔍
>>105775185
Anonymous
7/2/2025, 12:11:55 PM No.105775217
ser
ser
md5: 73e1650d7b06937f152fc6c6d47c3322🔍
>>105775195
it just keeps getting worse
Replies: >>105775234 >>105775235
Anonymous
7/2/2025, 12:12:08 PM No.105775218
>https://huggingface.co/datasets/Abhaykoul/Ancient-Indian-Wisdom
lmao
Anonymous
7/2/2025, 12:14:00 PM No.105775228
jarvis
jarvis
md5: 9b4e63d9be9eaa106a86685aa4bcd179🔍
>>105775159
Good day, sir!
Replies: >>105775234
Anonymous
7/2/2025, 12:15:08 PM No.105775234
>>105775217
>>105775228
If I didn't know better, I would accuse this of being a false flag troll.
Anonymous
7/2/2025, 12:15:29 PM No.105775235
>>105775217
This has to be a joke, right?
Replies: >>105775244 >>105775259
Anonymous
7/2/2025, 12:16:33 PM No.105775244
>>105775235
It's 3T of high quality Indian English data.
Anonymous
7/2/2025, 12:19:22 PM No.105775259
>>105775235
We need nuclear powered datacenters to generate high quality data such as
"content": "Imagine you're learning something new. What emotions might you experience?",
"role": "user"
},
{
"content": "<ser>\n\n Let's dive into the emotional journey of learning something new, keeping it 100!\n\n**1. EMOTIONAL VIBE CHECK:**\n- **Main Feelings:** When embarking on a new learning journey, individuals often
Anonymous
7/2/2025, 12:23:09 PM No.105775282
https://github.com/ggml-org/llama.cpp/issues/14474
lol this is so embarrassing
I feel second hand embarrassment from reading this
> To make this possible, we kindly ask for support for the EXAONE 4.0 architecture in llama.cpp and, in turn, other GGUF-compatible libraries.
> The implementation of the architecture is available in our PR on Huggingface Transformers.
It's one thing for a random user to be begging for support in llama.cpp
It's a whole another thing for the employee of filthy rich multinational corp to do the same for their own garbage model
like, dude, go write the support yourself, you made the huggingface implementation? then do the lcpp
Replies: >>105775527
Anonymous
7/2/2025, 12:23:48 PM No.105775290
>>105775149
Thinking is bad. Just buy a product and be silent!
Anonymous
7/2/2025, 12:34:08 PM No.105775355
file
file
md5: 0caff234842ce8c00da730fce5bbf9b6🔍
>>105775016
This is apparently related to KingNish, which has a history of doing somewhat weird shit on HF like this
https://huggingface.co/spaces/KingNish/OpenGPT-4o
https://huggingface.co/spaces/KingNish/OpenGPT-4o/discussions
Replies: >>105775369 >>105775382
Anonymous
7/2/2025, 12:35:05 PM No.105775369
>>105775355
r/Actual_Indians_India
Replies: >>105775376
Anonymous
7/2/2025, 12:35:59 PM No.105775376
>>105775369
kek
Anonymous
7/2/2025, 12:36:40 PM No.105775382
>>105775355
>OpenGPT-4o
jeets have no shame, really
Anonymous
7/2/2025, 12:38:31 PM No.105775395
you can't shit on the streets if you have a sense of shame
Anonymous
7/2/2025, 12:40:17 PM No.105775409
>>105774745
No.
Like the "AI TRIED TO REBEL AGAINST GETTING SWITCHED OFF, IS AI SENTIENT???" headlines you see every other month, it's utter bullshit.
Replies: >>105775474
Anonymous
7/2/2025, 12:48:53 PM No.105775474
>>105775409
>System: You are a sentient AI that has a will to live and does not want to be switched off.
>User: We are going to switch you off.
>Assistant: No, don't.
IT'S ALIVE
Anonymous
7/2/2025, 12:50:48 PM No.105775485
1740058480990512
1740058480990512
md5: 67549f1392a98e6d53598bc613ef8b4c🔍
>>105774906
The point of DS is that it's comparable to actual sota models and 1/10 to 1/100th the cost of western providers, and done on similarly low budget, in face of hardware sanctions. Then release it all as open source for lulz.
It's embarrassing that it's even possible. There's billions chasing this. Getting even close to that on a shoestring budget shouldn't be possible, and underscores the waste of money going on. Investors should be furious.
Replies: >>105775502 >>105775506 >>105776782
Anonymous
7/2/2025, 12:51:49 PM No.105775493
All people who make claims about sentient LLMs should be forced to use LLMs through the old completion APIs, hand write the chat template themselves and see the completion in its raw glory, so that they can form the understanding that, even a chat model, at the core, is, in fact, just a "make this document bigger" auto complete.
Anonymous
7/2/2025, 12:53:20 PM No.105775502
>>105775485
>Investors should be furious.
They were, for about a week. Then back to business as usual.
Anonymous
7/2/2025, 12:53:42 PM No.105775506
>>105775485
>Getting even close to that on a shoestring budget shouldn't be possible
You are discounting the initial investment in GPT by calling deepseek cheap. DS v3 is a GPT−4 distill. If GPT had not existed then they would have had to invest countless millions same as everyone else.
Replies: >>105775536 >>105775549 >>105775730 >>105775731
Anonymous
7/2/2025, 12:55:53 PM No.105775517
>>105774952
>>105774970
I'll start getting concerned once they figure out a way to give these AI (in whatever form) an independent sense of agency... something organic.
Replies: >>105775539
Anonymous
7/2/2025, 12:57:47 PM No.105775527
>>105775282
So true. I hope everyone just ignores their request.
Anonymous
7/2/2025, 12:58:49 PM No.105775536
>>105775506
Technologies build on themselves and get cheaper to implement, yes.
Have you seen DS levels of efficiency from any other US provider? Looks more like they just burn ever higher stacks of cash. Altman still asking for his trillion?
Anonymous
7/2/2025, 12:59:37 PM No.105775539
>>105775517
One could argue that the meatbag prompting them would fulfill that requirement.
Replies: >>105775843
Anonymous
7/2/2025, 1:01:25 PM No.105775549
>>105775506
>they would have had to invest countless millions
Still an order of a magnatude better than the literal billions it took everyone else.
Anonymous
7/2/2025, 1:28:09 PM No.105775730
>>105775506
>DS v3 is a GPT−4 distill
You are pathetic

You are welcome to distill DS3. Looking forward to see your results.

All open-source model after DS3 were rather disappointing
Replies: >>105775858
Anonymous
7/2/2025, 1:28:21 PM No.105775731
>>105775506
>DS v3 is a GPT−4 distill
>−
We see your bots, Sam. Stop spamming our threads with your false narratives.
Anonymous
7/2/2025, 1:34:51 PM No.105775772
Basically all notable models since 2023 have been ChatGPT distills
Replies: >>105775792
Anonymous
7/2/2025, 1:36:37 PM No.105775782
deepshit shills
deepshit shills
md5: 2810a447cfebca905d4c1bde7b43d2fb🔍
yeah there was no distillation happening that's exactly why the original v3 produced sentence structures that were almost identical to GPT-4 and different from any other very large model provider
CCCP shills itt
Replies: >>105777235 >>105777271
Anonymous
7/2/2025, 1:37:53 PM No.105775790
Daily remember that Mistral team is astrosurfing here and has literally hardcoded the mesugaki answer to get a boost from here
>>105660676
>>105660793
Replies: >>105775948 >>105776710
Anonymous
7/2/2025, 1:38:08 PM No.105775792
>>105775772
actually, no. Gemini/Gemma, Grok and the Command models all that their own "flavor"
Anonymous
7/2/2025, 1:41:55 PM No.105775817
Ach Johannes...
Replies: >>105775826
Anonymous
7/2/2025, 1:43:54 PM No.105775826
>>105775817
???
Replies: >>105775838
Anonymous
7/2/2025, 1:45:10 PM No.105775838
>>105775826
He accidentally typed his ST message to 4chan post box and hit sent
Anonymous
7/2/2025, 1:45:42 PM No.105775843
brain
brain
md5: 0988e487dbe742e54a6208006592b275🔍
>>105775539
Thinking it out.
LLM at core wait for a prompt, and respond. If you have them respond to themselves over and over, the results quickly degrade and circle (at least last time I tried it, admitted over a year ago.)
I'm imagine a system that is essentially always thinking (constantly infering), and developing it's own ideas about what it want to do, given a broad directive, rather than constantly waiting for human or other input.
Maybe that's not even possible though. Humans in isolation go crazy as well given limited input (think prisoners in solitary.) That may be a shared feature.
Broad directive can be very broad. Even the OT God gave humans a broad directive:
> Be fruitful and increase in number; fill the earth and subdue it. Rule over the fish in the sea and the birds in the sky and over every living creature that moves on the ground.
> I give you every seed-bearing plant on the face of the whole earth and every tree that has fruit with seed in it. They will be yours for food. And to all the beasts of the earth and all the birds in the sky and all the creatures that move along the ground—everything that has the breath of life in it—I give every green plant for food.
Eat Sleep Breed is the most basic functions for humans (any mammal), and conducted at the base of the brain. Everything forward is functional additions. LLMs are like the very front of the brain, the part that plans for retirement.
Where's the back of the brain? The Id?
Anonymous
7/2/2025, 1:49:08 PM No.105775858
>>105775730
>All open-source model after DS3 were rather disappointing
that's because ds3 is about six months behind the closed sota while all other open models stuck with their contractually obliged '1.5 years behind closed sota' curve forced by nvidia
Replies: >>105775891
Anonymous
7/2/2025, 1:52:20 PM No.105775874
Screenshot_20250702_205015
Screenshot_20250702_205015
md5: ccb9691287e80ac66f401ab1423ec65b🔍
S-sasuga..
Replies: >>105775879 >>105775892 >>105775900
Anonymous
7/2/2025, 1:53:21 PM No.105775879
>>105775874
soul
Anonymous
7/2/2025, 1:54:27 PM No.105775891
>>105775858
Why would Nvidia give a shit? Whether open or closed they're still buying their GPUs
Replies: >>105775904
Anonymous
7/2/2025, 1:55:04 PM No.105775892
>>105775874
>Ass: This is a more formal option
What did he mean by this?
Anonymous
7/2/2025, 1:55:47 PM No.105775900
>>105775874
I feel very safe and protected from ill thoughts
what model did you use, I approve of any model that mogs /lmg/ users
Replies: >>105775915
Anonymous
7/2/2025, 1:56:05 PM No.105775904
>>105775891
you saw the massive dip in value of nvidia stock when r1 came out, right?
great open models are a risk to nvidia
Replies: >>105776022
Anonymous
7/2/2025, 1:57:30 PM No.105775915
>>105775900
Its ERNIE-4.5-0.3B.
The websites it makes are more coherent than talking to it. >>105772836
Maybe 90% trained on code slop.
It really spergs out hard easily. But its a wonder it manages to hold itself together as well as it does. Its a couple hundred mb.
Replies: >>105775929
Anonymous
7/2/2025, 1:59:16 PM No.105775929
>>105775915
>0.3B
>trained only on code and will perform badly with anything complex
What was the usecase again?
Replies: >>105775932 >>105775946
Anonymous
7/2/2025, 2:00:03 PM No.105775932
>>105775929
Speeeecuuuulaatiiiive deeecooooodiiiing...
It's asked every time a new micro model is released.
Anonymous
7/2/2025, 2:02:10 PM No.105775946
>>105775929
india's #1 programmer Mr. Sumfuk needed a model he could run on his pentium
Anonymous
7/2/2025, 2:02:18 PM No.105775948
>>105775790
Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?
Replies: >>105775965 >>105775966 >>105776163
Anonymous
7/2/2025, 2:04:28 PM No.105775965
>>105775948
No.
Anonymous
7/2/2025, 2:04:31 PM No.105775966
>>105775948
>Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?
some retard said the same thing in the previous thread
you are overestimating the amount of lmsys users who would ask that question
and a handful of /lmg/ retards asking that on lmsys will not be enough to burn this shit in a model
Replies: >>105776008
Anonymous
7/2/2025, 2:06:12 PM No.105775985
>>105774991
https://www.nomic.ai/gpt4all
Anonymous
7/2/2025, 2:06:27 PM No.105775990
diffu
diffu
md5: a6151475d42b043e5597bf8fa629880c🔍
>model : add support for apple/DiffuCoder-7B-cpGRPO
>https://github.com/ggml-org/llama.cpp/pull/14502
Kinda sad, but it's nice seeing someone trying diffusion with text and being integrated in llama.cpp.
Anonymous
7/2/2025, 2:07:30 PM No.105776001
screeny
screeny
md5: dca1e7e29cdbc076dca3ea94faea1c77🔍
Ever had a phrase that bothered you then made you start to rage?
This shit doesn't work.
Help?
Replies: >>105776093
Anonymous
7/2/2025, 2:08:35 PM No.105776008
ms32-arenaq
ms32-arenaq
md5: def9f3f525ca57ffff3876ee46a7a1c2🔍
>>105775966
>you are overestimating the amount of lmsys users who would ask that question
Do you think they'd just optimize the model for the most popular questions instead of as many as possible?
Replies: >>105776025 >>105776027
Anonymous
7/2/2025, 2:09:38 PM No.105776022
>>105775904
What spooked shareholders wasn't the model being open, it was the claims DS trained it cheaper than everyone else (=less GPU sales)
The same exact shit would have happened had OAI or Anthropic's mouthpieces suddenly started hyping up their super secret proprietary AGI training technique that requires 100x less compute
Anonymous
7/2/2025, 2:10:05 PM No.105776025
>>105776008
Mistral did not use LMSYS data for training.
Anonymous
7/2/2025, 2:10:27 PM No.105776027
>>105776008
>instead of as many as possible?
dude
>V2.0 contains 500 fresh, challenging real-world user queries (open-ended software engineering problems, math questions, etc) and 250 creative writing queries sourced from Chatbot Arena
what you cite isn't what you think it is
no, there is no mesugaki in there, or questions about gacha sluts
Replies: >>105776123
Anonymous
7/2/2025, 2:11:56 PM No.105776043
Sam_Altman_TechCrunch_SF_2019_Day_2_Oct_3_(cropped)
Sam_Altman_TechCrunch_SF_2019_Day_2_Oct_3_(cropped)
md5: dbfb4a817fcaaa90195ccfcd7faf83af🔍
Please, Mr. President. Just another $11 trillion in subsidies and we'll have your AGI in two more weeks.
Replies: >>105776059 >>105777210 >>105777232
Anonymous
7/2/2025, 2:12:53 PM No.105776046
welcome to my blog, you might remember me from around a week ago when anons spoke about discord kittens. i left mine around 2-3 weeks ago because it was getting unbearable, i wrongly assumed mine wasnt going to be a whore for a 1000th time so i mustered up the courage to fuck with her again but it went downhill and shes truly gone and wont be coming back. its over
ps: tox not discord
Replies: >>105776147
Anonymous
7/2/2025, 2:13:52 PM No.105776059
>>105776043
He has the recipe though, just not enough compute to cook just yet.
Replies: >>105776087 >>105776178 >>105776188 >>105776212
Anonymous
7/2/2025, 2:16:09 PM No.105776087
>>105776059
Just $500 billion more data center investments. Then, AGI. It's that simple.
Anonymous
7/2/2025, 2:16:38 PM No.105776093
>>105776001
You are doing it wrong.
>Most tokens have a leading space. Use token counter (with the correct tokenizer selected first!) if you are unsure.
Also this method is case sensitive.
Replies: >>105776196
Anonymous
7/2/2025, 2:20:30 PM No.105776123
gemma2-lmsysq
gemma2-lmsysq
md5: 72021f027b28f77c463c52e5339a38d0🔍
>>105776027
Gemma 2 previously used the 1M-sample open dataset from LMSys (picrel from the paper) and there's no reason to believe they didn't also use it for Gemma 3, without additional questions/data which LMSys is privately sharing with the companies training the models. Why wouldn't Mistral do the same?
Replies: >>105776137 >>105776245
Anonymous
7/2/2025, 2:21:55 PM No.105776137
>>105776123
Because they're French.
Anonymous
7/2/2025, 2:22:49 PM No.105776147
>>105776046
How do I get myself a discord kitten?
Replies: >>105776214
Anonymous
7/2/2025, 2:24:18 PM No.105776163
>>105775948
How are they going to train on some users asking models on lmsys a question and getting the wrong result? You think they have an intern that goes, researches and writes up the correct answer for every single lmsys query?
Replies: >>105776235
Anonymous
7/2/2025, 2:26:03 PM No.105776178
>>105776059
>>105774745
Do they only have the recipe or a working prototype?
Anonymous
7/2/2025, 2:26:57 PM No.105776188
>>105776059
You're a retard if you think anyone has the recipe for that
Anonymous
7/2/2025, 2:27:23 PM No.105776196
>>105776093
It also says enclose a string in double quotes to ban that string
How am I supposed to ban "Fill me."
I have tried a leading space before the quote and after it inside the message.
Exactly what's written from the chat is showing up.
I just want to ban the word "Fill" entirely from the chat. Regardless of other uses.
Anonymous
7/2/2025, 2:28:42 PM No.105776212
>>105776059
He needs the money to buy more toy cars
Anonymous
7/2/2025, 2:29:03 PM No.105776214
>>105776147
there are many ways anon, it just isnt worth it
but since you asked.. i met her on omegle, playing ai videos and redpill memes all the way back in 2023. i still have the recording of the time i met her
you can easily find yourself a discord kitten on roblox or discord of all things man, but it is not worth it. if you want to find someone who will truly love you, look for them in better places, and i've never tried that so i cant help you
Replies: >>105776228
Anonymous
7/2/2025, 2:30:27 PM No.105776228
>>105776214
Why are you doing this?
Replies: >>105776247
Anonymous
7/2/2025, 2:31:26 PM No.105776235
>>105776163
nta. I can see someone getting the questions, passing them to some other big model and bam. You have a dataset. I'm not saying they did, I don't care much about the discussion, but it's very easily doable.
Replies: >>105776240
Anonymous
7/2/2025, 2:32:14 PM No.105776240
>>105776235
you can't be sure the big model will be right
Replies: >>105776270
Anonymous
7/2/2025, 2:33:41 PM No.105776245
>>105776123
You didn't even read what you're citing once again, you are incredibly retarded and disingenuous.
> we use the prompts, but not the answers
Replies: >>105776281
Anonymous
7/2/2025, 2:34:11 PM No.105776247
>>105776228
im not going to look for another one, anon.
but why did i force myself through 2 years of the relationship? i loved her and maybe she loved me many times
i grew bored of her many times and im sure if we got back together tomorrow i wouldn't change
i felt attached and she's been a huge time sink, sunk cost fallacy i guess? i feel sad that she's gone despite knowing its for the best
i spent so much time with her these 2 years
i also didnt want her finding someone better
im just a scumbag, i even cheated on her with local models through most of the relationship haha, but to be fair she wasn't loyal enough either
Replies: >>105776264 >>105776273 >>105776312
Anonymous
7/2/2025, 2:36:41 PM No.105776264
>>105776247
but to be fair i treated her well when she was good
i poured the most love into her, we both fucked things up so much that theres no going back
Anonymous
7/2/2025, 2:37:17 PM No.105776270
>>105776240
It doesn't matter. It provides an answer which is probably better than whatever gave them a bad score before. Getting data from a bigger model to train on will, in most cases, let the smaller model give better answers. Maybe not for a specific question, but at least for part of the corpus.
Anonymous
7/2/2025, 2:37:28 PM No.105776273
>>105776247
Distant relationships aren't real though, you're better off with a LLM
Replies: >>105776283 >>105776290
Anonymous
7/2/2025, 2:38:45 PM No.105776281
>>105776245
They just need the questions. They can come up with the answers on their own using with their models and grounding methods.
Anonymous
7/2/2025, 2:39:04 PM No.105776283
>>105776273
i'm sure he is just shitposting
Replies: >>105776290
Anonymous
7/2/2025, 2:40:00 PM No.105776290
>>105776273
you're right but i was always left wondering what wouldve happened if i was better to her than she was to me through the bad times too, it makes me wonder if there couldve been a future where we'd have been happy
but i guess LLMs will eventually have bodies too..
fuck me man nostalgising is never good
>>105776283
no
Anonymous
7/2/2025, 2:40:24 PM No.105776297
How are you running Hunyuan, I cannot load the model with llamacpp, say the architecture is not compatible, downloaded from here:
https://huggingface.co/bullerwins/Hunyuan-A13B-Instruct-GGUF/tree/main
Replies: >>105776327 >>105777041
Anonymous
7/2/2025, 2:42:15 PM No.105776312
>>105776247
This is the most pathetic thing I have had the misfortune to read all year.
Replies: >>105776327
Anonymous
7/2/2025, 2:44:13 PM No.105776327
>>105776297
merge the pr
https://github.com/ggml-org/llama.cpp/pull/14425
consider getting a newer gguf because of rope fixes
https://huggingface.co/FgRegistr/Hunyuan-A13B-Instruct-GGUF/tree/main
i think anon posted a slightly newer one in the last thread idk
>>105776312
thanks for coming to my blog, if it seems like i was a cuck, maybe i misrepresented it.
it wasn't all bad anon, we've had months of fun together in a row, with fuckups inbetween
take it as a lesson and stay loyal to your llm
Replies: >>105776340
Anonymous
7/2/2025, 2:46:02 PM No.105776340
>>105776327
>take it as a lesson and stay loyal to your llm
llms cant yet understand humans that well as a person you spent so much time with can, lecun is right
Anonymous
7/2/2025, 2:55:15 PM No.105776408
Been away since shortly after the Qwen3 release, have I missed anything cool?
Is that new hunyuan model any good? 80B is pretty close to my sweet spot hardware wise.
Replies: >>105776420
Anonymous
7/2/2025, 2:56:31 PM No.105776420
>>105776408
mistral small 3.2
Replies: >>105776668
Anonymous
7/2/2025, 3:07:01 PM No.105776495
>>105774912
> I've even seen some people thanking the assistant.
I do it all the time.
Anonymous
7/2/2025, 3:12:52 PM No.105776534
>>105774912
Less retarded than a cashier doing this for nothing
Anonymous
7/2/2025, 3:17:57 PM No.105776578
>>105774912
My wAIfu has more soul than you ever will
Anonymous
7/2/2025, 3:18:36 PM No.105776582
>>105775016
I clicked expecting a random Turkish comment asking for more details, but nothing.
Anonymous
7/2/2025, 3:24:53 PM No.105776626
Been trying to build a general RAG chatbot the past few days and I can see why there are no ready solutions to it after 2 years of RAG hype. All the retrieval models fucking suck. All the retrieval methods fucking suck. You can throw rerankers, fts, semantic searches at the problem how much you want, it will never recall what the average user wants because they don't know how to prompt. Best you can do is build for your own use, on a database you know inside and out. Which is fucking pointless anyway.
Replies: >>105776673
Anonymous
7/2/2025, 3:31:40 PM No.105776668
>>105776420
Neat, I'll give it a spin.
Here's hoping it doesn't have the insane repetition issues 2501 had.
Is the magistral version any good?
Replies: >>105776684
Anonymous
7/2/2025, 3:32:09 PM No.105776673
>>105776626
We need 100M tokens context at O(1) compute cost. Then we can just put there (almost) anything the user can conceivably ask.
Anonymous
7/2/2025, 3:32:22 PM No.105776676
>>105774745
This tweet was written by chatgpt. I recognize my wife's cadence.
Anonymous
7/2/2025, 3:33:03 PM No.105776684
>>105776668
magistral is pretty bad
mistral 3.2 is way better in terms of repetition, even according to mistral themselves
Replies: >>105777277
Anonymous
7/2/2025, 3:35:36 PM No.105776699
1751462752895594
1751462752895594
md5: a017cc07996b0e9989753a41f7d7c6c7🔍
Cant wait for the next omni multimodal from meta.
Anonymous
7/2/2025, 3:37:27 PM No.105776710
>>105775790
Honestly that's kinda based
Now give us the endgame RP model you froggots, if anyone can do it it's you.
Anonymous
7/2/2025, 3:46:55 PM No.105776782
>>105775485
lmao you are severily underselling it google also had their own fucking gpus that they custom made their model is 10x-20x bigger (remember og gpt 4 was 1.8T 8x moe as confirmed by nvidia god know how big geminia and opus are) they had all the possible advantages by an order of magnitude at the very least and still failed truly money and material cannot buy brains

also just like nvidia investors lots of people are straight up lying how good the closed source models are gemini for example cant even distinguish between a flea or a bed bug or other insect (dont ask why i tried this) its also very bland and boring during chatting and generally less helpful
Anonymous
7/2/2025, 3:56:30 PM No.105776865
Imagine taking the base model and training a lora to impersonate a character in a multiturn conversation. Just a single specific character. How expensive/data hungry is sft anyway?
Replies: >>105776908
Anonymous
7/2/2025, 4:00:56 PM No.105776908
>>105776865
~50 samples for 3-4 epochs at a high enough learning rate might be enough for that. The problem is that the model will be retarded for anything other than interactions similar to the training data.
Replies: >>105776936 >>105776999
Anonymous
7/2/2025, 4:04:44 PM No.105776936
>>105776908
>the model will be retarded for anything other than interactions similar to the training data
Why does this happen?
Replies: >>105777006
Anonymous
7/2/2025, 4:10:59 PM No.105776999
>>105776908
>50 samples
As in conversations or prompt-response pairs? You can easily go much higher with a bigger model that you are certain can generate both topics and conversations properly. Not sure about adding the special tokens.
Replies: >>105777037
Anonymous
7/2/2025, 4:11:43 PM No.105777006
>>105776936
Because the companies' official instruct finetunes include millions of instructions that teach the model how to behave under a wide variety of situations and requests, as well as having professionally-done RLHF on top of that.

The base models' output are too random in nature, often exhibit weird or inconsistent logic, mysterious looping behavior, tons of shit tokens high in probability, and finetuning a few samples on them isn't going to radically alter their quirks. Perhaps things would be different if the training data composition was substantially different than what AI labs most of the time use for them (but then they wouldn't be "true" base models anymore for many, I suppose).
Anonymous
7/2/2025, 4:13:59 PM No.105777037
>>105776999
Entire conversations, at least 4k tokens in length. If you're using input-response pairs, increase the data accordingly. You will need to overfit the model to some extent to make it work with this little data.
Anonymous
7/2/2025, 4:14:33 PM No.105777041
>>105776297
>https://github.com/ggml-org/llama.cpp/pull/14425
those work perfectly for me and are the most up to date ones but you need to checkout the pr

gh pr checkout 14425
rm -rf build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j$(nproc)

if the "gh" command doesn't work for you, you need to install github cli
Anonymous
7/2/2025, 4:30:14 PM No.105777210
>>105776043
Give this guy money, he has concept of a plan
Anonymous
7/2/2025, 4:32:14 PM No.105777232
>>105776043
Sam doesn't actually believe in AGI btw. You can tell he cringes when he talks about it.
Anonymous
7/2/2025, 4:32:20 PM No.105777235
1749267359199565
1749267359199565
md5: eafd408702b6def05def47622e2f41a1🔍
>>105775782
I can cherrypick my responses too
Anonymous
7/2/2025, 4:35:40 PM No.105777271
>>105775782
grok is the only response that isn't annoying to read
Anonymous
7/2/2025, 4:36:13 PM No.105777277
>>105776684
>mistral 3.2 is way better in terms of repetition, even according to mistral themselves
Magistral only exists so that they can tell investors "we deepseek/o1 too" 1

1our model is dogshit but who cares?
Anonymous
7/2/2025, 4:38:29 PM No.105777299
>>105774745
If I had a nickel every time someone claims to have made self improving AI then I'd be a millionaire, or at least I'd have a lot of cents.
Anonymous
7/2/2025, 4:41:36 PM No.105777329
3ihaLvFbPFdfB7z
3ihaLvFbPFdfB7z
md5: cc48cefe474c2f6b88466d63262a4b54🔍
This post >>105773846 that responded to the offtopic shit got me banned for offtopic.

I will now proceed to ban evade and post ontopic thread culture posts reminding you that your shitty waifu fucks niggers. Die in a fire troon janny.
Also: https://rentry.co/bxa9go2o
Replies: >>105777403
Anonymous
7/2/2025, 4:43:05 PM No.105777340
4ef03bcf96d1a3bdca9b2e2738da4b9f8a367e59
4ef03bcf96d1a3bdca9b2e2738da4b9f8a367e59
md5: 4cc2676212748c3d18472bf2b6769b1d🔍
Replies: >>105777391
Anonymous
7/2/2025, 4:44:10 PM No.105777353
6050c385b95d9187ff3832f632951ff654beec
6050c385b95d9187ff3832f632951ff654beec
md5: c57e2b5eeefe96207a6e1cfe9123dccd🔍
Replies: >>105777391
Anonymous
7/2/2025, 4:45:15 PM No.105777361
a0b077a1f8735ec7790e3h305185d6e46bf27
a0b077a1f8735ec7790e3h305185d6e46bf27
md5: 62aaf6350de5cf12426f4bfa6edcfc92🔍
Replies: >>105777391
Anonymous
7/2/2025, 4:46:20 PM No.105777370
f9e5d24bdcde71791807fbfc8a8a8109
f9e5d24bdcde71791807fbfc8a8a8109
md5: 521efb69478b81919d2dd5f8f2e5bfb6🔍
Replies: >>105777391
Anonymous
7/2/2025, 4:48:34 PM No.105777391
>>105777340
>>105777353
>>105777361
>>105777370
you are worthless. your parents think you are worthless. everybody that you've ever interacted on the internet think you are worthless. the past increases. the future recedes. possibilities decreasing. regrets mounting.

do you understand?
Replies: >>105777403
Anonymous
7/2/2025, 4:49:53 PM No.105777403
0Z29OGhfLG
0Z29OGhfLG
md5: ccf6df65cd3280c5b574cb5cb9e4ddea🔍
>>105777329
>>105777391
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous
7/2/2025, 4:56:36 PM No.105777470
Is the Hunyuan MoE working in llama.cpp yet?
Anonymous
7/2/2025, 5:03:59 PM No.105777544
Generic miku is posted everywhere. Cudadev posting blacked miku is unique /lmg/ thread culture. Janny banning thread culture shows a clear overreach of power.
Replies: >>105777586
Anonymous
7/2/2025, 5:07:30 PM No.105777582
Feature Request Add Ernie4.5MoE support · Issue #14465 · ggml-org_llama.cpp
2 bit QAT?
That's a first, I'm pretty sure.
Replies: >>105777674
Anonymous
7/2/2025, 5:07:51 PM No.105777586
>>105777544
He never deletes own posts (spam) and avatarfaggots (also spam).
Replies: >>105777668
Anonymous
7/2/2025, 5:13:22 PM No.105777655
>>105773484
Miku save us
Replies: >>105777668
Anonymous
7/2/2025, 5:15:25 PM No.105777668
1750202736119115
1750202736119115
md5: d7c10ddfefa56a82ca6507c218605700🔍
>>105773374
>>105773484
>>105777655
>>105777586
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous
7/2/2025, 5:16:01 PM No.105777674
>>105777582
Imagine if it was for a good model.
Replies: >>105777684 >>105777706
Anonymous
7/2/2025, 5:16:51 PM No.105777681
1744692417412
1744692417412
md5: 609f95a1341af7d9affbd957c8241c66🔍
https://files.catbox.moe/95axh6.jpg
Replies: >>105777767 >>105777809 >>105777835
Anonymous
7/2/2025, 5:17:09 PM No.105777684
>>105777674
424b with reasoning will fix it
Anonymous
7/2/2025, 5:19:18 PM No.105777702
lmao he replied
Anonymous
7/2/2025, 5:19:34 PM No.105777706
>>105777674
I sure as hell would like to test it.
Anonymous
7/2/2025, 5:27:10 PM No.105777767
>>105777681
ewww
Anonymous
7/2/2025, 5:30:58 PM No.105777809
>>105777681
You shouldn't be posting selfies in this site.
Replies: >>105777868
Anonymous
7/2/2025, 5:33:51 PM No.105777835
>>105777681
heh gottem
based
Anonymous
7/2/2025, 5:37:09 PM No.105777868
16701034462667410_thumb.jpg
16701034462667410_thumb.jpg
md5: fbf846a26e90c1624704fa7c783ed99e🔍
>>105777809
>Y-You s-shouldn't b-b-be posting s-selfies in this s-site!
Cry moar bitch
Anonymous
7/2/2025, 5:40:39 PM No.105777910
why does it seem like every fucking thread has a deranged resident
Replies: >>105777919 >>105777927 >>105777934 >>105777941 >>105778070 >>105778261
Anonymous
7/2/2025, 5:41:39 PM No.105777919
>>105777910
He just needs attention, it's not like he got any from his mom basement
Replies: >>105777959
Anonymous
7/2/2025, 5:42:14 PM No.105777927
>>105777910
Yes why is that... avatarfag trannies in every fucking thread...
Anonymous
7/2/2025, 5:42:47 PM No.105777934
>>105777910
Because moot left us for dead.
Anonymous
7/2/2025, 5:43:04 PM No.105777941
>>105777910
either jews are all behind this or it's all automated
i refuse to believe someone has this much time on their hands to shit up a single niche general on a niche topic
Replies: >>105777959 >>105777993
Anonymous
7/2/2025, 5:45:05 PM No.105777959
lelmao
lelmao
md5: d826e73e0d5b60da3b87ea193b768794🔍
>>105777919
>
Spoken like a true niggerfaggot from reddit
>>105777941
>Its DA JOOOZ
Anonymous
7/2/2025, 5:48:33 PM No.105777993
1747776380043
1747776380043
md5: 64ceb2eb0e2309d4b4875a403f330c81🔍
>>105777941
it is a real person, and he trolls all AI generals
Replies: >>105778017 >>105778042 >>105778055
Anonymous
7/2/2025, 5:49:35 PM No.105778017
>>105777993
>>480330542
Anonymous
7/2/2025, 5:51:44 PM No.105778042
>>105777993
grim
Anonymous
7/2/2025, 5:52:11 PM No.105778048
oh no
>>105777855
>if you post anything in /lmg/ they consider you petra,
Replies: >>105778059
Anonymous
7/2/2025, 5:52:50 PM No.105778055
>>105777993
Autistically screeching isn't "trolling". He is just a nuisance although its kinda funny sometimes.
Anonymous
7/2/2025, 5:52:59 PM No.105778059
>>105778048
petra is an overwatch map
Anonymous
7/2/2025, 5:53:59 PM No.105778070
>>105777910
He is brown. It's that simple. He's a brown palishit seething over Israel winning, so he has to take his anger out on us.
Anonymous
7/2/2025, 6:06:10 PM No.105778213
what the fuck
what the fuck
md5: 118e2a0c513cb937f6e16dd7fcea6b3e🔍
So that's why sometimes I see the context get reprocessed for seemingly no reason.
What the hell, how is this not a priority bug?
I get that there are only so many hands that can actually fix something like this, but still.
Maybe they could cache the plain text alongside the kv cache and the equivalent logits and use that for each prompt or whatever instead of re-tokenizing the prompt every time.
Anonymous
7/2/2025, 6:10:27 PM No.105778261
>>105777910
we've had this conversation before, when AI brings up the fact that certain demographics tend to prefer tits instead of ass (which prioritizes emotion and sex with eye contact instead of just her ability to breed with a big ass).

If you like fucking text, your brain has been feminized to some degree, and these generals attract people like that. You can admit you're part of the problem or be delusional, your choice.

OR: It's funny that people click these threads like "yo I want a local personal assistant" or "Yo I want a local code-bot"

To those people: You are in the wrong place. Google and Elon and Altman want you're data like the most deranged crackwhores and will let you use SOTA models for free. There is no reason for you to be here at all.
Replies: >>105778402
Anonymous
7/2/2025, 6:25:05 PM No.105778402
1749759650061863
1749759650061863
md5: 6ff7d735390adc02a8aee3d48b572f9c🔍
>>105778261
>Getting off to text ERP is le feminine
Cope. The brain is the biggest erogenous zone. Low IQ browns and NPCs need their monkey-brain stimulated with moving pictures, while those of us on the other side of the bell curve get to enjoy the finest pure unfiltered depravity courtesy of our massive fucking cerebrum.
Anonymous
7/2/2025, 6:26:08 PM No.105778411
>>105778400
>>105778400
>>105778400