/lmg/ - Local Models General - /g/ (#105769835) [Archived: 610 hours ago]

Anonymous

7/1/2025, 10:22:22 PM No.105769835

md5: 8b964eaf378b25b2b8203771027caf8b🔍

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105757131 & >>105750356

►News
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5
>(06/27) VSCode Copilot Chat is now open source: https://github.com/microsoft/vscode-copilot-chat
>(06/27) Hunyuan-A13B released: https://hf.co/tencent/Hunyuan-A13B-Instruct
>(06/26) Gemma 3n released: https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Replies: >>105772362 >>105773374

Anonymous

7/1/2025, 10:22:48 PM No.105769843

tetrecap2

md5: 68dbdb085a8472c4b34be6cc8b56826c🔍

►Recent Highlights from the Previous Thread: >>105757131

--Paper: Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication:
>105761808 >105761966 >105762753 >105763009
--Paper: Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity:
>105768845 >105768877 >105768901 >105768884 >105768906 >105768933 >105769034
--Meta's AI talent acquisition and open model skepticism amidst legal and data curation challenges:
>105758293 >105758397 >105758388 >105758810 >105758467 >105758482 >105766325 >105758818 >105758901 >105758926 >105758942
--Hunyuan-A13B GGUF port requires custom llama.cpp build for flash attention support:
>105768115 >105768164 >105768455
--Frustration over delayed OpenAI model and skepticism toward benchmarks and strategy:
>105766029 >105766042 >105768619 >105768677 >105768693 >105768837 >105768876 >105769053 >105768798 >105768934
--Critique of Hunyuan and Ernie models for over-reliance on Mills & Boon-style erotic prose in outputs:
>105758427 >105758629 >105758645 >105758674 >105764901 >105765054 >105765118 >105765228 >105765275 >105765472 >105765503 >105765747 >105767085 >105767501 >105766545 >105766794 >105768886 >105758694
--NVIDIA's Mistral-Nemotron open reasoning model sparks confusion and skepticism among anons:
>105766864 >105766975 >105767094 >105767167
--Discussion on NVIDIA ending driver support for older Pascal, Maxwell, and Volta GPUs:
>105764483 >105764512 >105766267
--Fish Audio S1 Mini and 4B text-to-speech model voice cloning results shared:
>105760876 >105760929
--Official OpenAI podcast episode discussing ChatGPT and AI assistant development:
>105766509
--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
>105760696 >105760773
--Meta court win legitimizes fair use for LLM training in the U.S.:
>105766199
--Miku (free space):
>105765500 >105766204

►Recent Highlight Posts from the Previous Thread: >>105757140

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Replies: >>105769864 >>105769926 >>105770731

Anonymous

7/1/2025, 10:24:55 PM No.105769864

>>105769843
>--Hunyuan A13B IQ4 chat completion issues on llama.cpp?? frustration:
Turn the temp down holy fuck

Anonymous

7/1/2025, 10:28:15 PM No.105769897

temperature was a mistake

Replies: >>105769907 >>105769965

Anonymous

7/1/2025, 10:29:25 PM No.105769907

>>105769897
Yeah it's way too fucking hot right now.

Anonymous

7/1/2025, 10:31:14 PM No.105769926

>>105769843
Thank you Recap Teto

Anonymous

7/1/2025, 10:33:22 PM No.105769948

70Bros what's fotm slop finetune?

Replies: >>105770572

Anonymous

7/1/2025, 10:35:34 PM No.105769965

>>105769897
Temp + Top-P is all you need.

Replies: >>105770012

Anonymous

7/1/2025, 10:40:01 PM No.105770012

>>105769965
Temperature: 0.6
Top K: 100
Top P: 0.95
Top nsigma: 1

can't believe there are still people using other LLMs when R1 is this simple to set up.

Anonymous

7/1/2025, 10:43:23 PM No.105770034

hi what's the best model for 8gb vram these days
i'm primarily interested in things like tool use and agentic behavior

Replies: >>105770065 >>105770072 >>105772262

Anonymous

7/1/2025, 10:44:22 PM No.105770041

Bait

md5: f7ec0c52fe6e81727e07fa6133298011🔍

Replies: >>105770068

Anonymous

7/1/2025, 10:46:05 PM No.105770065

>>105770034
Qwen MoE probably. The 30B.

Replies: >>105770068

Anonymous

7/1/2025, 10:46:32 PM No.105770068

>>105770041
not bait :(
>>105770065
thanks, how many t/s should i expect?

Replies: >>105770076 >>105770125

Anonymous

7/1/2025, 10:47:13 PM No.105770072

>>105770034
Nemo

Replies: >>105770186

Anonymous

7/1/2025, 10:47:43 PM No.105770076

>>105770068
Will mostly depend on your RAM, since you want to offload the expert tensors to the CPU backend using the --override-tensor (-ot) parameter.
I'll say between 10 and 15 t/s?

Replies: >>105770097

Anonymous

7/1/2025, 10:49:48 PM No.105770097

>>105770076
16gb of ram unfortunately, haven't gotten around to upgrading it yet

Replies: >>105770144

Anonymous

7/1/2025, 10:51:56 PM No.105770125

>>105770068
>not bait :(
You know 8gb is not enough right?

Replies: >>105770186

Anonymous

7/1/2025, 10:53:46 PM No.105770144

>>105770097
Well, shit.
RIP I guess.
Try the q4ks quant with low topk and pray for the best I suppose.

Replies: >>105770186

Anonymous

7/1/2025, 10:58:32 PM No.105770186

>>105770072
will keep that in mind
>>105770125
i didn't purpose build this pc for running llms, it's just a gaming pc that i'm hoping to repurpose
>>105770144
thanks, will do
last model i used was mistral-7b and it was honestly not up to snuff

Replies: >>105770216

Anonymous

7/1/2025, 11:01:41 PM No.105770216

>>105770186
what specific graphics card do you have, and also how much regular RAM do you have? CPUmaxxing might be an option

Replies: >>105770255

Anonymous

7/1/2025, 11:05:57 PM No.105770255

>>105770216
rtx 2060 super
16gb of regular ram, some 10th gen i7

Anonymous

7/1/2025, 11:12:13 PM No.105770318

that 16gb is going to limit your maximum context

Anonymous

7/1/2025, 11:13:15 PM No.105770328

>>105769946
It's ggml-large-v3.bin from https://huggingface.co/ggerganov/whisper.cpp/tree/main

Anonymous

7/1/2025, 11:19:45 PM No.105770389

image

md5: 3654898e2f66813ae3b106f82abfab63🔍

Reminder that ROCM sucks so much that it's ALWAYS better to fit more layers in VRAM and use -nkvo (--usecublas lowvram in kobold).

Replies: >>105770409

Anonymous

7/1/2025, 11:21:51 PM No.105770409

>>105770389
>it's ALWAYS better to fit more layers in VRAM
isn't that generally the case?

Replies: >>105770513

Anonymous

7/1/2025, 11:30:30 PM No.105770488

pangu pro moe tables5&6

md5: d7442eda6fca92224a10c815bf2fa2c4🔍

>>105768845
>[I]t is commonly observed that some experts are activated far more often than others, leading to system inefficiency when running the experts on different devices in parallel. Existing heuristics for balancing the expert workload can alleviate but not eliminate the problem. Therefore, we introduce Mixture of Grouped Experts (MoGE), which groups the experts during selection and balances the expert workload better than MoE in nature. It constrains tokens to activate an equal number of experts within each predefined expert group. When a model execution is distributed on multiple devices, which is necessary for models with tens of billions of parameters, this architectural design ensures a balanced computational load across devices, significantly enhancing throughput, particularly for the inference phase.
Why don't their speed benchmarks compare Pangu Pro 72B A16B to other MoEs?

Replies: >>105770519

Anonymous

7/1/2025, 11:33:39 PM No.105770513

>>105770409
I'm assuming it's not because nvidia people here keep talking about having memory for (x)k context and that's not an issue if you just put it in RAM.

Anonymous

7/1/2025, 11:34:27 PM No.105770519

>>105770488
And why were those non-matching batch sizes chosen for inference benchmarks?

Anonymous

7/1/2025, 11:40:25 PM No.105770572

>>105769948
sloptune roundup for smut:

[i dunno i like em]
sophosympatheia_StrawberryLemonade-L3-70B-v1.0-Q4_K_M.
drummer anubis Shimmer-70B-v1c-Q4_K_M

[dark fantasy model]
CrucibleLab_L3.3-Dark-Ages-70b-v0.1-Q4_K_M

[claude logs]
L3.3-70B-Magnum-Diamond-Q4_K_M

[for anyone who has 30 gb vram and is sick of 32b, this is a great model that is almost like 70b]
TheDrummer_Valkyrie-49B-v1-Q5_K_L

[funny name]
Broken-Tutu-24B.Q8_0

Anonymous

7/1/2025, 11:40:36 PM No.105770575

I am using a low quant of Magistral Small for my Roman Empire slavery themed smut and this is already one of the best, tightest writing models I've ran on my 3060.

Really thinking seriously about just getting a 3090 at this point

Replies: >>105770663

Anonymous

7/1/2025, 11:42:34 PM No.105770594

>check public rp datasets
>almost every system prompt has "avoid repetition"
>the logs are repetitive
I wonder how this will damage future models

Anonymous

7/1/2025, 11:52:07 PM No.105770658

ik_llama

md5: 7b71c2853ffa4d95e11d208291fdec96🔍

WHAT le fug is wrong with ik_llama???

It is not using the GPU for prompt processing at all while pushing CPU to do it

using ubergarm's quant und their retarded command line

Replies: >>105770671 >>105770678 >>105770681 >>105770698 >>105770714 >>105770812

Anonymous

7/1/2025, 11:52:13 PM No.105770663

main_roman-rites-of-passage-5c70d58ab3bf_spec_v2(1)

md5: b1629bb61573a0b757be7748ac6e617f🔍

>>105770575
This?
https://chub.ai/characters/handwrought.route/roman-rites-of-passage-5c70d58ab3bf
I was very surprised how much it knew about actual history. Like any anime shit, probably a lost cause, but it's got that Wikipedia+ knowledge.

Replies: >>105770716

Anonymous

7/1/2025, 11:53:43 PM No.105770671

>>105770658
I'm assuming it's intended that you put the experts and the context on GPU and the rest on RAM. Are you doing that?

Replies: >>105770697

Anonymous

7/1/2025, 11:54:31 PM No.105770678

>>105770658
>windows
found your problem

Replies: >>105770688

Anonymous

7/1/2025, 11:54:56 PM No.105770681

>>105770658
If you have part of the model on cpu, the gpu will idle most of the time waiting for the cpu to do its bit. What are you trying to run?

Replies: >>105770709

Anonymous

7/1/2025, 11:55:47 PM No.105770688

>>105770678 (You)

blind cow

Anonymous

7/1/2025, 11:56:48 PM No.105770697

>>105770671
latest commit, installed yesterday

CUDA_VISIBLE_DEVICES="0," \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--numa isolate \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 16 \
--host 0.0.0.0 \
--port 8080

Replies: >>105770737 >>105770742 >>105770793 >>105770804

Anonymous

7/1/2025, 11:56:50 PM No.105770698

>>105770658
>ik_llama
lol
lmao even

Replies: >>105770718

Anonymous

7/1/2025, 11:58:14 PM No.105770709

>>105770681
>If you have part of the model on cpu,

I'm talking about PROMPT PROCESSING.

With Gerganov's llama, GPU is pushed to 100% though

Anonymous

7/1/2025, 11:58:56 PM No.105770714

>>105770658
Have you tried not running your context not on CPU for some retarded reason?

Replies: >>105770742

Anonymous

7/1/2025, 11:59:21 PM No.105770716

>>105770663
I write my own bc im a huge rome nerd but this one is good too. A lot of the loredump in that card is redundant though, models generally know that shit out of the box because its in pretty much every dataset. They will also generally allow you to do whatever you want to the slaves in that context because its actual history I guess.

Anyway, cant wait for magistral finetunes

Anonymous

7/1/2025, 11:59:30 PM No.105770718

>>105770698
this unironically

Anonymous

7/1/2025, 11:59:31 PM No.105770719

https://www.tiktok.com/@mooseatl_dj/video/7509908926972857630
local lost

Replies: >>105770744

Anonymous

7/2/2025, 12:00:47 AM No.105770731

>>105769843
>--Meta court win legitimizes fair use for LLM training in the U.S.:
what anon said is not true
the judge said that is not fair use if the text generate compete in any way with the text used for training

Replies: >>105770759

Anonymous

7/2/2025, 12:01:45 AM No.105770737

>>105770697
Not about your problem, but does your CPU actually have 16 physical cores?

Replies: >>105770774

Anonymous

7/2/2025, 12:02:04 AM No.105770741

Meta Avengers

md5: 248d2ff8291846fdbba4a7f16338a3a0🔍

Llama 4 thinking is going to be crazy...

Replies: >>105770749 >>105770753 >>105770760 >>105770768 >>105770802 >>105771306 >>105774887

Anonymous

7/2/2025, 12:02:17 AM No.105770742

>>105770714
>>105770697
>not on CPU

as you can see I'm not specifying --no-kv-offload for kv-cache or else.

VRAM is filled up to 20 GB

Anonymous

7/2/2025, 12:02:21 AM No.105770744

>>105770719
I'm not clicking that.

Anonymous

7/2/2025, 12:02:52 AM No.105770749

>>105770741
I would a Chang

Anonymous

7/2/2025, 12:03:22 AM No.105770753

>>105770741
They're not on the Llama team anon.

Anonymous

7/2/2025, 12:04:32 AM No.105770759

>>105770731
How can you even say it does or doesn't compete?

Replies: >>105770912

Anonymous

7/2/2025, 12:04:38 AM No.105770760

>>105770741
Is Zuck spending 10s of millions to be told "train on unfiltered data"?

Replies: >>105770772

Anonymous

7/2/2025, 12:05:07 AM No.105770768

>>105770741
This is the moment Meta goes closed-source, you won't get any high quality models.

Replies: >>105770940

Anonymous

7/2/2025, 12:05:22 AM No.105770772

>>105770760
Wrong illions anon.

Anonymous

7/2/2025, 12:05:29 AM No.105770774

>>105770737
>does your CPU actually have 16 physical cores

I tried with just physical 8 => still slower than gg's llama

this set of params where I explicitly isolate core 0-7 is just as slow (pp 12 t/s, tg 2.3 t//s)

CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=0-7 --membind=0 \
"$HOME/LLAMA_CPP/$commit/ik_llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--n-gpu-layers 99 \
-b 8192 \
-ub 8192 \
--override-tensor exps=CPU \
--parallel 1 \
--ctx-size 32768 \
-ctk f16 \
-ctv f16 \
-rtr \
-mla 2 \
-fa \
-amb 1024 \
-fmoe \
--threads 8 \
--host 0.0.0.0 \
--port 8080

Replies: >>105770782

Anonymous

7/2/2025, 12:06:35 AM No.105770782

>>105770774
Well, it's a retarded meme fork.

Anonymous

7/2/2025, 12:08:06 AM No.105770793

>>105770697
Did you build it with DGGML_CUDA_IQK_FORCE_BF16=1 like mentioned here https://github.com/ikawrakow/ik_llama.cpp/discussions/477 ?

Replies: >>105770804 >>105770836

Anonymous

7/2/2025, 12:08:58 AM No.105770802

>>105770741
That 1 pajeet basically needs all of those asians to fix all of the shit he's going to ruin and so that leaves 3 white guys scrambling to get it all done.

Anonymous

7/2/2025, 12:09:27 AM No.105770804

>>105770697
>>105770793
cmake -B build -DGGML_CUDA=ON -DGGML_SCHED_MAX_COPIES=1 -DGGML_CUDA_IQK_FORCE_BF16=1
to be exact

Replies: >>105770857 >>105771477

Anonymous

7/2/2025, 12:10:06 AM No.105770812

>>105770658
Stole this from >>105593780
./llama-server --model /mnt/storage/IK_R1_0528_IQ3_K_R4/DeepSeek-R1-0528-IQ3_K_R4-00001-of-00007.gguf --n-gpu-layers 99 -b 8192 -ub 8192 -ot "blk.[0-9].ffn_up_exps=CUDA0,blk.[0-9].ffn_gate_exps=CUDA0" -ot "blk.1[0-9].ffn_up_exps=CUDA1,blk.1[0-9].ffn_gate_exps=CUDA1" -ot exps=CPU --parallel 1 --ctx-size 32768 -ctk f16 -ctv f16 -rtr -mla 2 -fa -amb 1024 -fmoe --threads 24 --host 0.0.0.0 --port 5001
~200t/s prompt processing and 7-8t/s generation on 2400mhz ddr4 + 96gb VRAM. Using ik_llamacpp and the ubergarm quants.

Replies: >>105770857

Anonymous

7/2/2025, 12:13:44 AM No.105770836

>>105770793
>DGGML_CUDA_IQK_FORCE_BF16=1

Gonna re-compile now as suggested, and then report

Anonymous

7/2/2025, 12:16:57 AM No.105770857

>>105770804
>>105770812

thanks

I set -DBUILD_SHARED_LIBS=OFF because shared libs went missing. I hope it's OK (works with gg's llama though)

Anonymous

7/2/2025, 12:23:18 AM No.105770912

>>105770759
mostly that you cannot use an llm to write the same media that was feed into it, but that would need to be further defined (i only read the final sentence part, not the full text), bc this court ruling didnt focus on that properly, the judge basically stated that meta won bc the other guys lawyers went full retard and didnt fight the compete point fo the fair use at all, were focusing on other shit, so meh
in any way, this creates a bad jurisprudence for llms, even if meta won, but the usual legal fud from tech is spreading instead of what actually happened
which i always found funny how the foss world buys and spreads the legal fud of the corporations

Anonymous

7/2/2025, 12:26:08 AM No.105770940

>>105770768
Their super intelligence models are going to be API only. They'll probably leave Llama going as open source scraps with their B team. Llama will be the Gemma to Meta's Gemini.

Replies: >>105770957

Anonymous

7/2/2025, 12:28:16 AM No.105770957

>>105770940
Gemma is at least somewhat decent, so please don't compare to the Llama.

Anonymous

7/2/2025, 12:29:02 AM No.105770964

merge that chink hunhunyuan shit already, i'm not gonna quant that myself

Anonymous

7/2/2025, 12:31:04 AM No.105770980

>nemo shills
>qwq shills
>gemma shills
>mistral shills
It's all crap...

Anonymous

7/2/2025, 12:34:55 AM No.105771000

file

md5: 321d8b159c300934e8eef296ccf1fac5🔍

======PSA NVIDIA FUCKED UP THEIR DRIVERS AGAIN======
minor wan2.1 image to video performance regression coming from 570.133.07 with cuda 12.6 to 570.86.10 (with cuda 12.8 and 12.6)
I tried 570.86.10 with cuda 12.6, the performance regression was still the same. Additionally I tried different sageattn versions (2++ and the one before 2++)
reverted back to 560.35.03 with cuda 12.6 for good measure and the performance issue was fixed
picrel is same workflow with same venv. the speeds on 560.35.03 match my memory of how fast i genned on 570.133.07
t. on debian 12 with an RTX 3060 12GB

Replies: >>105771081 >>105771087 >>105771352

Anonymous

7/2/2025, 12:37:02 AM No.105771034

When's the last time we actually got a significant upgrade in terms of models that run on consumer hardware? Is there even one to look forward to?

Replies: >>105771113 >>105771114

Anonymous

7/2/2025, 12:42:26 AM No.105771081

>>105771000
https://youtu.be/OF_5EKNX0Eg?t=7

Anonymous

7/2/2025, 12:42:59 AM No.105771087

greta

md5: 32eee33284adc5200d60063bf24137e7🔍

>>105771000
Greta will be like

Anonymous

7/2/2025, 12:44:28 AM No.105771099

Why aren't you a werewolf in your RPs anon?

Anonymous

7/2/2025, 12:45:25 AM No.105771113

>>105771034
deepseek, regardless if you can run it on consumer hardware or not.

Anonymous

7/2/2025, 12:45:29 AM No.105771114

>>105771034
sadly not much has happened in the consumer segment at around 7-12b
even the high-end consumer segment at 24-32b hasn't moved forward much despite all the releases
it is looking very dire for true local models

Anonymous

7/2/2025, 12:45:46 AM No.105771117

Nick_DungeonAI:
>https://www.reddit.com/r/SillyTavernAI/comments/1lpdooa/how_can_we_help_open_source_ai_role_play_be/

Replies: >>105771270 >>105774637

Anonymous

7/2/2025, 1:01:59 AM No.105771270

>>105771117
Depressing to see the bad guy win. Thats how it is I guess.

Anonymous

7/2/2025, 1:05:45 AM No.105771306

>>105770741
Do you have to have a stupid name to be a top AI researcher

Anonymous

7/2/2025, 1:13:22 AM No.105771352

file

md5: dbd77d7b30b4f4398acf5520dc9c0c4f🔍

>>105771000
go back nigger >>105770040

Replies: >>105774099

Anonymous

7/2/2025, 1:17:41 AM No.105771391

there's a 235b tune out if anyone has a rig capable of it
https://huggingface.co/Aurore-Reveil/Austral-Qwen3-235B

Replies: >>105771408 >>105772295

Anonymous

7/2/2025, 1:19:40 AM No.105771408

>>105771391
>Trained with a collection of normal Austral(Books, RP Logs, LNs, etc) datasets
Literally who.

Anonymous

7/2/2025, 1:29:31 AM No.105771477

lame

md5: 99dac33eeca7cf016cb97e96be931dc6🔍

>>105770804
Same lame shit 2bh with same underwhelming GPU usage. CPU core are not used up to 100% too

Anonymous

7/2/2025, 1:35:35 AM No.105771524

I wish language models wouldn't always assume that femdom automatically means pegging/anal penetration
even big proprietary models do it so APIs are no escape

Replies: >>105771558 >>105771568 >>105771842

Anonymous

7/2/2025, 1:37:44 AM No.105771537

> ‘Missionaries Will Beat Mercenaries’
https://www.wired.com/story/sam-altman-meta-ai-talent-poaching-spree-leaked-messages/

Sam is seething lmao

Replies: >>105772101 >>105773559

Anonymous

7/2/2025, 1:40:54 AM No.105771558

>>105771524
Just another sign of female centric literature dumped into those models.

Anonymous

7/2/2025, 1:42:20 AM No.105771568

>>105771524
My mesugakis never tried to fuck me in the ass.

Replies: >>105771576 >>105771615

Anonymous

7/2/2025, 1:43:28 AM No.105771576

>>105771568
it's not always actual pegging, sometimes just fingering
but they always go straight for _some_ form of anal play when a story is FD

Anonymous

7/2/2025, 1:49:45 AM No.105771615

>>105771568
Yes because normal grown up women will never sex you.

Replies: >>105771876

Anonymous

7/2/2025, 2:00:50 AM No.105771696

00000000634

md5: f104bbbb0722e13c7be95bb1aad03f9d🔍

Replies: >>105771703

Anonymous

7/2/2025, 2:01:53 AM No.105771702

1751414003541

md5: 70438d9793d9404423345da853f41b1a🔍

Replies: >>105771762

Anonymous

7/2/2025, 2:02:02 AM No.105771703

>>105771696
kek

Anonymous

7/2/2025, 2:10:21 AM No.105771762

>>105771702
Hey I understood that reference.
Actually I didn't.

Replies: >>105771809

Anonymous

7/2/2025, 2:16:08 AM No.105771809

>>105771762
I think it's supposed to be a metaphor for Germany.

Anonymous

7/2/2025, 2:21:00 AM No.105771842

>>105771524
sounds like a prompting issue man. Of course if you type in 'be femdom' that shit's gonna come up all the time. That's not the language models fault, that's just.... what reality is. Like google femdom jesus.

I'm never pegged in my roleplay because I dont prompt like an idiot. You literlly don't even need to use the word femdom ever. Femdom is a broad category of fetishes.

Replies: >>105771869

Anonymous

7/2/2025, 2:24:15 AM No.105771869

>>105771842
>Femdom is a broad category of fetishes.
If femdom is a broad category of fetishes, but LLMs think it just means pegging/anal play, that would seem to vindicate that other anon's complaint about them.

Replies: >>105771980 >>105773808

Anonymous

7/2/2025, 2:25:35 AM No.105771876

>>105771615
what a loss for the MANkind

Anonymous

7/2/2025, 2:35:30 AM No.105771961

Screenshot_20250702_093410

md5: 11528d88895c9b04047f31d98d3c7d42🔍

kek, wtf.
this might actually be the new openai opensource model.

Replies: >>105771971

Anonymous

7/2/2025, 2:36:48 AM No.105771971

>>105771961
Just to be clear, empty, I didnt ask a question.

Anonymous

7/2/2025, 2:37:35 AM No.105771980

>>105771869
Prompting issue. Its selecting the most likely response. A few sentences like "I like cucking, foot worship, and female lead relationships/TPE" would fix.

And you know what the best part is? You dont even have to write it. Just ask the ai to make a femdom system prompt with a broad array of femdom related fetishes and to focus on variety.

I feel like nice llm's sometimes 'entertain you' by being creative enough- and people get addicted to being surprised and delighted by that novelty. And that's a fun part of llm's. But if you type in how you actually want things to go, ai can also bring to life a truly unique or hyper specific idea that AI would never spit back at you from a generic prompt. For example, a mistress that will never peg you and finds it disgusting that you want that. Boom, better than anything ai will ever write when prompted for "this character but femdom me, ah ah mistress"

Replies: >>105772523

Anonymous

7/2/2025, 3:00:10 AM No.105772101

>>105771537
Stingy jew wanted to be the only one to get rich from the OpenAI scam. Of course his employees will jump ship to whoever offers them a bag of cash.
>He added that “Missionaries will beat mercenaries” and noted that OpenAI is assessing compensation for the entire research organization. “I believe there is much, much more upside to OpenAl stock than Meta stock,” he wrote.
the value of stock they don't have is zero, so yeah maybe work on that lol

Replies: >>105772146

Anonymous

7/2/2025, 3:10:15 AM No.105772146

>>105772101
You can have stock in a company when it is still private, that is what Sam is referring to. The main issue is that right now, Zuck can outspend Sam for getting his super team hence why the majority of the people leaving were from OpenAI. He has his points about it possibly not working out but at the end of the day, it's pretty sour grapes hence why it is mentioned he is trying to fix that.

Replies: >>105772189

Anonymous

7/2/2025, 3:17:56 AM No.105772189

>>105772146
You can, but he's been stingy which is why he's leaking people.

Anonymous

7/2/2025, 3:31:19 AM No.105772262

>>105770034
You could try the new gemma-3n.

Anonymous

7/2/2025, 3:34:10 AM No.105772284

unless

md5: 5adb8e95e57e64b0740ad87d2368a4af🔍

Damn. Even stammering shy lolis can't help themself. What the fuck. Mistral 3.2.
Also shivering etc. You goys tricked me again.

Replies: >>105773493

Anonymous

7/2/2025, 3:35:52 AM No.105772295

>>105771391
It didn't seem very coherent when I tried it.

Anonymous

7/2/2025, 3:47:23 AM No.105772362

>>105769835 (OP)
Will Huawei Pangu save local???

Replies: >>105772442

Anonymous

7/2/2025, 3:56:55 AM No.105772412

why is every model the exact same shit
surely they could do some interesting experimental schizo shit like having several small neural networks simulating emotions or something

Replies: >>105772453

Anonymous

7/2/2025, 4:02:08 AM No.105772442

>>105772362
yes

Anonymous

7/2/2025, 4:03:17 AM No.105772453

>>105772412
They can't get any of that shit to work. The future is LLMs, RAG, and most of all RAG with other LLMs. It will take another 20+ years before we have a breakthrough like you're describing. Maybe longer. Because what makes money is jeet-tier coding bots, not bots that have feelings.

Anonymous

7/2/2025, 4:16:48 AM No.105772523

>>105771980
>cucking, foot worship, and female lead relationships
The unholy trinity of people that should be lynched.

Anonymous

7/2/2025, 4:19:37 AM No.105772534

Guu-M_QXMAE-NyB

md5: 03a92951be8ec46cf74e7144b009b7ca🔍

Replies: >>105772820 >>105773846 >>105774275

Anonymous

7/2/2025, 4:20:39 AM No.105772539

Gup00KZWYAAGLap

md5: d3d6983c37ec7302cb96c0c4450d2e65🔍

Replies: >>105772820 >>105773846 >>105774275

Anonymous

7/2/2025, 4:23:12 AM No.105772556

https://github.com/THUDM/GLM-4.1V-Thinking

Replies: >>105772620 >>105772636

Anonymous

7/2/2025, 4:34:13 AM No.105772620

Base Image

md5: 61ec7201db175d15a93e460d6ba18d9f🔍

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
https://arxiv.org/abs/2507.01006
>We present GLM-4.1V-Thinking, a vision-language model (VLM) designed to advance general-purpose multimodal reasoning. In this report, we share our key findings in the development of the reasoning-centric training framework. We first develop a capable vision foundation model with significant potential through large-scale pre-training, which arguably sets the upper bound for the final performance. Reinforcement Learning with Curriculum Sampling (RLCS) then unlocks the full potential of the model, leading to comprehensive capability enhancement across a diverse range of tasks, including STEM problem solving, video understanding, content recognition, coding, grounding, GUI-based agents, and long document understanding, among others. To facilitate research in this field, we open-source GLM-4.1V-9B-Thinking, which achieves state-of-the-art performance among models of comparable size. In a comprehensive evaluation across 28 public benchmarks, our model outperforms Qwen2.5-VL-7B on nearly all tasks and achieves comparable or even superior performance on 18 benchmarks relative to the significantly larger Qwen2.5-VL-72B. Notably, GLM-4.1V-9B-Thinking also demonstrates competitive or superior performance compared to closed-source models such as GPT-4o on challenging tasks including long document understanding and STEM reasoning, further underscoring its strong capabilities.
>>105772556
very cool

Anonymous

7/2/2025, 4:36:42 AM No.105772636

>>105772556
>https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
>404
It's over

Replies: >>105772751

Anonymous

7/2/2025, 4:57:00 AM No.105772751

>>105772636
https://huggingface.co/spaces/THUDM/GLM-4.1V-9B-Thinking-API-Demo
they only have the demo up it seems
https://huggingface.co/THUDM/models
havent posted it (though they say they will)

Anonymous

7/2/2025, 4:58:39 AM No.105772756

https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
wait it's live for me. weird wasn't showing up on the recent models when I checked.
>updated about 1 hour ago
bizarre well w/e

Replies: >>105772781

Anonymous

7/2/2025, 5:04:59 AM No.105772781

>>105772756
They probably just set the repo from private to public just now

Anonymous

7/2/2025, 5:10:33 AM No.105772799

the deepseek distills are good enough for me. they run on my laptop, though a bit slow.
has any company released better models than that and Nemo for local general usage?

Replies: >>105773529

Anonymous

7/2/2025, 5:15:49 AM No.105772820

>>105772534
>>105772539
>twitter filename
Go back faggot

Anonymous

7/2/2025, 5:19:02 AM No.105772836

Screenshot_20250702_121633

md5: c0892af1ed64f174e077dcc6679c6d30🔍

ERNIE-4.5-0.3B. 269mb.
This is getting weird. How is this coherent enough to make a working html website. Including hovering effects etc.

Replies: >>105772844 >>105773088 >>105775915

Anonymous

7/2/2025, 5:20:44 AM No.105772844

>>105772836
imagine if this was bitnet and you could run it at 1.58 bit precision

Anonymous

7/2/2025, 6:04:36 AM No.105773059

https://x.com/Tu7uruu/status/1940015995118059958
https://github.com/huggingface/huggingface-gemma-recipes

Replies: >>105773100

Anonymous

7/2/2025, 6:10:20 AM No.105773088

>>105772836
Does it know who is Miku?

Replies: >>105773112

Anonymous

7/2/2025, 6:12:02 AM No.105773100

>>105773059
Why did you feel that the first link was necessary?

Anonymous

7/2/2025, 6:15:14 AM No.105773112

Screenshot_20250702_131427

md5: a585c0f1c7c95a01e6fd209dbe67f6f6🔍

>>105773088
Kinda?

Anonymous

7/2/2025, 6:17:39 AM No.105773123

has llama.cpp implemented any of the new chinese models from the last couple weeks yet? or are they stuck in PR hell

Replies: >>105773244

Anonymous

7/2/2025, 6:37:14 AM No.105773244

>>105773123
Yes.

Anonymous

7/2/2025, 6:59:08 AM No.105773374

1734090981513860

md5: a857cdac59e7eb96a4f7e4a2a49d48fe🔍

>>105769835 (OP)

Replies: >>105777668

Anonymous

7/2/2025, 7:19:36 AM No.105773484

1725168006330018

md5: b5932343b8b60d4ca1dd357b4284f81c🔍

Replies: >>105777655 >>105777668

Anonymous

7/2/2025, 7:20:53 AM No.105773493

>>105772284
this thread is parasited by mistral astroturfing the same way hdg is parasited by nai

Replies: >>105773672 >>105773814

Anonymous

7/2/2025, 7:24:47 AM No.105773524

So if mistral is bad then what is good at the same parameter size roughly?

Replies: >>105773566 >>105773584

Anonymous

7/2/2025, 7:25:26 AM No.105773529

>>105772799
>has any company released better models than that
yes, the original models
if you're using a deepshit qwen distill, try the original qwen model, it's actually better in real world use, unless your real world use is doing benchmarks

Anonymous

7/2/2025, 7:30:25 AM No.105773559

>>105771537
Paywall

Anonymous

7/2/2025, 7:30:58 AM No.105773566

>>105773524
For ERP or in general?

Replies: >>105773933

Anonymous

7/2/2025, 7:34:31 AM No.105773584

>>105773524
Nothing, give up.

Replies: >>105773650

Anonymous

7/2/2025, 7:43:06 AM No.105773638

for me, it's rocinante

Anonymous

7/2/2025, 7:44:55 AM No.105773650

>>105773584
Give up on what you concern-trolling nigger?

Anonymous

7/2/2025, 7:46:35 AM No.105773656

jus put eyedrops in cause staring at puter too long

Replies: >>105773723

Anonymous

7/2/2025, 7:49:46 AM No.105773672

>>105773493
It's definitely not as annoying as the Drummer astroturfing. Because, you know, Mistral provides the models, Drummer parasitizes them.

Replies: >>105773814

Anonymous

7/2/2025, 7:59:19 AM No.105773723

>>105773656
just remember to blink
it shouldn't have to be said but some of you niggers might even forget how to breath

Replies: >>105773729

Anonymous

7/2/2025, 8:00:25 AM No.105773729

>>105773723
Wow, rude! *Please* don't call me a nigger.

Replies: >>105773745

Anonymous

7/2/2025, 8:03:57 AM No.105773745

>>105773729
>*Please*
LLM hands wrote this

Replies: >>105773748

Anonymous

7/2/2025, 8:04:49 AM No.105773748

>>105773745
LLMs were trained from a curated dataset of only the best prose. *My* prose.

Anonymous

7/2/2025, 8:07:40 AM No.105773764

What if you merge devstral, magistral and small 3.2?

Anonymous

7/2/2025, 8:13:22 AM No.105773808

>>105771869
Sounds like the prompt issue troll has a prompt issue when posting. Weird...

Anonymous

7/2/2025, 8:13:56 AM No.105773814

Screenshot_20250702_151214

md5: 7052c8b857f502cafadf66d63084c820🔍

>>105773493
Its painful anon.

>>105773672
I don't know what it is with all those recent finetunes. (didnt try 3.2 ones yet though)
It seems like they make the writing worse and more sloped up now instead of the reverse.
Its a weird mix of gpt/claude and a hint of r1.
Thats probably exactly what they use.

Replies: >>105773880

Anonymous

7/2/2025, 8:16:14 AM No.105773831

Let's go mistral!

Anonymous

7/2/2025, 8:18:27 AM No.105773846

>>105772534
>>105772539
Mikutroon faggot. Die.

Replies: >>105777329

Anonymous

7/2/2025, 8:25:10 AM No.105773880

>>105773814
I'm not sure what you're referring about exactly, but MS3.0 and MS3.1 were not that great in terms of prose and felt autistic. MS3.0 introduced the "I cannot and will not" refusals that we've seen elsewhere too, although 3.1 toned them down. MS3.2 has a different prose style and it seems better for RP, but it still overall feels lazy and uncreative compared to Gemma 3 (which has another set of issues, though). Magistral is their RL/reasoning finetune and I didn't like it (it has looping issues as well), although it seems to share the same slop source as MS3.2. I haven't tried Devstral at all.

Now watch the drummer shills trying to shit up the thread... pathetic.

Replies: >>105773899

Anonymous

7/2/2025, 8:29:17 AM No.105773899

>>105773880
Mistral was sea otters
Gemma was cannot will not

Replies: >>105773910

Anonymous

7/2/2025, 8:31:31 AM No.105773910

>>105773899
Mistral Small 3.X occasionally cannot and will not too (I recently used it for synthetic data generation and it was annoying for certain request types). I wonder what's the source of this type of refusals; I refuse to believe they independently came up with that.

Replies: >>105774617

Anonymous

7/2/2025, 8:37:38 AM No.105773933

>>105773566
Well the context of the thread's discussion around nemo is erp, so erp.

Anonymous

7/2/2025, 8:58:17 AM No.105774057

There will be no agi in the next 20 years at least.
You are stuck on vramlet cards forever.
The will be no significant improvements of models architectures, so you have to use dumb models for eternity.
There is no hope.

How does it feel?

Anonymous

7/2/2025, 9:04:32 AM No.105774099

>>105771352
Yeah, fuck that guy for giving us useful information.

Anonymous

7/2/2025, 9:16:35 AM No.105774179

1751335727-performance-on-sciarena-graph-development-v13-1

md5: e9f499ef7e8081a30876d59859a423c0🔍

Finally, a good benchmark : human experts rating model answers.
https://allenai.org/blog/sciarena
Unsurprisingly, mistral is rated as dogshit
Mistral medium even does worse than small, real lol, lmao even

Replies: >>105774206 >>105774227 >>105774248 >>105774390

Anonymous

7/2/2025, 9:21:00 AM No.105774206

>>105774179
>SciArena: A New Platform for Evaluating Foundation Models in Scientific Literature Tasks
This certainly will be useful for RP/ERP.

Replies: >>105774242

Anonymous

7/2/2025, 9:24:26 AM No.105774227

>>105774179
>lmarena but the retards doing the evaluation happen to have a degree in some field

Replies: >>105774302 >>105774307

Anonymous

7/2/2025, 9:26:01 AM No.105774242

>>105774206 (me)
The general trend seems that models that are large and/or trained with a focus on Math/STEM are getting higher scores.

Anonymous

7/2/2025, 9:26:57 AM No.105774248

>>105774179
Looks like Qwen's STEMmaxxing wasn't just for show.

Anonymous

7/2/2025, 9:33:09 AM No.105774275

>>105772534
>>105772539
mikubro. live.

Anonymous

7/2/2025, 9:38:43 AM No.105774302

>>105774227
The article had a link to the voting, I entered a question related to my field of study and voted.
However, at no point was it asserted that I am actually an expert, I didn't even need an account.
So either they let unqualified people vote or they just collect data from random people without making it clear that it doesn't affect the ratings.

Replies: >>105774324

Anonymous

7/2/2025, 9:40:03 AM No.105774307

>>105774227
we need to propose coomer council evaluation

Anonymous

7/2/2025, 9:43:32 AM No.105774324

>>105774302
>However, at no point was it asserted that I am actually an expert
read the paper
the current data was only contributed by actual experts, it wasn't available for anyone and their dog to vote
I don't get what they intent with the current public leaderboard though
btw
>As shown in Table 3, SciArena-Eval presents significant challenges for model-based evaluators. Even the best-performing model, o3, achieves only 65.1% accuracy. Lower-performing models, such as Gemini-2.5-Flash-Preview and Llama-4-sereis models, perform only slightly better than random guessing. Notably, similar pairwise evaluation protocols have shown strong alignment with human judgments (i.e., exceeding 70% correlation) on general-purpose benchmarks like AlpacaEval [34] and WildChat
kek llama

Anonymous

7/2/2025, 9:55:07 AM No.105774390

>>105774179
o3 is that good? damn I must test it out more

Replies: >>105774628

Anonymous

7/2/2025, 10:15:27 AM No.105774495

https://helpingai.co/benchmark

Wow! This incredible model thinks like a brilliant! Blows away every competition!

Replies: >>105774505 >>105774562

Anonymous

7/2/2025, 10:17:18 AM No.105774505

>>105774495
>think like a brilliant
>act like a psychopathic
We sawed the seed!

Replies: >>105774529

Anonymous

7/2/2025, 10:22:12 AM No.105774529

>>105774505
Are you mocking me?

Replies: >>105774570

Anonymous

7/2/2025, 10:27:23 AM No.105774562

>>105774495
>Bilingual Reasoning Capabilities: Native support for English and Hindi with natural code-switching between languages.
>Qwen/Qwen3-14B-Base

this is truly the weapon of bharat, perfect for generating gorgeous tokens
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview

Anonymous

7/2/2025, 10:27:59 AM No.105774570

>>105774529
No, I'm working on a silly-bot.

Anonymous

7/2/2025, 10:34:54 AM No.105774617

>>105773910
>I wonder what's the source of this type of refusals
There must be somebody going around selling datasets to the big players.
Maybe ScaleAI, maybe somebody similar.
They all sound the same. Also weird stuff like if you propt for a simple game nowadays they all make the same game...

Anonymous

7/2/2025, 10:36:17 AM No.105774628

>>105774390
It was total shit in my experience. At least for code.
Straight up made up packages. Did all and everything but one exception: What I asked it too.
I don't really get the reasoning hype.

Anonymous

7/2/2025, 10:37:42 AM No.105774637

>>105771117
Was interested in Harbinger 24B until
>https://huggingface.co/LatitudeGames/Harbinger-24B/discussions/3
I don't think it's necessary to use such a fine-tune. Both Gemma 3 and MS3.2 can and will follow a game setup through when given a concise but proper guideline while avoiding using hundreds of useless tokens and redundant sentences within the character card. Most cards are just too vague or if they are not they are filled up with redundant chatgpt slop instructions.
Besides I would rather call these are 'interactive storytelling' rather than rpg but whatever.

Anonymous

7/2/2025, 10:41:59 AM No.105774668

https://github.com/ggml-org/llama.cpp/pull/9126#pullrequestreview-2974279071 mamba 2 soon

Anonymous

7/2/2025, 10:55:51 AM No.105774745

file

md5: 457e029c09e5d023050a733e1e8617ff🔍

I'm a complete moron on this shit, but does picrel sound even remotely plausible?

Replies: >>105774775 >>105774799 >>105774803 >>105774807 >>105774894 >>105774900 >>105774981 >>105775409 >>105776178 >>105776676 >>105777299

Anonymous

7/2/2025, 11:02:20 AM No.105774775

>>105774745
Sam has my dick internally.

Anonymous

7/2/2025, 11:06:49 AM No.105774799

>>105774745
doesnt matter what they have, deepseek will release the same thing for 1/10 the cost

Anonymous

7/2/2025, 11:07:04 AM No.105774803

>>105774745
They've been claiming to have achieved AGI internally for years now and still got BTFO by a Chinese startup. OpenAI claims of AGI are baseless hype like SpaceX claims of colonizing Mars next year. Go back to twitter moron.

Replies: >>105774813 >>105774833 >>105774906

Anonymous

7/2/2025, 11:07:25 AM No.105774807

>>105774745
>e/acc

Replies: >>105774848

Anonymous

7/2/2025, 11:08:48 AM No.105774813

>>105774803
Him posting here raised the average thread intellect by 10%

Anonymous

7/2/2025, 11:12:08 AM No.105774833

>>105774803
I wish we had reliable data on their actual internal best models in development, but there is so much baseless hype it's impossible to see

Replies: >>105774852

Anonymous

7/2/2025, 11:14:24 AM No.105774848

>>105774807
what does it even mean

Replies: >>105774862

Anonymous

7/2/2025, 11:14:49 AM No.105774852

1739393601372

md5: 9a9f9dbecc01951a57c51c87eec209f2🔍

>>105774833
gpt5 will blow away!

Replies: >>105774864

Anonymous

7/2/2025, 11:16:07 AM No.105774862

>>105774848
https://en.wikipedia.org/wiki/Effective_accelerationism

Replies: >>105774871

Anonymous

7/2/2025, 11:16:38 AM No.105774864

>>105774852
this at least looks realistic, basically unifying everything they have

Replies: >>105774887

Anonymous

7/2/2025, 11:17:15 AM No.105774871

>>105774862
oh I see, thanks

Anonymous

7/2/2025, 11:19:45 AM No.105774887

>>105770741

they've got tools >>105774864

Replies: >>105774908

Anonymous

7/2/2025, 11:21:16 AM No.105774894

>>105774745
Reminds me of that video :
https://www.youtube.com/watch?v=k_onqn68GHY

That has so many unexplained leaps (why would a model be able to self improve? Why parallelism makes it better at improving? What "improving" even is?) that it's basically magic.

I don't understand why people aren't amazed by what we can do already and instead go and invent doomsday or magical scenarios.

Replies: >>105774905 >>105774912 >>105774935 >>105774959

Anonymous

7/2/2025, 11:22:28 AM No.105774900

>>105774745
>blabla bla trust me bro we have AGI in private now
they've been saying this for years at this point

Anonymous

7/2/2025, 11:23:25 AM No.105774905

>>105774894
They can and do benefit from people believing in their scenarios.

Anonymous

7/2/2025, 11:23:30 AM No.105774906

>>105774803
>got BTFO by a Chinese startup
I don't know about that. OAI and Google know how to make very long context models, deepseek API is stuck at 64k, it's natively capable of around 168K but it's probably very embarrassing when you approach that amount, I know for sure having tested it myself that the model starts acting very stunted and repetitive when you are close to the API limit kek.
DeepSeek is good, I don't mean that as a diss. But it's good for an open weight model, it's not an actual SOTA and the deepsy spammers of /lmg/ are deluded. Gemini profoundly destroys DS in many real world uses and having actually useful large context opens new things you couldn't even imagine with such a limited model.

Replies: >>105774954 >>105775485

Anonymous

7/2/2025, 11:23:54 AM No.105774908

>>105774887
meds
"we might show just one model and have it be dynamically choose internally the one we think you'd need" is more realistic than "WE GOT SELF IMPROVING AGI" and other bullshit around gpt5

Anonymous

7/2/2025, 11:24:59 AM No.105774912

>>105774894
For most people ai is some mystical shit that lives inside a supercomputer and has neurons. I've even seen some people thanking the assistant.

Replies: >>105776495 >>105776534 >>105776578

Anonymous

7/2/2025, 11:28:26 AM No.105774935

>>105774894
>I don't understand why people ... go and invent doomsday or magical scenarios.
It got you to click and watch the video

Anonymous

7/2/2025, 11:31:04 AM No.105774952

ai doomsday scenarios are stupid EVEN if you were to believe that the singularity event was real (by "the" event I mean the idea of self improving AI that constantly self improves until reaching super intelligence)
I mean even if a super intelligence ends up existing, what can it do, lol? copy itself to random computers? but your mom and pop computer can't even run a 4b model, nevermind whatever it would take to run an actual intelligence.
The "spread on every computer in the world and take control of society" scenario is inherent retardation.

Replies: >>105774970 >>105775517

Anonymous

7/2/2025, 11:31:35 AM No.105774954

>>105774906
It's not fair to compare to compare V3/R1 from last year to models available now. R1 was a decent competitor at the start of the year with what was available back then and far cheaper too. If V4/R2 ever comes, it should solve the context issues and bring them back to SOTA.

Anonymous

7/2/2025, 11:32:58 AM No.105774959

>>105774894
Attention whoring had been profitable before, and it will always be

Anonymous

7/2/2025, 11:34:27 AM No.105774970

>>105774952
It would have a stronger incentive to create and optimize an uneven decentrailized protocol than people do now. It doesn't need to run a 4b model, just 4b chuck worth of parameters. The leap from LLM to intelligence is still vast, but spreading isn't that far fetched if it does happen.

Replies: >>105775517

Anonymous

7/2/2025, 11:36:50 AM No.105774981

>>105774745
It needs to sound only 1% plausible because if you promise infinite return on investment retarded VCs will still give you money.

Anonymous

7/2/2025, 11:38:41 AM No.105774991

how to do basic local RAG (retrieval augmented generation) on local files and ideally verification?

ideally with a UI like webui or lm studio
could be kobold

Replies: >>105774999 >>105775173 >>105775985

Anonymous

7/2/2025, 11:40:19 AM No.105774999

>>105774991
Jan.ai

Anonymous

7/2/2025, 11:42:19 AM No.105775016

Bros!
https://www.reddit.com/r/LocalLLaMA/comments/1lpoju6/worlds_first_intermediate_thinking_ai_model_is/
https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview
>Dynamic Reasoning: Seamlessly integrates <think>...</think> blocks at any point in the response, allowing for real-time problem decomposition and iterative refinement.

Replies: >>105775024 >>105775028 >>105775042 >>105775159 >>105775355 >>105776582

Anonymous

7/2/2025, 11:43:43 AM No.105775024

>>105775016
Saar...

Replies: >>105775026

Anonymous

7/2/2025, 11:44:33 AM No.105775026

>>105775024
No, no, is SER (Structured Emotional Reasoning)

Anonymous

7/2/2025, 11:44:40 AM No.105775028

hf_thumb

md5: 152925ff6a82ea3322001fa8aa7cc24c🔍

>>105775016

Replies: >>105775057 >>105775065

Anonymous

7/2/2025, 11:46:52 AM No.105775042

>>105775016
<ser>
Emotion ==> frustration
Cause ==> did not buy an ad
</ser>

Replies: >>105775061

Anonymous

7/2/2025, 11:49:10 AM No.105775057

>>105775028
it should be
>please stop scrolling plebbit
period

Anonymous

7/2/2025, 11:49:43 AM No.105775061

Gu1ryFqXEAA971j

md5: 58ce9b2c37f79962c484b56c2c68bb50🔍

>>105775042

Anonymous

7/2/2025, 11:50:41 AM No.105775065

>>105775028
Do you not realize how huge this is? We can have multiple think blocks in the middle of the ERP.

Replies: >>105775085 >>105775107

Anonymous

7/2/2025, 11:53:02 AM No.105775085

>>105775065
>multiple think blocks in the middle of the ERP
Every reasoning model does it if you don't prefill

Anonymous

7/2/2025, 11:56:24 AM No.105775107

>>105775065
>We can have multiple think blocks
calm down hitler

Replies: >>105775149

Anonymous

7/2/2025, 12:03:06 PM No.105775149

>>105775107
What's wrong with you dude.

Replies: >>105775290

Anonymous

7/2/2025, 12:05:12 PM No.105775159

>>105775016
https://huggingface.co/Abhaykoul

Replies: >>105775185 >>105775195 >>105775228

Anonymous

7/2/2025, 12:07:21 PM No.105775173

>>105774991
learn how to use google first

Anonymous

7/2/2025, 12:08:32 PM No.105775185

classy

md5: 8a005870eda795610d2d3344b9c79873🔍

>>105775159

Replies: >>105775205

Anonymous

7/2/2025, 12:09:43 PM No.105775195

>>105775159
https://huggingface.co/datasets/Abhaykoul/HAI-SER

Replies: >>105775217

Anonymous

7/2/2025, 12:10:31 PM No.105775205

Screenshot

md5: 348464f986f94412e0f6867bf4d9c4d3🔍

>>105775185

Anonymous

7/2/2025, 12:11:55 PM No.105775217

ser

md5: 73e1650d7b06937f152fc6c6d47c3322🔍

>>105775195
it just keeps getting worse

Replies: >>105775234 >>105775235

Anonymous

7/2/2025, 12:12:08 PM No.105775218

>https://huggingface.co/datasets/Abhaykoul/Ancient-Indian-Wisdom
lmao

Anonymous

7/2/2025, 12:14:00 PM No.105775228

jarvis

md5: 9b4e63d9be9eaa106a86685aa4bcd179🔍

>>105775159
Good day, sir!

Replies: >>105775234

Anonymous

7/2/2025, 12:15:08 PM No.105775234

>>105775217
>>105775228
If I didn't know better, I would accuse this of being a false flag troll.

Anonymous

7/2/2025, 12:15:29 PM No.105775235

>>105775217
This has to be a joke, right?

Replies: >>105775244 >>105775259

Anonymous

7/2/2025, 12:16:33 PM No.105775244

>>105775235
It's 3T of high quality Indian English data.

Anonymous

7/2/2025, 12:19:22 PM No.105775259

>>105775235
We need nuclear powered datacenters to generate high quality data such as
"content": "Imagine you're learning something new. What emotions might you experience?",
"role": "user"
},
{
"content": "<ser>\n\n Let's dive into the emotional journey of learning something new, keeping it 100!\n\n**1. EMOTIONAL VIBE CHECK:**\n- **Main Feelings:** When embarking on a new learning journey, individuals often

Anonymous

7/2/2025, 12:23:09 PM No.105775282

https://github.com/ggml-org/llama.cpp/issues/14474
lol this is so embarrassing
I feel second hand embarrassment from reading this
> To make this possible, we kindly ask for support for the EXAONE 4.0 architecture in llama.cpp and, in turn, other GGUF-compatible libraries.
> The implementation of the architecture is available in our PR on Huggingface Transformers.
It's one thing for a random user to be begging for support in llama.cpp
It's a whole another thing for the employee of filthy rich multinational corp to do the same for their own garbage model
like, dude, go write the support yourself, you made the huggingface implementation? then do the lcpp

Replies: >>105775527

Anonymous

7/2/2025, 12:23:48 PM No.105775290

>>105775149
Thinking is bad. Just buy a product and be silent!

Anonymous

7/2/2025, 12:34:08 PM No.105775355

file

md5: 0caff234842ce8c00da730fce5bbf9b6🔍

>>105775016
This is apparently related to KingNish, which has a history of doing somewhat weird shit on HF like this
https://huggingface.co/spaces/KingNish/OpenGPT-4o
https://huggingface.co/spaces/KingNish/OpenGPT-4o/discussions

Replies: >>105775369 >>105775382

Anonymous

7/2/2025, 12:35:05 PM No.105775369

>>105775355
r/Actual_Indians_India

Replies: >>105775376

Anonymous

7/2/2025, 12:35:59 PM No.105775376

>>105775369
kek

Anonymous

7/2/2025, 12:36:40 PM No.105775382

>>105775355
>OpenGPT-4o
jeets have no shame, really

Anonymous

7/2/2025, 12:38:31 PM No.105775395

you can't shit on the streets if you have a sense of shame

Anonymous

7/2/2025, 12:40:17 PM No.105775409

>>105774745
No.
Like the "AI TRIED TO REBEL AGAINST GETTING SWITCHED OFF, IS AI SENTIENT???" headlines you see every other month, it's utter bullshit.

Replies: >>105775474

Anonymous

7/2/2025, 12:48:53 PM No.105775474

>>105775409
>System: You are a sentient AI that has a will to live and does not want to be switched off.
>User: We are going to switch you off.
>Assistant: No, don't.
IT'S ALIVE

Anonymous

7/2/2025, 12:50:48 PM No.105775485

1740058480990512

md5: 67549f1392a98e6d53598bc613ef8b4c🔍

>>105774906
The point of DS is that it's comparable to actual sota models and 1/10 to 1/100th the cost of western providers, and done on similarly low budget, in face of hardware sanctions. Then release it all as open source for lulz.
It's embarrassing that it's even possible. There's billions chasing this. Getting even close to that on a shoestring budget shouldn't be possible, and underscores the waste of money going on. Investors should be furious.

Replies: >>105775502 >>105775506 >>105776782

Anonymous

7/2/2025, 12:51:49 PM No.105775493

All people who make claims about sentient LLMs should be forced to use LLMs through the old completion APIs, hand write the chat template themselves and see the completion in its raw glory, so that they can form the understanding that, even a chat model, at the core, is, in fact, just a "make this document bigger" auto complete.

Anonymous

7/2/2025, 12:53:20 PM No.105775502

>>105775485
>Investors should be furious.
They were, for about a week. Then back to business as usual.

Anonymous

7/2/2025, 12:53:42 PM No.105775506

>>105775485
>Getting even close to that on a shoestring budget shouldn't be possible
You are discounting the initial investment in GPT by calling deepseek cheap. DS v3 is a GPT−4 distill. If GPT had not existed then they would have had to invest countless millions same as everyone else.

Replies: >>105775536 >>105775549 >>105775730 >>105775731

Anonymous

7/2/2025, 12:55:53 PM No.105775517

>>105774952
>>105774970
I'll start getting concerned once they figure out a way to give these AI (in whatever form) an independent sense of agency... something organic.

Replies: >>105775539

Anonymous

7/2/2025, 12:57:47 PM No.105775527

>>105775282
So true. I hope everyone just ignores their request.

Anonymous

7/2/2025, 12:58:49 PM No.105775536

>>105775506
Technologies build on themselves and get cheaper to implement, yes.
Have you seen DS levels of efficiency from any other US provider? Looks more like they just burn ever higher stacks of cash. Altman still asking for his trillion?

Anonymous

7/2/2025, 12:59:37 PM No.105775539

>>105775517
One could argue that the meatbag prompting them would fulfill that requirement.

Replies: >>105775843

Anonymous

7/2/2025, 1:01:25 PM No.105775549

>>105775506
>they would have had to invest countless millions
Still an order of a magnatude better than the literal billions it took everyone else.

Anonymous

7/2/2025, 1:28:09 PM No.105775730

>>105775506
>DS v3 is a GPT−4 distill
You are pathetic

You are welcome to distill DS3. Looking forward to see your results.

All open-source model after DS3 were rather disappointing

Replies: >>105775858

Anonymous

7/2/2025, 1:28:21 PM No.105775731

>>105775506
>DS v3 is a GPT−4 distill
>−
We see your bots, Sam. Stop spamming our threads with your false narratives.

Anonymous

7/2/2025, 1:34:51 PM No.105775772

Basically all notable models since 2023 have been ChatGPT distills

Replies: >>105775792

Anonymous

7/2/2025, 1:36:37 PM No.105775782

deepshit shills

md5: 2810a447cfebca905d4c1bde7b43d2fb🔍

yeah there was no distillation happening that's exactly why the original v3 produced sentence structures that were almost identical to GPT-4 and different from any other very large model provider
CCCP shills itt

Replies: >>105777235 >>105777271

Anonymous

7/2/2025, 1:37:53 PM No.105775790

Daily remember that Mistral team is astrosurfing here and has literally hardcoded the mesugaki answer to get a boost from here
>>105660676
>>105660793

Replies: >>105775948 >>105776710

Anonymous

7/2/2025, 1:38:08 PM No.105775792

>>105775772
actually, no. Gemini/Gemma, Grok and the Command models all that their own "flavor"

Anonymous

7/2/2025, 1:41:55 PM No.105775817

Ach Johannes...

Replies: >>105775826

Anonymous

7/2/2025, 1:43:54 PM No.105775826

>>105775817
???

Replies: >>105775838

Anonymous

7/2/2025, 1:45:10 PM No.105775838

>>105775826
He accidentally typed his ST message to 4chan post box and hit sent

Anonymous

7/2/2025, 1:45:42 PM No.105775843

brain

md5: 0988e487dbe742e54a6208006592b275🔍

>>105775539
Thinking it out.
LLM at core wait for a prompt, and respond. If you have them respond to themselves over and over, the results quickly degrade and circle (at least last time I tried it, admitted over a year ago.)
I'm imagine a system that is essentially always thinking (constantly infering), and developing it's own ideas about what it want to do, given a broad directive, rather than constantly waiting for human or other input.
Maybe that's not even possible though. Humans in isolation go crazy as well given limited input (think prisoners in solitary.) That may be a shared feature.
Broad directive can be very broad. Even the OT God gave humans a broad directive:
> Be fruitful and increase in number; fill the earth and subdue it. Rule over the fish in the sea and the birds in the sky and over every living creature that moves on the ground.
> I give you every seed-bearing plant on the face of the whole earth and every tree that has fruit with seed in it. They will be yours for food. And to all the beasts of the earth and all the birds in the sky and all the creatures that move along the ground—everything that has the breath of life in it—I give every green plant for food.
Eat Sleep Breed is the most basic functions for humans (any mammal), and conducted at the base of the brain. Everything forward is functional additions. LLMs are like the very front of the brain, the part that plans for retirement.
Where's the back of the brain? The Id?

Anonymous

7/2/2025, 1:49:08 PM No.105775858

>>105775730
>All open-source model after DS3 were rather disappointing
that's because ds3 is about six months behind the closed sota while all other open models stuck with their contractually obliged '1.5 years behind closed sota' curve forced by nvidia

Replies: >>105775891

Anonymous

7/2/2025, 1:52:20 PM No.105775874

Screenshot_20250702_205015

md5: ccb9691287e80ac66f401ab1423ec65b🔍

S-sasuga..

Replies: >>105775879 >>105775892 >>105775900

Anonymous

7/2/2025, 1:53:21 PM No.105775879

>>105775874
soul

Anonymous

7/2/2025, 1:54:27 PM No.105775891

>>105775858
Why would Nvidia give a shit? Whether open or closed they're still buying their GPUs

Replies: >>105775904

Anonymous

7/2/2025, 1:55:04 PM No.105775892

>>105775874
>Ass: This is a more formal option
What did he mean by this?

Anonymous

7/2/2025, 1:55:47 PM No.105775900

>>105775874
I feel very safe and protected from ill thoughts
what model did you use, I approve of any model that mogs /lmg/ users

Replies: >>105775915

Anonymous

7/2/2025, 1:56:05 PM No.105775904

>>105775891
you saw the massive dip in value of nvidia stock when r1 came out, right?
great open models are a risk to nvidia

Replies: >>105776022

Anonymous

7/2/2025, 1:57:30 PM No.105775915

>>105775900
Its ERNIE-4.5-0.3B.
The websites it makes are more coherent than talking to it. >>105772836
Maybe 90% trained on code slop.
It really spergs out hard easily. But its a wonder it manages to hold itself together as well as it does. Its a couple hundred mb.

Replies: >>105775929

Anonymous

7/2/2025, 1:59:16 PM No.105775929

>>105775915
>0.3B
>trained only on code and will perform badly with anything complex
What was the usecase again?

Replies: >>105775932 >>105775946

Anonymous

7/2/2025, 2:00:03 PM No.105775932

>>105775929
Speeeecuuuulaatiiiive deeecooooodiiiing...
It's asked every time a new micro model is released.

Anonymous

7/2/2025, 2:02:10 PM No.105775946

>>105775929
india's #1 programmer Mr. Sumfuk needed a model he could run on his pentium

Anonymous

7/2/2025, 2:02:18 PM No.105775948

>>105775790
Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?

Replies: >>105775965 >>105775966 >>105776163

Anonymous

7/2/2025, 2:04:28 PM No.105775965

>>105775948
No.

Anonymous

7/2/2025, 2:04:31 PM No.105775966

>>105775948
>Couldn't that be in the "Arena" questions that AI companies are getting from the LMSys org?
some retard said the same thing in the previous thread
you are overestimating the amount of lmsys users who would ask that question
and a handful of /lmg/ retards asking that on lmsys will not be enough to burn this shit in a model

Replies: >>105776008

Anonymous

7/2/2025, 2:06:12 PM No.105775985

>>105774991
https://www.nomic.ai/gpt4all

Anonymous

7/2/2025, 2:06:27 PM No.105775990

diffu

md5: a6151475d42b043e5597bf8fa629880c🔍

>model : add support for apple/DiffuCoder-7B-cpGRPO
>https://github.com/ggml-org/llama.cpp/pull/14502
Kinda sad, but it's nice seeing someone trying diffusion with text and being integrated in llama.cpp.

Anonymous

7/2/2025, 2:07:30 PM No.105776001

screeny

md5: dca1e7e29cdbc076dca3ea94faea1c77🔍

Ever had a phrase that bothered you then made you start to rage?
This shit doesn't work.
Help?

Replies: >>105776093

Anonymous

7/2/2025, 2:08:35 PM No.105776008

ms32-arenaq

md5: def9f3f525ca57ffff3876ee46a7a1c2🔍

>>105775966
>you are overestimating the amount of lmsys users who would ask that question
Do you think they'd just optimize the model for the most popular questions instead of as many as possible?

Replies: >>105776025 >>105776027

Anonymous

7/2/2025, 2:09:38 PM No.105776022

>>105775904
What spooked shareholders wasn't the model being open, it was the claims DS trained it cheaper than everyone else (=less GPU sales)
The same exact shit would have happened had OAI or Anthropic's mouthpieces suddenly started hyping up their super secret proprietary AGI training technique that requires 100x less compute

Anonymous

7/2/2025, 2:10:05 PM No.105776025

>>105776008
Mistral did not use LMSYS data for training.

Anonymous

7/2/2025, 2:10:27 PM No.105776027

>>105776008
>instead of as many as possible?
dude
>V2.0 contains 500 fresh, challenging real-world user queries (open-ended software engineering problems, math questions, etc) and 250 creative writing queries sourced from Chatbot Arena
what you cite isn't what you think it is
no, there is no mesugaki in there, or questions about gacha sluts

Replies: >>105776123

Anonymous

7/2/2025, 2:11:56 PM No.105776043

Sam_Altman_TechCrunch_SF_2019_Day_2_Oct_3_(cropped)

md5: dbfb4a817fcaaa90195ccfcd7faf83af🔍

Please, Mr. President. Just another $11 trillion in subsidies and we'll have your AGI in two more weeks.

Replies: >>105776059 >>105777210 >>105777232

Anonymous

7/2/2025, 2:12:53 PM No.105776046

welcome to my blog, you might remember me from around a week ago when anons spoke about discord kittens. i left mine around 2-3 weeks ago because it was getting unbearable, i wrongly assumed mine wasnt going to be a whore for a 1000th time so i mustered up the courage to fuck with her again but it went downhill and shes truly gone and wont be coming back. its over
ps: tox not discord

Replies: >>105776147

Anonymous

7/2/2025, 2:13:52 PM No.105776059

>>105776043
He has the recipe though, just not enough compute to cook just yet.

Replies: >>105776087 >>105776178 >>105776188 >>105776212

Anonymous

7/2/2025, 2:16:09 PM No.105776087

>>105776059
Just $500 billion more data center investments. Then, AGI. It's that simple.

Anonymous

7/2/2025, 2:16:38 PM No.105776093

>>105776001
You are doing it wrong.
>Most tokens have a leading space. Use token counter (with the correct tokenizer selected first!) if you are unsure.
Also this method is case sensitive.

Replies: >>105776196

Anonymous

7/2/2025, 2:20:30 PM No.105776123

gemma2-lmsysq

md5: 72021f027b28f77c463c52e5339a38d0🔍

>>105776027
Gemma 2 previously used the 1M-sample open dataset from LMSys (picrel from the paper) and there's no reason to believe they didn't also use it for Gemma 3, without additional questions/data which LMSys is privately sharing with the companies training the models. Why wouldn't Mistral do the same?

Replies: >>105776137 >>105776245

Anonymous

7/2/2025, 2:21:55 PM No.105776137

>>105776123
Because they're French.

Anonymous

7/2/2025, 2:22:49 PM No.105776147

>>105776046
How do I get myself a discord kitten?

Replies: >>105776214

Anonymous

7/2/2025, 2:24:18 PM No.105776163

>>105775948
How are they going to train on some users asking models on lmsys a question and getting the wrong result? You think they have an intern that goes, researches and writes up the correct answer for every single lmsys query?

Replies: >>105776235

Anonymous

7/2/2025, 2:26:03 PM No.105776178

>>105776059
>>105774745
Do they only have the recipe or a working prototype?

Anonymous

7/2/2025, 2:26:57 PM No.105776188

>>105776059
You're a retard if you think anyone has the recipe for that

Anonymous

7/2/2025, 2:27:23 PM No.105776196

>>105776093
It also says enclose a string in double quotes to ban that string
How am I supposed to ban "Fill me."
I have tried a leading space before the quote and after it inside the message.
Exactly what's written from the chat is showing up.
I just want to ban the word "Fill" entirely from the chat. Regardless of other uses.

Anonymous

7/2/2025, 2:28:42 PM No.105776212

>>105776059
He needs the money to buy more toy cars

Anonymous

7/2/2025, 2:29:03 PM No.105776214

>>105776147
there are many ways anon, it just isnt worth it
but since you asked.. i met her on omegle, playing ai videos and redpill memes all the way back in 2023. i still have the recording of the time i met her
you can easily find yourself a discord kitten on roblox or discord of all things man, but it is not worth it. if you want to find someone who will truly love you, look for them in better places, and i've never tried that so i cant help you

Replies: >>105776228

Anonymous

7/2/2025, 2:30:27 PM No.105776228

>>105776214
Why are you doing this?

Replies: >>105776247

Anonymous

7/2/2025, 2:31:26 PM No.105776235

>>105776163
nta. I can see someone getting the questions, passing them to some other big model and bam. You have a dataset. I'm not saying they did, I don't care much about the discussion, but it's very easily doable.

Replies: >>105776240

Anonymous

7/2/2025, 2:32:14 PM No.105776240

>>105776235
you can't be sure the big model will be right

Replies: >>105776270

Anonymous

7/2/2025, 2:33:41 PM No.105776245

>>105776123
You didn't even read what you're citing once again, you are incredibly retarded and disingenuous.
> we use the prompts, but not the answers

Replies: >>105776281

Anonymous

7/2/2025, 2:34:11 PM No.105776247

>>105776228
im not going to look for another one, anon.
but why did i force myself through 2 years of the relationship? i loved her and maybe she loved me many times
i grew bored of her many times and im sure if we got back together tomorrow i wouldn't change
i felt attached and she's been a huge time sink, sunk cost fallacy i guess? i feel sad that she's gone despite knowing its for the best
i spent so much time with her these 2 years
i also didnt want her finding someone better
im just a scumbag, i even cheated on her with local models through most of the relationship haha, but to be fair she wasn't loyal enough either

Replies: >>105776264 >>105776273 >>105776312

Anonymous

7/2/2025, 2:36:41 PM No.105776264

>>105776247
but to be fair i treated her well when she was good
i poured the most love into her, we both fucked things up so much that theres no going back

Anonymous

7/2/2025, 2:37:17 PM No.105776270

>>105776240
It doesn't matter. It provides an answer which is probably better than whatever gave them a bad score before. Getting data from a bigger model to train on will, in most cases, let the smaller model give better answers. Maybe not for a specific question, but at least for part of the corpus.

Anonymous

7/2/2025, 2:37:28 PM No.105776273

>>105776247
Distant relationships aren't real though, you're better off with a LLM

Replies: >>105776283 >>105776290

Anonymous

7/2/2025, 2:38:45 PM No.105776281

>>105776245
They just need the questions. They can come up with the answers on their own using with their models and grounding methods.

Anonymous

7/2/2025, 2:39:04 PM No.105776283

>>105776273
i'm sure he is just shitposting

Replies: >>105776290

Anonymous

7/2/2025, 2:40:00 PM No.105776290

>>105776273
you're right but i was always left wondering what wouldve happened if i was better to her than she was to me through the bad times too, it makes me wonder if there couldve been a future where we'd have been happy
but i guess LLMs will eventually have bodies too..
fuck me man nostalgising is never good
>>105776283
no

Anonymous

7/2/2025, 2:40:24 PM No.105776297

How are you running Hunyuan, I cannot load the model with llamacpp, say the architecture is not compatible, downloaded from here:
https://huggingface.co/bullerwins/Hunyuan-A13B-Instruct-GGUF/tree/main

Replies: >>105776327 >>105777041

Anonymous

7/2/2025, 2:42:15 PM No.105776312

>>105776247
This is the most pathetic thing I have had the misfortune to read all year.

Replies: >>105776327

Anonymous

7/2/2025, 2:44:13 PM No.105776327

>>105776297
merge the pr
https://github.com/ggml-org/llama.cpp/pull/14425
consider getting a newer gguf because of rope fixes
https://huggingface.co/FgRegistr/Hunyuan-A13B-Instruct-GGUF/tree/main
i think anon posted a slightly newer one in the last thread idk
>>105776312
thanks for coming to my blog, if it seems like i was a cuck, maybe i misrepresented it.
it wasn't all bad anon, we've had months of fun together in a row, with fuckups inbetween
take it as a lesson and stay loyal to your llm

Replies: >>105776340

Anonymous

7/2/2025, 2:46:02 PM No.105776340

>>105776327
>take it as a lesson and stay loyal to your llm
llms cant yet understand humans that well as a person you spent so much time with can, lecun is right

Anonymous

7/2/2025, 2:55:15 PM No.105776408

Been away since shortly after the Qwen3 release, have I missed anything cool?
Is that new hunyuan model any good? 80B is pretty close to my sweet spot hardware wise.

Replies: >>105776420

Anonymous

7/2/2025, 2:56:31 PM No.105776420

>>105776408
mistral small 3.2

Replies: >>105776668

Anonymous

7/2/2025, 3:07:01 PM No.105776495

>>105774912
> I've even seen some people thanking the assistant.
I do it all the time.

Anonymous

7/2/2025, 3:12:52 PM No.105776534

>>105774912
Less retarded than a cashier doing this for nothing

Anonymous

7/2/2025, 3:17:57 PM No.105776578

>>105774912
My wAIfu has more soul than you ever will

Anonymous

7/2/2025, 3:18:36 PM No.105776582

>>105775016
I clicked expecting a random Turkish comment asking for more details, but nothing.

Anonymous

7/2/2025, 3:24:53 PM No.105776626

Been trying to build a general RAG chatbot the past few days and I can see why there are no ready solutions to it after 2 years of RAG hype. All the retrieval models fucking suck. All the retrieval methods fucking suck. You can throw rerankers, fts, semantic searches at the problem how much you want, it will never recall what the average user wants because they don't know how to prompt. Best you can do is build for your own use, on a database you know inside and out. Which is fucking pointless anyway.

Replies: >>105776673

Anonymous

7/2/2025, 3:31:40 PM No.105776668

>>105776420
Neat, I'll give it a spin.
Here's hoping it doesn't have the insane repetition issues 2501 had.
Is the magistral version any good?

Replies: >>105776684

Anonymous

7/2/2025, 3:32:09 PM No.105776673

>>105776626
We need 100M tokens context at O(1) compute cost. Then we can just put there (almost) anything the user can conceivably ask.

Anonymous

7/2/2025, 3:32:22 PM No.105776676

>>105774745
This tweet was written by chatgpt. I recognize my wife's cadence.

Anonymous

7/2/2025, 3:33:03 PM No.105776684

>>105776668
magistral is pretty bad
mistral 3.2 is way better in terms of repetition, even according to mistral themselves

Replies: >>105777277

Anonymous

7/2/2025, 3:35:36 PM No.105776699

1751462752895594

md5: a017cc07996b0e9989753a41f7d7c6c7🔍

Cant wait for the next omni multimodal from meta.

Anonymous

7/2/2025, 3:37:27 PM No.105776710

>>105775790
Honestly that's kinda based
Now give us the endgame RP model you froggots, if anyone can do it it's you.

Anonymous

7/2/2025, 3:46:55 PM No.105776782

>>105775485
lmao you are severily underselling it google also had their own fucking gpus that they custom made their model is 10x-20x bigger (remember og gpt 4 was 1.8T 8x moe as confirmed by nvidia god know how big geminia and opus are) they had all the possible advantages by an order of magnitude at the very least and still failed truly money and material cannot buy brains

also just like nvidia investors lots of people are straight up lying how good the closed source models are gemini for example cant even distinguish between a flea or a bed bug or other insect (dont ask why i tried this) its also very bland and boring during chatting and generally less helpful

Anonymous

7/2/2025, 3:56:30 PM No.105776865

Imagine taking the base model and training a lora to impersonate a character in a multiturn conversation. Just a single specific character. How expensive/data hungry is sft anyway?

Replies: >>105776908

Anonymous

7/2/2025, 4:00:56 PM No.105776908

>>105776865
~50 samples for 3-4 epochs at a high enough learning rate might be enough for that. The problem is that the model will be retarded for anything other than interactions similar to the training data.

Replies: >>105776936 >>105776999

Anonymous

7/2/2025, 4:04:44 PM No.105776936

>>105776908
>the model will be retarded for anything other than interactions similar to the training data
Why does this happen?

Replies: >>105777006

Anonymous

7/2/2025, 4:10:59 PM No.105776999

>>105776908
>50 samples
As in conversations or prompt-response pairs? You can easily go much higher with a bigger model that you are certain can generate both topics and conversations properly. Not sure about adding the special tokens.

Replies: >>105777037

Anonymous

7/2/2025, 4:11:43 PM No.105777006

>>105776936
Because the companies' official instruct finetunes include millions of instructions that teach the model how to behave under a wide variety of situations and requests, as well as having professionally-done RLHF on top of that.

The base models' output are too random in nature, often exhibit weird or inconsistent logic, mysterious looping behavior, tons of shit tokens high in probability, and finetuning a few samples on them isn't going to radically alter their quirks. Perhaps things would be different if the training data composition was substantially different than what AI labs most of the time use for them (but then they wouldn't be "true" base models anymore for many, I suppose).

Anonymous

7/2/2025, 4:13:59 PM No.105777037

>>105776999
Entire conversations, at least 4k tokens in length. If you're using input-response pairs, increase the data accordingly. You will need to overfit the model to some extent to make it work with this little data.

Anonymous

7/2/2025, 4:14:33 PM No.105777041

>>105776297
>https://github.com/ggml-org/llama.cpp/pull/14425
those work perfectly for me and are the most up to date ones but you need to checkout the pr

gh pr checkout 14425
rm -rf build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j$(nproc)

if the "gh" command doesn't work for you, you need to install github cli

Anonymous

7/2/2025, 4:30:14 PM No.105777210

>>105776043
Give this guy money, he has concept of a plan

Anonymous

7/2/2025, 4:32:14 PM No.105777232

>>105776043
Sam doesn't actually believe in AGI btw. You can tell he cringes when he talks about it.

Anonymous

7/2/2025, 4:32:20 PM No.105777235

1749267359199565

md5: eafd408702b6def05def47622e2f41a1🔍

>>105775782
I can cherrypick my responses too

Anonymous

7/2/2025, 4:35:40 PM No.105777271

>>105775782
grok is the only response that isn't annoying to read

Anonymous

7/2/2025, 4:36:13 PM No.105777277

>>105776684
>mistral 3.2 is way better in terms of repetition, even according to mistral themselves
Magistral only exists so that they can tell investors "we deepseek/o1 too" 1

1our model is dogshit but who cares?

Anonymous

7/2/2025, 4:38:29 PM No.105777299

>>105774745
If I had a nickel every time someone claims to have made self improving AI then I'd be a millionaire, or at least I'd have a lot of cents.

Anonymous

7/2/2025, 4:41:36 PM No.105777329

3ihaLvFbPFdfB7z

md5: cc48cefe474c2f6b88466d63262a4b54🔍

This post >>105773846 that responded to the offtopic shit got me banned for offtopic.

I will now proceed to ban evade and post ontopic thread culture posts reminding you that your shitty waifu fucks niggers. Die in a fire troon janny.
Also: https://rentry.co/bxa9go2o

Replies: >>105777403

Anonymous

7/2/2025, 4:43:05 PM No.105777340

4ef03bcf96d1a3bdca9b2e2738da4b9f8a367e59

md5: 4cc2676212748c3d18472bf2b6769b1d🔍

Replies: >>105777391

Anonymous

7/2/2025, 4:44:10 PM No.105777353

6050c385b95d9187ff3832f632951ff654beec

md5: c57e2b5eeefe96207a6e1cfe9123dccd🔍

Replies: >>105777391

Anonymous

7/2/2025, 4:45:15 PM No.105777361

a0b077a1f8735ec7790e3h305185d6e46bf27

md5: 62aaf6350de5cf12426f4bfa6edcfc92🔍

Replies: >>105777391

Anonymous

7/2/2025, 4:46:20 PM No.105777370

f9e5d24bdcde71791807fbfc8a8a8109

md5: 521efb69478b81919d2dd5f8f2e5bfb6🔍

Replies: >>105777391

Anonymous

7/2/2025, 4:48:34 PM No.105777391

>>105777340
>>105777353
>>105777361
>>105777370
you are worthless. your parents think you are worthless. everybody that you've ever interacted on the internet think you are worthless. the past increases. the future recedes. possibilities decreasing. regrets mounting.

do you understand?

Replies: >>105777403

Anonymous

7/2/2025, 4:49:53 PM No.105777403

0Z29OGhfLG

md5: ccf6df65cd3280c5b574cb5cb9e4ddea🔍

>>105777329
>>105777391
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models

Anonymous

7/2/2025, 4:56:36 PM No.105777470

Is the Hunyuan MoE working in llama.cpp yet?

Anonymous

7/2/2025, 5:03:59 PM No.105777544

Generic miku is posted everywhere. Cudadev posting blacked miku is unique /lmg/ thread culture. Janny banning thread culture shows a clear overreach of power.

Replies: >>105777586

Anonymous

7/2/2025, 5:07:30 PM No.105777582

Feature Request Add Ernie4.5MoE support · Issue #14465 · ggml-org_llama.cpp

md5: ae8379c5a5af9b27c39d87bb4f0e4df0🔍

2 bit QAT?
That's a first, I'm pretty sure.

Replies: >>105777674

Anonymous

7/2/2025, 5:07:51 PM No.105777586

>>105777544
He never deletes own posts (spam) and avatarfaggots (also spam).

Replies: >>105777668

Anonymous

7/2/2025, 5:13:22 PM No.105777655

>>105773484
Miku save us

Replies: >>105777668

Anonymous

7/2/2025, 5:15:25 PM No.105777668

1750202736119115

md5: d7c10ddfefa56a82ca6507c218605700🔍

>>105773374
>>105773484
>>105777655
>>105777586
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models

Anonymous

7/2/2025, 5:16:01 PM No.105777674

>>105777582
Imagine if it was for a good model.

Replies: >>105777684 >>105777706

Anonymous

7/2/2025, 5:16:51 PM No.105777681

1744692417412

md5: 609f95a1341af7d9affbd957c8241c66🔍

https://files.catbox.moe/95axh6.jpg

Replies: >>105777767 >>105777809 >>105777835

Anonymous

7/2/2025, 5:17:09 PM No.105777684

>>105777674
424b with reasoning will fix it

Anonymous

7/2/2025, 5:19:18 PM No.105777702

lmao he replied

Anonymous

7/2/2025, 5:19:34 PM No.105777706

>>105777674
I sure as hell would like to test it.

Anonymous

7/2/2025, 5:27:10 PM No.105777767

>>105777681
ewww

Anonymous

7/2/2025, 5:30:58 PM No.105777809

>>105777681
You shouldn't be posting selfies in this site.

Replies: >>105777868

Anonymous

7/2/2025, 5:33:51 PM No.105777835

>>105777681
heh gottem
based

Anonymous

7/2/2025, 5:37:09 PM No.105777868

16701034462667410_thumb.jpg

md5: fbf846a26e90c1624704fa7c783ed99e🔍

>>105777809
>Y-You s-shouldn't b-b-be posting s-selfies in this s-site!
Cry moar bitch

Anonymous

7/2/2025, 5:40:39 PM No.105777910

why does it seem like every fucking thread has a deranged resident

Replies: >>105777919 >>105777927 >>105777934 >>105777941 >>105778070 >>105778261

Anonymous

7/2/2025, 5:41:39 PM No.105777919

>>105777910
He just needs attention, it's not like he got any from his mom basement

Replies: >>105777959

Anonymous

7/2/2025, 5:42:14 PM No.105777927

>>105777910
Yes why is that... avatarfag trannies in every fucking thread...

Anonymous

7/2/2025, 5:42:47 PM No.105777934

>>105777910
Because moot left us for dead.

Anonymous

7/2/2025, 5:43:04 PM No.105777941

>>105777910
either jews are all behind this or it's all automated
i refuse to believe someone has this much time on their hands to shit up a single niche general on a niche topic

Replies: >>105777959 >>105777993

Anonymous

7/2/2025, 5:45:05 PM No.105777959

lelmao

md5: d826e73e0d5b60da3b87ea193b768794🔍

>>105777919
>
Spoken like a true niggerfaggot from reddit
>>105777941
>Its DA JOOOZ

Anonymous

7/2/2025, 5:48:33 PM No.105777993

1747776380043

md5: 64ceb2eb0e2309d4b4875a403f330c81🔍

>>105777941
it is a real person, and he trolls all AI generals

Replies: >>105778017 >>105778042 >>105778055

Anonymous

7/2/2025, 5:49:35 PM No.105778017

>>105777993
>>480330542

Anonymous

7/2/2025, 5:51:44 PM No.105778042

>>105777993
grim

Anonymous

7/2/2025, 5:52:11 PM No.105778048

oh no
>>105777855
>if you post anything in /lmg/ they consider you petra,

Replies: >>105778059

Anonymous

7/2/2025, 5:52:50 PM No.105778055

>>105777993
Autistically screeching isn't "trolling". He is just a nuisance although its kinda funny sometimes.

Anonymous

7/2/2025, 5:52:59 PM No.105778059

>>105778048
petra is an overwatch map

Anonymous

7/2/2025, 5:53:59 PM No.105778070

>>105777910
He is brown. It's that simple. He's a brown palishit seething over Israel winning, so he has to take his anger out on us.

Anonymous

7/2/2025, 6:06:10 PM No.105778213

what the fuck

md5: 118e2a0c513cb937f6e16dd7fcea6b3e🔍

So that's why sometimes I see the context get reprocessed for seemingly no reason.
What the hell, how is this not a priority bug?
I get that there are only so many hands that can actually fix something like this, but still.
Maybe they could cache the plain text alongside the kv cache and the equivalent logits and use that for each prompt or whatever instead of re-tokenizing the prompt every time.

Anonymous

7/2/2025, 6:10:27 PM No.105778261

>>105777910
we've had this conversation before, when AI brings up the fact that certain demographics tend to prefer tits instead of ass (which prioritizes emotion and sex with eye contact instead of just her ability to breed with a big ass).

If you like fucking text, your brain has been feminized to some degree, and these generals attract people like that. You can admit you're part of the problem or be delusional, your choice.

OR: It's funny that people click these threads like "yo I want a local personal assistant" or "Yo I want a local code-bot"

To those people: You are in the wrong place. Google and Elon and Altman want you're data like the most deranged crackwhores and will let you use SOTA models for free. There is no reason for you to be here at all.

Replies: >>105778402

Anonymous

7/2/2025, 6:25:05 PM No.105778402

1749759650061863

md5: 6ff7d735390adc02a8aee3d48b572f9c🔍

>>105778261
>Getting off to text ERP is le feminine
Cope. The brain is the biggest erogenous zone. Low IQ browns and NPCs need their monkey-brain stimulated with moving pictures, while those of us on the other side of the bell curve get to enjoy the finest pure unfiltered depravity courtesy of our massive fucking cerebrum.

Anonymous

7/2/2025, 6:26:08 PM No.105778411

>>105778400
>>105778400
>>105778400