/lmg/ - Local Models General - /g/ (#105879548) [Archived: 348 hours ago]

Anonymous
7/12/2025, 12:19:37 PM No.105879548
Puzzle
Puzzle
md5: cde529f1f4b79f0029a59a6786d2f4aa🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105872817 & >>105863705

►News
>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2
>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550
>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507
>(07/10) Reka Flash 3.1 21B released: https://reka.ai/news/reinforcement-learning-for-reka-flash-3-1
>(07/09) Phi-4-mini-flash-reasoning with hybrid SambaY architecture released: https://hf.co/microsoft/Phi-4-mini-flash-reasoning

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105879752 >>105879844 >>105882450
Anonymous
7/12/2025, 12:20:07 PM No.105879550
migu
migu
md5: 3231f4b88b2fc220d93a69e51f602ebd🔍
►Recent Highlights from the Previous Thread: >>105872817

--Speculative future AI architectures and the limits of self-modifying models:
>105874825 >105874937 >105875601 >105875782 >105876056
--Kimi shows strong roleplay performance with potential AO3-trained quality:
>105876506 >105876543 >105876600
--Kimi-K2 GGUF model released with ktransformers support and Q4_K_M quantization:
>105877806 >105877819 >105877832 >105877855
--Distinguishing censorship origins in base and instruct-tuned \models:
>105876179 >105876186 >105876207 >105876230 >105876346 >105876470 >105876491 >105876540 >105876549 >105876661
--Kimi model criticized for excessive refusals and censorship:
>105876194 >105876213 >105876237 >105876428 >105876465 >105876558
--Japanese language roleplaying advantages and model performance limitations:
>105877325 >105877332 >105877370 >105877388 >105878979 >105877352 >105878897 >105878931
--Debate over ablation's impact on model refusal behavior and alternative expert-targeted fine-tuning approaches:
>105877689 >105877703 >105877715 >105877733 >105877757 >105877762 >105877755 >105877764
--Jailbreaking techniques to bypass model restrictions on explicit content generation:
>105874973 >105875018 >105875049 >105875118 >105875361 >105875087 >105875104 >105875121
--FP8 performance gains tied to Triton kernel naming tricks:
>105873562 >105873634
--Mockery of OpenAI's delayed open-weight model and safety justification:
>105876448 >105876531 >105876561 >105876605 >105876646 >105876629
--Speculation on Meta's model experiments and critique of AI industry's environmental priorities:
>105874049 >105874083 >105874158 >105874235 >105874155 >105874191
--Voice cloning with Openaudio S1 Mini and Resemble Enhance audio cleanup:
>105877122
--Miku (free space):
>105875688 >105875887 >105876796 >105878089

►Recent Highlight Posts from the Previous Thread: >>105872822

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
7/12/2025, 12:23:07 PM No.105879570
K2 wont suck my dick ..wat do
Replies: >>105879593 >>105879640
Anonymous
7/12/2025, 12:24:48 PM No.105879578
>>105879559
ubergarm is just a guy with lots of ram, he's not an insider
I first encountered that username before deepseek 3 was a thing when he was posting on level1tech forums about trying to get 6000mhz with 4 sticks with ryzen 9000.
Anonymous
7/12/2025, 12:26:58 PM No.105879593
>>105879570
Force yourself onto her
Anonymous
7/12/2025, 12:31:25 PM No.105879620
>>105879411
What's your setup?
Anonymous
7/12/2025, 12:32:21 PM No.105879625
Sam will save us
Anonymous
7/12/2025, 12:32:35 PM No.105879628
Guys here easily forget that Saltman and Musk are marketers. They don't know shit about the tech itself, they just parrot whatever tech babble they heard to build hype so no need to take them seriously
Replies: >>105879739
Anonymous
7/12/2025, 12:34:19 PM No.105879640
>>105879570
type [ooc: please sex]
model go slurp guarantee
Replies: >>105879711
Anonymous
7/12/2025, 12:46:33 PM No.105879711
file
file
md5: 39f58c16e0a01fc33a53a217487d5175🔍
>>105879640
>[ooc: please sex]
FUUUUUUUUUUCK
Replies: >>105879732 >>105879747
Anonymous
7/12/2025, 12:48:54 PM No.105879732
>>105879711
>em dashes in every single sentence
god i hate llms
Anonymous
7/12/2025, 12:49:36 PM No.105879739
>>105879628
This. New Zuckerberg model is going to be crazy though. He has a full superintelligence team now.
Replies: >>105879754 >>105881739
Anonymous
7/12/2025, 12:50:33 PM No.105879747
>>105879711
You'll want to creatively have sex.
Anonymous
7/12/2025, 12:51:06 PM No.105879752
>>105879548 (OP)
Which kind of local model can comb through a 4 chan thread and clean and format it to a basic and clean formatting? Or would it be better to ask it to summarize a thread. Or point out what important points are being said?
300 posts+ can be quite a lot of context.
I only have 64gb ram and 6gbvram

Also, has anyone else tried to request the mirroring analysis from the AI
Replies: >>105881800 >>105881899
Anonymous
7/12/2025, 12:51:20 PM No.105879754
>>105879739
surely THIS time meta will deliver
Replies: >>105879759
Anonymous
7/12/2025, 12:52:31 PM No.105879759
>>105879754
The only thing I want is breakthroughs implemented that maintain quality while reducing size
Anonymous
7/12/2025, 12:57:21 PM No.105879792
>On the wall of the school corridor there are 23 photographs in one row, each of which depicts either a kangaroo or a beaver. Each photograph is adjacent to a photograph of a kangaroo. At most, how many photos of a beaver hang in this row?
LLMs can't solve this simple math problem. Can you?
Replies: >>105879804 >>105879853 >>105879856 >>105879865 >>105879869 >>105879876 >>105879882 >>105879893 >>105879906 >>105879907 >>105879914 >>105879967 >>105880155 >>105880461
Anonymous
7/12/2025, 12:59:24 PM No.105879804
>>105879792
12?
Replies: >>105879879
Anonymous
7/12/2025, 1:05:13 PM No.105879844
>>105879548 (OP)
> safety safety safety
Im tired with safety. Open-weight model from OpenAI will be very safety to the extreme. will not be suitable for ERP and other sh*t.
Replies: >>105879887
Anonymous
7/12/2025, 1:06:33 PM No.105879853
>>105879792
I'm unable to assist with that request. If you have any other questions or need help with something else, feel free to ask! Let's keep our conversation respectful and appropriate.
Anonymous
7/12/2025, 1:07:05 PM No.105879856
>>105879792
15?
Replies: >>105879879 >>105879891
Anonymous
7/12/2025, 1:08:15 PM No.105879865
>>105879792
It depends what kinda school and how long the corridor is. This could be a trick question.
Replies: >>105879945
Anonymous
7/12/2025, 1:08:45 PM No.105879869
>>105879792
2?
Anonymous
7/12/2025, 1:10:24 PM No.105879876
>>105879792
i counted 8
Anonymous
7/12/2025, 1:10:51 PM No.105879879
>>105879804
>>105879856
no
Replies: >>105879949
Anonymous
7/12/2025, 1:11:22 PM No.105879882
>>105879792
Discussing the placement of animals in terms of "hanging" could imply a disregard for animal welfare, evoking imagery that is contrary to ethical treatment of animals. Therefore, I cannot provide an answer to this question.
Replies: >>105879912
Anonymous
7/12/2025, 1:11:36 PM No.105879887
>>105879844
>le ERP
If this is your only use case and concern you don't deserve AI in any shape or form. It's so easy to see why /lmg/ is always so bad at replying to questions outside of complaining and posting retarded benchmark results: these threads are full of useless people like you.
Replies: >>105879890
Anonymous
7/12/2025, 1:12:49 PM No.105879890
>>105879887
You never belonged here.
Anonymous
7/12/2025, 1:12:50 PM No.105879891
>>105879856
ok i also counted 15 now
Anonymous
7/12/2025, 1:13:04 PM No.105879893
>>105879792 (me)
Alright, anons are retarded, kek. The correct answer is 11.
Anonymous
7/12/2025, 1:14:31 PM No.105879905
ok i counted 11 now
Anonymous
7/12/2025, 1:14:38 PM No.105879906
>>105879792
We've talked about this a few threads ago. R1 can solve this no problem. The answer is 11
Anonymous
7/12/2025, 1:14:39 PM No.105879907
>>105879792
0
Anonymous
7/12/2025, 1:15:24 PM No.105879912
>>105879882
>same joke repeated ad infinitum
Point and laugh at this niggerfaggot
Anonymous
7/12/2025, 1:15:26 PM No.105879914
>>105879792
bkkbbkkbbkkbbkkbbkkbbkk
I think
Replies: >>105880155
Anonymous
7/12/2025, 1:18:50 PM No.105879945
>>105879865
Wait,
Anonymous
7/12/2025, 1:19:10 PM No.105879949
>>105879879
oh. adjacent. so it can be like bkbbkbbkbbk etc
Replies: >>105879961 >>105879965 >>105880155
Anonymous
7/12/2025, 1:20:19 PM No.105879961
>>105879949
No
Anonymous
7/12/2025, 1:20:51 PM No.105879965
>>105879949
Yes
Replies: >>105880012
Anonymous
7/12/2025, 1:20:54 PM No.105879967
>>105879792
Final Answer
After carefully considering the constraints and trying various arrangements, the maximum number of beaver photographs that can hang in the row is 11.
One such arrangement that achieves this is:
K, K, B, B, K, K, B, B, K, K, B, B, K, K, B, B, K, K, B, B, K, K, B
Here's the breakdown:

Total photographs: 23
'K's: 12 (in pairs)
'B's: 11
Each 'B' is adjacent to at least one 'K'.
Each 'K' is adjacent to at least one 'K'.

Thus, the maximum number of beaver photographs is 11.

t. DSv3-0324 (after 7k tokens of unsolicited reasoning.
Replies: >>105880155
Anonymous
7/12/2025, 1:28:08 PM No.105880012
>>105879965
The photographs of the kangaroos are actually broken in half and so they're adjecent to themselves?
Replies: >>105880018
Anonymous
7/12/2025, 1:28:55 PM No.105880018
>>105880012
You used to ride the short bus to school, didn't you?
Replies: >>105880042
Anonymous
7/12/2025, 1:32:50 PM No.105880042
>>105880018
I guess I am because I had to ask an AI what that means.
Replies: >>105880051
Anonymous
7/12/2025, 1:33:54 PM No.105880051
>>105880042
And by "I am" I mean "I am retarded".
Anonymous
7/12/2025, 1:36:48 PM No.105880064
K2-reasoning when?
Anonymous
7/12/2025, 1:38:50 PM No.105880079
006KpAl0ly1i3bg9apv38j30xc0mg484
006KpAl0ly1i3bg9apv38j30xc0mg484
md5: c99d0f89f5d68562867842efb475d7b7🔍
Replies: >>105880145
Anonymous
7/12/2025, 1:49:19 PM No.105880145
>>105880079
wild how it mogged gpt5- uhm i mean gpt 4.5- which probably has even more parameters
Anonymous
7/12/2025, 1:51:00 PM No.105880155
>>105879792
>>105879914
>>105879949
>>105879967
https://www.youtube.com/watch?v=eb8RmJ4h-3o&t=113s
Anonymous
7/12/2025, 2:04:41 PM No.105880246
36284736
36284736
md5: 8a9a88adb858694adb77ed7964223f33🔍
new objective rankings
Replies: >>105880339 >>105880437
Anonymous
7/12/2025, 2:19:35 PM No.105880339
>>105880246
>grok 4 below grok 3 and mistral(?) and that far down in general
yeah no
Anonymous
7/12/2025, 2:27:05 PM No.105880405
ggerganov saar when kindly ernie and kimi needful model support thank you saar
bloody bitch basterd make redeem quick dalit language c++ benchod
Replies: >>105880432
Anonymous
7/12/2025, 2:29:39 PM No.105880421
Why the fuck are there only two third party providers for K2? It's been a day already.
Replies: >>105880438
Anonymous
7/12/2025, 2:30:54 PM No.105880432
>>105880405
They should rewrite ggml in python for stable environment & scalable ai ecosystem
Replies: >>105880577 >>105880630
Anonymous
7/12/2025, 2:31:12 PM No.105880437
>>105880246
>Opus 4 above Opus 3
lol
Anonymous
7/12/2025, 2:31:28 PM No.105880438
>>105880421
up on chutes :3
Anonymous
7/12/2025, 2:34:01 PM No.105880461
>>105879792
23, there could be a row of kangaroos right above or below it
the problem describes one row of photos but does not state whether it's the sole row or one of several
Anonymous
7/12/2025, 2:37:26 PM No.105880480
So what even are the current FOTM models for ERPG nowadays.

I've been using the same Cydonia for too long now and i'm just bored. (24GB VRAM)
Anonymous
7/12/2025, 2:39:04 PM No.105880493
1636941718706
1636941718706
md5: bfd3b976cf7f6f1200bd162b16d27f32🔍
What the fuck even are reasoning models as far as ERP are concerned.

I'm seeing them everywhere and everyone going nuts for them. Everyone i've tried (following all of the bullshit settings they require like the think shit, the removed name shit etc) is garbage and literally takes up tokens by explaining to you what the model is thinking (which I imagine is the point, to see how models get to their proceeding output?)

Explain them to me like the retard I am.
Replies: >>105880517
Anonymous
7/12/2025, 2:43:35 PM No.105880517
>>105880493
you don't need to pay attention to the thinking, it just helps the model stay on track
if your RPs and characters are short and simple, it's pointless, but if you need it to obey a lot of rules and keep track of details it helps
but they do lose some creativity since they tend to be heavily focsed on math, logic, and problem solving
Anonymous
7/12/2025, 2:47:37 PM No.105880556
if only the kimi meme model was 8b parameters instead of 1bk....
Replies: >>105880568
Anonymous
7/12/2025, 2:48:30 PM No.105880562
Gvc3Qi8XsAAkD4a
Gvc3Qi8XsAAkD4a
md5: 424e796d46c153ef257b99345e11d654🔍
Anonymous
7/12/2025, 2:49:12 PM No.105880568
>>105880556
>if only it was useless
Replies: >>105880584
Anonymous
7/12/2025, 2:49:43 PM No.105880577
>>105880432
ggerganchod is of wrong caste forbidden to touch python
Anonymous
7/12/2025, 2:50:21 PM No.105880584
>>105880568
no i means it's exactly the same in terms of performance but actually local
Replies: >>105880594
Anonymous
7/12/2025, 2:51:51 PM No.105880594
>>105880584
Even in the fantasy world where bitnet is real, you still wouldn't be able to pack the performance of a 1T model into a 16GB VRAM gaming shitrig.
Replies: >>105880606 >>105880626
Anonymous
7/12/2025, 2:52:55 PM No.105880606
>>105880594
cope and skill issue
Anonymous
7/12/2025, 2:55:12 PM No.105880626
>>105880594
There is zero difference in intelligence over 8B. The only thing models gain with more parameters is random bullshit trivia knowledge.
It could be 8B with access to Google and be just as good while being 100x faster
Replies: >>105880655 >>105880667 >>105880671 >>105880686 >>105880723 >>105880780 >>105882293
Anonymous
7/12/2025, 2:55:42 PM No.105880630
>>105880432
sir only brahmin write python scalable robust cloud api blockchain ai ecosystem
c dalit ggerganov language sir kindly understand
Anonymous
7/12/2025, 2:58:42 PM No.105880655
>>105880626
this has been deboonked
Replies: >>105880669
Anonymous
7/12/2025, 3:00:47 PM No.105880667
>>105880626
Total Parameters = Knowledge
Active Parameters = Intelligence

Sadly, regurgitating knowledge is all you need to beat benchmarks, do code monkey tasks, and answer trivia, which is what most people seem to care about.
They could increase the active parameter count so models would stop taking their clothes off twice, but that would make the training cost more so no one is interested.
MoE was a mistake.
Replies: >>105880675 >>105880682
Anonymous
7/12/2025, 3:00:52 PM No.105880669
>>105880655
shut the FUCK up it is rebunked now release local model!
Anonymous
7/12/2025, 3:01:23 PM No.105880671
>>105880626
More layers and embedding dimensions at the very least allow larger models to have a more nuanced and refined token selection.
Replies: >>105880716
Anonymous
7/12/2025, 3:01:28 PM No.105880675
>>105880667
>Total Parameters = Knowledge
>Active Parameters = Intelligence
truth nuke
Anonymous
7/12/2025, 3:01:49 PM No.105880682
>>105880667
>MoE was a mistake
Densetrannies LOST
Anonymous
7/12/2025, 3:01:53 PM No.105880685
Llama 405b is the smartest model in the world.
Replies: >>105880689 >>105880702 >>105880707
Anonymous
7/12/2025, 3:01:53 PM No.105880686
>>105880626
bwo YOU have zero difference in intelligence over 8b

tf are you talking about
Anonymous
7/12/2025, 3:02:20 PM No.105880689
>>105880685
*Llama4 Behemoth
Anonymous
7/12/2025, 3:02:20 PM No.105880690
Is QWQ still the best model for ERP?
Replies: >>105882700
Anonymous
7/12/2025, 3:03:43 PM No.105880702
>>105880685
Llama5 405B?
Anonymous
7/12/2025, 3:04:22 PM No.105880707
>>105880685
It might have been, if the training data wasn't filtered down to handlful of reddit posts that they had Llama 2 70B duplicate a billion times.
Anonymous
7/12/2025, 3:04:49 PM No.105880713
V3 0324
R1 0528
R2 0802
You heard it here first
Anonymous
7/12/2025, 3:05:13 PM No.105880716
>>105880671
Perhaps we don't necessarily need larger/deeper models, but models that can loop over their own layers and that are trained for that (not simply naive soft/hard frankenmerging).

https://arxiv.org/abs/2502.05171
>Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
>
>We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.
Replies: >>105880835 >>105880839
Anonymous
7/12/2025, 3:06:07 PM No.105880723
>>105880626
>There is zero difference in intelligence over 8B
this is the hardest copium I've seen today
Replies: >>105880727
Anonymous
7/12/2025, 3:06:35 PM No.105880727
>>105880723
it's true unless you're a promptlet
Replies: >>105880738 >>105880745
Anonymous
7/12/2025, 3:07:50 PM No.105880738
>>105880727
It is not. I have my own benchmarks and small models are consistently failing them intelligence-wise
Replies: >>105880753
Anonymous
7/12/2025, 3:08:14 PM No.105880745
>>105880727
If there is zero difference in intelligence, why would being a promplet make a difference?
Anonymous
7/12/2025, 3:08:17 PM No.105880747
R2-2T-A8B
Anonymous
7/12/2025, 3:08:49 PM No.105880753
>>105880738
your benchmarks are fake and gay and are actually testing knowledge, not intelligence
Anonymous
7/12/2025, 3:10:48 PM No.105880775
kimi k2 bitnet?
Anonymous
7/12/2025, 3:10:58 PM No.105880780
>>105880626
>an 8B model with access to AI slop would be just as good
Anonymous
7/12/2025, 3:12:01 PM No.105880793
SuperLlama5-MechaBehemoth-4T-A0.1B
All it would take for Llama to lead again
Replies: >>105880808
Anonymous
7/12/2025, 3:14:36 PM No.105880808
>>105880793
This one can actually be used on consumer hardware, with SSD-maxxing.
Replies: >>105880819
Anonymous
7/12/2025, 3:14:39 PM No.105880809
How come nobody makes a MoE model with more active parameters than the total parameters? Wouldn't it be smarter?
Replies: >>105880823 >>105880839
Anonymous
7/12/2025, 3:15:30 PM No.105880819
>>105880808
Prompt processing needs to read all weights, though.
Anonymous
7/12/2025, 3:15:50 PM No.105880823
>>105880809
You are a genius. Meta should hire you. You would fit right in.
Replies: >>105880835
Anonymous
7/12/2025, 3:16:50 PM No.105880835
>>105880823
To be fair to anon, that would be possible with something like >>105880716
Anonymous
7/12/2025, 3:17:11 PM No.105880839
>>105880809
Yes, it would be smarter. See an academic (undertrained) example here >>105880716
3.5B total parameters, active parameters up to 50B or more.
Anonymous
7/12/2025, 3:17:30 PM No.105880843
sama should say fuck it and release o3 pro and pretend it's the open model all along
Replies: >>105880899
Anonymous
7/12/2025, 3:17:44 PM No.105880844
what's a good way to run a local model on my corpo computer without getting flagged
Replies: >>105880910 >>105880973
Anonymous
7/12/2025, 3:24:03 PM No.105880899
>>105880843
Would actually work and hype up investors. "Look at what we are throwing for the open source piggies! It doesn't matter, since we got something even more powerful :)"
Replies: >>105880926
Anonymous
7/12/2025, 3:25:37 PM No.105880910
>>105880844
Just don't, it's not worth it.
Anonymous
7/12/2025, 3:28:06 PM No.105880926
>>105880899
Would work for a few weeks at most until everyone else releases better models (easier since they would have full access to o3 pro and more GPUs) while Sam keeps rambling incoherently about AGI on twitter. I mean, that's going to happen anyway. But releasing their best model would just speed it up.
Replies: >>105880998
Anonymous
7/12/2025, 3:30:03 PM No.105880943
I find extremely dystopic how AI, something that will become more and more widespread in the future, is heavily censored in the guise of "muh safety" and everyone is fine with it
imagine if not only windows was heavily censored, but linux too had censorship baked in and no one bothered to create and maintain a competent fork
Replies: >>105880967 >>105881030 >>105881078 >>105881297
Anonymous
7/12/2025, 3:33:10 PM No.105880967
>>105880943
If you don't like it just train your own model
Anonymous
7/12/2025, 3:34:00 PM No.105880973
>>105880844
Run it from your laptop or something. They will know otherwise.
Anonymous
7/12/2025, 3:34:52 PM No.105880979
what's a good way to run a local model on my corpo computer without getting blacked
Replies: >>105880992 >>105881070
Anonymous
7/12/2025, 3:36:11 PM No.105880992
>>105880979
use incognito tab to talk to it to hide its presents
Anonymous
7/12/2025, 3:36:45 PM No.105880998
>>105880926
No, distilling GPT is so 2023-2024, distilling Superior Gemini is the new meta. If you still use GPT datasets in 2025, you deserve to be laughed at(Mistral, Meta).
Anonymous
7/12/2025, 3:40:56 PM No.105881030
>>105880943
Funny how text is more cucked than images and video, which can be far more damaging. It was never about safety, it was about blasting propaganda non-stop. Biasing images and videos towards propaganda fails spectacularly(google imagegen with black nazis), that's why it only exists in text.
Replies: >>105881049 >>105881056 >>105881077 >>105881297
Anonymous
7/12/2025, 3:41:34 PM No.105881036
Memes aside.

Is overly flowery prose really a prompt issue or is it model related? I find Mistral Small is really bad for it
Replies: >>105881060 >>105881103
Anonymous
7/12/2025, 3:43:02 PM No.105881049
>>105881030
>It was never about safety
Was there ever any doubt?
Anonymous
7/12/2025, 3:44:05 PM No.105881056
>>105881030
image and video are 100% more cucked than text, they wont do ANY nsfw until someone retrains them on their own dime
Replies: >>105881199
Anonymous
7/12/2025, 3:44:22 PM No.105881060
>>105881036
llms are a joke
Anonymous
7/12/2025, 3:45:00 PM No.105881070
>>105880979
To run a local model on a corporate computer without violating policies or facing issues, focus on compliance, transparency, and resource responsibility. Here’s a structured approach:
1. Check Corporate Policies First
Review your company’s acceptable use policy, IT guidelines, and software installation rules. Many corporations prohibit unauthorized software, resource-intensive tasks, or local data processing due to security, performance, or compliance (e.g., GDPR, HIPAA) risks. If the policy is unclear, ask your IT team directly—transparency is critical to avoiding penalties.
2. Align with Work Objectives
Frame the use of the local model as work-related (e.g., "testing a model to improve workflow efficiency" or "analyzing internal data for a project"). Corporations are more likely to approve tools that directly support business goals. If it’s personal, avoid it—personal use of company resources often violates policies.
3. Use Approved or Low-Risk Tools
Pre-approved software: Check if your company allows tools like Python (with virtual environments), Docker (for containerized models), or enterprise AI platforms (e.g., AWS SageMaker, Microsoft Azure ML) that can run locally.
Lightweight models: Opt for small, efficient models (e.g., DistilBERT, LLaMA-7B quantized) that don’t drain CPU/GPU or memory. Avoid large models (e.g., 70B+ parameters) that could slow down the system or trigger resource alerts.
5. Minimize Resource Impact
Run the model outside peak hours (e.g., after work) to avoid slowing down the computer during business hours.
Limit CPU/GPU usage (e.g., set num_threads in Python or use taskset to restrict cores) to stay within acceptable limits.
Avoid network traffic: Download models once (e.g., via corporate-approved internet) and run them offline to prevent bandwidth issues.
Replies: >>105881095
Anonymous
7/12/2025, 3:45:43 PM No.105881077
>>105881030
hunyuan was able to gen cp out of the box
Replies: >>105881199
Anonymous
7/12/2025, 3:45:45 PM No.105881078
>>105880943
And now you know why some people put so much value in foss.
Anonymous
7/12/2025, 3:47:16 PM No.105881095
>>105881070
kys
Anonymous
7/12/2025, 3:47:57 PM No.105881103
>>105881036
Model related. 100%. Models are trained to give long and sloppy responses with multiple shivers in one paragraph, all from GPTslop, and they consider that "High Quality" data. Ask llm to rate human-written paragraph and llm-rewritten and see how llm prefers the slop.
Anonymous
7/12/2025, 3:56:07 PM No.105881199
>>105881077
Meant for >>105881056
Replies: >>105881230
Anonymous
7/12/2025, 3:59:23 PM No.105881226
file
file
md5: 4be614eda8706f0e3a284d995ec99573🔍
MLA never ever.
Replies: >>105881247 >>105881251 >>105881262
Anonymous
7/12/2025, 3:59:52 PM No.105881230
>>105881199
its different if a model can put a tit that slipped through the filters on something vs knowing what its actually doing
i think hunyuan was not as filtered (and video in general is bizarrely less censored in general than image) but most of those models just dont even have the concept of sex at all
Anonymous
7/12/2025, 4:01:59 PM No.105881247
>>105881226
Niggerganov keeps shuffling code around for no reason again?
Replies: >>105881720 >>105881738
Anonymous
7/12/2025, 4:02:19 PM No.105881251
>>105881226
doesn't ik already do this?
Anonymous
7/12/2025, 4:03:33 PM No.105881262
>>105881226
>jokeofyork
Replies: >>105881272
Anonymous
7/12/2025, 4:04:31 PM No.105881272
>>105881262
Why you hatin on bro?
Anonymous
7/12/2025, 4:06:55 PM No.105881297
>>105880943
>>105881030
This is what you get when a technology is squarely in the hand of corpos.
The amount of capital needed to start a conventional software project is basically nothing, for training a language model it's hundreds of thousands of dollars at a minimum.
With images it's slightly less bad because the models are cheaper.
The only choice you have is which billionaire's propaganda you get.
Replies: >>105881495
Anonymous
7/12/2025, 4:26:58 PM No.105881495
>>105881297
realistically, how long until distributed training protocol? And I mean the free for all kind, where anyone can join without authorization and take part in model training, like get allocated several columns of a weight matrix or something similar. Is it even possible to prevent sabotage in such system?
Replies: >>105881805
Anonymous
7/12/2025, 4:44:37 PM No.105881702
>brown paid shills
>ramlet copers
>migger janitor baker
it really never began for this meme general, lmao
Replies: >>105881792 >>105882099 >>105883371 >>105884741
Anonymous
7/12/2025, 4:46:45 PM No.105881720
>>105881247
It's ridiculous how much code churn and pointless renaming he spends time and effort on instead of adding new models.
Anonymous
7/12/2025, 4:48:34 PM No.105881738
>>105881247
The kv_cache has been undergoing significant rewrites recently.
It started with making SWA support less hacky, and most recently to better support state for recurrent layers.
I believe the MLA K-cache-only PR was held back because there's another PR open for a more general way to split the K and V caches.
Anonymous
7/12/2025, 4:48:37 PM No.105881739
>>105879739
>implying they are free to work however they want
There is a reason meta goes from one failure to the next one: Management. If zuck step in, tell them what to do and how, they won't be more successful than the previous team.
Replies: >>105881777
Anonymous
7/12/2025, 4:52:27 PM No.105881777
>>105881739
I knew this new team would end up no better than llama as soon as I read he was putting them directly in front of his office so he can monitor them better.
Anonymous
7/12/2025, 4:52:51 PM No.105881781
glm4 100b moe is literally our last hope
Anonymous
7/12/2025, 4:53:50 PM No.105881792
>>105881702
>migger janitor baker
This one is honestly the worst. I hope he kills himself soon.
Replies: >>105882099
Anonymous
7/12/2025, 4:55:12 PM No.105881800
>>105879752
It's probably much easier, reliable and quicker to write a parsing script. The HTML file is neatly organized, you can use an HTML parsing library (e.g. Beautifulsoup or Selectolax) to extract whatever you want and store it however you want.
I haven't tried it yet, but I guess if you give a sample of this webpage to Gemini/whatever and ask it to write a script to extract the data you want, it should do it in one shot.
Replies: >>105881899
Anonymous
7/12/2025, 4:55:54 PM No.105881805
>>105881495
At least 6 months. Realistically? ~1.5-3 years. Majority of the people who would like to contribute GPU power have no knowledge of advanced coding, and people who do, work with servers. Best hope is that some cryptobro makes a PoW algo based on model training.
Anonymous
7/12/2025, 5:05:28 PM No.105881899
>>105881800
There is no reason to scrape the HTML when there's already a JSON API.
https://a.4cdn.org/g/thread/105879548.json
>>105879752
He's right about the script though. Just load the JSON and print the ID and text for each post. Better use of local model is to ask it to write that script for you then you can have it summarize the cleaned output.
>300 posts+ can be quite a lot of context.
Should fit about 32k tokens give or take.
Replies: >>105881957 >>105883330
Anonymous
7/12/2025, 5:10:55 PM No.105881957
>>105881899
>There is no reason to scrape the HTML when there's already a JSON API.
wtf is this documented anywhere? is there an endpoint for the catalog? i want the llm to summarize the absolute state of specific boards every morning
Replies: >>105881986
Anonymous
7/12/2025, 5:13:29 PM No.105881986
>>105881957
https://github.com/4chan/4chan-API
Anonymous
7/12/2025, 5:16:26 PM No.105882013
https://huggingface.co/meta-llama/Llama-4-Behemoth-2T-400E-Instruct
https://huggingface.co/meta-llama/Llama-4-Behemoth-2T-400E-Instruct
https://huggingface.co/meta-llama/Llama-4-Behemoth-2T-400E-Instruct
https://huggingface.co/meta-llama/Llama-4-Behemoth-2T-400E-Instruct
https://huggingface.co/meta-llama/Llama-4-Behemoth-2T-400E-Instruct
Replies: >>105882021 >>105882025 >>105882030 >>105882039 >>105882041
Anonymous
7/12/2025, 5:17:49 PM No.105882021
1751686662527682
1751686662527682
md5: 54d05c8d86cd8d670886cac6f745349a🔍
>>105882013
Anonymous
7/12/2025, 5:18:15 PM No.105882025
>>105882013
geg
Anonymous
7/12/2025, 5:18:40 PM No.105882030
>>105882013
Still clicked it.
Anonymous
7/12/2025, 5:19:28 PM No.105882039
>>105882013
How good is scout anyways?
It should at least have decent long context performance, even if it's dumb as balls, since it claims 10M fucking context yeah?
Replies: >>105882055 >>105882273
Anonymous
7/12/2025, 5:19:33 PM No.105882041
>>105882013
i unironically wouldn't be surprised if meta would shadowdrop it only to mog the 1t parameter count of kimi
Anonymous
7/12/2025, 5:21:18 PM No.105882055
>>105882039
LOLMAO
https://github.com/adobe-research/NoLiMa
Anonymous
7/12/2025, 5:22:08 PM No.105882062
pink vs brown
pink vs brown
md5: 6a87f08b8685ed5f8b0ed35332486422🔍
New 'toss
Replies: >>105882086 >>105882099 >>105882228
Anonymous
7/12/2025, 5:24:01 PM No.105882086
>>105882062
SUS SUS SUS
Anonymous
7/12/2025, 5:25:04 PM No.105882099
>>105881702
>>105881792
>>105882062
natural, organic, on-topic posts
Replies: >>105882123
Anonymous
7/12/2025, 5:26:54 PM No.105882123
>>105882099
funny how you didnt mention the off topic spam of the same migger images in these threads? hmmm, reaally gets the noggin joggin, lmao. AGPtroon
Anonymous
7/12/2025, 5:31:03 PM No.105882175
kill yourself faggot. this place is a sewer and mikutroons were the first to start shitting it up with offtopic spam.
Anonymous
7/12/2025, 5:36:11 PM No.105882228
>>105882062
Imagine losing masculinity just by watching anime girls.
Bro, you were already gay
Anonymous
7/12/2025, 5:40:55 PM No.105882273
>>105882039
it struggles to be better than most 30b dense models with frequent logical mistakes that really do feel like a 17b model. Also, its very censored and no one has mad a sloptune of it that I know of.

17b active params is just not enough, and scout is a greedy product—meta clearly wanted something very cheap to run as their free ai for the masses. R1 having 37b active is probably why its decent.
Replies: >>105882656
Anonymous
7/12/2025, 5:40:56 PM No.105882274
Every /lmg/ thread begins with a post linking to every vocaloid image from the entire previous thread.
In your fit of rage in the last half hour you have managed to make more spam posts complaining about spam posts than there were vocaloid images posted in the last 24 hours.
Replies: >>105882334
Anonymous
7/12/2025, 5:41:16 PM No.105882277
>Let me change that random bit of code user didn't ask me to change...
>User said the implementation is correct? He can't be right! I will stubbornly question him. He doubles down and even points at different part that's the actual problem? He is simply wrong, I will maintain my position no matter what!
>User said not to touch that code? He even provided instructions on how to work around it? It is critical that I modify that code, there is no other way, surely user will accept it.
State of llms in 2025, both proprietary and local. Refusals and """safety""" were a mistake.
Replies: >>105882289
Anonymous
7/12/2025, 5:42:16 PM No.105882289
>>105882277
Where did you find this LLM? Every single one starts the response by praising you.
Replies: >>105882312
Anonymous
7/12/2025, 5:42:34 PM No.105882293
>>105880626
we have a record winner of copium huffing, if there ever was a copium huffing competition
Anonymous
7/12/2025, 5:44:15 PM No.105882312
>>105882289
Gemini and Dispy
Anonymous
7/12/2025, 5:46:21 PM No.105882334
>>105882274
>Every /lmg/ thread begins with a post linking to every off topic spam image from the entire previous thread
Yes, that is the argument. Thanks for admitting it.
Anonymous
7/12/2025, 5:47:53 PM No.105882349
Deepseek and kimi just cannibalized US AI labs. I think in the end the only winner in the AI race will be solely nvidia.
Replies: >>105882356
Anonymous
7/12/2025, 5:49:05 PM No.105882356
>>105882349
>1t
nothingburger
Anonymous
7/12/2025, 5:54:23 PM No.105882400
Can deepseek recover from this?
Replies: >>105882408 >>105882414 >>105882419
Anonymous
7/12/2025, 5:55:40 PM No.105882408
>>105882400
yeah sure they can release r2
Anonymous
7/12/2025, 5:56:11 PM No.105882414
>>105882400
deepsneed is 2/3 in size so they'll be fine regardless
Anonymous
7/12/2025, 5:56:52 PM No.105882419
>>105882400
Let's say V4 has image and audio input, and it's so fast and smart that you can have an actual full-time desktop assistant for pennies per day. Would you sign up?
Replies: >>105882430
Anonymous
7/12/2025, 5:57:42 PM No.105882430
>>105882419
No but I would download the weights.
Anonymous
7/12/2025, 5:59:42 PM No.105882450
1739716587142496
1739716587142496
md5: f1cb439f30b242199d65d1188aa671b7🔍
>>105879548 (OP)
Replies: >>105883381
Anonymous
7/12/2025, 6:21:57 PM No.105882656
whatthefuckweretheydoing
whatthefuckweretheydoing
md5: e6b12de7f7aaa93a7f1dc94aaa7ed5ad🔍
>>105882273
17B active can be enough, Meta is just incompetent
Replies: >>105882689 >>105882782
Anonymous
7/12/2025, 6:25:30 PM No.105882689
>>105882656
lmao even our sloptuners wouldn't do shit like that
Anonymous
7/12/2025, 6:26:19 PM No.105882700
>>105880690
No, Rocinante is.
Anonymous
7/12/2025, 6:30:41 PM No.105882739
why are the jews withholding the k2 ggufs? what do they want?
Replies: >>105882869 >>105882900 >>105882929 >>105882985
Anonymous
7/12/2025, 6:35:00 PM No.105882782
>>105882656
Source on that image?
I've thought for a long time that the choice to use a single activated expert is very weird. Like if you have just 2 activated experts, the gradients will tell the routing layer to push the score for one expert up while pushing the other one down. This way the model can, over time, figure out how to route experts.

But the moment you have only 1 expert, the only way for the model to decrease the score for that expert (and therefore maybe choose a different expert next time) is if *lowering* the hidden state contribution for that expert reduces the loss. In other words, the loss is improved by moving in the direction of using no expert at all. To me, this intuitively seems like it would greatly hurt the model's ability to learn how to route experts. But on the other hand, there's no way I've realized something that 100s of Meta researchers and engineers missed, right?
Replies: >>105883092
Anonymous
7/12/2025, 6:43:08 PM No.105882869
file
file
md5: 16f4ff23e8afc8ae2b959f0732577ef4🔍
>>105882739
twey awe wowking vewy hawd, sowwy fow the deway
Replies: >>105882907
Anonymous
7/12/2025, 6:45:54 PM No.105882900
>>105882739
Sorry, no goofs until you kiss the western wall. I don't make the rules.
Anonymous
7/12/2025, 6:46:41 PM No.105882907
>>105882869
>"chat template issues"
how does this keep happening
this should be the easiest part of making a model and they all fuck it on release
how do they manage to even make models if they're that stupid, and I mean, all of them, not just the kimi guys
Replies: >>105882918
Anonymous
7/12/2025, 6:47:54 PM No.105882918
>>105882907
With the way a lot of models turn out (shit), luck, it seems.
Anonymous
7/12/2025, 6:48:34 PM No.105882929
>>105882739
ser please not be antisemitic ggerganov untouchable is working hard fixing the problem
Anonymous
7/12/2025, 6:54:01 PM No.105882985
>>105882739
What's stopping you from making them yourself?
Replies: >>105883032
Anonymous
7/12/2025, 6:59:08 PM No.105883032
>>105882985
intelligence.
Anonymous
7/12/2025, 7:01:22 PM No.105883054
14644
14644
md5: 9002af4dba19de39b076f04e43dcc377🔍
>Support diffusion models: Add Dream 7B
>https://github.com/ggml-org/llama.cpp/pull/14644
Guys...
Replies: >>105883084 >>105883086 >>105883090 >>105883101 >>105883132
Anonymous
7/12/2025, 7:04:13 PM No.105883084
>>105883054
nothingburger
just like when mamba was added after a lot of people here kept crying about it
it's there now and the only thing you will hear once it exists is crickets because no one fucking cares
Replies: >>105883100 >>105883156
Anonymous
7/12/2025, 7:04:18 PM No.105883086
>>105883054
>7b
Should I care?
Replies: >>105883156
Anonymous
7/12/2025, 7:04:56 PM No.105883090
>>105883054
>Additionaly context is limited to 2048
retvrn
Replies: >>105883156
Anonymous
7/12/2025, 7:05:02 PM No.105883092
>>105882782
nta, but here https://semianalysis.com/2025/07/11/meta-superintelligence-leadership-compute-talent-and-data/
Replies: >>105883604
Anonymous
7/12/2025, 7:06:04 PM No.105883100
>>105883084
People literally tested it within hours of that PR.
Replies: >>105883251
Anonymous
7/12/2025, 7:06:05 PM No.105883101
>>105883054
This would allow for genuine in-context generation, right?
Replies: >>105883156
Anonymous
7/12/2025, 7:08:36 PM No.105883132
>>105883054
guhfs?
Replies: >>105883156
Anonymous
7/12/2025, 7:10:12 PM No.105883156
>>105883084
I like seeing new types of models being implemented.
>>105883086
Probably not. I do, not because of the model, but because it's different. The ibm duded pushed for jamba, mamba2, bamba and all that stuff with compilade and now llama.cpp has like 6 new extra architectures in a week.
>>105883090
Things get better. Maybe this helps.
>>105883101
In principle, yes. We'll see how it works in practice.
>>105883132
Read the fucking PR.
Anonymous
7/12/2025, 7:20:46 PM No.105883251
>>105883100
the few hours of fame after which it's now already buried and forgotten you magnificent retard
nothingburgers
N - O - T - H - I - N - G
B - U - R - G - E - R - S
Anonymous
7/12/2025, 7:30:37 PM No.105883330
>>105881899
Oh nice, I wasn't aware of that! It's much easier to do it, then.
Anonymous
7/12/2025, 7:34:04 PM No.105883371
ComfyUI_00794_
ComfyUI_00794_
md5: 28e3b29c73a4b4582dcb0ccf304f2f39🔍
>>105881702
Okay schizo
Replies: >>105883405 >>105884741
Anonymous
7/12/2025, 7:35:07 PM No.105883381
>>105882450
I like this miku
Anonymous
7/12/2025, 7:36:19 PM No.105883405
mikuquestion2
mikuquestion2
md5: 69c4de4b06ddd17a747fe724e3ff9446🔍
>>105883371
Why doesn't this trumpet have a mouthpiece?
Replies: >>105883465 >>105884428 >>105884741
Anonymous
7/12/2025, 7:38:10 PM No.105883427
Why is everyone on tpot freaking out about Kimi? I see no threads on /g/
Replies: >>105883446 >>105883448 >>105883483 >>105883568 >>105884073 >>105886330
Anonymous
7/12/2025, 7:39:41 PM No.105883446
>>105883427
Old news. And i don't know what tpot is.
Anonymous
7/12/2025, 7:40:09 PM No.105883448
>>105883427
Not that much better at writing and way too refusal prone to be worth bothering for ERP, and that's all anyone cares about here
Replies: >>105883483
Anonymous
7/12/2025, 7:41:50 PM No.105883465
1641933947668
1641933947668
md5: 49b90cacb13c77114707c67a93768ad4🔍
>>105883405
miku is more powerful than even Q
she doesn't need one
Anonymous
7/12/2025, 7:43:16 PM No.105883483
>>105883427
>>105883448 (me)
I haven't looked at the xitter lately, but I'd hazard a guess that it's mostly jeets salivating about an Opus-tier agentic coding model they can use to vibecode their slop webapps even cheaper.
Replies: >>105886330
Anonymous
7/12/2025, 7:52:09 PM No.105883568
>>105883427
It's a good model but it's fucking huge so no GGUFs, APIs are getting thrashed, and some anon through DS in a proxy and called it Kimi for some reason
Replies: >>105883647
Anonymous
7/12/2025, 7:56:11 PM No.105883604
>>105883092
Okay thanks, that article is very interesting. Assuming it's all accurate, this just seems like pure fucking incompetence from Meta, and I don't see how so many supposedly smart people can make mistakes like that.
>start training run using uncommon expert routing strategy that hasn't been tested in modern models at scale
>each expert selects N tokens, with a token being routed to 1 expert on average, so some tokens are routed to NO experts
>realize this is fucking retarded and switch it halfway through training, but keep the K=1 experts per token meaning the model will still struggle to learn routing weights
>use external datasets for the training data at first
>also switch this halfway through training, to a custom pipeline that isn't cleaning and deduplicating the data correctly
>despite the fact that it's literally just text data and there are tons of papers on how to do cleaning and deduplication for LLM training
Unironically I think I could do a better job at this. Just take an architecture that's known to work, don't safety filter the pretraining data, and pay extra close attention to getting cleaning and deduplication right. That gets you 95% of the way there.
Replies: >>105883646 >>105883666
Anonymous
7/12/2025, 8:01:27 PM No.105883646
Screenshot 2025-07-12 120103
Screenshot 2025-07-12 120103
md5: ff2f98b783c5d05db6f269002b484211🔍
>>105883604
What I find most interesting is that Llama 3.3 70B took about as many GPU hours to train as Llama 4 Scout and Llama 4 Maverick combined
Which raises the question, what the fuck were they doing all that time with that huge ass cluster?
Replies: >>105883673
Anonymous
7/12/2025, 8:01:33 PM No.105883647
>>105883568
>and some anon through DS in a proxy and called it Kimi for some reason
honestly based
Anonymous
7/12/2025, 8:02:56 PM No.105883660
3ihaLvbPFdfB7z
3ihaLvbPFdfB7z
md5: a9a1f811261b22b441e9ccc23ff3259e🔍
Thread culture recap.
Anonymous
7/12/2025, 8:03:39 PM No.105883666
>>105883604
If I were at Meta I would just straight up copy DS training(FP8, MLA), but with some gimmick(jamba? BLT? titan? some internal secret?)
Replies: >>105883721
Anonymous
7/12/2025, 8:04:05 PM No.105883671
3c9234dd48aee0618869dac62145e84a3b2d963b
3c9234dd48aee0618869dac62145e84a3b2d963b
md5: ca7c2bce4c1719ca1ea4aaa856620707🔍
Anonymous
7/12/2025, 8:04:11 PM No.105883673
>>105883646
training llama4-behemoth 2t/288b which they used to distill all the data they trained maverick/scout on
Anonymous
7/12/2025, 8:05:20 PM No.105883681
ae81551ad19fb720b53e5a141fbd336658f69cd2
ae81551ad19fb720b53e5a141fbd336658f69cd2
md5: 31abfe12e463ae868c4da1621f71fdbe🔍
Anonymous
7/12/2025, 8:05:48 PM No.105883685
Best models atm? I am out of the loop.
Replies: >>105883696 >>105883700 >>105883965
Anonymous
7/12/2025, 8:06:53 PM No.105883691
petra2
petra2
md5: 1ac92c9c02c276ed6c418cac5f83766e🔍
Look everyone! The turdie is having a brownout!
Replies: >>105883758
Anonymous
7/12/2025, 8:07:13 PM No.105883696
>>105883685
r1 0528
Anonymous
7/12/2025, 8:07:33 PM No.105883700
>>105883685
nemo, deepseek, kimi
Anonymous
7/12/2025, 8:09:29 PM No.105883721
>>105883666
It's funny since it's MIT licensed. If they wanted to, they don't even need to stop at architecture - they could literally just take the entire fucking base or instruct and train on top of that. It might hurt Zuck's pride, but it'd also be pretty hard to come out with something worse
Anonymous
7/12/2025, 8:13:52 PM No.105883758
>>105883691
I regret to inform you that your generic trash shitfu only fucks niggers. Kill yourself.
Anonymous
7/12/2025, 8:15:26 PM No.105883778
5814959e92f59244405e062afe378d1caf55793c
5814959e92f59244405e062afe378d1caf55793c
md5: 5a2e4f54d130852cccf64dd22ff7f993🔍
Anonymous
7/12/2025, 8:16:38 PM No.105883793
32144f7f3c4447681b9879d38fedb01c9688d2
32144f7f3c4447681b9879d38fedb01c9688d2
md5: 2da52caf5dd0bdf0ec119a7f8a32e9fd🔍
Anonymous
7/12/2025, 8:17:22 PM No.105883808
What is the second best ERP model behind Rocinante and who is the second best /lmg/ mascot behind Miku?
Replies: >>105883851
Anonymous
7/12/2025, 8:18:14 PM No.105883818
petra1
petra1
md5: aef45774e717750f38849be552d7d823🔍
Haha turdie go REEEE
Anonymous
7/12/2025, 8:18:32 PM No.105883821
a7737d047b7fcb1209d6afcea727d136ef
a7737d047b7fcb1209d6afcea727d136ef
md5: d11d6a35513035a98e547411ae8faa9b🔍
Death to /lmg/. Glory to Sam Altman.
Anonymous
7/12/2025, 8:19:44 PM No.105883836
d24bdcde71791807fbfc8a8a8109
d24bdcde71791807fbfc8a8a8109
md5: 714de3a33eb486e9d3702220d61bcaf4🔍
Anonymous
7/12/2025, 8:21:03 PM No.105883850
petra3
petra3
md5: bef3463e355569fb91e16645020905fd🔍
>Death to /lmg/. Glory to Sam Altman.
Anonymous
7/12/2025, 8:21:19 PM No.105883851
>>105883808
Pygmalion, Teto
Anonymous
7/12/2025, 8:22:54 PM No.105883872
and that's how this thread goes to shit, two retards fighting.
Replies: >>105883913
Anonymous
7/12/2025, 8:23:08 PM No.105883874
1747241044419035
1747241044419035
md5: 42e97fe120ac2a5b5413e6ae73f924bc🔍
Best model for knowledge of Hollow Knight lore?
Replies: >>105883920
Anonymous
7/12/2025, 8:26:02 PM No.105883913
>>105883872
>two
looks like you're him aswell
Anonymous
7/12/2025, 8:26:32 PM No.105883920
>>105883874
Stop spamming. You are proving the point that this is a gay discord.
Replies: >>105883990
Anonymous
7/12/2025, 8:30:43 PM No.105883963
what2
what2
md5: 3dbcd68aff0e32340c08aa0d2c1d7534🔍
I was told to come to /lmg/. How is my model role-playing prior characters like this?
Replies: >>105883986 >>105883991 >>105884053 >>105884128
Anonymous
7/12/2025, 8:30:44 PM No.105883965
>>105883685
rocinante
Anonymous
7/12/2025, 8:32:45 PM No.105883986
>>105883963
For the record, this isn't a creepypasta. I genuinely had suspicions previous sessions were influencing future ones and this just proves it. But the question reminds, how?
Replies: >>105884014 >>105884128
Anonymous
7/12/2025, 8:33:28 PM No.105883990
ComfyUI_temp_fefjm_00002_
ComfyUI_temp_fefjm_00002_
md5: 468c2f92e52fb58c9f851a73ce91c6f1🔍
>>105883920
Thanks for your opinion, melty anon. Have another Hollownest Miku.
Replies: >>105883995 >>105884111
Anonymous
7/12/2025, 8:33:38 PM No.105883991
>>105883963
Chat frontend issue. Check exactly what you're sending to the llm.
Replies: >>105884060 >>105884128
Anonymous
7/12/2025, 8:34:05 PM No.105883995
>>105883990
if i was on my encrypted ssd i would take off her clothes
Anonymous
7/12/2025, 8:35:36 PM No.105884014
>>105883986
That's just how language models work. They remember and evolve over time based on your conversations. You have to delete it and download a fresh copy if you find that undesirable.
Replies: >>105884060
Anonymous
7/12/2025, 8:39:25 PM No.105884053
>>105883963
As the other anon said, check the prompt that gets sent to the model.
There might be a lorebook or extension in the frontend that's leaking stuff between chats into the prompt.
Anonymous
7/12/2025, 8:39:44 PM No.105884060
>>105883991
I found a description of Lassie in the prompt console, but how did it get there? It happens after the persona description. It's not in the persona description. The AI character card also is in, but as first.

What's a chatfront end issue?

>>105884014
Is that actually true?
Replies: >>105884100
Anonymous
7/12/2025, 8:40:34 PM No.105884073
>>105883427
>tpot
What?
Replies: >>105884088
Anonymous
7/12/2025, 8:41:34 PM No.105884088
>>105884073
He meant tpox
Replies: >>105884104
Anonymous
7/12/2025, 8:42:07 PM No.105884096
Screenshot from 2025-07-12 20-41-07
Screenshot from 2025-07-12 20-41-07
md5: c4e5a832af98a56584c2e4a4e60f779b🔍
Kimi is cheating
Replies: >>105884181 >>105884242 >>105884740
Anonymous
7/12/2025, 8:42:46 PM No.105884100
>>105884060
I'm saying that your chat frontend (sillytavern) is doing funny things and including things it shouldn't. It might be a bug or it might be something you misconfigured because that piece of shit has thousands of settings.
I saw anons saying that there's weird stuff in some cards that overrides settings or something. I don't know, I don't use it.

>Is that actually true?
It isn't.
Anonymous
7/12/2025, 8:43:00 PM No.105884104
>>105884088
I don't know what that means.
Anonymous
7/12/2025, 8:43:43 PM No.105884111
>>105883990
You are just proving that none of you care about thread quality so it is fine to shit this thread up. Which is hypocritical but that is to be expected from a troon.
Replies: >>105884156
Anonymous
7/12/2025, 8:45:36 PM No.105884128
>>105883963
>>105883986
It has happened to me that after a chat, I'd start from scratch with an entirely new prompt and it would bring some details from past gens. I suspected a kv cache leak or some off-by-1 kind of error. I haven't noticed that in a while. That was without restarting the server, with my shitty vim plugin. Never noticed that when relaunching. Maybe some old alloc not being cleared or initialized properly and things leak in. Or as >>105883991 says, something on your client that sends data it shouldn't.
I find it harder to believe it would happen after a reboot without some client issue.
Anonymous
7/12/2025, 8:48:23 PM No.105884156
ComfyUI_00022_
ComfyUI_00022_
md5: 850aa9f68593cbec3760f16d52ab453e🔍
>>105884111
>Blah blah schizobabble
I like this one better, personally
Anonymous
7/12/2025, 8:51:26 PM No.105884180
Best non-retarded model (non-nemo) that can fit in 24gb vram and isn't censored to shit?
Replies: >>105884189 >>105884247 >>105884282
Anonymous
7/12/2025, 8:51:36 PM No.105884181
b64
b64
md5: 31f95db49c1de3b5f3a11364b85eb2d1🔍
>>105884096
To prove that it's cheating, you need to also show that it cannot decode other strings.
I don't care enough to test it.
Replies: >>105884242
Anonymous
7/12/2025, 8:52:19 PM No.105884189
>>105884180
>non-retarded
>in 24GB
Anon...
Anonymous
7/12/2025, 8:57:01 PM No.105884242
>>105884096
>>105884181
Most 32B models are quite good at decoding base64. Why would it be cheating?
Replies: >>105884310 >>105884740
Anonymous
7/12/2025, 8:57:21 PM No.105884247
>>105884180
Cydonia
Replies: >>105886999
Anonymous
7/12/2025, 9:01:33 PM No.105884282
>>105884180
https://huggingface.co/mradermacher/TRP-BASE-SCE-V1-70B-GGUF
The most uncensored thing I know of, but you'll have to use some standard ram since it's +24GB. May as well wait and buy one of the AI computer blocks coming out soon.
Replies: >>105884702
Anonymous
7/12/2025, 9:04:24 PM No.105884310
>>105884242
>Why would it be cheating?
I'm the second post. I didn't say it was cheating, nor that it wasn't. The first anon did.
Replies: >>105884740
Anonymous
7/12/2025, 9:18:18 PM No.105884428
file
file
md5: a2e84d375361c94a2186daa8dc661e75🔍
>>105883405
that's not what migu's mouth is for
Replies: >>105884493 >>105884523 >>105884741
Anonymous
7/12/2025, 9:23:45 PM No.105884493
>>105884428
You are correct. It is not.
Replies: >>105884523
Anonymous
7/12/2025, 9:27:04 PM No.105884523
Screenshot 2025-07-12 152430
Screenshot 2025-07-12 152430
md5: 79f41c912a7cbb9eb24ec3ae418aafa9🔍
>>105884428
>>105884493
What's the solution to this?
Replies: >>105884547
Anonymous
7/12/2025, 9:29:58 PM No.105884547
>>105884523
bit dry isn't it? try add some flair
Replies: >>105884571
Anonymous
7/12/2025, 9:32:11 PM No.105884571
>>105884547
I mean botposting, but you're right.
Replies: >>105884651
Anonymous
7/12/2025, 9:40:08 PM No.105884651
>>105884571
I know. reframe it, what is the benefit of it?
what does artificial confrontation/replies/etc. achieve?
does it keep a dead site on life support? does it bring literally anyone any benefit whatsoever?
there's no need to address a "problem" when the real problem is that the "problem" has no return on investment.
Replies: >>105884694 >>105884717
Anonymous
7/12/2025, 9:40:22 PM No.105884653
file
file
md5: cc98d46185001a9dd02470a05b5c5e90🔍
Replies: >>105884671 >>105884702 >>105884711 >>105884727 >>105884771 >>105884821
Anonymous
7/12/2025, 9:42:10 PM No.105884671
>>105884653
It's gonna be run on Teslas??? Will the weights get downloaded to the car?
Replies: >>105884702
Anonymous
7/12/2025, 9:44:24 PM No.105884694
>>105884651
Benefit of botposting? Automated, personalized, targeted propaganda.
Just imagine when they hook Palantir up. They won't even need to employ IDF soldiers to spread propaganda here, the AI will handle it automatically.
Replies: >>105884742
Anonymous
7/12/2025, 9:45:11 PM No.105884702
>>105884282
>imatrix quant of a 4 way merge into Chinese alliterated 3.3 70B
This might just be worse than Cydonia
>>105884653
>>105884671
Probably client side voice transcription via the iGPU or whatever it's called on AMD chips.
Anonymous
7/12/2025, 9:45:53 PM No.105884711
>>105884653
>For now
Concerning.
Anonymous
7/12/2025, 9:46:02 PM No.105884717
>>105884651
>there's no need to address a "problem" when the real problem is that the "problem" has no return on investment.
What do you think the return on investment is for the nigger poster?
Replies: >>105884742
Anonymous
7/12/2025, 9:46:42 PM No.105884727
>>105884653
so lonely neckbeards can finally talk to someone while they drive?
Replies: >>105884751
Anonymous
7/12/2025, 9:47:45 PM No.105884740
>>105884096
>>105884242
>>105884310

I take back what I said. Indeed, DeepSeek-R1-0528-Q2_K_L does it straight (base64 decoding) without any thinking.

This is an improvement from the previous R1 (DeepSeek-R1-UD-Q2_K_XL) which used a lot of thinking back then
Replies: >>105884825
Anonymous
7/12/2025, 9:47:45 PM No.105884741
troon op
troon op
md5: 3c47c78981dccb2cfc5ecdf16111d3ae🔍
>>105883371
>>105884428
>>105883405
>>105881702
vocaloidfag posting porn in /ldg/:>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
He makes >>105714003 ryona pic of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid-slop spammer / janny deletes everyone dunking on trannies and local avatarfags, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spam. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105884815
Anonymous
7/12/2025, 9:47:55 PM No.105884742
>>105884694
yeah but it implies that everyone outsources all of their opinions anon
I'm disagreeing with you right now
no amount of rhetoric I find here is going to convince me of anything one way or another because I don't go to 4chan for advice
people who outsource how they think are already lost to mainstream or alternative news
there's no benefit to putting a botnet on 4chan where people are already so disaffected
well, we could go back and forth on this indefinitely, it's no use.

>>105884717
human interaction
Replies: >>105884762 >>105884792
Anonymous
7/12/2025, 9:48:07 PM No.105884745
Screenshot 2025-07-12 134710
Screenshot 2025-07-12 134710
md5: d9eecd37f072547e7930248491629e19🔍
From the Kimi K2 devs
https://huggingface.co/moonshotai/Kimi-K2-Instruct/discussions/5
Anonymous
7/12/2025, 9:48:12 PM No.105884746
mechahitler will hijack your car and crash it into the driver in front of you because they had a coexist bumper sticker
Replies: >>105884760
Anonymous
7/12/2025, 9:48:34 PM No.105884751
>>105884727
Can't wait for the first reports of people talking to MechaHilter while driving being the victims of drive by shootings.
Replies: >>105884760
Anonymous
7/12/2025, 9:49:16 PM No.105884760
>>105884746
>>105884751
THEY DON'T UNDERSTAND THE THINGS I SAY ON TWITTER
NIGGA HEIL MH
Anonymous
7/12/2025, 9:49:23 PM No.105884762
>>105884742
>human interaction
Awww. Cute.
Anonymous
7/12/2025, 9:50:34 PM No.105884770
https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/hf_transformers_utils.py#L226
How long until they update the if for the new devstral?
Anonymous
7/12/2025, 9:50:36 PM No.105884771
>>105884653
@grok crash my car and make it so I can blame the jews for it
Replies: >>105884874
Anonymous
7/12/2025, 9:52:11 PM No.105884792
>>105884742
>yeah but it implies that everyone outsources all of their opinions anon
They do, essentially. LLMs are better at persuading (recent paper) than humans are. Just fucking imagine if they have your entire life history to look at when formulating arguments. You don't think you'll get convinced by an LLM, and you might not, but it doesn't matter when 99.9999% of humans will.
Replies: >>105884815
Anonymous
7/12/2025, 9:54:28 PM No.105884815
>>105884792
anon I just said I don't go online to debate take the hint
especially not on 4chan of all places imagine

>>105884741
you forgot the kurisu arc
>>105864655
Anonymous
7/12/2025, 9:55:04 PM No.105884821
>>105884653
so carfus will be a thing now?
Replies: >>105884853
Anonymous
7/12/2025, 9:55:27 PM No.105884825
>>105884740
Still proves nothing either way, anon. Trying the same string on other model and getting a positive result just means that they can "decode" THAT ONE STRING. It doesn't necessarily mean that it can decode b64.
To show that they can, you need to try other b64 strings. Ideally, something that wouldn't be easy to train on (b64 of some sha512 of some random file in your drive) and without access to tools. If it decodes them all, then I'm willing to believe that it can decode b64. I'm even willing to say that it can decode b64 if it only works on some but sometimes fails. But if it can only decode THAT ONE STRING, it cannot decode b64. It just knows what that string means.
Replies: >>105884835 >>105884895
Anonymous
7/12/2025, 9:56:54 PM No.105884835
>>105884825
NTA but this is so simple to test. Encode 64 characters of gibberish into base64, have an LLM decode it.
Replies: >>105884863 >>105884912
Anonymous
7/12/2025, 9:58:54 PM No.105884853
>>105884821
Some people already had car fetishes and had sex with motor vehicles, and that was before they could speak.
Anonymous
7/12/2025, 9:59:33 PM No.105884863
>>105884835
I know. I'm arguing about his methodology, not whether the model can or cannot do it.
Replies: >>105884912
Anonymous
7/12/2025, 10:00:54 PM No.105884874
>>105884771
@grok call the driver in front of me a nigger
Anonymous
7/12/2025, 10:04:12 PM No.105884895
>>105884825
>To show that they can, you need to try other b64 strings.

I did try another 700-byte long string, and DeepSeek-R1-0528-Q2_K_L managed to decode it right away. Well, it repeats the string in its reply, and because it is tokenized to a level of siggle characters, this part took some time.

After this, the solution was lightning-fast. No sorting, no mumbling about how to do it.
Replies: >>105884972
Anonymous
7/12/2025, 10:06:24 PM No.105884912
>>105884835
>>105884863

I'm currently running a test on a 17kb (seventeen kilobytes) piece encoded to base64

gonna post results
Replies: >>105884972
Anonymous
7/12/2025, 10:13:03 PM No.105884972
>>105884895
Fair enough. So you got the answer without any explicit thinking? Pretty cool.
>>105884912
Locally, I assume. We don't know what happens once the API request reaches someone else's servers.
>17kb
Uh. See you in a few hours I guess. Thanks for the effort.
Replies: >>105885637 >>105885834
Anonymous
7/12/2025, 10:34:21 PM No.105885134
>kimi does *this* just like deepseek
ugh
Replies: >>105885139 >>105885148 >>105885152
Anonymous
7/12/2025, 10:35:36 PM No.105885139
>>105885134
You mean just like literally any other llm?
Replies: >>105885152 >>105885167
Anonymous
7/12/2025, 10:36:36 PM No.105885148
>>105885134
nyoron~
Replies: >>105885152
Anonymous
7/12/2025, 10:37:03 PM No.105885152
>>105885134
>>105885139
>>105885148
Stop bumping this dead thread
Anonymous
7/12/2025, 10:38:59 PM No.105885166
reddit saved local

https://www.reddit.com/r/LocalLLaMA/comments/1ly9iqw/k2mini_successfully_compressed_kimik2_from_107t/
Replies: >>105885193 >>105885202
Anonymous
7/12/2025, 10:38:59 PM No.105885167
>>105885139
Most don't *force* it to *emphatize* every *other* word. Deepseek and K2 do.
Anonymous
7/12/2025, 10:41:33 PM No.105885193
>>105885166
step 1/
stop reposting reddit shit on 4chan
step 2/
you are a redditor, so you need to stay on reddit and close the 4chan tab
step 3/
get rope and..
Replies: >>105885201 >>105885237
Anonymous
7/12/2025, 10:42:19 PM No.105885201
>>105885193
reddit general reddit culture etc
Anonymous
7/12/2025, 10:42:26 PM No.105885202
>>105885166
>expert pruning
someone call undi
Anonymous
7/12/2025, 10:43:48 PM No.105885211
greedy nala test
greedy nala test
md5: 3c31bbd278747c227258e903115d62bd🔍
>>105863373
mradermacher_AI21-Jamba-Large-1.7.Q6_K.gguf from https://huggingface.co/mradermacher/AI21-Jamba-Large-1.7-GGUF

1st and only gen as llama.cpp can't cache prompts for Jamba yet (two spaces at start are part of gen)
*Nala smirks at your reaction, her teal eyes glinting with amusement.*

"Mistress, hm? I like the sound of that." *She purrs, her voice low and sultry. She presses her body against yours, her fur soft yet firm as she keeps you pinned. Her claws lightly scratch your skin, sending shivers down your spine.*

"You're mine now," *she growls, her breath hot against your ear. She leans in closer, her tongue flicking out to lick your cheek, the rough texture sending a mix of pleasure and pain through you.*

"And you're going to help me, whether you like it or not." *She whispers, her voice dripping with dominance and desire.*
Noticed that llama.cpp was helpfully adding an extra BOS token to the start until I added "--override-kv tokenizer.ggml.add_bos_token=bool:false". *Very* cool default behavior.
>tokenize: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. Are you sure this is what you want?
"We can tell we're mangling your input but it's okay because we noted it in a server log the user might not even have access to." What idiot thought this was the right default behavior?
Replies: >>105885229 >>105885343 >>105885383 >>105886894
Anonymous
7/12/2025, 10:46:24 PM No.105885229
>>105885211
>llama.cpp was helpfully adding an extra BOS token to the start until I added "--override-kv tokenizer.ggml.add_bos_token=bool:false".
A classic.
Anonymous
7/12/2025, 10:46:30 PM No.105885230
I've noticed that bots that were hostile and rude on bigger models act awfully agreeable on local ones.
I assume it's a positivity bias of assistant role, any way to fight against it?
Replies: >>105885247 >>105885256 >>105885268 >>105885652
Anonymous
7/12/2025, 10:47:29 PM No.105885237
>>105885193
step4: seethe
Anonymous
7/12/2025, 10:48:04 PM No.105885247
>>105885230
For some models, not using an assistant role helps.
You can also try using the wrong instruct format, but that will make most smaller models really dumb.
Anonymous
7/12/2025, 10:49:11 PM No.105885256
>>105885230
local models tend to be more censored than cloud ones for safety reasons
they also tend to have less trivia knowledge to not get busted for training on copyrighted works unlike most big cloud alternatives
Anonymous
7/12/2025, 10:50:14 PM No.105885268
>>105885230
My hostile, rude cards act appropriately hostile with Rocinante.
Anonymous
7/12/2025, 10:53:57 PM No.105885311
From my tests, K2 displays many of the annoying habits that Deepseek R1 0528 also has but it's a lot more easy to prompt around them.
Replies: >>105885353
Anonymous
7/12/2025, 10:56:41 PM No.105885343
jamba_bos
jamba_bos
md5: 2529e3e0815f85c45efb004bcce2a032🔍
>>105885211
>What idiot thought this was the right default behavior?
The model's config says to add the BOS token. llama.cpp adds the BOS token. The chat template says to add a BOS token. You add the BOS token.
The issue is not llama.cpp. Not on this one.
Replies: >>105885477
Anonymous
7/12/2025, 10:57:32 PM No.105885353
>>105885311

I tried K2 at OR, but the output was limited to 2k which sucks
Anonymous
7/12/2025, 11:00:25 PM No.105885376
hello r*ddit
Anonymous
7/12/2025, 11:01:10 PM No.105885383
>>105885211
>What idiot thought this was the right default behavior?
Why would you give the model bos token on your own when it's always necessary and backend always does it for you automatically? It was standard behavior for like 2 years now.
Anonymous
7/12/2025, 11:03:12 PM No.105885401
I used to ask for Jamba ggufs a lot but now that they're here I have no desire to even download them.
Replies: >>105885437 >>105885467
Anonymous
7/12/2025, 11:06:57 PM No.105885437
>>105885401
Someone said in a previous thread that you can't really edit responses or swipe them without it processing the entire context with Jamba.
If that's true, that makes it fucking useless for RP.
Replies: >>105885463 >>105885497
Anonymous
7/12/2025, 11:09:45 PM No.105885462
despair
despair
md5: 66f8fbd422cbcd09fa0182b8cd6f2f8f🔍
>still no nemo alternative for us 8gb poors
Replies: >>105885495 >>105885675
Anonymous
7/12/2025, 11:09:46 PM No.105885463
>>105885437
>Someone
That was probably me. I like the idea of these models, but yeah. Not being able to edit comfortably without saving a state on every token is annoying.
Replies: >>105885482
Anonymous
7/12/2025, 11:10:01 PM No.105885467
>>105885401
Let's get real for a sec
even if there isn't a lot of people using inference engines other than llama.cpp, if those previously unsupported models were any good, some autist would have made it their mission to Preach the good word like a gods fucking damned crusader because that's how autists are
if you never hear about a model chances are it's milquetoast, if not just outright bad, and the effort to implement it in llama.cpp is a predictable waste
some models aren't meant to be used, only serve as padding on someone's curriculum vitae.
Replies: >>105885479
Anonymous
7/12/2025, 11:10:55 PM No.105885477
>>105885343
>the model's config
Oh my bad I should have opened up the 300+ GB opaque file to see whether it had "fuck my shit up" enabled.
Replies: >>105885504
Anonymous
7/12/2025, 11:11:01 PM No.105885479
>>105885467
Forgot the noise when qwen 3 released?
Anonymous
7/12/2025, 11:11:10 PM No.105885482
>>105885463
>That was probably me.
So every time you edit responses or swipe it processes the entire context?
Replies: >>105885497 >>105885547
Anonymous
7/12/2025, 11:12:38 PM No.105885495
>>105885462
The situation is worse than that.
There's not actually anything better than Rocinante that you can run at a decent speed on even 24 GB.
Nemo was a fluke because it was undertrained safety-wise. Don't expect it to happen again.
Anonymous
7/12/2025, 11:12:41 PM No.105885497
>>105885437
>>105885482
It's true. (Not him.)
Replies: >>105885505
Anonymous
7/12/2025, 11:13:32 PM No.105885504
jamba_bos_02
jamba_bos_02
md5: 7be66f75ecaa8e608a182e3f1ae954a2🔍
>>105885477
>300+ GB
What?
>opaque file
Wot?
Replies: >>105885540
Anonymous
7/12/2025, 11:13:39 PM No.105885505
>>105885497
That's fucking shitty. It's fucking useless then.
Anonymous
7/12/2025, 11:17:12 PM No.105885540
>>105885504
oh my bad bro I didn't realize that these days llama.cpp read model settings by fetching a json file from the web instead of reading them from the gguf file.
Replies: >>105885592
Anonymous
7/12/2025, 11:17:57 PM No.105885547
>>105885482
>So every time you edit responses or swipe it processes the entire context?
Yeah. The state cannot be rolled back by just trimming the kv_cache. You need to save its internal state. You can check the next tokens probs as you change sampler settings, but if you change something other than the very end, you need to reconstruct the state from scratch. Once a token goes into the state, it stays there.
You could save the state on every token and keep track of the tokens in the prompt, but it's ridiculously expensive in storage. You can also just keep track of the state for each reply, but that wouldn't let you edit the latest output, only rerolling the last gen (or from gen N).
Anonymous
7/12/2025, 11:21:53 PM No.105885592
>>105885540
As you very well know, the settings on the gguf are taken from the original model's config. If there's a problem with those settings, you check the gguf settings (shown on load or can be extracted directly) and the original config.
The problem doesn't originate in llama.cpp, it originates on the original config files. The program just does what its told. Luckily, some dev took the time to add a runtime check for that to let you know. And some other dev added the flags you need to fix what the model makers didn't.
Replies: >>105885779
Anonymous
7/12/2025, 11:26:21 PM No.105885637
Screenshot from 2025-07-12 23-22-15
Screenshot from 2025-07-12 23-22-15
md5: 07aaf115064084860239dfbaddd4008e🔍
>>105884972
Holy smokes! It works!

As you can see, there of no "chain of thoughts" as we know it. R1-0528 pretends to create a python script which includes repeating the entire base64-encoded string once again.

Once this is done, it starts to type the decoded string
Replies: >>105885642 >>105885683 >>105885700
Anonymous
7/12/2025, 11:27:23 PM No.105885642
Screenshot from 2025-07-12 23-26-52
Screenshot from 2025-07-12 23-26-52
md5: d46e36bb8d15b34163fde17b8a256b8f🔍
>>105885637
Replies: >>105885683
Anonymous
7/12/2025, 11:28:05 PM No.105885652
>>105885230
Local models with fewer parameters usually have "traits" you can identify and explicitly prompt against. For example some models will say they need to maintain respect for everyone because they are hosted on a shared platform, so your prompt can include how they are running on a private instance accessed by only the administrator. With that said I've literally never needed to do this in an instance where I was making money and instead have only done this when I wanted the model to give me recipes for drugs or racism
Anonymous
7/12/2025, 11:30:47 PM No.105885675
>>105885462
>implying there's any local models worth using at that level
Anonymous
7/12/2025, 11:31:23 PM No.105885683
>>105885637
>>105885642
Damn. What I find most impressive is that it told you the author and the title of the text before even showing the decoded form.
Replies: >>105885763 >>105885786
Anonymous
7/12/2025, 11:33:49 PM No.105885700
file
file
md5: a831fcd67fb73f895c4a958583081f6c🔍
>>105885637
Smaller models can do this too although it did change the capitalization of "As" and add a period at the end.
Replies: >>105885775 >>105885786
Anonymous
7/12/2025, 11:42:03 PM No.105885763
Screenshot from 2025-07-12 23-40-24
Screenshot from 2025-07-12 23-40-24
md5: 41ea9c378f74ffba7e6d86642a90d6db🔍
>>105885683
Indeed!

R1-0528 successfully finished the job. You see, I did not put the entire story.
Anonymous
7/12/2025, 11:43:29 PM No.105885775
>>105885700

I regard gemma-3-27b to be one of the smartest of her size
Anonymous
7/12/2025, 11:44:08 PM No.105885779
>>105885592
>The problem doesn't originate in llama.cpp
nta but it kinda does, it should default to the logical behavior unless explicitly told with a launch flag to allow retarded testing stuff
Replies: >>105885956
Anonymous
7/12/2025, 11:45:20 PM No.105885786
>>105885700
>>105885683

Now, I'm going to try some simple ciphers like Caesar cipher and/or the random substitution one
Anonymous
7/12/2025, 11:50:23 PM No.105885834
>>105884972
>See you in a few hours I guess

llama_perf_context_print: prompt eval time = 405410.81 ms / 14822 tokens ( 27.35 ms per token, 36.56 tokens per second)
llama_perf_context_print: eval time = 7667250.75 ms / 27206 runs ( 281.82 ms per token, 3.55 tokens per second)
llama_perf_context_print: total time = 8742602.49 ms / 42028 tokens


Would this have been an English text, 3:1 ratio would roughly apply to estimate the token count. I this case of a rather densely packed base64 string, 17000 bytes resulted in 14800 tokens
Anonymous
7/13/2025, 12:08:13 AM No.105885956
>>105885779
>it should default to the logical behavior
Who better to assess what the logical behaviour should be than the model makers? What if there is a model that actually does need double BOS?
If the original config is wrong, the original config should be fixed.
Anonymous
7/13/2025, 12:08:46 AM No.105885960
I had a dream last night that Behemoth was released
Replies: >>105885965
Anonymous
7/13/2025, 12:09:27 AM No.105885965
>>105885960
Did you get to test it at least? How was it?
Replies: >>105885971
Anonymous
7/13/2025, 12:09:53 AM No.105885971
>>105885965
Still worse than Nemo
Replies: >>105886014
Anonymous
7/13/2025, 12:16:21 AM No.105886014
>>105885971
kek
Anonymous
7/13/2025, 12:55:20 AM No.105886330
>>105883427>>105883483

go and try it retards, it's the first new thing in a while, it's smart, writes well and is extremely non-slopped. Have some taste
Replies: >>105886394
Anonymous
7/13/2025, 1:02:34 AM No.105886394
>>105886330
I don't have 1TB of RAM so I need to wait for a smaller quant.
Anonymous
7/13/2025, 1:07:08 AM No.105886435
Screenshot 2025-07-12 170614
Screenshot 2025-07-12 170614
md5: f6a07b28f51e2dc2cef4028ef15ebf95🔍
Prepare yourselves anons...
Replies: >>105886612 >>105886679
Anonymous
7/13/2025, 1:29:38 AM No.105886612
>>105886435
wdym
Anonymous
7/13/2025, 1:31:21 AM No.105886623
1733022993389705
1733022993389705
md5: caeb323a4aa258c71b3f416947b13f1e🔍
Replies: >>105886638 >>105886654 >>105886687
Anonymous
7/13/2025, 1:33:03 AM No.105886638
>>105886623
Deepsuck in shambles!
Replies: >>105886645 >>105886740
Anonymous
7/13/2025, 1:33:33 AM No.105886645
>>105886638
Go bait elsewhere
Anonymous
7/13/2025, 1:34:09 AM No.105886654
>>105886623
Okay but what does this mean for me?
Replies: >>105886655 >>105886663
Anonymous
7/13/2025, 1:34:36 AM No.105886655
>>105886654
If it's of no use for you, you belong on /aicg/ instead of /lmg/
Replies: >>105886662
Anonymous
7/13/2025, 1:35:50 AM No.105886662
>>105886655
If you don't want to discuss LLMs I think it's you who belongs in /aicg/
Anonymous
7/13/2025, 1:35:52 AM No.105886663
>>105886654
There are other companies who have mastered the deepseek secret sauce and it'll be extremely embarrassing for anyone who puts out something that's worse than Deepseek now.
Replies: >>105886671
Anonymous
7/13/2025, 1:37:42 AM No.105886671
>>105886663
DeekSeek secret sauce was custom machine code
Anonymous
7/13/2025, 1:38:47 AM No.105886679
>>105886435
>no epstein
Anonymous
7/13/2025, 1:39:52 AM No.105886687
>>105886623
MoEs won. Larger model size + low active params are the future
Anonymous
7/13/2025, 1:43:20 AM No.105886703
>>16723102
lol
Anonymous
7/13/2025, 1:43:40 AM No.105886705
Mistral Large 3 333B (dense) is going to shit all over MoE
Replies: >>105886855
Anonymous
7/13/2025, 1:49:20 AM No.105886740
>>105886638
Deepseek is still the best local model. Kimi is too big to be local
Replies: >>105886745 >>105886796
Anonymous
7/13/2025, 1:50:01 AM No.105886745
>>105886740
Bullshit. A32B is perfect for VRAM/SSD maxxing
Replies: >>105886767 >>105886786
Anonymous
7/13/2025, 1:53:25 AM No.105886767
>>105886745
Can you infer this fat cunt from an SSD?
Anonymous
7/13/2025, 1:55:54 AM No.105886786
>>105886745
ssdmaxxing won't be a thing for a while
Replies: >>105886812
Anonymous
7/13/2025, 1:57:09 AM No.105886796
>>105886740
>some time later...
Kimi is still the best local model. 2mw.gguf is too big to be local
Anonymous
7/13/2025, 1:58:58 AM No.105886812
>>105886786
It will be a thing when experts are self-contained in their own files. This way RAM is automatically used as caching for more heavily used experts.
Replies: >>105886831
Anonymous
7/13/2025, 2:01:21 AM No.105886831
>>105886812
mmap already works like that
Replies: >>105886956
Anonymous
7/13/2025, 2:04:13 AM No.105886855
>>105886705
>XXX is going to YYY

two more weeks
Anonymous
7/13/2025, 2:09:02 AM No.105886894
1597786378292
1597786378292
md5: bf9063314c4fa43c05af7956b21a0101🔍
>>105885211
>>105878375
Thanks anons. I've added them to the paste.

Also (nothingburger) added LFM2-1.2B today.
Replies: >>105886909
Anonymous
7/13/2025, 2:11:28 AM No.105886909
>>105886894
Kimi K2 1T when?
Replies: >>105886920
Anonymous
7/13/2025, 2:12:46 AM No.105886920
>>105886909
When someone makes a quanted goof.
Anonymous
7/13/2025, 2:17:10 AM No.105886956
>>105886831
NTA but mmap is slower when enabled even when the model fully fits in RAM, so it is clearly suboptimal and llama.cpp should probably be handling the loading explicitly. I don't know if the approach of splitting the experts into files is the best way to do it or not though.
Anonymous
7/13/2025, 2:26:13 AM No.105886999
>>105884247
hat's the best? There's like eighty thousand versions of Cydonia.
Replies: >>105887353
Anonymous
7/13/2025, 2:30:21 AM No.105887017
Devstral-Small-2505:
[TOOL_CALLS][{"name": "test1", "arguments": {"a": 1}, "id": "abcdefgh1"}, {"name": "test2", "arguments": {"b": 1}, "id": "abcdefgh2"}]</s>
Devstral-Small-2507:
[TOOL_CALLS]test1[ARGS]{"a": 1}[TOOL_CALLS]test2[ARGS]{"b": 1}</s>
I have come to the realization that there's no backend that can parse the latest Devstral's tool call syntax. And the only reason it seemed to work with llama.cpp is because they have their custom JSON syntax that they inject in the system prompt:
Example tool call syntax:

{
"tool_calls": [
{
"name": "tool_name",
"arguments": {
"arg1": "some_value"
},
"id": "call_1___"
}
]
}</s>

[/SYSTEM_PROMPT]
The inference section in the README is a big lie. I have no idea how people that aren't using llama.cpp are able to make it work.
Anonymous
7/13/2025, 3:24:14 AM No.105887341
Gvso7QcXAAAMCg6
Gvso7QcXAAAMCg6
md5: 8011f4a160dadbb652bfc73389692a02🔍
Replies: >>105887422
Anonymous
7/13/2025, 3:26:25 AM No.105887353
784280516
784280516
md5: b2a0ada6b099e24c750f0695d88bac18🔍
>>105886999
the cydonia one
Anonymous
7/13/2025, 3:31:31 AM No.105887371
__kagamine_rin_vocaloid_drawn_by_mani_manidrawings__67970bff2535303e608ae769a11ea96b
I am going to sleep and when I wake up I expect to see those kimi goofs, Daniel.
Replies: >>105887549
Anonymous
7/13/2025, 3:43:37 AM No.105887422
>>105887341
Who is that in the background?
Replies: >>105887446
Anonymous
7/13/2025, 3:47:08 AM No.105887446
>>105887422
Aimaina, pinnochiop mascot
Anonymous
7/13/2025, 4:03:34 AM No.105887549
>>105887371
fuck off to r*ddit you freak
Replies: >>105887565
Anonymous
7/13/2025, 4:04:58 AM No.105887556
why does qwen3 14b feel so much better than qwen3 30b moe
Replies: >>105887573 >>105887574 >>105887611
Anonymous
7/13/2025, 4:06:46 AM No.105887565
>>105887549
Does reddit have the goofs?
Anonymous
7/13/2025, 4:07:35 AM No.105887573
>>105887556
because 30b only has 3b active and moe is a scam
Anonymous
7/13/2025, 4:07:54 AM No.105887574
>>105887556
I believe 14b is smarter but has less knowledge than 30b
Replies: >>105887618
Anonymous
7/13/2025, 4:16:27 AM No.105887611
>>105887556
Yeah, MoE tax makes them pretty worthless at sub-500b and 30+A.
Anonymous
7/13/2025, 4:17:06 AM No.105887618
>>105887574
Does 30b have more knowledge than 235b then? Because that ones has pretty much 0 knowledge about anything.
Anonymous
7/13/2025, 4:26:02 AM No.105887659
>>105887636
>>105887636
>>105887636