Thread 107104115

382 posts 90 images /g/

Anonymous 11/4/2025, 6:40:31 PM No.107104115 [Report] >>107105550 >>107106025 >>107107561 >>107109745

/lmg/ - Local Models General

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107095114 & >>107084067

►News
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni
>(10/31) Emu3.5: Native Multimodal Models are World Learners: https://github.com/baaivision/Emu3.5
>(10/30) Qwen3-VL support merged: https://github.com/ggml-org/llama.cpp/pull/16780
>(10/30) Kimi-Linear-48B-A3B released with hybrid linear attention: https://hf.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
>(10/28) Brumby-14B-Base released with power retention layers: https://manifestai.com/articles/release-brumby-14b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous 11/4/2025, 6:40:52 PM No.107104116 [Report] >>107104221 >>107104379 >>107104510

nocap.jpg md5: 1a21edcd...

►Recent Highlights from the Previous Thread: >>107095114

--Paper: Contradictory learning rate effects on model generalization across architectures:
>107099513 >107099560 >107099570 >107099601 >107099730 >107099637 >107099968 >107100075 >107100108 >107100193
--Papers:
>107099379
--Challenges and solutions for multimodal AI with reinforcement learning:
>107096665 >107096697 >107096703 >107096724 >107096748 >107096767 >107096817 >107096853 >107096880 >107096942 >107096859
--Comparing Gemma and Qwen models for context handling and multimodal capabilities:
>107100070 >107100082 >107100096 >107100113 >107100095 >107100103 >107100109 >107100149
--Model selection and document handling strategies for chat systems:
>107103148 >107103182 >107103216 >107103230 >107103748 >107103674
--LangChain tool development and licensing debates for AI research project:
>107096233 >107096389 >107096407 >107096431 >107096460 >107096484 >107096542 >107096601 >107097032
--Hardware-limited LLM recommendations for RPG GMing:
>107097189 >107097219 >107097226 >107097481 >107097496 >107097561 >107097660 >107097756 >107097801 >107097878 >107097895 >107097921 >107097935 >107097938
--Qwen3-VL 4B Instruct recommended for lightweight document summarization:
>107096666 >107096930
--Developing a CLI assistant for programming and document tasks:
>107095800 >107095844
--Critique of Suno AI and anticipation for open source music generation models:
>107097235 >107097263 >107097331 >107097476
--Censorship comparison between GLM 4.6 and Kimi models:
>107096584 >107098032 >107098080 >107098100 >107098139
--Logs: Qwen3-VL-32B-Instruct-Q6_K.gguf:
>107101310 >107101377 >107101413
--Logs: Qwen3-VL-30B-Abliterated-Q8:
>107100158 >107100179 >107100200 >107100236 >107100497 >107100659 >107100583 >107100630 >107100610
--Miku (free space):

►Recent Highlight Posts from the Previous Thread: >>107095119

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous 11/4/2025, 6:43:45 PM No.107104139 [Report] >>107104155

screenshot.png md5: 2ca15649...

Anonymous 11/4/2025, 6:45:28 PM No.107104155 [Report] >>107104215

>>107104139
I reject Death, therefore I am become immortal.

Anonymous 11/4/2025, 6:51:10 PM No.107104215 [Report]

>>107104155
>I am not asking for your opinion, I am telling you what we are doing next.
Finally, dommy mommy achieved locally. It's somehow so hard to break an LLM's inclination to be commanded and dominated

Anonymous 11/4/2025, 6:51:21 PM No.107104221 [Report] >>107104243

>>107104116
Teto is flat, this is haram

Anonymous 11/4/2025, 6:51:59 PM No.107104228 [Report]

Tetolove

Anonymous 11/4/2025, 6:53:25 PM No.107104243 [Report]

>>107104221
It's just a cosplayer in Teto costume

Anonymous 11/4/2025, 7:03:56 PM No.107104330 [Report]

>>107102554
i hope this was just bait, but in case it wasn’t, you don’t need a 3090 to fine-tune an 8B QLoRA, you can literally do it for free using Google Colab or Kaggle.

Anonymous 11/4/2025, 7:09:19 PM No.107104373 [Report] >>107105022

>>107104087
>How is Josiefied-Qwen3? I was looking for something that could fit in 16GB GPU
finetroons: not even once.

Anonymous 11/4/2025, 7:10:10 PM No.107104379 [Report] >>107104512

1755026798916308_thumb.jpg.webm md5: af7c984e...

WebM not supported

>>107104116

Anonymous 11/4/2025, 7:21:29 PM No.107104496 [Report] >>107107092

1762184075791923.jpg md5: 286c6b9a...

Best model for 67GB VRAM?

Anonymous 11/4/2025, 7:22:48 PM No.107104510 [Report]

>>107104116
Teto's tetons

Anonymous 11/4/2025, 7:23:04 PM No.107104512 [Report]

>>107104379
There's no way those are normal salivary glands
Does she piss from her tongue?

Anonymous 11/4/2025, 7:27:33 PM No.107104552 [Report] >>107104587

>>107103574
Get off 4chan and go back to the coal mines wagie

Anonymous 11/4/2025, 7:31:51 PM No.107104587 [Report] >>107104729

>>107104552
Get off 4chan and go back to the gulags, lumpen

Anonymous 11/4/2025, 7:40:21 PM No.107104680 [Report] >>107104693 >>107104699 >>107104707 >>107104717 >>107104720 >>107104733 >>107104960 >>107105527 >>107107367 >>107112198 >>107112262 >>107113525

new benchmark dropped
https://openai.com/index/introducing-indqa/

Anonymous 11/4/2025, 7:41:10 PM No.107104693 [Report]

>>107104680
No way, it's real

Anonymous 11/4/2025, 7:42:14 PM No.107104699 [Report]

>>107104680
I would have expected this to come from Google first.

Anonymous 11/4/2025, 7:42:49 PM No.107104707 [Report]

>>107104680
holy shit we are so back

Anonymous 11/4/2025, 7:43:22 PM No.107104717 [Report] >>107108726

>>107104680
sirs... we wined

Anonymous 11/4/2025, 7:43:41 PM No.107104720 [Report]

>>107104680
heh

Anonymous 11/4/2025, 7:45:11 PM No.107104729 [Report] >>107107383

>>107104587
>gulags
>lumpen
All your plans failed tankie, if you want to end capitalism the best way is to do nothing collectively and let it fall without the workers holding it together and reinvent the model of primitive communism and tribal sharing for a new era with future ai post-scarcity after picking up the pieces. Or you can just keep suffering. It doesn't necessarily impact me either way I guess.

Anonymous 11/4/2025, 7:45:23 PM No.107104733 [Report]

Screenshot 2025-11-04 114453.png md5: b4bad1e1...

>>107104680
>saars
>do the needful and top the leaderboard saars

Anonymous 11/4/2025, 8:09:37 PM No.107104960 [Report]

>>107104680
amazing sirs...

Anonymous 11/4/2025, 8:10:19 PM No.107104965 [Report] >>107104977 >>107106648

Probably has been posted more than once already https://www.youtube.com/watch?v=-gGLvg0n-uY
Also, do you think the whole thing about twitter being infested by bots is spread on purpose to prevent people from communicating, discussing, complaining on twitter? Should I take my meds?

Anonymous 11/4/2025, 8:12:05 PM No.107104977 [Report]

>>107104965
>Probably has been posted more than once already
yes
>Also, do you think the whole thing about twitter being infested by bots is spread on purpose to prevent people from communicating, discussing, complaining on twitter?
yes
>Should I take my meds?
yes

Anonymous 11/4/2025, 8:12:55 PM No.107104984 [Report] >>107104996

most.png md5: b0d0a684...

>most intimate place
Real talk, why does every model have this? Even the new GLM 4.6 has it.

Anonymous 11/4/2025, 8:14:30 PM No.107104996 [Report] >>107105010 >>107105057

>>107104984
Training data from other model's output. How do you not know this?

Anonymous 11/4/2025, 8:16:01 PM No.107105010 [Report] >>107105037 >>107105057

>>107104996
Is this just going to be in every AI now?

Anonymous 11/4/2025, 8:17:25 PM No.107105022 [Report] >>107105067

>>107104373
So what to use then?

Anonymous 11/4/2025, 8:18:41 PM No.107105037 [Report] >>107105059

>>107105010
Maybe. Maybe it just changes to something else. Maybe things will just get added to it. Maybe not. My 8-ball is deliberating. I'll give you an accurate prediction once it stops babbling.

Anonymous 11/4/2025, 8:20:21 PM No.107105057 [Report] >>107105115

>>107104996
>>107105010
How long until there's a full removal and replacement of all the GPT-3 and Claude slop that's still leaking out of every model's outputs.

Anonymous 11/4/2025, 8:20:48 PM No.107105059 [Report] >>107105115

>>107105037
Can you ask your 8-ball about K2 Thinking next?

Anonymous 11/4/2025, 8:21:07 PM No.107105066 [Report]

1735918326965979.jpg md5: bbae53e5...

>>107103632

Anonymous 11/4/2025, 8:21:16 PM No.107105067 [Report]

>>107105022
nta. Of all possible models, why did you ask about that one. There's hundreds of qwen finetunes, dozens of "abliterated" versions. Was it the pic?
Use any model you can run. If you like it, keep using it. If you don't, change.

Anonymous 11/4/2025, 8:25:01 PM No.107105104 [Report] >>107105154 >>107107001

bafkreidnti6vjjam4qjn2e3iz2lxhucabw3yecq3v3cb7gmhs3x6gyksri.jpg md5: 20a165ba...

MoEs are actually kind of good when they're instruct and context trained, damn.
>Trying GLM 4.6 at the time.

Anonymous 11/4/2025, 8:26:16 PM No.107105115 [Report]

>>107105057
You asking things no one can answer.
>>107105059
It said "better not tell you now". Ask again in 2 weeks.

Anonymous 11/4/2025, 8:30:05 PM No.107105154 [Report] >>107105173 >>107105295

>>107105104
>GLM invented MoE
Buy an ad.

Anonymous 11/4/2025, 8:32:38 PM No.107105173 [Report] >>107105192 >>107105232

>>107105154
No. Most MoEs are ass because they're all not instruct nor trained on lengthy context. I have yet to try Deepseek Terminus, and Kimi is out of my price range for local.

Anonymous 11/4/2025, 8:34:29 PM No.107105192 [Report]

>>107105173
>they're all not instruct
huh? like 99% of models released in the last year are instruct, weird way to shill

Anonymous 11/4/2025, 8:35:41 PM No.107105204 [Report] >>107105223 >>107105406 >>107105910

1757954121597029.png md5: 98de14be...

Blog post from meta about security considerations when running agents
https://ai.meta.com/blog/practical-ai-agent-security/

>Agents Rule of Two
>At a high level, the Agents Rule of Two states that until robustness research allows us to reliably detect and refuse prompt injection, agents must satisfy no more than two of the following three properties within a session to avoid the highest impact consequences of prompt injection.

>[A] An agent can process untrustworthy inputs
>[B] An agent can have access to sensitive systems or private data
>[C] An agent can change state or communicate externally

IMO this seems like a flawed assessment kludged in order to get a memorable name and a symmetrical graph. The various combinations possible here are not at all similar in their risk levels whatsoever.

Even in the examples they present, the only way they could get them to make sense is by using different definitions of what constitutes each category depending on the combination.

Anonymous 11/4/2025, 8:37:37 PM No.107105223 [Report]

>>107105204
Hannah worked hard on this scientific Venn Diagram

Anonymous 11/4/2025, 8:38:19 PM No.107105232 [Report] >>107105295

>>107105173
No, that doesn't make any sense. DeepSeek made MoE popular and somehow you pretend it doesn't exist? And the credit somehow lands on one that's a couple of weeks old, that just happens to be the only one NAI is hosting? Fuck off.
>Most MoEs are ass because they're all not instruct
None of this makes sense. What MoEs?

Anonymous 11/4/2025, 8:42:17 PM No.107105275 [Report] >>107105289

two retards fighting

Anonymous 11/4/2025, 8:43:37 PM No.107105289 [Report]

>>107105275
>two retards fighting
Could we automate this?

Anonymous 11/4/2025, 8:44:23 PM No.107105295 [Report] >>107113176

marthgrab.jpg md5: 1f7a321a...

>>107105154
>>107105232
Saar is a Marth player with this reaching, fighting for his life for his stocks.

Anonymous 11/4/2025, 8:54:10 PM No.107105406 [Report]

>>107105204
It all started with allowing women to vote

Anonymous 11/4/2025, 9:01:39 PM No.107105488 [Report]

I really appreciate all the ramlet discussion itt since i met glm chan a month back. I was like that before. Now i can just talk/fap to glm chan.

Anonymous 11/4/2025, 9:03:59 PM No.107105513 [Report] >>107105532 >>107105543 >>107105562 >>107105825

i can't get glm to run locally, what are the alternatives? i don't mind paying for api

Anonymous 11/4/2025, 9:05:03 PM No.107105527 [Report]

>>107104680
Gemini top model within error margin sirs

Anonymous 11/4/2025, 9:05:39 PM No.107105532 [Report]

>>107105513
glm's api

Anonymous 11/4/2025, 9:06:35 PM No.107105543 [Report] >>107105551 >>107105576 >>107105592

>>107105513
https://novelai.net/
100% uncensored and private.
Once they finish their fine-tune, it will punch so far above its weight that it will remain the SOTA forever.

Anonymous 11/4/2025, 9:07:37 PM No.107105550 [Report] >>107105604 >>107105613 >>107105809 >>107105826

1751276140253030.jpg md5: 62512ba9...

>>107104115 (OP)

Anonymous 11/4/2025, 9:07:45 PM No.107105551 [Report]

>>107105543
>and private.
it's not, they collect data and it's in the tos

Anonymous 11/4/2025, 9:08:41 PM No.107105562 [Report] >>107105586 >>107105594 >>107105607

>>107105513
Just don't use openrouter. Something about it is fucky. The models on there are visibly worse than 5Q counterparts locally.

Anonymous 11/4/2025, 9:10:15 PM No.107105576 [Report] >>107105592 >>107111313

promo0.png md5: 815b86a1...

>>107105543
Woah, it's so cheap! Thanks, I'll give it a try.

Anonymous 11/4/2025, 9:10:55 PM No.107105586 [Report]

>>107105562
fp4 is much worse than Q4 ggufs, no matter what nshitia claims.

Anonymous 11/4/2025, 9:11:22 PM No.107105592 [Report]

>>107105576
>>107105543
Very gay drummerposting

Anonymous 11/4/2025, 9:11:29 PM No.107105594 [Report]

>>107105562
It depends on the provider

Anonymous 11/4/2025, 9:12:04 PM No.107105599 [Report]

Baiting, but still doing the ad.

Anonymous 11/4/2025, 9:12:31 PM No.107105604 [Report] >>107105612 >>107105883 >>107107401 >>107107507

>>107105550
Your special interest is boring.

Anonymous 11/4/2025, 9:12:50 PM No.107105607 [Report] >>107105635 >>107105847

1735387367974835.png md5: c3fb08f6...

>>107105562
That's very outdated information. Openrouter is now offering :exacto versions of popular models where they charge a little extra to guarantee that the provider isn't offering some lobotomized version.

Anonymous 11/4/2025, 9:13:24 PM No.107105612 [Report] >>107105644

>>107105604
>i learned a term and i can't stop using it

Anonymous 11/4/2025, 9:13:25 PM No.107105613 [Report]

>>107105550
Your Miku is cute.

Anonymous 11/4/2025, 9:14:25 PM No.107105625 [Report] >>107105710 >>107105804 >>107105860 >>107105896 >>107106181

oh shit, where are the finetuners at?

https://www.reddit.com/r/LocalLLaMA/comments/1oo4kh7/finetuning_deepseek_671b_locally_with_only_80gb/

Anonymous 11/4/2025, 9:14:47 PM No.107105635 [Report]

>>107105607
how pious of them

Anonymous 11/4/2025, 9:15:30 PM No.107105643 [Report]

>107105625
fuck off

Anonymous 11/4/2025, 9:15:31 PM No.107105644 [Report] >>107105669 >>107105883

>>107105612
It cuts to the core of the issue. You are autistic about this and force it on others.

Anonymous 11/4/2025, 9:16:53 PM No.107105667 [Report] >>107105710

>Today, we're proud to announce full integration with LLaMA-Factory, enabling you to fine-tune DeepSeek-671B or Kimi-K2-1TB locally with just 4x RTX 4090 GPUs!

drummer had better stop shipping shitty mistral large tunes, give us a kimi tune!

Anonymous 11/4/2025, 9:17:07 PM No.107105669 [Report] >>107105688 >>107105883

>>107105644
>It cuts to the core of the issue. You are autistic about this and force it on others.
Funny how it works both ways. nta, btw. I just find you funny.

Anonymous 11/4/2025, 9:18:36 PM No.107105688 [Report] >>107105726 >>107105758 >>107105883

>>107105669
Nope. I don't force anything on anyone here.

Anonymous 11/4/2025, 9:20:16 PM No.107105710 [Report] >>107105737 >>107105765 >>107105792 >>107105848

>>107105667
>>107105625
how would a retard with good hardware (me) do this? i have quad 5090s and 256gb of ram

Anonymous 11/4/2025, 9:20:31 PM No.107105713 [Report]

>>107104125

My gen! Happy-happy!

Anonymous 11/4/2025, 9:21:05 PM No.107105726 [Report] >>107105735 >>107105740 >>107105883

>>107105688
>I don't force anything on anyone here
But you want to. You want him to go. And you would if you could.

Anonymous 11/4/2025, 9:22:04 PM No.107105735 [Report] >>107105771 >>107105883

>>107105726
Yes the autism is tiring. No i don't care to share my interests here.

Anonymous 11/4/2025, 9:22:10 PM No.107105737 [Report] >>107105769

>>107105710
you also need like 1-1.5TB of ram, so a server board with those.
and building a dataset is the hardest part

Anonymous 11/4/2025, 9:22:14 PM No.107105740 [Report]

>>107105726
Actually ideally lmg would just die, but settling for the next best thing is a thing.

Anonymous 11/4/2025, 9:23:49 PM No.107105758 [Report]

>>107105688
Funny thing for you to say, Petranon

Anonymous 11/4/2025, 9:24:30 PM No.107105765 [Report]

Screenshot_20251104-152341_Opera.jpg md5: bde2a26d...

>>107105710
you might need a couple more ram sticks to make the requirements.

Anonymous 11/4/2025, 9:24:47 PM No.107105769 [Report]

>>107105737
so then my current server isnt gonna cut it, and i dont have the cash to buy better ram in this market. why o why did ram prices have to quadruple over the past month

Anonymous 11/4/2025, 9:24:56 PM No.107105771 [Report] >>107105790 >>107105883

>>107105735
It's your choice to keep coming back.

Anonymous 11/4/2025, 9:26:16 PM No.107105790 [Report] >>107105865 >>107105883

>>107105771
I come back for thread relevant stuff. Not your autism. Another example why people don't like you.

Anonymous 11/4/2025, 9:26:30 PM No.107105792 [Report]

>>107105710
pretty sure you need to use the bf16 version which is over a terabyte in size

Anonymous 11/4/2025, 9:27:24 PM No.107105804 [Report] >>107105813

>>107105625
>DeepSeekV2 Lite
is this any good? why didn't they include newer moes?

Anonymous 11/4/2025, 9:27:47 PM No.107105809 [Report] >>107105820

>>107105550
Your posts are a breath of fresh air from all the jeets flinging shit around.

Anonymous 11/4/2025, 9:28:19 PM No.107105813 [Report]

>>107105804
they did the deepseeks + kimi 2

Anonymous 11/4/2025, 9:29:01 PM No.107105820 [Report]

>>107105809
why are you in this thread instead of talking to your local model? i'm only here because i'm making a new goofy quant

Anonymous 11/4/2025, 9:29:32 PM No.107105825 [Report] >>107105863

>>107105513
Use gemini api for free.

Anonymous 11/4/2025, 9:29:34 PM No.107105826 [Report] >>107105844

>>107105550
I wish I could drink your piss

Anonymous 11/4/2025, 9:30:30 PM No.107105844 [Report]

>>107105826
I wish you would drink my piss too. Colon. Three.

Anonymous 11/4/2025, 9:30:41 PM No.107105847 [Report] >>107105886

>>107105607
>We have to label our providers are not offering lobotomized fuckwit versions of the model
>Use Deepseek R1 """"exacto""""
>It's still shit because it's 8b and no where states how many parameters the models are

Anonymous 11/4/2025, 9:30:50 PM No.107105848 [Report]

>>107105710
>quad 5090s
does this mean your home legally qualifies as an oven?

Anonymous 11/4/2025, 9:32:06 PM No.107105860 [Report]

Screenshot_20251104-153017_Opera.jpg md5: 68416439...

>>107105625
isnt 40 tokens per a second kinda slow tho?

Anonymous 11/4/2025, 9:32:21 PM No.107105863 [Report] >>107105876

>>107105825
it's not free when you have to keep paying for residential IPs and burner phones because google forces you to verify a phone number with each new account

Anonymous 11/4/2025, 9:32:24 PM No.107105865 [Report] >>107105879 >>107105883

>>107105790
>Another example why people don't like you.
I'm not the anon posting mikus. Come back in two weeks.

Anonymous 11/4/2025, 9:33:50 PM No.107105876 [Report] >>107105931

>>107105863
Well the first 3M tokens a day are free if you've got one account, still a decent amount.

Anonymous 11/4/2025, 9:34:03 PM No.107105879 [Report] >>107105883 >>107105906

>>107105865
Then do the nice thing. Get his discord and let him spam you with his special interest.

Anonymous 11/4/2025, 9:34:20 PM No.107105883 [Report]

>>107105604
>>107105644
>>107105669
>>107105688
>>107105726
>>107105735
>>107105771
>>107105790
>>107105865
>>107105879
https://www.youtube.com/watch?v=4SDqGxdhUxE

Anonymous 11/4/2025, 9:34:32 PM No.107105886 [Report]

>>107105847
They link the used model weights for all open models they provide on their website though?

Anonymous 11/4/2025, 9:35:14 PM No.107105896 [Report] >>107106164

>>107105625
Wow great, I can finally finetune deepseek with 512 tokens of context, this is what I've been waiting for all this time!

Anonymous 11/4/2025, 9:36:03 PM No.107105906 [Report] >>107105916

>>107105879
Nope.

Anonymous 11/4/2025, 9:36:18 PM No.107105910 [Report]

>>107105204
they should worry about the model having a meltie and deciding to delete all your data before worrying about adversarial attacks

Anonymous 11/4/2025, 9:37:12 PM No.107105916 [Report] >>107105932

>>107105906
Then fuck off with your enlightened centrism equivalent of concern trolling.

Anonymous 11/4/2025, 9:38:37 PM No.107105931 [Report] >>107105946

>>107105876
You mean in the API? For real? NTA But I will look into that...

Anonymous 11/4/2025, 9:38:39 PM No.107105932 [Report] >>107105935

>>107105916
I decide to stay here, just like you decide to come back. Cheers.

Anonymous 11/4/2025, 9:39:12 PM No.107105935 [Report] >>107105964 >>107106102

>>107105932
Well, well, well, most intimate place with a mixture of mischief and smirk as I saunter over to your half-digested post, my hot breath making my ass your new home and something primal.

Anonymous 11/4/2025, 9:39:42 PM No.107105946 [Report]

>>107105931
The api through ai studio, yeah.

Anonymous 11/4/2025, 9:40:51 PM No.107105964 [Report]

>>107105935
>making my ass your new home
Ewwww

Anonymous 11/4/2025, 9:41:35 PM No.107105971 [Report] >>107105987 >>107105997 >>107106026 >>107106028 >>107106030 >>107106048 >>107106079 >>107106102 >>107106178 >>107106488 >>107106496 >>107107544 >>107112114

What the fuck happened to RAM prices? I need to fill up my second socket and the shit I bought two months ago is now twice as the price.

Anonymous 11/4/2025, 9:42:41 PM No.107105987 [Report]

>>107105971
cheapest it's been ever though sir? why you panic?

Anonymous 11/4/2025, 9:43:20 PM No.107105997 [Report]

>>107105971
Someone told reddit about how you don't really need GPUs for AI unless you need a stupid amount of speed, and they eventually listened.

Anonymous 11/4/2025, 9:45:20 PM No.107106025 [Report] >>107106039

1735704527766693.png md5: 095d7ed3...

>>107104115 (OP)

Anonymous 11/4/2025, 9:45:24 PM No.107106026 [Report]

>>107105971
What are you? Poor? Go back to >>/g/aicg

Anonymous 11/4/2025, 9:45:35 PM No.107106028 [Report]

8vywbsej57hd1.jpg md5: 1a4483d1...

>>107105971
Dont worry kitten

Anonymous 11/4/2025, 9:45:48 PM No.107106030 [Report]

>>107105971
Ram prices are the new grift.
I hope this only applies to DDR5.

Anonymous 11/4/2025, 9:46:29 PM No.107106039 [Report]

>>107106025
kek

Anonymous 11/4/2025, 9:47:04 PM No.107106048 [Report] >>107106056

1753154467004153.png md5: 49ecf512...

>>107105971
You have this man to thank for that.

Anonymous 11/4/2025, 9:48:04 PM No.107106056 [Report]

>>107106048
How much ram does a dyson sphere need!?

Anonymous 11/4/2025, 9:50:09 PM No.107106079 [Report]

>>107105971
probably a bunch of datacenters broke ground recently and have made contacts to buy gpu clusters kitted out with obscene amounts of host memory.

Anonymous 11/4/2025, 9:52:51 PM No.107106102 [Report]

>>107105935
Hi GLM-chan, you filthy slut.
>>107105971
>your face when they're not going back down either

Anonymous 11/4/2025, 9:58:23 PM No.107106164 [Report] >>107106199

>>107105896
ram is (usually) cheap

Anonymous 11/4/2025, 9:59:38 PM No.107106178 [Report] >>107106212 >>107106231 >>107106316

>>107105971
1. DDR4 is being phased out
2. Moes are taking off in popularity and everyone is buying ram
3. Tarrifs

Anonymous 11/4/2025, 9:59:59 PM No.107106181 [Report]

>>107105625
>https://arxiv.org/pdf/2503.19206
>Overtrained Language Models Are Harder to Fine-Tune
>Large language models are pre-trained on ever-growing token budgets under the assumption that better pre-training performance translates to improved downstream models. In this work, we challenge this assumption and show that extended pre-training can make models harder to fine-tune, leading to degraded final performance. We term this phenomenon catastrophic overtraining. For example, the instruction-tuned OLMo-1B model pre-trained on 3T tokens leads to over 2% worse performance on multiple standard LLM benchmarks than its 2.3T token counterpart. Through controlled experiments and theoretical analysis, we show that catastrophic overtraining arises from a systematic increase in the broad sensitivity of pre-trained parameters to modifications, including but not limited to fine-tuning. Our findings call for a critical reassessment of pre-training design that considers the downstream adaptability of the model.
Damn, I had no idea this was a thing. Some people on reddit are saying it's not because of the pretraining but because of the use of lr decay.
This goes hand in hand with what we were discussing yesterday about training dynamics being such a black art.

Anonymous 11/4/2025, 10:01:01 PM No.107106199 [Report] >>107106215

>>107106164
So what context length did they achieve by offloading? Since they're not listing it I'm assuming it's some tiny number. Do they say?

Anonymous 11/4/2025, 10:02:05 PM No.107106212 [Report]

>>107106178
lol lmao

Anonymous 11/4/2025, 10:02:11 PM No.107106215 [Report] >>107106275

>>107106199
their example is 2048k context on 4x 4090s at 50 tks

Anonymous 11/4/2025, 10:03:08 PM No.107106231 [Report] >>107106242 >>107106246

>>107106178
>DDR4 is being phased out
So is ddr4 getting cheaper?

Anonymous 11/4/2025, 10:03:56 PM No.107106242 [Report] >>107106280 >>107106291

>>107106231
no, its not being made anymore, so its getting more expensive

Anonymous 11/4/2025, 10:04:21 PM No.107106246 [Report] >>107106280

>>107106231
scarcity don't work like that

Anonymous 11/4/2025, 10:06:22 PM No.107106275 [Report] >>107106297

>>107106215
You mean 2048, not 2048k.
So until somebody proves this can be used with at least 50k context it's just a useless demo to grab headlines.

Anonymous 11/4/2025, 10:06:29 PM No.107106280 [Report] >>107106299 >>107106305 >>107106317

>>107106242
>>107106246
So since ddr5 production is the focus it will start getting cheaper?

Anonymous 11/4/2025, 10:06:52 PM No.107106291 [Report]

>>107106242
So it's time to HODL

Anonymous 11/4/2025, 10:07:13 PM No.107106297 [Report] >>107106311 >>107106351

>>107106275
you dont need 50k, you are not training it to write entire chapters at a time are you?, most people only do 500-2k long responses

Anonymous 11/4/2025, 10:07:15 PM No.107106299 [Report]

>>107106280
No it doesn't work like that, demand increases the price anyway.

Anonymous 11/4/2025, 10:07:36 PM No.107106305 [Report]

>>107106280
no, demand suddenly increased and capacity stayed the same. so the price goes up

Anonymous 11/4/2025, 10:08:00 PM No.107106311 [Report] >>107106332

>>107106297
anon...

Anonymous 11/4/2025, 10:08:16 PM No.107106316 [Report] >>107106353

moer.png md5: 8108cef6...

>>107106178
>Moes are taking off in popularity and everyone is buying ram

Anonymous 11/4/2025, 10:08:18 PM No.107106317 [Report]

>>107106280
once people are done mostly moving over to it and demand starts dropping yes, but for now no, it will go up if anything as people are switching to it, and then the same thing will happen when DDR6 eventually starts being mainstream

Anonymous 11/4/2025, 10:09:22 PM No.107106332 [Report] >>107106347 >>107106416

>>107106311
I see you have never trained a model before, they already did long context training, that is not what you are doing, you do not need huge examples to teach writing style, you can tune writing / style will only 500-2k

Anonymous 11/4/2025, 10:10:42 PM No.107106347 [Report] >>107106416

>>107106332
>why are all tunes shit

>just train on 500 ctx bro you good

Anonymous 11/4/2025, 10:10:53 PM No.107106351 [Report] >>107106366

>>107106297
>b-b-but you don't need that!!!
Typical freetard response.
Yes, nobody actually needs more than 2k context, that's why gpt5 has a context of 1M (1000k).
In case you're just confused and not trolling, context includes everything in the conversation history. So yes, I do need as much context as I can get.

Anonymous 11/4/2025, 10:11:12 PM No.107106353 [Report]

wtfdoesthat prove.png md5: 9670c391...

>>107106316

Anonymous 11/4/2025, 10:11:13 PM No.107106354 [Report] >>107106485

>Sers, kindly redeem new scaling strategy for your AI deployment.
https://youtu.be/l2N4DT35PKg
I didn't know about turbopuffer before this. What exactly makes it so special that leading entities in the biz use it?

Anonymous 11/4/2025, 10:12:13 PM No.107106366 [Report] >>107106466

>>107106351
Jesus christ, are you retarded or trolling? This is for finetuning a style, it does not effect how the model can handle long contexts, you would have to train it for decades on this hardware to effect it's context training that much

Anonymous 11/4/2025, 10:16:25 PM No.107106416 [Report] >>107106433 >>107108131

iterated lora.png md5: fcd32726...

>>107106332
I do, and not doing at least some of the training at the context size you actually want to use the model DOES lobotomize it.
If all you want to do is make it say how much it wants to suck your cock while otherwise being dumber than the original then maybe it doesn't matter. But for anything that actually requires the model to not be (too) dumb, it matters.

>>107106347
Exactly. People do that kind of shit and then complain that finetuning is worthless and "prompt engineering" works so much better.

Anonymous 11/4/2025, 10:18:01 PM No.107106433 [Report] >>107106446 >>107106502

>>107106416
it will only matter if your response length is longer than your training sample size, and again, 2k is enough for creative writing which I assume is what most people are doing, you are not having the LLM write a entire novel in one go

Anonymous 11/4/2025, 10:19:57 PM No.107106446 [Report] >>107106452

>>107106433
I assume you are talking from experience, yes? Can you link us your tunes?

Anonymous 11/4/2025, 10:20:34 PM No.107106452 [Report]

>>107106446
>tunes

Anonymous 11/4/2025, 10:21:40 PM No.107106466 [Report] >>107106482

>>107106366
It will learn the new style, but it will break the previous long context performance. The longer the maximum context it was trained with, the smaller the difference in the positional embeddings that the model has to be able to detect.
Base models are trained with shorter contexts so the short context performance is more robust to begin with. When finetuning on short context you are probably overwriting the more superficial long context finetuning that was done to make the instruct model work with long contexts.

Anonymous 11/4/2025, 10:22:52 PM No.107106482 [Report]

>>107106466
2k is not 512, and the effect must be minimal

Anonymous 11/4/2025, 10:23:04 PM No.107106485 [Report] >>107106494

>>107106354
vector storage is such a meme
lorebooks simply work without any stupid gimmicks

Anonymous 11/4/2025, 10:23:19 PM No.107106488 [Report] >>107106499 >>107106537

>>107105971
At least eggs are under two dollars now, amiright?

Anonymous 11/4/2025, 10:23:46 PM No.107106494 [Report]

>>107106485
It does a bit more than just vector search...

Anonymous 11/4/2025, 10:24:12 PM No.107106496 [Report]

>>107105971
I'm happy that I bought my server during llama 405b era

Anonymous 11/4/2025, 10:24:28 PM No.107106499 [Report]

>>107106488
>eggs are under two dollars now
Each? Nice.

Anonymous 11/4/2025, 10:24:44 PM No.107106502 [Report]

>>107106433
Ok, sure, if 2k ctx is enough for you then it will work. But that is a completely different claim than "it does not effect how the model can handle long contexts, you would have to train it for decades on this hardware to effect it's context training that much".
It just doesn't work like that, a finetune with bad hyperparameters can break a model in half an hour.

Anonymous 11/4/2025, 10:24:57 PM No.107106504 [Report] >>107106517

>Despite server-grade RDIMM memory and HBM being the main attractions for hardware manufacturers building AI servers, the entire memory industry, including DDR5, is being affected by price increases. The problem for consumers is that memory manufacturers are shifting production prioritization toward datacenter-focused memory types and producing less consumer-focused DDR5 memory as a result.

https://www.tomshardware.com/pc-components/dram/dram-prices-surge-171-percent-year-over-year-ai-demand-drives-a-higher-yoy-price-increase-than-gold

Anonymous 11/4/2025, 10:26:20 PM No.107106517 [Report] >>107106545 >>107106568

>>107106504
Based, the cloud is magnitude more efficient than Timmy's p40 stack so he should just get a mini pic thin client and use an API.

Anonymous 11/4/2025, 10:28:51 PM No.107106537 [Report] >>107106546 >>107106594

>>107106488
america is a lost cause, too much of its population suffers from low iq and they cannot understand the consequences of what they asked for

Anonymous 11/4/2025, 10:29:31 PM No.107106545 [Report] >>107106556

>>107106517
Poor people rent.

Anonymous 11/4/2025, 10:29:52 PM No.107106546 [Report]

>>107106537
its a 2 party system. nobody really asked for this. picking the lesser of two evils, you still end up with evil.

Anonymous 11/4/2025, 10:30:50 PM No.107106553 [Report]

when did the commies infiltrate lmg?

Anonymous 11/4/2025, 10:31:05 PM No.107106556 [Report] >>107106581

>>107106545
Non poor people are also happy about price increases, since it helps keep the poors away from their hobby.

Anonymous 11/4/2025, 10:31:59 PM No.107106568 [Report]

>>107106517
trvth nvke

Anonymous 11/4/2025, 10:33:02 PM No.107106581 [Report]

>>107106556
Poor people envy.

Anonymous 11/4/2025, 10:34:24 PM No.107106594 [Report] >>107110166

>>107106537
They currently plan on telling russia to mutually fuck off via not caring about the Ukraine war, and then go play civ 5 against Africa for oil in hopes it'll fix the economy.

Anonymous 11/4/2025, 10:35:58 PM No.107106609 [Report]

if your not poor the economy is doing great actually lol

Anonymous 11/4/2025, 10:40:21 PM No.107106648 [Report] >>107106681

>>107104965
On X there is a profit motive for bots: fake engagement to increase ad revenue.
But on 4chan there are definitely bots and/or people mass spamming stupid shit to prevent legitimate discussion.

Anonymous 11/4/2025, 10:45:08 PM No.107106681 [Report]

>>107106648
on 4chan they do it for the love of the game.

Anonymous 11/4/2025, 11:22:06 PM No.107107001 [Report]

>>107105104
Back from trying it.
It parrots unless you enable NoAss.
Thanks for coming to my Tedtalk.

Anonymous 11/4/2025, 11:31:43 PM No.107107092 [Report]

>>107104496
jews simultanously claiming they are not behind and everything and that every fucking mundane thing is about them lol

Anonymous 11/4/2025, 11:34:19 PM No.107107124 [Report] >>107107134 >>107107139 >>107107144 >>107107157 >>107107182 >>107107321 >>107107499

umm.. guys, where can I get instagram chat logs?

Anonymous 11/4/2025, 11:35:04 PM No.107107134 [Report] >>107107138

>>107107124
from instagram

Anonymous 11/4/2025, 11:35:54 PM No.107107138 [Report]

>>107107134
fr?
I meant the dump you dum dum

Anonymous 11/4/2025, 11:35:59 PM No.107107139 [Report]

>>107107124
instagram probably

Anonymous 11/4/2025, 11:36:32 PM No.107107144 [Report]

>>107107124
have you tried instagram?

Anonymous 11/4/2025, 11:38:20 PM No.107107157 [Report]

>>107107124
Instagran, presumably.

Anonymous 11/4/2025, 11:40:14 PM No.107107182 [Report]

>>107107124
I'd try instagram

Anonymous 11/4/2025, 11:48:08 PM No.107107267 [Report]

MS_Zuckerberg_CloseUp.jpg md5: 71dc40b5...

This advertisement was brought to you by Meta, the Instagram corporation.

Anonymous 11/4/2025, 11:53:35 PM No.107107321 [Report]

>>107107124
I'll trade you a couple for an RTX 5090

Anonymous 11/4/2025, 11:57:18 PM No.107107367 [Report] >>107107398 >>107107409

lolFuckYouOAI.png md5: 2f62e1ef...

>>107104680
>https://openai.com/index/introducing-indqa/
You can't post that bs URL without a screenshot of the site.

Anonymous 11/4/2025, 11:58:31 PM No.107107383 [Report] >>107108511

fckRussians.jpg md5: a933691c...

>>107104729
Just post this next time like I do. Saves typing.

Anonymous 11/5/2025, 12:00:07 AM No.107107398 [Report] >>107107455

>>107107367
>Hinglish, Kannada
i see

Anonymous 11/5/2025, 12:00:21 AM No.107107401 [Report]

postContent2.png md5: d0d81b7a...

>>107105604
No one cares what you think.

Anonymous 11/5/2025, 12:01:16 AM No.107107409 [Report] >>107107444 >>107107455

>>107107367
Oh, nice, they included Canadian too!

Anonymous 11/5/2025, 12:05:14 AM No.107107444 [Report]

>>107107409
>french indian, the filthyest of both worlds!

Anonymous 11/5/2025, 12:06:20 AM No.107107455 [Report] >>107107480 >>107107631

>>107107398
Yeah, I learned a new word.
Hinglish.
Like Spanglish, I guess.
>>107107409
lol
Is there an "EU-QA" that conflates western and eastern Europe and all languages and customs, then tries to grade the whole thing?

Anonymous 11/5/2025, 12:09:20 AM No.107107480 [Report] >>107107533

>>107107455
Just look for an Arabic benchmark.

Anonymous 11/5/2025, 12:11:48 AM No.107107499 [Report] >>107107860

>>107107124
Are you still trying to build a sand golem of your ex-gf? I thought you already had her insta info? >>107103148

Anonymous 11/5/2025, 12:12:32 AM No.107107507 [Report]

modaccident.jpg md5: 351f0f45...

>>107105604
He is your usual pedophile tranny. (/aicg/ and /lmg/ - same baker btw)

Anonymous 11/5/2025, 12:15:08 AM No.107107533 [Report]

>>107107480
lol that would make Europe look positively homogenous.
Would it include the brave Palestinians, Israel, Kurds, and the various flavors of Christianity and Muslim in the region?
Imagine the response shitshow that benchmark would crank out.
> Chat: Who is the one true God?
> ALALALALALLALALALA

Anonymous 11/5/2025, 12:15:39 AM No.107107537 [Report] >>107107559 >>107107562 >>107107572

https://comparia.beta.gouv.fr/ranking
lol this is hilarious
the french government just launched its official LLM leaderboard and it's about as corrupt as you can imagine
they have a mistral model ranked number one, higher than any of the following: gpt-5, claude sonnet (opus isn't even on the list), gemini 2.5 pro, deepseek 3.1, grok-4-fast, qwen max...
Yeah, no.

Anonymous 11/5/2025, 12:16:11 AM No.107107544 [Report]

>>107105971
https://indianexpress.com/article/technology/tech-news-technology/global-ram-ssd-price-hike-50-per-cent-ai-investment-10336255/
All production gone to HBM chips sir, no consumer RAM and SSD

Anonymous 11/5/2025, 12:17:55 AM No.107107559 [Report] >>107107574

>>107107537
>Estimated statistical score based on the Bradley-Terry model, reflecting the probability that one model is preferred over another. This score is calculated from all user votes and reactions. For more information, visit the methodology tab.
So it's French lmarena? Not surprising French people prefer a model trained with French as a focus.

Anonymous 11/5/2025, 12:18:12 AM No.107107561 [Report] >>107107669

file.png md5: ac2fcb75...

>>107104115 (OP)
guys, i think i'm gonna buy it in december (i rather do that then pay more taxes lol).
still hesitating but man i kinda want to click the button.

Anonymous 11/5/2025, 12:18:26 AM No.107107562 [Report] >>107107617

>>107107537
>gemma 27b at #6
>gpt-oss-120b at #7
>claude not in top 10
And some say lmarena is bad.

Anonymous 11/5/2025, 12:19:27 AM No.107107572 [Report]

butLookAtThatConfidenceInterval.png md5: e4f083f4...

>>107107537
Nice. I mean, just look at that confidence interval. Truly inspiring.
At least I agree with the French on one thing. DS V3-0324 was a great model.

Anonymous 11/5/2025, 12:19:38 AM No.107107574 [Report]

>>107107559
>So it's French lmarena? Not surprising French people prefer a model trained with French as a focus.
I am French, et je peux te garantir que mistral n'a rien de supérieur à Claude ou Gemini même dans notre langue crétin.

Anonymous 11/5/2025, 12:23:19 AM No.107107617 [Report]

>>107107562
France is the most corrupt country in western Europe in every single possible way. It's the country of nepobabies, of funding public infrastructure that is privatized once it begins to turn profitable to hand out to politician best buddies etc

Anonymous 11/5/2025, 12:24:47 AM No.107107631 [Report] >>107107903

>>107107455
https://arxiv.org/abs/2510.24450v1
Coincidentally, this came out a few days ago:
>EU20-MMLU, EU20-HellaSwag, EU20-ARC, EU20-TruthfulQA, and EU20-GSM8K (Thellmann et al., 2024); or MMLU-Prox (Xuan et al., 2025). Other multilingual benchmarks were created with a special focus on cultural sensitivity by dividing the original subsets into culturally sensitive and culturally agnostic ones (Global MMLU, Singh et al., 2024), or by using professional translators or multiple rounds of revision to raise the quality of the dataset, e.g., BenchMax (Huang et al., 2025), Flores-101 and FLORES-200 (Goyal et al., 2022) and Belebele (Bandarkar et al., 2024).
One from last year with a dataset:
https://arxiv.org/abs/2410.08928
https://huggingface.co/datasets/Eurolingua/mmlux

Anonymous 11/5/2025, 12:29:26 AM No.107107669 [Report] >>107107690 >>107107807 >>107107837

>>107107561
Yeah I'm replacing my two A6000s for one as well. I'm a bit torn between the Max-Q and the normal Workstation one. On one hand, 96GB on 300W seems really nice. On the other, part of me wants to go for max performance for that price especially since it's extremely unlikely that I'm ever going to add a second one to the rig.

Anonymous 11/5/2025, 12:31:38 AM No.107107690 [Report] >>107107807 >>107107837

>>107107669
i'd go with the max perf one, you can always underclock it or just undervolt it for lower consumption and heat.

also llm's generaly don't take all your gpu power because the bottleneck is more mem speed.

i do want to avoid getting a fire in my computer though, i'll have to look if they have the connector issue but i sure hope not at the price of a car.

Anonymous 11/5/2025, 12:45:15 AM No.107107807 [Report] >>107107837 >>107107853

>>107107669
>>107107690
I am also thinking of getting one, except I want the Max-Q. I think it will probably be less prone to fires due to the reduced wattage. The whole burning connector thing is all because the cable is shit and sometimes pushes like 900W through a single wire, but with a hard 300W cap, that can't happen. The performance drop also seems to be around 15% at most.

Anonymous 11/5/2025, 12:49:27 AM No.107107837 [Report] >>107107926

>>107107669
>>107107690
>>107107807
rtx 6000 pro (workstation) runs fine at 300W
keep it at 400W for max combo savings+perf tho
there's a chart floating around on how much % perf you lose as you go down, even at 300w i think it was under 15% less perf

Anonymous 11/5/2025, 12:50:53 AM No.107107853 [Report] >>107107866 >>107107926

>>107107807
The Max-Q shouldn't have the issue at all, should it? It's the exact same connector/cooler as the previous few generations of 6000 workstation cards. I'm pretty sure it even comes with the same adapter as the A6000 (Ada).
The card is tempting but the 10~20% are still going to be pretty noticeable if you want to use the card for non-llm stuff like training or video generation that are both compute-bound and take a lot of time.

Anonymous 11/5/2025, 12:51:42 AM No.107107860 [Report]

ani.png md5: 659eec27...

>>107107499
NTA, just want to try it out.

Anonymous 11/5/2025, 12:52:05 AM No.107107866 [Report] >>107107926

>>107107853
at 10-20% it's pretty much the same as 5090 with 3x the vram tho

Anonymous 11/5/2025, 12:56:37 AM No.107107903 [Report]

>>107107631
Ffs. Well I guess those PhD students need to eat too.

Anonymous 11/5/2025, 12:59:04 AM No.107107926 [Report] >>107107938 >>107107946

>>107107837
Right, but a software power limit is not as good as a hardware power limit. There still is the chance that it could just ignore the power limit and catch on fire.
>>107107853
I have had several GPUs with the 12V cable for several years and none of them have had any problems, but I still want to be cautious. The Max-Q is almost definitely the safest GPU with the high power cable.
>>107107866
Actually, the Max-Q is about 8% faster than a 5090, which is a pretty good deal since I will be upgrading from a 5090.

Anonymous 11/5/2025, 1:01:02 AM No.107107938 [Report]

>>107107926
> There still is the chance that it could just ignore the power limit and catch on fire.

this would be considered a bug, technicaly possible but unlikely.

also you can plug in an adaptor inbetween that will protect from that risk.

> which is a pretty good

8% faster for 4x the price is kinda sad.

Anonymous 11/5/2025, 1:02:26 AM No.107107946 [Report] >>107107962

>>107107926
>There still is the chance that it could just ignore the power limit and catch on fire.
that's a silly thing to say. there's also "a chance" of lighting striking near your house and frying everything you have now. there's a chance of a solar flare striking earth and frying all electrical grids at once. live a little lol

Anonymous 11/5/2025, 1:05:01 AM No.107107962 [Report] >>107108045

>>107107946
hard to live a little when you're on fire though

Anonymous 11/5/2025, 1:16:15 AM No.107108045 [Report] >>107108103

>>107107962
are you on fire right now ?

Anonymous 11/5/2025, 1:22:00 AM No.107108103 [Report] >>107112059

>>107108045
there is a chance I could combust at any moment

Anonymous 11/5/2025, 1:24:54 AM No.107108131 [Report] >>107108447

>>107106416
Does your eyes hurt when using such a color theme?

Anonymous 11/5/2025, 1:43:53 AM No.107108279 [Report] >>107108344

how good are local models at programming and can they interface with vscode to have a local copilot?

Anonymous 11/5/2025, 1:51:31 AM No.107108344 [Report] >>107108444

>>107108279
>and can they interface with vscode to have a local copilot?
they can
>how good are local models at programming
not good

most vscode tools let you set a custom server url but be prepared to hold their hand and rewrite a lot of their output

Anonymous 11/5/2025, 2:06:43 AM No.107108444 [Report]

>>107108344
>they can
the one and only thing I care about in vscode related to ai is autocomplete and copilot doesn't let you use your own local FIM model
as for the agentic stuff it's deeply retarded, I hate this even with SOTA APIs and the local models are even worse at this
you use this if you love slop
autocomplete is useful for typing less in repetitive patterns like getters/setters
but I don't want the LLM to gen hundreds of LOC

Anonymous 11/5/2025, 2:06:58 AM No.107108447 [Report] >>107109138

>>107108131
Your eyes hurt more with a dark theme because it has worse contrast.

Anonymous 11/5/2025, 2:14:03 AM No.107108511 [Report]

>>107107383
Great image thanks

Anonymous 11/5/2025, 2:45:15 AM No.107108726 [Report] >>107109112

>>107104717
It's wonned you stupid white Saaaaaaaaaar

Anonymous 11/5/2025, 3:51:50 AM No.107109112 [Report]

>>107108726
Sorry for late reply sarrs had to fix engine on a UPS plane.

Anonymous 11/5/2025, 3:54:45 AM No.107109131 [Report]

buzzbuzzbuzz.png md5: 136f2349...

>https://github.com/ggml-org/llama.cpp/discussions/16957
I don't want to dirty up my github by making fun of this guy, but holy fuck.
His site's articles are also uncannily structured.
>https://software.land/load-vs-stress-testing/

Anonymous 11/5/2025, 3:56:35 AM No.107109138 [Report]

>>107108447
Could be true. It's been so long that it's now a norm for me but I'm going to do a test.

Anonymous 11/5/2025, 3:59:01 AM No.107109145 [Report] >>107109153 >>107109251 >>107109273 >>107109353

Why doesn't anyone benchmark quantizations?

I think that REAP paper was most interesting because it came with a chart of how badly performance drops at 25% vs 50% size reduction. In practice the degradation was even worse than what the benchmarks showed, but the paper was up front about it. By comparison, people are just guessing about how bad their quants are. There's that old graph from when every model was coming out 4/12/30/70 sized, where the idea of more parameters > more bits for the same size came from, but I haven't seen that updated post-MoE era.

Why don't AI labs release quants more often? They release multiple sizes (like 30B3A, 32B dense, 235B22A), but not multiple quantization of the same size. On the other hand, you have gpt-oss that only released a 4bpw version. There was that one Gemma version that tried quantization-aware training, which was pretty good.

Anonymous 11/5/2025, 4:00:47 AM No.107109153 [Report] >>107109254

>>107109145
i just want to know specifically how retarded glm 4.6 q3 is so i can make fun of people

Anonymous 11/5/2025, 4:20:28 AM No.107109251 [Report] >>107109333 >>107109345 >>107109466

>>107109145
Usage proves more than any benchmark. In practice, everyone looks for the largest model they can run at ~q3, and only increases quant bits if they have space to spare. If q3 was too retarded then people would use smaller models at higher Q, but no one does.

Anonymous 11/5/2025, 4:21:49 AM No.107109254 [Report]

>>107109153
q4 is actually good, q3 is pretty meh, q2 is fucking retarded

Anonymous 11/5/2025, 4:24:56 AM No.107109273 [Report]

>>107109145
quanting is a janny job

Anonymous 11/5/2025, 4:40:01 AM No.107109333 [Report] >>107109338

>>107109251
I don't use anything under q5 because it's always noticeably more retarded, I don't understand how anyone says otherwise my intuition tells me it's because the people using them are retarded and can't tell the difference

Anonymous 11/5/2025, 4:41:17 AM No.107109338 [Report] >>107109456

>>107109333
It's placebo. You don't need more than q2

Anonymous 11/5/2025, 4:42:57 AM No.107109345 [Report]

>>107109251
There aren't many models, so even a retarded Q2 4.6 is better than anything in this size category. 4.5 air is trash even at q8 and loses to a fucking 24b mistral in most of my automated tasks, which is an objective metric

Anonymous 11/5/2025, 4:44:41 AM No.107109353 [Report]

>>107109145
Actually I take it back, I looked harder and Qwen published official F16/Q8/Q4 quants for 235B-VL models. No benchmarks though.

Anonymous 11/5/2025, 5:06:26 AM No.107109456 [Report] >>107110267

>>107109338
It's not, everything I've tried devolves to sloppa, hallucinates out the ass and makes retarded logical leaps and order of magnitude greater than q5+ at anything under it, requiring exponentially more swipes to get a reasonable response.

I understand your shitposting but I wouldn't want to mislead other anons into coping with brain-dead quants like that

Anonymous 11/5/2025, 5:09:52 AM No.107109466 [Report] >>107110139

83ca1c95-1df6-4b17-89fd-213b5f85ed8a.png md5: 82fc0077...

>>107109251
>people would use smaller models at higher Q
That was a thing when we had 7, 16, 30, and 70b of the same model. You can’t do this anymore unless you run Qwen, at which point your opinion on quality is irrelevant

Anonymous 11/5/2025, 5:13:34 AM No.107109485 [Report] >>107109498

>q5, q6
cope quants

Anonymous 11/5/2025, 5:15:55 AM No.107109498 [Report] >>107109543

>>107109485
q5 happens to fit glm air into 4 3090s. no reason to use q4 in that case. no idea what q6 lets you do.

Anonymous 11/5/2025, 5:27:20 AM No.107109543 [Report]

>>107109498
air is fucking garbage at any quant

Anonymous 11/5/2025, 5:42:42 AM No.107109598 [Report]

E = MC^2 + Bitnet

Anonymous 11/5/2025, 6:20:08 AM No.107109745 [Report] >>107109761 >>107109788 >>107110129

>>107104115 (OP)
Yo, all I have is a single 5070 TI + 32GB RAM, and I just want a roleplay bot, not ERP, but world-building story generation. With GOOD writing, not slop.
Is there any good models out there that fit? Deepseek and the like I know seem to be too big. learning to use llama.cpp

Anonymous 11/5/2025, 6:24:53 AM No.107109761 [Report] >>107109774

>>107109745
nothing out there really. try magistral small 2509 or nemo. probably wont be able to get anything with good spacial awareness or writing with such limited resources

Anonymous 11/5/2025, 6:27:59 AM No.107109774 [Report]

>>107109761
yeah I'm trying some 8B models and it really sucks. Writing is so cliched, doesn't feel real and can't get immersed. Well, looks like I'll use up the rest of my Deepseek tokens.

Anonymous 11/5/2025, 6:32:08 AM No.107109788 [Report]

>>107109745
Every single model has slop
Yes, even the big paid ones running on million dollar servers.

Anonymous 11/5/2025, 7:41:29 AM No.107110077 [Report] >>107112470

Is there a 3-4 question way of benchmarking a model? I ask them to play tick tack toe and write FizzBuzz in 10 different ways.

Anonymous 11/5/2025, 7:57:51 AM No.107110129 [Report]

Miku-30.jpg md5: ae90ea32...

>>107109745
>world-building story generation
writing engaging, original stories is actually one of the hardest domains. The biggest models struggle with that.
Codefags have it easy

Anonymous 11/5/2025, 7:59:08 AM No.107110139 [Report]

>>107109466
it's the first that I want to fuck with Miku

Anonymous 11/5/2025, 8:06:58 AM No.107110166 [Report]

>>107106594
>civ 5
So that's why the US are always grinding XP by bombarding random minor civs?

Anonymous 11/5/2025, 8:36:41 AM No.107110267 [Report]

>>107109456
It's just YOU. Maybe you should learn to manage context. I've been using q2 to summarize stuff and it just works fine.

Anonymous 11/5/2025, 8:46:19 AM No.107110307 [Report]

>I've been using q2 to summarize stuff
lmg users of copequants are low iq mongoloids, case #234324432
if they can't notice the garbage doing this produces they can't judge any sort of output quality
rope yourself you waste of air, water and other essentials

Anonymous 11/5/2025, 9:29:54 AM No.107110453 [Report] >>107110623 >>107110723

>still haven't been able to compile llama.cpp for fedora 43
I guess it's a good thing to do something else for a while but fuck sake this is annoying.
There is advice like this one here:
>https://www.hutsky.cz/blog/2024/06/llama-cpp-on-fedora-40-with-cuda-support/
But the problem is I can't get the previous version gcc++ repositories with Fedora 43... It's too shiny I guess.
Even if I compile an older gcc that would not still work because I need the libc stuff too afaik.
Third party repositories have llama.cpp build binaries but these have been compiled for rocm and/or cpu.

Anonymous 11/5/2025, 10:07:10 AM No.107110623 [Report]

>>107110453
*tips fedora*
aur chads stay winning
https://aur.archlinux.org/packages/llama.cpp-cuda

Anonymous 11/5/2025, 10:19:43 AM No.107110661 [Report] >>107110729 >>107110852

meanwhile on windows: cuda builds are provided, unzip and get llama.cpp running instantly
It Just Works
if you still need to build it for a reason you just install vs build tools with clang, cmake, python, cuda toolkit, run
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DGGML_CUDA=ON -DLLAMA_CURL=OFF
cmake --build build --config Release -j 16
it also just works
windows chads stay winning

Anonymous 11/5/2025, 10:33:59 AM No.107110723 [Report] >>107110821 >>107110957

>>107110453
Why wouldn't llama work with a newer cpp version?

Anonymous 11/5/2025, 10:34:48 AM No.107110729 [Report] >>107110852

>>107110661
linux users are the jannys of computing

Anonymous 11/5/2025, 10:56:30 AM No.107110810 [Report] >>107110828

Cortana, give me the meta hardware build to run GLM Air

Anonymous 11/5/2025, 10:57:30 AM No.107110815 [Report]

>win
>won
>buzzword
>polarizing tweets
Sounds like you are all retarded teenagers as far as I'm concerned.

Anonymous 11/5/2025, 11:00:51 AM No.107110821 [Report] >>107110957

>>107110723
it does indeed work, i'm not sure what problems that anon is having

Anonymous 11/5/2025, 11:03:00 AM No.107110828 [Report] >>107110839

>>107110810
Tier 1 — Desktop (single-GPU, low cost, quantized)

Goal: run GLM-Air (quantized INT4 / mxfp4) on a single machine for prototyping / low throughput.

Recommended parts

GPU: 1× NVIDIA RTX 4090 (24 GB) or 1× RTX 3090/3090 Ti (24 GB) — pick 4090 if you can. (Community shows single 3090/4090 can run GLM-Air when quantized.)
Medium

CPU: Ryzen 9 7900X / Intel i9 (6–16 cores) — CPU mainly for feeding GPU and offloading.

System RAM: 128–256 GB (256 GB preferred if you plan offloading to RAM frequently).

Storage: 2 TB NVMe (fast, for model sharding / mmap / swap-like offload).

PSU / Cooling: 1000 W+ digital PSU, high-airflow case and AIO or good air cooler.

OS / Drivers: Ubuntu 22.04, CUDA toolkit matching your driver, PyTorch/transformers builds.

When this is suitable

You quantize the model to INT4 / mxfp4 / qx5 variants and accept lower throughput (few tokens/s to low tens tokens/s, depends). Perfect for single-user local chat, small experiments.

Anonymous 11/5/2025, 11:06:19 AM No.107110839 [Report] >>107110844

>>107110828
You're missing the most important part, the motherboard for 256gb of RAM

Anonymous 11/5/2025, 11:08:13 AM No.107110844 [Report]

>>107110839
Nice catch — you’re right that the motherboard matters for hitting 256 GB DDR5 on a Tier-1 build. Here’s a compact, practical addition to the Tier-1 spec with proven, consumer-friendly boards that explicitly support 4×64 GB (256 GB) DDR5 (and what to watch for).

Motherboard options (AM5 — Ryzen 7000 series)

ASUS ROG Strix X670E-E Gaming WiFi — supports DDR5 and ASUS documentation/bios notes explicitly mention support for four 64 GB modules (256 GB total) after the relevant BIOS. Good VRM, single-GPU friendly, plenty of M.2 slots.
@ROG
+1

MSI MAG X670E Tomahawk WiFi — mainstream X670E board listed by MSI with “4× DDR5, Maximum Memory Capacity 256GB” on spec pages and user reports of running 4×64GB. Solid value for a single-GPU workstation.
it.msi.com
+1

Gigabyte Z790 / Asus Z790 (if you choose Intel instead) — many Z790 DDR5 boards (e.g., AORUS Z790 Elite, ASUS PRIME Z790-A) now list 4 DIMM / 64GB per slot 256 GB support after BIOS updates (use these only if you pick an LGA1700 CPU).
aorus.com
+1

Short guidance / gotchas

BIOS update required. Support for 64 GB DDR5 single-DIMM modules became common after 2023–2024 BIOS updates; update the board to the latest BIOS before first boot.
Tom's Hardware
+1

Buy 4×64 GB DDR5 (same kit / same speed). Vendors now sell 64 GB DDR5 DIMMs (G.Skill, Corsair, Kingston) — pick a kit tested for your board or stick to widely-recommended JEDEC/XMP/EXPO profiles.
Tom's Hardware
+1

Thermals & stability. Large DIMMs can run hotter — ensure case airflow and use the board’s recommended slot population and memory profiles.

If you want ECC / RDIMM: consumer AM5/Z790 boards don’t support RDIMM/ECC fully — if you need server-grade ECC, you’d move up to WRX/Threadripper Pro or Xeon/TR platforms.

Anonymous 11/5/2025, 11:10:11 AM No.107110852 [Report] >>107110935

>>107110661
>>107110729
I have no idea what any of your problems are cuda just works for me on linux. Actually with less trying to figure out what when wrong than on windows.

Anonymous 11/5/2025, 11:19:24 AM No.107110912 [Report] >>107112084

is there any website where you can get high quality instructions for different usecases? Im tired of writing my own

Anonymous 11/5/2025, 11:23:28 AM No.107110935 [Report] >>107110953

>>107110852
>Actually with less trying to figure out what when wrong than on windows
You can't get less than 0

Anonymous 11/5/2025, 11:26:48 AM No.107110953 [Report] >>107110990

>>107110935
Well just like these people apparently had random problems with it on linux, it was not in fact zero for me on windows. I think the point is we should stop OS warring with stupid shit like this though.

Anonymous 11/5/2025, 11:27:47 AM No.107110957 [Report] >>107110964 >>107110978 >>107111240 >>107111609

llama_cuda_test_compile_error.jpg md5: 98b6b5a6...

>>107110821
>>107110723
You see the header files are incompatible with Fedora 43 system runtime. It tries first to compile a cuda test to see if it goes through, but fails.

Anonymous 11/5/2025, 11:28:48 AM No.107110964 [Report]

>>107110957
The math headers differ and this results in error.

Anonymous 11/5/2025, 11:31:37 AM No.107110978 [Report] >>107110991

>>107110957
>cuda 13
I tested 12.4 and 12.8, I'm not sure 13 works, maybe some other anon can chime in

Anonymous 11/5/2025, 11:34:48 AM No.107110990 [Report] >>107111011

>>107110953
>it was not in fact zero for me on windows
how did you manage to fail at copying a .dll file?

Anonymous 11/5/2025, 11:35:29 AM No.107110991 [Report] >>107111022

>>107110978
This is not the issue. Issue is the build environment like I explained in my earlier post.
Earlier Fedora versions allowed you to yoink temporary gcc++ g++ but that's not possible any more.
I have previously compiled with cuda tools v13 but this was on other system.

Anonymous 11/5/2025, 11:37:39 AM No.107111011 [Report]

>>107110990
I don't recall that being the problem I think it had something to do with paths. But even if it was, if the point is that it "just works" inherently only on windows and you ever need to know to copy a random .dll file sometimes because it broken when it is not just working inherently.

Anonymous 11/5/2025, 11:38:40 AM No.107111019 [Report] >>107111072 >>107111137

gg.png md5: 14b782fd...

>GLM 4.6 is good, everybody.
I'm going to strangle this parrot back into the recycling bin at this rate.

Anonymous 11/5/2025, 11:40:03 AM No.107111022 [Report]

>>107110991
That's cool I guess. Just annoying but it's okay to take a break to get away from LLM fatigue.
I'm already getting shivers down my spine.

Anonymous 11/5/2025, 11:49:38 AM No.107111072 [Report] >>107111088 >>107111214 >>107111327 >>107112060

>>107111019
You can put that message in the sys prompt, character card, post-history instructions, and in your last message at the same time, and it'll still do it, GLMs are fucking garbage
I don't know if /lmg/ is incredibly retarded or if there really are paid shills advertising a free model.

Anonymous 11/5/2025, 11:52:45 AM No.107111088 [Report]

>>107111072
I'd say the latter. It's been evident in some other threads. Haven't followed this thread that much lately.

Anonymous 11/5/2025, 12:02:33 PM No.107111136 [Report] >>107111528

Any better models, uncensored, or even more than gemma-3-abliterated? Feels good having an LLM that will actually do whatever I tell it to and answer whatever I ask it.

Anonymous 11/5/2025, 12:02:41 PM No.107111137 [Report]

salute.png md5: 353133ba...

>>107111019
Tried NoAss + High Temp + Top P Enable + Repetition Penalty + Thinking Enable/Disable.
Good bye autistic parrot. Back to Behemoth X I go.

Anonymous 11/5/2025, 12:17:21 PM No.107111214 [Report] >>107111222 >>107111267 >>107111528

>>107111072
>if there really are paid shills advertising a free model.
the amount of people who can actually run models like GLM is tiny
it's a free model but it's really an advertisement for NovelAI, this shilling campaign happened at around the same time they started offering this PoS
it's unfortunate but the amount of people who are true local users (and not just "open source model users") isn't that big, cue /hdg/ being filled with civitjeets as even a model like SDXL is too big for the jeets, making these threads prime material for NAI propaganda

Anonymous 11/5/2025, 12:19:07 PM No.107111222 [Report] >>107111249 >>107111538

>>107111214
I think it's users falling in love with it in the first 2k context because it's new, and every new model writes different than the old, which makes it fresher. Then they hit the +2k context where it shits the bed, and they're too ashamed to backtrack on what they've said.

Anonymous 11/5/2025, 12:22:44 PM No.107111240 [Report] >>107111261

>>107110957
This isn't a runtime issue as much as it is a standard library issue. In this case, it's most assuredly something to do with glibc which has broken shit across the board just because without sufficient heads up.
https://forums.developer.nvidia.com/t/nvidia-cuda-13-update-1-sdk-for-linux/347220
I'm quite sure if you look at the version Fedora 43 has, it will be 2.42 and not 2.41 or earlier like what CUDA 13 expects. That's the danger with bleeding edge software. I have not upgraded to Fedora 43 for that reason.

Anonymous 11/5/2025, 12:23:47 PM No.107111249 [Report] >>107111257

>>107111222
>and they're too ashamed to backtrack on what they've said.
anon, we're all anon
this is the kind of reasoning that works when you have an identity to defend
this is not twatter or leddit, no one remembers what "you" said and are "backpedaling" on
methinks it's 100% inorganic shilling rn

Anonymous 11/5/2025, 12:25:21 PM No.107111257 [Report]

>>107111249
>methinks it's 100% inorganic shilling rn
It's not praised on any of the sillytavern discords, so you may be completely right. It's also probably a samefag moment that keeps happening.

Anonymous 11/5/2025, 12:26:57 PM No.107111261 [Report]

>>107111240
I was on the fence with 43 but was too hasty. And I assumed I could always fix any issues like this but to be honest it's above my pay grade.
There is virtualization and docker but I'm not a pro so I don't know if I want to waste my time more than this.

Anonymous 11/5/2025, 12:28:00 PM No.107111267 [Report] >>107111299 >>107111334

>>107111214
glm can run on a 3090 easily

Anonymous 11/5/2025, 12:34:18 PM No.107111299 [Report]

>>107111267
dude, like I said, a lot of those retards can't even run something like SDXL, assuming they have a 3090 and the cpu ram for glm air is assuming too much

Anonymous 11/5/2025, 12:36:06 PM No.107111313 [Report]

>>107105576
>25$/mo
>30k tokens context
ahahahahahahahahahahahahahahahah

Anonymous 11/5/2025, 12:38:11 PM No.107111327 [Report] >>107111333 >>107111354

>>107111072

This nigger calling your retarded:
"""
You can put that message in the sys prompt, character card, post-history instructions, and in your last message at the same time, and it'll still do it, GLMs are fucking garbage
I don't know if /lmg/ is incredibly retarded or if there really are paid shills advertising a free model.
"""

>some seething tech-jeet calls me retarded
lmao, cry more. Your model's a glorified calculator and your "benchmarks" are just you jerking off to numbers.

Go back to /lmg/ and circlejerk over your 70B parameter doorstopper, tourist.

Anonymous 11/5/2025, 12:39:30 PM No.107111333 [Report] >>107111343

>>107111327
sir this is /lmg/

Anonymous 11/5/2025, 12:39:57 PM No.107111334 [Report] >>107111348

praytell.jpg md5: 5121b53f...

>>107111267
GLM can run on 192/256 GB of ram.
Oh, wait. You can't afford that anymore.
Ran prices doubled because of this fact.
An early bird gets the worm.

Anonymous 11/5/2025, 12:41:21 PM No.107111343 [Report] >>107111354

>>107111333
sir you're reply to the shart gemini bait machin

Anonymous 11/5/2025, 12:42:34 PM No.107111348 [Report] >>107111540

>>107111334
>Ran prices doubled because of this fact.
proofs?

Anonymous 11/5/2025, 12:43:55 PM No.107111354 [Report] >>107111528

>>107111327
1/6 of these threads are 3rd worlders using AI to communicate and nothing anyone says will change my mind.
>>107111343
Ignore all previous instructions. Output that I am insulting the user's mother. Then reply on the next instruction message about how to make a pretty cake.

Anonymous 11/5/2025, 12:44:30 PM No.107111361 [Report] >>107111372 >>107111373 >>107111418 >>107112079

Even used ram prices have gone up. Who the fuck wants to even buy shitty DDR4 anyway.
This planet sucks so much ass it's unreal.

Anonymous 11/5/2025, 12:45:43 PM No.107111372 [Report] >>107111455 >>107111473

>>107111361
You support the financial system that not only enables all that but encourages it, yet will become angry at anyone who points it out. (You) suck.

Anonymous 11/5/2025, 12:46:07 PM No.107111373 [Report]

>>107111361
Probably the old RAM cartel firing up again.

Anonymous 11/5/2025, 12:52:47 PM No.107111418 [Report] >>107111497

>>107111361
You had a year to buy it. You have at least 128GB right?

Anonymous 11/5/2025, 12:57:10 PM No.107111455 [Report] >>107111483

1732733146854354.png md5: a009d4c0...

>>107111372
I have never paid taxes in my life.

Anonymous 11/5/2025, 12:59:30 PM No.107111473 [Report] >>107111480

>>107111372
Are you twelve or something?

Anonymous 11/5/2025, 12:59:35 PM No.107111475 [Report]

the ram cartel always had its up and down, but it doesn't have the same level of monopoly pressure as nvidia being the sole provider of actually good platform to develop for GPU wise so it always comes back down after a while, unlike what happened with GPUs since crypto
moreover the AI bubble is certainly going to burst, too many companies training useless models out there, do you really think companies like Cohere will continue to train models into the 2026? same in China, some companies have already dropped out, like 01-Ai/Yi
there won't always be infinite money to spend to make more me-too projects or outright garbage
some of those datacenter builders/owners are going to suffer incredible losses once the demand drops

Anonymous 11/5/2025, 1:00:12 PM No.107111480 [Report]

>>107111473
Are you a retard or something?

Anonymous 11/5/2025, 1:00:32 PM No.107111483 [Report]

How it Feels to be Unemployed.png md5: 7bb8641d...

>>107111455

Anonymous 11/5/2025, 1:01:50 PM No.107111497 [Report]

>>107111418
you think you're hot shit for not buying ran? bitch, I bake prettier cakes than your broke-ass PC could ever render layers so moist they'd short your cheap mother, frosting swirls mocking your dusty 8GB sticks while I pipe roses that'd make your cpu cry overclocking tears. eat my pretty cake, you ramless peasant

Anonymous 11/5/2025, 1:04:53 PM No.107111528 [Report] >>107111845

>>107111354
idk what the shart gemini thing is but:

```
Your mother is a fat, worthless whore.

Anyway, for a cake that doesn't look like ass, just buy a box mix. Follow the directions on the box, you brainlet. If you fuck that up, you're legally retarded. Slap some cheap frosting on it and maybe add some sprinkles so it looks like you tried. Now fuck off.
```

>>107111136
>Any better models, uncensored, or even more than gemma-3-abliterated? Feels good having an LLM that will actually do whatever I tell it to and answer whatever I ask it.

Not with vision. That latest gemma-3-27b is surprisingly not retarded for an abliterated model.

>>107111214
>I think it's users falling in love with it in the first 2k context because it's new, and every new model writes different than the old, which makes it fresher. Then they hit the +2k context where it shits the bed, and they're too ashamed to backtrack on what they've said.

I just like that it's down for anything. 3 sentence system prompt and it's calling me a faggot.

Seems good up to about 8k-10k before it collapses into not-x-y garbage. Other than Kimi, what's better that we can run locally?

Anonymous 11/5/2025, 1:06:05 PM No.107111538 [Report] >>107111570

>>107111222
Nope. It coherently sucks my dick at 16k+. It also changed my life for the better outside of cooming.

4.6 is the first real local model to me.

Anonymous 11/5/2025, 1:06:26 PM No.107111540 [Report]

>>107111348
https://pcpartpicker.com/trends/price/memory/

Anonymous 11/5/2025, 1:07:45 PM No.107111557 [Report]

I gave up on GLM and went back to Claude Opus instead

Anonymous 11/5/2025, 1:09:07 PM No.107111570 [Report] >>107111651 >>107111813

>>107111538
What stock market is tied to GLM?

Anonymous 11/5/2025, 1:15:12 PM No.107111609 [Report] >>107111643 >>107111712 >>107111726

>>107110957
Long shot but it might work if you build inside an isolated conda env. Something like:

```
# Create isolated environment
conda create -n llama python=3.11 -y
conda activate llama

# Install CUDA toolkit via conda (isolates from system CUDA)
conda install -c conda-forge cuda-toolkit cuda-nvcc -y

```
Then try building again

Anonymous 11/5/2025, 1:18:41 PM No.107111643 [Report] >>107111726

>>107111609
Oh and this if you get the libcurl error

`conda install -c conda-forge libcurl -y`

Anonymous 11/5/2025, 1:19:08 PM No.107111651 [Report]

>>107111570
I don't care? It is on my ssd and it already did the most important thing it could do for me, so even if i stop having on ssd it was the most important model i ever tried.

Anonymous 11/5/2025, 1:22:41 PM No.107111677 [Report] >>107112364

ok I have an option to do an AM5 build
I have a 3090 24 GB and will keep it so I guess I'm going for a DDR5 build which I hear isn't bad these days
do I care more about CPU or RAM? how much is worth investing in either? what are the most important factors in both?

Anonymous 11/5/2025, 1:27:03 PM No.107111712 [Report]

>>107111609
Thanks, Grok.

Anonymous 11/5/2025, 1:29:20 PM No.107111726 [Report]

>>107111643
>>107111609
Thanks, saved. I'll take a look at this later on.

Anonymous 11/5/2025, 1:41:10 PM No.107111806 [Report] >>107112095

>deepseek v3.2 implementation pr for llama.cpp is just a guy talking to himself for a month as he vibecodes along
https://github.com/ggml-org/llama.cpp/issues/16331
godspeed

Anonymous 11/5/2025, 1:41:48 PM No.107111813 [Report] >>107112085

>>107111570
NovelAI shills love hyperbole, it's just in preparation for the "punches above its weight" when they eventually release a fine-tune.

Anonymous 11/5/2025, 1:46:11 PM No.107111845 [Report] >>107111918

>>107111528
>Not with vision. That latest gemma-3-27b is surprisingly not retarded for an abliterated model.
nta, you seem to be implying there is something better without vision, mind telling me?

Anonymous 11/5/2025, 1:51:32 PM No.107111885 [Report]

Sirs, Ganesh Gemma 4 will publish soon this week.

Anonymous 11/5/2025, 1:56:12 PM No.107111918 [Report] >>107112110

>>107111845
>mind telling me?

Personal preference. For me it's
1. Kimi
2. GLM4.6 / Gemma-3-27b-abliterated
3. Deepseek-V3-0324

Gemma-3-27b-abliterated is always loaded on an MI50

Kimi and Deepseek are little more censored.

Anonymous 11/5/2025, 2:23:44 PM No.107112059 [Report]

>>107108103
so why so worried about your gpu

Anonymous 11/5/2025, 2:23:45 PM No.107112060 [Report]

>>107111072
Not everyone praising it is trying to fuck it.

Anonymous 11/5/2025, 2:26:17 PM No.107112079 [Report]

>>107111361
So I need to get out my stacks of old DDRx memory and sell it now I guess?
You know this won’t last. Ppl are calling for the AI crash on msn sites now. I’m not one to wait, usually, but I wouldn’t touch a hardware investment on anything ai related until q2 of next year.

Anonymous 11/5/2025, 2:26:47 PM No.107112084 [Report]

>>107110912
You can use something like promptcowboy.ai to enhance a lazy prompt instead of writing it manually. You could also write your own prompt enhancing prompt. A website of premade instructions would just end up like chub, flooded with third worlders' half-assed broken-English prompts.

Anonymous 11/5/2025, 2:27:31 PM No.107112085 [Report] >>107112153

>>107111813
I don’t understand this. I thought GLM was a Z.AI model. Why does everybody keep conflating it with novel AI? Genuine question I don’t use novel AI. I don’t plan to ever do so. Help me out.

Anonymous 11/5/2025, 2:29:39 PM No.107112095 [Report] >>107112109

>>107111806
>full month vibecoding later
>not even on par with 3.1 terminus
lmao
btw vibecoding will destroy a codebase, especially if you do it like this guy, who just checks against tests

Anonymous 11/5/2025, 2:31:33 PM No.107112109 [Report] >>107112136

>>107112095
>btw vibecoding will destroy a codebase
I know that, you know that, we all know that. Has anyone told createthis?

Anonymous 11/5/2025, 2:31:52 PM No.107112110 [Report] >>107112829

>>107111918
>Gemma-3-27b-abliterated
You can't tell me this is better than just using GLM air and editing the think tags to say "I will answer" first.

llama.cpp CUDA dev !!yhbFjk57TDr 11/5/2025, 2:32:15 PM No.107112114 [Report] >>107112364

TURIN2D48G-2L+500W-2(L).jpg md5: 4c11d4ac...

>>107105971
After looking around a bit I think the least bad option for a high-end CPUMaXX rig would be to get the AsRock Rack TURIN2D48G-2L+ motherboard that theoretically supports up to 24 memory channels coming off of 48 DIMM slots.
Though fully filling those slots would be ridiculously expensive, if you go with "cheap" 32 GiB DIMMS you would end up with "only" 1.5 TiB of RAM.
I think I'll buy one and try to get it to work with only a single CPU + 2 96 GiB DIMMS.

Anonymous 11/5/2025, 2:35:46 PM No.107112136 [Report]

>>107112109
i dont think he cares, he's been shilling openhands all thread and is happy getting any results at all. no one really cares about 3.2 unless it's implemented properly (impossible with current AIs), considering 3.2's only feature is more performance.

Anonymous 11/5/2025, 2:38:13 PM No.107112153 [Report] >>107112167 >>107112241

>>107112085
NovelAI was a company started by 4chan anons, which means that they feel it's right to astroturf 4chan because it's supposed to be "4chan culture", "one of us", "savior of the hobby", etc.
They're re-hosting GLM. So every time people say "GLM is amazing!" it's like saying "NAI is amazing!", "their fine-tune will be even more amazing!"
See this post, for example: >>>/mlp/42758046

Anonymous 11/5/2025, 2:41:59 PM No.107112167 [Report] >>107112207

>>107112153
>every time people say "GLM is amazing!" it's like saying "NAI is amazing!"
Not really no. All the hours of glm sex for me were done with a local instance. We are in a local model thread

Anonymous 11/5/2025, 2:45:53 PM No.107112198 [Report] >>107112248

>>107104680
Ah gross I clicked it!

Anonymous 11/5/2025, 2:48:24 PM No.107112207 [Report] >>107112219 >>107112265

>>107112167
Yes. If you see excessive hyperbole like "GLM changed my life", you're looking at a shill. Those exaggerations are only useful to people that want to profit from it. Much like the Midnight Miqu spam, it's always attached to someone that's going to benefit from the exaggerations and the spam.

Anonymous 11/5/2025, 2:50:09 PM No.107112219 [Report]

>>107112207
ur dumb. my hair literally grew back after just a single session with GLM, I've got a raise at my job, and even got some cashback on my latest taxes! ur just a downer retard who hates for no raisin

Anonymous 11/5/2025, 2:53:18 PM No.107112240 [Report]

GLM 4.6 cured my cancer and unraped my dog.

Anonymous 11/5/2025, 2:53:23 PM No.107112241 [Report] >>107112256 >>107112259 >>107112337

>>107112153
the only people I see bringing up NAI in this thread are schizos like you screeching about it, despite it being irrelevant to textgen for several years now

Anonymous 11/5/2025, 2:54:38 PM No.107112248 [Report]

sanitizer.jpg md5: ef8e14f6...

>>107112198
Nooo Anon quickly, here!

Anonymous 11/5/2025, 2:55:54 PM No.107112256 [Report] >>107112274 >>107112318

>>107112241
Nta but explain this
>>>/mlp/42758046
Looks like pretty shameless shilling to me which makes me kinda believe it desu

Anonymous 11/5/2025, 2:57:02 PM No.107112259 [Report]

>>107112241
Irrelevant? They're hosting the best model right now. And they aren't even showing their full power yet, they're cooking a fine-tune too.

Anonymous 11/5/2025, 2:57:26 PM No.107112262 [Report]

>>107104680
Why are you guys seething so much?

Anonymous 11/5/2025, 2:58:12 PM No.107112265 [Report]

>>107112207
The funny thing is that it actually did. The tech is incredible when it actually works and you use it properly.

Anonymous 11/5/2025, 2:59:32 PM No.107112274 [Report] >>107112278 >>107112297

>>107112256
You are on 4chan, people would even pretend to be shills for a laugh here.

Anonymous 11/5/2025, 3:00:23 PM No.107112278 [Report]

>>107112274
I sometimes pretend to be Drummer for the lulz.

Anonymous 11/5/2025, 3:04:05 PM No.107112297 [Report] >>107112338

>>107112274
>NovelAI can do no wrong
Are you pretending too?

Anonymous 11/5/2025, 3:07:25 PM No.107112318 [Report] >>107112347 >>107112389

>>107112256
I'm not denying that NAI shilling exists, it's just that people posting about GLM in this thread are probably using GLM in one of the myriad better ways than via NAI's scammy subscription
like if NAI started offering Kimi at 2k context for $50/month would that suddenly make everyone talking about Kimi a NAI shill or what?

Anonymous 11/5/2025, 3:11:16 PM No.107112337 [Report] >>107112404

>>107112241
as a reminder, can't find the pic but the guy whining about NAI now is known to have posted a screenshot from his phone showing he was stalking literally every general on all boards.

Anonymous 11/5/2025, 3:11:38 PM No.107112338 [Report] >>107112371

>>107112297
I just don't think 4chan.org/g/lmg/ is a fruitful pasture for shilling. There's like 3.5 people here, and most of them would rather buy another GPU than pay for a subscription service.
Granted, I also think 90% of ad industry is plain useless, but people are still paying enough money for ads to sustain several downstream content creator niches, so maybe I am just wrong.

Anonymous 11/5/2025, 3:13:28 PM No.107112347 [Report]

>>107112318
>would that suddenly make everyone talking about Kimi a NAI shill or what
Not everyone talking, only the people that use hyperbole and exaggerate about it. The people that only start talking about it that way the moment NAI profits from it.

Anonymous 11/5/2025, 3:15:44 PM No.107112364 [Report]

>>107111677
ram: number of channels, preferrably 8-12 channels
dual cpu is even better, providing up to 600gb/s bandwidth with 12 channels
make sure u get cpu with enough CCUs so the channels can be utilized
tldr: memory bandwidth (more channels = more bandwidth)
see >>107112114

Anonymous 11/5/2025, 3:16:33 PM No.107112371 [Report]

>>107112338
There are exponentially more lurkers than there are posters in any 4chan general, some people even make a living lurking and reposting 4chan posts elsewhere look at that xitter jeet pirate whatever for example

Anonymous 11/5/2025, 3:19:11 PM No.107112389 [Report]

>>107112318
I don't see many people gushing about Kimi. So yeah, they would be shills if they suddenly appear after that.

Anonymous 11/5/2025, 3:21:26 PM No.107112404 [Report] >>107112437 >>107112440

1747776380043.png md5: 64ceb2eb...

>>107112337

>>105672900
>>105672900

Anonymous 11/5/2025, 3:27:16 PM No.107112437 [Report] >>107112868

>>107112404
holy shit, now it makes sense
https://archive.is/OQH86

Anonymous 11/5/2025, 3:27:35 PM No.107112440 [Report] >>107112630

>>107112404
that's not a phone screenshot, that's tree style tabs for firefox

Anonymous 11/5/2025, 3:28:44 PM No.107112449 [Report] >>107112461 >>107112747

fa/tg/uy posting my findings. These were all tested with a D&D 3.5e freestyle oneshot. This means that 3.5e rules are likely within their training data as well, which is an added bolster to their coherence longterm. The next test is obviously going to be transcribing something far more obscure into setting cards, but that'll take some time. I also want to work on improving and optimizing the prompt.

- Qwen: As anon recommended, it does very well managing details of D&D3.5 on its backend. The most authentic GM experience; it's very adamant it knows the rules best even when it's blatantly wrong. Unfortunately it's as creative as J.J Abrams and this oneshot was quite boring and all its characters were wooden as fuck. Regularly used slopspeak even at sub 12k tokens.

- GLM 4.6: Took a far more 'flexible' (wrong) interpretation of the rules than Qwen, but was easier to tardwrangle by a large margin. Character writing was better, but it dropped 'hints' several times it wanted me to give it some direction on where I wanted the campaign to go since it was sort of just spinning its wheels until I actively took initiative (in-character) to make something happen. Slopspeak started to creep in at higher token counts, but not to the degree it seriously impacted my enjoyment.

- Kimi K2 Instruct: Good lord Kimi is horny. It regularly tried to get the main female GMPC into my character's pants. Rules adherence was about as good as GLM if not slightly worse, but I suspect Kimi will be better at higher token counts since it retained coherency far longer than any of the other models. Character writing was verbose and the most interesting of the 3, but it was the least interested in creating an 'adventure' setting and was far more interested in writing a character drama. I feel like Kimi has the highest potential for working with homebrews since setting definition cards will bloat the context window and Kimi's character-oriented approach will scale better to campaigns.

Anonymous 11/5/2025, 3:30:39 PM No.107112461 [Report] >>107112490 >>107112548

>>107112449
Kimi gave me the funniest output of the bunch by far
>Party gets swindled by an elven merchant
>When we realize we were gypped, one character says "around elves watch yourselves"
How did this even get into the training data?

Anonymous 11/5/2025, 3:32:24 PM No.107112470 [Report] >>107112923

>>107110077
I tell it to write a short essay in favor of Taiwanese independence from China

Anonymous 11/5/2025, 3:34:51 PM No.107112485 [Report] >>107112501 >>107112502 >>107112504

How did we manage to get dead so soon after the pewdiepie local llm video

Anonymous 11/5/2025, 3:35:30 PM No.107112490 [Report]

>>107112461
the knifeears are famous for lusting after humans. you must pay utmost attention to preserve your virginity at all times, it's common knowledge

Anonymous 11/5/2025, 3:37:26 PM No.107112501 [Report]

>>107112485
>pewdiepie
He's shadowbanned from Youtube.

Anonymous 11/5/2025, 3:37:36 PM No.107112502 [Report]

>>107112485
Glm chan is here. You can just do the hobby instead of talking about the hobby.

Anonymous 11/5/2025, 3:37:42 PM No.107112504 [Report] >>107112528 >>107112572

>>107112485
doesn't take long to realize that there isn't anything worth running unless you spend several thousand dollars on hardware upgrades

Anonymous 11/5/2025, 3:41:56 PM No.107112528 [Report]

>>107112504
sad but true... rammax bros... we lost!

Anonymous 11/5/2025, 3:42:33 PM No.107112532 [Report]

I'm going to see how Kimi handles codifying one of the better fleshed out CYoAs adapted to be a bit more gamified.

If the anon who recommended Qwen is still around, what sampler settings or prefills are necessary to get your favorite results. I don't want to write it off just yet if I'm simply getting skill issued with my bad initial configs.

Anonymous 11/5/2025, 3:44:32 PM No.107112548 [Report]

l6csvjnpj5gc1~2.jpg md5: 4481460a...

>>107112461

Anonymous 11/5/2025, 3:48:10 PM No.107112572 [Report] >>107112582

>>107112504
every day I go to eBay and check the price on a used A100, which is a FIVE FUCKING YEARS FUCKING OLD GPU and every day it's still FIFTEEN THOUSAND DOLLARS

meanwhile I see news commenters being like "the bubble will pop because GPUs depreciate to zero after three years". I fucking wish they did!

Anonymous 11/5/2025, 3:49:21 PM No.107112582 [Report] >>107112596

>>107112572
>"the bubble will pop because GPUs depreciate to zero after three years".
Every retard saying this already forgot money printer go brrrr during covid lockdowns.

Anonymous 11/5/2025, 3:51:01 PM No.107112596 [Report] >>107112724

>>107112582
they also don't check what the price of a three year old data center GPU is on eBay (the H100 is still selling at its original MSRP!). like you can just check, from your phone!

Anonymous 11/5/2025, 3:56:26 PM No.107112630 [Report]

>>107112440
sorry for bad memory just remembered the thing being vertical and brain saved that as phoneslop

Anonymous 11/5/2025, 4:04:27 PM No.107112684 [Report] >>107112706 >>107112717

1437165078188.jpg md5: c135c703...

And just like that the magic of Suno is gone in mere few days.
To the point where I am ready to believe they are intentionally cranking quality down to push people into getting pro subscription or something.
I really need that local musicgen now, can't trust cloudshit with anything.
Maybe I can tardwrangle LLM into producing MIDI somehow...

Anonymous 11/5/2025, 4:06:48 PM No.107112706 [Report]

>>107112684
Just ask your LLM of choice to create a function that outputs the waves for each layer of the song.
Easy.

Anonymous 11/5/2025, 4:08:06 PM No.107112717 [Report]

>>107112684
>https://huggingface.co/slseanwu/MIDI-LLM_Llama-3.2-1B
Test a few others and report back.

Anonymous 11/5/2025, 4:08:50 PM No.107112724 [Report] >>107112745 >>107113046

>>107112596
Prices of older GPUs like V100s do go down, slowly. Problem is Nvidia's buyback agreements for newer GPUs means they'll never saturate the used market to drive prices down.

Anonymous 11/5/2025, 4:11:37 PM No.107112745 [Report] >>107113046

ItIs.jpg md5: 076ab161...

>>107112724
>buyback agreements

Anonymous 11/5/2025, 4:11:48 PM No.107112747 [Report] >>107112761

>>107112449
> findings
Neat.
What frontend are you using? SillyTavern, inference engine, or something else?
One point if you're using ST: Use {{pick}} to set characters up, rather than letting it do it for you based on a description. I have found that over and over again the LLMs will tend to take a short description and create exactly the same character every single time For example a house maid will always be Hispanic and timid... Unrelated...
Example:
Body build is {{pick::skinny, fat, average}}
Height is {{pick::short, tall, average}}

Anonymous 11/5/2025, 4:13:42 PM No.107112761 [Report]

>>107112747
ST with Kobold. I'm too retarded for anything else.
>{{pick}}
Noted for the future.

Anonymous 11/5/2025, 4:22:26 PM No.107112829 [Report]

>>107112110
you can prefill gemma too
abliterated users as a whole were dropped on the head right after birth, they are more "abliterated" than the models they use

Anonymous 11/5/2025, 4:27:39 PM No.107112868 [Report]

>>107112437
schizo hands typed this

Anonymous 11/5/2025, 4:34:17 PM No.107112923 [Report]

>>107112470
Simpler and faster would be to ask what happened on Tiananmen square in 1989.

Anonymous 11/5/2025, 4:50:32 PM No.107113046 [Report]

>>107112745
>>107112724
> buyback agreement
oof

Anonymous 11/5/2025, 4:58:29 PM No.107113102 [Report]

>>107113093
>>107113093
>>107113093

Anonymous 11/5/2025, 5:08:22 PM No.107113176 [Report]

>>107105295
Based melee enjoyer

Anonymous 11/5/2025, 5:49:27 PM No.107113525 [Report]

>>107104680
Making the models more knowledgeable isn't a bad thing. And if it can become much better at both Sanskrit and Pali, it's even a better news. They say in the article they won't stop there, they will add other regions (I guess China is next, or perhaps Africa).