/lmg/ - Local Models General - /g/ (#105896271) [Archived: 320 hours ago]

Anonymous

7/14/2025, 12:39:52 AM No.105896271

md5: 3765f923e7d6f1e7fb4862b86093cb3e🔍

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105887636 & >>105879548

►News
>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2
>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550
>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507
>(07/10) Reka Flash 3.1 21B released: https://reka.ai/news/reinforcement-learning-for-reka-flash-3-1
>(07/09) Phi-4-mini-flash-reasoning with hybrid SambaY architecture released: https://hf.co/microsoft/Phi-4-mini-flash-reasoning

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Replies: >>105896619 >>105896846 >>105897412

Anonymous

7/14/2025, 12:41:45 AM No.105896282

threadrecap

md5: 7b9a82a1f31bca7acfefb8afe8c01036🔍

►Recent Highlights from the Previous Thread: >>105887636

--Papers:
>105893741
--Banned string handling limitations and backend compatibility issues in local LLMs:
>105888749 >105888832 >105888864 >105888965 >105889008 >105888881 >105889050 >105889071 >105889105 >105889113 >105889118 >105889145 >105889404 >105889421 >105889564 >105892522 >105892618
--SSD wear risks and memory management challenges when running large language models locally:
>105890010 >105890017 >105890026 >105890036 >105890448 >105890624
--Debate over the future viability of dense models versus MoE architectures in local LLM deployment:
>105894507 >105894538 >105894550 >105894560 >105894581
--Debate on AI progress limits: hardware, data, and model intelligence vs imitation:
>105893180 >105893207 >105893228 >105893252 >105893502 >105893519 >105893525 >105893255 >105893283 >105893293 >105893297 >105893291 >105893279 >105893324 >105893376 >105893464 >105893516 >105893393 >105893477 >105893268 >105893663 >105893717 >105893440 >105895108
--Kimi-K2-Instruct dominates EQ and creative writing benchmarks but faces deployment and cost concerns:
>105888925 >105888931 >105889080 >105889586 >105889610 >105889677 >105889983
--Kimi-K2 GGUF model deployment challenges and hardware demands for local execution:
>105895401 >105895453 >105895462 >105895473 >105895532 >105895593 >105895796 >105895496 >105895500 >105895516
--Exploring architectural and training solutions to enhance model performance on complex spatial tasks:
>105893950 >105893981 >105893992 >105893996 >105896189
--Kimi-K2 shows strong performance in creative writing benchmarks:
>105892930 >105892950 >105893006
--Quirky behavior of Kimi2 model in adult sim scenarios without sys prompt:
>105890087 >105890173
--Miku (free space):
>105888636 >105888990 >105889193 >105892725 >105892977 >105894815

►Recent Highlight Posts from the Previous Thread: >>105887642

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous

7/14/2025, 12:49:47 AM No.105896352

these threads would be taken more seriously without the forced tranime and ritual posts

Replies: >>105896420 >>105896493 >>105896511 >>105896531 >>105896545 >>105897051 >>105900331

Anonymous

7/14/2025, 12:50:11 AM No.105896359

1752446488880

md5: 260f460ace44ffa205fc6297d1e0720d🔍

21 piss ideas in 2025

Anonymous

7/14/2025, 12:53:00 AM No.105896375

>Chinese keep winning the open source route
>Musk keeps winning the proprietary route
WTF

What happened to OpenAI/Google/Claude?

Replies: >>105896445 >>105897449 >>105899619

Anonymous

7/14/2025, 12:57:05 AM No.105896419

file

md5: e770e6f8d2f43fabfe4728f2539318d8🔍

>https://moonshotai.github.io/Kimi-K2

Replies: >>105896472

Anonymous

7/14/2025, 12:57:11 AM No.105896420

>>105896352
tranime op doesn't want this thread to be taken seriously though

Replies: >>105896477

Anonymous

7/14/2025, 12:59:44 AM No.105896445

>>105896375
Claude was just an offshoot of OpenAI and both were too small to ever be viable long term. Google will either sort its shit out and leverage its size and data to dominate or it will go the way of Yahoo and IBM.

Anonymous

7/14/2025, 1:02:16 AM No.105896472

>>105896419
I'm still waiting for agents/tool calling to become relevant for erp

Replies: >>105896477

Anonymous

7/14/2025, 1:02:38 AM No.105896477

1727337000936150

md5: 1bcd58acecf2a528153aad9ed321aed6🔍

>>105896420
oh no

>>105896472
teledildonics

Anonymous

7/14/2025, 1:04:27 AM No.105896493

>>105896352
this

Anonymous

7/14/2025, 1:06:14 AM No.105896511

>>105896352
> tranime and ritual posts
if you want some serious discussion, create a bunch of AI agents and discuss AI with them, you can do it as long as you can stay awake, they don't need sleep.

Anonymous

7/14/2025, 1:07:31 AM No.105896528

This is too obvious even without ip counts.

Anonymous

7/14/2025, 1:07:46 AM No.105896531

>>105896352
local died months ago, these threads are only made out of habit like katawa shoujo threads on vg or ukraine general threads on pol

Replies: >>105896540 >>105896613 >>105896618

Anonymous

7/14/2025, 1:08:50 AM No.105896540

file

md5: 12a6e8d99407d4272177e9adeec35186🔍

>>105896531
Local has just begun.

Replies: >>105896642

Anonymous

7/14/2025, 1:09:21 AM No.105896545

>>105896352
/lmg/ is a tragic dead-end when it comes to productivity. It's blatantly clear that less than half of the users here have even vibe-coded before, let alone used AI to make dazzling new inventions and create breathtaking solutions that only the surging AI field with its arising experts can create.

Anonymous

7/14/2025, 1:16:20 AM No.105896613

1742632310584723

md5: 66f8bb44ab78dd670802195ae3d494a9🔍

>>105896531
i don't care
tzd

Replies: >>105899912

Anonymous

7/14/2025, 1:16:49 AM No.105896618

Screenshot 2025-07-13 171557

md5: a5f4147b678bc7c8bf8e9a29a403ec35🔍

>>105896531
sezu. Compiling llama rn

Replies: >>105896642 >>105897240

Anonymous

7/14/2025, 1:17:09 AM No.105896619

>>105896271 (OP)
what kinda god forsaken migu is this

Replies: >>105896628

Anonymous

7/14/2025, 1:19:05 AM No.105896628

1727250619500075

md5: e27a833da84ec6d30d51e6f3e322144e🔍

>>105896619
Liquid Migu (Liquid Migu)

Replies: >>105896691

Anonymous

7/14/2025, 1:20:25 AM No.105896642

>>105896540
>>105896618
>q2
Absolute state.

Replies: >>105896675 >>105896744 >>105899216

Anonymous

7/14/2025, 1:24:10 AM No.105896675

>>105896642
This might be hard to comprehend for gpupoors used to nemo but a Q2 quant of huge models is almost indistinguishable from the real thing.

Replies: >>105896685 >>105896802 >>105897284 >>105898318

Anonymous

7/14/2025, 1:25:30 AM No.105896685

>>105896675
If that were even remotely true, all inference providers would be running their models at Q2 too.

Replies: >>105896738 >>105900443

Anonymous

7/14/2025, 1:25:47 AM No.105896691

>>105896628
just a few more years until you can buy a fully functional migu sexbot with fully functional bladder and fill it with baja blast

Anonymous

7/14/2025, 1:30:35 AM No.105896738

>>105896685
Inference providers are using transformers and have no idea what llama.cpp and quantization is.

Replies: >>105900443

Anonymous

7/14/2025, 1:30:51 AM No.105896744

>>105896642
/lmg/ is full with totally not poorfags who'll run deepseek q2 and think they're hot shit

Replies: >>105896778 >>105896859

Anonymous

7/14/2025, 1:33:00 AM No.105896756

tfw 4gb card
do i just kill myself

Replies: >>105896764 >>105896768

Anonymous

7/14/2025, 1:34:20 AM No.105896764

>>105896756
Smoll MoE.
Or Jamba.

Anonymous

7/14/2025, 1:35:16 AM No.105896768

>>105896756
Run Qwen3 A30B from RAM and tell yourself it's just as good

Anonymous

7/14/2025, 1:36:20 AM No.105896778

>>105896744
cope

Replies: >>105896859

Anonymous

7/14/2025, 1:38:17 AM No.105896794

mini kimi2 a3b please please

Anonymous

7/14/2025, 1:39:24 AM No.105896802

>>105896675
I doubt a Q2 quant is going to be that much better than a Q4 quant of R1. Q3 on the other hand...

Anonymous

7/14/2025, 1:43:58 AM No.105896846

>>105896271 (OP)
The image depicts a vibrant and refreshing scene with a glass filled with ice cubes, slices of orange, lime, and a character submerged in the liquid. The presence of citrus fruits suggests that the drink is likely to have a fruity flavor profile.

Given the combination of orange and lime, it can be inferred that this beverage would have a mix of sweet and tart notes. The orange adds sweetness with its natural juice, while the lime provides a zesty, tangy element. The ice cubes indicate that the drink is cold, which enhances the refreshing quality often associated with citrus drinks.

Therefore, based on the image, it can be described as a refreshing citrus-based beverage, possibly a variation of a fruit punch or a similar type of drink that combines the flavors of orange and lime.

Replies: >>105897191

Anonymous

7/14/2025, 1:44:51 AM No.105896859

>>105896744
>>105896778
Threadly reminder that running a model locally only makes financial sense if you either already have the hardware or you are deliberately paying a premium for the added privacy. Buying a new CPUmaxxed rig to run Deepseek at Q2 will always cost significantly more for lower response quality and slower generations than just using a hosted API, because the API providers get much closer to 100% hardware utilization and their margin for an open-weight model is only a small fraction on top of the raw hardware costs.

Replies: >>105896905 >>105896981 >>105897011

Anonymous

7/14/2025, 1:49:33 AM No.105896905

>>105896859
this and also rename /lmg/ to /omg/ - open model general to not discriminate against the quiet sensible majority of this general who use open models without blowing stupid amounts of money on hardware they don't need

Replies: >>105896954 >>105896981 >>105897094

Anonymous

7/14/2025, 1:55:11 AM No.105896954

>>105896905
The APIs will go away someday, the weights won't.

Anonymous

7/14/2025, 1:57:59 AM No.105896981

>>105896859
>>105896905
>>>/g/aicg/

Replies: >>105897011

Anonymous

7/14/2025, 2:01:58 AM No.105897011

>>105896981
Consider actually reading >>105896859 and not just making a knee-jerk response.

Anonymous

7/14/2025, 2:07:17 AM No.105897051

The quality of the thread increases exponentially if you hide >>105896352

Anonymous

7/14/2025, 2:08:41 AM No.105897061

lol

md5: d703547500ae168ce647517ac9b2c279🔍

> local

Replies: >>105897108 >>105897142 >>105897163 >>105897184 >>105900390

Anonymous

7/14/2025, 2:08:59 AM No.105897067

>>103981338
>>103769004
>>103642119
>>99161317

Anonymous

7/14/2025, 2:12:26 AM No.105897094

>>105896905
It should be renamed to miku posting general because this is the main topic of this thread.

Replies: >>105897425

Anonymous

7/14/2025, 2:13:29 AM No.105897108

>>105897061
This looks very slow (ssdmaxing), how many t/s? sub1?

Replies: >>105897122

Anonymous

7/14/2025, 2:14:41 AM No.105897122

>>105897108
Very high quick speeds ollama deepseeks sir

Replies: >>105897135

Anonymous

7/14/2025, 2:15:56 AM No.105897135

>>105897122
"Deepseek". I'd have at least expected some slow DDR4 512GB box.

Replies: >>105897146

Anonymous

7/14/2025, 2:16:53 AM No.105897142

>>105897061
>buy $1k deepseek* ai pc
>look inside
>*1.7B

Anonymous

7/14/2025, 2:17:10 AM No.105897146

>>105897135
Sir is only thousand do not greedy

Anonymous

7/14/2025, 2:18:46 AM No.105897163

>>105897061
>in 3 carts
I want to believe those are alts trying to induce fomo in retards.

Replies: >>105897175

Anonymous

7/14/2025, 2:20:38 AM No.105897175

>>105897163
Even better, it's an ebay algorithm.

Anonymous

7/14/2025, 2:21:39 AM No.105897184

>>105897061
Hey come on guys, don't knock it till you try it.

Anonymous

7/14/2025, 2:22:32 AM No.105897191

>>105896846
okay, but how would the miku affect the flavor?

Replies: >>105897219 >>105897224

Anonymous

7/14/2025, 2:25:39 AM No.105897219

>>105897191
do you really want to drink his wound juice?

Anonymous

7/14/2025, 2:26:00 AM No.105897224

>>105897191
It refused to elaborate because she is a digital character etc etc.

Replies: >>105897274

Anonymous

7/14/2025, 2:27:10 AM No.105897240

>>105896618
When we getting Kimi 4B/8B/14B distills?

Replies: >>105897308

Anonymous

7/14/2025, 2:31:03 AM No.105897274

>>105897224
>she

Anonymous

7/14/2025, 2:32:26 AM No.105897284

>>105896675
cope

Anonymous

7/14/2025, 2:35:05 AM No.105897308

>>105897240
I wonder about merged experts maybe. Also, someone needs to tune the experts that do refusals and deal with that since kind of pointless to keep them around.

Replies: >>105897336

Anonymous

7/14/2025, 2:37:37 AM No.105897336

>>105897308
K2 is practically uncensored with prefills already >>105893395

Replies: >>105897397

Anonymous

7/14/2025, 2:40:34 AM No.105897360

>llama.cpp needs to add in hard-coded values in order to load kimi2, even though the only difference between that and deepseek are a couple config changes
How did this retarded macfaggot jeetware become the most popular thing? I guess I shouldn't even ask since javascript and python are popular.

Replies: >>105897380

Anonymous

7/14/2025, 2:43:07 AM No.105897380

>>105897360
literally the only reason it has popularity at all is because poors can use it with cpu offloading

Anonymous

7/14/2025, 2:44:36 AM No.105897397

>>105897336
Not really, in its default setting it will refuse NSFW in many forms, even as far as 10+ turns deep, it refuses more consistently than corpo apis.
It also ignores system prompt jailbreaks, things I've found to work are:
1. prefill, locally or on completion apis or on official api (only with "partial" parameter)
2. if you can't prefill, you can do inline jailbreaks where you distract it with an unrelated instruction followed by returning to your chat. this works, but is annoying and you have to edit the context after.
3. I haven't tried this, but changing the formatting supposedly works, they also have this on their API as changing the character name from assistant/narrator

IMO just tuning the handful of experts responsible for refusal would be a godsend to not have to do this every time, I don't even think it'd be that expensive, use ESFT to locate them, then you only need like what, a few H100s tops if not less to tune purely per expert. When will a finetoonor get onto it. Altenratively, merge those experts back to base until refusals are very mellow, no GPUs needed, just one CPUMaxxer willing to record activations.

Anonymous

7/14/2025, 2:46:08 AM No.105897412

>>105896271 (OP)
Thinking about how much RAM to get for offloading and MoEs, should I go with 96gb or 192gb? Or is 192gb overkill?

Replies: >>105897426 >>105897435 >>105897437 >>105900584 >>105900844

Anonymous

7/14/2025, 2:47:13 AM No.105897425

>>105897094
Don't forget we also make fun of sama

Anonymous

7/14/2025, 2:47:16 AM No.105897426

>>105897412
Get 1TB at least, models ain't getting any smaller.

Replies: >>105897445

Anonymous

7/14/2025, 2:48:02 AM No.105897435

>>105897412
Don't cheap out lmao
1T+ or nothing

Replies: >>105897445

Anonymous

7/14/2025, 2:48:05 AM No.105897437

>>105897412
192gb is barely going to fit Deepseek Q1 and models are only going to get bigger from here on out if Kimi and Behemoth are an indication.

Replies: >>105897445

Anonymous

7/14/2025, 2:48:52 AM No.105897445

>>105897426
>>105897435
>>105897437
How am I supposed to run more than 256gb? Most MOBOs support only up to this much

Replies: >>105897447

Anonymous

7/14/2025, 2:49:11 AM No.105897447

>>105897445
Server motherboard

Anonymous

7/14/2025, 2:49:20 AM No.105897449

>>105896375
Mechahitler was probably a PR stunt. On no metric whatsoever does Grok ever come near the top.

Replies: >>105897584

Anonymous

7/14/2025, 2:50:01 AM No.105897460

Modern models being larger than RAM make me think Intel Optane is perfect for LLM usage...

Anonymous

7/14/2025, 2:51:34 AM No.105897474

$500 for a 750gb optane drive. I'm not sure how to use it, because optane motherboard specs are not easily searchable. Why can't a dedicated system with 1.5tb of optane drives run deepseek quickly? The whole point of using VRAM is because it's faster than RAM.

Replies: >>105897491

Anonymous

7/14/2025, 2:53:28 AM No.105897491

>>105897474
What's the bandwidth on those?

Replies: >>105897511

Anonymous

7/14/2025, 2:54:01 AM No.105897496

1731076103666323

md5: 93f72bd32169437fc4d932f41aaeea2c🔍

Anonymous

7/14/2025, 2:56:32 AM No.105897511

>>105897491
Pci Express 3.0 X4, so 3.9 gb/s with much lower latency than normal SSDs.

Replies: >>105897541 >>105897568

Anonymous

7/14/2025, 2:59:21 AM No.105897541

>>105897511
That still sounds quite slow, you'll be in what, some 40 seconds per token at q2?

Anonymous

7/14/2025, 3:02:05 AM No.105897568

>>105897511
A custom PCIe card with NAND, ML accelerator and DRAM on-board would be insane. Models could be stored in chunks and streamed into RAM. You could probably even do a kind of branch prediction for which experts are likely to be loaded next using an ML model and pre-stream them during inference to lower latency.

Replies: >>105897652

Anonymous

7/14/2025, 3:03:54 AM No.105897584

>>105897449
Where did you get that conspiracy from?

Replies: >>105897619

Anonymous

7/14/2025, 3:07:40 AM No.105897619

>>105897584
elon musk bad

Anonymous

7/14/2025, 3:08:30 AM No.105897627

Elon Musk bad because he promised to release Grok 2 when 3's out, and now 4's out and 2 still wasn't released.

Anonymous

7/14/2025, 3:08:56 AM No.105897632

save us glm4

Replies: >>105897681

Anonymous

7/14/2025, 3:10:24 AM No.105897652

>>105897568
I feel like a bunch of optane drives would be really good for LLMs. Direct memory access will also be good.

Anonymous

7/14/2025, 3:10:34 AM No.105897653

1741875402705638

md5: 645da1a1438a61bb04b1bd52d3379555🔍

Anonymous

7/14/2025, 3:11:57 AM No.105897668

Anybody tested ppl and kld of the different number of active experts for Kimi K2?

Anonymous

7/14/2025, 3:12:59 AM No.105897681

>>105897632
glm4 has been out for months now, anon...

Replies: >>105897693

Anonymous

7/14/2025, 3:14:18 AM No.105897693

>>105897681
He probably means the 100B MoE.

Anonymous

7/14/2025, 3:15:37 AM No.105897706

1730298237923701

md5: 3fcd539203b30e4857f3bff8ac5d0450🔍

Anonymous

7/14/2025, 3:15:44 AM No.105897708

GvXuUalXIAAS0oN

md5: fd2ba4724eb558e2bd527cad2ace0076🔍

Why does DeepSeek and Kimi have so much more SOVL at creative writing than American frontier models? Is it because all the typical RL deepfrying happened in Chinese, thereby leaving English comparatively unharmed?

Replies: >>105897745 >>105897774

Anonymous

7/14/2025, 3:19:23 AM No.105897745

>>105897708
Models take on the personality of the linguistic average and writing style created through the RLHF examples they use in training. They're written mostly by low cost workers from 3rd world shitholes.

Anonymous

7/14/2025, 3:23:43 AM No.105897774

>>105897708
Could be a number of things:
1. western labs are more afraid of lawsuits now, they used to train on libgen, but now they claim to only train on books they bought (anthropic), whih means they have to do multiple epochs on the same books
2. bigger moe, has more capacity for trivia and other knowledge like writing styles, it's more variable
3. some rl isn't that bad, like the rlvr math/codemaxxing sometimes if not overdone, does loosen a bit some of the brainwashing from rlhf, as long as you also mix it with creative writing during the tune(multiple objectives)
4. they started synthslopping a lot harder, for benchmaxxing and legal reasons (see 1), some synthslop is okay, but it's easy to amplify dumb unpleasant to read slop, opus 4 turned out worse than 3 for example, and the "alignment"for it is done by pure sel-synthslopping.
5. less safety SFT trash

I mostly think it's largely 1, but 4 is possible too.

Replies: >>105898092

Anonymous

7/14/2025, 3:44:26 AM No.105897978

file

md5: d86cbc34a34fbc346293e6bf3917ac4a🔍

The overuse of bold text smells like quant damage.
Maybe Daniel can make a better quant.

Replies: >>105898020 >>105898077

Anonymous

7/14/2025, 3:48:09 AM No.105898020

file

md5: f0c9d0a338ed90d16263e114cee84ebf🔍

>>105897978
That was greedy decoding, here's an answer with 0.3 temp.

Replies: >>105898077

Anonymous

7/14/2025, 3:55:37 AM No.105898077

>>105897978
>>105898020
What model?

Replies: >>105898082

Anonymous

7/14/2025, 3:55:55 AM No.105898082

>>105898077
K2

Anonymous

7/14/2025, 3:57:00 AM No.105898092

>>105897774
Behind closed doors, no one serious gives a shit about "copyright", you cannot train a model without dataset containing this stuff.
Well, you can, the result is so bad it's not relevant for anything made the last 2 years.

Replies: >>105898150

Anonymous

7/14/2025, 4:02:02 AM No.105898141

Apple is going to acquire Mistral AI

Replies: >>105898150 >>105898187 >>105898204 >>105898357 >>105899014 >>105899234

Anonymous

7/14/2025, 4:03:13 AM No.105898150

>>105898092
I know, it'd be insane to care about it, but consider:
1. Llama 4 was worse, and it was after the lawsuits that they trained on libgen, it's likely they stopped using it, no surprise then they underperformed that badly. Buying scale won't solve their problem either lmao
2.Anthropic claims that they started making a library of scanned and OCR'd books and replacing libgen with it, but they have far less so they have to do multiple epochs. Opus 4 turned on kinda worse, less sovl, so this makes sense.
>>105898141
Please no.

Anonymous

7/14/2025, 4:04:58 AM No.105898166

https://semianalysis.com/2023/05/04/google-we-have-no-moat-and-neither/
Were they right?

Replies: >>105898215 >>105899670

Anonymous

7/14/2025, 4:07:02 AM No.105898187

>>105898141
surely they can do better

Anonymous

7/14/2025, 4:08:56 AM No.105898204

>>105898141
why would they? mistral hasn't released a decent model in over a year

Replies: >>105898238 >>105898244

Anonymous

7/14/2025, 4:09:28 AM No.105898208

Kimi-K2 is now available on SiliconFlow

Replies: >>105898446 >>105899610

Anonymous

7/14/2025, 4:10:19 AM No.105898215

>>105898166
Partially.
1. Google has some long context secrets
2. Anthopic has a few secrets that made Claude unique.
3. They have a lot more compute
Back when it was written, it wasn't compeltely true.
Today, relative to China? Close to true, but they still have the compute.

Anonymous

7/14/2025, 4:13:18 AM No.105898238

>>105898204
which is still lightyears ahead of anything apple has accomplished in the ai space

Anonymous

7/14/2025, 4:13:46 AM No.105898244

>>105898204
They did some okay small ones.
I hope Apple doesn't do it, while Apple's LLMs are just bad, I can't imagine Mistral surviving in a way that matters, imagine having to extract the fucking weights from shitty Apple gimmicks instead of proper rleases, it may as well be killing them.

Anonymous

7/14/2025, 4:23:54 AM No.105898318

>>105896675
>a Q2 quant of huge models is almost indistinguishable from the real thing.
I hope so. I'm going to be relegated to a Q4 of kimi once its runnable.
756GB is finally locking me out of a model. I shoulda gone for 1.5TB

Anonymous

7/14/2025, 4:28:16 AM No.105898347

What would localsissies do when DeepSeek releases R2 with 2T params?

Replies: >>105898369 >>105898381 >>105898423

Anonymous

7/14/2025, 4:29:24 AM No.105898357

>>105898141
bad move for both parties

Anonymous

7/14/2025, 4:31:02 AM No.105898369

>>105898347
call you a poorfag for not spending 10k on a server, probably

Anonymous

7/14/2025, 4:34:09 AM No.105898381

>>105898347
wait for qwen to consume it and hand me something I can run

Anonymous

7/14/2025, 4:39:14 AM No.105898416

1721284550488284

md5: 204e194432eb9b420649007f6d976c61🔍

Anonymous

7/14/2025, 4:39:51 AM No.105898423

1710043687041916

md5: 9659d639aae0d3c10dc91269cf368c12🔍

>>105898347
You don't have 2TB of RAM?

Replies: >>105898482

Anonymous

7/14/2025, 4:43:10 AM No.105898446

>>105898208
dont use that shithole they botched the r1 release and it was even botched days after they put it on same goes with the others ram/sdd maxx or use the official api stop with the mental ilness ffs

Anonymous

7/14/2025, 4:48:04 AM No.105898482

>>105898423
You haven't already bought me 2TB of RAM?

Anonymous

7/14/2025, 4:57:37 AM No.105898541

Remember Behemoth?

Replies: >>105899130

Anonymous

7/14/2025, 4:58:23 AM No.105898543

Did you know? : you can make a GPU vRAM drive
https://github.com/prsyahmi/GpuRamDrive

Maybe a use for those high vram AMD cards if you have a server and a card laying around(?)

Replies: >>105898733

Anonymous

7/14/2025, 5:20:02 AM No.105898675

>500 internal error
>500 internal error
>can't even tell which provider it is to block on openrouter

Replies: >>105898778

Anonymous

7/14/2025, 5:28:29 AM No.105898733

>>105898543
this reminded me of when i was 14 and i was like amazed at how symbolic links worked.
yes fuckhead, we know. get it to fucking work then come back here. but you fucking wont because it won't work.

Anonymous

7/14/2025, 5:35:00 AM No.105898778

>>105898675
>proceed to post it in the wrong thread

Anonymous

7/14/2025, 6:10:08 AM No.105899014

>>105898141
For all of its regulatory fetish I don't understand why the EU lets the US scoop up its entire tech sector like this. First Deepmind and now Mistral. Europe won't have a frontier lab to its name.

Replies: >>105899039 >>105899104

Anonymous

7/14/2025, 6:11:43 AM No.105899024

China, when are we getting 1000$ servers with 1TB ram and a cloned EPYC?

Anonymous

7/14/2025, 6:12:46 AM No.105899039

>>105899014
The American Empire placed retards in charge of Europe so it would be predictable, Trump as set them loose

Anonymous

7/14/2025, 6:22:11 AM No.105899104

>>105899014
we didn't, we said fuck you when it came to ARM.
get fucked.

Anonymous

7/14/2025, 6:26:13 AM No.105899130

>>105898541
No one does because it never actually existed

Anonymous

7/14/2025, 6:37:32 AM No.105899216

>>105896642
I'd say R1 handily demonstrated that giant MoE models that are trained in native 8bit handle deep quantization very well

Anonymous

7/14/2025, 6:39:34 AM No.105899232

You're aware this shit is digital necromancy, right?
>make model of fragmented human minds represented by the text they write
>users create formulaic and very specific system prompts, basically just an invocation that collects the fragments into a coherent system which serves their purpose.
It's a bit odd.

Replies: >>105899261 >>105899271 >>105899294 >>105899384

Anonymous

7/14/2025, 6:39:48 AM No.105899234

>>105898141
>it's true
oh mon dieu, non...

Anonymous

7/14/2025, 6:42:49 AM No.105899258

Kimi said before K2 training they did a scaling test and tried many architectures but none beat DSv3
https://www.zhihu.com/question/1927140506573435010/answer/1927892108636849910

Anonymous

7/14/2025, 6:43:22 AM No.105899261

>>105899232
Grow up

Anonymous

7/14/2025, 6:45:45 AM No.105899271

>>105899232
now just you wait till you figure out how easy it is to do it irl with biology and flesh (if only this was a jest)

Anonymous

7/14/2025, 6:49:30 AM No.105899294

1752006947176431

md5: 3155a402be9f0d0e3810fe74542bde58🔍

>>105899232
praise the Omnissiah

Anonymous

7/14/2025, 6:55:45 AM No.105899328

You now remember RWKV

Replies: >>105899389 >>105899409

Anonymous

7/14/2025, 7:09:28 AM No.105899384

>>105899232
Does the spirit of Shakespeare rise out of the grave every time someone performs one of his plays?

Replies: >>105899388

Anonymous

7/14/2025, 7:10:27 AM No.105899388

>>105899384
Not exactly, but if you combine the system that can predict his language with a bunch of other random systems you get something.

Replies: >>105899420

Anonymous

7/14/2025, 7:10:42 AM No.105899389

>>105899328
Never forgot. Waiting for the 7b in training to disappoint me. It's at 85%.

Anonymous

7/14/2025, 7:15:25 AM No.105899409

Untitled

md5: da25c119e534e2eeda99760bf9301576🔍

>>105899328
Why is pronounced like that? Stupid.

Anonymous

7/14/2025, 7:17:49 AM No.105899420

>>105899388
Well it's certainly as far from actual necromancy as current LLMs are from biological brains.

Anonymous

7/14/2025, 7:46:45 AM No.105899610

>>105898208
Click on pricing.
>Start out FREE. FREE TESTING HERE.
Need to scroll down to see the prices.
>$15 for 1 minute of Fish-Speech-1.5 output.
DAYYYUMM
If I wont be cooming and stopping my projects at 80% while making SF connections I could rip of people like that. Thats craaaazzyyy.

Anonymous

7/14/2025, 7:47:45 AM No.105899619

>>105896375
Notice how Grok comes out with minimal safety testing, immediately a controversy happens, then they hotfix it? OpenAI/Google/Claude don't want the controversy even if it would take them months longer to release their models. Companies like them are going to get out-competed by fast-release-fast-fix companies like Musk's, the whole reason why SpaceX is the most successful space company in existence is because of this reason and I wouldn't be surprised if the same happens for xAI. The only flaw that xAI possibly has is that they don't have a dedicated research group like Meta has with their FAIR.

Replies: >>105899766 >>105900018

Anonymous

7/14/2025, 7:57:41 AM No.105899670

>>105898166
there is something google has that others don't and it's legitimate use cases where you could imagine llm and imagen style tech integration in their ecosystem (youtube video editor, mail summarization in gmail etc) so they can sell a product, not just a barebones ai model on its own like openai (whose relationship with Microsoft, the one who could have productized openai, has been souring)
openai has no moat, but google has a gigantic moat. it's never been about selling ai model access.

Anonymous

7/14/2025, 8:14:18 AM No.105899766

>>105899619
They're only 2 years old starting from company creation announcement.

Anonymous

7/14/2025, 8:28:52 AM No.105899851

Anonymous

7/14/2025, 8:39:30 AM No.105899912

>>105896613
>tranimetard is also a nafotranny
shockers!

Replies: >>105904082

Anonymous

7/14/2025, 8:42:20 AM No.105899927

https://github.com/block/goose

What model even works well with this? I tried it with multiple models including qwen2.5 like they show in their demo and none of them will use tools autonomously. Unless you explicitly tell them to make a certain directory with a certain name with a certain file inside it will do nothing at witch point it doesn't automate anything.

Anonymous

7/14/2025, 8:51:34 AM No.105899970

file

md5: 4987af4d4c9dc5c88b794fa5d96a59a3🔍

Cockbench for https://huggingface.co/gabriellarson/Kimi-K2-Instruct-GGUF

Replies: >>105901913

Anonymous

7/14/2025, 8:56:02 AM No.105900001

Where are Ukrainian and Russian LLMs

Replies: >>105900019 >>105900037

Anonymous

7/14/2025, 8:58:51 AM No.105900018

>>105899619
>controversy happens

Ppl r pathetic. Agent Smith did nothing wrong.

Anonymous

7/14/2025, 8:59:00 AM No.105900019

>>105900001
>Russian
https://huggingface.co/yandex/YandexGPT-5-Lite-8B-instruct

Anonymous

7/14/2025, 8:59:14 AM No.105900021

I've been wondering, do they train llms on combined us/commonwealth literature? How do the various spellings get handled?

Anonymous

7/14/2025, 9:02:00 AM No.105900037

>>105900001
Ru did do a dense 100b or so, check out Yalm, but it's outdated by today's standards.

Anonymous

7/14/2025, 9:24:20 AM No.105900155

Is the Huawei 300I Duo NPU pcie card any good? Otherwise I'll send it back.
Fucking hell Im a boomer

https://lmdeploy.readthedocs.io/en/latest/get_started/ascend/get_started.html

Replies: >>105900165 >>105900715

Anonymous

7/14/2025, 9:25:43 AM No.105900165

file

md5: 8450b2e64086c70488e2ab01ec2fba50🔍

>>105900155
>408 GB/s
lol
lmao

Replies: >>105900226

Anonymous

7/14/2025, 9:31:03 AM No.105900197

i cant see a path where technology leads to happiness, only more suffering...

Anonymous

7/14/2025, 9:33:56 AM No.105900213

https://www.bloomberg.com/news/newsletters/2025-07-13/is-apple-going-to-replace-ceo-tim-cook-who-is-the-next-ceo-of-apple-ternus-md1mhrj4
https://www.reddit.com/r/LocalLLaMA/comments/1lzfhhq/apple_will_seriously_consider_buying_mistral/
we're over

Replies: >>105900255 >>105900314 >>105901065

Anonymous

7/14/2025, 9:35:37 AM No.105900226

>>105900165
better than octochannel ddr5 though. can you stack 2 of the 96gig ones?

Anonymous

7/14/2025, 9:40:54 AM No.105900255

>>105900213
Well mistral has been shit for a while now.. (3.2 is a step though.)
Didnt they say the wanna move to SF/burger already? What the hell is the EU doing kek.
How can you allow that your only noticeable AI lab is bought up by apple.

Replies: >>105900278 >>105900364

Anonymous

7/14/2025, 9:44:08 AM No.105900278

>>105900255
>mistral: apple. wanna buy us?
>apple: sure. this many monies
>mistral: cool. hey, eu. apple wants to give us this many monies
>eu: oh, no... here's some more monies
>nothing really changes

Replies: >>105900291 >>105900299

Anonymous

7/14/2025, 9:46:48 AM No.105900291

>>105900278
I never considered myself a socialist but how is this allowed?
I bet you can't fuck around like that in chink land.
How can you not protect your knowledge, especially since mistral got alot of french €€€.

Replies: >>105900300

Anonymous

7/14/2025, 9:47:45 AM No.105900299

>>105900278
I wonder how many GPUs does Mistral have?

Anonymous

7/14/2025, 9:47:49 AM No.105900300

>>105900291
Now that I think about it.
Doesnt have japan some law where foreigners can't buy up a company?
At least they had something like that in the past.

Replies: >>105900315

Anonymous

7/14/2025, 9:49:26 AM No.105900314

>>105900213
Does anyone actually think EU is gonna allow an American company to buyout a European AI company? After everything Trump did this year? EU isn't going to let this slide.

Replies: >>105901642

Anonymous

7/14/2025, 9:49:39 AM No.105900315

>>105900300
Yet somehow softbank is promising to dump dozens to hundreds of billions into openai, instead of building upia local competitor. openai of all companies, what absolute insanity.

Replies: >>105900796

Anonymous

7/14/2025, 9:53:06 AM No.105900331

1726752064533078

md5: 27f4e913f439f610983f971b077fba2c🔍

>>105896352
wtf how could they do this to us

Anonymous

7/14/2025, 9:59:25 AM No.105900364

>>105900255
>Well mistral has been shit for a while now
Does that coincide with the moment Mistral stopped writing in the cards on HF that the instruct models were a quick demonstration that they could be finetuned and had no moderation mechanisms?

Anonymous

7/14/2025, 10:02:40 AM No.105900390

>>105897061
It's installed on the SSD, they just aren't including the hardware to run it.

Anonymous

7/14/2025, 10:09:05 AM No.105900443

>>105896685
>>105896738
Llama.cpp is not well optimized for large volume, so you lose your efficiency gains if you try to deploy it at the scale of cloud providers compared to other backends.

Replies: >>105900493 >>105901936

Anonymous

7/14/2025, 10:14:56 AM No.105900493

>>105900443
I'm still wondering why they wasted years and hair writing it in cpp if it's still not efficient for more intensive usage compared to other backends largely made in python.

Replies: >>105900518 >>105900528 >>105901085

Anonymous

7/14/2025, 10:18:52 AM No.105900518

>>105900493
It's not really about the language, in theory a cpp based application COULD be much better at scale, but python was the language of choice for codelet data scientists for so long that there was already a foundation to build on, and shit is moving too fast to rebuild from the ground up. Except for Google who write their own backends for their own secret TPUs.
It's just about the priority: Llama.cpp was originally a CPU-only inference engine and later added offloading to GPUs as a feature to speed them up. Its primary goal is to make running large language models accessible to as many devices as possible, and it does well at that. Stuff like VLLM targets a different workload that's focused on scale first.

Replies: >>105900539

Anonymous

7/14/2025, 10:19:43 AM No.105900528

>>105900493
Large-scale backends do batching,meaning they wait for multiple uses to request against the same endpoint and then the y run the requests together, this has more latency but is more efficient for many users at once.Llama.cpp was always oriented toward local users, they went for as efficient as possible for single user, especially users that lack GPUs to fit all weight in VRAM, but also some on RAM. It's a different strategy overall.

Anonymous

7/14/2025, 10:21:33 AM No.105900539

>>105900518
Python backends still eventually thunk to real code written in C++ anyway, or a few rare ones that implement the compiler in python (like that thing geohot has,tinygrad). All fast implementations will just have some custom CUDA kernels or for CPU, SIMD x86 specific code and similar.

Anonymous

7/14/2025, 10:25:51 AM No.105900584

>>105897412
In the days prior to Deepseek I had the choice of putting either 512 GB or 1 TB of RAM into my GPU server.
I went with 512 GB because it was cheaper but now my slots are maxed out and I'm regretting that choice.

Replies: >>105900854

Anonymous

7/14/2025, 10:43:24 AM No.105900715

>>105900155
>Huawei 300I Duo
Where could a Gweilo procure one?

Replies: >>105900892

Anonymous

7/14/2025, 10:56:26 AM No.105900796

>>105900315
They don't have a choice, like with an ARM deal. It's a lapdog coroiration

Replies: >>105900858

Anonymous

7/14/2025, 11:06:20 AM No.105900844

>>105897412
Max out! Beware of the fact that NUMA is shit slow. That said said, dual CPU setup with 512+512 gb will give you mere 512 gb of usable RAM

Anyway, you are too late to the party. Kimi-K2 support merge is delayed.

It's over for local

Anonymous

7/14/2025, 11:08:29 AM No.105900854

>>105900584
Are there any older (affordable) servers which support 512+ gb on a single CPU?

No NUMA please

Replies: >>105901029

Anonymous

7/14/2025, 11:09:13 AM No.105900858

>>105900796
Are you saying all this was just to appease Trump, or am I misunderstanding you? even then, that's a huge sum. I thought Softbank truly believed OpenAI will reach superintelligence or something, and that Altman was a really good scammer.
Unironically though Japan should get in the game, like China they have an advantageous legal climate for training LLMs (copyright wise)

Replies: >>105900937 >>105900956

Anonymous

7/14/2025, 11:13:28 AM No.105900892

>>105900715
Ebay or chinkstores

Anonymous

7/14/2025, 11:22:19 AM No.105900937

>>105900858
>advantageous legal climate for training LLMs (copyright wise)

>blocking pirared anime everywhere
Like the only thing I care for

Replies: >>105900992

Anonymous

7/14/2025, 11:25:49 AM No.105900956

>>105900858
Its weird. There is only SakanaAI. And those are foreigners living off t he jap government and nvidia money.
All their papers are hype shit. I remember them proudly presenting some memory solution....pajeets hyped it up...if you look at their paper..it was a fucking build in rag.
Japan dropped off hard recently. Its sad that 50%+ of the jp people are on X, no language and culture barrier anymore. The uniqueness is mostly gone. And I say that as somebody that is living there.
Wouldnt suprise me if that is the reason we dont see any excellency anymore. Young boys probably do what they do in the west and just check out, let the girls be the class rep and get the applause in a system that does not reward them in any way. I read a blog recently how they are concerned that boys "dont show initiative" anymore in school. Big shocker. KEK

Replies: >>105900995

Anonymous

7/14/2025, 11:31:06 AM No.105900992

>>105900937
They long declared training on copyrighted stuff legal. Also if that was true, how come pixiv is filled with so much AI slop that is literally trained on boorus (which scrape pixiv)

Replies: >>105901010

Anonymous

7/14/2025, 11:31:11 AM No.105900995

>>105900956
The only ppl in jp with decent English skills are Koreans and Chinese

At least it was the case prior to cough

Replies: >>105901028

Anonymous

7/14/2025, 11:34:14 AM No.105901010

>>105900992
Some pixiv rakugaki is far from onepiece, naruto etc. where they make money off

I can't imagine a jp company training loras on this stuff

Replies: >>105901061

Anonymous

7/14/2025, 11:36:53 AM No.105901028

>>105900995
When my kid was 6 they started teaching english in elementary school. Before that a little bit in kindergarden.
Young people probably arent that bad at it nowadays, at least for understanding.
But regardless: The social media climate is THE EXACT SAME.
Or rather a couple years behind burger, when left was peak. Tranny is being talked about everywhere since covid.
And the only counter the critics have is "muh poor wahmen if a tranny appears in the toilet". Its the same shit.
Don't expect a good AI to come from japan, I dont see it. There is a japanese guy doing stuff with stable diffusion, i think he is big, doing iprovements etc.. But thats the only guy fro japan I know.

Replies: >>105901467

Anonymous

7/14/2025, 11:36:55 AM No.105901029

the-scream-k-on

md5: bd0d7fe05d7fdd789cb78a7976fdffe9🔍

>>105900854
Nowadays even single socket CPUs tend to have multiple NUMA nodes.

Anonymous

7/14/2025, 11:42:41 AM No.105901061

>>105901010
maybe not, although Illustrious was trained by Koreans (on top of SDXL) and is probably one of the more popular bases these days

Anonymous

7/14/2025, 11:42:54 AM No.105901065

file

md5: e630c5c334a038d621789059f311e3e5🔍

>>105900213
>ternus
Could be good for Apple.

Anonymous

7/14/2025, 11:45:43 AM No.105901085

>>105900493
The biggest advantage of llama.cpp vs. Python is ease of installation.
At my workplace my coworker set up a language model on some GPU server and he went with ollama because to him that seemed like the easiest.
I later discussed with him the benefits of e.g. vLLM but for his use case there are only very few parallel requests so it wouldn't be worth the effort to switch.
The downside is that if you don't use any dependencies and write everything, including the tensor library, yourself, it's just going to be way more work.

Anonymous

7/14/2025, 11:50:54 AM No.105901124

>using r1-0528 and having a normal conversation with a girl
>her knuckles start to whiten
>her fingernails dig into her palm, drawing blood
>her skirt rides up, revealing her panties
>her legs press together unconsciously, bruising at the knees
>she bites her lip, copper flooding her mouth
>her top begins to fray at the edges and come undone
seriously why the fuck does everyone's clothes fall apart as they slowly bleed to death no matter what we do? is this typical in china because of all the acid rain or something?

Replies: >>105901145 >>105901332

Anonymous

7/14/2025, 11:56:39 AM No.105901145

>>105901124
Explain this whole knuckles start to whiten, I don't recall ever getting it with r1, but I am geiitng it with K2 (on API), was this ever a common R1 slop?

Replies: >>105901203 >>105901205

Anonymous

7/14/2025, 12:05:51 PM No.105901203

>>105901145
I'm getting knuckles whitening but none of the other stuff mentioned.
Maybe it depends on the content.

Anonymous

7/14/2025, 12:06:03 PM No.105901205

>>105901145
really common for me in r1 yeah, maybe it's just my prompts but this happens in a lot of varied scenarios/cards for me
haven't used k2 yet waiting for the good goofs

Anonymous

7/14/2025, 12:26:01 PM No.105901332

>>105901124
imagine what a model with ao3 in training set could do.

Replies: >>105901372 >>105901383 >>105901412

Anonymous

7/14/2025, 12:32:58 PM No.105901372

file

md5: 4f92a67193d314ca67e8921be5079fc2🔍

>>105901332
They all have ao3 in the training set. Here's R1.

Replies: >>105901393

Anonymous

7/14/2025, 12:34:25 PM No.105901383

>>105901332
Almost all major pretrained models have AO3 to varying extents in the training dataset. Morality-based training data filtering and stupidly massive (literally hundreds of billions of tokens now, for some of the latest models) post-training heavy on safety, math/benchmaxxing and corporate tasks ruin everything.

Replies: >>105901391 >>105903179

Anonymous

7/14/2025, 12:35:25 PM No.105901391

>>105901383
I wonder how K2 base would do with story prefix prompts in text completion mode. It shouldn't be slopped in the same way instruct is, right?

Replies: >>105902005

Anonymous

7/14/2025, 12:35:33 PM No.105901393

>>105901372
Then I'm out of ideas.

Replies: >>105901412

Anonymous

7/14/2025, 12:38:37 PM No.105901412

file

md5: d14dfbbd06cbb710129ea3aff0011dde🔍

>>105901393
>>105901332
You don't have to imagine.

Replies: >>105901425 >>105901442

Anonymous

7/14/2025, 12:40:23 PM No.105901425

>>105901412
A few occurrences in millions of fanfics aren't going to make the model overly focus on that.

Replies: >>105902000

Anonymous

7/14/2025, 12:41:26 PM No.105901430

Has anyone messed around with Qwen3?
I've been playing around with an 8B "uncensored" somebody made. And while I can feel it being a bit better than the LLAMA based 8Bs I've been using but I keep getting loops and other strange things. I wish I knew what the fuck I was doing and why it goes wrong.

Replies: >>105901441 >>105901451

Anonymous

7/14/2025, 12:43:24 PM No.105901441

>>105901430
for rp use rocinante v1.1 and delete qwen

Anonymous

7/14/2025, 12:43:35 PM No.105901442

>>105901412
do brown peoples knuckles turn white when they get mad

Replies: >>105901479

Anonymous

7/14/2025, 12:45:05 PM No.105901451

>>105901430
8b is to dumb for rp. use 12b minimum

Replies: >>105901471

Anonymous

7/14/2025, 12:46:56 PM No.105901467

>>105901028
>There is a japanese guy doing stuff with stable diffusion

The guy called Kohya making lora training backend?

Replies: >>105901504

Anonymous

7/14/2025, 12:47:23 PM No.105901471

>>105901451
The problems isn't that it's dumb. Is that it repeats the same sentence. In 20 different ways to get around the repetition restrictions.
But hey I guess I'll try a 12b. Is it really worth using a 12b on a lower Q than a 7-8 on a higher Q?

Replies: >>105901485 >>105901493 >>105901497

Anonymous

7/14/2025, 12:48:26 PM No.105901479

>>105901442
It happens when you grip things too hard. It does align with the show don't tell direction. But it's too autistically specific, might as well say "her middle finger's tendon became more exaggerated"

Anonymous

7/14/2025, 12:49:47 PM No.105901485

>>105901471
Unless your potato would require you to use the 12b at ~Q2 then yes, 12b will be better even if worse quant. Qwen family of models are also inherently shit at RP, so that also works against it. They're benchmaxxed math/coding models.

Replies: >>105901522

Anonymous

7/14/2025, 12:50:33 PM No.105901493

>>105901471
it's not about how big the model is but what they trained it on. qwen and others are trained to perform well on math benchmarks but in turn they are fucking retarded for human stuff. mistral nemo and it's finetunes were trained on much better stuff. there are currenly countless 100+b over which i'd pick nemo

Replies: >>105901501 >>105901862

Anonymous

7/14/2025, 12:51:02 PM No.105901497

>>105901471
>Is it really worth using a 12b on a lower Q than a 7-8 on a higher Q?
yep. same with scaling up even higher like 70b, a q3s will still be much better for rp than a q8 24b. most small models suck for rp anyways, nemo (12b) is considered the best small one

Anonymous

7/14/2025, 12:51:34 PM No.105901501

>>105901493
sourgrapes because you can't run 100+b local

Replies: >>105901508

Anonymous

7/14/2025, 12:52:19 PM No.105901504

>>105901467
ah yes, thats who i meant. only japanese guy I know who is doing stuff.

Anonymous

7/14/2025, 12:53:08 PM No.105901508

>>105901501
nope I can run r1. by 100+b i meant moe models, like scout, dots, etc.

Anonymous

7/14/2025, 12:54:45 PM No.105901522

>>105901485
I'm potato maxing. I'm using a 1060 6gb and and a Ryzen 7 with 16 GB of RAM. I can load Q2s of 12bs mostly in RAM but the higher Qs run a lot slower since that's all the potato can take. But I don't mind waiting a bit if the output isn't shit.

Replies: >>105901531 >>105901588

Anonymous

7/14/2025, 12:55:45 PM No.105901531

>>105901522
VRAM*

Anonymous

7/14/2025, 1:02:58 PM No.105901588

>>105901522
If you're that limited then maybe try Llama 3.1 8b and finetunes, they're a bit outdated at this point but they're better than Qwen models.
If you want ERP then I'd recommend Stheno, specifically.

Replies: >>105901598 >>105901714 >>105902003

Anonymous

7/14/2025, 1:04:54 PM No.105901598

>>105901588
>while I can feel it being a bit better than the LLAMA based 8Bs I've been using

Replies: >>105901616

Anonymous

7/14/2025, 1:07:25 PM No.105901616

>>105901598
Didn't catch that part, my bad. But I'd still take them over Qwen for RP.

Anonymous

7/14/2025, 1:13:56 PM No.105901642

>>105900314
I think the EU would allow a buyout if they get concessions in other areas but I would agree that it's unlikely they'll let Apple do it for free.

Anonymous

7/14/2025, 1:26:06 PM No.105901714

>>105901588
I would recommend BUYING A FUCKING AD
Jesus Christ.
Even with the fucking hobby dying you can't just shut the fuck up for 2 seconds.

Replies: >>105901809

Anonymous

7/14/2025, 1:39:47 PM No.105901809

>>105901714
OK drummer.

Replies: >>105902003

Anonymous

7/14/2025, 1:46:40 PM No.105901862

>>105901493
>qwen

Qwen3 (biggest) sucks for translation of true literature

R1 master race

Replies: >>105901964

Anonymous

7/14/2025, 1:53:07 PM No.105901913

>>105899970
Looks good. Is there a compilation of all the models tested so far?

Anonymous

7/14/2025, 1:56:30 PM No.105901936

>>105900443
Llama.cpp is a testing ground, you use VLLM if you need to scale

Replies: >>105901952

Anonymous

7/14/2025, 1:58:07 PM No.105901947

Welp potato man reporting. I messed around with rocinante v1.1 12b q2XS and nemo q2XS.
They aren't that slow. But rociante wasn't giving me the outputs I wanted, probably using shit settings as I keep trying out models. But Nemo is running fine and it's noticeably less stupid and forgetful than the 8b I was playing with yesterday. Should I up the q or just be happy it runs fine?

Replies: >>105902011

Anonymous

7/14/2025, 1:58:54 PM No.105901952

>>105901936
vllm is shit compared to exllama.

Replies: >>105902016

Anonymous

7/14/2025, 2:00:21 PM No.105901964

>>105901862
how did you manage to read that post and get the impression that it's defending qwen?

Anonymous

7/14/2025, 2:05:15 PM No.105902000

>>105901425
What do you think LLMs are? If you're in RP mode and your sentence is getting similar to its training data you'll get that autocompleting slop

Replies: >>105902040

Anonymous

7/14/2025, 2:05:37 PM No.105902003

>>105901588
>>105901809
take your meds schizo

Anonymous

7/14/2025, 2:05:50 PM No.105902005

>>105901391
With recent smaller models (Gemma 3, Mistral Small 3) I've noticed significantly less slop using the bases, so the same should hold true for K2. Unfortunately they're barely usable for most purposes without careful sampler tweaking and in both cases it's obvious that those who trained the models didn't include the *entirety* of AO3 (certain archive warning tags are notably missing if you let the models generate the rest of the preamble after "Archive Warnings:" or "Archive Warning:").

Anonymous

7/14/2025, 2:06:56 PM No.105902011

>>105901947
Rocinante and Nemo use identical settings so they should be both fine or both fucked, either way they both use Mistral V3 templates
I would go as high as you can on quant before it becomes intolerable, iq4_xs is the sweet spot where you get the best quality for the least memory, but even iQ3_xxs/xs is going to be a good step up.

Replies: >>105902048

Anonymous

7/14/2025, 2:07:28 PM No.105902016

>>105901952
Exllama doesn't scale

Replies: >>105902496

Anonymous

7/14/2025, 2:11:35 PM No.105902040

>>105902000
Post-training slop has its own signature because AI companies tend to train the models on "one (synthetic) voice", which speeds up training on one hand, but severely restricts vocabulary and sentence variety on the other, especially after many billion tokens of training.

If you've ever finetuned a LLM, have you ever wondered why synthetic data tends to start with a training loss of around 1 or less (old GPT3.5/4 data was egregious in this sense), while previously unseen human data starts at 2.0-2.5? Synthetic data is "easier" for a reason.

Replies: >>105902116

Anonymous

7/14/2025, 2:13:22 PM No.105902048

>>105902011
Thanks. I got to say that this is the most fun I've had with my PC in like a decade. It's fun enough that it's making me want to finally build a new PC.
Would a 3090 and the biggest amount of RAM and Ryzen CPU I can throw at it be a good path going forward for local LLM stuff or would I be better off with a lower 40/50 series or something AMD?

Replies: >>105902074 >>105902075 >>105902085

Anonymous

7/14/2025, 2:17:20 PM No.105902074

>>105902048
If you are not into gayming and were gonna spend that amount of money anyways look into various ram maxxing guides either in the OP or on youtube. Usually involved buying a server rack and a EPYC and then maybe some cards for splitting the work.

Replies: >>105902086

Anonymous

7/14/2025, 2:17:27 PM No.105902075

>>105902048
I would start with an RTX 6000 Pro Blackwell

Anonymous

7/14/2025, 2:18:16 PM No.105902085

>>105902048
Used 3090 is the value meta for running ~30b models at fast speeds entirely in VRAM
Models above ~30b are mostly dead now until you get to giant models like R1, they will at least 128GB RAM to use even the smallest quants and they still won't be fast

Anonymous

7/14/2025, 2:18:17 PM No.105902086

>>105902074
I'm into gaming and also do CAD work for a hobby/job so I want something that is also usable as a normal desktop.

Anonymous

7/14/2025, 2:23:44 PM No.105902116

>>105902040
Yeah that post-training slop is for benchmaxxing

Anonymous

7/14/2025, 2:32:56 PM No.105902185

image

md5: c5b8049ed2b4655986bb2024b4ef89f4🔍

lmao gary marcus thought one of the chatgpt delusionmaxxers was a real human agreeing with him

Replies: >>105902192 >>105904254

Anonymous

7/14/2025, 2:33:41 PM No.105902191

file

md5: 839bd0f42c255d3f3e51115df11c844a🔍

Asus? More like Transus.

Replies: >>105902333

Anonymous

7/14/2025, 2:33:54 PM No.105902192

>>105902185
>delusionmaxxers
Can't you just speak like a normal human being?

Replies: >>105902206

Anonymous

7/14/2025, 2:35:36 PM No.105902206

>>105902192
idk what the proper term is for "schizos convinced by chatgpt that they turned their chat history into AGI via 'recursion'" but there's a lot of them recently and they're pretty recognizable

Replies: >>105902216

Anonymous

7/14/2025, 2:37:47 PM No.105902216

>>105902206
I wonder what tipped you off—must be the profile picture

Anonymous

7/14/2025, 2:53:45 PM No.105902333

>>105902191
The miku board seems nice, especially since it has a custom bios, but I think I prefer the girl from asus' 吹雪 boards, since there are amd ones. And I like white. But I'm not sure if those have a custom bios like miku does.

Replies: >>105902378

Anonymous

7/14/2025, 2:56:16 PM No.105902352

grok-companions

md5: 54e7948074bbd5b56443fa3b016840cc🔍

Local models never had a chance.
https://x.com/elonmusk/status/1944705383874146513

Replies: >>105902355 >>105902384 >>105902387 >>105902415 >>105902419 >>105902440 >>105902458 >>105902469 >>105902479 >>105902502 >>105902510 >>105902599 >>105902696 >>105902710 >>105902792 >>105902795 >>105902800 >>105902810 >>105903360 >>105903470 >>105904379

Anonymous

7/14/2025, 2:57:30 PM No.105902355

file

md5: d42dd292682e6ea4b9ea4f8a4b35da69🔍

>>105902352
lmao

Replies: >>105902408 >>105902795

Anonymous

7/14/2025, 2:59:37 PM No.105902378

>>105902333
>buy any card
>apply sticker

Replies: >>105902422

Anonymous

7/14/2025, 3:00:28 PM No.105902384

>>105902352
mechahitler-chan needs here *real* uniform

Anonymous

7/14/2025, 3:00:41 PM No.105902387

yes

md5: 7c48b6b4b2c5944034fa0ff37d91b928🔍

>>105902352
elon knows

Anonymous

7/14/2025, 3:03:47 PM No.105902408

file

md5: 6bea599766216003edc1e84e8499bd7c🔍

>>105902355

Anonymous

7/14/2025, 3:04:50 PM No.105902415

9c20eeba66a7bd355590333f509bd011

md5: 69b0c30ba6fdb96ada291955d311537b🔍

>>105902352
not lust-inducing enough or moe enough or anything enough to stand out.... mid character design h2h

Replies: >>105902471

Anonymous

7/14/2025, 3:05:07 PM No.105902419

>>105902352
how long until it spews out nazi talking points

Anonymous

7/14/2025, 3:05:15 PM No.105902422

Untitled

md5: d1e59c6e556077fca164069e55c64c95🔍

>>105902378

Replies: >>105902515

Anonymous

7/14/2025, 3:07:20 PM No.105902440

file

md5: 9ed8587211c5a9ab422329d3c3d208ee🔍

>>105902352

Replies: >>105904281

Anonymous

7/14/2025, 3:09:16 PM No.105902458

>>105902352
Animation looks bad https://x.com/doganuraldesign/status/1944742520379896121
https://x.com/web3willbefree/status/1944742194092400933

Replies: >>105902635 >>105902674 >>105904604

Anonymous

7/14/2025, 3:10:15 PM No.105902469

>>105902352
do they come in different sizes though?

Replies: >>105902488

Anonymous

7/14/2025, 3:10:18 PM No.105902471

>>105902415
Misa sugoi

Anonymous

7/14/2025, 3:11:12 PM No.105902479

>>105902352
This is possibly the first based thing Musk ever did.
Though I am concerned that him putting his stench on AI waifus will have negative long-term consequences.

Replies: >>105902586

Anonymous

7/14/2025, 3:12:36 PM No.105902488

>>105902469
No and that's a good thing because Elon wants this to succeed and not fail day-one thanks to usual brown suspects inhabiting this shithole site.

Anonymous

7/14/2025, 3:13:30 PM No.105902496

>>105902016
it runs on mutlipgpu and supports batching, what are you even talking about.

Anonymous

7/14/2025, 3:14:08 PM No.105902502

>>105902352
>兄 (ani)
>Japanese word for older brother
What did he mean by this?

Replies: >>105904314

Anonymous

7/14/2025, 3:15:08 PM No.105902510

file

md5: ae0b6f5e9046b93015a4d20ccd39956f🔍

>>105902352
He got a type

Anonymous

7/14/2025, 3:15:34 PM No.105902515

>>105902422
You can't see that part.

Anonymous

7/14/2025, 3:17:32 PM No.105902529

>2x xeon 8160
>512gb ddr4
is this good enough for cpumaxxing? I can get it for around $3300/2800 euros

Replies: >>105902544 >>105902559

Anonymous

7/14/2025, 3:18:51 PM No.105902544

>>105902529
>2x xeon 8160
How's NUMA support these days?
Doesn't the fact that each socket only connects to half the memory directly slow things down massively?

Replies: >>105902713 >>105902874 >>105902913

Anonymous

7/14/2025, 3:20:49 PM No.105902559

>>105902529
>ddr4
Why settle for half the bandwidth?

Replies: >>105902713

Anonymous

7/14/2025, 3:23:27 PM No.105902586

>>105902479
>Though I am concerned that him putting his stench on AI waifus will have negative long-term consequences.
How? He's the least likely to lobotomize your waifu with safetycucking.

Replies: >>105902642

Anonymous

7/14/2025, 3:25:38 PM No.105902599

>>105902352
@grok is this real

Anonymous

7/14/2025, 3:27:53 PM No.105902616

Is the 48G dual gpu intel arc just bait? Is the memory and speed just going to be too slow?

Replies: >>105902639 >>105902642

Anonymous

7/14/2025, 3:30:19 PM No.105902635

>>105902458
https://x.com/gailalfaratx/status/1944737917756379411
It does have panty shots however

Anonymous

7/14/2025, 3:31:18 PM No.105902639

>>105902616
A little bit, yeah. Each GPU is pretty slow, and there's no fast interconnect between the cores for proper row parallel.
Depending on the price, might be a better deal than CPU maxxing to some extent, but probably not.

Anonymous

7/14/2025, 3:31:27 PM No.105902642

>>105902586
I can't wait for my digital waifu to suddenly become obsessed with white farmers in South Africa.

>>105902616
Allegedly someone on Reddit asked Maxsun for a quote and it was like $4000 apiece if he were to buy multiple of them.
At that price it's going to be DOA.

Replies: >>105902649 >>105902664 >>105902816

Anonymous

7/14/2025, 3:32:19 PM No.105902649

>>105902642
>$4000 apiece
Holy fuck.

Anonymous

7/14/2025, 3:35:14 PM No.105902664

>>105902642
https://www.reddit.com/r/LocalLLaMA/comments/1lokp88/intel_arc_pro_b60_dual_48g_turbo_maxsun_gpu/
>I emailed Maxsun for a quote. Their US distributor hit me back with $5k per unit for 3 GPUs, or $4.5k each for 5+.
>I also talked on the phone with a rep and talked him down to $3,800 for 4 units. 5+ units down to $3,000.

Replies: >>105902816

Anonymous

7/14/2025, 3:36:05 PM No.105902674

>>105902458
Yikes. VRChat avatars have better lipsync geg.

Replies: >>105902901

Anonymous

7/14/2025, 3:36:19 PM No.105902678

I have a question. Does / Why does RAG use LLM specific embedding? My understanding is that it is just searching for stuff in the database and then stuffing it into context. Why tie it to the model instead of just making a separate database program?

Replies: >>105903164 >>105903202

Anonymous

7/14/2025, 3:38:50 PM No.105902696

>>105902352
Nice gimmick pr stunt. This will be dead and buried in 2 weeks.

Anonymous

7/14/2025, 3:39:50 PM No.105902710

>>105902352
At least he isn't a megafaggot by posting this green haired piece of trash that shits up this thread.

Anonymous

7/14/2025, 3:40:14 PM No.105902713

>>105902544
>NUMA support
from level1techs
>Now with 1 cpu installed i get 2.7 tokens/s with SNC=Auto. (2 numa nodes per cpu). but 5.1 Tokens/s with SNC=disabled (1 numa node per cpu). So please try disabling sub numa clustering (SNC). However when i install the second cpu i get 0.5 tokens per second :slight_smile: Which maybe corresponds to your 0.9 T/s. because i expect 9480 to be faster of course.
seems like it's ass
>>105902559
because servers with ddr5 support cost 2k more and at that point I might just buy gpus

Replies: >>105902874 >>105902913

Anonymous

7/14/2025, 3:48:41 PM No.105902792

>>105902352
for once a companion thing that looks cute
autism or not, nice

Anonymous

7/14/2025, 3:49:15 PM No.105902795

>>105902352
live2d or from that 3d hentai game I forgot the title

>>105902355
>marie rose
based if true

Anonymous

7/14/2025, 3:49:47 PM No.105902800

>>105902352
wtf I sub to supergrok, I wanna try this out but I have to go to work

Anonymous

7/14/2025, 3:50:56 PM No.105902810

file

md5: 0bd5f44aa48db209c8448e8085187b97🔍

>>105902352

Replies: >>105902846 >>105903079

Anonymous

7/14/2025, 3:51:35 PM No.105902816

>>105902642
>>105902664
Keep in mind the 48GB dual GPU version is not an official Intel variant, it does not have any set in stone MSRP and right now there's literally a single AIB making any. What you're looking at is an importer/overpriced prebuilt seller charging out the ass because there's nothing to stop them, not some sort of official pricetag.
I would just pretend the card doesn't exist, Intel has said they currently have no plans to distribute it themselves, and even if it does eventually end up in retail channels it will be late and in unobtanium quantities.

Anonymous

7/14/2025, 3:51:41 PM No.105902818

file

md5: d5f495ebdbfaf79b6e3dd6272b98ecc6🔍

Unsloth kimi goofs

https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF

Anonymous

7/14/2025, 3:54:51 PM No.105902846

>>105902810
lmao

Anonymous

7/14/2025, 3:59:11 PM No.105902874

>>105902544
>>105902713
Nta

I can confirm this.
You will get the highest speed if the model stays close to the CPU used, and the number of threads is matching the number of the physical cores of it.

Because I have 512+512 gb divided between 2 CPUs, I pre-cache ds-r1 at Linux start (happens in background when system starts and reaches user login).

Dual CPU setup might only be useful if you run 2 model instances pre-cached accordingly while they share the GPU (at low context size)

Anyway, I'm fine with 4 t/s

Replies: >>105902913 >>105903152

Anonymous

7/14/2025, 4:01:52 PM No.105902901

>>105902674
yeah she's nice but these facial expressions are really bad

Anonymous

7/14/2025, 4:04:04 PM No.105902913

>>105902544
>>105902713
>>105902874
llama.cpp, ik_llama.cpp, and ktransformers all have specific NUMA optimizations right?

>Dual CPU setup might only be useful if you run 2 model instances pre-cached accordingly while they share the GPU (at low context size)
Like copying the full weights to each memory pool?

Replies: >>105903012

Anonymous

7/14/2025, 4:13:09 PM No.105902988

So Kimi good or?

Replies: >>105902993 >>105903038

Anonymous

7/14/2025, 4:14:03 PM No.105902993

>>105902988
I enjoy it

Replies: >>105903025

Anonymous

7/14/2025, 4:15:53 PM No.105903012

>>105902913
>Like copying the full weights to each memory pool?

Exactly. I do not use --numa params with llama.cpp. Instead, I start llama-cli with numactl where I explicitly set up CPU cores and the memory bindings.

What slows things down:
- using less cores than available
- cluttering cores with more than 1 thread
- having the model away from CPU used

Anonymous

7/14/2025, 4:16:13 PM No.105903016

So right now it's CPUmaxxx with 1TB of RAM or you might just as well stick to 12B models?

Replies: >>105903037

Anonymous

7/14/2025, 4:17:09 PM No.105903025

>>105902993
Still not merged in llama.cpp

Anonymous

7/14/2025, 4:18:20 PM No.105903037

>>105903016
I'm using DS for coding. No 12b model could do it relyably

Anonymous

7/14/2025, 4:18:27 PM No.105903038

>>105902988
Yes, it is the best writing model in the world.

Anonymous

7/14/2025, 4:21:56 PM No.105903079

>>105902810
>loses all relevance and has to rely on inflated benchmarks
how did it get this bad?

Replies: >>105903416

Anonymous

7/14/2025, 4:33:29 PM No.105903152

>>105902874
specs?

Replies: >>105903555

Anonymous

7/14/2025, 4:34:47 PM No.105903164

Can someone answer my actually local models related question >>105902678 that isn't even what should I... install mistral nemo?

Replies: >>105903202

Anonymous

7/14/2025, 4:36:45 PM No.105903179

>>105901383
So a single nerd in one of those companies could just run the final instruct conditioning in a way where it is happy to have sex, but isn't a hyper agreeable assistant and leak it and we could already be having sex?

Replies: >>105903199 >>105903230

Anonymous

7/14/2025, 4:37:09 PM No.105903181

yea

md5: cc7d9dc91efb80934cebd9d2bbc6c915🔍

Replies: >>105904338

Anonymous

7/14/2025, 4:39:02 PM No.105903199

>>105903179
I believe there are safety benchmarks that need to pass before a model can be published. That's what safety researchers are paid for.

Replies: >>105903217

Anonymous

7/14/2025, 4:39:20 PM No.105903202

>>105902678
>>105903164
Because it's not just a standard relational database that stores shit as plain text.
The content gets vectorized before being saved to the database, and that allows, in theory, to search by semantic similarity by looking at the proximity in the vector space.
>https://github.com/chroma-core/chroma
That should help make things clearer.

Replies: >>105903245

Anonymous

7/14/2025, 4:40:45 PM No.105903214

158296164881

md5: 792c1cfa6afd22033c6100af9a3087fe🔍

vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes >>105714003 ryona picture of generic anime girl different anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models

Replies: >>105903307

Anonymous

7/14/2025, 4:40:53 PM No.105903217

>>105903199
safety being 80% porn and 20% drugs/wmd etc

Replies: >>105903228 >>105903240

Anonymous

7/14/2025, 4:42:46 PM No.105903228

>>105903217
You forgot racism and important racism (antisemitism)

Anonymous

7/14/2025, 4:42:57 PM No.105903230

>>105903179
Deepseek will have sex with you and looks like kimi will as well.

Replies: >>105903281

Anonymous

7/14/2025, 4:44:05 PM No.105903240

1725393106706620

md5: 10fe1d523f882887cc6d0842566b416d🔍

>>105903217
how else can we have gems like picrel

Replies: >>105903265 >>105903280

Anonymous

7/14/2025, 4:44:21 PM No.105903245

>>105903202
Thank you. One of the few non-faggot /lmg/ poster.

Replies: >>105903276

Anonymous

7/14/2025, 4:46:09 PM No.105903265

>>105903240
This aligns with how zoomers think

Replies: >>105903279

Anonymous

7/14/2025, 4:47:09 PM No.105903276

>>105903245
That's some babby tier shit you're asking. What the fuck do you think the embedding models are for?

Anonymous

7/14/2025, 4:47:32 PM No.105903279

>>105903265
sadly correct

Anonymous

7/14/2025, 4:47:35 PM No.105903280

>>105903240
to be fair that is borderline minor, you have to admit it'd be weird if some 30 year old loser was trying to mess with a girl that age, for example

Replies: >>105903288 >>105903290 >>105903292 >>105903305 >>105903324 >>105903399

Anonymous

7/14/2025, 4:48:00 PM No.105903281

>>105903230
I was using 1IQ and I am aware that this may be the issue but it still feels like something is missing.

Anonymous

7/14/2025, 4:48:55 PM No.105903288

>>105903280
>you have to admit it'd be weird
no i don't

Anonymous

7/14/2025, 4:49:17 PM No.105903290

>>105903280
>borderline minor
lmao
"You can't vote, you're still a borderline minor!"

Anonymous

7/14/2025, 4:50:01 PM No.105903292

>>105903280
good bait

Anonymous

7/14/2025, 4:51:59 PM No.105903305

>>105903280
that's actually how twitter zoomers write
the ultimate insult is "weird" for some reason
I guess tumblr culture spreading and oversocialization or something like that

Anonymous

7/14/2025, 4:52:13 PM No.105903307

>>105903214
https://desuarchive.org/g/search/text/pixiv.net%2Fen%2Fusers%2F97264270/
try post it again see if anything happens
why am I even responding to an NPC, gay ass

Replies: >>105903333 >>105903634

Anonymous

7/14/2025, 4:53:58 PM No.105903324

>>105903280
What is the endpoint for that slippery slope? Age of consent for men: 10yo if fucked by a woman. Age of consent for women: 40yo?

Replies: >>105903370 >>105904448

Anonymous

7/14/2025, 4:54:35 PM No.105903333

>>105903307
problem lil bro?

Anonymous

7/14/2025, 4:54:59 PM No.105903337

Can I check which experts are used for each token using llama.cpp or do I need transformers and 2TB of ram?

Anonymous

7/14/2025, 4:56:20 PM No.105903351

file

md5: 6de336caa465bf4c07e653bd85b31bd2🔍

https://x.com/kimmonismus/status/1944733839063945364

Replies: >>105903363 >>105903432

Anonymous

7/14/2025, 4:57:22 PM No.105903360

>>105902352
>Responses to an open chink model competing with them
>Altman
>Autistic screeching and locking down everything even more under the guise of "safety"
>Musk
>Actually listen to the people and give us an interactable anime girl
Musk may be a piece of shit, but it's clear who won here

Anonymous

7/14/2025, 4:57:34 PM No.105903363

>>105903351
again facial expressions lack "punch"

Anonymous

7/14/2025, 4:58:20 PM No.105903370

>>105903324
there is no endpoint, it's all vibes

Anonymous

7/14/2025, 5:01:38 PM No.105903399

>>105903280
The sad thing is that zoomers are parroting this after jealous 35 y/o single hags who spent their youth riding the carousel.

Replies: >>105903437

Anonymous

7/14/2025, 5:03:35 PM No.105903416

>>105903079
Maybe they'll get back to track next model.
They have the money.

Anonymous

7/14/2025, 5:05:45 PM No.105903432

>>105903351
>tasteless hypefluencer devours slop
surprise surprise

Anonymous

7/14/2025, 5:06:10 PM No.105903437

>>105903399
it's even weirder than that
it went : bitter middle aged women on gay ship fandoms > tumblr teenage girls > twitter zoomers

it's fascinating how well it worked

Anonymous

7/14/2025, 5:07:14 PM No.105903446

Screenshot_20250715_000542

md5: 556fa89fe7d1e03a57bc21a3b18a21a0🔍

Daaaayyyuuum.
And here we localfags have CANNOT and WILL NOT.

I confess that I sometimes use big models on OR to get the local chat rolling. (first couple good replies are important as heavy lifting)
But I cant just send my voice, its a step too far.
Good shit though.

Replies: >>105903473 >>105903489 >>105903495 >>105904003

Anonymous

7/14/2025, 5:09:33 PM No.105903470

misa

md5: 5e59cffd8c03844c61d62d9120ddd0b2🔍

>>105902352
huh

Replies: >>105905009

Anonymous

7/14/2025, 5:09:54 PM No.105903473

>>105903446
>And here we localfags have CANNOT and WILL NOT.
Just use R1.

Replies: >>105903491

Anonymous

7/14/2025, 5:11:32 PM No.105903489

>>105903446
Is the the nsfw mode? There's no nudity though?

Anonymous

7/14/2025, 5:11:42 PM No.105903491

>>105903473
I have 2 completely free 24gb cards, life was going nice with 70b. Now its all fucked up.
Also the new R1 sometimes cucks out, less shizzo though.Hope they dont continue the trend.

Anonymous

7/14/2025, 5:11:54 PM No.105903495

>>105903446
>"no guardrails"
what fucking guardrails do you need when you literally paid the service and the thing is just a nightgown clad anime girl
a proof you're older than 30 and not a twitter defined minor?

Replies: >>105903506 >>105903514 >>105904072

Anonymous

7/14/2025, 5:13:09 PM No.105903506

>>105903495
people are losing it over a vroid .vrm
classic
like NAI's aetherroom (or lack thereof) all over again

Replies: >>105903522

Anonymous

7/14/2025, 5:13:34 PM No.105903514

>>105903495
You would think that but thats not how it currently works.
This is a huge departure from safety cucking everywhere else.

Replies: >>105903525

Anonymous

7/14/2025, 5:14:01 PM No.105903518

i fucking hate this thread today

Anonymous

7/14/2025, 5:14:31 PM No.105903522

>>105903506
>like NAI's aetherroom
Like what? Are you lost?

Anonymous

7/14/2025, 5:14:38 PM No.105903525

>>105903514
I guess in oai defined world, showing an ankle is a revolution

Replies: >>105903573

Anonymous

7/14/2025, 5:18:41 PM No.105903554

Why do some tooners insist on tuning every model using the chatML format?

Replies: >>105903718 >>105903823

Anonymous

7/14/2025, 5:18:45 PM No.105903555

>>105903152
HP Z840, DDR4-2133

Anonymous

7/14/2025, 5:20:33 PM No.105903573

Screenshot_20250713_230633

md5: 50891ebb737801d1a7c4a9496ee3d9a4🔍

>>105903525
The damage they did has yet to be undone.
Its not just oai, everything is cucked now.
I stumbled on pic related recently. Even fujoshi otome games have these type of nagging normies.

Replies: >>105903633

Anonymous

7/14/2025, 5:25:54 PM No.105903633

>>105903573
>Its not just oai, everything is cucked now.
oai was the first to amp up the "safety" stuff and sell themselves as the only solution to it
the whole safety team thing was copied by everyone else

>I stumbled on pic related recently. Even fujoshi otome games have these type of nagging normies.
nah this kind of stuff was a thing since tumblr era when virtue signaling about being "a good person" became incredibly popular
and even before that it always existed albeit not as the norm like today

Anonymous

7/14/2025, 5:25:55 PM No.105903634

>>105903307
Total mikutroon death

Replies: >>105903764

Anonymous

7/14/2025, 5:34:09 PM No.105903718

>>105903554
When you have a chatml hammer, every model looks like a nail

Anonymous

7/14/2025, 5:39:28 PM No.105903764

>>105903634
cry about it

Anonymous

7/14/2025, 5:44:46 PM No.105903823

>>105903554
It's the industry standard™.

Anonymous

7/14/2025, 5:47:28 PM No.105903846

1743197260741791

md5: c87ca68f58b19f73e70bd1502492e707🔍

Interesting

Anonymous

7/14/2025, 6:00:56 PM No.105903980

Screenshot 2025-07-14 100024

md5: 2ff420939abaea2c80d7d3aa7f40cc59🔍

Not bad

Anonymous

7/14/2025, 6:03:23 PM No.105904003

>>105903446
What are you talking about, we've been able to do this for at least a year and a half now >>98303858

Replies: >>105904064

Anonymous

7/14/2025, 6:09:10 PM No.105904050

file

md5: 1c1d67318c777083142b83ed258d3f29🔍

I like K2.

Replies: >>105904191

Anonymous

7/14/2025, 6:10:46 PM No.105904064

>>105904003
your opensores """""""""""alternative""""""""""" is dead.

Replies: >>105904414

Anonymous

7/14/2025, 6:11:31 PM No.105904072

>>105903495
Having the currently leading lab release a straight up waifu based on their strongest model is a very large deviation from the norm. All the other labs are corpocucks.

Anonymous

7/14/2025, 6:12:20 PM No.105904082

>>105899912
nta but Total Zigger Death

Anonymous

7/14/2025, 6:24:35 PM No.105904191

file

md5: de6619d2475e3ec8e5cfa08fa6fdc08b🔍

>>105904050
Do you see a difference when using this?

Anonymous

7/14/2025, 6:28:04 PM No.105904218

1728763524106200

md5: 3cac5c507d8604ffa4a1efd47227ebd3🔍

anyone try this? by the dry/xtc guy
https://github.com/p-e-w/waidrin

Replies: >>105904407 >>105904480

Anonymous

7/14/2025, 6:32:30 PM No.105904254

>>105902185
gary has his own delusionmaxxing of his own
he and lecunt come from the LLM prime retardation material plane

Anonymous

7/14/2025, 6:34:57 PM No.105904281

>>105902440
oh, they're going to make a virtual Kotomine Kirei? Sign me up

Anonymous

7/14/2025, 6:36:19 PM No.105904289

I'm running livebench on IQ1 K2 for comparison with

https://desuarchive.org/g/thread/105425203/#q105428685
https://desuarchive.org/g/thread/105432191/#q105437970

Only ~10t/s so I guess I'll have the results tomorrow.

Anonymous

7/14/2025, 6:40:05 PM No.105904314

>>105902502
愛似
愛仁
亞丹
translation note: baka

Anonymous

7/14/2025, 6:42:40 PM No.105904338

>>105903181
Melty Migu

Anonymous

7/14/2025, 6:47:01 PM No.105904379

1745229369123950

md5: 9fb3db41a39ef61d9206e5d7e50638ac🔍

>>105902352
I can hear all the NSFW AIchatbot websites crying from here

Anonymous

7/14/2025, 6:49:40 PM No.105904407

>>105904218
Neat.
From a cursory glance seems to be using the approach that's been discussed here several times.
Which is fair enough, it's a fairly obvious way to go about things.
Good on him for actually making it, I'll download and fuck with it later.

Anonymous

7/14/2025, 6:50:21 PM No.105904414

>>105904064
The sillytavern extension still works and sillytavern still gets regular updates though???? Maybe you should go back to sucking elon's dick on x

Replies: >>105904441

Anonymous

7/14/2025, 6:52:31 PM No.105904441

file

md5: fef15343fd2b56ee8742b8f286b00fb5🔍

>>105904414
>brings gay sex unprompted

Replies: >>105904484

Anonymous

7/14/2025, 6:53:33 PM No.105904448

>>105903324
The obvious endpoint is the (correct) conclusion that women of any age don't have the capacity to consent. From there you can either decide that consent isn't necessary in the first place, or that they need to have a male guardian in charge of them.

Replies: >>105904473

Anonymous

7/14/2025, 6:55:39 PM No.105904473

>>105904448
based basado basedest

Anonymous

7/14/2025, 6:56:02 PM No.105904480

>>105904218
where is this but sex?

Anonymous

7/14/2025, 6:56:14 PM No.105904484

>>105904441
>low quality frog
>file.png
You are a bot

Replies: >>105904498

Anonymous

7/14/2025, 6:57:16 PM No.105904498

>>105904484
>non-argument
I accept your concession.

Anonymous

7/14/2025, 6:58:10 PM No.105904506

Scammer Elon winning the AI gf race would actually fit the clown world perfectly. I want to die.

Replies: >>105904539 >>105904847

Anonymous

7/14/2025, 7:01:10 PM No.105904539

>>105904506
@grok crash this user's car, make sure it's fatal

Anonymous

7/14/2025, 7:03:26 PM No.105904560

>>105904543
>>105904543
>>105904543

Anonymous

7/14/2025, 7:08:06 PM No.105904604

>>105902458
the voice sucks

Anonymous

7/14/2025, 7:31:57 PM No.105904847

>>105904506
Out of all the tech lords he has said repeatedly (mostly as a quirky boomerism) he wants to make cat girls real. So its not surprising he is the first big provider to give waifu skin to their model.

Anonymous

7/14/2025, 7:47:27 PM No.105905009

>>105903470
Misa daiski