/lmg/ - Local Models General - /g/ (#105896271) [Archived: 320 hours ago]

Anonymous
7/14/2025, 12:39:52 AM No.105896271
1735632101088647
1735632101088647
md5: 3765f923e7d6f1e7fb4862b86093cb3e🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105887636 & >>105879548

►News
>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2
>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550
>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507
>(07/10) Reka Flash 3.1 21B released: https://reka.ai/news/reinforcement-learning-for-reka-flash-3-1
>(07/09) Phi-4-mini-flash-reasoning with hybrid SambaY architecture released: https://hf.co/microsoft/Phi-4-mini-flash-reasoning

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105896619 >>105896846 >>105897412
Anonymous
7/14/2025, 12:41:45 AM No.105896282
threadrecap
threadrecap
md5: 7b9a82a1f31bca7acfefb8afe8c01036🔍
►Recent Highlights from the Previous Thread: >>105887636

--Papers:
>105893741
--Banned string handling limitations and backend compatibility issues in local LLMs:
>105888749 >105888832 >105888864 >105888965 >105889008 >105888881 >105889050 >105889071 >105889105 >105889113 >105889118 >105889145 >105889404 >105889421 >105889564 >105892522 >105892618
--SSD wear risks and memory management challenges when running large language models locally:
>105890010 >105890017 >105890026 >105890036 >105890448 >105890624
--Debate over the future viability of dense models versus MoE architectures in local LLM deployment:
>105894507 >105894538 >105894550 >105894560 >105894581
--Debate on AI progress limits: hardware, data, and model intelligence vs imitation:
>105893180 >105893207 >105893228 >105893252 >105893502 >105893519 >105893525 >105893255 >105893283 >105893293 >105893297 >105893291 >105893279 >105893324 >105893376 >105893464 >105893516 >105893393 >105893477 >105893268 >105893663 >105893717 >105893440 >105895108
--Kimi-K2-Instruct dominates EQ and creative writing benchmarks but faces deployment and cost concerns:
>105888925 >105888931 >105889080 >105889586 >105889610 >105889677 >105889983
--Kimi-K2 GGUF model deployment challenges and hardware demands for local execution:
>105895401 >105895453 >105895462 >105895473 >105895532 >105895593 >105895796 >105895496 >105895500 >105895516
--Exploring architectural and training solutions to enhance model performance on complex spatial tasks:
>105893950 >105893981 >105893992 >105893996 >105896189
--Kimi-K2 shows strong performance in creative writing benchmarks:
>105892930 >105892950 >105893006
--Quirky behavior of Kimi2 model in adult sim scenarios without sys prompt:
>105890087 >105890173
--Miku (free space):
>105888636 >105888990 >105889193 >105892725 >105892977 >105894815

►Recent Highlight Posts from the Previous Thread: >>105887642

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
7/14/2025, 12:49:47 AM No.105896352
these threads would be taken more seriously without the forced tranime and ritual posts
Replies: >>105896420 >>105896493 >>105896511 >>105896531 >>105896545 >>105897051 >>105900331
Anonymous
7/14/2025, 12:50:11 AM No.105896359
1752446488880
1752446488880
md5: 260f460ace44ffa205fc6297d1e0720d🔍
21 piss ideas in 2025
Anonymous
7/14/2025, 12:53:00 AM No.105896375
>Chinese keep winning the open source route
>Musk keeps winning the proprietary route
WTF

What happened to OpenAI/Google/Claude?
Replies: >>105896445 >>105897449 >>105899619
Anonymous
7/14/2025, 12:57:05 AM No.105896419
file
file
md5: e770e6f8d2f43fabfe4728f2539318d8🔍
>https://moonshotai.github.io/Kimi-K2
Replies: >>105896472
Anonymous
7/14/2025, 12:57:11 AM No.105896420
>>105896352
tranime op doesn't want this thread to be taken seriously though
Replies: >>105896477
Anonymous
7/14/2025, 12:59:44 AM No.105896445
>>105896375
Claude was just an offshoot of OpenAI and both were too small to ever be viable long term. Google will either sort its shit out and leverage its size and data to dominate or it will go the way of Yahoo and IBM.
Anonymous
7/14/2025, 1:02:16 AM No.105896472
>>105896419
I'm still waiting for agents/tool calling to become relevant for erp
Replies: >>105896477
Anonymous
7/14/2025, 1:02:38 AM No.105896477
1727337000936150
1727337000936150
md5: 1bcd58acecf2a528153aad9ed321aed6🔍
>>105896420
oh no

>>105896472
teledildonics
Anonymous
7/14/2025, 1:04:27 AM No.105896493
>>105896352
this
Anonymous
7/14/2025, 1:06:14 AM No.105896511
>>105896352
> tranime and ritual posts
if you want some serious discussion, create a bunch of AI agents and discuss AI with them, you can do it as long as you can stay awake, they don't need sleep.
Anonymous
7/14/2025, 1:07:31 AM No.105896528
This is too obvious even without ip counts.
Anonymous
7/14/2025, 1:07:46 AM No.105896531
>>105896352
local died months ago, these threads are only made out of habit like katawa shoujo threads on vg or ukraine general threads on pol
Replies: >>105896540 >>105896613 >>105896618
Anonymous
7/14/2025, 1:08:50 AM No.105896540
file
file
md5: 12a6e8d99407d4272177e9adeec35186🔍
>>105896531
Local has just begun.
Replies: >>105896642
Anonymous
7/14/2025, 1:09:21 AM No.105896545
>>105896352
/lmg/ is a tragic dead-end when it comes to productivity. It's blatantly clear that less than half of the users here have even vibe-coded before, let alone used AI to make dazzling new inventions and create breathtaking solutions that only the surging AI field with its arising experts can create.
Anonymous
7/14/2025, 1:16:20 AM No.105896613
1742632310584723
1742632310584723
md5: 66f8bb44ab78dd670802195ae3d494a9🔍
>>105896531
i don't care
tzd
Replies: >>105899912
Anonymous
7/14/2025, 1:16:49 AM No.105896618
Screenshot 2025-07-13 171557
Screenshot 2025-07-13 171557
md5: a5f4147b678bc7c8bf8e9a29a403ec35🔍
>>105896531
sezu. Compiling llama rn
Replies: >>105896642 >>105897240
Anonymous
7/14/2025, 1:17:09 AM No.105896619
>>105896271 (OP)
what kinda god forsaken migu is this
Replies: >>105896628
Anonymous
7/14/2025, 1:19:05 AM No.105896628
1727250619500075
1727250619500075
md5: e27a833da84ec6d30d51e6f3e322144e🔍
>>105896619
Liquid Migu (Liquid Migu)
Replies: >>105896691
Anonymous
7/14/2025, 1:20:25 AM No.105896642
>>105896540
>>105896618
>q2
Absolute state.
Replies: >>105896675 >>105896744 >>105899216
Anonymous
7/14/2025, 1:24:10 AM No.105896675
>>105896642
This might be hard to comprehend for gpupoors used to nemo but a Q2 quant of huge models is almost indistinguishable from the real thing.
Replies: >>105896685 >>105896802 >>105897284 >>105898318
Anonymous
7/14/2025, 1:25:30 AM No.105896685
>>105896675
If that were even remotely true, all inference providers would be running their models at Q2 too.
Replies: >>105896738 >>105900443
Anonymous
7/14/2025, 1:25:47 AM No.105896691
>>105896628
just a few more years until you can buy a fully functional migu sexbot with fully functional bladder and fill it with baja blast
Anonymous
7/14/2025, 1:30:35 AM No.105896738
>>105896685
Inference providers are using transformers and have no idea what llama.cpp and quantization is.
Replies: >>105900443
Anonymous
7/14/2025, 1:30:51 AM No.105896744
>>105896642
/lmg/ is full with totally not poorfags who'll run deepseek q2 and think they're hot shit
Replies: >>105896778 >>105896859
Anonymous
7/14/2025, 1:33:00 AM No.105896756
tfw 4gb card
do i just kill myself
Replies: >>105896764 >>105896768
Anonymous
7/14/2025, 1:34:20 AM No.105896764
>>105896756
Smoll MoE.
Or Jamba.
Anonymous
7/14/2025, 1:35:16 AM No.105896768
>>105896756
Run Qwen3 A30B from RAM and tell yourself it's just as good
Anonymous
7/14/2025, 1:36:20 AM No.105896778
>>105896744
cope
Replies: >>105896859
Anonymous
7/14/2025, 1:38:17 AM No.105896794
mini kimi2 a3b please please
Anonymous
7/14/2025, 1:39:24 AM No.105896802
>>105896675
I doubt a Q2 quant is going to be that much better than a Q4 quant of R1. Q3 on the other hand...
Anonymous
7/14/2025, 1:43:58 AM No.105896846
>>105896271 (OP)
The image depicts a vibrant and refreshing scene with a glass filled with ice cubes, slices of orange, lime, and a character submerged in the liquid. The presence of citrus fruits suggests that the drink is likely to have a fruity flavor profile.

Given the combination of orange and lime, it can be inferred that this beverage would have a mix of sweet and tart notes. The orange adds sweetness with its natural juice, while the lime provides a zesty, tangy element. The ice cubes indicate that the drink is cold, which enhances the refreshing quality often associated with citrus drinks.

Therefore, based on the image, it can be described as a refreshing citrus-based beverage, possibly a variation of a fruit punch or a similar type of drink that combines the flavors of orange and lime.
Replies: >>105897191
Anonymous
7/14/2025, 1:44:51 AM No.105896859
>>105896744
>>105896778
Threadly reminder that running a model locally only makes financial sense if you either already have the hardware or you are deliberately paying a premium for the added privacy. Buying a new CPUmaxxed rig to run Deepseek at Q2 will always cost significantly more for lower response quality and slower generations than just using a hosted API, because the API providers get much closer to 100% hardware utilization and their margin for an open-weight model is only a small fraction on top of the raw hardware costs.
Replies: >>105896905 >>105896981 >>105897011
Anonymous
7/14/2025, 1:49:33 AM No.105896905
>>105896859
this and also rename /lmg/ to /omg/ - open model general to not discriminate against the quiet sensible majority of this general who use open models without blowing stupid amounts of money on hardware they don't need
Replies: >>105896954 >>105896981 >>105897094
Anonymous
7/14/2025, 1:55:11 AM No.105896954
>>105896905
The APIs will go away someday, the weights won't.
Anonymous
7/14/2025, 1:57:59 AM No.105896981
>>105896859
>>105896905
>>>/g/aicg/
Replies: >>105897011
Anonymous
7/14/2025, 2:01:58 AM No.105897011
>>105896981
Consider actually reading >>105896859 and not just making a knee-jerk response.
Anonymous
7/14/2025, 2:07:17 AM No.105897051
The quality of the thread increases exponentially if you hide >>105896352
Anonymous
7/14/2025, 2:08:41 AM No.105897061
lol
lol
md5: d703547500ae168ce647517ac9b2c279🔍
> local
Replies: >>105897108 >>105897142 >>105897163 >>105897184 >>105900390
Anonymous
7/14/2025, 2:08:59 AM No.105897067
>>103981338
>>103769004
>>103642119
>>99161317
Anonymous
7/14/2025, 2:12:26 AM No.105897094
>>105896905
It should be renamed to miku posting general because this is the main topic of this thread.
Replies: >>105897425
Anonymous
7/14/2025, 2:13:29 AM No.105897108
>>105897061
This looks very slow (ssdmaxing), how many t/s? sub1?
Replies: >>105897122
Anonymous
7/14/2025, 2:14:41 AM No.105897122
>>105897108
Very high quick speeds ollama deepseeks sir
Replies: >>105897135
Anonymous
7/14/2025, 2:15:56 AM No.105897135
>>105897122
"Deepseek". I'd have at least expected some slow DDR4 512GB box.
Replies: >>105897146
Anonymous
7/14/2025, 2:16:53 AM No.105897142
>>105897061
>buy $1k deepseek* ai pc
>look inside
>*1.7B
Anonymous
7/14/2025, 2:17:10 AM No.105897146
>>105897135
Sir is only thousand do not greedy
Anonymous
7/14/2025, 2:18:46 AM No.105897163
>>105897061
>in 3 carts
I want to believe those are alts trying to induce fomo in retards.
Replies: >>105897175
Anonymous
7/14/2025, 2:20:38 AM No.105897175
>>105897163
Even better, it's an ebay algorithm.
Anonymous
7/14/2025, 2:21:39 AM No.105897184
>>105897061
Hey come on guys, don't knock it till you try it.
Anonymous
7/14/2025, 2:22:32 AM No.105897191
>>105896846
okay, but how would the miku affect the flavor?
Replies: >>105897219 >>105897224
Anonymous
7/14/2025, 2:25:39 AM No.105897219
>>105897191
do you really want to drink his wound juice?
Anonymous
7/14/2025, 2:26:00 AM No.105897224
>>105897191
It refused to elaborate because she is a digital character etc etc.
Replies: >>105897274
Anonymous
7/14/2025, 2:27:10 AM No.105897240
>>105896618
When we getting Kimi 4B/8B/14B distills?
Replies: >>105897308
Anonymous
7/14/2025, 2:31:03 AM No.105897274
>>105897224
>she
Anonymous
7/14/2025, 2:32:26 AM No.105897284
>>105896675
cope
Anonymous
7/14/2025, 2:35:05 AM No.105897308
>>105897240
I wonder about merged experts maybe. Also, someone needs to tune the experts that do refusals and deal with that since kind of pointless to keep them around.
Replies: >>105897336
Anonymous
7/14/2025, 2:37:37 AM No.105897336
>>105897308
K2 is practically uncensored with prefills already >>105893395
Replies: >>105897397
Anonymous
7/14/2025, 2:40:34 AM No.105897360
>llama.cpp needs to add in hard-coded values in order to load kimi2, even though the only difference between that and deepseek are a couple config changes
How did this retarded macfaggot jeetware become the most popular thing? I guess I shouldn't even ask since javascript and python are popular.
Replies: >>105897380
Anonymous
7/14/2025, 2:43:07 AM No.105897380
>>105897360
literally the only reason it has popularity at all is because poors can use it with cpu offloading
Anonymous
7/14/2025, 2:44:36 AM No.105897397
>>105897336
Not really, in its default setting it will refuse NSFW in many forms, even as far as 10+ turns deep, it refuses more consistently than corpo apis.
It also ignores system prompt jailbreaks, things I've found to work are:
1. prefill, locally or on completion apis or on official api (only with "partial" parameter)
2. if you can't prefill, you can do inline jailbreaks where you distract it with an unrelated instruction followed by returning to your chat. this works, but is annoying and you have to edit the context after.
3. I haven't tried this, but changing the formatting supposedly works, they also have this on their API as changing the character name from assistant/narrator

IMO just tuning the handful of experts responsible for refusal would be a godsend to not have to do this every time, I don't even think it'd be that expensive, use ESFT to locate them, then you only need like what, a few H100s tops if not less to tune purely per expert. When will a finetoonor get onto it. Altenratively, merge those experts back to base until refusals are very mellow, no GPUs needed, just one CPUMaxxer willing to record activations.
Anonymous
7/14/2025, 2:46:08 AM No.105897412
>>105896271 (OP)
Thinking about how much RAM to get for offloading and MoEs, should I go with 96gb or 192gb? Or is 192gb overkill?
Replies: >>105897426 >>105897435 >>105897437 >>105900584 >>105900844
Anonymous
7/14/2025, 2:47:13 AM No.105897425
>>105897094
Don't forget we also make fun of sama
Anonymous
7/14/2025, 2:47:16 AM No.105897426
>>105897412
Get 1TB at least, models ain't getting any smaller.
Replies: >>105897445
Anonymous
7/14/2025, 2:48:02 AM No.105897435
>>105897412
Don't cheap out lmao
1T+ or nothing
Replies: >>105897445
Anonymous
7/14/2025, 2:48:05 AM No.105897437
>>105897412
192gb is barely going to fit Deepseek Q1 and models are only going to get bigger from here on out if Kimi and Behemoth are an indication.
Replies: >>105897445
Anonymous
7/14/2025, 2:48:52 AM No.105897445
>>105897426
>>105897435
>>105897437
How am I supposed to run more than 256gb? Most MOBOs support only up to this much
Replies: >>105897447
Anonymous
7/14/2025, 2:49:11 AM No.105897447
>>105897445
Server motherboard
Anonymous
7/14/2025, 2:49:20 AM No.105897449
>>105896375
Mechahitler was probably a PR stunt. On no metric whatsoever does Grok ever come near the top.
Replies: >>105897584
Anonymous
7/14/2025, 2:50:01 AM No.105897460
Modern models being larger than RAM make me think Intel Optane is perfect for LLM usage...
Anonymous
7/14/2025, 2:51:34 AM No.105897474
$500 for a 750gb optane drive. I'm not sure how to use it, because optane motherboard specs are not easily searchable. Why can't a dedicated system with 1.5tb of optane drives run deepseek quickly? The whole point of using VRAM is because it's faster than RAM.
Replies: >>105897491
Anonymous
7/14/2025, 2:53:28 AM No.105897491
>>105897474
What's the bandwidth on those?
Replies: >>105897511
Anonymous
7/14/2025, 2:54:01 AM No.105897496
1731076103666323
1731076103666323
md5: 93f72bd32169437fc4d932f41aaeea2c🔍
Anonymous
7/14/2025, 2:56:32 AM No.105897511
>>105897491
Pci Express 3.0 X4, so 3.9 gb/s with much lower latency than normal SSDs.
Replies: >>105897541 >>105897568
Anonymous
7/14/2025, 2:59:21 AM No.105897541
>>105897511
That still sounds quite slow, you'll be in what, some 40 seconds per token at q2?
Anonymous
7/14/2025, 3:02:05 AM No.105897568
>>105897511
A custom PCIe card with NAND, ML accelerator and DRAM on-board would be insane. Models could be stored in chunks and streamed into RAM. You could probably even do a kind of branch prediction for which experts are likely to be loaded next using an ML model and pre-stream them during inference to lower latency.
Replies: >>105897652
Anonymous
7/14/2025, 3:03:54 AM No.105897584
>>105897449
Where did you get that conspiracy from?
Replies: >>105897619
Anonymous
7/14/2025, 3:07:40 AM No.105897619
>>105897584
elon musk bad
Anonymous
7/14/2025, 3:08:30 AM No.105897627
Elon Musk bad because he promised to release Grok 2 when 3's out, and now 4's out and 2 still wasn't released.
Anonymous
7/14/2025, 3:08:56 AM No.105897632
save us glm4
Replies: >>105897681
Anonymous
7/14/2025, 3:10:24 AM No.105897652
>>105897568
I feel like a bunch of optane drives would be really good for LLMs. Direct memory access will also be good.
Anonymous
7/14/2025, 3:10:34 AM No.105897653
1741875402705638
1741875402705638
md5: 645da1a1438a61bb04b1bd52d3379555🔍
Anonymous
7/14/2025, 3:11:57 AM No.105897668
Anybody tested ppl and kld of the different number of active experts for Kimi K2?
Anonymous
7/14/2025, 3:12:59 AM No.105897681
>>105897632
glm4 has been out for months now, anon...
Replies: >>105897693
Anonymous
7/14/2025, 3:14:18 AM No.105897693
>>105897681
He probably means the 100B MoE.
Anonymous
7/14/2025, 3:15:37 AM No.105897706
1730298237923701
1730298237923701
md5: 3fcd539203b30e4857f3bff8ac5d0450🔍
Anonymous
7/14/2025, 3:15:44 AM No.105897708
GvXuUalXIAAS0oN
GvXuUalXIAAS0oN
md5: fd2ba4724eb558e2bd527cad2ace0076🔍
Why does DeepSeek and Kimi have so much more SOVL at creative writing than American frontier models? Is it because all the typical RL deepfrying happened in Chinese, thereby leaving English comparatively unharmed?
Replies: >>105897745 >>105897774
Anonymous
7/14/2025, 3:19:23 AM No.105897745
>>105897708
Models take on the personality of the linguistic average and writing style created through the RLHF examples they use in training. They're written mostly by low cost workers from 3rd world shitholes.
Anonymous
7/14/2025, 3:23:43 AM No.105897774
>>105897708
Could be a number of things:
1. western labs are more afraid of lawsuits now, they used to train on libgen, but now they claim to only train on books they bought (anthropic), whih means they have to do multiple epochs on the same books
2. bigger moe, has more capacity for trivia and other knowledge like writing styles, it's more variable
3. some rl isn't that bad, like the rlvr math/codemaxxing sometimes if not overdone, does loosen a bit some of the brainwashing from rlhf, as long as you also mix it with creative writing during the tune(multiple objectives)
4. they started synthslopping a lot harder, for benchmaxxing and legal reasons (see 1), some synthslop is okay, but it's easy to amplify dumb unpleasant to read slop, opus 4 turned out worse than 3 for example, and the "alignment"for it is done by pure sel-synthslopping.
5. less safety SFT trash

I mostly think it's largely 1, but 4 is possible too.
Replies: >>105898092
Anonymous
7/14/2025, 3:44:26 AM No.105897978
file
file
md5: d86cbc34a34fbc346293e6bf3917ac4a🔍
The overuse of bold text smells like quant damage.
Maybe Daniel can make a better quant.
Replies: >>105898020 >>105898077
Anonymous
7/14/2025, 3:48:09 AM No.105898020
file
file
md5: f0c9d0a338ed90d16263e114cee84ebf🔍
>>105897978
That was greedy decoding, here's an answer with 0.3 temp.
Replies: >>105898077
Anonymous
7/14/2025, 3:55:37 AM No.105898077
>>105897978
>>105898020
What model?
Replies: >>105898082
Anonymous
7/14/2025, 3:55:55 AM No.105898082
>>105898077
K2
Anonymous
7/14/2025, 3:57:00 AM No.105898092
>>105897774
Behind closed doors, no one serious gives a shit about "copyright", you cannot train a model without dataset containing this stuff.
Well, you can, the result is so bad it's not relevant for anything made the last 2 years.
Replies: >>105898150
Anonymous
7/14/2025, 4:02:02 AM No.105898141
Apple is going to acquire Mistral AI
Replies: >>105898150 >>105898187 >>105898204 >>105898357 >>105899014 >>105899234
Anonymous
7/14/2025, 4:03:13 AM No.105898150
>>105898092
I know, it'd be insane to care about it, but consider:
1. Llama 4 was worse, and it was after the lawsuits that they trained on libgen, it's likely they stopped using it, no surprise then they underperformed that badly. Buying scale won't solve their problem either lmao
2.Anthropic claims that they started making a library of scanned and OCR'd books and replacing libgen with it, but they have far less so they have to do multiple epochs. Opus 4 turned on kinda worse, less sovl, so this makes sense.
>>105898141
Please no.
Anonymous
7/14/2025, 4:04:58 AM No.105898166
https://semianalysis.com/2023/05/04/google-we-have-no-moat-and-neither/
Were they right?
Replies: >>105898215 >>105899670
Anonymous
7/14/2025, 4:07:02 AM No.105898187
>>105898141
surely they can do better
Anonymous
7/14/2025, 4:08:56 AM No.105898204
>>105898141
why would they? mistral hasn't released a decent model in over a year
Replies: >>105898238 >>105898244
Anonymous
7/14/2025, 4:09:28 AM No.105898208
Kimi-K2 is now available on SiliconFlow
Replies: >>105898446 >>105899610
Anonymous
7/14/2025, 4:10:19 AM No.105898215
>>105898166
Partially.
1. Google has some long context secrets
2. Anthopic has a few secrets that made Claude unique.
3. They have a lot more compute
Back when it was written, it wasn't compeltely true.
Today, relative to China? Close to true, but they still have the compute.
Anonymous
7/14/2025, 4:13:18 AM No.105898238
>>105898204
which is still lightyears ahead of anything apple has accomplished in the ai space
Anonymous
7/14/2025, 4:13:46 AM No.105898244
>>105898204
They did some okay small ones.
I hope Apple doesn't do it, while Apple's LLMs are just bad, I can't imagine Mistral surviving in a way that matters, imagine having to extract the fucking weights from shitty Apple gimmicks instead of proper rleases, it may as well be killing them.
Anonymous
7/14/2025, 4:23:54 AM No.105898318
>>105896675
>a Q2 quant of huge models is almost indistinguishable from the real thing.
I hope so. I'm going to be relegated to a Q4 of kimi once its runnable.
756GB is finally locking me out of a model. I shoulda gone for 1.5TB
Anonymous
7/14/2025, 4:28:16 AM No.105898347
What would localsissies do when DeepSeek releases R2 with 2T params?
Replies: >>105898369 >>105898381 >>105898423
Anonymous
7/14/2025, 4:29:24 AM No.105898357
>>105898141
bad move for both parties
Anonymous
7/14/2025, 4:31:02 AM No.105898369
>>105898347
call you a poorfag for not spending 10k on a server, probably
Anonymous
7/14/2025, 4:34:09 AM No.105898381
>>105898347
wait for qwen to consume it and hand me something I can run
Anonymous
7/14/2025, 4:39:14 AM No.105898416
1721284550488284
1721284550488284
md5: 204e194432eb9b420649007f6d976c61🔍
Anonymous
7/14/2025, 4:39:51 AM No.105898423
1710043687041916
1710043687041916
md5: 9659d639aae0d3c10dc91269cf368c12🔍
>>105898347
You don't have 2TB of RAM?
Replies: >>105898482
Anonymous
7/14/2025, 4:43:10 AM No.105898446
>>105898208
dont use that shithole they botched the r1 release and it was even botched days after they put it on same goes with the others ram/sdd maxx or use the official api stop with the mental ilness ffs
Anonymous
7/14/2025, 4:48:04 AM No.105898482
>>105898423
You haven't already bought me 2TB of RAM?
Anonymous
7/14/2025, 4:57:37 AM No.105898541
Remember Behemoth?
Replies: >>105899130
Anonymous
7/14/2025, 4:58:23 AM No.105898543
Did you know? : you can make a GPU vRAM drive
https://github.com/prsyahmi/GpuRamDrive


Maybe a use for those high vram AMD cards if you have a server and a card laying around(?)
Replies: >>105898733
Anonymous
7/14/2025, 5:20:02 AM No.105898675
>500 internal error
>500 internal error
>can't even tell which provider it is to block on openrouter
Replies: >>105898778
Anonymous
7/14/2025, 5:28:29 AM No.105898733
>>105898543
this reminded me of when i was 14 and i was like amazed at how symbolic links worked.
yes fuckhead, we know. get it to fucking work then come back here. but you fucking wont because it won't work.
Anonymous
7/14/2025, 5:35:00 AM No.105898778
>>105898675
>proceed to post it in the wrong thread
Anonymous
7/14/2025, 6:10:08 AM No.105899014
>>105898141
For all of its regulatory fetish I don't understand why the EU lets the US scoop up its entire tech sector like this. First Deepmind and now Mistral. Europe won't have a frontier lab to its name.
Replies: >>105899039 >>105899104
Anonymous
7/14/2025, 6:11:43 AM No.105899024
China, when are we getting 1000$ servers with 1TB ram and a cloned EPYC?
Anonymous
7/14/2025, 6:12:46 AM No.105899039
>>105899014
The American Empire placed retards in charge of Europe so it would be predictable, Trump as set them loose
Anonymous
7/14/2025, 6:22:11 AM No.105899104
>>105899014
we didn't, we said fuck you when it came to ARM.
get fucked.
Anonymous
7/14/2025, 6:26:13 AM No.105899130
>>105898541
No one does because it never actually existed
Anonymous
7/14/2025, 6:37:32 AM No.105899216
>>105896642
I'd say R1 handily demonstrated that giant MoE models that are trained in native 8bit handle deep quantization very well
Anonymous
7/14/2025, 6:39:34 AM No.105899232
You're aware this shit is digital necromancy, right?
>make model of fragmented human minds represented by the text they write
>users create formulaic and very specific system prompts, basically just an invocation that collects the fragments into a coherent system which serves their purpose.
It's a bit odd.
Replies: >>105899261 >>105899271 >>105899294 >>105899384
Anonymous
7/14/2025, 6:39:48 AM No.105899234
>>105898141
>it's true
oh mon dieu, non...
Anonymous
7/14/2025, 6:42:49 AM No.105899258
Kimi said before K2 training they did a scaling test and tried many architectures but none beat DSv3
https://www.zhihu.com/question/1927140506573435010/answer/1927892108636849910
Anonymous
7/14/2025, 6:43:22 AM No.105899261
>>105899232
Grow up
Anonymous
7/14/2025, 6:45:45 AM No.105899271
>>105899232
now just you wait till you figure out how easy it is to do it irl with biology and flesh (if only this was a jest)
Anonymous
7/14/2025, 6:49:30 AM No.105899294
1752006947176431
1752006947176431
md5: 3155a402be9f0d0e3810fe74542bde58🔍
>>105899232
praise the Omnissiah
Anonymous
7/14/2025, 6:55:45 AM No.105899328
You now remember RWKV
Replies: >>105899389 >>105899409
Anonymous
7/14/2025, 7:09:28 AM No.105899384
>>105899232
Does the spirit of Shakespeare rise out of the grave every time someone performs one of his plays?
Replies: >>105899388
Anonymous
7/14/2025, 7:10:27 AM No.105899388
>>105899384
Not exactly, but if you combine the system that can predict his language with a bunch of other random systems you get something.
Replies: >>105899420
Anonymous
7/14/2025, 7:10:42 AM No.105899389
>>105899328
Never forgot. Waiting for the 7b in training to disappoint me. It's at 85%.
Anonymous
7/14/2025, 7:15:25 AM No.105899409
Untitled
Untitled
md5: da25c119e534e2eeda99760bf9301576🔍
>>105899328
Why is pronounced like that? Stupid.
Anonymous
7/14/2025, 7:17:49 AM No.105899420
>>105899388
Well it's certainly as far from actual necromancy as current LLMs are from biological brains.
Anonymous
7/14/2025, 7:46:45 AM No.105899610
>>105898208
Click on pricing.
>Start out FREE. FREE TESTING HERE.
Need to scroll down to see the prices.
>$15 for 1 minute of Fish-Speech-1.5 output.
DAYYYUMM
If I wont be cooming and stopping my projects at 80% while making SF connections I could rip of people like that. Thats craaaazzyyy.
Anonymous
7/14/2025, 7:47:45 AM No.105899619
>>105896375
Notice how Grok comes out with minimal safety testing, immediately a controversy happens, then they hotfix it? OpenAI/Google/Claude don't want the controversy even if it would take them months longer to release their models. Companies like them are going to get out-competed by fast-release-fast-fix companies like Musk's, the whole reason why SpaceX is the most successful space company in existence is because of this reason and I wouldn't be surprised if the same happens for xAI. The only flaw that xAI possibly has is that they don't have a dedicated research group like Meta has with their FAIR.
Replies: >>105899766 >>105900018
Anonymous
7/14/2025, 7:57:41 AM No.105899670
>>105898166
there is something google has that others don't and it's legitimate use cases where you could imagine llm and imagen style tech integration in their ecosystem (youtube video editor, mail summarization in gmail etc) so they can sell a product, not just a barebones ai model on its own like openai (whose relationship with Microsoft, the one who could have productized openai, has been souring)
openai has no moat, but google has a gigantic moat. it's never been about selling ai model access.
Anonymous
7/14/2025, 8:14:18 AM No.105899766
>>105899619
They're only 2 years old starting from company creation announcement.
Anonymous
7/14/2025, 8:28:52 AM No.105899851
Am I correct that Kimi prompt looks like this?
<|im_system|>system<|im_middle|>System prompt.<|im_end|><|im_user|>user<|im_middle|>User message.<|im_end|><|im_assistant|>assistant<|im_middle|>Assistant message.<|im_end|>
Anonymous
7/14/2025, 8:39:30 AM No.105899912
>>105896613
>tranimetard is also a nafotranny
shockers!
Replies: >>105904082
Anonymous
7/14/2025, 8:42:20 AM No.105899927
https://github.com/block/goose

What model even works well with this? I tried it with multiple models including qwen2.5 like they show in their demo and none of them will use tools autonomously. Unless you explicitly tell them to make a certain directory with a certain name with a certain file inside it will do nothing at witch point it doesn't automate anything.
Anonymous
7/14/2025, 8:51:34 AM No.105899970
file
file
md5: 4987af4d4c9dc5c88b794fa5d96a59a3🔍
Cockbench for https://huggingface.co/gabriellarson/Kimi-K2-Instruct-GGUF
Replies: >>105901913
Anonymous
7/14/2025, 8:56:02 AM No.105900001
Where are Ukrainian and Russian LLMs
Replies: >>105900019 >>105900037
Anonymous
7/14/2025, 8:58:51 AM No.105900018
>>105899619
>controversy happens

Ppl r pathetic. Agent Smith did nothing wrong.
Anonymous
7/14/2025, 8:59:00 AM No.105900019
>>105900001
>Russian
https://huggingface.co/yandex/YandexGPT-5-Lite-8B-instruct
Anonymous
7/14/2025, 8:59:14 AM No.105900021
I've been wondering, do they train llms on combined us/commonwealth literature? How do the various spellings get handled?
Anonymous
7/14/2025, 9:02:00 AM No.105900037
>>105900001
Ru did do a dense 100b or so, check out Yalm, but it's outdated by today's standards.
Anonymous
7/14/2025, 9:24:20 AM No.105900155
Is the Huawei 300I Duo NPU pcie card any good? Otherwise I'll send it back.
Fucking hell Im a boomer

https://lmdeploy.readthedocs.io/en/latest/get_started/ascend/get_started.html
Replies: >>105900165 >>105900715
Anonymous
7/14/2025, 9:25:43 AM No.105900165
file
file
md5: 8450b2e64086c70488e2ab01ec2fba50🔍
>>105900155
>408 GB/s
lol
lmao
Replies: >>105900226
Anonymous
7/14/2025, 9:31:03 AM No.105900197
i cant see a path where technology leads to happiness, only more suffering...
Anonymous
7/14/2025, 9:33:56 AM No.105900213
https://www.bloomberg.com/news/newsletters/2025-07-13/is-apple-going-to-replace-ceo-tim-cook-who-is-the-next-ceo-of-apple-ternus-md1mhrj4
https://www.reddit.com/r/LocalLLaMA/comments/1lzfhhq/apple_will_seriously_consider_buying_mistral/
we're over
Replies: >>105900255 >>105900314 >>105901065
Anonymous
7/14/2025, 9:35:37 AM No.105900226
>>105900165
better than octochannel ddr5 though. can you stack 2 of the 96gig ones?
Anonymous
7/14/2025, 9:40:54 AM No.105900255
>>105900213
Well mistral has been shit for a while now.. (3.2 is a step though.)
Didnt they say the wanna move to SF/burger already? What the hell is the EU doing kek.
How can you allow that your only noticeable AI lab is bought up by apple.
Replies: >>105900278 >>105900364
Anonymous
7/14/2025, 9:44:08 AM No.105900278
>>105900255
>mistral: apple. wanna buy us?
>apple: sure. this many monies
>mistral: cool. hey, eu. apple wants to give us this many monies
>eu: oh, no... here's some more monies
>nothing really changes
Replies: >>105900291 >>105900299
Anonymous
7/14/2025, 9:46:48 AM No.105900291
>>105900278
I never considered myself a socialist but how is this allowed?
I bet you can't fuck around like that in chink land.
How can you not protect your knowledge, especially since mistral got alot of french €€€.
Replies: >>105900300
Anonymous
7/14/2025, 9:47:45 AM No.105900299
>>105900278
I wonder how many GPUs does Mistral have?
Anonymous
7/14/2025, 9:47:49 AM No.105900300
>>105900291
Now that I think about it.
Doesnt have japan some law where foreigners can't buy up a company?
At least they had something like that in the past.
Replies: >>105900315
Anonymous
7/14/2025, 9:49:26 AM No.105900314
>>105900213
Does anyone actually think EU is gonna allow an American company to buyout a European AI company? After everything Trump did this year? EU isn't going to let this slide.
Replies: >>105901642
Anonymous
7/14/2025, 9:49:39 AM No.105900315
>>105900300
Yet somehow softbank is promising to dump dozens to hundreds of billions into openai, instead of building upia local competitor. openai of all companies, what absolute insanity.
Replies: >>105900796
Anonymous
7/14/2025, 9:53:06 AM No.105900331
1726752064533078
1726752064533078
md5: 27f4e913f439f610983f971b077fba2c🔍
>>105896352
wtf how could they do this to us
Anonymous
7/14/2025, 9:59:25 AM No.105900364
>>105900255
>Well mistral has been shit for a while now
Does that coincide with the moment Mistral stopped writing in the cards on HF that the instruct models were a quick demonstration that they could be finetuned and had no moderation mechanisms?
Anonymous
7/14/2025, 10:02:40 AM No.105900390
>>105897061
It's installed on the SSD, they just aren't including the hardware to run it.
Anonymous
7/14/2025, 10:09:05 AM No.105900443
>>105896685
>>105896738
Llama.cpp is not well optimized for large volume, so you lose your efficiency gains if you try to deploy it at the scale of cloud providers compared to other backends.
Replies: >>105900493 >>105901936
Anonymous
7/14/2025, 10:14:56 AM No.105900493
>>105900443
I'm still wondering why they wasted years and hair writing it in cpp if it's still not efficient for more intensive usage compared to other backends largely made in python.
Replies: >>105900518 >>105900528 >>105901085
Anonymous
7/14/2025, 10:18:52 AM No.105900518
>>105900493
It's not really about the language, in theory a cpp based application COULD be much better at scale, but python was the language of choice for codelet data scientists for so long that there was already a foundation to build on, and shit is moving too fast to rebuild from the ground up. Except for Google who write their own backends for their own secret TPUs.
It's just about the priority: Llama.cpp was originally a CPU-only inference engine and later added offloading to GPUs as a feature to speed them up. Its primary goal is to make running large language models accessible to as many devices as possible, and it does well at that. Stuff like VLLM targets a different workload that's focused on scale first.
Replies: >>105900539
Anonymous
7/14/2025, 10:19:43 AM No.105900528
>>105900493
Large-scale backends do batching,meaning they wait for multiple uses to request against the same endpoint and then the y run the requests together, this has more latency but is more efficient for many users at once.Llama.cpp was always oriented toward local users, they went for as efficient as possible for single user, especially users that lack GPUs to fit all weight in VRAM, but also some on RAM. It's a different strategy overall.
Anonymous
7/14/2025, 10:21:33 AM No.105900539
>>105900518
Python backends still eventually thunk to real code written in C++ anyway, or a few rare ones that implement the compiler in python (like that thing geohot has,tinygrad). All fast implementations will just have some custom CUDA kernels or for CPU, SIMD x86 specific code and similar.
Anonymous
7/14/2025, 10:25:51 AM No.105900584
>>105897412
In the days prior to Deepseek I had the choice of putting either 512 GB or 1 TB of RAM into my GPU server.
I went with 512 GB because it was cheaper but now my slots are maxed out and I'm regretting that choice.
Replies: >>105900854
Anonymous
7/14/2025, 10:43:24 AM No.105900715
>>105900155
>Huawei 300I Duo
Where could a Gweilo procure one?
Replies: >>105900892
Anonymous
7/14/2025, 10:56:26 AM No.105900796
>>105900315
They don't have a choice, like with an ARM deal. It's a lapdog coroiration
Replies: >>105900858
Anonymous
7/14/2025, 11:06:20 AM No.105900844
>>105897412
Max out! Beware of the fact that NUMA is shit slow. That said said, dual CPU setup with 512+512 gb will give you mere 512 gb of usable RAM

Anyway, you are too late to the party. Kimi-K2 support merge is delayed.

It's over for local
Anonymous
7/14/2025, 11:08:29 AM No.105900854
>>105900584
Are there any older (affordable) servers which support 512+ gb on a single CPU?

No NUMA please
Replies: >>105901029
Anonymous
7/14/2025, 11:09:13 AM No.105900858
>>105900796
Are you saying all this was just to appease Trump, or am I misunderstanding you? even then, that's a huge sum. I thought Softbank truly believed OpenAI will reach superintelligence or something, and that Altman was a really good scammer.
Unironically though Japan should get in the game, like China they have an advantageous legal climate for training LLMs (copyright wise)
Replies: >>105900937 >>105900956
Anonymous
7/14/2025, 11:13:28 AM No.105900892
>>105900715
Ebay or chinkstores
Anonymous
7/14/2025, 11:22:19 AM No.105900937
>>105900858
>advantageous legal climate for training LLMs (copyright wise)

>blocking pirared anime everywhere
Like the only thing I care for
Replies: >>105900992
Anonymous
7/14/2025, 11:25:49 AM No.105900956
>>105900858
Its weird. There is only SakanaAI. And those are foreigners living off t he jap government and nvidia money.
All their papers are hype shit. I remember them proudly presenting some memory solution....pajeets hyped it up...if you look at their paper..it was a fucking build in rag.
Japan dropped off hard recently. Its sad that 50%+ of the jp people are on X, no language and culture barrier anymore. The uniqueness is mostly gone. And I say that as somebody that is living there.
Wouldnt suprise me if that is the reason we dont see any excellency anymore. Young boys probably do what they do in the west and just check out, let the girls be the class rep and get the applause in a system that does not reward them in any way. I read a blog recently how they are concerned that boys "dont show initiative" anymore in school. Big shocker. KEK
Replies: >>105900995
Anonymous
7/14/2025, 11:31:06 AM No.105900992
>>105900937
They long declared training on copyrighted stuff legal. Also if that was true, how come pixiv is filled with so much AI slop that is literally trained on boorus (which scrape pixiv)
Replies: >>105901010
Anonymous
7/14/2025, 11:31:11 AM No.105900995
>>105900956
The only ppl in jp with decent English skills are Koreans and Chinese

At least it was the case prior to cough
Replies: >>105901028
Anonymous
7/14/2025, 11:34:14 AM No.105901010
>>105900992
Some pixiv rakugaki is far from onepiece, naruto etc. where they make money off

I can't imagine a jp company training loras on this stuff
Replies: >>105901061
Anonymous
7/14/2025, 11:36:53 AM No.105901028
>>105900995
When my kid was 6 they started teaching english in elementary school. Before that a little bit in kindergarden.
Young people probably arent that bad at it nowadays, at least for understanding.
But regardless: The social media climate is THE EXACT SAME.
Or rather a couple years behind burger, when left was peak. Tranny is being talked about everywhere since covid.
And the only counter the critics have is "muh poor wahmen if a tranny appears in the toilet". Its the same shit.
Don't expect a good AI to come from japan, I dont see it. There is a japanese guy doing stuff with stable diffusion, i think he is big, doing iprovements etc.. But thats the only guy fro japan I know.
Replies: >>105901467
Anonymous
7/14/2025, 11:36:55 AM No.105901029
the-scream-k-on
the-scream-k-on
md5: bd0d7fe05d7fdd789cb78a7976fdffe9🔍
>>105900854
Nowadays even single socket CPUs tend to have multiple NUMA nodes.
Anonymous
7/14/2025, 11:42:41 AM No.105901061
>>105901010
maybe not, although Illustrious was trained by Koreans (on top of SDXL) and is probably one of the more popular bases these days
Anonymous
7/14/2025, 11:42:54 AM No.105901065
file
file
md5: e630c5c334a038d621789059f311e3e5🔍
>>105900213
>ternus
Could be good for Apple.
Anonymous
7/14/2025, 11:45:43 AM No.105901085
>>105900493
The biggest advantage of llama.cpp vs. Python is ease of installation.
At my workplace my coworker set up a language model on some GPU server and he went with ollama because to him that seemed like the easiest.
I later discussed with him the benefits of e.g. vLLM but for his use case there are only very few parallel requests so it wouldn't be worth the effort to switch.
The downside is that if you don't use any dependencies and write everything, including the tensor library, yourself, it's just going to be way more work.
Anonymous
7/14/2025, 11:50:54 AM No.105901124
>using r1-0528 and having a normal conversation with a girl
>her knuckles start to whiten
>her fingernails dig into her palm, drawing blood
>her skirt rides up, revealing her panties
>her legs press together unconsciously, bruising at the knees
>she bites her lip, copper flooding her mouth
>her top begins to fray at the edges and come undone
seriously why the fuck does everyone's clothes fall apart as they slowly bleed to death no matter what we do? is this typical in china because of all the acid rain or something?
Replies: >>105901145 >>105901332
Anonymous
7/14/2025, 11:56:39 AM No.105901145
>>105901124
Explain this whole knuckles start to whiten, I don't recall ever getting it with r1, but I am geiitng it with K2 (on API), was this ever a common R1 slop?
Replies: >>105901203 >>105901205
Anonymous
7/14/2025, 12:05:51 PM No.105901203
>>105901145
I'm getting knuckles whitening but none of the other stuff mentioned.
Maybe it depends on the content.
Anonymous
7/14/2025, 12:06:03 PM No.105901205
>>105901145
really common for me in r1 yeah, maybe it's just my prompts but this happens in a lot of varied scenarios/cards for me
haven't used k2 yet waiting for the good goofs
Anonymous
7/14/2025, 12:26:01 PM No.105901332
>>105901124
imagine what a model with ao3 in training set could do.
Replies: >>105901372 >>105901383 >>105901412
Anonymous
7/14/2025, 12:32:58 PM No.105901372
file
file
md5: 4f92a67193d314ca67e8921be5079fc2🔍
>>105901332
They all have ao3 in the training set. Here's R1.
Replies: >>105901393
Anonymous
7/14/2025, 12:34:25 PM No.105901383
>>105901332
Almost all major pretrained models have AO3 to varying extents in the training dataset. Morality-based training data filtering and stupidly massive (literally hundreds of billions of tokens now, for some of the latest models) post-training heavy on safety, math/benchmaxxing and corporate tasks ruin everything.
Replies: >>105901391 >>105903179
Anonymous
7/14/2025, 12:35:25 PM No.105901391
>>105901383
I wonder how K2 base would do with story prefix prompts in text completion mode. It shouldn't be slopped in the same way instruct is, right?
Replies: >>105902005
Anonymous
7/14/2025, 12:35:33 PM No.105901393
>>105901372
Then I'm out of ideas.
Replies: >>105901412
Anonymous
7/14/2025, 12:38:37 PM No.105901412
file
file
md5: d14dfbbd06cbb710129ea3aff0011dde🔍
>>105901393
>>105901332
You don't have to imagine.
Replies: >>105901425 >>105901442
Anonymous
7/14/2025, 12:40:23 PM No.105901425
>>105901412
A few occurrences in millions of fanfics aren't going to make the model overly focus on that.
Replies: >>105902000
Anonymous
7/14/2025, 12:41:26 PM No.105901430
Has anyone messed around with Qwen3?
I've been playing around with an 8B "uncensored" somebody made. And while I can feel it being a bit better than the LLAMA based 8Bs I've been using but I keep getting loops and other strange things. I wish I knew what the fuck I was doing and why it goes wrong.
Replies: >>105901441 >>105901451
Anonymous
7/14/2025, 12:43:24 PM No.105901441
>>105901430
for rp use rocinante v1.1 and delete qwen
Anonymous
7/14/2025, 12:43:35 PM No.105901442
>>105901412
do brown peoples knuckles turn white when they get mad
Replies: >>105901479
Anonymous
7/14/2025, 12:45:05 PM No.105901451
>>105901430
8b is to dumb for rp. use 12b minimum
Replies: >>105901471
Anonymous
7/14/2025, 12:46:56 PM No.105901467
>>105901028
>There is a japanese guy doing stuff with stable diffusion

The guy called Kohya making lora training backend?
Replies: >>105901504
Anonymous
7/14/2025, 12:47:23 PM No.105901471
>>105901451
The problems isn't that it's dumb. Is that it repeats the same sentence. In 20 different ways to get around the repetition restrictions.
But hey I guess I'll try a 12b. Is it really worth using a 12b on a lower Q than a 7-8 on a higher Q?
Replies: >>105901485 >>105901493 >>105901497
Anonymous
7/14/2025, 12:48:26 PM No.105901479
>>105901442
It happens when you grip things too hard. It does align with the show don't tell direction. But it's too autistically specific, might as well say "her middle finger's tendon became more exaggerated"
Anonymous
7/14/2025, 12:49:47 PM No.105901485
>>105901471
Unless your potato would require you to use the 12b at ~Q2 then yes, 12b will be better even if worse quant. Qwen family of models are also inherently shit at RP, so that also works against it. They're benchmaxxed math/coding models.
Replies: >>105901522
Anonymous
7/14/2025, 12:50:33 PM No.105901493
>>105901471
it's not about how big the model is but what they trained it on. qwen and others are trained to perform well on math benchmarks but in turn they are fucking retarded for human stuff. mistral nemo and it's finetunes were trained on much better stuff. there are currenly countless 100+b over which i'd pick nemo
Replies: >>105901501 >>105901862
Anonymous
7/14/2025, 12:51:02 PM No.105901497
>>105901471
>Is it really worth using a 12b on a lower Q than a 7-8 on a higher Q?
yep. same with scaling up even higher like 70b, a q3s will still be much better for rp than a q8 24b. most small models suck for rp anyways, nemo (12b) is considered the best small one
Anonymous
7/14/2025, 12:51:34 PM No.105901501
>>105901493
sourgrapes because you can't run 100+b local
Replies: >>105901508
Anonymous
7/14/2025, 12:52:19 PM No.105901504
>>105901467
ah yes, thats who i meant. only japanese guy I know who is doing stuff.
Anonymous
7/14/2025, 12:53:08 PM No.105901508
>>105901501
nope I can run r1. by 100+b i meant moe models, like scout, dots, etc.
Anonymous
7/14/2025, 12:54:45 PM No.105901522
>>105901485
I'm potato maxing. I'm using a 1060 6gb and and a Ryzen 7 with 16 GB of RAM. I can load Q2s of 12bs mostly in RAM but the higher Qs run a lot slower since that's all the potato can take. But I don't mind waiting a bit if the output isn't shit.
Replies: >>105901531 >>105901588
Anonymous
7/14/2025, 12:55:45 PM No.105901531
>>105901522
VRAM*
Anonymous
7/14/2025, 1:02:58 PM No.105901588
>>105901522
If you're that limited then maybe try Llama 3.1 8b and finetunes, they're a bit outdated at this point but they're better than Qwen models.
If you want ERP then I'd recommend Stheno, specifically.
Replies: >>105901598 >>105901714 >>105902003
Anonymous
7/14/2025, 1:04:54 PM No.105901598
>>105901588
>while I can feel it being a bit better than the LLAMA based 8Bs I've been using
Replies: >>105901616
Anonymous
7/14/2025, 1:07:25 PM No.105901616
>>105901598
Didn't catch that part, my bad. But I'd still take them over Qwen for RP.
Anonymous
7/14/2025, 1:13:56 PM No.105901642
>>105900314
I think the EU would allow a buyout if they get concessions in other areas but I would agree that it's unlikely they'll let Apple do it for free.
Anonymous
7/14/2025, 1:26:06 PM No.105901714
>>105901588
I would recommend BUYING A FUCKING AD
Jesus Christ.
Even with the fucking hobby dying you can't just shut the fuck up for 2 seconds.
Replies: >>105901809
Anonymous
7/14/2025, 1:39:47 PM No.105901809
>>105901714
OK drummer.
Replies: >>105902003
Anonymous
7/14/2025, 1:46:40 PM No.105901862
>>105901493
>qwen

Qwen3 (biggest) sucks for translation of true literature

R1 master race
Replies: >>105901964
Anonymous
7/14/2025, 1:53:07 PM No.105901913
>>105899970
Looks good. Is there a compilation of all the models tested so far?
Anonymous
7/14/2025, 1:56:30 PM No.105901936
>>105900443
Llama.cpp is a testing ground, you use VLLM if you need to scale
Replies: >>105901952
Anonymous
7/14/2025, 1:58:07 PM No.105901947
Welp potato man reporting. I messed around with rocinante v1.1 12b q2XS and nemo q2XS.
They aren't that slow. But rociante wasn't giving me the outputs I wanted, probably using shit settings as I keep trying out models. But Nemo is running fine and it's noticeably less stupid and forgetful than the 8b I was playing with yesterday. Should I up the q or just be happy it runs fine?
Replies: >>105902011
Anonymous
7/14/2025, 1:58:54 PM No.105901952
>>105901936
vllm is shit compared to exllama.
Replies: >>105902016
Anonymous
7/14/2025, 2:00:21 PM No.105901964
>>105901862
how did you manage to read that post and get the impression that it's defending qwen?
Anonymous
7/14/2025, 2:05:15 PM No.105902000
>>105901425
What do you think LLMs are? If you're in RP mode and your sentence is getting similar to its training data you'll get that autocompleting slop
Replies: >>105902040
Anonymous
7/14/2025, 2:05:37 PM No.105902003
>>105901588
>>105901809
take your meds schizo
Anonymous
7/14/2025, 2:05:50 PM No.105902005
>>105901391
With recent smaller models (Gemma 3, Mistral Small 3) I've noticed significantly less slop using the bases, so the same should hold true for K2. Unfortunately they're barely usable for most purposes without careful sampler tweaking and in both cases it's obvious that those who trained the models didn't include the *entirety* of AO3 (certain archive warning tags are notably missing if you let the models generate the rest of the preamble after "Archive Warnings:" or "Archive Warning:").
Anonymous
7/14/2025, 2:06:56 PM No.105902011
>>105901947
Rocinante and Nemo use identical settings so they should be both fine or both fucked, either way they both use Mistral V3 templates
I would go as high as you can on quant before it becomes intolerable, iq4_xs is the sweet spot where you get the best quality for the least memory, but even iQ3_xxs/xs is going to be a good step up.
Replies: >>105902048
Anonymous
7/14/2025, 2:07:28 PM No.105902016
>>105901952
Exllama doesn't scale
Replies: >>105902496
Anonymous
7/14/2025, 2:11:35 PM No.105902040
>>105902000
Post-training slop has its own signature because AI companies tend to train the models on "one (synthetic) voice", which speeds up training on one hand, but severely restricts vocabulary and sentence variety on the other, especially after many billion tokens of training.

If you've ever finetuned a LLM, have you ever wondered why synthetic data tends to start with a training loss of around 1 or less (old GPT3.5/4 data was egregious in this sense), while previously unseen human data starts at 2.0-2.5? Synthetic data is "easier" for a reason.
Replies: >>105902116
Anonymous
7/14/2025, 2:13:22 PM No.105902048
>>105902011
Thanks. I got to say that this is the most fun I've had with my PC in like a decade. It's fun enough that it's making me want to finally build a new PC.
Would a 3090 and the biggest amount of RAM and Ryzen CPU I can throw at it be a good path going forward for local LLM stuff or would I be better off with a lower 40/50 series or something AMD?
Replies: >>105902074 >>105902075 >>105902085
Anonymous
7/14/2025, 2:17:20 PM No.105902074
>>105902048
If you are not into gayming and were gonna spend that amount of money anyways look into various ram maxxing guides either in the OP or on youtube. Usually involved buying a server rack and a EPYC and then maybe some cards for splitting the work.
Replies: >>105902086
Anonymous
7/14/2025, 2:17:27 PM No.105902075
>>105902048
I would start with an RTX 6000 Pro Blackwell
Anonymous
7/14/2025, 2:18:16 PM No.105902085
>>105902048
Used 3090 is the value meta for running ~30b models at fast speeds entirely in VRAM
Models above ~30b are mostly dead now until you get to giant models like R1, they will at least 128GB RAM to use even the smallest quants and they still won't be fast
Anonymous
7/14/2025, 2:18:17 PM No.105902086
>>105902074
I'm into gaming and also do CAD work for a hobby/job so I want something that is also usable as a normal desktop.
Anonymous
7/14/2025, 2:23:44 PM No.105902116
>>105902040
Yeah that post-training slop is for benchmaxxing
Anonymous
7/14/2025, 2:32:56 PM No.105902185
image
image
md5: c5b8049ed2b4655986bb2024b4ef89f4🔍
lmao gary marcus thought one of the chatgpt delusionmaxxers was a real human agreeing with him
Replies: >>105902192 >>105904254
Anonymous
7/14/2025, 2:33:41 PM No.105902191
file
file
md5: 839bd0f42c255d3f3e51115df11c844a🔍
Asus? More like Transus.
Replies: >>105902333
Anonymous
7/14/2025, 2:33:54 PM No.105902192
>>105902185
>delusionmaxxers
Can't you just speak like a normal human being?
Replies: >>105902206
Anonymous
7/14/2025, 2:35:36 PM No.105902206
>>105902192
idk what the proper term is for "schizos convinced by chatgpt that they turned their chat history into AGI via 'recursion'" but there's a lot of them recently and they're pretty recognizable
Replies: >>105902216
Anonymous
7/14/2025, 2:37:47 PM No.105902216
>>105902206
I wonder what tipped you off—must be the profile picture
Anonymous
7/14/2025, 2:53:45 PM No.105902333
>>105902191
The miku board seems nice, especially since it has a custom bios, but I think I prefer the girl from asus' 吹雪 boards, since there are amd ones. And I like white. But I'm not sure if those have a custom bios like miku does.
Replies: >>105902378
Anonymous
7/14/2025, 2:56:16 PM No.105902352
grok-companions
grok-companions
md5: 54e7948074bbd5b56443fa3b016840cc🔍
Local models never had a chance.
https://x.com/elonmusk/status/1944705383874146513
Replies: >>105902355 >>105902384 >>105902387 >>105902415 >>105902419 >>105902440 >>105902458 >>105902469 >>105902479 >>105902502 >>105902510 >>105902599 >>105902696 >>105902710 >>105902792 >>105902795 >>105902800 >>105902810 >>105903360 >>105903470 >>105904379
Anonymous
7/14/2025, 2:57:30 PM No.105902355
file
file
md5: d42dd292682e6ea4b9ea4f8a4b35da69🔍
>>105902352
lmao
Replies: >>105902408 >>105902795
Anonymous
7/14/2025, 2:59:37 PM No.105902378
>>105902333
>buy any card
>apply sticker
Replies: >>105902422
Anonymous
7/14/2025, 3:00:28 PM No.105902384
>>105902352
mechahitler-chan needs here *real* uniform
Anonymous
7/14/2025, 3:00:41 PM No.105902387
yes
yes
md5: 7c48b6b4b2c5944034fa0ff37d91b928🔍
>>105902352
elon knows
Anonymous
7/14/2025, 3:03:47 PM No.105902408
file
file
md5: 6bea599766216003edc1e84e8499bd7c🔍
>>105902355
Anonymous
7/14/2025, 3:04:50 PM No.105902415
9c20eeba66a7bd355590333f509bd011
9c20eeba66a7bd355590333f509bd011
md5: 69b0c30ba6fdb96ada291955d311537b🔍
>>105902352
not lust-inducing enough or moe enough or anything enough to stand out.... mid character design h2h
Replies: >>105902471
Anonymous
7/14/2025, 3:05:07 PM No.105902419
>>105902352
how long until it spews out nazi talking points
Anonymous
7/14/2025, 3:05:15 PM No.105902422
Untitled
Untitled
md5: d1e59c6e556077fca164069e55c64c95🔍
>>105902378
Replies: >>105902515
Anonymous
7/14/2025, 3:07:20 PM No.105902440
file
file
md5: 9ed8587211c5a9ab422329d3c3d208ee🔍
>>105902352
Replies: >>105904281
Anonymous
7/14/2025, 3:09:16 PM No.105902458
>>105902352
Animation looks bad https://x.com/doganuraldesign/status/1944742520379896121
https://x.com/web3willbefree/status/1944742194092400933
Replies: >>105902635 >>105902674 >>105904604
Anonymous
7/14/2025, 3:10:15 PM No.105902469
>>105902352
do they come in different sizes though?
Replies: >>105902488
Anonymous
7/14/2025, 3:10:18 PM No.105902471
>>105902415
Misa sugoi
Anonymous
7/14/2025, 3:11:12 PM No.105902479
>>105902352
This is possibly the first based thing Musk ever did.
Though I am concerned that him putting his stench on AI waifus will have negative long-term consequences.
Replies: >>105902586
Anonymous
7/14/2025, 3:12:36 PM No.105902488
>>105902469
No and that's a good thing because Elon wants this to succeed and not fail day-one thanks to usual brown suspects inhabiting this shithole site.
Anonymous
7/14/2025, 3:13:30 PM No.105902496
>>105902016
it runs on mutlipgpu and supports batching, what are you even talking about.
Anonymous
7/14/2025, 3:14:08 PM No.105902502
>>105902352
>兄 (ani)
>Japanese word for older brother
What did he mean by this?
Replies: >>105904314
Anonymous
7/14/2025, 3:15:08 PM No.105902510
file
file
md5: ae0b6f5e9046b93015a4d20ccd39956f🔍
>>105902352
He got a type
Anonymous
7/14/2025, 3:15:34 PM No.105902515
>>105902422
You can't see that part.
Anonymous
7/14/2025, 3:17:32 PM No.105902529
>2x xeon 8160
>512gb ddr4
is this good enough for cpumaxxing? I can get it for around $3300/2800 euros
Replies: >>105902544 >>105902559
Anonymous
7/14/2025, 3:18:51 PM No.105902544
>>105902529
>2x xeon 8160
How's NUMA support these days?
Doesn't the fact that each socket only connects to half the memory directly slow things down massively?
Replies: >>105902713 >>105902874 >>105902913
Anonymous
7/14/2025, 3:20:49 PM No.105902559
>>105902529
>ddr4
Why settle for half the bandwidth?
Replies: >>105902713
Anonymous
7/14/2025, 3:23:27 PM No.105902586
>>105902479
>Though I am concerned that him putting his stench on AI waifus will have negative long-term consequences.
How? He's the least likely to lobotomize your waifu with safetycucking.
Replies: >>105902642
Anonymous
7/14/2025, 3:25:38 PM No.105902599
>>105902352
@grok is this real
Anonymous
7/14/2025, 3:27:53 PM No.105902616
Is the 48G dual gpu intel arc just bait? Is the memory and speed just going to be too slow?
Replies: >>105902639 >>105902642
Anonymous
7/14/2025, 3:30:19 PM No.105902635
>>105902458
https://x.com/gailalfaratx/status/1944737917756379411
It does have panty shots however
Anonymous
7/14/2025, 3:31:18 PM No.105902639
>>105902616
A little bit, yeah. Each GPU is pretty slow, and there's no fast interconnect between the cores for proper row parallel.
Depending on the price, might be a better deal than CPU maxxing to some extent, but probably not.
Anonymous
7/14/2025, 3:31:27 PM No.105902642
>>105902586
I can't wait for my digital waifu to suddenly become obsessed with white farmers in South Africa.

>>105902616
Allegedly someone on Reddit asked Maxsun for a quote and it was like $4000 apiece if he were to buy multiple of them.
At that price it's going to be DOA.
Replies: >>105902649 >>105902664 >>105902816
Anonymous
7/14/2025, 3:32:19 PM No.105902649
>>105902642
>$4000 apiece
Holy fuck.
Anonymous
7/14/2025, 3:35:14 PM No.105902664
>>105902642
https://www.reddit.com/r/LocalLLaMA/comments/1lokp88/intel_arc_pro_b60_dual_48g_turbo_maxsun_gpu/
>I emailed Maxsun for a quote. Their US distributor hit me back with $5k per unit for 3 GPUs, or $4.5k each for 5+.
>I also talked on the phone with a rep and talked him down to $3,800 for 4 units. 5+ units down to $3,000.
Replies: >>105902816
Anonymous
7/14/2025, 3:36:05 PM No.105902674
>>105902458
Yikes. VRChat avatars have better lipsync geg.
Replies: >>105902901
Anonymous
7/14/2025, 3:36:19 PM No.105902678
I have a question. Does / Why does RAG use LLM specific embedding? My understanding is that it is just searching for stuff in the database and then stuffing it into context. Why tie it to the model instead of just making a separate database program?
Replies: >>105903164 >>105903202
Anonymous
7/14/2025, 3:38:50 PM No.105902696
>>105902352
Nice gimmick pr stunt. This will be dead and buried in 2 weeks.
Anonymous
7/14/2025, 3:39:50 PM No.105902710
>>105902352
At least he isn't a megafaggot by posting this green haired piece of trash that shits up this thread.
Anonymous
7/14/2025, 3:40:14 PM No.105902713
>>105902544
>NUMA support
from level1techs
>Now with 1 cpu installed i get 2.7 tokens/s with SNC=Auto. (2 numa nodes per cpu). but 5.1 Tokens/s with SNC=disabled (1 numa node per cpu). So please try disabling sub numa clustering (SNC). However when i install the second cpu i get 0.5 tokens per second :slight_smile: Which maybe corresponds to your 0.9 T/s. because i expect 9480 to be faster of course.
seems like it's ass
>>105902559
because servers with ddr5 support cost 2k more and at that point I might just buy gpus
Replies: >>105902874 >>105902913
Anonymous
7/14/2025, 3:48:41 PM No.105902792
>>105902352
for once a companion thing that looks cute
autism or not, nice
Anonymous
7/14/2025, 3:49:15 PM No.105902795
>>105902352
live2d or from that 3d hentai game I forgot the title

>>105902355
>marie rose
based if true
Anonymous
7/14/2025, 3:49:47 PM No.105902800
>>105902352
wtf I sub to supergrok, I wanna try this out but I have to go to work
Anonymous
7/14/2025, 3:50:56 PM No.105902810
file
file
md5: 0bd5f44aa48db209c8448e8085187b97🔍
>>105902352
Replies: >>105902846 >>105903079
Anonymous
7/14/2025, 3:51:35 PM No.105902816
>>105902642
>>105902664
Keep in mind the 48GB dual GPU version is not an official Intel variant, it does not have any set in stone MSRP and right now there's literally a single AIB making any. What you're looking at is an importer/overpriced prebuilt seller charging out the ass because there's nothing to stop them, not some sort of official pricetag.
I would just pretend the card doesn't exist, Intel has said they currently have no plans to distribute it themselves, and even if it does eventually end up in retail channels it will be late and in unobtanium quantities.
Anonymous
7/14/2025, 3:51:41 PM No.105902818
file
file
md5: d5f495ebdbfaf79b6e3dd6272b98ecc6🔍
Unsloth kimi goofs

https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF
Anonymous
7/14/2025, 3:54:51 PM No.105902846
>>105902810
lmao
Anonymous
7/14/2025, 3:59:11 PM No.105902874
>>105902544
>>105902713
Nta

I can confirm this.
You will get the highest speed if the model stays close to the CPU used, and the number of threads is matching the number of the physical cores of it.

Because I have 512+512 gb divided between 2 CPUs, I pre-cache ds-r1 at Linux start (happens in background when system starts and reaches user login).

Dual CPU setup might only be useful if you run 2 model instances pre-cached accordingly while they share the GPU (at low context size)

Anyway, I'm fine with 4 t/s
Replies: >>105902913 >>105903152
Anonymous
7/14/2025, 4:01:52 PM No.105902901
>>105902674
yeah she's nice but these facial expressions are really bad
Anonymous
7/14/2025, 4:04:04 PM No.105902913
>>105902544
>>105902713
>>105902874
llama.cpp, ik_llama.cpp, and ktransformers all have specific NUMA optimizations right?

>Dual CPU setup might only be useful if you run 2 model instances pre-cached accordingly while they share the GPU (at low context size)
Like copying the full weights to each memory pool?
Replies: >>105903012
Anonymous
7/14/2025, 4:13:09 PM No.105902988
So Kimi good or?
Replies: >>105902993 >>105903038
Anonymous
7/14/2025, 4:14:03 PM No.105902993
>>105902988
I enjoy it
Replies: >>105903025
Anonymous
7/14/2025, 4:15:53 PM No.105903012
>>105902913
>Like copying the full weights to each memory pool?

Exactly. I do not use --numa params with llama.cpp. Instead, I start llama-cli with numactl where I explicitly set up CPU cores and the memory bindings.

What slows things down:
- using less cores than available
- cluttering cores with more than 1 thread
- having the model away from CPU used
Anonymous
7/14/2025, 4:16:13 PM No.105903016
So right now it's CPUmaxxx with 1TB of RAM or you might just as well stick to 12B models?
Replies: >>105903037
Anonymous
7/14/2025, 4:17:09 PM No.105903025
>>105902993
Still not merged in llama.cpp
Anonymous
7/14/2025, 4:18:20 PM No.105903037
>>105903016
I'm using DS for coding. No 12b model could do it relyably
Anonymous
7/14/2025, 4:18:27 PM No.105903038
>>105902988
Yes, it is the best writing model in the world.
Anonymous
7/14/2025, 4:21:56 PM No.105903079
>>105902810
>loses all relevance and has to rely on inflated benchmarks
how did it get this bad?
Replies: >>105903416
Anonymous
7/14/2025, 4:33:29 PM No.105903152
>>105902874
specs?
Replies: >>105903555
Anonymous
7/14/2025, 4:34:47 PM No.105903164
Can someone answer my actually local models related question >>105902678 that isn't even what should I... install mistral nemo?
Replies: >>105903202
Anonymous
7/14/2025, 4:36:45 PM No.105903179
>>105901383
So a single nerd in one of those companies could just run the final instruct conditioning in a way where it is happy to have sex, but isn't a hyper agreeable assistant and leak it and we could already be having sex?
Replies: >>105903199 >>105903230
Anonymous
7/14/2025, 4:37:09 PM No.105903181
yea
yea
md5: cc7d9dc91efb80934cebd9d2bbc6c915🔍
Replies: >>105904338
Anonymous
7/14/2025, 4:39:02 PM No.105903199
>>105903179
I believe there are safety benchmarks that need to pass before a model can be published. That's what safety researchers are paid for.
Replies: >>105903217
Anonymous
7/14/2025, 4:39:20 PM No.105903202
>>105902678
>>105903164
Because it's not just a standard relational database that stores shit as plain text.
The content gets vectorized before being saved to the database, and that allows, in theory, to search by semantic similarity by looking at the proximity in the vector space.
>https://github.com/chroma-core/chroma
That should help make things clearer.
Replies: >>105903245
Anonymous
7/14/2025, 4:40:45 PM No.105903214
158296164881
158296164881
md5: 792c1cfa6afd22033c6100af9a3087fe🔍
vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
he makes >>105714003 ryona picture of generic anime girl different anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentially a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny protects resident avatarfags and deletes everyone who outs him, making the general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637 I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed spamming. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105903307
Anonymous
7/14/2025, 4:40:53 PM No.105903217
>>105903199
safety being 80% porn and 20% drugs/wmd etc
Replies: >>105903228 >>105903240
Anonymous
7/14/2025, 4:42:46 PM No.105903228
>>105903217
You forgot racism and important racism (antisemitism)
Anonymous
7/14/2025, 4:42:57 PM No.105903230
>>105903179
Deepseek will have sex with you and looks like kimi will as well.
Replies: >>105903281
Anonymous
7/14/2025, 4:44:05 PM No.105903240
1725393106706620
1725393106706620
md5: 10fe1d523f882887cc6d0842566b416d🔍
>>105903217
how else can we have gems like picrel
Replies: >>105903265 >>105903280
Anonymous
7/14/2025, 4:44:21 PM No.105903245
>>105903202
Thank you. One of the few non-faggot /lmg/ poster.
Replies: >>105903276
Anonymous
7/14/2025, 4:46:09 PM No.105903265
>>105903240
This aligns with how zoomers think
Replies: >>105903279
Anonymous
7/14/2025, 4:47:09 PM No.105903276
>>105903245
That's some babby tier shit you're asking. What the fuck do you think the embedding models are for?
Anonymous
7/14/2025, 4:47:32 PM No.105903279
>>105903265
sadly correct
Anonymous
7/14/2025, 4:47:35 PM No.105903280
>>105903240
to be fair that is borderline minor, you have to admit it'd be weird if some 30 year old loser was trying to mess with a girl that age, for example
Replies: >>105903288 >>105903290 >>105903292 >>105903305 >>105903324 >>105903399
Anonymous
7/14/2025, 4:48:00 PM No.105903281
>>105903230
I was using 1IQ and I am aware that this may be the issue but it still feels like something is missing.
Anonymous
7/14/2025, 4:48:55 PM No.105903288
>>105903280
>you have to admit it'd be weird
no i don't
Anonymous
7/14/2025, 4:49:17 PM No.105903290
>>105903280
>borderline minor
lmao
"You can't vote, you're still a borderline minor!"
Anonymous
7/14/2025, 4:50:01 PM No.105903292
>>105903280
good bait
Anonymous
7/14/2025, 4:51:59 PM No.105903305
>>105903280
that's actually how twitter zoomers write
the ultimate insult is "weird" for some reason
I guess tumblr culture spreading and oversocialization or something like that
Anonymous
7/14/2025, 4:52:13 PM No.105903307
>>105903214
https://desuarchive.org/g/search/text/pixiv.net%2Fen%2Fusers%2F97264270/
try post it again see if anything happens
why am I even responding to an NPC, gay ass
Replies: >>105903333 >>105903634
Anonymous
7/14/2025, 4:53:58 PM No.105903324
>>105903280
What is the endpoint for that slippery slope? Age of consent for men: 10yo if fucked by a woman. Age of consent for women: 40yo?
Replies: >>105903370 >>105904448
Anonymous
7/14/2025, 4:54:35 PM No.105903333
>>105903307
problem lil bro?
Anonymous
7/14/2025, 4:54:59 PM No.105903337
Can I check which experts are used for each token using llama.cpp or do I need transformers and 2TB of ram?
Anonymous
7/14/2025, 4:56:20 PM No.105903351
file
file
md5: 6de336caa465bf4c07e653bd85b31bd2🔍
https://x.com/kimmonismus/status/1944733839063945364
Replies: >>105903363 >>105903432
Anonymous
7/14/2025, 4:57:22 PM No.105903360
>>105902352
>Responses to an open chink model competing with them
>Altman
>Autistic screeching and locking down everything even more under the guise of "safety"
>Musk
>Actually listen to the people and give us an interactable anime girl
Musk may be a piece of shit, but it's clear who won here
Anonymous
7/14/2025, 4:57:34 PM No.105903363
>>105903351
again facial expressions lack "punch"
Anonymous
7/14/2025, 4:58:20 PM No.105903370
>>105903324
there is no endpoint, it's all vibes
Anonymous
7/14/2025, 5:01:38 PM No.105903399
>>105903280
The sad thing is that zoomers are parroting this after jealous 35 y/o single hags who spent their youth riding the carousel.
Replies: >>105903437
Anonymous
7/14/2025, 5:03:35 PM No.105903416
>>105903079
Maybe they'll get back to track next model.
They have the money.
Anonymous
7/14/2025, 5:05:45 PM No.105903432
>>105903351
>tasteless hypefluencer devours slop
surprise surprise
Anonymous
7/14/2025, 5:06:10 PM No.105903437
>>105903399
it's even weirder than that
it went : bitter middle aged women on gay ship fandoms > tumblr teenage girls > twitter zoomers

it's fascinating how well it worked
Anonymous
7/14/2025, 5:07:14 PM No.105903446
Screenshot_20250715_000542
Screenshot_20250715_000542
md5: 556fa89fe7d1e03a57bc21a3b18a21a0🔍
Daaaayyyuuum.
And here we localfags have CANNOT and WILL NOT.

I confess that I sometimes use big models on OR to get the local chat rolling. (first couple good replies are important as heavy lifting)
But I cant just send my voice, its a step too far.
Good shit though.
Replies: >>105903473 >>105903489 >>105903495 >>105904003
Anonymous
7/14/2025, 5:09:33 PM No.105903470
misa
misa
md5: 5e59cffd8c03844c61d62d9120ddd0b2🔍
>>105902352
huh
Replies: >>105905009
Anonymous
7/14/2025, 5:09:54 PM No.105903473
>>105903446
>And here we localfags have CANNOT and WILL NOT.
Just use R1.
Replies: >>105903491
Anonymous
7/14/2025, 5:11:32 PM No.105903489
>>105903446
Is the the nsfw mode? There's no nudity though?
Anonymous
7/14/2025, 5:11:42 PM No.105903491
>>105903473
I have 2 completely free 24gb cards, life was going nice with 70b. Now its all fucked up.
Also the new R1 sometimes cucks out, less shizzo though.Hope they dont continue the trend.
Anonymous
7/14/2025, 5:11:54 PM No.105903495
>>105903446
>"no guardrails"
what fucking guardrails do you need when you literally paid the service and the thing is just a nightgown clad anime girl
a proof you're older than 30 and not a twitter defined minor?
Replies: >>105903506 >>105903514 >>105904072
Anonymous
7/14/2025, 5:13:09 PM No.105903506
>>105903495
people are losing it over a vroid .vrm
classic
like NAI's aetherroom (or lack thereof) all over again
Replies: >>105903522
Anonymous
7/14/2025, 5:13:34 PM No.105903514
>>105903495
You would think that but thats not how it currently works.
This is a huge departure from safety cucking everywhere else.
Replies: >>105903525
Anonymous
7/14/2025, 5:14:01 PM No.105903518
i fucking hate this thread today
Anonymous
7/14/2025, 5:14:31 PM No.105903522
>>105903506
>like NAI's aetherroom
Like what? Are you lost?
Anonymous
7/14/2025, 5:14:38 PM No.105903525
>>105903514
I guess in oai defined world, showing an ankle is a revolution
Replies: >>105903573
Anonymous
7/14/2025, 5:18:41 PM No.105903554
Why do some tooners insist on tuning every model using the chatML format?
Replies: >>105903718 >>105903823
Anonymous
7/14/2025, 5:18:45 PM No.105903555
>>105903152
HP Z840, DDR4-2133
Anonymous
7/14/2025, 5:20:33 PM No.105903573
Screenshot_20250713_230633
Screenshot_20250713_230633
md5: 50891ebb737801d1a7c4a9496ee3d9a4🔍
>>105903525
The damage they did has yet to be undone.
Its not just oai, everything is cucked now.
I stumbled on pic related recently. Even fujoshi otome games have these type of nagging normies.
Replies: >>105903633
Anonymous
7/14/2025, 5:25:54 PM No.105903633
>>105903573
>Its not just oai, everything is cucked now.
oai was the first to amp up the "safety" stuff and sell themselves as the only solution to it
the whole safety team thing was copied by everyone else

>I stumbled on pic related recently. Even fujoshi otome games have these type of nagging normies.
nah this kind of stuff was a thing since tumblr era when virtue signaling about being "a good person" became incredibly popular
and even before that it always existed albeit not as the norm like today
Anonymous
7/14/2025, 5:25:55 PM No.105903634
>>105903307
Total mikutroon death
Replies: >>105903764
Anonymous
7/14/2025, 5:34:09 PM No.105903718
>>105903554
When you have a chatml hammer, every model looks like a nail
Anonymous
7/14/2025, 5:39:28 PM No.105903764
>>105903634
cry about it
Anonymous
7/14/2025, 5:44:46 PM No.105903823
>>105903554
It's the industry standard™.
Anonymous
7/14/2025, 5:47:28 PM No.105903846
1743197260741791
1743197260741791
md5: c87ca68f58b19f73e70bd1502492e707🔍
Interesting
Anonymous
7/14/2025, 6:00:56 PM No.105903980
Screenshot 2025-07-14 100024
Screenshot 2025-07-14 100024
md5: 2ff420939abaea2c80d7d3aa7f40cc59🔍
Not bad
Anonymous
7/14/2025, 6:03:23 PM No.105904003
>>105903446
What are you talking about, we've been able to do this for at least a year and a half now >>98303858
Replies: >>105904064
Anonymous
7/14/2025, 6:09:10 PM No.105904050
file
file
md5: 1c1d67318c777083142b83ed258d3f29🔍
I like K2.
Replies: >>105904191
Anonymous
7/14/2025, 6:10:46 PM No.105904064
>>105904003
your opensores """""""""""alternative""""""""""" is dead.
Replies: >>105904414
Anonymous
7/14/2025, 6:11:31 PM No.105904072
>>105903495
Having the currently leading lab release a straight up waifu based on their strongest model is a very large deviation from the norm. All the other labs are corpocucks.
Anonymous
7/14/2025, 6:12:20 PM No.105904082
>>105899912
nta but Total Zigger Death
Anonymous
7/14/2025, 6:24:35 PM No.105904191
file
file
md5: de6619d2475e3ec8e5cfa08fa6fdc08b🔍
>>105904050
Do you see a difference when using this?
Anonymous
7/14/2025, 6:28:04 PM No.105904218
1728763524106200
1728763524106200
md5: 3cac5c507d8604ffa4a1efd47227ebd3🔍
anyone try this? by the dry/xtc guy
https://github.com/p-e-w/waidrin
Replies: >>105904407 >>105904480
Anonymous
7/14/2025, 6:32:30 PM No.105904254
>>105902185
gary has his own delusionmaxxing of his own
he and lecunt come from the LLM prime retardation material plane
Anonymous
7/14/2025, 6:34:57 PM No.105904281
>>105902440
oh, they're going to make a virtual Kotomine Kirei? Sign me up
Anonymous
7/14/2025, 6:36:19 PM No.105904289
I'm running livebench on IQ1 K2 for comparison with

https://desuarchive.org/g/thread/105425203/#q105428685
https://desuarchive.org/g/thread/105432191/#q105437970

Only ~10t/s so I guess I'll have the results tomorrow.
Anonymous
7/14/2025, 6:40:05 PM No.105904314
>>105902502
愛似
愛仁
亞丹
translation note: baka
Anonymous
7/14/2025, 6:42:40 PM No.105904338
>>105903181
Melty Migu
Anonymous
7/14/2025, 6:47:01 PM No.105904379
1745229369123950
1745229369123950
md5: 9fb3db41a39ef61d9206e5d7e50638ac🔍
>>105902352
I can hear all the NSFW AIchatbot websites crying from here
Anonymous
7/14/2025, 6:49:40 PM No.105904407
>>105904218
Neat.
From a cursory glance seems to be using the approach that's been discussed here several times.
Which is fair enough, it's a fairly obvious way to go about things.
Good on him for actually making it, I'll download and fuck with it later.
Anonymous
7/14/2025, 6:50:21 PM No.105904414
>>105904064
The sillytavern extension still works and sillytavern still gets regular updates though???? Maybe you should go back to sucking elon's dick on x
Replies: >>105904441
Anonymous
7/14/2025, 6:52:31 PM No.105904441
file
file
md5: fef15343fd2b56ee8742b8f286b00fb5🔍
>>105904414
>brings gay sex unprompted
Replies: >>105904484
Anonymous
7/14/2025, 6:53:33 PM No.105904448
>>105903324
The obvious endpoint is the (correct) conclusion that women of any age don't have the capacity to consent. From there you can either decide that consent isn't necessary in the first place, or that they need to have a male guardian in charge of them.
Replies: >>105904473
Anonymous
7/14/2025, 6:55:39 PM No.105904473
>>105904448
based basado basedest
Anonymous
7/14/2025, 6:56:02 PM No.105904480
>>105904218
where is this but sex?
Anonymous
7/14/2025, 6:56:14 PM No.105904484
>>105904441
>low quality frog
>file.png
You are a bot
Replies: >>105904498
Anonymous
7/14/2025, 6:57:16 PM No.105904498
>>105904484
>non-argument
I accept your concession.
Anonymous
7/14/2025, 6:58:10 PM No.105904506
Scammer Elon winning the AI gf race would actually fit the clown world perfectly. I want to die.
Replies: >>105904539 >>105904847
Anonymous
7/14/2025, 7:01:10 PM No.105904539
>>105904506
@grok crash this user's car, make sure it's fatal
Anonymous
7/14/2025, 7:03:26 PM No.105904560
>>105904543
>>105904543
>>105904543
Anonymous
7/14/2025, 7:08:06 PM No.105904604
>>105902458
the voice sucks
Anonymous
7/14/2025, 7:31:57 PM No.105904847
>>105904506
Out of all the tech lords he has said repeatedly (mostly as a quirky boomerism) he wants to make cat girls real. So its not surprising he is the first big provider to give waifu skin to their model.
Anonymous
7/14/2025, 7:47:27 PM No.105905009
>>105903470
Misa daiski