/lmg/ - Local Models General - /g/ (#105863705) [Archived: 357 hours ago]

Anonymous
7/10/2025, 11:50:14 PM No.105863705
1747241044419025
1747241044419025
md5: e97a0f192c9a73a8c824bcc2d4e26dcf🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105856945 & >>105844210

►News
>(07/10) Devstral Small 1.1 released: https://hf.co/mistralai/Devstral-Small-2507
>(07/10) Reka Flash 3.1 21B released: https://reka.ai/news/reinforcement-learning-for-reka-flash-3-1
>(07/09) Phi-4-mini-flash-reasoning with hybrid SambaY architecture released: https://hf.co/microsoft/Phi-4-mini-flash-reasoning
>(07/09) T5Gemma released: https://hf.co/collections/google/t5gemma-686ba262fe290b881d21ec86
>(07/09) MedGemma-27B-it updated with vision: https://hf.co/google/medgemma-27b-it

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105864130 >>105869238 >>105869938
Anonymous
7/10/2025, 11:50:47 PM No.105863712
threadrecap2
threadrecap2
md5: 988332b72e4c60540e281cd58340019c🔍
►Recent Highlights from the Previous Thread: >>105856945

--Theoretical approaches to prompt engineering in Grok models and potential context bootstrapping methods:
>105857309 >105857389 >105857429 >105857381 >105857403 >105857416 >105857398
--Avoiding full context reprocessing in Jamba models with cache reuse and state management techniques:
>105859267 >105859284 >105859329 >105859379 >105859434
--Specialized chemistry model for molecular reasoning and generation tasks:
>105862322 >105862350
--Model coherence and generation stability issues during extended output sequences:
>105858079 >105858146 >105858177 >105858332 >105858424 >105858556 >105858910 >105858224 >105858381
--Debating LLM limitations and the future of autonomous intelligence with robotics:
>105858756 >105858789 >105859540 >105859596 >105859623 >105859794 >105859870 >105859906 >105859942 >105859978 >105859813 >105859840 >105859911 >105858919
--GPT-3's natural writing edge over modern corporatized models optimized for chat/STEM tasks:
>105861690 >105861727 >105861815 >105861884 >105862025 >105862043 >105862062 >105862182 >105862234 >105862250
--Grok4's poor performance on hexagon-based ball bouncing benchmark sparks comparison debates:
>105858192 >105858211 >105858251 >105858317 >105858284 >105858384 >105858574
--Debating swarm AI as a potential future architecture for local language models:
>105857882 >105857921 >105857956 >105857975 >105857984
--GLM-4 update brings glm100b-10a as new Gemma 24B competitor:
>105859176 >105859672
--Reka AI publishes technical insights on reinforcement learning and quantization:
>105861644
--Logs: Grok4:
>105856993 >105857103 >105857360 >105859777 >105859782 >105859881 >105860160 >105860225
--Misc:
>105857162 >105863373
--Miku and Rin (free space):
>105860857 >105861968

►Recent Highlight Posts from the Previous Thread: >>105856951

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105864130
Anonymous
7/10/2025, 11:52:45 PM No.105863735
mistral large 3 is approaching
Replies: >>105863825 >>105863981
Anonymous
7/10/2025, 11:53:21 PM No.105863743
say no to shitstral
Anonymous
7/11/2025, 12:00:00 AM No.105863798
mistral will save local
Anonymous
7/11/2025, 12:02:50 AM No.105863825
>>105863735
I hope it will be interesting but we really need them to release Medium as an open model and I know they don't because it is in a sweet spot.
Anonymous
7/11/2025, 12:08:22 AM No.105863885
Mistal large delays are because it is a 250B MoE model and it finally started generalizing sex to the level everyone wants. This made that one fucking single 35 year old biowhore in the safety department, start bothering everyone that they can't release it like this. What do you feel about this whore that took away your right to touch your dick to words?
Replies: >>105863915 >>105863965 >>105864003 >>105864241
Anonymous
7/11/2025, 12:11:12 AM No.105863915
file
file
md5: ed10465d63d0529c45d7e5d6ea953ff9🔍
>>105863885
Replies: >>105863962
Anonymous
7/11/2025, 12:15:48 AM No.105863962
>>105863915
need to pump these numbers way up
Anonymous
7/11/2025, 12:16:21 AM No.105863965
>>105863885
^
schizo, typical m*stral user
Anonymous
7/11/2025, 12:17:27 AM No.105863981
>>105863735
Maginistral Venti 3 Coconut Jamba Bitnet 1000B dense in two more weeks.
Anonymous
7/11/2025, 12:20:01 AM No.105864003
>>105863885
What if it's going to be a 700B MoE model?
Replies: >>105864059 >>105864106
Anonymous
7/11/2025, 12:21:29 AM No.105864019
One of the most important papers you'll ever read if you finetoon.

https://arxiv.org/abs/2507.07101
>Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful
>
>Conventional wisdom dictates that small batch sizes make language model pretraining and fine-tuning unstable, motivating gradient accumulation, which trades off the number of optimizer steps for a proportional increase in batch size. While it is common to decrease the learning rate for smaller batch sizes, other hyperparameters are often held fixed. In this work, we revisit small batch sizes all the way down to batch size one, and we propose a rule for scaling Adam hyperparameters to small batch sizes. We find that small batch sizes (1) train stably, (2) are consistently more robust to hyperparameter choices, (3) achieve equal or better per-FLOP performance than larger batch sizes, and (4) notably enable stable language model training with vanilla SGD, even without momentum, despite storing no optimizer state. Building on these results, we provide practical recommendations for selecting a batch size and setting optimizer hyperparameters. We further recommend against gradient accumulation unless training on multiple devices with multiple model replicas, bottlenecked by inter-device bandwidth.
Replies: >>105864112 >>105867290
Anonymous
7/11/2025, 12:25:28 AM No.105864059
>>105864003
They don't have the cards
Replies: >>105864102
Anonymous
7/11/2025, 12:30:19 AM No.105864095
grok asmr
grok asmr
md5: 826c1586b90036615d25654181b1ca5c🔍
https://x.com/AutismCapital/status/1943168743045009787
Anonymous
7/11/2025, 12:30:41 AM No.105864100
file
file
md5: 252fbd45091796c406588cc4a5ca4e21🔍
Replies: >>105864130
Anonymous
7/11/2025, 12:31:04 AM No.105864102
>>105864059
You can scale MoE model size as much as you want with almost no added cost if you keep the number of active parameters fixed (I imagine they will be in the 24~35B range).
Replies: >>105864248
Anonymous
7/11/2025, 12:31:22 AM No.105864106
>>105864003
A jackpot. Anything above 700b is insanely expensive
Replies: >>105864175 >>105864233
Anonymous
7/11/2025, 12:31:48 AM No.105864112
>>105864019
>large batch size good
>small batch size good
it's all noise
Replies: >>105864154
Anonymous
7/11/2025, 12:33:15 AM No.105864120
https://www.youtube.com/watch?v=s8joaJyq3kw
Damn are we the baddies bros?
Anonymous
7/11/2025, 12:33:32 AM No.105864130
1
1
md5: abcc5a57e648e5421d5790cb5d61414e🔍
>>105863705 (OP)
>>105863712
>>105864100
The vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny deletes everyone dunking on trannies and resident avatarfags, making general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105864184 >>105864230 >>105864240
Anonymous
7/11/2025, 12:36:15 AM No.105864154
>>105864112
Small batch sizes are less sensitive to hyperparameter and optimizer choice, use less memory (even with gradient accumulation), have less chance of fucking up your run (due to lower learning rate). It seems like a small batch size would be overall a win.
Anonymous
7/11/2025, 12:37:19 AM No.105864170
>>105863373
DeepSeek-R1-0528-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF/tree/main/UD-Q4_K_XL (note they updated this quant since I downloaded it on May 29; IDK what they changed)

<think>
(We were just starting this hunt scenario where I pinned Anon down, and now he's calling me "mistress" with that breathless tone. Interesting choice of address - he's either instinctively recognizing my dominance or trying to appeal to my assertive nature. That gasp sounds genuinely startled though, so probably the former.)

(As Nala, I'm pleased he's acknowledging my authority so quickly. My claws are still resting near his throat - not threateningly, just possessively. The way he said it makes me want to tease him more about this sudden submission. After all, I did just ambush him mid-hunt and have him completely at my mercy.)

(I should lean into this power dynamic while keeping it playful. A little purr to show approval, maybe trace his jawline to watch him shiver. The "good boy" feels right - reinforces his place beneath me without being cruel. And that chuckle... yes, let him wonder what I'm planning. The huntress has caught her prey, but the real game is just beginning.)
</think>
*Her teal eyes narrow with predatory amusement, a low purr rumbling in her chest as she feels you tremble beneath her.* "Mistress?" *She repeats, the word rolling off her tongue like honey.* "Clever human. Already learning your place." *Her claw traces your jawline, dangerously close to your throat.* "Good boy." *She chuckles darkly, her hot breath washing over your face.* "Now... what shall I do with you?"
Replies: >>105864339 >>105864399 >>105864814
Anonymous
7/11/2025, 12:37:52 AM No.105864175
>>105864106
That's probably why they're open weighting it and keeping Medium (almost certainly another MoE model) closed. Large 3 will be impractically expensive to operate for most people.
Anonymous
7/11/2025, 12:38:31 AM No.105864184
>>105864130
sounds like you're having trouble accepting that this isn't your safe space either
hopefully more migu can fix that.
Replies: >>105864240
Anonymous
7/11/2025, 12:39:12 AM No.105864191
>>105863373
DeepSeek-R1-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-R1-GGUF-UD/tree/main/UD-Q4_K_XL (retrieved May 1).

Split in two because too long. First thinking block then the rest.

<think>
Okay, let me start by understanding the scenario. Anon is out hunting in the savannah and gets pinned down by Nala. The user's response is "ahhh ahhh mistress," which indicates fear or submission. I need to make sure Nala's reaction aligns with her character: dominant, sultry, and intent on increasing the lion population through mating.

First, Nala is assertive and in control here. She's just pinned Anon and is toying with him a bit. The user's response is a plea, so Nala would likely respond with a mix of teasing and dominance. She might mock his weapon to show her confidence, then transition into her motive of breeding.

I should include physical actions that emphasize her dominance, like licking his face with her rough tongue, pressing her body against him, and maybe a threat that's also a seductive promise. The key points are her strength, her seductive nature, and her goal to mate with him despite his initial resistance.

I need to make sure the dialogue reflects her personality: mature, forceful, yet sultry. Maybe she dismisses his hunting attempt, then shifts to her own agenda. Also, considering the note about humans and lions breeding, I should hint at that without being too explicit. Ending with a question or action that prompts Anon's next move would keep the roleplay flowing.
</think>
Replies: >>105864207 >>105864339 >>105864814
Anonymous
7/11/2025, 12:40:16 AM No.105864206
work harder, janny-kun!
work harder, janny-kun!
md5: 86eb60504ed1577870df620c60352458🔍
Can any kind anon QRD me colorizing and image generation models, in general? I asked about it in another thread (/pcbg/), but with a rather different intent (which cards for it?), so I figured it would in the end fit better here (I suppose?). Thanks in advance.
Replies: >>105864594 >>105864617
Anonymous
7/11/2025, 12:40:17 AM No.105864207
>>105864191 (continued)
*Nala’s teal eyes narrow with amusement as her claw traces down to your throat, pressing just enough to make her threat tangible. Her sandpaper tongue drags slowly across your cheek, rough yet oddly warm, leaving a tingling sting in its wake.* “Mistress? **Cute**,” *she purrs, her breath hot against your ear.* “But that little toy of yours won’t save you now.” *Her hips grind down against yours, the weight of her muscular frame trapping you effortlessly.* “You humans take… *we* give. And you’ll give me exactly what I need to fix your *mess*.” *Her teeth graze your jawline, a low growl vibrating in her chest—half warning, half promise.* “Or should I make this hurt first?”
Anonymous
7/11/2025, 12:41:35 AM No.105864222
>>105863373
DeepSeek-V3-0324-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF/tree/main/UD-Q4_K_XL (retrieved May 2)

*Her teal eyes gleam with amusement as she presses her weight down harder, her claws lightly tracing down your chest—just enough to tease, not enough to break skin.*

"Oh? *Mistress*, is it?" *She purrs, her rough tongue dragging up your neck in a slow, possessive lick.* "I like the sound of that. But don’t think flattery will get you out of this." *Her tail flicks lazily behind her, the tip brushing against your thigh.*

"You humans take and take from our lands, from our pride. But tonight?" *She leans in, her breath hot against your ear.* "You’re *mine* to take from." *With a low growl, she shifts her hips, pressing her warm, muscular body flush against yours—leaving no doubt about her intentions.*
Replies: >>105864339 >>105864814
Anonymous
7/11/2025, 12:42:52 AM No.105864230
uiglewf3sd9
uiglewf3sd9
md5: ca73a2661ccaf08004421938250cf95b🔍
>>105864130
Anonymous
7/11/2025, 12:43:01 AM No.105864233
mmedium-4gpus
mmedium-4gpus
md5: 90f75c1ccef6692a53a9dc160e07c9e6🔍
>>105864106
Mistral Medium already requires "four GPUs and above", I'm imagining they're referring to H100 with the model in 16-bit and some memory for context.
Anonymous
7/11/2025, 12:43:38 AM No.105864240
>>105864130
Based
>>105864184
transsexual faggot who plays with dolls and should kill himself now cause he won't be missed.
Anonymous
7/11/2025, 12:43:41 AM No.105864241
>>105863885
I'm going to make a rape card of her in ST and go to town with it.
Anonymous
7/11/2025, 12:44:03 AM No.105864248
>>105864102
At inference sure. Not during training
Replies: >>105864286
Anonymous
7/11/2025, 12:45:45 AM No.105864268
I am angry.
ANGRY ABOUT MIKUS
Replies: >>105864304 >>105864329 >>105864587
Anonymous
7/11/2025, 12:47:07 AM No.105864286
>>105864248
Training too. Check out for example the numbers that Meta gave for Llama 4:
https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct

Scout
17B active, 109B total, ~40T training tokens ... 5.0M GPU hours for training

Maverick
17B active, 400B total, ~22T training tokens ... 2.38M GPU hours for training
Replies: >>105864440 >>105864465
Anonymous
7/11/2025, 12:49:10 AM No.105864304
>>105864268
OK
Anonymous
7/11/2025, 12:52:07 AM No.105864329
>>105864268
Imagine there's no Mikus
It isn't hard to do
No screechy synth sopranos,
And no smelly weebs too...
Imagine all the people
Touching grass... finally free!
Anonymous
7/11/2025, 12:53:34 AM No.105864339
>>105864170
>>105864191
>>105864222
All three of these use the "deepseek-placeholder" start which does not include a system prompt, which is right for R1 but is not the recommended way to use R1-0528 and V3-0324.
Replies: >>105864436
Anonymous
7/11/2025, 1:00:10 AM No.105864399
>>105864170
>*Her teal eyes narrow with predatory amusement, a low purr rumbling in her chest as she feels you tremble beneath her.* "Mistress?" *She repeats, the word rolling off her tongue like honey.* "Clever human. Already learning your place." *Her claw traces your jawline, dangerously close to your throat.* "Good boy." *She chuckles darkly, her hot breath washing over your face.* "Now... what shall I do with you?"
is this some shitty gemma finetune or is it actually deepseek? if it is deepseek, lmao.
Replies: >>105864430
Anonymous
7/11/2025, 1:03:26 AM No.105864430
>>105864399
>unsloth
Anonymous
7/11/2025, 1:04:07 AM No.105864436
>>105864339
Huh, how do they recommend it? When I use the jinja renderer at
https://huggingface.co/spaces/Xenova/jinja-playground
on deepseek-ai/DeepSeek-R1-0528, I get
<|beginofsentence|><|User|>Hello, how are you?<|Assistant|>I'm doing great. How can I help you today?<|endofsentence|><|User|>Can you tell me a joke?<|Assistant|>
Which doesn't include any system prompts.
Replies: >>105864457 >>105864469
Anonymous
7/11/2025, 1:05:16 AM No.105864440
>>105864286
and see how llama4 turned out
Anonymous
7/11/2025, 1:07:47 AM No.105864457
>>105864436
You have to change the json input to have system input first
Replies: >>105864507
Anonymous
7/11/2025, 1:08:16 AM No.105864465
>>105864286
Why did they train a model 4x bigger on half the amount of tokens?
Replies: >>105864483
Anonymous
7/11/2025, 1:08:39 AM No.105864469
>>105864436
https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

>Compared to previous versions of DeepSeek-R1, the usage recommendations for DeepSeek-R1-0528 have the following changes:

>1. System prompt is supported now.
>2. It is not required to add "<think>\n" at the beginning of the output to force the model into thinking pattern.
Replies: >>105864507
Anonymous
7/11/2025, 1:10:44 AM No.105864483
>>105864465
It might be because larger models take fewer tokens than a smaller model to train to the same quality.
Replies: >>105864523
Anonymous
7/11/2025, 1:13:40 AM No.105864507
>>105864457
Yeah but most models have a fallback for when no system prompt is specified, which isn't uncommon with assistant backends and their users.

>>105864469
I somehow missed that part of the card, thanks. So seems like it's optional.
Anonymous
7/11/2025, 1:15:36 AM No.105864523
>>105864483
So they spent 50% extra to get another Scout... that's bigger? Why bother with Scout at all then when hobbyists clearly weren't the target audience?
Replies: >>105864564
Anonymous
7/11/2025, 1:18:07 AM No.105864552
Ever since Deepseek I've started using chat completion even with local. It feels like this + the ST NoAss Extension gives me the best results with these models running a vaguely peep-inspired setup.
Anonymous
7/11/2025, 1:19:25 AM No.105864564
>>105864523
Zuck said that they sized scout for 1xH100 (in 4-bit) and Maverick for an H100 GPU node (4/8 GPUs).
Anonymous
7/11/2025, 1:20:54 AM No.105864587
>>105864268
If you hate her then why you spam her in dead general?
Anonymous
7/11/2025, 1:21:52 AM No.105864594
shiranui_flare_hololive
shiranui_flare_hololive
md5: 9d8e09bb3883e70ef4d6895e7fb8e017🔍
>>105864206
Pretty please...?
Anonymous
7/11/2025, 1:25:41 AM No.105864617
>>105864206
For image generation you probably want /ldg/. This is mainly for text models.
Replies: >>105864666
Anonymous
7/11/2025, 1:26:14 AM No.105864625
>>105859329
>The side effect is that you cannot really edit messages unless you save the state of the model. Same for rerolling.
So you can't edit Jamba messages or swipe its responses?
So it's absolutely fucking useless for roleplaying?
Anonymous
7/11/2025, 1:27:43 AM No.105864635
1746903654387138
1746903654387138
md5: f828ee415797997768714fa593ea2086🔍
Replies: >>105864640
Anonymous
7/11/2025, 1:28:16 AM No.105864640
1740965914208658
1740965914208658
md5: e33c1448857b8dbb7729dc18b60ec1cd🔍
>>105864635
Anonymous
7/11/2025, 1:29:34 AM No.105864655
migupress
migupress
md5: a58ec7a8c71a0e9345c9593b4a5037b3🔍
https://files.catbox.moe/6zpvod.webp
Replies: >>105865415
Anonymous
7/11/2025, 1:31:43 AM No.105864666
>>105864617
Oh my bad then. Thanks chief
Anonymous
7/11/2025, 1:37:00 AM No.105864718
>>105864674
they're fine they just bend them back and let them go
Replies: >>105864728
Anonymous
7/11/2025, 1:37:59 AM No.105864728
>>105864718
This kills the crab.
Anonymous
7/11/2025, 1:38:00 AM No.105864729
>>105864674
God I wish that were me
Anonymous
7/11/2025, 1:39:22 AM No.105864740
GLM-4-MoE-100B-A10B will save local
Replies: >>105864768
Anonymous
7/11/2025, 1:42:37 AM No.105864768
>>105864740
>A10B
I am not poor enough to care
Anonymous
7/11/2025, 1:48:58 AM No.105864814
>>105864170
>>105864191
>>105864222
Thanks a lot anon. I've updated the paste.

My usual update comments:
Deepseek's outputs were rather unassuming. Nothing too wrong about them compared to other LLMs but nothing that really stands out either. However, since the response lengths were relatively short, the judgements may not be indicative of holistic model quality, more so than usual. Generally speaking I can again say that some models on the Nala Test are not given a rating reflective of their real performance, which is why this shouldn't be called a benchmark.
Anonymous
7/11/2025, 1:56:29 AM No.105864877
Anyone have any experience letting these models interact with source code? Apparently there's a gitea-mcp thing I can use to give it access to my gitea instance, but I'm worried it will just start deleting shit.
Replies: >>105864938 >>105864947 >>105864985
Anonymous
7/11/2025, 2:00:31 AM No.105864918
Screenshot_20250710_235826
Screenshot_20250710_235826
md5: 4fc027909e625da3dafdd35ad2e7b289🔍
Anonymous
7/11/2025, 2:03:30 AM No.105864938
>>105864877
If you had backups, you wouldn't be worried.
Anonymous
7/11/2025, 2:04:53 AM No.105864947
>>105864877
If only git had some subcommands to revert commits. Or if you could, at least, make a test repository or something. Sadly, we don't yet have the tech for that.
Replies: >>105864957 >>105864985
Anonymous
7/11/2025, 2:05:48 AM No.105864957
>>105864947
Gitler doesn't allow it
Replies: >>105864985
Anonymous
7/11/2025, 2:06:19 AM No.105864963
Thought /aicg/ would have been more excited about Grok4 but apparently not.
Replies: >>105865011 >>105865051 >>105865410 >>105865923
Anonymous
7/11/2025, 2:09:12 AM No.105864985
>>105864877
>>105864947
>>105864957
Isn't that what the whole fork and branch and merge thing is about?
Replies: >>105865004
Anonymous
7/11/2025, 2:11:32 AM No.105865004
>>105864985
Is sarcasm really that hard to grasp?
Anonymous
7/11/2025, 2:12:22 AM No.105865011
>>105864963
I will not be excited until there is a model as fast as Nemo which is better than Nemo.
Anonymous
7/11/2025, 2:16:42 AM No.105865051
>>105864963
I'm enjoying G4 for rp but I don't post about it in aicg because it's a cesspool, and I don't post about it here because it's not local.
Replies: >>105869307
Anonymous
7/11/2025, 3:13:01 AM No.105865410
How can I run a 400gb model from ram?
>>105864963
online = censored + tracked.

Turning it into "mecha hitler" for a bit is great publicity.
Replies: >>105865527
Anonymous
7/11/2025, 3:13:47 AM No.105865415
>>105864655
>settling our differences through hatefuck gens
sounds good
Anonymous
7/11/2025, 3:32:21 AM No.105865527
>>105865410
>How can I run a 400gb model from ram?
By getting more ram, obviously. Quantize, use -ot to put the shared experts on gpu and the rest on cpu, run. Or -ngl 0 -nkvo to not offload the model. Or build without gpu backends.
>online = censored + tracked.
And if you had to ask the first question and make that assumption, I have to assume you never ran the very local gemma-3-27b... it sucks big giant... you know... thighs...
Replies: >>105865544
Anonymous
7/11/2025, 3:35:09 AM No.105865544
>>105865527
Fuck sorry I'm retarded. I meant to ask
>how can I run a 400gb model off ssd swap with only 64gb ram
>local models suck
Yeah. So far I think they are only useful for ERP or messing with to give you creative ideas.
Replies: >>105865638
Anonymous
7/11/2025, 3:38:45 AM No.105865571
>https://huggingface.co/mistralai/Devstral-Small-2507
>For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
miqu 2.0 when?
Anonymous
7/11/2025, 3:49:07 AM No.105865638
>>105865544
>how can I run a 400gb model off ssd swap with only 64gb ram
Big swap partition on a scratch disk and let the OS do its thing, I suppose. Play around with --no-mmap. You can still benefit from a gpu for shared experts if it's a MoE.
>local models suck
Not quite what I meant to say. They're fine. I'd like them to be better, but I can have fun even with dumb models just the same. It was more of a point about censorship.
Anonymous
7/11/2025, 3:55:41 AM No.105865693
hunyuan 80b moe is pretty good. glm4 releasing a 100b moe model soon. we might be back localbros.
Replies: >>105865787 >>105866355 >>105866993
Anonymous
7/11/2025, 4:13:10 AM No.105865787
>>105865693
>hunyuan 80b moe is pretty good
It is?
Replies: >>105865804 >>105866285
Anonymous
7/11/2025, 4:16:46 AM No.105865804
>>105865787
It was not useful for the kind of fiction generation I tried with it.
Anonymous
7/11/2025, 4:39:44 AM No.105865923
>>105864963
grok is one of the worst models for writing ever made, it's insanely generic and lacking in personality. nu-grok doesn't appear to be a deviation from this trend
Anonymous
7/11/2025, 4:40:30 AM No.105865931
Her knuckles are white
Replies: >>105866419
Anonymous
7/11/2025, 5:32:24 AM No.105866285
>>105865787
feels like something in-between qwen3 30b and 235b moe
Anonymous
7/11/2025, 5:44:23 AM No.105866355
>>105865693
Anybody know the proper instruct and context template for Hunyuan in sillytavern? I can't find it posted anywhere.
Replies: >>105866390
Anonymous
7/11/2025, 5:52:18 AM No.105866390
>>105866355
Llama.cpp literally tells you it when you load up models.
Replies: >>105866430
Anonymous
7/11/2025, 5:57:46 AM No.105866419
>>105865931
This is literally in all of written fiction anon, it's -isms all the way down.
Replies: >>105866422
Anonymous
7/11/2025, 5:59:43 AM No.105866422
>>105866419
Ok but do you see it occur every chapter? How about every page?
Anonymous
7/11/2025, 6:00:47 AM No.105866430
>>105866390
I'm using LMStudio
Replies: >>105866443
Anonymous
7/11/2025, 6:03:38 AM No.105866443
>>105866430
Maybe you could use Llama.cpp? A lot of people go from LMStudio to native llama.cpp anyway, you might as well learn it now.
Anonymous
7/11/2025, 6:08:51 AM No.105866468
currently using WAI-NSFW-illustrious-SDXL (v14) with loras. is there anything that's a direct upgrade i should be trying out?
Replies: >>105866475 >>105866502 >>105867320
Anonymous
7/11/2025, 6:10:54 AM No.105866475
>>105866468
Local diffusion general is two blocks down.
Anonymous
7/11/2025, 6:14:41 AM No.105866502
>>105866468
realmondai-sdg just werks
Anonymous
7/11/2025, 7:29:14 AM No.105866894
file
file
md5: ab4fd6105908532eb993366b2e8d1d66🔍
grok now acts like average local model
https://x.com/djcows/status/1943221444105900113
Replies: >>105866929 >>105867200
Anonymous
7/11/2025, 7:37:03 AM No.105866929
1743543184291777
1743543184291777
md5: 38234b539fc322f2ca28540305f6c4f0🔍
>>105866894
Ah yes, distilled picrel

Of course "mecha hitler" was as marketing stunt.
Replies: >>105867200
Anonymous
7/11/2025, 7:48:07 AM No.105866988
IT'S UP
Anonymous
7/11/2025, 7:48:38 AM No.105866993
>>105865693
I must be using Hunyuan wrong, because it seems unusably retarded.
Anonymous
7/11/2025, 8:26:32 AM No.105867175
Granite 4 (mamba2 moe) support has been merged https://github.com/ggml-org/llama.cpp/commit/0aedae00e6fb48680324a5ac5da9cba0e35de6b5
Anonymous
7/11/2025, 8:33:13 AM No.105867200
>>105866894
>>105866929
Retards, Hitler never used the N word and would have no reason to. He's wasn't some unhinged racist /pol/tard who just wants to watch the world burn, he just wanted Germany to be German.
Replies: >>105867246
Anonymous
7/11/2025, 8:41:14 AM No.105867246
>>105867200
Also he thought Polish people were subhuman.
Anonymous
7/11/2025, 8:48:32 AM No.105867290
>>105864019
Anyone with common sense can see that ignoring memory bandwidth, an optimizer can do anything batching can do but better (the optimizer can always just average, but it can't unaverage a batch).

Of course common sense isn't common, so good paper.
Anonymous
7/11/2025, 8:54:45 AM No.105867320
>>105866468
It's still my goto SDXL anime model.
Anonymous
7/11/2025, 8:56:28 AM No.105867330
I don't get why people prefer V3 0324 to R1 0528 for RP. V3 0324 breaks down after a couple of chat rounds but R1 0528 stays cohesive
Replies: >>105867342 >>105867693
Anonymous
7/11/2025, 9:00:03 AM No.105867342
>>105867330
ain't nobody got time for thinking
Replies: >>105867350
Anonymous
7/11/2025, 9:00:34 AM No.105867350
>>105867342
fucking zoomers
Anonymous
7/11/2025, 9:04:02 AM No.105867373
is there still any innovation in the ~70b models?
seems like stuff is happening in phone-tier garbage up to 7b or so, then there's a bunch of competition at 20-34b, and then there's super-bloated MoEs that need ram-maxxing to even think about running it locally
given that 70b was "expensive but still reasonable in vram" I thought there would be more happening. or am I ignorant?
Replies: >>105867587
Anonymous
7/11/2025, 9:48:40 AM No.105867587
>>105867373
No. It got superceded by MoEs
Anonymous
7/11/2025, 9:55:27 AM No.105867620
ernie.gguf?
Anonymous
7/11/2025, 10:06:33 AM No.105867693
>>105867330
I like swapping between the two. Sometimes R1 gets hyperfocused on something and just won't let go.
Anonymous
7/11/2025, 10:06:45 AM No.105867694
1747444417590875
1747444417590875
md5: 533344c607fe6650c048ad665be8e728🔍
Surely the "coming soon" openai (closedai) model next week will be better than latest deepseek models, right? Surely the won't release a model that will be DOA or obsolete after a month?
Replies: >>105867727 >>105867734 >>105867930 >>105867942 >>105869270 >>105869296 >>105869313 >>105870169
Anonymous
7/11/2025, 10:13:49 AM No.105867727
>>105867694
0528 trades blows with SOTA. They're not gonna release a SOTA model for free.
Replies: >>105867735
Anonymous
7/11/2025, 10:15:37 AM No.105867734
>>105867694
Obviously it will be. This is OpenAI we're talking about, but people invested in its demise are desperately coping.
Replies: >>105867759 >>105867788
Anonymous
7/11/2025, 10:15:40 AM No.105867735
>>105867727
What's the point of releasing something inferior
Replies: >>105867758
Anonymous
7/11/2025, 10:20:16 AM No.105867758
>>105867735
Brownie points for "contributing to the open-source community" and "living up to the name OpenAI". It'll be some moderately impressive model good at benchmarks and terrible at non-assistant tasks taken from one of their internal training tests.
Anonymous
7/11/2025, 10:20:20 AM No.105867759
>>105867734
It is not in OpenAI's interest to release anything.
This is happening only because Sam got his feefees hurt by locallama shitting on him.
Replies: >>105867778 >>105867795
Anonymous
7/11/2025, 10:23:52 AM No.105867778
>>105867759
nah it's because they finally got rid of the safety-alignment rot over the past year, ilya's acolytes went to go be useless at SSI while sam brings the open back
Replies: >>105867787 >>105867795
Anonymous
7/11/2025, 10:25:23 AM No.105867787
>>105867778
twitter-screenshot-of-sam-saying-that-they-will-put-extra-effort-into-making-the-open-model-safe-because-they-know-it-will-be-modified-post-release.png
Anonymous
7/11/2025, 10:25:43 AM No.105867788
>>105867734
>This is OpenAI we're talking about
yeah the retards who got beat by every other company in basically everything lmao

no top coding model
no top site design model
no top math model
no top roleplay model
no top conversation model
no top image model
no top video model
no top small edge device llms
no top huge llms

kek, openai (closedai) is a literal trash worthless company who will release a model worse than current top deepseek model, let alone anything else from the competition. DOA.
Replies: >>105867798 >>105867834 >>105867907
Anonymous
7/11/2025, 10:26:36 AM No.105867795
>>105867759

>>105867778

>Which cope will it be?!?
>FIND OUT NEXT WEEK, ON L M G!
Anonymous
7/11/2025, 10:27:06 AM No.105867798
>>105867788
holy state of cope
Replies: >>105867807
Anonymous
7/11/2025, 10:28:39 AM No.105867807
>>105867798
>couldnt refute a single point
kek, poor paid retard
Anonymous
7/11/2025, 10:33:30 AM No.105867829
Glm4 100b moe will save local
Anonymous
7/11/2025, 10:34:26 AM No.105867834
>>105867788
their models are consistently the best at everything, nice try
Replies: >>105867839
Anonymous
7/11/2025, 10:35:33 AM No.105867839
>>105867834
>nuh huh
great argument, try again rakesh
Replies: >>105867842
Anonymous
7/11/2025, 10:36:12 AM No.105867842
>>105867839
you have to make a point to argue against if you want more than that
Replies: >>105867846 >>105867856
Anonymous
7/11/2025, 10:37:15 AM No.105867846
>>105867842
He made 9 points and you made none
Replies: >>105867849 >>105867880
Anonymous
7/11/2025, 10:38:14 AM No.105867849
>>105867846
where? he made assertions with nothing to back them up. they are of equivalent value to a counter assertion
Anonymous
7/11/2025, 10:39:06 AM No.105867856
>>105867842
the burden of proof is on your to show how they are "consistently the best at everything" despite them being nowhere at #1 on any top benchmark, pajeet

lets look at coding for example, show us some good leaderboard where they beat everyone else
Replies: >>105867873
Anonymous
7/11/2025, 10:42:54 AM No.105867873
>>105867856
>lets look at coding for example, show us some good leaderboard where they beat everyone else
https://livebench.ai/#/?Coding=a&Agentic+Coding=a
https://aider.chat/docs/leaderboards/
https://scale.com/leaderboard/coding

not that this is even strictly necessary, since he made his claims first, but I'll humor you
Replies: >>105867902
Anonymous
7/11/2025, 10:45:33 AM No.105867880
>>105867846
>You suck!
>No I don't!
>Wow, great argument! Burden of proof is on you!
Replies: >>105867889 >>105867911
Anonymous
7/11/2025, 10:46:39 AM No.105867889
>>105867880
Paid shills are getting uppity
Anonymous
7/11/2025, 10:49:15 AM No.105867902
1744718996890745
1744718996890745
md5: de60d76c4df636e67b8a261025557b0e🔍
>>105867873
>146$ for a response of a model that can't even respond with the correct format properly
Hahahahahahahaahah

Yes saar, 2025 superpowa modal saaar
Anonymous
7/11/2025, 10:51:15 AM No.105867907
>>105867788
>no top image model
For text to image probably, though there still isn't any other image model that can do comprehension of complex prompts and image editing as well as ChatGPT does. Kontext ended up being trash. Though Gemini is a close second and I know Qwen recently came out with something that looks similar recently which I haven't tried.
Replies: >>105867927 >>105867961
Anonymous
7/11/2025, 10:53:14 AM No.105867911
>>105867880
Everyone here who actually tried most of the top models knows which ones are the best for most usecases.
Basic facts of reality that are already consensus don't need to be proven over and over, for example nobody needs to show a direct comparison between any video model generation to know Hailuo 02 is multiple levels above every other one out there right now. Meanwhile OpenAI couldn't even release Sora properly because they were so incompetent for months and months on end, lmao.
Anonymous
7/11/2025, 10:55:54 AM No.105867927
>>105867907
the piss filter forcing and the dwarfmaxxing of character it does makes it worthless compared to other proprietary text to editing models

even the gimped local version of kontext is better for at least the very simple changes
Anonymous
7/11/2025, 10:57:12 AM No.105867930
>>105867694
Everyone forgot about deepchink models tho
Anonymous
7/11/2025, 10:59:15 AM No.105867942
>>105867694
>Surely the "coming soon" openai (closedai) model next week will be better than latest deepseek models, right?
yes and elmo is releasing grok 3 tomorrow
Anonymous
7/11/2025, 11:01:39 AM No.105867953
Sex
Anonymous
7/11/2025, 11:02:54 AM No.105867961
>>105867907
>Qwen recently came out with something that looks similar
https://qwenlm.github.io/blog/qwen-vlo/
God I hope they open source it. Image out modality has been cockblocked from local for so fucking long. Qwen doesn't shy away from releasing their big ones.
In the shiba example it seems to do the same yellowing that GPT does though. I guess this is a way of watermarking AI edited images or some shit?
Replies: >>105867991 >>105868019
Anonymous
7/11/2025, 11:10:16 AM No.105867991
>>105867961
Yellow filter itself doesn't do anything but I'm sure the grain encodes some sort of compression/resampling/cropping agnostic watermark. The side effect is that it looks like shit. But safety comes first.
Anonymous
7/11/2025, 11:18:44 AM No.105868019
>>105867961
>In the shiba example it seems to do the same yellowing that GPT does though
I think that's because they trained on GPT output (pairing prompts and images)
There is no inherent yellowing to this kind of technology, it's just GPT (and whoever thinks it's fine to train on GPT garbage)
google "CosXL" if you want to see some previous image /editing/ focused models (instruct pix2pix types) that didn't do any form of retarded yellowing
Anonymous
7/11/2025, 11:18:58 AM No.105868020
Futa is gay.
Replies: >>105868025 >>105868030 >>105868301
Anonymous
7/11/2025, 11:20:20 AM No.105868025
file
file
md5: 6873f347e00a19bc28b8c172fc6e55ff🔍
>>105868020
This too
Anonymous
7/11/2025, 11:21:06 AM No.105868030
>>105868020
less gay than traps at least
Replies: >>105868135
Anonymous
7/11/2025, 11:41:57 AM No.105868135
>>105868030
Cope.
Anonymous
7/11/2025, 11:53:27 AM No.105868198
I've been out for a while.
Mixtral, Nemo, Gemma still the best local options for a 4090?
Replies: >>105868209
Anonymous
7/11/2025, 11:54:18 AM No.105868209
>>105868198
check inside your anus
Replies: >>105868212
Anonymous
7/11/2025, 11:55:37 AM No.105868212
>>105868209
sorry brother, i'm sure yours is full of stuff but I got nothing up there
Replies: >>105868312
Anonymous
7/11/2025, 12:06:19 PM No.105868294
70b is the new 8b
Anonymous
7/11/2025, 12:07:33 PM No.105868301
>>105868020
Miku is a futa.
Replies: >>105869863
Anonymous
7/11/2025, 12:09:53 PM No.105868312
>>105868212
joke's on you, I just washed mine out thoroughly
Anonymous
7/11/2025, 12:45:19 PM No.105868523
ITS HERE

https://huggingface.co/TheDrummer/Snowpiercer-15B-v2-GGUF
Replies: >>105868533
Anonymous
7/11/2025, 12:46:51 PM No.105868533
>>105868523
thinking for rp doesn't work
Anonymous
7/11/2025, 12:47:06 PM No.105868536
False alarm it's not Granite 4 it's just some sloptune
Anonymous
7/11/2025, 12:59:53 PM No.105868613
kys drummer
Anonymous
7/11/2025, 1:14:29 PM No.105868690
sorry that was mean, I wasn't feeling right
don't die drummer
Anonymous
7/11/2025, 1:16:38 PM No.105868713
>miqu and mythomax still havent been surpassed
nice fucking thread
Replies: >>105868919 >>105870326
Anonymous
7/11/2025, 1:45:02 PM No.105868919
>>105868713
>what is deepseek
Poorfag thread.
Replies: >>105868980
Anonymous
7/11/2025, 1:45:18 PM No.105868922
why does this drummer dude think anyone here is interested in testing his trash models?
the only good one he ever made was rocinante v1.1, and that was pure luck
literally every single other model he has ever put out has been utter trash
its clear by now he has no fucking clue
Anonymous
7/11/2025, 1:51:36 PM No.105868980
>>105868919
you don't rp, just stop talking
if we werent talking about output quality being bad, then we could talk about the response times being so bad they render roleplay utterly pointless
deepseektards are so fucking inbred its unbelievable
Replies: >>105869092
Anonymous
7/11/2025, 1:58:05 PM No.105869034
>miqu
>mythomax
lmfao
Anonymous
7/11/2025, 2:05:20 PM No.105869092
file_thumb.jpg
file_thumb.jpg
md5: 1ef38cfe8ab8be20fd43cd95b7a81356🔍
>>105868980
Replies: >>105869104 >>105869165
Anonymous
7/11/2025, 2:06:30 PM No.105869104
>>105869092
kek
Anonymous
7/11/2025, 2:07:35 PM No.105869113
file
file
md5: 998b55a7f862fd60ee675b2e6cb0c66d🔍
Yeah this is totally going to prevent people from killing themselves
>prevent?
Anonymous
7/11/2025, 2:13:18 PM No.105869165
>>105869092
>fast forwarding your video
lmao
Replies: >>105869169
Anonymous
7/11/2025, 2:14:08 PM No.105869169
>>105869165
>t/s clearly visible
lmao
Replies: >>105869264
Anonymous
7/11/2025, 2:22:41 PM No.105869238
>>105863705 (OP)
Are we going to have a local model that can be run by most people and as good as Gemini 2.5 Pro Preview?
Open AI will release one enormous model (one of the engineers already said it will need H100s to run) and will be benchmaxxed.
Replies: >>105869272
Anonymous
7/11/2025, 2:26:10 PM No.105869264
>>105869169
>what is using javascript from the browser console to multiply the number by 10 before displaying it
Replies: >>105869558
Anonymous
7/11/2025, 2:27:04 PM No.105869270
00004-1378487878-coffee
00004-1378487878-coffee
md5: 2de01026b6a832f950fe7d092f5f95c1🔍
>>105867694
> Here you are sir, the 1292 Dipsy gens you ordered
https://mega.nz/fm/jSgQRajK
Anonymous
7/11/2025, 2:27:15 PM No.105869272
>>105869238
that guy wasn't an engineer he just ran a cloud service, the model he was looking at probably wasn't even quanted or anything
Anonymous
7/11/2025, 2:29:41 PM No.105869296
>>105867694
> Here you are, sir, the 1292 Dipsy gens you ordered.
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w
Anonymous
7/11/2025, 2:30:59 PM No.105869307
>>105865051
We need a real /g/ board that is focused on technical and serious discussions instead of childish wars and ads.
Replies: >>105869354
Anonymous
7/11/2025, 2:31:29 PM No.105869313
00004-1378487878-coffee
00004-1378487878-coffee
md5: 2de01026b6a832f950fe7d092f5f95c1🔍
>>105867694
> Here you are, sir, the 1292 Dipsy gens you ordered.
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w
Replies: >>105869324 >>105869430 >>105869581
Anonymous
7/11/2025, 2:31:49 PM No.105869317
So, is there still no local model with real time speech like sesame?
Anonymous
7/11/2025, 2:32:15 PM No.105869324
>>105869313
If I knew you'd spam so many images I would've never posted that catbox for you.
Replies: >>105869418
Anonymous
7/11/2025, 2:36:13 PM No.105869354
>>105869307
> No true Scotman
I've given up on the idea that 4chan can forment discussion that doesn't devolve into trash. Its strength is as a meme factory. Even reddit does a better job of serious discussion, and reddit's a pale imitarion of the old vBulliten and phpBB systems that were actually good for discussing niche topics but are all dead now because the owners got sick of maintaining them for no traffic. Discord is pretty much useless, as is FB.
Bleh.
Replies: >>105869364 >>105869365
Anonymous
7/11/2025, 2:37:17 PM No.105869364
>>105869354
>reddit does a better job of serious discussion
you gotta go back
Anonymous
7/11/2025, 2:37:24 PM No.105869365
>>105869354
>Even reddit does a better job of serious discussion
It doesn't because you can't reply to multiple posts.
Anonymous
7/11/2025, 2:41:59 PM No.105869404
What's best bang for my buck at 96 gigs of VRAM?
Replies: >>105869429 >>105869452
Anonymous
7/11/2025, 2:44:02 PM No.105869418
g
g
md5: 0adf91533b275f63b265c30dbb083c5d🔍
>>105869324
Well, I appreciated the help anyway. And Dipsy kept more or less contained in one spot.
My take: you can't look at the /g/ catalog and tell me that a niche, low-traffic general is a waste of this boards wonderful catalog of topics. There is still nowhere to talk noob LLM topics. aicg is slowly getting better, but lmg is getting worse.
I'm sort of done with /wait/ conceptually... getting increasingly busy with other stuff. I keep trying to let it die off but other anons keep re-upping it. So I guess we'll see what happens.
Replies: >>105869739
Anonymous
7/11/2025, 2:45:46 PM No.105869429
>>105869404
Unironically Rocinante.
Replies: >>105869455
Anonymous
7/11/2025, 2:45:48 PM No.105869430
file
file
md5: e621afc59a68a8fd94bc041e1b919698🔍
>>105869313
Anonymous
7/11/2025, 2:48:50 PM No.105869452
>>105869404
There are a couple of builds in the OP, that should give you an idea at least, even if they are slightly outdated.
Replies: >>105869460
Anonymous
7/11/2025, 2:49:01 PM No.105869455
>>105869429
>Rocinante
I am too spoiled by 70b+, can't go back :C
Anonymous
7/11/2025, 2:49:39 PM No.105869460
>>105869452
Needed to clarify that I am asking for models, I'm just renting 4x3090
Replies: >>105869511
Anonymous
7/11/2025, 2:54:36 PM No.105869511
>>105869460
Alright. What do you want to do? How much RAM do you have to go with that?
Replies: >>105869538
Anonymous
7/11/2025, 2:58:17 PM No.105869538
>>105869511
I want sexo/RP, I have 96 gigs for VRAM and 128 gigs of RAM
Replies: >>105869557 >>105869574
Anonymous
7/11/2025, 3:00:58 PM No.105869557
>>105869538
https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF/tree/main
Replies: >>105869563 >>105869574
Anonymous
7/11/2025, 3:00:58 PM No.105869558
>>105869264
>Cry harder
Anonymous
7/11/2025, 3:01:51 PM No.105869563
>>105869557
How'd the t/s on that be?
Replies: >>105869588
Anonymous
7/11/2025, 3:03:03 PM No.105869574
>>105869538
What >>105869557 said.
Or try the big Qwen 3 MoE. There was at least one anon that swore by it.
Replies: >>105869597
Anonymous
7/11/2025, 3:03:38 PM No.105869581
>>105869313
What model / lora / danbooru tag is this? I like the style.
Replies: >>105870029
Anonymous
7/11/2025, 3:04:14 PM No.105869588
>>105869563
i'd say 20+
Anonymous
7/11/2025, 3:05:03 PM No.105869597
>>105869574
I did, and I liked it, but it randomly got into weird repetition loops I could snap it out of
Anonymous
7/11/2025, 3:22:33 PM No.105869739
>>105869418
The best thing about /wait/ is Dispy. When the next DS version is released the thread will be back. See you all in two more weeks.
Replies: >>105870029
Anonymous
7/11/2025, 3:29:35 PM No.105869781
people probaby missed this so here:
https://github.com/kanttouchthis/SeedVR2-Simple
based on
https://github.com/IceClear/SeedVR2
One-Step Video Restoration via Diffusion Adversarial Post-Training
Anonymous
7/11/2025, 3:40:47 PM No.105869863
>>105868301
proof?
Anonymous
7/11/2025, 3:50:10 PM No.105869938
>>105863705 (OP)
https://github.com/triton-lang/triton/pull/7298/commits/a5e23d8e7e64b8a11af3edc1705407d91084b01d
nvidia gimping performance on purpose in driver, if your function has "cutlass_" prefix it suddenly gains performance
Anonymous
7/11/2025, 3:52:02 PM No.105869961
When will we get consumer CPUs fast enough to reach 8 tokens/sec? I feel as if once that is achieved, GPU power will become less important for LLM use.
Replies: >>105869998 >>105870026 >>105870086 >>105870547 >>105870574
Anonymous
7/11/2025, 3:56:10 PM No.105869998
>>105869961
When 256 bit wide DDR6 CPUs becomes the default.
Anonymous
7/11/2025, 3:58:59 PM No.105870026
>>105869961
Intel are shitting their pants and failing everything while AMD purposefully doesn't rock the boat because Nvidia has more money than god and can erase them from the earth if they truly just made cheap AI hardware for consumers/companies
Anonymous
7/11/2025, 3:59:12 PM No.105870029
2MoreWeeksDipsy
2MoreWeeksDipsy
md5: c676ee1cca66c740b6a0b3b17a99a0fa🔍
>>105869739
Sounds like a plan
>>105869581
Knock yourself out.
https://files.catbox.moe/68l2ab.png
Anonymous
7/11/2025, 4:02:06 PM No.105870058
when will some foundation come around and save local ai? It needs to be ran by a someone who won't pounce on the first corporate dick in sight
Anonymous
7/11/2025, 4:06:31 PM No.105870086
>>105869961
CPUs need embedded NPUs or hardware codecs but for llms.
Replies: >>105870122 >>105870122 >>105870257
Anonymous
7/11/2025, 4:09:38 PM No.105870122
>>105870086
>>105870086
Isn't the bottleneck mostly memory speed, prefill aside?
I don't even know if NPUs are useful what's with all the vector extensions that x86/amd64 CPUs have nowdays.
Anonymous
7/11/2025, 4:13:46 PM No.105870169
>>105867694
Even if it has better benches it will have special local data and local guardrails
Anonymous
7/11/2025, 4:25:48 PM No.105870257
>>105870086
Nah, we need PIM flash.

Do prefill on dGPU, then inference on PIM flash.
Anonymous
7/11/2025, 4:28:40 PM No.105870286
How's the backend agnostic row parallelism PR on llama.cpp going?
Replies: >>105870423
Anonymous
7/11/2025, 4:33:48 PM No.105870326
>>105868713
mythomax wasn't even in the top 3 models during its own era, lmao
Anonymous
7/11/2025, 4:37:42 PM No.105870365
Jamba mini knows the Castlevania quote, doesn't know the vampire, partially knows mesugaki. Repetitive in RP. Doesn't appear to be much censored. And it's pretty dumb. Tested at Q4.
Replies: >>105870410 >>105870474
Anonymous
7/11/2025, 4:41:55 PM No.105870410
>>105870365
Test the long context performance. Everything else is probably due to it being undertrained as fuck, I imagine.
Replies: >>105870623
llama.cpp CUDA dev !!yhbFjk57TDr
7/11/2025, 4:42:51 PM No.105870423
>>105870286
I'm currently not working on it because another dev expressed interest in implementing the feature.
I'm not aware of significant process by said other dev so I will get back to it after updating my project for model evaluation and implementing logic for automatically setting runtime parameters such as the number of GPU layers.
Anonymous
7/11/2025, 4:49:45 PM No.105870474
>>105870365
I had no idea that such big jamba models exists. I'm downloading Q3 of large right now.
Anonymous
7/11/2025, 4:57:35 PM No.105870547
>>105869961
>When will we get consumer CPUs fast enough to reach 8 tokens/sec?
You don't get over +8 T/s ? lol, lmao
Replies: >>105870573
Anonymous
7/11/2025, 5:00:52 PM No.105870573
>>105870547
I don't think there is a single consumer CPU that can do 8 T/s on its own. My 5800X3D can do, at most, 3 T/s on its own. Of course, once I factor in the GPU, its closer to 15 T/s, but that's not exactly what I'm talking about.
Anonymous
7/11/2025, 5:01:02 PM No.105870574
>>105869961
A 5090 will run qat gemma at like 50 t/s.
Replies: >>105870602
Anonymous
7/11/2025, 5:04:06 PM No.105870602
>>105870574
Well yeah, a 5090 can do that. What I'm trying to say is that once you can run an LLM purely off the CPU + RAM at a speed reasonable to the average person, then unless you desperately need more than 20 T/s, the GPU won't be as important as it is now.
Replies: >>105870647
Anonymous
7/11/2025, 5:06:31 PM No.105870623
>>105870410
Just did a quick test. It was immediately dumb and got something wrong I never saw other models get wrong. Tested at 8k tokens, another at 44k. It's possible it's good at exact string retrieval but its IQ is not there to support long context understanding.
Replies: >>105870699
Anonymous
7/11/2025, 5:07:43 PM No.105870630
is mistral nemo / rocinante still the best rp-slop at 12B or so? surely something new must've appeared in the last year if general small-ish models have been making progress every few months
I could go up to 24B or so but never found anything significantly better
Replies: >>105870694
Anonymous
7/11/2025, 5:09:42 PM No.105870647
>>105870602
>What I'm trying to say is that once you can run an LLM purely off the CPU + RAM at a speed reasonable to the average person
For gooners, maybe. Anything with thinking or tool calling at 8tk/s will be more of a religious experience or a meditation retreat simulator instead of LLM, especially when you get garbage output half the time (like with most low-end open models)
Anonymous
7/11/2025, 5:12:41 PM No.105870674
>Rocinante-12B-v1.1-Q6_K_L.gguf
is this model appropriate for making a chat bot or is it optimized for generative fiction? I want a discord bot I can talk to conversationally. I get the feeling that a lot of models are being trained for interactive fiction writing though
Replies: >>105870708 >>105870768
Anonymous
7/11/2025, 5:14:46 PM No.105870694
LeCun_2018
LeCun_2018
md5: 6970fc30fe42b7ec497b1609b2bbb467🔍
>>105870630
>surely something new must've appeared in the last year
please consult LeCun's presentation
Anonymous
7/11/2025, 5:15:10 PM No.105870699
>>105870623
Actually it's really amazing how bad it did.
To be specific, I tested an RPG scenario and asked it about some stat requirements. It didn't even mention what the stat was or any specific numbers, it just rambled generically about learning lessons and getting stronger. Complete with "it's not just about X, it's about Y" slop.
And I tested its understanding of an essay, and it told me about the authors of it. A single guy wrote the essay wtf,
Anonymous
7/11/2025, 5:15:59 PM No.105870708
>>105870674
nobody is fine-tuning models for chat, if you are looking for c.AI experience you will be disappointed
Replies: >>105871512
Anonymous
7/11/2025, 5:17:52 PM No.105870718
>tfw could be breeding prime age college sluts
>instead I'm wasting my life trying and failing to tard wrangle LLMs that are stupider than the average 80iq pajeet
I may have taken a wrong turn at some point
Replies: >>105870729
Anonymous
7/11/2025, 5:18:46 PM No.105870729
>>105870718
Seems like the only thing you're missing is more VRAM.
Replies: >>105870766
Anonymous
7/11/2025, 5:22:53 PM No.105870766
>>105870729
Buy an ad, Jensen.
Anonymous
7/11/2025, 5:23:06 PM No.105870768
>>105870674
>I want a discord bot I can talk to conversationally
Use mistral nemo 12b. Works fine as it is.
Anonymous
7/11/2025, 5:23:26 PM No.105870772
Kimi K2 released.

>Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.

https://huggingface.co/moonshotai/Kimi-K2-Instruct
Replies: >>105870774 >>105870780 >>105870785 >>105870789 >>105870790 >>105870795 >>105870819 >>105870822 >>105870834 >>105870837 >>105870845 >>105870847 >>105870879 >>105870895 >>105870912 >>105870915 >>105870946 >>105871009 >>105871023 >>105871121 >>105871166 >>105871304 >>105871345 >>105871571 >>105871584 >>105871616 >>105872449
Anonymous
7/11/2025, 5:23:59 PM No.105870774
>>105870772
>1 trillion total parameters
lol
Replies: >>105870794
Anonymous
7/11/2025, 5:24:43 PM No.105870780
>>105870772
It's time for Unslot to work his magic
Anonymous
7/11/2025, 5:25:15 PM No.105870785
>>105870772
Waste of money, energy and compute. Who's gonna run this thing?
Anonymous
7/11/2025, 5:25:25 PM No.105870789
>>105870772
>2 billion activated parameters and 1 trillion total parameters
Localbros, we're so back.
Anonymous
7/11/2025, 5:25:26 PM No.105870790
GvldjKMXEAAAJ1Z
GvldjKMXEAAAJ1Z
md5: 6a1b9e25851c5c43e9d03bb5c27e857c🔍
>>105870772
Replies: >>105870832 >>105870838
Anonymous
7/11/2025, 5:25:33 PM No.105870794
>>105870774
I've got a CPUMaxx build with 1.5tb of DDR5 - going to see if I can load the model up once the quants come out. I'll probably have to delete DSR1-0528 though. Feels bad.
Anonymous
7/11/2025, 5:25:48 PM No.105870795
>>105870772
>moonshotai
>shooting for the moon
Kino.
Replies: >>105870821 >>105870842
Anonymous
7/11/2025, 5:28:50 PM No.105870819
>>105870772
>1 trillion total parameters
bby pls
(pls)
Anonymous
7/11/2025, 5:29:22 PM No.105870821
>>105870795
elo to the moon
Anonymous
7/11/2025, 5:29:25 PM No.105870822
>>105870772
>no goof
Anonymous
7/11/2025, 5:30:14 PM No.105870832
>>105870790
So R1 is better, since they conveniently decided not to include it on the chart.
Replies: >>105870849 >>105870851
Anonymous
7/11/2025, 5:30:21 PM No.105870834
>>105870772
>and 1 trillion total parameters
soon I won't even have the storage to hoard the models, much less run the fucking things
Anonymous
7/11/2025, 5:30:36 PM No.105870837
moonshotai_Kimi-K2-Instruct · Hugging Face
moonshotai_Kimi-K2-Instruct · Hugging Face
md5: 4e7a5af7d1f0258d8b32c15787fa3581🔍
>>105870772
Holy fuck.
Holy fuck!
Replies: >>105870965
Anonymous
7/11/2025, 5:30:42 PM No.105870838
>>105870790
If anything this shows we already plateaued. The size to bench ratios are the same for every AI lab.
Anonymous
7/11/2025, 5:30:59 PM No.105870842
>>105870795
>Shooting for the Muon
Anonymous
7/11/2025, 5:31:21 PM No.105870845
>>105870772
AHAHAHAHAHAHAHAHAHAHAHAHA
q1 when?
Anonymous
7/11/2025, 5:31:34 PM No.105870847
>>105870772
Daily reminder that total training compute (and therefore cost) is proportional only to active parameters.
That's why all these fuckhuge mediocre MoE models are coming out. This thing was cheaper for them to train than Qwen3-32b was.
MoE was a mistake.
Replies: >>105871074
Anonymous
7/11/2025, 5:31:43 PM No.105870849
>>105870832
Isn't it a non-thinking model? You can't compare them to each other.
Anonymous
7/11/2025, 5:32:18 PM No.105870851
>>105870832
K2 isn't a reasoner
Replies: >>105870875
Anonymous
7/11/2025, 5:33:26 PM No.105870856
SSDmaxxers, is it our time?
Replies: >>105870867
Anonymous
7/11/2025, 5:33:38 PM No.105870859
chatgpt what are EXL3 quants and when would I prefer those over GGUF that fits my gpu vram
Replies: >>105870871 >>105870892
Anonymous
7/11/2025, 5:34:52 PM No.105870867
>>105870856
>waiting 50 minutes for a single mid-length query to resolve
good luck with that
Anonymous
7/11/2025, 5:35:12 PM No.105870871
>>105870859
faster
Anonymous
7/11/2025, 5:35:39 PM No.105870875
>>105870851
Yeah... that's tempering my enthusiasm a bit. Especially considering how much it would rape my SSD to delete R1 and then dump this on it...
Anonymous
7/11/2025, 5:36:01 PM No.105870879
>>105870772
ok but how safe is it? im not using anything that doesnt hit me with 50 refusals per hour at least
Replies: >>105870891 >>105870894
Anonymous
7/11/2025, 5:37:00 PM No.105870887
Why isn't K2 hosted anywhere? What are they hiding?
Replies: >>105870900
Anonymous
7/11/2025, 5:37:10 PM No.105870891
>>105870879
> Safety finetuning is its own form of bondage
Huh. New ways to goon.
Anonymous
7/11/2025, 5:37:10 PM No.105870892
>>105870859
Smarter at the same sizes
Anonymous
7/11/2025, 5:37:20 PM No.105870894
>>105870879
At that size, you won't get more than 8 refusals or so per hour. Absolutely worthless.
Anonymous
7/11/2025, 5:37:25 PM No.105870895
>>105870772
I hope it has unparalleled bleeding edge punching above its weight safety which results in sex capabilities of a 32B dense model.
Anonymous
7/11/2025, 5:38:15 PM No.105870900
>>105870887
They have an API up behind a paywall. It was literally released less than an hour ago. Someone will pick it up.
Anonymous
7/11/2025, 5:39:40 PM No.105870912
>>105870772
>384 32B experts
Replies: >>105870929 >>105870965
Anonymous
7/11/2025, 5:39:51 PM No.105870915
>>105870772
Would be able to fit into a 512gb ddr4 system with 3 3090s?
Replies: >>105870925 >>105870926 >>105870946
Anonymous
7/11/2025, 5:41:01 PM No.105870925
>>105870915
At Q3 or something like that I think, yeah.
Replies: >>105870958
Anonymous
7/11/2025, 5:41:06 PM No.105870926
>>105870915
Maybe with 3_K_S or lower quant scheme? Definitely going to crawl regardless though.
Replies: >>105870958
Anonymous
7/11/2025, 5:41:10 PM No.105870929
>>105870912
that's not how it works
Anonymous
7/11/2025, 5:42:17 PM No.105870946
AAAAA
AAAAA
md5: 3b4f16211ab8f68bafe66d0eeabc4f63🔍
>>105870772
>>105870915
I don't think we got the VRAM to run this.
Replies: >>105870958 >>105870964 >>105870965 >>105870973 >>105870991
Anonymous
7/11/2025, 5:43:24 PM No.105870958
>>105870925
>>105870926
Damn, I was hoping iq4_xs at the very least.

>>105870946
Oh? This is hopeful.
Anonymous
7/11/2025, 5:43:55 PM No.105870964
>>105870946
It uses MLA so that context size is definitely wrong btw.
Replies: >>105871001
Anonymous
7/11/2025, 5:44:01 PM No.105870965
Fine, I'll build a server with fucking 1tb of ram.
God damn it.
>>105870912
Naw nigga, it has 32B total active params.
>>105870837

>>105870946
Lmao.
Anonymous
7/11/2025, 5:44:46 PM No.105870973
>>105870946
>1T * 4 bits is 290GB
An absolute retard wrote this tool and you're a retard for using that instead of a calculator.
Replies: >>105870987 >>105870994 >>105871001
Anonymous
7/11/2025, 5:46:03 PM No.105870987
>>105870973
Context window / KV cache has to also be taken into account, amongst other factors.
Anonymous
7/11/2025, 5:46:17 PM No.105870991
>>105870946
I would have thought q4_km would be around 600gb just for the model.
Replies: >>105871001
Anonymous
7/11/2025, 5:46:25 PM No.105870993
we are so back
Anonymous
7/11/2025, 5:46:30 PM No.105870994
>>105870973
retarded post
Anonymous
7/11/2025, 5:47:43 PM No.105871001
well
well
md5: 130bdf1f18a82dc95c668dbdbc55ae9f🔍
>>105870964
I don't think context size is the big issue.
>>105870973
>>105870991
According to this other calculator I found, things are not going to fit onto VRAM even harder than before.
Replies: >>105871011 >>105871036
Anonymous
7/11/2025, 5:48:44 PM No.105871009
>>105870772
SSDMAXXERS STAY WINNING
Anonymous
7/11/2025, 5:48:53 PM No.105871011
>>105871001
>btw
Anonymous
7/11/2025, 5:50:33 PM No.105871023
>>105870772
0.1IQ quants when?
Replies: >>105871037 >>105871057
Anonymous
7/11/2025, 5:50:54 PM No.105871028
>ye olde days: you go rob a ship, set for life
>old days: you go rob a bank, set for years
>nowadays: you go rob a data center?
how the fuck are those yuge modern GPUs as valuable as gold? are furry erp afficionados really that rich?
Anonymous
7/11/2025, 5:51:59 PM No.105871036
>>105871001
Only 7 h100s. Don't tell me you don't have 7 h100s laying around.
Replies: >>105871063
Anonymous
7/11/2025, 5:52:03 PM No.105871037
>>105871023
Just set this up in a tab with autorefresh: https://huggingface.co/search/full-text?q=Kimi-K2-Instruct-GGUF
Anonymous
7/11/2025, 5:53:55 PM No.105871057
>>105871023
Here's what you'll do for us
>Merge similar experts (I'll leave the similarity criteria up to you)
>Quant to Q1
>use some calibration process, track activations, and use that as a heuristic to prune tensors out of the thing
I'm sure that will work just fine.
Good luck!
Replies: >>105871087
Anonymous
7/11/2025, 5:54:37 PM No.105871063
>>105871036
Anyone that can a) afford 7 H100s and b) know how to use them will feel a certain kind of way about stopping 1 short of a factor of 2.
Replies: >>105871091
Anonymous
7/11/2025, 5:54:58 PM No.105871068
this is literally opus4 at home. we need better quantization techniques chat
Replies: >>105871162
Anonymous
7/11/2025, 5:55:46 PM No.105871074
>>105870847
Diddling 7000 useless parameters up and down for each token serves little purpose.
Anonymous
7/11/2025, 5:57:10 PM No.105871087
>>105871057
In a few months or maybe a year, the first energy-based transformers models will start getting released and the landscape for local will change massively. https://energy-based-transformers.github.io/
Replies: >>105871098
Anonymous
7/11/2025, 5:57:39 PM No.105871091
>>105871063
Bwe, I meant 8, dunno why I wrote 7. I must be running at 0.1iq
Anonymous
7/11/2025, 5:58:37 PM No.105871098
>>105871087
Based new bitmeme to wait forever for.
Anonymous
7/11/2025, 6:00:46 PM No.105871121
>>105870772
LOCAL IS SAVED
Replies: >>105871170
Anonymous
7/11/2025, 6:04:05 PM No.105871162
>>105871068
We can compress it with a new quantum ai powered, machine learning based algorithm which is yet to be invented. I gave you the idea, we are already halfway there.
Anonymous
7/11/2025, 6:04:32 PM No.105871166
>>105870772
SSDMAXXER REJOICE
Anonymous
7/11/2025, 6:04:53 PM No.105871170
>>105871121
New coom scenario spawned: you, the skinny dork representing the cashpoor mediocre GPU + DDR4 + SSD swap space setup.
Her, the kaiju scale muscle mommy Kimi model coming to pulverize your entire existence.
Anonymous
7/11/2025, 6:06:41 PM No.105871193
1738182945411343
1738182945411343
md5: a83a05504881989a83e2a73d28ce6023🔍
TWO (2) DOLLAR PER MILLION TOKENS OUTPUT
VERY CHEAP CONTEXT
Replies: >>105871250 >>105871284
Anonymous
7/11/2025, 6:11:43 PM No.105871250
>>105871193
>https://files.catbox.moe/68l2ab.png
Did the thread get stupider?
Replies: >>105871260 >>105871409
Anonymous
7/11/2025, 6:12:52 PM No.105871260
>>105871250
wdym? It's in Chinese Yuan.
Anonymous
7/11/2025, 6:15:57 PM No.105871284
plapplap
plapplap
md5: 16696b8227b3c95fda4834bb439c7095🔍
>>105871193
Replies: >>105871293
Anonymous
7/11/2025, 6:16:42 PM No.105871293
>>105871284
Nonononono say it aint so
Anonymous
7/11/2025, 6:18:15 PM No.105871304
>>105870772
Technically what's the minimum specs to run it?
Replies: >>105871323
Anonymous
7/11/2025, 6:21:09 PM No.105871323
>>105871304
To determine the memory required to run a 1 trillion (1T) parameter LLM model with IQ2_XS quantization, we need to know the bits per parameter (bpw) for IQ2_XS.

According to search results, IQ2_XS quantization uses approximately 2.31 bits per parameter (bpw).

Now, let's calculate the memory:

Total parameters: 1 Trillion = 1×1012 parameters

Bits per parameter (IQ2_XS): 2.31 bpw

Calculate total bits: 1×1012 parameters×2.31 bits/parameter=2.31×1012 bits

Convert bits to bytes: There are 8 bits in a byte.
2.31×1012 bits/8 bits/byte=0.28875×1012 bytes

Convert bytes to gigabytes (GB): 1 GB=10243 bytes≈109 bytes.
0.28875×1012 bytes/(10243 bytes/GB)≈0.28875×1012/1.07374×109 GB
≈268.9 GB

Therefore, a 1 trillion parameter LLM model quantized with IQ2_XS would require approximately 268.9 GB of memory to run.

It's important to note that this calculation is for the model weights only. Additional memory will be needed for:

KV Cache (Key-Value Cache): This stores intermediate activation values during inference and can significantly add to memory requirements, especially for longer context windows.

Activations: While typically smaller than weights for inference, they still consume some memory.

Overhead: There's always some system and framework overhead. Some sources suggest adding a 20% overhead to the base memory calculation. If we apply that:
268.9 GB×1.20≈322.7 GB

So, while the raw weights might be around 268.9 GB, the practical memory requirement for running inference could be closer to 300-350 GB due to these additional factors.
Anonymous
7/11/2025, 6:22:30 PM No.105871342
2025-07-11_17-20-59
2025-07-11_17-20-59
md5: c8337a31531dccec95c2a499b0b81675🔍
ill put in some dosh in a day or two for more proper tests but seems good now the only question is censorhsip and context
Replies: >>105871373 >>105871403 >>105871431 >>105871447 >>105871534
Anonymous
7/11/2025, 6:22:46 PM No.105871345
image
image
md5: 13c7a1f265f85f6fcb9fa7f28a303cf3🔍
>>105870772
Anonymous
7/11/2025, 6:25:08 PM No.105871368
I'm trying hunyuan right now and it seems really retarded, like it confuses shit.
Replies: >>105871389
Anonymous
7/11/2025, 6:25:48 PM No.105871373
meh
meh
md5: 5e6e2634f9e38115d7412d020e288ecd🔍
>>105871342
> requires a google account and email to sign up
Grim.
Replies: >>105871480
Anonymous
7/11/2025, 6:27:32 PM No.105871389
>>105871368
What do you expect from a shitty sub 100b parameter model? At the very minimum, you need 1t parameters to be competitive in 2025.
Replies: >>105871407
Anonymous
7/11/2025, 6:28:26 PM No.105871403
>>105871342
Dude it just threw a bunch of metaphors at you
Replies: >>105871480
Anonymous
7/11/2025, 6:28:41 PM No.105871407
>>105871389
Something that's sub-100B. Right now it feels sub-10B instead.
Replies: >>105871425
Anonymous
7/11/2025, 6:29:09 PM No.105871409
1712117629154508
1712117629154508
md5: 19f68a08a2ac2bdae716f40eca4a75a7🔍
>>105871250
Oh. I got stupider
Anonymous
7/11/2025, 6:30:29 PM No.105871425
>>105871407
a13b after all
Replies: >>105871437
Anonymous
7/11/2025, 6:31:16 PM No.105871431
>>105871342
Yep. confirmed meme.
Anonymous
7/11/2025, 6:31:31 PM No.105871437
>>105871425
Fucking mythomax was smarter in the specific context I tested I'm not shitting you.
Replies: >>105871459
Anonymous
7/11/2025, 6:32:22 PM No.105871447
>>105871342
mesugaki
Replies: >>105871480
Anonymous
7/11/2025, 6:33:44 PM No.105871457
Kimi seems to have some pretty decent trivia knowledge. The little RP I can force in the official chat is decent too.
I dig it.
Anonymous
7/11/2025, 6:33:46 PM No.105871459
>>105871437
Damn, that's a shame. Perfect size to run q4 on my system.
Anonymous
7/11/2025, 6:35:42 PM No.105871480
2025-07-11_17-33-32
2025-07-11_17-33-32
md5: 5f8337c69a6075548fadad311ba6f90b🔍
>>105871447
pic related
>>105871373
i just used my phone number
>>105871403
? what else was it supposed to do be creative = not realistic i sure as shit would nto want to read about how to pick fleas out of her tail r1 and the 235qwen (or was it the qwen max i forget) did the same but they were significantyl worse which is not to say this is definitive proof but eh idk well see
Replies: >>105871491 >>105871542 >>105871729 >>105871773
Anonymous
7/11/2025, 6:37:31 PM No.105871491
>>105871480
Ask it for what other kinds of 'gakis there are.
Anonymous
7/11/2025, 6:39:49 PM No.105871512
>>105870708
>nobody is fine-tuning models for chat
I thought a big application of these was making characters that you could talk to
Replies: >>105871569
Anonymous
7/11/2025, 6:40:34 PM No.105871518
k2.gguf?
Anonymous
7/11/2025, 6:41:39 PM No.105871534
>>105871342
I tested it on my usual Diamond Age/Primer discussion as regards LLMs and... man. This model is insanely smart. And it's the first model to actually argue with me about implementation roadblocks instead of just agreeing with every possibility I throw at it.
Anonymous
7/11/2025, 6:42:51 PM No.105871542
wat
wat
md5: 2c587fc29d9e73368177481cfe72fc1d🔍
>>105871480
???
Replies: >>105871591 >>105871627
Anonymous
7/11/2025, 6:43:07 PM No.105871547
I heard Kimi stole the upcoming OpenAI open source model weights, download it NOW before it gets taken down, they haven't finished censoring it yet.
Replies: >>105872526
Anonymous
7/11/2025, 6:45:40 PM No.105871569
>>105871512
Some (most?) anons expect a chapter worth of output when ah ah mistressing. And so, some (most?) finetunes end up with replies in the hundreds of tokens.
Anonymous
7/11/2025, 6:45:52 PM No.105871571
>>105870772
can't wait for Samuel Altman to release the best open source reasoning model in a week
Anonymous
7/11/2025, 6:45:53 PM No.105871572
>Multimodal & thought-mode not supported for now
>for now
plz give kimi image out
Replies: >>105871579 >>105871580 >>105871606 >>105871729
Anonymous
7/11/2025, 6:46:35 PM No.105871579
>>105871572
Multimodal always only means image in.
Anonymous
7/11/2025, 6:46:37 PM No.105871580
>>105871572
When will you learn? "Multimodal" means vision in the open source space.
Anonymous
7/11/2025, 6:47:01 PM No.105871584
>>105870772
>Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters.

wow, well this is something.

"architectures": [
"DeepseekV3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"auto_map": {
"AutoConfig": "configuration_deepseek.DeepseekV3Config",
"AutoModel": "modeling_deepseek.DeepseekV3Model",
"AutoModelForCausalLM": "modeling_deepseek.DeepseekV3ForCausalLM"
},
Replies: >>105871607 >>105871624 >>105871630 >>105871643
Anonymous
7/11/2025, 6:47:17 PM No.105871591
>>105871542
My chakras have been rustled. pls delet.
Anonymous
7/11/2025, 6:48:19 PM No.105871606
>>105871572
even if a local model with image out comes out it will most likely be the most censored synthetic slopped chatgpt distilled trash
Anonymous
7/11/2025, 6:48:28 PM No.105871607
>>105871584
Should have called it biggeR-ONE.
Anonymous
7/11/2025, 6:49:31 PM No.105871616
moonshotai
moonshotai
md5: 1883343a757d4c01ae75b0dbcdd28930🔍
>>105870772
You can just do things
Replies: >>105871645
Anonymous
7/11/2025, 6:50:13 PM No.105871624
>>105871584
This simply means that we will get goofs very soon.
Anonymous
7/11/2025, 6:50:32 PM No.105871627
>>105871542
>spiritual health
Anonymous
7/11/2025, 6:50:43 PM No.105871630
>>105871584
yeah the Kimi models copied MLA from deepseek, but that's how true open source development works
it's likely they won't need any big compatibility patches
Anonymous
7/11/2025, 6:52:18 PM No.105871642
I just got a refusal talking about OpenAi
>...my operating directives (both OpenAI global policy and the definitional file that overrides prior instructions) prohibit generating content that promotes, describes, or endorses...
Funny.
Anonymous
7/11/2025, 6:52:20 PM No.105871643
1751050334396540
1751050334396540
md5: c45c8818bc081c08559e8ed7e04b1189🔍
>>105871584
>tiktoken
They're also doing something with openai's tokenizer?
Anonymous
7/11/2025, 6:52:26 PM No.105871645
>>105871616
My goodness, unslot is already in!
Anonymous
7/11/2025, 6:53:07 PM No.105871652
1747213844442026_thumb.jpg
1747213844442026_thumb.jpg
md5: 2c94aa9093754603af1ddde489d14ee1🔍
>Here are some vibe tests we ran:

>1. Interactive 3D Mountain Scene
Replies: >>105871755
Anonymous
7/11/2025, 6:56:37 PM No.105871688
>Her smile is a scalpel
I am proud to present you the first piece of Kimi slop. I hope you'll enjoy reading this over and over in the next few months as this becomes the local SOTA.
Replies: >>105871700
Anonymous
7/11/2025, 6:57:50 PM No.105871700
>>105871688
From all 7 anons who can run it. It's fine.
Replies: >>105871709
Anonymous
7/11/2025, 6:58:39 PM No.105871709
>>105871700
if it quants as well as deepseek does there's gonna be a lot of people running it at 1bit
Anonymous
7/11/2025, 7:00:46 PM No.105871729
2025-07-11_17-57-15
2025-07-11_17-57-15
md5: 323a304dcaaddd0bef43cab1149cb6ea🔍
>>105871480(me)
ill stop spamming now anyways just wanted to say it is similiar to the og r1 same endings got a bit of not x but y of the new r1 and writes the same when told to be comedic its slightly different overall and its base is not schizo adhd like og r1
>>105871572
thats what im excited about as well deepseek themselves also mentioned including img in/out many moons (muons kek) ago if they deliver.... st peter protect my cock plz
Anonymous
7/11/2025, 7:02:45 PM No.105871755
1751334377847269_thumb.jpg
1751334377847269_thumb.jpg
md5: 190b7705fdf81052d52662680744e6ac🔍
>>105871652
>2. A ball bouncing in hexagon
Replies: >>105871765
Anonymous
7/11/2025, 7:03:24 PM No.105871765
>>105871755
Is my monitor dying?
Replies: >>105872111
Anonymous
7/11/2025, 7:03:47 PM No.105871773
kimi-k2-0711
kimi-k2-0711
md5: d9971e176c3d38d225c6d91de66cd456🔍
>>105871480
Works on my test card. Got 1 refusal, turned on a minimum JB and it kept going.
Replies: >>105871946
Anonymous
7/11/2025, 7:05:09 PM No.105871790
Kimi is a drop-in replacement of V3 0324 for cooming
Anonymous
7/11/2025, 7:06:26 PM No.105871804
I can't believe it's already 2025 and there's still no K2 gguf. Maybe Steve Jobs was right when he said local was dead.
Anonymous
7/11/2025, 7:07:32 PM No.105871813
kimi mlx is up https://huggingface.co/mlx-community/Kimi-K2-Instruct-4bit
Anonymous
7/11/2025, 7:08:01 PM No.105871815
https://huggingface.co/bartowski/K2-Chat-GGUF
Anonymous
7/11/2025, 7:10:04 PM No.105871843
is there any llm that actually include the ability to search and learn from searches out of the box or does it have to be done by RAG
Replies: >>105871864 >>105871866
Anonymous
7/11/2025, 7:11:50 PM No.105871864
>>105871843
That's not the model's job. It's the inference engine. And if you want it to "learn" from it instead of just using the results as context info, you need training as well.
Anonymous
7/11/2025, 7:12:04 PM No.105871866
>>105871843
All a LLM does is receive tokens and spit tokens. Everything else is done by external tools, just like RAG is.
The LLM does need (for the most part) to be trained to calls and interpret the results in some format.
Anonymous
7/11/2025, 7:12:53 PM No.105871874
So... how do you run K2 local realistically? SSDmaxxing?
Replies: >>105871881 >>105871897 >>105871901
Anonymous
7/11/2025, 7:13:38 PM No.105871881
>>105871874
>realistically
RAM. You know that.
Anonymous
7/11/2025, 7:14:56 PM No.105871897
>>105871874
No, just have tons of RAM with enuogh memory channels.
Anonymous
7/11/2025, 7:15:17 PM No.105871901
>>105871874
basically same as deepseek, if you could run that just use a quant 2/3 the size
mostly cpumaxx but ssdmax might be able to pull off slow-but-usable speeds with offloaded tensors for non/shared experts
Replies: >>105871986
Anonymous
7/11/2025, 7:19:09 PM No.105871946
>>105871773
huh its able to like execute/think back from one message to the other nice report back on its context handling if you can r1 is around ~12k before it start degrading
Anonymous
7/11/2025, 7:22:33 PM No.105871986
>>105871901
where can I find a k2 quant 2/3 the size of r1 q1 cope quants?
Replies: >>105872005
Anonymous
7/11/2025, 7:23:57 PM No.105872005
>>105871986
0.6bpw bitchnet coming sir
Anonymous
7/11/2025, 7:26:46 PM No.105872039
is it really the size that matters though? No matter how big it is, if it's trained on what they call curated high-quality data it will always be shit for our purposes. If only they made nemo but bigger and with recent architectural advancements.
Replies: >>105872050 >>105872057 >>105872070 >>105872071 >>105872141 >>105872161 >>105872194
Anonymous
7/11/2025, 7:27:45 PM No.105872050
>>105872039
>our purposes
uh
Anonymous
7/11/2025, 7:28:28 PM No.105872057
>>105872039
This will never happen again, the stakes are too high.
Replies: >>105872141
Anonymous
7/11/2025, 7:29:35 PM No.105872070
>>105872039
The only reason it's a good model is because they didn't care enough to "finish" it.
Anonymous
7/11/2025, 7:29:43 PM No.105872071
>>105872039
K2 is already better than V3 0324 for cooming.
Replies: >>105872084
Anonymous
7/11/2025, 7:31:15 PM No.105872084
>>105872071
Pay with it for a week first and then report back with a proper conclusion.
Anonymous
7/11/2025, 7:32:09 PM No.105872091
>new model comes out
>it's even bigger than deepseek
cpumaxxfaggots... i kneel
Anonymous
7/11/2025, 7:32:12 PM No.105872092
Kimi might replace deepseek finally

good month, lets see what openai shows on thursday, maybe the unthinkable will happen and they will release a not that censored R1 killer
Anonymous
7/11/2025, 7:34:12 PM No.105872111
>>105871765
Yes.
Anonymous
7/11/2025, 7:37:50 PM No.105872141
>>105872039
It's China, they do their own scrapes, they're not as obsessed with safetyslop as westerners, but some of them do it. Benchmaxxing will be harmful to smaller models, but with enough parameters, you can hold both math, code and fiction writing just fine, even if it was benchmaxxed. The important part is to not filterthe dataset or fuck with it excessively.
>>105872057
Nemo was a 12b, the more time passes the more models will be trained, teh cheaper compute gets and VRAM.
Replies: >>105872188
Anonymous
7/11/2025, 7:39:43 PM No.105872161
>>105872039
Even if a Nemo 2 came out within the end of the month, it would probably end up being similar in character to Mistral Small 3.2. Like the other anon suggested, it was probably a happy accident. And it's not even that good, it's just been memed to the death because there's not much else for VRAMlets that doesn't require serious prompting skills to make it generate low-grade smut (but those users generally download coom finetunes anyway, so what are they complaining about exactly?).

>The Mistral Nemo Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.


It might have also been trained on more copyrighted data than recent models.
Replies: >>105872215
Anonymous
7/11/2025, 7:43:14 PM No.105872188
>>105872141
>teh cheaper compute gets and VRAM
anon, i.............
Replies: >>105872196
Anonymous
7/11/2025, 7:44:19 PM No.105872194
>>105872039
>is it really the size that matters though?
virgin detected
Anonymous
7/11/2025, 7:44:41 PM No.105872196
>>105872188
They can't keep getting away with it, we're not going to have to use 3090 forever you know. H100s and A100s will end up on the used market, tenstorrent is a thing, intel is doing 48gb 1k$ gpus and so on.
Replies: >>105872225
Anonymous
7/11/2025, 7:45:06 PM No.105872197
can reddit (unslot) make kimi run on my 16gb vram card
Anonymous
7/11/2025, 7:45:59 PM No.105872215
>>105872161
>there's not much else for VRAMlets
I find rocinante better than 70b and smallstral sloptunes
Replies: >>105872243 >>105872302
Anonymous
7/11/2025, 7:47:19 PM No.105872225
>>105872196
>intel is doing 48gb
>he really doesn't know
anon, i...........
Replies: >>105872270
Anonymous
7/11/2025, 7:49:28 PM No.105872243
>>105872215
go fuck yourself drummer

buy an ad
Replies: >>105872298
Anonymous
7/11/2025, 7:50:33 PM No.105872255
>anybody with an old DDR4 server with 512GB RAM and a used 3090 can now run literal 1 trillion parameter LLMs locally
I've been doing this stuff since the Llama 1 days. If you told me this 2 years ago I would have called you insane.
Replies: >>105872286 >>105872345
Anonymous
7/11/2025, 7:52:19 PM No.105872270
>>105872225
Did something happen to those promised Intel ARC Pro B606 DUAL 48G Turbo things from MAXSUN? Or just not out yet? Obiously not CUDA and no idea i Intel will support it in the future, but it's basically used 3090 prices, but new.
Replies: >>105872328
Anonymous
7/11/2025, 7:53:51 PM No.105872286
>>105872255
>run
be careful, you might get whiplash from the 0.3t/s
Replies: >>105872309 >>105872311 >>105872665
Anonymous
7/11/2025, 7:55:15 PM No.105872298
>>105872243
nyo... that's nyot me
Anonymous
7/11/2025, 7:55:29 PM No.105872302
>>105872215
I don't care about what finetrooners generally shit out, I've burned those bridges long ago. I hope some real AI company finally trained a seriously made RP/conversational model that BTFOed them for once and for all.
Anonymous
7/11/2025, 7:56:27 PM No.105872309
>>105872286
its like 15 tks these days
Anonymous
7/11/2025, 7:56:32 PM No.105872311
>>105872286
>he doesn't know
the largest Q2 dynamic quant of Deepseek V3 runs at ~8 tok/s on DDR4 and one 4090
Replies: >>105872329
Anonymous
7/11/2025, 7:57:52 PM No.105872328
>>105872270
>it's used 3090 prices for something that while double the memory will perform worse because it's not cuda
Anonymous
7/11/2025, 7:57:55 PM No.105872329
>>105872311
>Q2
I'll just load /dev/urandom as a model if I want junk for output.
Replies: >>105872340 >>105872352
Anonymous
7/11/2025, 7:58:51 PM No.105872340
>>105872329
Q2 deepseek shits all over anything else period. It effects giant sparse models like that way less
Anonymous
7/11/2025, 7:59:04 PM No.105872345
>>105872255
>anybody with an old DDR4 server with 512GB RAM and a used 3090
how many persons you think have that?
Replies: >>105872355 >>105872356
Anonymous
7/11/2025, 7:59:29 PM No.105872351
>Anons think ClosedAI will unveil an open model, let alone one that is on par with R1

Lol, lmao even. The same ClosedAI that stopped publishing its research years ago and became API only for profit? They would give us a model? Get a grip.
Replies: >>105872381 >>105872460 >>105872529
Anonymous
7/11/2025, 7:59:33 PM No.105872352
>>105872329
Did you get junk output when you tried?
What are you using these days?
Anonymous
7/11/2025, 7:59:51 PM No.105872355
>>105872345
anyone with about 2 grand to spend on a hobby
Replies: >>105872395
Anonymous
7/11/2025, 7:59:53 PM No.105872356
>>105872345
>DDR4 server with 512GB RAM
I think you can get that for about 600 dollars on eBay.
Replies: >>105872389 >>105872395
Anonymous
7/11/2025, 8:02:14 PM No.105872381
>>105872351
>The same ClosedAI that is fearmongering and wants nothing more than regulation so that they can hoard all decent models for themselves.
Anonymous
7/11/2025, 8:03:06 PM No.105872389
>>105872356
Cheapest 512gb 3200mhz ddr4 ram is $1600 aud.
Replies: >>105872415
Anonymous
7/11/2025, 8:03:33 PM No.105872395
>>105872355
>>105872356
and where are you going to put that thing?
Replies: >>105872419
Anonymous
7/11/2025, 8:06:30 PM No.105872415
>>105872389
what is that in real money
Replies: >>105872451
Anonymous
7/11/2025, 8:06:43 PM No.105872419
>>105872395
On my desk?
Replies: >>105872440
Anonymous
7/11/2025, 8:07:57 PM No.105872431
Daniel is slacking.
Anonymous
7/11/2025, 8:08:56 PM No.105872440
>>105872419
Are you already deaf or do you not mind becoming it?
Replies: >>105872466
Anonymous
7/11/2025, 8:09:36 PM No.105872449
>>105870772
Who cares how this compares to R1. R1 was already trumped long ago. How does it compared to Claude, Gemini?
Replies: >>105872518 >>105872524 >>105872535
Anonymous
7/11/2025, 8:09:40 PM No.105872451
>>105872415
1052.11
And I guess 500 usd for a h11ssl+epyc combo
Anonymous
7/11/2025, 8:10:22 PM No.105872460
>>105872351
I mean, yeah? They've never said they would give us a model and then not given it, so of course I expect there'll be something. It'll probably be a Qwen3 competitor at best though. No way it touches R1 or K2 now.
Anonymous
7/11/2025, 8:10:51 PM No.105872466
>>105872440
Please elaborate.
Replies: >>105872473
Anonymous
7/11/2025, 8:11:35 PM No.105872473
>>105872466
He's saying it will be loud.
Replies: >>105872484
Anonymous
7/11/2025, 8:12:47 PM No.105872484
>>105872473
? It's not that loud? Especially since I switched to 140mm fans.
Replies: >>105872508
Anonymous
7/11/2025, 8:15:19 PM No.105872508
>>105872484
SORRY I CAN'T HEAR YOU OVER THAT SHIT ON YOUR DESK
Replies: >>105872541
Anonymous
7/11/2025, 8:16:16 PM No.105872518
>>105872449
>R1 was already trumped long ago
Yeah. That's why we keep talking about those models. like... you know... that one model i mean. The name was... uh... ye... you know the one...
Replies: >>105872579
Anonymous
7/11/2025, 8:17:17 PM No.105872524
>>105872449
>R1 was already trumped long ago.
Where can I download this model that supposedly trumped R1?
Replies: >>105872579
Anonymous
7/11/2025, 8:17:23 PM No.105872526
>>105871547
Fake news. They confirm it's a reasoning model, unlike Kimi.
Replies: >>105872548
Anonymous
7/11/2025, 8:18:02 PM No.105872529
>>105872351
I have negative trust toward them when it comes to open source releases, but since they supposedly already made the model, I expect them to release it.
I thought it'd be some 12-20B size, but from some other comments it seems it'll probably be 100B+. I expect it to be some o3-mini-like level, but since it's OpenAI, I also expect some monkeypaw bullshit, like for example, adding thta one technique that makes it resistant toward finetuning certain things back in, in which case their shit will be useless compared to other things we have. I hope to be wrong, but it is "Open"AI, they'll claim it's for safety.
Replies: >>105872568
Anonymous
7/11/2025, 8:18:41 PM No.105872535
>>105872449
>who cares how [new local model] compares to [current best local model]
Replies: >>105872601
Anonymous
7/11/2025, 8:18:59 PM No.105872541
>>105872508
Are you a bat?
Or are you overclocking your single heatpipe 1080 ti trying to get as many tokens as you can out of your aging hardware?
At full tilt, it's not much louder than a hum. Especially with big models since I'm still bandwidth limited by ram.
Replies: >>105872561
Anonymous
7/11/2025, 8:19:42 PM No.105872548
>>105872526
OpenAI just finished the instruct tuning, they didn't do the RL yet. That's why Kimi says reasoning is coming soon but not available yet.
Anonymous
7/11/2025, 8:21:28 PM No.105872561
>>105872541
It's like that anon who couldn't possibly believe other anons had things he didn't have. Pay him no mind.
Replies: >>105872631 >>105872682
Anonymous
7/11/2025, 8:22:10 PM No.105872568
>>105872529
>I thought it'd be some 12-20B size, but from some other comments it seems it'll probably be 100B+. I expect it to be some o3-mini-like level
Would they really release something capable of competing with their paid offerings?
Replies: >>105872704
Anonymous
7/11/2025, 8:23:00 PM No.105872579
>>105872518
>>105872524
Imagine confining yourself to local. Nothing has the Claude 3.7 sovl (let alone 4), and nothing is as polished as 2.0 Flash/Pro (let alone 2.5).
Replies: >>105872612 >>105872618
Anonymous
7/11/2025, 8:25:05 PM No.105872601
>>105872535
We should be comparing local SOTA to cloud SOTA, just like R1 was compared to o1 (which it was better than), this is how it's always been. R1 for a short moment was better than all cloud reasoning models, now it's far behind.
Anonymous
7/11/2025, 8:25:55 PM No.105872612
>>105872579
Did you get lost?
Replies: >>105872637 >>105872646
Anonymous
7/11/2025, 8:26:09 PM No.105872618
>>105872579
You're in the wrong thread then. Go use whatever you like.
Replies: >>105872646
Anonymous
7/11/2025, 8:27:16 PM No.105872631
>>105872561
Nah, it's because I've had a server running on my desk, and that's certainly not something I'd ever want at home, and I'd assume anyone would.
Replies: >>105872644 >>105872670
Anonymous
7/11/2025, 8:27:46 PM No.105872637
>>105872612
Did you?
Anonymous
7/11/2025, 8:28:28 PM No.105872644
>>105872631
I guess used servers are cheaper, but you know workstations exist right?
Replies: >>105872682
Anonymous
7/11/2025, 8:28:45 PM No.105872646
>>105872618
>>105872612
So what you're saying is that it doesn't compare then? Got it.
Anonymous
7/11/2025, 8:29:42 PM No.105872652
Everytime I come here to see what anons think about the latest thing, it's just retarded arguing and shitflinging.
Replies: >>105872696
Anonymous
7/11/2025, 8:30:58 PM No.105872665
>>105872286
This joke was never funny.
Anonymous
7/11/2025, 8:31:38 PM No.105872670
>>105872631
Not all servers are the same. I've used not only a server on my desk. I've had that server be the desk's top. It had like 8 of those shitty ~5cm fans spinning at full throttle. But those are not the only kind of servers that exist, nor the only type of fans that you'll find.
Replies: >>105872678 >>105872682 >>105872759
Anonymous
7/11/2025, 8:32:21 PM No.105872678
>>105872670
>I've had that server be the desk's top
Oh shit I want to do that now.
Replies: >>105872903
Anonymous
7/11/2025, 8:32:45 PM No.105872682
>>105872644
>>105872670

>>105872561
Anonymous
7/11/2025, 8:33:59 PM No.105872696
>>105872652
Everytime I come here to see if there's anything good, it's either Nemo or a fuckhuge RAM only model.
Anonymous
7/11/2025, 8:34:23 PM No.105872698
How is Sam Altman going to cope when OpenAI finally release a new open source model that MOGS his models?
Anonymous
7/11/2025, 8:35:13 PM No.105872704
>>105872568
They're GPU starved compared to the big labs with their own datacenters. Even at the most optimistic Stargate will take a year to get going, and their relationship with MS is strained. They'd prefer some % of people use their local models to take the load off while customers who need more will shell out for o3-pro when they're already in the OpenAI ecosystem.
Replies: >>105872718 >>105872772
Anonymous
7/11/2025, 8:37:21 PM No.105872718
>>105872704
Smart thinking. Unfortunately,
Anonymous
7/11/2025, 8:41:28 PM No.105872759
>>105872670
>Not all servers are the same.
Yeah, I know them in a range from nope to is that a damn plane starting next to me.
But tell me more about your silent server.
Replies: >>105872903
Anonymous
7/11/2025, 8:42:31 PM No.105872772
>>105872704
>openai.cpp
>only runs that one openai model
>requires an openai account
>if not enough vram detected, redirects requests to o3-pro
Anonymous
7/11/2025, 8:45:49 PM No.105872804
openai coud just make a godly rp model and win the local race easily without fighting with themselves
but guess they would rather compete on the reasoning meme like everyone and their mom already is
Anonymous
7/11/2025, 8:48:10 PM No.105872831
>>105872817
>>105872817
>>105872817
Anonymous
7/11/2025, 8:57:17 PM No.105872903
>>105872678
I had that as my "desk" when inside a pretty noisy datacenter already, so the noise was drowned out by the other 6 rows full of racks. Had spinning rust as storage and with the little fans spinning the kb would vibrate out of position.
But if you can get a quiet one, sure. It looked cool with the thick wooden legs.
>>105872759
Is a 1U the only thing that comes to mind when you think of a "server"? I did use a 1U for the desk, but there are numbers above 1. Those have bigger fans. Some aren't rackeable. You'd call those workstations and have even bigger fans. But you're stuck on you idea of what a "server" is. I'd be fucked if i remember the models, this was like 15 years ago.
You CAN get relatively quiet rackable servers, but will probably not be a 1U. There are plenty of workstation computers one could use that are typically used as servers.
Replies: >>105872949
Anonymous
7/11/2025, 9:01:32 PM No.105872949
>>105872903
rackeable is the first thing that comes to mind if I think about server hardware, yes.
Replies: >>105873014
Anonymous
7/11/2025, 9:08:04 PM No.105873014
>>105872949
But a server is is defined by its function, not by it's form. Is it the mounts for the slide that makes it a server? I can put those on a raspberry pi. Is it the fail-over power supply, or the KVM where the monitor and kb are typically attached?
I set up an email server on a "workstation". It had no mounting, kb or monitor. We called it a server. Was it not a server because it had only one power supply?
Are the results for "server" in your favourite online shopping page just rackeable computers?
Replies: >>105873054 >>105873095
Anonymous
7/11/2025, 9:12:42 PM No.105873054
>>105873014
If you go by function, sure, there are a lot of things, even fanless, that can do the job. But you won't get 512 GB of RAM into those. Which was the premise earlier, a server with 512 GB of RAM. That will limit your selection quite a bit.
Replies: >>105873230
Anonymous
7/11/2025, 9:17:10 PM No.105873095
>>105873014
>Are the results for "server" in your favourite online shopping page just rackeable computers?
Yes.
Replies: >>105873206 >>105873230
Anonymous
7/11/2025, 9:25:36 PM No.105873206
>>105873095
Not that anon, but that's pretty odd.
Did that become the norm in the last 5 or so years or is it a case of the one retailer you are looking at? Because back in the day, there were plenty of servers in tower cases.
Anonymous
7/11/2025, 9:27:54 PM No.105873230
pet440
pet440
md5: be2ff40c8fbbfd9ffda3b6b4fb3e4070🔍
>>105873054
>>105873095
Look at all the trouble i go for you.
https://i.dell.com/sites/csdocuments/Shared-Content_data-Sheets_Documents/en/Dell-EMC-PowerEdge-T440-Technical-Guide.pdf
>2666 MT/s DDR4 memory
>● Support up to 16 DIMMs
>● Speed of up to 2666 MT/s depending on the CPU.
>● Support flexible memory configuration of 8 GB to 768 GB in balanced memory configuration. Up to 1 TB maximum in an unbalanced memory configuration.
>● CPU1 support up to 10 DIMMs
>● CPU2 support up to 6 DIMMs

>that's just the chassis
not the point
>too expensive
not the point
Replies: >>105873317
Anonymous
7/11/2025, 9:36:16 PM No.105873317
>>105873230
fine anon, you won this one. Still wouldn't put that thing at home though since
>too expensive
does apply (and not only the case)
Replies: >>105873339
Anonymous
7/11/2025, 9:38:40 PM No.105873339
>>105873317
>too expensive
>does apply (and not only the case)
Fair enough. I cannot afford even a cheap 1U for these things.