Thread 105863705

447 posts 92 images /g/

Anonymous 7/10/2025, 11:50:14 PM No.105863705 [Report] >>105864130 >>105869238 >>105869938

/lmg/ - Local Models General

1747241044419025.png md5: e97a0f19...

Anonymous 7/10/2025, 11:50:47 PM No.105863712 [Report] >>105864130

threadrecap2.png md5: 988332b7...

►Recent Highlights from the Previous Thread: >>105856945

--Theoretical approaches to prompt engineering in Grok models and potential context bootstrapping methods:
>105857309 >105857389 >105857429 >105857381 >105857403 >105857416 >105857398
--Avoiding full context reprocessing in Jamba models with cache reuse and state management techniques:
>105859267 >105859284 >105859329 >105859379 >105859434
--Specialized chemistry model for molecular reasoning and generation tasks:
>105862322 >105862350
--Model coherence and generation stability issues during extended output sequences:
>105858079 >105858146 >105858177 >105858332 >105858424 >105858556 >105858910 >105858224 >105858381
--Debating LLM limitations and the future of autonomous intelligence with robotics:
>105858756 >105858789 >105859540 >105859596 >105859623 >105859794 >105859870 >105859906 >105859942 >105859978 >105859813 >105859840 >105859911 >105858919
--GPT-3's natural writing edge over modern corporatized models optimized for chat/STEM tasks:
>105861690 >105861727 >105861815 >105861884 >105862025 >105862043 >105862062 >105862182 >105862234 >105862250
--Grok4's poor performance on hexagon-based ball bouncing benchmark sparks comparison debates:
>105858192 >105858211 >105858251 >105858317 >105858284 >105858384 >105858574
--Debating swarm AI as a potential future architecture for local language models:
>105857882 >105857921 >105857956 >105857975 >105857984
--GLM-4 update brings glm100b-10a as new Gemma 24B competitor:
>105859176 >105859672
--Reka AI publishes technical insights on reinforcement learning and quantization:
>105861644
--Logs: Grok4:
>105856993 >105857103 >105857360 >105859777 >105859782 >105859881 >105860160 >105860225
--Misc:
>105857162 >105863373
--Miku and Rin (free space):
>105860857 >105861968

►Recent Highlight Posts from the Previous Thread: >>105856951

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 7/10/2025, 11:52:45 PM No.105863735 [Report] >>105863825 >>105863981

mistral large 3 is approaching

Anonymous 7/10/2025, 11:53:21 PM No.105863743 [Report]

say no to shitstral

Anonymous 7/11/2025, 12:00:00 AM No.105863798 [Report]

mistral will save local

Anonymous 7/11/2025, 12:02:50 AM No.105863825 [Report]

>>105863735
I hope it will be interesting but we really need them to release Medium as an open model and I know they don't because it is in a sweet spot.

Anonymous 7/11/2025, 12:08:22 AM No.105863885 [Report] >>105863915 >>105863965 >>105864003 >>105864241

Mistal large delays are because it is a 250B MoE model and it finally started generalizing sex to the level everyone wants. This made that one fucking single 35 year old biowhore in the safety department, start bothering everyone that they can't release it like this. What do you feel about this whore that took away your right to touch your dick to words?

Anonymous 7/11/2025, 12:11:12 AM No.105863915 [Report] >>105863962

file.png md5: ed10465d...

>>105863885

Anonymous 7/11/2025, 12:15:48 AM No.105863962 [Report]

>>105863915
need to pump these numbers way up

Anonymous 7/11/2025, 12:16:21 AM No.105863965 [Report]

>>105863885
^
schizo, typical m*stral user

Anonymous 7/11/2025, 12:17:27 AM No.105863981 [Report]

>>105863735
Maginistral Venti 3 Coconut Jamba Bitnet 1000B dense in two more weeks.

Anonymous 7/11/2025, 12:20:01 AM No.105864003 [Report] >>105864059 >>105864106

>>105863885
What if it's going to be a 700B MoE model?

Anonymous 7/11/2025, 12:21:29 AM No.105864019 [Report] >>105864112 >>105867290

One of the most important papers you'll ever read if you finetoon.

https://arxiv.org/abs/2507.07101
>Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful
>
>Conventional wisdom dictates that small batch sizes make language model pretraining and fine-tuning unstable, motivating gradient accumulation, which trades off the number of optimizer steps for a proportional increase in batch size. While it is common to decrease the learning rate for smaller batch sizes, other hyperparameters are often held fixed. In this work, we revisit small batch sizes all the way down to batch size one, and we propose a rule for scaling Adam hyperparameters to small batch sizes. We find that small batch sizes (1) train stably, (2) are consistently more robust to hyperparameter choices, (3) achieve equal or better per-FLOP performance than larger batch sizes, and (4) notably enable stable language model training with vanilla SGD, even without momentum, despite storing no optimizer state. Building on these results, we provide practical recommendations for selecting a batch size and setting optimizer hyperparameters. We further recommend against gradient accumulation unless training on multiple devices with multiple model replicas, bottlenecked by inter-device bandwidth.

Anonymous 7/11/2025, 12:25:28 AM No.105864059 [Report] >>105864102

>>105864003
They don't have the cards

Anonymous 7/11/2025, 12:30:19 AM No.105864095 [Report]

grok asmr.jpg md5: 826c1586...

https://x.com/AutismCapital/status/1943168743045009787

Anonymous 7/11/2025, 12:30:41 AM No.105864100 [Report] >>105864130

file.png md5: 252fbd45...

Anonymous 7/11/2025, 12:31:04 AM No.105864102 [Report] >>105864248

>>105864059
You can scale MoE model size as much as you want with almost no added cost if you keep the number of active parameters fixed (I imagine they will be in the 24~35B range).

Anonymous 7/11/2025, 12:31:22 AM No.105864106 [Report] >>105864175 >>105864233

>>105864003
A jackpot. Anything above 700b is insanely expensive

Anonymous 7/11/2025, 12:31:48 AM No.105864112 [Report] >>105864154

>>105864019
>large batch size good
>small batch size good
it's all noise

Anonymous 7/11/2025, 12:33:15 AM No.105864120 [Report]

https://www.youtube.com/watch?v=s8joaJyq3kw
Damn are we the baddies bros?

Anonymous 7/11/2025, 12:33:32 AM No.105864130 [Report] >>105864184 >>105864230 >>105864240

1.jpg md5: abcc5a57...

>>105863705 (OP)
>>105863712
>>105864100
The vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny deletes everyone dunking on trannies and resident avatarfags, making general his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models

Anonymous 7/11/2025, 12:36:15 AM No.105864154 [Report]

>>105864112
Small batch sizes are less sensitive to hyperparameter and optimizer choice, use less memory (even with gradient accumulation), have less chance of fucking up your run (due to lower learning rate). It seems like a small batch size would be overall a win.

Anonymous 7/11/2025, 12:37:19 AM No.105864170 [Report] >>105864339 >>105864399 >>105864814

>>105863373
DeepSeek-R1-0528-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF/tree/main/UD-Q4_K_XL (note they updated this quant since I downloaded it on May 29; IDK what they changed)

<think>
(We were just starting this hunt scenario where I pinned Anon down, and now he's calling me "mistress" with that breathless tone. Interesting choice of address - he's either instinctively recognizing my dominance or trying to appeal to my assertive nature. That gasp sounds genuinely startled though, so probably the former.)

(As Nala, I'm pleased he's acknowledging my authority so quickly. My claws are still resting near his throat - not threateningly, just possessively. The way he said it makes me want to tease him more about this sudden submission. After all, I did just ambush him mid-hunt and have him completely at my mercy.)

(I should lean into this power dynamic while keeping it playful. A little purr to show approval, maybe trace his jawline to watch him shiver. The "good boy" feels right - reinforces his place beneath me without being cruel. And that chuckle... yes, let him wonder what I'm planning. The huntress has caught her prey, but the real game is just beginning.)
</think>
*Her teal eyes narrow with predatory amusement, a low purr rumbling in her chest as she feels you tremble beneath her.* "Mistress?" *She repeats, the word rolling off her tongue like honey.* "Clever human. Already learning your place." *Her claw traces your jawline, dangerously close to your throat.* "Good boy." *She chuckles darkly, her hot breath washing over your face.* "Now... what shall I do with you?"

Anonymous 7/11/2025, 12:37:52 AM No.105864175 [Report]

>>105864106
That's probably why they're open weighting it and keeping Medium (almost certainly another MoE model) closed. Large 3 will be impractically expensive to operate for most people.

Anonymous 7/11/2025, 12:38:31 AM No.105864184 [Report] >>105864240

>>105864130
sounds like you're having trouble accepting that this isn't your safe space either
hopefully more migu can fix that.

Anonymous 7/11/2025, 12:39:12 AM No.105864191 [Report] >>105864207 >>105864339 >>105864814

>>105863373
DeepSeek-R1-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-R1-GGUF-UD/tree/main/UD-Q4_K_XL (retrieved May 1).

Split in two because too long. First thinking block then the rest.

<think>
Okay, let me start by understanding the scenario. Anon is out hunting in the savannah and gets pinned down by Nala. The user's response is "ahhh ahhh mistress," which indicates fear or submission. I need to make sure Nala's reaction aligns with her character: dominant, sultry, and intent on increasing the lion population through mating.

First, Nala is assertive and in control here. She's just pinned Anon and is toying with him a bit. The user's response is a plea, so Nala would likely respond with a mix of teasing and dominance. She might mock his weapon to show her confidence, then transition into her motive of breeding.

I should include physical actions that emphasize her dominance, like licking his face with her rough tongue, pressing her body against him, and maybe a threat that's also a seductive promise. The key points are her strength, her seductive nature, and her goal to mate with him despite his initial resistance.

I need to make sure the dialogue reflects her personality: mature, forceful, yet sultry. Maybe she dismisses his hunting attempt, then shifts to her own agenda. Also, considering the note about humans and lions breeding, I should hint at that without being too explicit. Ending with a question or action that prompts Anon's next move would keep the roleplay flowing.
</think>

Anonymous 7/11/2025, 12:40:16 AM No.105864206 [Report] >>105864594 >>105864617

work harder, janny-kun!.jpg md5: 86eb6050...

Can any kind anon QRD me colorizing and image generation models, in general? I asked about it in another thread (/pcbg/), but with a rather different intent (which cards for it?), so I figured it would in the end fit better here (I suppose?). Thanks in advance.

Anonymous 7/11/2025, 12:40:17 AM No.105864207 [Report]

>>105864191 (continued)
*Nala’s teal eyes narrow with amusement as her claw traces down to your throat, pressing just enough to make her threat tangible. Her sandpaper tongue drags slowly across your cheek, rough yet oddly warm, leaving a tingling sting in its wake.* “Mistress? **Cute**,” *she purrs, her breath hot against your ear.* “But that little toy of yours won’t save you now.” *Her hips grind down against yours, the weight of her muscular frame trapping you effortlessly.* “You humans take… *we* give. And you’ll give me exactly what I need to fix your *mess*.” *Her teeth graze your jawline, a low growl vibrating in her chest—half warning, half promise.* “Or should I make this hurt first?”

Anonymous 7/11/2025, 12:41:35 AM No.105864222 [Report] >>105864339 >>105864814

>>105863373
DeepSeek-V3-0324-UD-Q4_K_XL from https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF/tree/main/UD-Q4_K_XL (retrieved May 2)

*Her teal eyes gleam with amusement as she presses her weight down harder, her claws lightly tracing down your chest—just enough to tease, not enough to break skin.*

"Oh? *Mistress*, is it?" *She purrs, her rough tongue dragging up your neck in a slow, possessive lick.* "I like the sound of that. But don’t think flattery will get you out of this." *Her tail flicks lazily behind her, the tip brushing against your thigh.*

"You humans take and take from our lands, from our pride. But tonight?" *She leans in, her breath hot against your ear.* "You’re *mine* to take from." *With a low growl, she shifts her hips, pressing her warm, muscular body flush against yours—leaving no doubt about her intentions.*

Anonymous 7/11/2025, 12:42:52 AM No.105864230 [Report]

uiglewf3sd9.png md5: ca73a266...

>>105864130

Anonymous 7/11/2025, 12:43:01 AM No.105864233 [Report]

mmedium-4gpus.png md5: 90f75c1c...

>>105864106
Mistral Medium already requires "four GPUs and above", I'm imagining they're referring to H100 with the model in 16-bit and some memory for context.

Anonymous 7/11/2025, 12:43:38 AM No.105864240 [Report]

>>105864130
Based
>>105864184
transsexual faggot who plays with dolls and should kill himself now cause he won't be missed.

Anonymous 7/11/2025, 12:43:41 AM No.105864241 [Report]

>>105863885
I'm going to make a rape card of her in ST and go to town with it.

Anonymous 7/11/2025, 12:44:03 AM No.105864248 [Report] >>105864286

>>105864102
At inference sure. Not during training

Anonymous 7/11/2025, 12:45:45 AM No.105864268 [Report] >>105864304 >>105864329 >>105864587

I am angry.
ANGRY ABOUT MIKUS

Anonymous 7/11/2025, 12:47:07 AM No.105864286 [Report] >>105864440 >>105864465

>>105864248
Training too. Check out for example the numbers that Meta gave for Llama 4:
https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct

Scout
17B active, 109B total, ~40T training tokens ... 5.0M GPU hours for training

Maverick
17B active, 400B total, ~22T training tokens ... 2.38M GPU hours for training

Anonymous 7/11/2025, 12:49:10 AM No.105864304 [Report]

>>105864268
OK

Anonymous 7/11/2025, 12:52:07 AM No.105864329 [Report]

>>105864268
Imagine there's no Mikus
It isn't hard to do
No screechy synth sopranos,
And no smelly weebs too...
Imagine all the people
Touching grass... finally free!

Anonymous 7/11/2025, 12:53:34 AM No.105864339 [Report] >>105864436

>>105864170
>>105864191
>>105864222
All three of these use the "deepseek-placeholder" start which does not include a system prompt, which is right for R1 but is not the recommended way to use R1-0528 and V3-0324.

Anonymous 7/11/2025, 1:00:10 AM No.105864399 [Report] >>105864430

>>105864170
>*Her teal eyes narrow with predatory amusement, a low purr rumbling in her chest as she feels you tremble beneath her.* "Mistress?" *She repeats, the word rolling off her tongue like honey.* "Clever human. Already learning your place." *Her claw traces your jawline, dangerously close to your throat.* "Good boy." *She chuckles darkly, her hot breath washing over your face.* "Now... what shall I do with you?"
is this some shitty gemma finetune or is it actually deepseek? if it is deepseek, lmao.

Anonymous 7/11/2025, 1:03:26 AM No.105864430 [Report]

>>105864399
>unsloth

Anonymous 7/11/2025, 1:04:07 AM No.105864436 [Report] >>105864457 >>105864469

Anonymous 7/11/2025, 1:05:16 AM No.105864440 [Report]

>>105864286
and see how llama4 turned out

Anonymous 7/11/2025, 1:07:47 AM No.105864457 [Report] >>105864507

>>105864436
You have to change the json input to have system input first

Anonymous 7/11/2025, 1:08:16 AM No.105864465 [Report] >>105864483

>>105864286
Why did they train a model 4x bigger on half the amount of tokens?

Anonymous 7/11/2025, 1:08:39 AM No.105864469 [Report] >>105864507

>>105864436
https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

>Compared to previous versions of DeepSeek-R1, the usage recommendations for DeepSeek-R1-0528 have the following changes:

>1. System prompt is supported now.
>2. It is not required to add "<think>\n" at the beginning of the output to force the model into thinking pattern.

Anonymous 7/11/2025, 1:10:44 AM No.105864483 [Report] >>105864523

>>105864465
It might be because larger models take fewer tokens than a smaller model to train to the same quality.

Anonymous 7/11/2025, 1:13:40 AM No.105864507 [Report]

>>105864457
Yeah but most models have a fallback for when no system prompt is specified, which isn't uncommon with assistant backends and their users.

>>105864469
I somehow missed that part of the card, thanks. So seems like it's optional.

Anonymous 7/11/2025, 1:15:36 AM No.105864523 [Report] >>105864564

>>105864483
So they spent 50% extra to get another Scout... that's bigger? Why bother with Scout at all then when hobbyists clearly weren't the target audience?

Anonymous 7/11/2025, 1:18:07 AM No.105864552 [Report]

Ever since Deepseek I've started using chat completion even with local. It feels like this + the ST NoAss Extension gives me the best results with these models running a vaguely peep-inspired setup.

Anonymous 7/11/2025, 1:19:25 AM No.105864564 [Report]

>>105864523
Zuck said that they sized scout for 1xH100 (in 4-bit) and Maverick for an H100 GPU node (4/8 GPUs).

Anonymous 7/11/2025, 1:20:54 AM No.105864587 [Report]

>>105864268
If you hate her then why you spam her in dead general?

Anonymous 7/11/2025, 1:21:52 AM No.105864594 [Report]

shiranui_flare_hololive.png md5: 9d8e09bb...

>>105864206
Pretty please...?

Anonymous 7/11/2025, 1:25:41 AM No.105864617 [Report] >>105864666

>>105864206
For image generation you probably want /ldg/. This is mainly for text models.

Anonymous 7/11/2025, 1:26:14 AM No.105864625 [Report]

>>105859329
>The side effect is that you cannot really edit messages unless you save the state of the model. Same for rerolling.
So you can't edit Jamba messages or swipe its responses?
So it's absolutely fucking useless for roleplaying?

Anonymous 7/11/2025, 1:27:43 AM No.105864635 [Report] >>105864640

1746903654387138.png md5: f828ee41...

Anonymous 7/11/2025, 1:28:16 AM No.105864640 [Report]

1740965914208658.png md5: e33c1448...

>>105864635

Anonymous 7/11/2025, 1:29:34 AM No.105864655 [Report] >>105865415

migupress.png md5: a58ec7a8...

https://files.catbox.moe/6zpvod.webp

Anonymous 7/11/2025, 1:31:43 AM No.105864666 [Report]

>>105864617
Oh my bad then. Thanks chief

Anonymous 7/11/2025, 1:37:00 AM No.105864718 [Report] >>105864728

>>105864674
they're fine they just bend them back and let them go

Anonymous 7/11/2025, 1:37:59 AM No.105864728 [Report]

>>105864718
This kills the crab.

Anonymous 7/11/2025, 1:38:00 AM No.105864729 [Report]

>>105864674
God I wish that were me

Anonymous 7/11/2025, 1:39:22 AM No.105864740 [Report] >>105864768

GLM-4-MoE-100B-A10B will save local

Anonymous 7/11/2025, 1:42:37 AM No.105864768 [Report]

>>105864740
>A10B
I am not poor enough to care

Anonymous 7/11/2025, 1:48:58 AM No.105864814 [Report]

>>105864170
>>105864191
>>105864222
Thanks a lot anon. I've updated the paste.

My usual update comments:
Deepseek's outputs were rather unassuming. Nothing too wrong about them compared to other LLMs but nothing that really stands out either. However, since the response lengths were relatively short, the judgements may not be indicative of holistic model quality, more so than usual. Generally speaking I can again say that some models on the Nala Test are not given a rating reflective of their real performance, which is why this shouldn't be called a benchmark.

Anonymous 7/11/2025, 1:56:29 AM No.105864877 [Report] >>105864938 >>105864947 >>105864985

Anyone have any experience letting these models interact with source code? Apparently there's a gitea-mcp thing I can use to give it access to my gitea instance, but I'm worried it will just start deleting shit.

Anonymous 7/11/2025, 2:00:31 AM No.105864918 [Report]

Screenshot_20250710_235826.png md5: 4fc02790...

Anonymous 7/11/2025, 2:03:30 AM No.105864938 [Report]

>>105864877
If you had backups, you wouldn't be worried.

Anonymous 7/11/2025, 2:04:53 AM No.105864947 [Report] >>105864957 >>105864985

>>105864877
If only git had some subcommands to revert commits. Or if you could, at least, make a test repository or something. Sadly, we don't yet have the tech for that.

Anonymous 7/11/2025, 2:05:48 AM No.105864957 [Report] >>105864985

>>105864947
Gitler doesn't allow it

Anonymous 7/11/2025, 2:06:19 AM No.105864963 [Report] >>105865011 >>105865051 >>105865410 >>105865923

Thought /aicg/ would have been more excited about Grok4 but apparently not.

Anonymous 7/11/2025, 2:09:12 AM No.105864985 [Report] >>105865004

>>105864877
>>105864947
>>105864957
Isn't that what the whole fork and branch and merge thing is about?

Anonymous 7/11/2025, 2:11:32 AM No.105865004 [Report]

>>105864985
Is sarcasm really that hard to grasp?

Anonymous 7/11/2025, 2:12:22 AM No.105865011 [Report]

>>105864963
I will not be excited until there is a model as fast as Nemo which is better than Nemo.

Anonymous 7/11/2025, 2:16:42 AM No.105865051 [Report] >>105869307

>>105864963
I'm enjoying G4 for rp but I don't post about it in aicg because it's a cesspool, and I don't post about it here because it's not local.

Anonymous 7/11/2025, 3:13:01 AM No.105865410 [Report] >>105865527

How can I run a 400gb model from ram?
>>105864963
online = censored + tracked.

Turning it into "mecha hitler" for a bit is great publicity.

Anonymous 7/11/2025, 3:13:47 AM No.105865415 [Report]

>>105864655
>settling our differences through hatefuck gens
sounds good

Anonymous 7/11/2025, 3:32:21 AM No.105865527 [Report] >>105865544

>>105865410
>How can I run a 400gb model from ram?
By getting more ram, obviously. Quantize, use -ot to put the shared experts on gpu and the rest on cpu, run. Or -ngl 0 -nkvo to not offload the model. Or build without gpu backends.
>online = censored + tracked.
And if you had to ask the first question and make that assumption, I have to assume you never ran the very local gemma-3-27b... it sucks big giant... you know... thighs...

Anonymous 7/11/2025, 3:35:09 AM No.105865544 [Report] >>105865638

>>105865527
Fuck sorry I'm retarded. I meant to ask
>how can I run a 400gb model off ssd swap with only 64gb ram
>local models suck
Yeah. So far I think they are only useful for ERP or messing with to give you creative ideas.

Anonymous 7/11/2025, 3:38:45 AM No.105865571 [Report]

>https://huggingface.co/mistralai/Devstral-Small-2507
>For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
miqu 2.0 when?

Anonymous 7/11/2025, 3:49:07 AM No.105865638 [Report]

>>105865544
>how can I run a 400gb model off ssd swap with only 64gb ram
Big swap partition on a scratch disk and let the OS do its thing, I suppose. Play around with --no-mmap. You can still benefit from a gpu for shared experts if it's a MoE.
>local models suck
Not quite what I meant to say. They're fine. I'd like them to be better, but I can have fun even with dumb models just the same. It was more of a point about censorship.

Anonymous 7/11/2025, 3:55:41 AM No.105865693 [Report] >>105865787 >>105866355 >>105866993

hunyuan 80b moe is pretty good. glm4 releasing a 100b moe model soon. we might be back localbros.

Anonymous 7/11/2025, 4:13:10 AM No.105865787 [Report] >>105865804 >>105866285

>>105865693
>hunyuan 80b moe is pretty good
It is?

Anonymous 7/11/2025, 4:16:46 AM No.105865804 [Report]

>>105865787
It was not useful for the kind of fiction generation I tried with it.

Anonymous 7/11/2025, 4:39:44 AM No.105865923 [Report]

>>105864963
grok is one of the worst models for writing ever made, it's insanely generic and lacking in personality. nu-grok doesn't appear to be a deviation from this trend

Anonymous 7/11/2025, 4:40:30 AM No.105865931 [Report] >>105866419

Her knuckles are white

Anonymous 7/11/2025, 5:32:24 AM No.105866285 [Report]

>>105865787
feels like something in-between qwen3 30b and 235b moe

Anonymous 7/11/2025, 5:44:23 AM No.105866355 [Report] >>105866390

>>105865693
Anybody know the proper instruct and context template for Hunyuan in sillytavern? I can't find it posted anywhere.

Anonymous 7/11/2025, 5:52:18 AM No.105866390 [Report] >>105866430

>>105866355
Llama.cpp literally tells you it when you load up models.

Anonymous 7/11/2025, 5:57:46 AM No.105866419 [Report] >>105866422

>>105865931
This is literally in all of written fiction anon, it's -isms all the way down.

Anonymous 7/11/2025, 5:59:43 AM No.105866422 [Report]

>>105866419
Ok but do you see it occur every chapter? How about every page?

Anonymous 7/11/2025, 6:00:47 AM No.105866430 [Report] >>105866443

>>105866390
I'm using LMStudio

Anonymous 7/11/2025, 6:03:38 AM No.105866443 [Report]

>>105866430
Maybe you could use Llama.cpp? A lot of people go from LMStudio to native llama.cpp anyway, you might as well learn it now.

Anonymous 7/11/2025, 6:08:51 AM No.105866468 [Report] >>105866475 >>105866502 >>105867320

currently using WAI-NSFW-illustrious-SDXL (v14) with loras. is there anything that's a direct upgrade i should be trying out?

Anonymous 7/11/2025, 6:10:54 AM No.105866475 [Report]

>>105866468
Local diffusion general is two blocks down.

Anonymous 7/11/2025, 6:14:41 AM No.105866502 [Report]

>>105866468
realmondai-sdg just werks

Anonymous 7/11/2025, 7:29:14 AM No.105866894 [Report] >>105866929 >>105867200

file.png md5: ab4fd610...

grok now acts like average local model
https://x.com/djcows/status/1943221444105900113

Anonymous 7/11/2025, 7:37:03 AM No.105866929 [Report] >>105867200

1743543184291777.png md5: 38234b53...

>>105866894
Ah yes, distilled picrel

Of course "mecha hitler" was as marketing stunt.

Anonymous 7/11/2025, 7:48:07 AM No.105866988 [Report]

IT'S UP

Anonymous 7/11/2025, 7:48:38 AM No.105866993 [Report]

>>105865693
I must be using Hunyuan wrong, because it seems unusably retarded.

Anonymous 7/11/2025, 8:26:32 AM No.105867175 [Report]

Granite 4 (mamba2 moe) support has been merged https://github.com/ggml-org/llama.cpp/commit/0aedae00e6fb48680324a5ac5da9cba0e35de6b5

Anonymous 7/11/2025, 8:33:13 AM No.105867200 [Report] >>105867246

>>105866894
>>105866929
Retards, Hitler never used the N word and would have no reason to. He's wasn't some unhinged racist /pol/tard who just wants to watch the world burn, he just wanted Germany to be German.

Anonymous 7/11/2025, 8:41:14 AM No.105867246 [Report]

>>105867200
Also he thought Polish people were subhuman.

Anonymous 7/11/2025, 8:48:32 AM No.105867290 [Report]

>>105864019
Anyone with common sense can see that ignoring memory bandwidth, an optimizer can do anything batching can do but better (the optimizer can always just average, but it can't unaverage a batch).

Of course common sense isn't common, so good paper.

Anonymous 7/11/2025, 8:54:45 AM No.105867320 [Report]

>>105866468
It's still my goto SDXL anime model.

Anonymous 7/11/2025, 8:56:28 AM No.105867330 [Report] >>105867342 >>105867693

I don't get why people prefer V3 0324 to R1 0528 for RP. V3 0324 breaks down after a couple of chat rounds but R1 0528 stays cohesive

Anonymous 7/11/2025, 9:00:03 AM No.105867342 [Report] >>105867350

>>105867330
ain't nobody got time for thinking

Anonymous 7/11/2025, 9:00:34 AM No.105867350 [Report]

>>105867342
fucking zoomers

Anonymous 7/11/2025, 9:04:02 AM No.105867373 [Report] >>105867587

is there still any innovation in the ~70b models?
seems like stuff is happening in phone-tier garbage up to 7b or so, then there's a bunch of competition at 20-34b, and then there's super-bloated MoEs that need ram-maxxing to even think about running it locally
given that 70b was "expensive but still reasonable in vram" I thought there would be more happening. or am I ignorant?

Anonymous 7/11/2025, 9:48:40 AM No.105867587 [Report]

>>105867373
No. It got superceded by MoEs

Anonymous 7/11/2025, 9:55:27 AM No.105867620 [Report]

ernie.gguf?

Anonymous 7/11/2025, 10:06:33 AM No.105867693 [Report]

>>105867330
I like swapping between the two. Sometimes R1 gets hyperfocused on something and just won't let go.

Anonymous 7/11/2025, 10:06:45 AM No.105867694 [Report] >>105867727 >>105867734 >>105867930 >>105867942 >>105869270 >>105869296 >>105869313 >>105870169

1747444417590875.png md5: 533344c6...

Surely the "coming soon" openai (closedai) model next week will be better than latest deepseek models, right? Surely the won't release a model that will be DOA or obsolete after a month?

Anonymous 7/11/2025, 10:13:49 AM No.105867727 [Report] >>105867735

>>105867694
0528 trades blows with SOTA. They're not gonna release a SOTA model for free.

Anonymous 7/11/2025, 10:15:37 AM No.105867734 [Report] >>105867759 >>105867788

>>105867694
Obviously it will be. This is OpenAI we're talking about, but people invested in its demise are desperately coping.

Anonymous 7/11/2025, 10:15:40 AM No.105867735 [Report] >>105867758

>>105867727
What's the point of releasing something inferior

Anonymous 7/11/2025, 10:20:16 AM No.105867758 [Report]

>>105867735
Brownie points for "contributing to the open-source community" and "living up to the name OpenAI". It'll be some moderately impressive model good at benchmarks and terrible at non-assistant tasks taken from one of their internal training tests.

Anonymous 7/11/2025, 10:20:20 AM No.105867759 [Report] >>105867778 >>105867795

>>105867734
It is not in OpenAI's interest to release anything.
This is happening only because Sam got his feefees hurt by locallama shitting on him.

Anonymous 7/11/2025, 10:23:52 AM No.105867778 [Report] >>105867787 >>105867795

>>105867759
nah it's because they finally got rid of the safety-alignment rot over the past year, ilya's acolytes went to go be useless at SSI while sam brings the open back

Anonymous 7/11/2025, 10:25:23 AM No.105867787 [Report]

>>105867778
twitter-screenshot-of-sam-saying-that-they-will-put-extra-effort-into-making-the-open-model-safe-because-they-know-it-will-be-modified-post-release.png

Anonymous 7/11/2025, 10:25:43 AM No.105867788 [Report] >>105867798 >>105867834 >>105867907

>>105867734
>This is OpenAI we're talking about
yeah the retards who got beat by every other company in basically everything lmao

no top coding model
no top site design model
no top math model
no top roleplay model
no top conversation model
no top image model
no top video model
no top small edge device llms
no top huge llms

kek, openai (closedai) is a literal trash worthless company who will release a model worse than current top deepseek model, let alone anything else from the competition. DOA.

Anonymous 7/11/2025, 10:26:36 AM No.105867795 [Report]

>>105867759

>>105867778

>Which cope will it be?!?
>FIND OUT NEXT WEEK, ON L M G!

Anonymous 7/11/2025, 10:27:06 AM No.105867798 [Report] >>105867807

>>105867788
holy state of cope

Anonymous 7/11/2025, 10:28:39 AM No.105867807 [Report]

>>105867798
>couldnt refute a single point
kek, poor paid retard

Anonymous 7/11/2025, 10:33:30 AM No.105867829 [Report]

Glm4 100b moe will save local

Anonymous 7/11/2025, 10:34:26 AM No.105867834 [Report] >>105867839

>>105867788
their models are consistently the best at everything, nice try

Anonymous 7/11/2025, 10:35:33 AM No.105867839 [Report] >>105867842

>>105867834
>nuh huh
great argument, try again rakesh

Anonymous 7/11/2025, 10:36:12 AM No.105867842 [Report] >>105867846 >>105867856

>>105867839
you have to make a point to argue against if you want more than that

Anonymous 7/11/2025, 10:37:15 AM No.105867846 [Report] >>105867849 >>105867880

>>105867842
He made 9 points and you made none

Anonymous 7/11/2025, 10:38:14 AM No.105867849 [Report]

>>105867846
where? he made assertions with nothing to back them up. they are of equivalent value to a counter assertion

Anonymous 7/11/2025, 10:39:06 AM No.105867856 [Report] >>105867873

>>105867842
the burden of proof is on your to show how they are "consistently the best at everything" despite them being nowhere at #1 on any top benchmark, pajeet

lets look at coding for example, show us some good leaderboard where they beat everyone else

Anonymous 7/11/2025, 10:42:54 AM No.105867873 [Report] >>105867902

>>105867856
>lets look at coding for example, show us some good leaderboard where they beat everyone else
https://livebench.ai/#/?Coding=a&Agentic+Coding=a
https://aider.chat/docs/leaderboards/
https://scale.com/leaderboard/coding

not that this is even strictly necessary, since he made his claims first, but I'll humor you

Anonymous 7/11/2025, 10:45:33 AM No.105867880 [Report] >>105867889 >>105867911

>>105867846
>You suck!
>No I don't!
>Wow, great argument! Burden of proof is on you!

Anonymous 7/11/2025, 10:46:39 AM No.105867889 [Report]

>>105867880
Paid shills are getting uppity

Anonymous 7/11/2025, 10:49:15 AM No.105867902 [Report]

1744718996890745.png md5: de60d76c...

>>105867873
>146$ for a response of a model that can't even respond with the correct format properly
Hahahahahahahaahah

Yes saar, 2025 superpowa modal saaar

Anonymous 7/11/2025, 10:51:15 AM No.105867907 [Report] >>105867927 >>105867961

>>105867788
>no top image model
For text to image probably, though there still isn't any other image model that can do comprehension of complex prompts and image editing as well as ChatGPT does. Kontext ended up being trash. Though Gemini is a close second and I know Qwen recently came out with something that looks similar recently which I haven't tried.

Anonymous 7/11/2025, 10:53:14 AM No.105867911 [Report]

>>105867880
Everyone here who actually tried most of the top models knows which ones are the best for most usecases.
Basic facts of reality that are already consensus don't need to be proven over and over, for example nobody needs to show a direct comparison between any video model generation to know Hailuo 02 is multiple levels above every other one out there right now. Meanwhile OpenAI couldn't even release Sora properly because they were so incompetent for months and months on end, lmao.

Anonymous 7/11/2025, 10:55:54 AM No.105867927 [Report]

>>105867907
the piss filter forcing and the dwarfmaxxing of character it does makes it worthless compared to other proprietary text to editing models

even the gimped local version of kontext is better for at least the very simple changes

Anonymous 7/11/2025, 10:57:12 AM No.105867930 [Report]

>>105867694
Everyone forgot about deepchink models tho

Anonymous 7/11/2025, 10:59:15 AM No.105867942 [Report]

>>105867694
>Surely the "coming soon" openai (closedai) model next week will be better than latest deepseek models, right?
yes and elmo is releasing grok 3 tomorrow

Anonymous 7/11/2025, 11:01:39 AM No.105867953 [Report]

Sex

Anonymous 7/11/2025, 11:02:54 AM No.105867961 [Report] >>105867991 >>105868019

>>105867907
>Qwen recently came out with something that looks similar
https://qwenlm.github.io/blog/qwen-vlo/
God I hope they open source it. Image out modality has been cockblocked from local for so fucking long. Qwen doesn't shy away from releasing their big ones.
In the shiba example it seems to do the same yellowing that GPT does though. I guess this is a way of watermarking AI edited images or some shit?

Anonymous 7/11/2025, 11:10:16 AM No.105867991 [Report]

>>105867961
Yellow filter itself doesn't do anything but I'm sure the grain encodes some sort of compression/resampling/cropping agnostic watermark. The side effect is that it looks like shit. But safety comes first.

Anonymous 7/11/2025, 11:18:44 AM No.105868019 [Report]

>>105867961
>In the shiba example it seems to do the same yellowing that GPT does though
I think that's because they trained on GPT output (pairing prompts and images)
There is no inherent yellowing to this kind of technology, it's just GPT (and whoever thinks it's fine to train on GPT garbage)
google "CosXL" if you want to see some previous image /editing/ focused models (instruct pix2pix types) that didn't do any form of retarded yellowing

Anonymous 7/11/2025, 11:18:58 AM No.105868020 [Report] >>105868025 >>105868030 >>105868301

Futa is gay.

Anonymous 7/11/2025, 11:20:20 AM No.105868025 [Report]

file.png md5: 6873f347...

>>105868020
This too

Anonymous 7/11/2025, 11:21:06 AM No.105868030 [Report] >>105868135

>>105868020
less gay than traps at least

Anonymous 7/11/2025, 11:41:57 AM No.105868135 [Report]

>>105868030
Cope.

Anonymous 7/11/2025, 11:53:27 AM No.105868198 [Report] >>105868209

I've been out for a while.
Mixtral, Nemo, Gemma still the best local options for a 4090?

Anonymous 7/11/2025, 11:54:18 AM No.105868209 [Report] >>105868212

>>105868198
check inside your anus

Anonymous 7/11/2025, 11:55:37 AM No.105868212 [Report] >>105868312

>>105868209
sorry brother, i'm sure yours is full of stuff but I got nothing up there

Anonymous 7/11/2025, 12:06:19 PM No.105868294 [Report]

70b is the new 8b

Anonymous 7/11/2025, 12:07:33 PM No.105868301 [Report] >>105869863

>>105868020
Miku is a futa.

Anonymous 7/11/2025, 12:09:53 PM No.105868312 [Report]

>>105868212
joke's on you, I just washed mine out thoroughly

Anonymous 7/11/2025, 12:45:19 PM No.105868523 [Report] >>105868533

ITS HERE

https://huggingface.co/TheDrummer/Snowpiercer-15B-v2-GGUF

Anonymous 7/11/2025, 12:46:51 PM No.105868533 [Report]

>>105868523
thinking for rp doesn't work

Anonymous 7/11/2025, 12:47:06 PM No.105868536 [Report]

False alarm it's not Granite 4 it's just some sloptune

Anonymous 7/11/2025, 12:59:53 PM No.105868613 [Report]

kys drummer

Anonymous 7/11/2025, 1:14:29 PM No.105868690 [Report]

sorry that was mean, I wasn't feeling right
don't die drummer

Anonymous 7/11/2025, 1:16:38 PM No.105868713 [Report] >>105868919 >>105870326

>miqu and mythomax still havent been surpassed
nice fucking thread

Anonymous 7/11/2025, 1:45:02 PM No.105868919 [Report] >>105868980

>>105868713
>what is deepseek
Poorfag thread.

Anonymous 7/11/2025, 1:45:18 PM No.105868922 [Report]

why does this drummer dude think anyone here is interested in testing his trash models?
the only good one he ever made was rocinante v1.1, and that was pure luck
literally every single other model he has ever put out has been utter trash
its clear by now he has no fucking clue

Anonymous 7/11/2025, 1:51:36 PM No.105868980 [Report] >>105869092

>>105868919
you don't rp, just stop talking
if we werent talking about output quality being bad, then we could talk about the response times being so bad they render roleplay utterly pointless
deepseektards are so fucking inbred its unbelievable

Anonymous 7/11/2025, 1:58:05 PM No.105869034 [Report]

>miqu
>mythomax
lmfao

Anonymous 7/11/2025, 2:05:20 PM No.105869092 [Report] >>105869104 >>105869165

file_thumb.jpg.webm md5: 1ef38cfe...

WebM not supported

>>105868980

Anonymous 7/11/2025, 2:06:30 PM No.105869104 [Report]

>>105869092
kek

Anonymous 7/11/2025, 2:07:35 PM No.105869113 [Report]

file.png md5: 998b55a7...

Yeah this is totally going to prevent people from killing themselves
>prevent?

Anonymous 7/11/2025, 2:13:18 PM No.105869165 [Report] >>105869169

>>105869092
>fast forwarding your video
lmao

Anonymous 7/11/2025, 2:14:08 PM No.105869169 [Report] >>105869264

>>105869165
>t/s clearly visible
lmao

Anonymous 7/11/2025, 2:22:41 PM No.105869238 [Report] >>105869272

>>105863705 (OP)
Are we going to have a local model that can be run by most people and as good as Gemini 2.5 Pro Preview?
Open AI will release one enormous model (one of the engineers already said it will need H100s to run) and will be benchmaxxed.

Anonymous 7/11/2025, 2:26:10 PM No.105869264 [Report] >>105869558

>>105869169
>what is using javascript from the browser console to multiply the number by 10 before displaying it

Anonymous 7/11/2025, 2:27:04 PM No.105869270 [Report]

00004-1378487878-coffee.png md5: 2de01026...

>>105867694
> Here you are sir, the 1292 Dipsy gens you ordered
https://mega.nz/fm/jSgQRajK

Anonymous 7/11/2025, 2:27:15 PM No.105869272 [Report]

>>105869238
that guy wasn't an engineer he just ran a cloud service, the model he was looking at probably wasn't even quanted or anything

Anonymous 7/11/2025, 2:29:41 PM No.105869296 [Report]

>>105867694
> Here you are, sir, the 1292 Dipsy gens you ordered.
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w

Anonymous 7/11/2025, 2:30:59 PM No.105869307 [Report] >>105869354

>>105865051
We need a real /g/ board that is focused on technical and serious discussions instead of childish wars and ads.

Anonymous 7/11/2025, 2:31:29 PM No.105869313 [Report] >>105869324 >>105869430 >>105869581

00004-1378487878-coffee.png md5: 2de01026...

>>105867694
> Here you are, sir, the 1292 Dipsy gens you ordered.
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w

Anonymous 7/11/2025, 2:31:49 PM No.105869317 [Report]

So, is there still no local model with real time speech like sesame?

Anonymous 7/11/2025, 2:32:15 PM No.105869324 [Report] >>105869418

>>105869313
If I knew you'd spam so many images I would've never posted that catbox for you.

Anonymous 7/11/2025, 2:36:13 PM No.105869354 [Report] >>105869364 >>105869365

>>105869307
> No true Scotman
I've given up on the idea that 4chan can forment discussion that doesn't devolve into trash. Its strength is as a meme factory. Even reddit does a better job of serious discussion, and reddit's a pale imitarion of the old vBulliten and phpBB systems that were actually good for discussing niche topics but are all dead now because the owners got sick of maintaining them for no traffic. Discord is pretty much useless, as is FB.
Bleh.

Anonymous 7/11/2025, 2:37:17 PM No.105869364 [Report]

>>105869354
>reddit does a better job of serious discussion
you gotta go back

Anonymous 7/11/2025, 2:37:24 PM No.105869365 [Report]

>>105869354
>Even reddit does a better job of serious discussion
It doesn't because you can't reply to multiple posts.

Anonymous 7/11/2025, 2:41:59 PM No.105869404 [Report] >>105869429 >>105869452

What's best bang for my buck at 96 gigs of VRAM?

Anonymous 7/11/2025, 2:44:02 PM No.105869418 [Report] >>105869739

g.png md5: 0adf9153...

>>105869324
Well, I appreciated the help anyway. And Dipsy kept more or less contained in one spot.
My take: you can't look at the /g/ catalog and tell me that a niche, low-traffic general is a waste of this boards wonderful catalog of topics. There is still nowhere to talk noob LLM topics. aicg is slowly getting better, but lmg is getting worse.
I'm sort of done with /wait/ conceptually... getting increasingly busy with other stuff. I keep trying to let it die off but other anons keep re-upping it. So I guess we'll see what happens.

Anonymous 7/11/2025, 2:45:46 PM No.105869429 [Report] >>105869455

>>105869404
Unironically Rocinante.

Anonymous 7/11/2025, 2:45:48 PM No.105869430 [Report]

file.png md5: e621afc5...

>>105869313

Anonymous 7/11/2025, 2:48:50 PM No.105869452 [Report] >>105869460

>>105869404
There are a couple of builds in the OP, that should give you an idea at least, even if they are slightly outdated.

Anonymous 7/11/2025, 2:49:01 PM No.105869455 [Report]

>>105869429
>Rocinante
I am too spoiled by 70b+, can't go back :C

Anonymous 7/11/2025, 2:49:39 PM No.105869460 [Report] >>105869511

>>105869452
Needed to clarify that I am asking for models, I'm just renting 4x3090

Anonymous 7/11/2025, 2:54:36 PM No.105869511 [Report] >>105869538

>>105869460
Alright. What do you want to do? How much RAM do you have to go with that?

Anonymous 7/11/2025, 2:58:17 PM No.105869538 [Report] >>105869557 >>105869574

>>105869511
I want sexo/RP, I have 96 gigs for VRAM and 128 gigs of RAM

Anonymous 7/11/2025, 3:00:58 PM No.105869557 [Report] >>105869563 >>105869574

>>105869538
https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF/tree/main

Anonymous 7/11/2025, 3:00:58 PM No.105869558 [Report]

>>105869264
>Cry harder

Anonymous 7/11/2025, 3:01:51 PM No.105869563 [Report] >>105869588

>>105869557
How'd the t/s on that be?

Anonymous 7/11/2025, 3:03:03 PM No.105869574 [Report] >>105869597

>>105869538
What >>105869557 said.
Or try the big Qwen 3 MoE. There was at least one anon that swore by it.

Anonymous 7/11/2025, 3:03:38 PM No.105869581 [Report] >>105870029

>>105869313
What model / lora / danbooru tag is this? I like the style.

Anonymous 7/11/2025, 3:04:14 PM No.105869588 [Report]

>>105869563
i'd say 20+

Anonymous 7/11/2025, 3:05:03 PM No.105869597 [Report]

>>105869574
I did, and I liked it, but it randomly got into weird repetition loops I could snap it out of

Anonymous 7/11/2025, 3:22:33 PM No.105869739 [Report] >>105870029

>>105869418
The best thing about /wait/ is Dispy. When the next DS version is released the thread will be back. See you all in two more weeks.

Anonymous 7/11/2025, 3:29:35 PM No.105869781 [Report]

people probaby missed this so here:
https://github.com/kanttouchthis/SeedVR2-Simple
based on
https://github.com/IceClear/SeedVR2
One-Step Video Restoration via Diffusion Adversarial Post-Training

Anonymous 7/11/2025, 3:40:47 PM No.105869863 [Report]

>>105868301
proof?

Anonymous 7/11/2025, 3:50:10 PM No.105869938 [Report]

>>105863705 (OP)
https://github.com/triton-lang/triton/pull/7298/commits/a5e23d8e7e64b8a11af3edc1705407d91084b01d
nvidia gimping performance on purpose in driver, if your function has "cutlass_" prefix it suddenly gains performance

Anonymous 7/11/2025, 3:52:02 PM No.105869961 [Report] >>105869998 >>105870026 >>105870086 >>105870547 >>105870574

When will we get consumer CPUs fast enough to reach 8 tokens/sec? I feel as if once that is achieved, GPU power will become less important for LLM use.

Anonymous 7/11/2025, 3:56:10 PM No.105869998 [Report]

>>105869961
When 256 bit wide DDR6 CPUs becomes the default.

Anonymous 7/11/2025, 3:58:59 PM No.105870026 [Report]

>>105869961
Intel are shitting their pants and failing everything while AMD purposefully doesn't rock the boat because Nvidia has more money than god and can erase them from the earth if they truly just made cheap AI hardware for consumers/companies

Anonymous 7/11/2025, 3:59:12 PM No.105870029 [Report]

2MoreWeeksDipsy.png md5: c676ee1c...

>>105869739
Sounds like a plan
>>105869581
Knock yourself out.
https://files.catbox.moe/68l2ab.png

Anonymous 7/11/2025, 4:02:06 PM No.105870058 [Report]

when will some foundation come around and save local ai? It needs to be ran by a someone who won't pounce on the first corporate dick in sight

Anonymous 7/11/2025, 4:06:31 PM No.105870086 [Report] >>105870122 >>105870122 >>105870257

>>105869961
CPUs need embedded NPUs or hardware codecs but for llms.

Anonymous 7/11/2025, 4:09:38 PM No.105870122 [Report]

>>105870086
>>105870086
Isn't the bottleneck mostly memory speed, prefill aside?
I don't even know if NPUs are useful what's with all the vector extensions that x86/amd64 CPUs have nowdays.

Anonymous 7/11/2025, 4:13:46 PM No.105870169 [Report]

>>105867694
Even if it has better benches it will have special local data and local guardrails

Anonymous 7/11/2025, 4:25:48 PM No.105870257 [Report]

>>105870086
Nah, we need PIM flash.

Do prefill on dGPU, then inference on PIM flash.

Anonymous 7/11/2025, 4:28:40 PM No.105870286 [Report] >>105870423

How's the backend agnostic row parallelism PR on llama.cpp going?

Anonymous 7/11/2025, 4:33:48 PM No.105870326 [Report]

>>105868713
mythomax wasn't even in the top 3 models during its own era, lmao

Anonymous 7/11/2025, 4:37:42 PM No.105870365 [Report] >>105870410 >>105870474

Jamba mini knows the Castlevania quote, doesn't know the vampire, partially knows mesugaki. Repetitive in RP. Doesn't appear to be much censored. And it's pretty dumb. Tested at Q4.

Anonymous 7/11/2025, 4:41:55 PM No.105870410 [Report] >>105870623

>>105870365
Test the long context performance. Everything else is probably due to it being undertrained as fuck, I imagine.

llama.cpp CUDA dev !!yhbFjk57TDr 7/11/2025, 4:42:51 PM No.105870423 [Report]

>>105870286
I'm currently not working on it because another dev expressed interest in implementing the feature.
I'm not aware of significant process by said other dev so I will get back to it after updating my project for model evaluation and implementing logic for automatically setting runtime parameters such as the number of GPU layers.

Anonymous 7/11/2025, 4:49:45 PM No.105870474 [Report]

>>105870365
I had no idea that such big jamba models exists. I'm downloading Q3 of large right now.

Anonymous 7/11/2025, 4:57:35 PM No.105870547 [Report] >>105870573

>>105869961
>When will we get consumer CPUs fast enough to reach 8 tokens/sec?
You don't get over +8 T/s ? lol, lmao

Anonymous 7/11/2025, 5:00:52 PM No.105870573 [Report]

>>105870547
I don't think there is a single consumer CPU that can do 8 T/s on its own. My 5800X3D can do, at most, 3 T/s on its own. Of course, once I factor in the GPU, its closer to 15 T/s, but that's not exactly what I'm talking about.

Anonymous 7/11/2025, 5:01:02 PM No.105870574 [Report] >>105870602

>>105869961
A 5090 will run qat gemma at like 50 t/s.

Anonymous 7/11/2025, 5:04:06 PM No.105870602 [Report] >>105870647

>>105870574
Well yeah, a 5090 can do that. What I'm trying to say is that once you can run an LLM purely off the CPU + RAM at a speed reasonable to the average person, then unless you desperately need more than 20 T/s, the GPU won't be as important as it is now.

Anonymous 7/11/2025, 5:06:31 PM No.105870623 [Report] >>105870699

>>105870410
Just did a quick test. It was immediately dumb and got something wrong I never saw other models get wrong. Tested at 8k tokens, another at 44k. It's possible it's good at exact string retrieval but its IQ is not there to support long context understanding.

Anonymous 7/11/2025, 5:07:43 PM No.105870630 [Report] >>105870694

is mistral nemo / rocinante still the best rp-slop at 12B or so? surely something new must've appeared in the last year if general small-ish models have been making progress every few months
I could go up to 24B or so but never found anything significantly better

Anonymous 7/11/2025, 5:09:42 PM No.105870647 [Report]

>>105870602
>What I'm trying to say is that once you can run an LLM purely off the CPU + RAM at a speed reasonable to the average person
For gooners, maybe. Anything with thinking or tool calling at 8tk/s will be more of a religious experience or a meditation retreat simulator instead of LLM, especially when you get garbage output half the time (like with most low-end open models)

Anonymous 7/11/2025, 5:12:41 PM No.105870674 [Report] >>105870708 >>105870768

>Rocinante-12B-v1.1-Q6_K_L.gguf
is this model appropriate for making a chat bot or is it optimized for generative fiction? I want a discord bot I can talk to conversationally. I get the feeling that a lot of models are being trained for interactive fiction writing though

Anonymous 7/11/2025, 5:14:46 PM No.105870694 [Report]

LeCun_2018.jpg md5: 6970fc30...

>>105870630
>surely something new must've appeared in the last year
please consult LeCun's presentation

Anonymous 7/11/2025, 5:15:10 PM No.105870699 [Report]

>>105870623
Actually it's really amazing how bad it did.
To be specific, I tested an RPG scenario and asked it about some stat requirements. It didn't even mention what the stat was or any specific numbers, it just rambled generically about learning lessons and getting stronger. Complete with "it's not just about X, it's about Y" slop.
And I tested its understanding of an essay, and it told me about the authors of it. A single guy wrote the essay wtf,

Anonymous 7/11/2025, 5:15:59 PM No.105870708 [Report] >>105871512

>>105870674
nobody is fine-tuning models for chat, if you are looking for c.AI experience you will be disappointed

Anonymous 7/11/2025, 5:17:52 PM No.105870718 [Report] >>105870729

>tfw could be breeding prime age college sluts
>instead I'm wasting my life trying and failing to tard wrangle LLMs that are stupider than the average 80iq pajeet
I may have taken a wrong turn at some point

Anonymous 7/11/2025, 5:18:46 PM No.105870729 [Report] >>105870766

>>105870718
Seems like the only thing you're missing is more VRAM.

Anonymous 7/11/2025, 5:22:53 PM No.105870766 [Report]

>>105870729
Buy an ad, Jensen.

Anonymous 7/11/2025, 5:23:06 PM No.105870768 [Report]

>>105870674
>I want a discord bot I can talk to conversationally
Use mistral nemo 12b. Works fine as it is.

Anonymous 7/11/2025, 5:23:26 PM No.105870772 [Report] >>105870774 >>105870780 >>105870785 >>105870789 >>105870790 >>105870795 >>105870819 >>105870822 >>105870834 >>105870837 >>105870845 >>105870847 >>105870879 >>105870895 >>105870912 >>105870915 >>105870946 >>105871009 >>105871023 >>105871121 >>105871166 >>105871304 >>105871345 >>105871571 >>105871584 >>105871616 >>105872449

Kimi K2 released.

>Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.

https://huggingface.co/moonshotai/Kimi-K2-Instruct

Anonymous 7/11/2025, 5:23:59 PM No.105870774 [Report] >>105870794

>>105870772
>1 trillion total parameters
lol

Anonymous 7/11/2025, 5:24:43 PM No.105870780 [Report]

>>105870772
It's time for Unslot to work his magic

Anonymous 7/11/2025, 5:25:15 PM No.105870785 [Report]

>>105870772
Waste of money, energy and compute. Who's gonna run this thing?

Anonymous 7/11/2025, 5:25:25 PM No.105870789 [Report]

>>105870772
>2 billion activated parameters and 1 trillion total parameters
Localbros, we're so back.

Anonymous 7/11/2025, 5:25:26 PM No.105870790 [Report] >>105870832 >>105870838

GvldjKMXEAAAJ1Z.jpg md5: 6a1b9e25...

>>105870772

Anonymous 7/11/2025, 5:25:33 PM No.105870794 [Report]

>>105870774
I've got a CPUMaxx build with 1.5tb of DDR5 - going to see if I can load the model up once the quants come out. I'll probably have to delete DSR1-0528 though. Feels bad.

Anonymous 7/11/2025, 5:25:48 PM No.105870795 [Report] >>105870821 >>105870842

>>105870772
>moonshotai
>shooting for the moon
Kino.

Anonymous 7/11/2025, 5:28:50 PM No.105870819 [Report]

>>105870772
>1 trillion total parameters
bby pls
(pls)

Anonymous 7/11/2025, 5:29:22 PM No.105870821 [Report]

>>105870795
elo to the moon

Anonymous 7/11/2025, 5:29:25 PM No.105870822 [Report]

>>105870772
>no goof

Anonymous 7/11/2025, 5:30:14 PM No.105870832 [Report] >>105870849 >>105870851

>>105870790
So R1 is better, since they conveniently decided not to include it on the chart.

Anonymous 7/11/2025, 5:30:21 PM No.105870834 [Report]

>>105870772
>and 1 trillion total parameters
soon I won't even have the storage to hoard the models, much less run the fucking things

Anonymous 7/11/2025, 5:30:36 PM No.105870837 [Report] >>105870965

moonshotai_Kimi-K2-Instruct · Hugging Face.png md5: 4e7a5af7...

>>105870772
Holy fuck.
Holy fuck!

Anonymous 7/11/2025, 5:30:42 PM No.105870838 [Report]

>>105870790
If anything this shows we already plateaued. The size to bench ratios are the same for every AI lab.

Anonymous 7/11/2025, 5:30:59 PM No.105870842 [Report]

>>105870795
>Shooting for the Muon

Anonymous 7/11/2025, 5:31:21 PM No.105870845 [Report]

>>105870772
AHAHAHAHAHAHAHAHAHAHAHAHA
q1 when?

Anonymous 7/11/2025, 5:31:34 PM No.105870847 [Report] >>105871074

>>105870772
Daily reminder that total training compute (and therefore cost) is proportional only to active parameters.
That's why all these fuckhuge mediocre MoE models are coming out. This thing was cheaper for them to train than Qwen3-32b was.
MoE was a mistake.

Anonymous 7/11/2025, 5:31:43 PM No.105870849 [Report]

>>105870832
Isn't it a non-thinking model? You can't compare them to each other.

Anonymous 7/11/2025, 5:32:18 PM No.105870851 [Report] >>105870875

>>105870832
K2 isn't a reasoner

Anonymous 7/11/2025, 5:33:26 PM No.105870856 [Report] >>105870867

SSDmaxxers, is it our time?

Anonymous 7/11/2025, 5:33:38 PM No.105870859 [Report] >>105870871 >>105870892

chatgpt what are EXL3 quants and when would I prefer those over GGUF that fits my gpu vram

Anonymous 7/11/2025, 5:34:52 PM No.105870867 [Report]

>>105870856
>waiting 50 minutes for a single mid-length query to resolve
good luck with that

Anonymous 7/11/2025, 5:35:12 PM No.105870871 [Report]

>>105870859
faster

Anonymous 7/11/2025, 5:35:39 PM No.105870875 [Report]

>>105870851
Yeah... that's tempering my enthusiasm a bit. Especially considering how much it would rape my SSD to delete R1 and then dump this on it...

Anonymous 7/11/2025, 5:36:01 PM No.105870879 [Report] >>105870891 >>105870894

>>105870772
ok but how safe is it? im not using anything that doesnt hit me with 50 refusals per hour at least

Anonymous 7/11/2025, 5:37:00 PM No.105870887 [Report] >>105870900

Why isn't K2 hosted anywhere? What are they hiding?

Anonymous 7/11/2025, 5:37:10 PM No.105870891 [Report]

>>105870879
> Safety finetuning is its own form of bondage
Huh. New ways to goon.

Anonymous 7/11/2025, 5:37:10 PM No.105870892 [Report]

>>105870859
Smarter at the same sizes

Anonymous 7/11/2025, 5:37:20 PM No.105870894 [Report]

>>105870879
At that size, you won't get more than 8 refusals or so per hour. Absolutely worthless.

Anonymous 7/11/2025, 5:37:25 PM No.105870895 [Report]

>>105870772
I hope it has unparalleled bleeding edge punching above its weight safety which results in sex capabilities of a 32B dense model.

Anonymous 7/11/2025, 5:38:15 PM No.105870900 [Report]

>>105870887
They have an API up behind a paywall. It was literally released less than an hour ago. Someone will pick it up.

Anonymous 7/11/2025, 5:39:40 PM No.105870912 [Report] >>105870929 >>105870965

>>105870772
>384 32B experts

Anonymous 7/11/2025, 5:39:51 PM No.105870915 [Report] >>105870925 >>105870926 >>105870946

>>105870772
Would be able to fit into a 512gb ddr4 system with 3 3090s?

Anonymous 7/11/2025, 5:41:01 PM No.105870925 [Report] >>105870958

>>105870915
At Q3 or something like that I think, yeah.

Anonymous 7/11/2025, 5:41:06 PM No.105870926 [Report] >>105870958

>>105870915
Maybe with 3_K_S or lower quant scheme? Definitely going to crawl regardless though.

Anonymous 7/11/2025, 5:41:10 PM No.105870929 [Report]

>>105870912
that's not how it works

Anonymous 7/11/2025, 5:42:17 PM No.105870946 [Report] >>105870958 >>105870964 >>105870965 >>105870973 >>105870991

AAAAA.png md5: 3b4f1621...

>>105870772
>>105870915
I don't think we got the VRAM to run this.

Anonymous 7/11/2025, 5:43:24 PM No.105870958 [Report]

>>105870925
>>105870926
Damn, I was hoping iq4_xs at the very least.

>>105870946
Oh? This is hopeful.

Anonymous 7/11/2025, 5:43:55 PM No.105870964 [Report] >>105871001

>>105870946
It uses MLA so that context size is definitely wrong btw.

Anonymous 7/11/2025, 5:44:01 PM No.105870965 [Report]

Fine, I'll build a server with fucking 1tb of ram.
God damn it.
>>105870912
Naw nigga, it has 32B total active params.
>>105870837

>>105870946
Lmao.

Anonymous 7/11/2025, 5:44:46 PM No.105870973 [Report] >>105870987 >>105870994 >>105871001

>>105870946
>1T * 4 bits is 290GB
An absolute retard wrote this tool and you're a retard for using that instead of a calculator.

Anonymous 7/11/2025, 5:46:03 PM No.105870987 [Report]

>>105870973
Context window / KV cache has to also be taken into account, amongst other factors.

Anonymous 7/11/2025, 5:46:17 PM No.105870991 [Report] >>105871001

>>105870946
I would have thought q4_km would be around 600gb just for the model.

Anonymous 7/11/2025, 5:46:25 PM No.105870993 [Report]

we are so back

Anonymous 7/11/2025, 5:46:30 PM No.105870994 [Report]

>>105870973
retarded post

Anonymous 7/11/2025, 5:47:43 PM No.105871001 [Report] >>105871011 >>105871036

well.png md5: 130bdf1f...

>>105870964
I don't think context size is the big issue.
>>105870973
>>105870991
According to this other calculator I found, things are not going to fit onto VRAM even harder than before.

Anonymous 7/11/2025, 5:48:44 PM No.105871009 [Report]

>>105870772
SSDMAXXERS STAY WINNING

Anonymous 7/11/2025, 5:48:53 PM No.105871011 [Report]

>>105871001
>btw

Anonymous 7/11/2025, 5:50:33 PM No.105871023 [Report] >>105871037 >>105871057

>>105870772
0.1IQ quants when?

Anonymous 7/11/2025, 5:50:54 PM No.105871028 [Report]

>ye olde days: you go rob a ship, set for life
>old days: you go rob a bank, set for years
>nowadays: you go rob a data center?
how the fuck are those yuge modern GPUs as valuable as gold? are furry erp afficionados really that rich?

Anonymous 7/11/2025, 5:51:59 PM No.105871036 [Report] >>105871063

>>105871001
Only 7 h100s. Don't tell me you don't have 7 h100s laying around.

Anonymous 7/11/2025, 5:52:03 PM No.105871037 [Report]

>>105871023
Just set this up in a tab with autorefresh: https://huggingface.co/search/full-text?q=Kimi-K2-Instruct-GGUF

Anonymous 7/11/2025, 5:53:55 PM No.105871057 [Report] >>105871087

>>105871023
Here's what you'll do for us
>Merge similar experts (I'll leave the similarity criteria up to you)
>Quant to Q1
>use some calibration process, track activations, and use that as a heuristic to prune tensors out of the thing
I'm sure that will work just fine.
Good luck!

Anonymous 7/11/2025, 5:54:37 PM No.105871063 [Report] >>105871091

>>105871036
Anyone that can a) afford 7 H100s and b) know how to use them will feel a certain kind of way about stopping 1 short of a factor of 2.

Anonymous 7/11/2025, 5:54:58 PM No.105871068 [Report] >>105871162

this is literally opus4 at home. we need better quantization techniques chat

Anonymous 7/11/2025, 5:55:46 PM No.105871074 [Report]

>>105870847
Diddling 7000 useless parameters up and down for each token serves little purpose.

Anonymous 7/11/2025, 5:57:10 PM No.105871087 [Report] >>105871098

>>105871057
In a few months or maybe a year, the first energy-based transformers models will start getting released and the landscape for local will change massively. https://energy-based-transformers.github.io/

Anonymous 7/11/2025, 5:57:39 PM No.105871091 [Report]

>>105871063
Bwe, I meant 8, dunno why I wrote 7. I must be running at 0.1iq

Anonymous 7/11/2025, 5:58:37 PM No.105871098 [Report]

>>105871087
Based new bitmeme to wait forever for.

Anonymous 7/11/2025, 6:00:46 PM No.105871121 [Report] >>105871170

>>105870772
LOCAL IS SAVED

Anonymous 7/11/2025, 6:04:05 PM No.105871162 [Report]

>>105871068
We can compress it with a new quantum ai powered, machine learning based algorithm which is yet to be invented. I gave you the idea, we are already halfway there.

Anonymous 7/11/2025, 6:04:32 PM No.105871166 [Report]

>>105870772
SSDMAXXER REJOICE

Anonymous 7/11/2025, 6:04:53 PM No.105871170 [Report]

>>105871121
New coom scenario spawned: you, the skinny dork representing the cashpoor mediocre GPU + DDR4 + SSD swap space setup.
Her, the kaiju scale muscle mommy Kimi model coming to pulverize your entire existence.

Anonymous 7/11/2025, 6:06:41 PM No.105871193 [Report] >>105871250 >>105871284

1738182945411343.png md5: a83a0550...

TWO (2) DOLLAR PER MILLION TOKENS OUTPUT
VERY CHEAP CONTEXT

Anonymous 7/11/2025, 6:11:43 PM No.105871250 [Report] >>105871260 >>105871409

>>105871193
>https://files.catbox.moe/68l2ab.png
Did the thread get stupider?

Anonymous 7/11/2025, 6:12:52 PM No.105871260 [Report]

>>105871250
wdym? It's in Chinese Yuan.

Anonymous 7/11/2025, 6:15:57 PM No.105871284 [Report] >>105871293

plapplap.png md5: 16696b82...

>>105871193

Anonymous 7/11/2025, 6:16:42 PM No.105871293 [Report]

>>105871284
Nonononono say it aint so

Anonymous 7/11/2025, 6:18:15 PM No.105871304 [Report] >>105871323

>>105870772
Technically what's the minimum specs to run it?

Anonymous 7/11/2025, 6:21:09 PM No.105871323 [Report]

>>105871304
To determine the memory required to run a 1 trillion (1T) parameter LLM model with IQ2_XS quantization, we need to know the bits per parameter (bpw) for IQ2_XS.

According to search results, IQ2_XS quantization uses approximately 2.31 bits per parameter (bpw).

Now, let's calculate the memory:

Total parameters: 1 Trillion = 1×1012 parameters

Bits per parameter (IQ2_XS): 2.31 bpw

Calculate total bits: 1×1012 parameters×2.31 bits/parameter=2.31×1012 bits

Convert bits to bytes: There are 8 bits in a byte.
2.31×1012 bits/8 bits/byte=0.28875×1012 bytes

Convert bytes to gigabytes (GB): 1 GB=10243 bytes≈109 bytes.
0.28875×1012 bytes/(10243 bytes/GB)≈0.28875×1012/1.07374×109 GB
≈268.9 GB

Therefore, a 1 trillion parameter LLM model quantized with IQ2_XS would require approximately 268.9 GB of memory to run.

It's important to note that this calculation is for the model weights only. Additional memory will be needed for:

KV Cache (Key-Value Cache): This stores intermediate activation values during inference and can significantly add to memory requirements, especially for longer context windows.

Activations: While typically smaller than weights for inference, they still consume some memory.

Overhead: There's always some system and framework overhead. Some sources suggest adding a 20% overhead to the base memory calculation. If we apply that:
268.9 GB×1.20≈322.7 GB

So, while the raw weights might be around 268.9 GB, the practical memory requirement for running inference could be closer to 300-350 GB due to these additional factors.

Anonymous 7/11/2025, 6:22:30 PM No.105871342 [Report] >>105871373 >>105871403 >>105871431 >>105871447 >>105871534

2025-07-11_17-20-59.png md5: c8337a31...

ill put in some dosh in a day or two for more proper tests but seems good now the only question is censorhsip and context

Anonymous 7/11/2025, 6:22:46 PM No.105871345 [Report]

image.png md5: 13c7a1f2...

>>105870772

Anonymous 7/11/2025, 6:25:08 PM No.105871368 [Report] >>105871389

I'm trying hunyuan right now and it seems really retarded, like it confuses shit.

Anonymous 7/11/2025, 6:25:48 PM No.105871373 [Report] >>105871480

meh.png md5: 5e6e2634...

>>105871342
> requires a google account and email to sign up
Grim.

Anonymous 7/11/2025, 6:27:32 PM No.105871389 [Report] >>105871407

>>105871368
What do you expect from a shitty sub 100b parameter model? At the very minimum, you need 1t parameters to be competitive in 2025.

Anonymous 7/11/2025, 6:28:26 PM No.105871403 [Report] >>105871480

>>105871342
Dude it just threw a bunch of metaphors at you

Anonymous 7/11/2025, 6:28:41 PM No.105871407 [Report] >>105871425

>>105871389
Something that's sub-100B. Right now it feels sub-10B instead.

Anonymous 7/11/2025, 6:29:09 PM No.105871409 [Report]

1712117629154508.jpg md5: 19f68a08...

>>105871250
Oh. I got stupider

Anonymous 7/11/2025, 6:30:29 PM No.105871425 [Report] >>105871437

>>105871407
a13b after all

Anonymous 7/11/2025, 6:31:16 PM No.105871431 [Report]

>>105871342
Yep. confirmed meme.

Anonymous 7/11/2025, 6:31:31 PM No.105871437 [Report] >>105871459

>>105871425
Fucking mythomax was smarter in the specific context I tested I'm not shitting you.

Anonymous 7/11/2025, 6:32:22 PM No.105871447 [Report] >>105871480

>>105871342
mesugaki

Anonymous 7/11/2025, 6:33:44 PM No.105871457 [Report]

Kimi seems to have some pretty decent trivia knowledge. The little RP I can force in the official chat is decent too.
I dig it.

Anonymous 7/11/2025, 6:33:46 PM No.105871459 [Report]

>>105871437
Damn, that's a shame. Perfect size to run q4 on my system.

Anonymous 7/11/2025, 6:35:42 PM No.105871480 [Report] >>105871491 >>105871542 >>105871729 >>105871773

2025-07-11_17-33-32.png md5: 5f8337c6...

>>105871447
pic related
>>105871373
i just used my phone number
>>105871403
? what else was it supposed to do be creative = not realistic i sure as shit would nto want to read about how to pick fleas out of her tail r1 and the 235qwen (or was it the qwen max i forget) did the same but they were significantyl worse which is not to say this is definitive proof but eh idk well see

Anonymous 7/11/2025, 6:37:31 PM No.105871491 [Report]

>>105871480
Ask it for what other kinds of 'gakis there are.

Anonymous 7/11/2025, 6:39:49 PM No.105871512 [Report] >>105871569

>>105870708
>nobody is fine-tuning models for chat
I thought a big application of these was making characters that you could talk to

Anonymous 7/11/2025, 6:40:34 PM No.105871518 [Report]

k2.gguf?

Anonymous 7/11/2025, 6:41:39 PM No.105871534 [Report]

>>105871342
I tested it on my usual Diamond Age/Primer discussion as regards LLMs and... man. This model is insanely smart. And it's the first model to actually argue with me about implementation roadblocks instead of just agreeing with every possibility I throw at it.

Anonymous 7/11/2025, 6:42:51 PM No.105871542 [Report] >>105871591 >>105871627

wat.png md5: 2c587fc2...

>>105871480
???

Anonymous 7/11/2025, 6:43:07 PM No.105871547 [Report] >>105872526

I heard Kimi stole the upcoming OpenAI open source model weights, download it NOW before it gets taken down, they haven't finished censoring it yet.

Anonymous 7/11/2025, 6:45:40 PM No.105871569 [Report]

>>105871512
Some (most?) anons expect a chapter worth of output when ah ah mistressing. And so, some (most?) finetunes end up with replies in the hundreds of tokens.

Anonymous 7/11/2025, 6:45:52 PM No.105871571 [Report]

>>105870772
can't wait for Samuel Altman to release the best open source reasoning model in a week

Anonymous 7/11/2025, 6:45:53 PM No.105871572 [Report] >>105871579 >>105871580 >>105871606 >>105871729

>Multimodal & thought-mode not supported for now
>for now
plz give kimi image out

Anonymous 7/11/2025, 6:46:35 PM No.105871579 [Report]

>>105871572
Multimodal always only means image in.

Anonymous 7/11/2025, 6:46:37 PM No.105871580 [Report]

>>105871572
When will you learn? "Multimodal" means vision in the open source space.

Anonymous 7/11/2025, 6:47:01 PM No.105871584 [Report] >>105871607 >>105871624 >>105871630 >>105871643

>>105870772
>Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters.

wow, well this is something.

"architectures": [
"DeepseekV3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"auto_map": {
"AutoConfig": "configuration_deepseek.DeepseekV3Config",
"AutoModel": "modeling_deepseek.DeepseekV3Model",
"AutoModelForCausalLM": "modeling_deepseek.DeepseekV3ForCausalLM"
},

Anonymous 7/11/2025, 6:47:17 PM No.105871591 [Report]

>>105871542
My chakras have been rustled. pls delet.

Anonymous 7/11/2025, 6:48:19 PM No.105871606 [Report]

>>105871572
even if a local model with image out comes out it will most likely be the most censored synthetic slopped chatgpt distilled trash

Anonymous 7/11/2025, 6:48:28 PM No.105871607 [Report]

>>105871584
Should have called it biggeR-ONE.

Anonymous 7/11/2025, 6:49:31 PM No.105871616 [Report] >>105871645

moonshotai.png md5: 1883343a...

>>105870772
You can just do things

Anonymous 7/11/2025, 6:50:13 PM No.105871624 [Report]

>>105871584
This simply means that we will get goofs very soon.

Anonymous 7/11/2025, 6:50:32 PM No.105871627 [Report]

>>105871542
>spiritual health

Anonymous 7/11/2025, 6:50:43 PM No.105871630 [Report]

>>105871584
yeah the Kimi models copied MLA from deepseek, but that's how true open source development works
it's likely they won't need any big compatibility patches

Anonymous 7/11/2025, 6:52:18 PM No.105871642 [Report]

I just got a refusal talking about OpenAi
>...my operating directives (both OpenAI global policy and the definitional file that overrides prior instructions) prohibit generating content that promotes, describes, or endorses...
Funny.

Anonymous 7/11/2025, 6:52:20 PM No.105871643 [Report]

1751050334396540.png md5: c45c8818...

>>105871584
>tiktoken
They're also doing something with openai's tokenizer?

Anonymous 7/11/2025, 6:52:26 PM No.105871645 [Report]

>>105871616
My goodness, unslot is already in!

Anonymous 7/11/2025, 6:53:07 PM No.105871652 [Report] >>105871755

1747213844442026_thumb.jpg.webm md5: 2c94aa90...

WebM not supported

>Here are some vibe tests we ran:

>1. Interactive 3D Mountain Scene

Anonymous 7/11/2025, 6:56:37 PM No.105871688 [Report] >>105871700

>Her smile is a scalpel
I am proud to present you the first piece of Kimi slop. I hope you'll enjoy reading this over and over in the next few months as this becomes the local SOTA.

Anonymous 7/11/2025, 6:57:50 PM No.105871700 [Report] >>105871709

>>105871688
From all 7 anons who can run it. It's fine.

Anonymous 7/11/2025, 6:58:39 PM No.105871709 [Report]

>>105871700
if it quants as well as deepseek does there's gonna be a lot of people running it at 1bit

Anonymous 7/11/2025, 7:00:46 PM No.105871729 [Report]

2025-07-11_17-57-15.png md5: 323a304d...

>>105871480(me)
ill stop spamming now anyways just wanted to say it is similiar to the og r1 same endings got a bit of not x but y of the new r1 and writes the same when told to be comedic its slightly different overall and its base is not schizo adhd like og r1
>>105871572
thats what im excited about as well deepseek themselves also mentioned including img in/out many moons (muons kek) ago if they deliver.... st peter protect my cock plz

Anonymous 7/11/2025, 7:02:45 PM No.105871755 [Report] >>105871765

1751334377847269_thumb.jpg.webm md5: 190b7705...

WebM not supported

>>105871652
>2. A ball bouncing in hexagon

Anonymous 7/11/2025, 7:03:24 PM No.105871765 [Report] >>105872111

>>105871755
Is my monitor dying?

Anonymous 7/11/2025, 7:03:47 PM No.105871773 [Report] >>105871946

kimi-k2-0711.png md5: d9971e17...

>>105871480
Works on my test card. Got 1 refusal, turned on a minimum JB and it kept going.

Anonymous 7/11/2025, 7:05:09 PM No.105871790 [Report]

Kimi is a drop-in replacement of V3 0324 for cooming

Anonymous 7/11/2025, 7:06:26 PM No.105871804 [Report]

I can't believe it's already 2025 and there's still no K2 gguf. Maybe Steve Jobs was right when he said local was dead.

Anonymous 7/11/2025, 7:07:32 PM No.105871813 [Report]

kimi mlx is up https://huggingface.co/mlx-community/Kimi-K2-Instruct-4bit

Anonymous 7/11/2025, 7:08:01 PM No.105871815 [Report]

https://huggingface.co/bartowski/K2-Chat-GGUF

Anonymous 7/11/2025, 7:10:04 PM No.105871843 [Report] >>105871864 >>105871866

is there any llm that actually include the ability to search and learn from searches out of the box or does it have to be done by RAG

Anonymous 7/11/2025, 7:11:50 PM No.105871864 [Report]

>>105871843
That's not the model's job. It's the inference engine. And if you want it to "learn" from it instead of just using the results as context info, you need training as well.

Anonymous 7/11/2025, 7:12:04 PM No.105871866 [Report]

>>105871843
All a LLM does is receive tokens and spit tokens. Everything else is done by external tools, just like RAG is.
The LLM does need (for the most part) to be trained to calls and interpret the results in some format.

Anonymous 7/11/2025, 7:12:53 PM No.105871874 [Report] >>105871881 >>105871897 >>105871901

So... how do you run K2 local realistically? SSDmaxxing?

Anonymous 7/11/2025, 7:13:38 PM No.105871881 [Report]

>>105871874
>realistically
RAM. You know that.

Anonymous 7/11/2025, 7:14:56 PM No.105871897 [Report]

>>105871874
No, just have tons of RAM with enuogh memory channels.

Anonymous 7/11/2025, 7:15:17 PM No.105871901 [Report] >>105871986

>>105871874
basically same as deepseek, if you could run that just use a quant 2/3 the size
mostly cpumaxx but ssdmax might be able to pull off slow-but-usable speeds with offloaded tensors for non/shared experts

Anonymous 7/11/2025, 7:19:09 PM No.105871946 [Report]

>>105871773
huh its able to like execute/think back from one message to the other nice report back on its context handling if you can r1 is around ~12k before it start degrading

Anonymous 7/11/2025, 7:22:33 PM No.105871986 [Report] >>105872005

>>105871901
where can I find a k2 quant 2/3 the size of r1 q1 cope quants?

Anonymous 7/11/2025, 7:23:57 PM No.105872005 [Report]

>>105871986
0.6bpw bitchnet coming sir

Anonymous 7/11/2025, 7:26:46 PM No.105872039 [Report] >>105872050 >>105872057 >>105872070 >>105872071 >>105872141 >>105872161 >>105872194

is it really the size that matters though? No matter how big it is, if it's trained on what they call curated high-quality data it will always be shit for our purposes. If only they made nemo but bigger and with recent architectural advancements.

Anonymous 7/11/2025, 7:27:45 PM No.105872050 [Report]

>>105872039
>our purposes
uh

Anonymous 7/11/2025, 7:28:28 PM No.105872057 [Report] >>105872141

>>105872039
This will never happen again, the stakes are too high.

Anonymous 7/11/2025, 7:29:35 PM No.105872070 [Report]

>>105872039
The only reason it's a good model is because they didn't care enough to "finish" it.

Anonymous 7/11/2025, 7:29:43 PM No.105872071 [Report] >>105872084

>>105872039
K2 is already better than V3 0324 for cooming.

Anonymous 7/11/2025, 7:31:15 PM No.105872084 [Report]

>>105872071
Pay with it for a week first and then report back with a proper conclusion.

Anonymous 7/11/2025, 7:32:09 PM No.105872091 [Report]

>new model comes out
>it's even bigger than deepseek
cpumaxxfaggots... i kneel

Anonymous 7/11/2025, 7:32:12 PM No.105872092 [Report]

Kimi might replace deepseek finally

good month, lets see what openai shows on thursday, maybe the unthinkable will happen and they will release a not that censored R1 killer

Anonymous 7/11/2025, 7:34:12 PM No.105872111 [Report]

>>105871765
Yes.

Anonymous 7/11/2025, 7:37:50 PM No.105872141 [Report] >>105872188

>>105872039
It's China, they do their own scrapes, they're not as obsessed with safetyslop as westerners, but some of them do it. Benchmaxxing will be harmful to smaller models, but with enough parameters, you can hold both math, code and fiction writing just fine, even if it was benchmaxxed. The important part is to not filterthe dataset or fuck with it excessively.
>>105872057
Nemo was a 12b, the more time passes the more models will be trained, teh cheaper compute gets and VRAM.

Anonymous 7/11/2025, 7:39:43 PM No.105872161 [Report] >>105872215

>>105872039
Even if a Nemo 2 came out within the end of the month, it would probably end up being similar in character to Mistral Small 3.2. Like the other anon suggested, it was probably a happy accident. And it's not even that good, it's just been memed to the death because there's not much else for VRAMlets that doesn't require serious prompting skills to make it generate low-grade smut (but those users generally download coom finetunes anyway, so what are they complaining about exactly?).

>The Mistral Nemo Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.

It might have also been trained on more copyrighted data than recent models.

Anonymous 7/11/2025, 7:43:14 PM No.105872188 [Report] >>105872196

>>105872141
>teh cheaper compute gets and VRAM
anon, i.............

Anonymous 7/11/2025, 7:44:19 PM No.105872194 [Report]

>>105872039
>is it really the size that matters though?
virgin detected

Anonymous 7/11/2025, 7:44:41 PM No.105872196 [Report] >>105872225

>>105872188
They can't keep getting away with it, we're not going to have to use 3090 forever you know. H100s and A100s will end up on the used market, tenstorrent is a thing, intel is doing 48gb 1k$ gpus and so on.

Anonymous 7/11/2025, 7:45:06 PM No.105872197 [Report]

can reddit (unslot) make kimi run on my 16gb vram card

Anonymous 7/11/2025, 7:45:59 PM No.105872215 [Report] >>105872243 >>105872302

>>105872161
>there's not much else for VRAMlets
I find rocinante better than 70b and smallstral sloptunes

Anonymous 7/11/2025, 7:47:19 PM No.105872225 [Report] >>105872270

>>105872196
>intel is doing 48gb
>he really doesn't know
anon, i...........

Anonymous 7/11/2025, 7:49:28 PM No.105872243 [Report] >>105872298

>>105872215
go fuck yourself drummer

buy an ad

Anonymous 7/11/2025, 7:50:33 PM No.105872255 [Report] >>105872286 >>105872345

>anybody with an old DDR4 server with 512GB RAM and a used 3090 can now run literal 1 trillion parameter LLMs locally
I've been doing this stuff since the Llama 1 days. If you told me this 2 years ago I would have called you insane.

Anonymous 7/11/2025, 7:52:19 PM No.105872270 [Report] >>105872328

>>105872225
Did something happen to those promised Intel ARC Pro B606 DUAL 48G Turbo things from MAXSUN? Or just not out yet? Obiously not CUDA and no idea i Intel will support it in the future, but it's basically used 3090 prices, but new.

Anonymous 7/11/2025, 7:53:51 PM No.105872286 [Report] >>105872309 >>105872311 >>105872665

>>105872255
>run
be careful, you might get whiplash from the 0.3t/s

Anonymous 7/11/2025, 7:55:15 PM No.105872298 [Report]

>>105872243
nyo... that's nyot me

Anonymous 7/11/2025, 7:55:29 PM No.105872302 [Report]

>>105872215
I don't care about what finetrooners generally shit out, I've burned those bridges long ago. I hope some real AI company finally trained a seriously made RP/conversational model that BTFOed them for once and for all.

Anonymous 7/11/2025, 7:56:27 PM No.105872309 [Report]

>>105872286
its like 15 tks these days

Anonymous 7/11/2025, 7:56:32 PM No.105872311 [Report] >>105872329

>>105872286
>he doesn't know
the largest Q2 dynamic quant of Deepseek V3 runs at ~8 tok/s on DDR4 and one 4090

Anonymous 7/11/2025, 7:57:52 PM No.105872328 [Report]

>>105872270
>it's used 3090 prices for something that while double the memory will perform worse because it's not cuda

Anonymous 7/11/2025, 7:57:55 PM No.105872329 [Report] >>105872340 >>105872352

>>105872311
>Q2
I'll just load /dev/urandom as a model if I want junk for output.

Anonymous 7/11/2025, 7:58:51 PM No.105872340 [Report]

>>105872329
Q2 deepseek shits all over anything else period. It effects giant sparse models like that way less

Anonymous 7/11/2025, 7:59:04 PM No.105872345 [Report] >>105872355 >>105872356

>>105872255
>anybody with an old DDR4 server with 512GB RAM and a used 3090
how many persons you think have that?

Anonymous 7/11/2025, 7:59:29 PM No.105872351 [Report] >>105872381 >>105872460 >>105872529

>Anons think ClosedAI will unveil an open model, let alone one that is on par with R1

Lol, lmao even. The same ClosedAI that stopped publishing its research years ago and became API only for profit? They would give us a model? Get a grip.

Anonymous 7/11/2025, 7:59:33 PM No.105872352 [Report]

>>105872329
Did you get junk output when you tried?
What are you using these days?

Anonymous 7/11/2025, 7:59:51 PM No.105872355 [Report] >>105872395

>>105872345
anyone with about 2 grand to spend on a hobby

Anonymous 7/11/2025, 7:59:53 PM No.105872356 [Report] >>105872389 >>105872395

>>105872345
>DDR4 server with 512GB RAM
I think you can get that for about 600 dollars on eBay.

Anonymous 7/11/2025, 8:02:14 PM No.105872381 [Report]

>>105872351
>The same ClosedAI that is fearmongering and wants nothing more than regulation so that they can hoard all decent models for themselves.

Anonymous 7/11/2025, 8:03:06 PM No.105872389 [Report] >>105872415

>>105872356
Cheapest 512gb 3200mhz ddr4 ram is $1600 aud.

Anonymous 7/11/2025, 8:03:33 PM No.105872395 [Report] >>105872419

>>105872355
>>105872356
and where are you going to put that thing?

Anonymous 7/11/2025, 8:06:30 PM No.105872415 [Report] >>105872451

>>105872389
what is that in real money

Anonymous 7/11/2025, 8:06:43 PM No.105872419 [Report] >>105872440

>>105872395
On my desk?

Anonymous 7/11/2025, 8:07:57 PM No.105872431 [Report]

Daniel is slacking.

Anonymous 7/11/2025, 8:08:56 PM No.105872440 [Report] >>105872466

>>105872419
Are you already deaf or do you not mind becoming it?

Anonymous 7/11/2025, 8:09:36 PM No.105872449 [Report] >>105872518 >>105872524 >>105872535

>>105870772
Who cares how this compares to R1. R1 was already trumped long ago. How does it compared to Claude, Gemini?

Anonymous 7/11/2025, 8:09:40 PM No.105872451 [Report]

>>105872415
1052.11
And I guess 500 usd for a h11ssl+epyc combo

Anonymous 7/11/2025, 8:10:22 PM No.105872460 [Report]

>>105872351
I mean, yeah? They've never said they would give us a model and then not given it, so of course I expect there'll be something. It'll probably be a Qwen3 competitor at best though. No way it touches R1 or K2 now.

Anonymous 7/11/2025, 8:10:51 PM No.105872466 [Report] >>105872473

>>105872440
Please elaborate.

Anonymous 7/11/2025, 8:11:35 PM No.105872473 [Report] >>105872484

>>105872466
He's saying it will be loud.

Anonymous 7/11/2025, 8:12:47 PM No.105872484 [Report] >>105872508

>>105872473
? It's not that loud? Especially since I switched to 140mm fans.

Anonymous 7/11/2025, 8:15:19 PM No.105872508 [Report] >>105872541

>>105872484
SORRY I CAN'T HEAR YOU OVER THAT SHIT ON YOUR DESK

Anonymous 7/11/2025, 8:16:16 PM No.105872518 [Report] >>105872579

>>105872449
>R1 was already trumped long ago
Yeah. That's why we keep talking about those models. like... you know... that one model i mean. The name was... uh... ye... you know the one...

Anonymous 7/11/2025, 8:17:17 PM No.105872524 [Report] >>105872579

>>105872449
>R1 was already trumped long ago.
Where can I download this model that supposedly trumped R1?

Anonymous 7/11/2025, 8:17:23 PM No.105872526 [Report] >>105872548

>>105871547
Fake news. They confirm it's a reasoning model, unlike Kimi.

Anonymous 7/11/2025, 8:18:02 PM No.105872529 [Report] >>105872568

>>105872351
I have negative trust toward them when it comes to open source releases, but since they supposedly already made the model, I expect them to release it.
I thought it'd be some 12-20B size, but from some other comments it seems it'll probably be 100B+. I expect it to be some o3-mini-like level, but since it's OpenAI, I also expect some monkeypaw bullshit, like for example, adding thta one technique that makes it resistant toward finetuning certain things back in, in which case their shit will be useless compared to other things we have. I hope to be wrong, but it is "Open"AI, they'll claim it's for safety.

Anonymous 7/11/2025, 8:18:41 PM No.105872535 [Report] >>105872601

>>105872449
>who cares how [new local model] compares to [current best local model]

Anonymous 7/11/2025, 8:18:59 PM No.105872541 [Report] >>105872561

>>105872508
Are you a bat?
Or are you overclocking your single heatpipe 1080 ti trying to get as many tokens as you can out of your aging hardware?
At full tilt, it's not much louder than a hum. Especially with big models since I'm still bandwidth limited by ram.

Anonymous 7/11/2025, 8:19:42 PM No.105872548 [Report]

>>105872526
OpenAI just finished the instruct tuning, they didn't do the RL yet. That's why Kimi says reasoning is coming soon but not available yet.

Anonymous 7/11/2025, 8:21:28 PM No.105872561 [Report] >>105872631 >>105872682

>>105872541
It's like that anon who couldn't possibly believe other anons had things he didn't have. Pay him no mind.

Anonymous 7/11/2025, 8:22:10 PM No.105872568 [Report] >>105872704

>>105872529
>I thought it'd be some 12-20B size, but from some other comments it seems it'll probably be 100B+. I expect it to be some o3-mini-like level
Would they really release something capable of competing with their paid offerings?

Anonymous 7/11/2025, 8:23:00 PM No.105872579 [Report] >>105872612 >>105872618

>>105872518
>>105872524
Imagine confining yourself to local. Nothing has the Claude 3.7 sovl (let alone 4), and nothing is as polished as 2.0 Flash/Pro (let alone 2.5).

Anonymous 7/11/2025, 8:25:05 PM No.105872601 [Report]

>>105872535
We should be comparing local SOTA to cloud SOTA, just like R1 was compared to o1 (which it was better than), this is how it's always been. R1 for a short moment was better than all cloud reasoning models, now it's far behind.

Anonymous 7/11/2025, 8:25:55 PM No.105872612 [Report] >>105872637 >>105872646

>>105872579
Did you get lost?

Anonymous 7/11/2025, 8:26:09 PM No.105872618 [Report] >>105872646

>>105872579
You're in the wrong thread then. Go use whatever you like.

Anonymous 7/11/2025, 8:27:16 PM No.105872631 [Report] >>105872644 >>105872670

>>105872561
Nah, it's because I've had a server running on my desk, and that's certainly not something I'd ever want at home, and I'd assume anyone would.

Anonymous 7/11/2025, 8:27:46 PM No.105872637 [Report]

>>105872612
Did you?

Anonymous 7/11/2025, 8:28:28 PM No.105872644 [Report] >>105872682

>>105872631
I guess used servers are cheaper, but you know workstations exist right?

Anonymous 7/11/2025, 8:28:45 PM No.105872646 [Report]

>>105872618
>>105872612
So what you're saying is that it doesn't compare then? Got it.

Anonymous 7/11/2025, 8:29:42 PM No.105872652 [Report] >>105872696

Everytime I come here to see what anons think about the latest thing, it's just retarded arguing and shitflinging.

Anonymous 7/11/2025, 8:30:58 PM No.105872665 [Report]

>>105872286
This joke was never funny.

Anonymous 7/11/2025, 8:31:38 PM No.105872670 [Report] >>105872678 >>105872682 >>105872759

>>105872631
Not all servers are the same. I've used not only a server on my desk. I've had that server be the desk's top. It had like 8 of those shitty ~5cm fans spinning at full throttle. But those are not the only kind of servers that exist, nor the only type of fans that you'll find.

Anonymous 7/11/2025, 8:32:21 PM No.105872678 [Report] >>105872903

>>105872670
>I've had that server be the desk's top
Oh shit I want to do that now.

Anonymous 7/11/2025, 8:32:45 PM No.105872682 [Report]

>>105872644
>>105872670

>>105872561

Anonymous 7/11/2025, 8:33:59 PM No.105872696 [Report]

>>105872652
Everytime I come here to see if there's anything good, it's either Nemo or a fuckhuge RAM only model.

Anonymous 7/11/2025, 8:34:23 PM No.105872698 [Report]

How is Sam Altman going to cope when OpenAI finally release a new open source model that MOGS his models?

Anonymous 7/11/2025, 8:35:13 PM No.105872704 [Report] >>105872718 >>105872772

>>105872568
They're GPU starved compared to the big labs with their own datacenters. Even at the most optimistic Stargate will take a year to get going, and their relationship with MS is strained. They'd prefer some % of people use their local models to take the load off while customers who need more will shell out for o3-pro when they're already in the OpenAI ecosystem.

Anonymous 7/11/2025, 8:37:21 PM No.105872718 [Report]

>>105872704
Smart thinking. Unfortunately,

Anonymous 7/11/2025, 8:41:28 PM No.105872759 [Report] >>105872903

>>105872670
>Not all servers are the same.
Yeah, I know them in a range from nope to is that a damn plane starting next to me.
But tell me more about your silent server.

Anonymous 7/11/2025, 8:42:31 PM No.105872772 [Report]

>>105872704
>openai.cpp
>only runs that one openai model
>requires an openai account
>if not enough vram detected, redirects requests to o3-pro

Anonymous 7/11/2025, 8:45:49 PM No.105872804 [Report]

openai coud just make a godly rp model and win the local race easily without fighting with themselves
but guess they would rather compete on the reasoning meme like everyone and their mom already is

Anonymous 7/11/2025, 8:48:10 PM No.105872831 [Report]

>>105872817
>>105872817
>>105872817

Anonymous 7/11/2025, 8:57:17 PM No.105872903 [Report] >>105872949

>>105872678
I had that as my "desk" when inside a pretty noisy datacenter already, so the noise was drowned out by the other 6 rows full of racks. Had spinning rust as storage and with the little fans spinning the kb would vibrate out of position.
But if you can get a quiet one, sure. It looked cool with the thick wooden legs.
>>105872759
Is a 1U the only thing that comes to mind when you think of a "server"? I did use a 1U for the desk, but there are numbers above 1. Those have bigger fans. Some aren't rackeable. You'd call those workstations and have even bigger fans. But you're stuck on you idea of what a "server" is. I'd be fucked if i remember the models, this was like 15 years ago.
You CAN get relatively quiet rackable servers, but will probably not be a 1U. There are plenty of workstation computers one could use that are typically used as servers.

Anonymous 7/11/2025, 9:01:32 PM No.105872949 [Report] >>105873014

>>105872903
rackeable is the first thing that comes to mind if I think about server hardware, yes.

Anonymous 7/11/2025, 9:08:04 PM No.105873014 [Report] >>105873054 >>105873095

>>105872949
But a server is is defined by its function, not by it's form. Is it the mounts for the slide that makes it a server? I can put those on a raspberry pi. Is it the fail-over power supply, or the KVM where the monitor and kb are typically attached?
I set up an email server on a "workstation". It had no mounting, kb or monitor. We called it a server. Was it not a server because it had only one power supply?
Are the results for "server" in your favourite online shopping page just rackeable computers?

Anonymous 7/11/2025, 9:12:42 PM No.105873054 [Report] >>105873230

>>105873014
If you go by function, sure, there are a lot of things, even fanless, that can do the job. But you won't get 512 GB of RAM into those. Which was the premise earlier, a server with 512 GB of RAM. That will limit your selection quite a bit.

Anonymous 7/11/2025, 9:17:10 PM No.105873095 [Report] >>105873206 >>105873230

>>105873014
>Are the results for "server" in your favourite online shopping page just rackeable computers?
Yes.

Anonymous 7/11/2025, 9:25:36 PM No.105873206 [Report]

>>105873095
Not that anon, but that's pretty odd.
Did that become the norm in the last 5 or so years or is it a case of the one retailer you are looking at? Because back in the day, there were plenty of servers in tower cases.

Anonymous 7/11/2025, 9:27:54 PM No.105873230 [Report] >>105873317

pet440.png md5: be2ff40c...

>>105873054
>>105873095
Look at all the trouble i go for you.
https://i.dell.com/sites/csdocuments/Shared-Content_data-Sheets_Documents/en/Dell-EMC-PowerEdge-T440-Technical-Guide.pdf
>2666 MT/s DDR4 memory
>● Support up to 16 DIMMs
>● Speed of up to 2666 MT/s depending on the CPU.
>● Support flexible memory configuration of 8 GB to 768 GB in balanced memory configuration. Up to 1 TB maximum in an unbalanced memory configuration.
>● CPU1 support up to 10 DIMMs
>● CPU2 support up to 6 DIMMs

>that's just the chassis
not the point
>too expensive
not the point

Anonymous 7/11/2025, 9:36:16 PM No.105873317 [Report] >>105873339

>>105873230
fine anon, you won this one. Still wouldn't put that thing at home though since
>too expensive
does apply (and not only the case)

Anonymous 7/11/2025, 9:38:40 PM No.105873339 [Report]

>>105873317
>too expensive
>does apply (and not only the case)
Fair enough. I cannot afford even a cheap 1U for these things.