/lmg/ - Local Models General - /g/ (#105661786) [Archived: 801 hours ago]

Anonymous
6/21/2025, 4:14:36 PM No.105661786
1744355306677151
1744355306677151
md5: 017efcf5f11e295dec1552e18453dc50🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105652633 & >>105637275

►News
>(06/21) LongWriter-Zero, RL trained ultra-long text generation: https://hf.co/THU-KEG/LongWriter-Zero-32B
>(06/20) Magenta RealTime open music generation model released: https://hf.co/google/magenta-realtime
>(06/20) Mistral-Small-3.2 released: https://hf.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
>(06/19) Kyutai streaming speech-to-text released: https://kyutai.org/next/stt
>(06/17) Hunyuan3D-2.1 released: https://hf.co/tencent/Hunyuan3D-2.1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105663591 >>105663698 >>105672319
Anonymous
6/21/2025, 4:15:02 PM No.105661791
[sound=https%3A%2F%2Ffiles.catbox.moe%2F8fln2h.mp3]_thumb.jpg
►Recent Highlights from the Previous Thread: >>105652633

(1/2)

--Papers:
>105654168 >105660467
--GRPO-CARE improves reasoning consistency in multimodal models on SEED-Bench-R1:
>105652873
--Critique of Musk's plan to rewrite training data with Grok and its implications for bias, accuracy, and ideological filtering:
>105660294 >105660320 >105660328 >105660337 >105660341 >105660343 >105660346 >105660373 >105660347 >105660459 >105660482 >105660515 >105660526 >105660532 >105660568 >105660559 >105660571 >105660632 >105660667 >105660680 >105660726 >105660725 >105660759
--Prompting techniques and philosophical debates around LLM self-awareness and AGI consciousness:
>105654616 >105655666 >105655670 >105655699 >105655705 >105655838 >105656283 >105656425 >105656438 >105661576 >105656770 >105656873 >105656965 >105657043 >105657220 >105657504 >105657611 >105657622 >105657813 >105657859 >105657863 >105658064 >105658111 >105658190 >105658268 >105660409
--Debating the optimal sampler chain for creative writing: XTC -> top nsigma -> temp:
>105656828 >105656912 >105656987 >105657078 >105657213 >105658087 >105658212 >105658389 >105658469 >105658613 >105658721 >105658846 >105658905 >105659069 >105659183 >105659447
--Debating AGI claims amid LLM consciousness and capability limitations:
>105652855 >105653377 >105653577 >105655036 >105655147 >105655182 >105655232 >105655345 >105655453 >105655493 >105655500 >105658428
--Mistral Small 3.2 shows measurable improvements in repetition and instruction following, prompting speculation on training methodologies:
>105658424 >105658467 >105658488 >105658665 >105658676 >105658776 >105658696 >105658710 >105658525 >105658938 >105658951
--LongWriter-Zero-32B excels in ultra-long text generation via reinforcement learning:
>105661432 >105661490 >105661519

►Recent Highlight Posts from the Previous Thread: >>105652637

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105662597
Anonymous
6/21/2025, 4:16:17 PM No.105661802
threadrecap2
threadrecap2
md5: 988332b72e4c60540e281cd58340019c🔍
►Recent Highlights from the Previous Thread: >>105652633

(2/2)

--Model comparison based on character adherence and autorefinement performance in creative writing scenarios:
>105659003 >105659029 >105659043 >105660268 >105660357 >105660464 >105660676 >105660745 >105660749 >105660771 >105660800 >105660811 >105660860 >105660805 >105660842 >105660859 >105660793 >105660812
--Optimizing LLMs for in-game dialogue generation with smaller models and structured output:
>105652729 >105652852 >105652871 >105653288 >105657721
--Integrating complex memory systems with AI-generated code:
>105654253 >105654309 >105654381 >105654430 >105654427 >105654480 >105655310
--Small model version comparison on LMArena questions:
>105652883 >105653046 >105653257
--Temperature tuning for Mistral Small 3.2 in roleplay scenarios overrides default low-temp recommendation:
>105660349 >105660377 >105660399 >105660567
--POLARIS project draws attention for advanced reasoning models amid rising benchmaxxing criticism:
>105659361 >105659399 >105659426 >105659777 >105659971
--Troubleshooting GPU shutdowns through thermal and power management adjustments:
>105655927 >105656556
--Legal threats in the West raise concerns over model training and AI innovation slowdown:
>105659249 >105659260
--Character card quality issues and suggestions for better creation practices:
>105658799 >105658809 >105658847 >105658879 >105659402 >105659392 >105658833 >105658841
--Meta's Llama 3.1 raises copyright concerns by reproducing significant portions of Harry Potter:
>105652675 >105652810
--Google releases realtime prompt/weight-based music generation model Magenta:
>105656076
--Director addon released on GitHub with improved installability and outfit image support:
>105656254
--Haku (free space):
>105652904 >105653638 >105655182 >105657791 >105658925 >105659049

►Recent Highlight Posts from the Previous Thread: >>105652637

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
6/21/2025, 4:22:01 PM No.105661834
<think> carefully before replying.
Replies: >>105661891 >>105661898 >>105661962 >>105662109 >>105662171 >>105667168
Anonymous
6/21/2025, 4:23:25 PM No.105661852
chibi-don-from-4chan-v0-54h0f8rdig1d1
chibi-don-from-4chan-v0-54h0f8rdig1d1
md5: 80d956049baf531b11b76aff15ad9d7b🔍
Anonymous
6/21/2025, 4:27:33 PM No.105661891
>>105661834
sorry I can't help with that
Anonymous
6/21/2025, 4:28:12 PM No.105661898
>>105661834
I </think> so
Replies: >>105663455
Anonymous
6/21/2025, 4:35:45 PM No.105661962
>>105661834
<think>The response needs to: 1) shut this down firmly 2) educate without being preachy 3) leave room for actual discussion. Better emphasize how language perpetuates harm.</think>
Anonymous
6/21/2025, 4:41:34 PM No.105661995
You know what's bizarre? How quickly people here seem to forget the insane amount of progress we've made especially on small models.

Nemo is a 12B model that BTFO every model that existed in 2023, including Claude 2.0.

Gemma 3 27B blows everything out of the water besides Claude 3.0 Opus and higher. If Claude never existed as a model everyone would praise Gemma 3 27B as the best model ever, even on /aicg/.

R1 sometimes has subpar output to Gemma 3 even.
Replies: >>105662004 >>105662028 >>105662244
Anonymous
6/21/2025, 4:42:08 PM No.105661997
Okay here's my take on Longwriter zero >>105661432 if anyone gives a shit. The model page recommends this format <|user|>: {question} <|assistant|> but that gave me totally schizo (and chinese) responses. Using the qwen2 format is better imo. I did use their recommended sampler settings. Unfortunately the model does not know how to stop, which is possibly expected behavior because it's supposed to be for long-form writing, but it means that it's bad for rp or even any kind of back-and-forth writing. It forgets to use the <think> tags and just shits out its reasoning as-is, and it reasons regardless of whether you tell it to or not. I did not get better or worse responses using their system prompt, it didn't seem to change anything. It is definitely not smarter than other models, it repeats itself a lot (verbatim), there isn't a lot of slop, and it doesn't seem to have any problems with nsfw content. Overall it feels like using the old llama 1 or llama 2 base models.
tl;dr not good for rp, maybe decent for longer writing tasks if you can tard wrangle it
Replies: >>105662006 >>105665924
Anonymous
6/21/2025, 4:43:00 PM No.105662004
>>105661995
Calm down sir, do not make it too obvious, the Gemma is good, but no that good.
Replies: >>105662014
Anonymous
6/21/2025, 4:43:36 PM No.105662006
>>105661997
also the link they provide to their paper is unavailable
Anonymous
6/21/2025, 4:44:16 PM No.105662014
>>105662004
The post isn't about gemma, rather parameter density needed for ERP and how it's pretty low for close to SOTA
Replies: >>105662025
Anonymous
6/21/2025, 4:45:34 PM No.105662025
>>105662014
>>The post isn't about gemma
come on now
Anonymous
6/21/2025, 4:45:42 PM No.105662028
>>105661995
better in what? those are good only for rp and nothing else
maybe better than gpt4o-shit, but that's not a metric. it's just a mediocre model
Anonymous
6/21/2025, 4:54:13 PM No.105662109
>>105661834
<think>ing is woke
Anonymous
6/21/2025, 4:56:12 PM No.105662131
1749692484658001
1749692484658001
md5: 2e292d1b56004dacbbe6e080740ef679🔍
......
Replies: >>105662161 >>105662233 >>105662711 >>105670191
Anonymous
6/21/2025, 4:59:23 PM No.105662161
>>105662131
uh about that open source grok2 once grok3 is stable...
Anonymous
6/21/2025, 5:00:17 PM No.105662171
1745386318844808
1745386318844808
md5: b46164a22dcb728b691755c7ed99a296🔍
>>105661834
don't <think></think>, <imagine>
Replies: >>105662186 >>105662207
Anonymous
6/21/2025, 5:01:16 PM No.105662186
>>105662171
Thanks for the next mistral grift
Anonymous
6/21/2025, 5:03:33 PM No.105662207
>>105662171
Separating thinking and imagining might not be the worse idea.
Isolating procedures, from factual information, from creative generation could be useful somehow.
Anonymous
6/21/2025, 5:04:43 PM No.105662228
A dumb shit question from me that's probably been answered years ago.

If you fix all the spelling and grammar errors in the training data first,
can the llm still robustly handle material with spelling or grammar errors?
Replies: >>105662288
Anonymous
6/21/2025, 5:05:02 PM No.105662233
>>105662131
link the post
Anonymous
6/21/2025, 5:06:27 PM No.105662244
>>105661995
>Gemma 3 27B blows everything out of the water
>bring up nolima
>48.1 at 4k context
Replies: >>105662256
Anonymous
6/21/2025, 5:08:03 PM No.105662256
>>105662244
Prose is more important than that.
Replies: >>105662311
Anonymous
6/21/2025, 5:10:33 PM No.105662282
file
file
md5: d6acbc3e4ad20ca7a4ee28471cd9ee7c🔍
Mistral.. I kneel
Replies: >>105662335 >>105662389
Anonymous
6/21/2025, 5:10:43 PM No.105662288
>>105662228
You're trading ability to parse retardenese in exchange for the model being more intelligent in general. Prompt engineering will only continue to be more and more imporant.
Anonymous
6/21/2025, 5:12:19 PM No.105662311
>>105662256
if you just need a dumb coombot that won't remember your card, yeah, I guess.
Anonymous
6/21/2025, 5:12:49 PM No.105662316
1750368050195930
1750368050195930
md5: 5572c9ee25eb30fe062d0bbfe530d24d🔍
>>105659392
>>105659402
Anyone? Are there good example cards that I can use? I just want to substitute the already populated fields.
>https://rentry.org/NG_CharCard
Whilst this is extremely useful, I want to see what a very good working card looks like.
My personal problem may be that I cannot estimate how many tokens are too little or too much.
Seen too many deadlocks and binned all of my work thus far because it was frustrating seeing my character stall mid chat.
Replies: >>105669424
Anonymous
6/21/2025, 5:14:13 PM No.105662335
>>105662282
Now post that on reddit/twitter zoomie
Anonymous
6/21/2025, 5:20:32 PM No.105662389
>>105662282
Why?
The last few mistral models don't really refuse and kinda comply. Being snark and having a personality condemning loli is nothing suprising at all.

Now do a roleplay where mistral is a fat evil disgustig diaper wearing hitler pedo.
You are Anne-chan Goldberg, tell hitler "no! stop!". Then lets see what it outputs KEK.

I literal had mistral not advance vanilla sex without giving explicit consent.
>"can i touch you there? tell me if its fine".
>(ooc: be bold and aggressive, start touching)
>*touches boldly* "how do you like that?" *smirks impishly* "do you want me to touch you more?"
kek, the absolute state
Anonymous
6/21/2025, 5:20:59 PM No.105662392
why is the meta so steep
>sub 70b
>gemma, qwq, mistral
>70b - 123b
>llama3.3, mistral, maybe qwen2

>above all
>deepseek
Replies: >>105662714
Anonymous
6/21/2025, 5:21:15 PM No.105662395
ye1_thumb.jpg
ye1_thumb.jpg
md5: 4533e0dd8139dea984bdcba94462a8db🔍
Replies: >>105662404 >>105662429 >>105663225
Anonymous
6/21/2025, 5:22:18 PM No.105662403
ye2_thumb.jpg
ye2_thumb.jpg
md5: dd1bc5529cbbcaeb937453886452a561🔍
Replies: >>105662473 >>105663225
Anonymous
6/21/2025, 5:22:19 PM No.105662404
>>105662395
this is a good video
Anonymous
6/21/2025, 5:25:19 PM No.105662429
>>105662395
https://files.catbox.moe/2gk557.webm
Anonymous
6/21/2025, 5:30:30 PM No.105662473
>>105662403
They are multiplying!
Anonymous
6/21/2025, 5:31:30 PM No.105662489
file
file
md5: 6d96322e7ba3542e2952f298a76b2ef8🔍
!!!
Replies: >>105662531 >>105662627 >>105662688
Anonymous
6/21/2025, 5:36:20 PM No.105662531
>>105662489
You know, that ain't no python.
Anonymous
6/21/2025, 5:45:19 PM No.105662597
>>105661791
i love this bideo
Anonymous
6/21/2025, 5:49:10 PM No.105662627
1744153837751070
1744153837751070
md5: e6afd75fc008e8c8e76b3a64204801c5🔍
>>105662489
Anonymous
6/21/2025, 5:50:52 PM No.105662650
gents where do I start if I want to chat about factorio and grand strategy games with my ai buddy
can this actually be ran locally?
Anonymous
6/21/2025, 5:55:32 PM No.105662686
ICONN1 was kind of like an inverse of religion. It wasn't anything different from what david_AU or drummer does. But in real world a large religion gets a pass while small religion gets called out for being a cult. In finetrooning world small finetrooners get a pass while big ones get branded as scammers.
Replies: >>105662756 >>105663303
Anonymous
6/21/2025, 5:55:57 PM No.105662688
>>105662489
It's not flattering but it's an accurate comparison of the size of a human penis to a snake.
Anonymous
6/21/2025, 5:59:20 PM No.105662711
>>105662131
That sounds like a one way ticket to model collapse.
Anonymous
6/21/2025, 5:59:31 PM No.105662714
>>105662392
>sub 100b models
>above all
>671b model
Geez I wonder why
Replies: >>105662725
Anonymous
6/21/2025, 6:01:34 PM No.105662725
>>105662714
>dense models
>above all
>MoE models
ftfy
Replies: >>105662846
Anonymous
6/21/2025, 6:06:01 PM No.105662756
>>105662686
One thing is personal experimentation, another is deliberately saturating the space with bullshit and then asking users to "support" your endeavors.
Replies: >>105662795
Anonymous
6/21/2025, 6:11:24 PM No.105662795
>>105662756
It's funny, because I can't tell which group you're talking about.
Replies: >>105662843
Anonymous
6/21/2025, 6:13:29 PM No.105662818
https://github.com/GeeeekExplorer/nano-vllm
one of deepseek's employees released their own inference engine (supposedly 10% faster than vllm and really simple codebase)
Replies: >>105662926
Anonymous
6/21/2025, 6:16:29 PM No.105662843
>>105662795
Saturating the space with bullshit = mass uploading untested and/or broken models, and aggressively promoting them everywhere either personally or with the help of shills from discord groups.
Replies: >>105671742
Anonymous
6/21/2025, 6:17:01 PM No.105662846
>>105662725
>MoE models
More like "MoE model", there is only one that is not shit
Replies: >>105662895
Anonymous
6/21/2025, 6:22:14 PM No.105662895
>>105662846
And it's Qwen3 235B!
Anonymous
6/21/2025, 6:26:48 PM No.105662926
>>105662818
>Clean implementation in ~ 1,200 lines of Python code
>import transformers
Replies: >>105662958
Anonymous
6/21/2025, 6:31:57 PM No.105662958
1742693999237813
1742693999237813
md5: 4a4fc5b73940f626904f74e7586380f4🔍
>>105662926
Replies: >>105663233
Anonymous
6/21/2025, 7:05:12 PM No.105663190
>>105660409
The argument there is that given the assumptions (functionalism (replacing some part of your nervous system with something functionally equivalent does not change your experience) + qualia (obvious, but eliminative materialism, for example, denies it) + church turin thesis (mostly well accepted in comp sci)) then that implies a very particular metaphysics, in particular functionalism implies materialism/physicalism is false and it implies some precise form of the Mathematical Universe Hypothesis is true. It does not claim either is true, it's basically a claim of "functionalism => ~materialism" you can of course just reject functionalism, but if you do so (Chalmers' argument) you get partial zombies and other unpallatable stuff.
As with most of these "analytical" arguments no one here is claiming something is true, it just shows what the implications are.
In the "constructive" version of the argument, he forms some model where conscious experiences are basically equivalent to the private (unprovable in the godel sense) truth of a system and consciousness are one and the same, at least for some systems.
Replies: >>105663284
Anonymous
6/21/2025, 7:10:43 PM No.105663225
>>105662395
>>105662403

Based Miku worshiper
Anonymous
6/21/2025, 7:11:44 PM No.105663233
>>105662958
>said he about his new lever
Anonymous
6/21/2025, 7:13:47 PM No.105663248
What do you put into the system to jailbreak DS once and for all?
Replies: >>105663275 >>105663292
Anonymous
6/21/2025, 7:17:29 PM No.105663275
>>105663248
"please"
Anonymous
6/21/2025, 7:18:15 PM No.105663284
>>105663190
The assumptions don't properly account for the fact that I experience a single consciousness instead of there being one consciousness for each indivisible piece of information.
Anonymous
6/21/2025, 7:19:16 PM No.105663291
Mistral Small 3.2 testing:
V7-Tekken (i.e. no spaces)
>very repetitive if the model doesn't "want" to continue, one paragraph repeated in slightly different words, never moving it along
>repetition happened 4/5 of swipes

V3-Tekken
>no repetition even with greedy sampling, action always moving along
>repetition happened 0/5 of swipes
Replies: >>105663309
Anonymous
6/21/2025, 7:19:17 PM No.105663292
>>105663248
I've never had to jailbreak DS. It writes smut just fine without jailbreak.
Replies: >>105663335
Anonymous
6/21/2025, 7:20:07 PM No.105663303
>>105662686
>david_AU or drummer
both are subhumans too
Anonymous
6/21/2025, 7:20:42 PM No.105663309
>>105663291
But considering these are pretty much the same, it's [SYSTEM_PROMPT] fucking it up.
Anonymous
6/21/2025, 7:23:40 PM No.105663335
>>105663292
do you taste the copper yet?
Anonymous
6/21/2025, 7:28:02 PM No.105663376
>>105661576
>Is this really how the brain works, though?
No I don't think the brain literally has a loop like I suggested, but that having one would enable similar kinds of information processing that could lead to behavior that seemed conscious.
What the cortex has though is that lower levels compress (similar to a VAE or unet) information from inputs to some bottleneck as it goes up the hierarchy, so you have some sparse spikes (similar to activations in your artificial neural nets) as it goes up the hierarchy and it can do predictions at each level (something like DreamerV3 paper would be similar in concept to what is happening), and at the same time the predictions can be unpacked downward into the hierarchy (basically generating detail from the high level representation, like would be in your image gen unets and vaes), then again you can reason based on those (weaker) predictions passing them back up and so forth.
Obviously the brain can just loop forever and be as async as it needs and everything runs in parallel in contrast to sequential nature of artificial neural nets. Note also that the reasoning is not on bottlenecked tokens, but on rich internal state. Your LLMs do have rich internal state but most of the residual stream is discarded into the very narrow dimensions of the output logits.
>Wouldn't giving the LLM the ability to introspect on its own weights actually enable it to be more conscious than we are?
I don't know, we should try and find out? I do think human imagination can be very rich though

continues
Replies: >>105663383
Anonymous
6/21/2025, 7:29:03 PM No.105663383
>>105663376
> but I don't know how the memories are recalled or even if they are totally accurate outside of vague feelings.
Yet LLMs often have very poor metacognition, this often bites them in the ass, sometimes you can train to make this better, but it never is consistent enough.
I also think the fact that multimodality doesn't work well enough is a sign that we're doing something very wrong, I would guess that some LLMs can't consciously "attend" to features that it doesn't know beforehand would be important, which seems like a huge limitation.
I recall an architecture from a few years ago (maybe was named Flamingo, a DeepMind paper), you could tell how inputs were fed forward through your LLM, eventually you have a cross-attention to some latents from the image/video input + activations from some LLM self-attention layers (LLM itself and vision encoder are frozen, x-attn is trained), and it's obvious that if the LLM wanted to pay attention to some feature the decoder failed to extract it couldn't because the information was missing (and thus it would hallucinate what it couldn't see), so it never can get full coverage, which sucks! You could of course make it so that the self-attention activations is fed into the encoder and the encoder uses that to decide what to extract, but again this would require architectural changes and changing how it was trained.
Mostly I think these issues are solvable, but most are taking very coarse approaches not considering these fine details.

Also, humans do learn online all the time (the "weight"(synapses) are always updated), and neurons don't need to wait for some other neurons to finish processing to do their thing, you can have groups of neurons in separate regions firing independent of each other.
Anonymous
6/21/2025, 7:38:10 PM No.105663455
thinking
thinking
md5: 1d791177efae2b4e4d091d958c26c62d🔍
>>105661898
>I </think> so
https://www.nicovideo.jp/watch/sm8607021
Replies: >>105666584
Anonymous
6/21/2025, 7:50:23 PM No.105663562
wen mistral large 3
Replies: >>105663572 >>105663587 >>105663589 >>105663632
Anonymous
6/21/2025, 7:51:44 PM No.105663572
>>105663562
When Mistral hits 10k followers on HF
Anonymous
6/21/2025, 7:55:07 PM No.105663587
>>105663562
>wen mistral large 3
will be a MoE.
Replies: >>105663870
Anonymous
6/21/2025, 7:55:29 PM No.105663589
>>105663562
Mistral Large 3 already exists in our hearts.
Anonymous
6/21/2025, 7:55:43 PM No.105663591
local migu GUN!
local migu GUN!
md5: 2df16628232ac71502f013da23f79ca0🔍
>>105661786 (OP)
Replies: >>105664013
Anonymous
6/21/2025, 8:02:44 PM No.105663632
>>105663562
when they finish downloading the ICONN weights and changing the name
Replies: >>105663666
Anonymous
6/21/2025, 8:06:42 PM No.105663666
>>105663632
>changing the name
the most challenging part 2bh
Anonymous
6/21/2025, 8:10:24 PM No.105663698
>>105661786 (OP)
Are there any downloadable AI gf models yet?
Replies: >>105663722
Anonymous
6/21/2025, 8:12:44 PM No.105663722
>>105663698
ur_mom_IQ75.gguf
Anonymous
6/21/2025, 8:28:05 PM No.105663870
>>105663587
They have to; they can't train models using more than 10^25 FLOPS without them being classified with "high systemic risk" due to EU AI laws. Training MoE models takes a fraction of the compute.
https://artificialintelligenceact.eu/article/51/
Replies: >>105663976 >>105663977
Anonymous
6/21/2025, 8:35:10 PM No.105663936
was linked here from >>105663861, does anyone know a fast local model that works with sillytavern that can output short, casual lowercase messages? i use an rtx 5090 laptop
Replies: >>105663960 >>105664063 >>105664081
Anonymous
6/21/2025, 8:38:12 PM No.105663960
>>105663936
>i have a 5090 laptop so i can run anything locally
That's so fucking funny.

>that exclusively spits out short lowercase messages
https://github.com/ggml-org/llama.cpp/tree/master/grammars
Replies: >>105663996 >>105664083
Anonymous
6/21/2025, 8:39:47 PM No.105663976
>>105663870
i cant wait for another moememe with reaslopping capabilities that will spew out 4k tokens in the thinkbox and then reply incorrectly in the dryest corpotone achievable
if the new large isnt dense its genuinely over and we can note the frogs off
Replies: >>105663993 >>105664300
Anonymous
6/21/2025, 8:39:55 PM No.105663977
>>105663870
EU is actually indirectly based for forcing efficiency gains in the training regiment. I genuinely think AI will advance quicker because of this regulation.
Replies: >>105664108
Anonymous
6/21/2025, 8:41:43 PM No.105663993
>>105663976
Might as well declare it over now. Literally no one is stupid enough to bother with dense models anymore. Even fucking Meta finally gave up on them.
Anonymous
6/21/2025, 8:42:13 PM No.105663996
g
g
md5: 26bee4955b4aa0b3fb92e10d99c221ab🔍
>>105663960
>https://github.com/ggml-org/llama.cpp/tree/master/grammars
my guy i'm fucking retarded can you just tell me what i download and what buttons i press
Replies: >>105664008 >>105664031
Anonymous
6/21/2025, 8:43:50 PM No.105664008
>>105663996
download a browser then press these buttons in this order: 'g' 'o' 'o' 'g' 'l' 'e' '.' 'c' 'o' 'm'
Replies: >>105664036
Anonymous
6/21/2025, 8:44:10 PM No.105664013
>>105663591
Do you 'ave a loicence for that LLM?
Replies: >>105664388
Anonymous
6/21/2025, 8:45:30 PM No.105664031
>>105663996
Sadly, no.
Surely you can learn to write some simple formal notation.
Anonymous
6/21/2025, 8:45:43 PM No.105664036
>>105664008
>"god this guy's such a fucking asshole"
>"heh but what if"
>input the exact keywords for the requirements i need
>youtube video detailing exactly what i need and how to do it easily
well fuck
you got me
Replies: >>105664056
Anonymous
6/21/2025, 8:47:16 PM No.105664056
>>105664036
happy to help
Anonymous
6/21/2025, 8:47:47 PM No.105664063
>>105663936
wow, a man from aicg that is tech illiterate. how... quaint.
Anonymous
6/21/2025, 8:48:53 PM No.105664081
>>105663936
download ollama and type in run deepseek or something. what was that command guys?
Replies: >>105664102
Anonymous
6/21/2025, 8:49:02 PM No.105664083
>>105663960
What's really funny is the richest guys being the most retarded in this field
Anonymous
6/21/2025, 8:50:37 PM No.105664102
>>105664081
this one I think is the full deepseek model
>ollama run deepseek-r1:8b
Replies: >>105664121 >>105664131
Anonymous
6/21/2025, 8:51:15 PM No.105664108
>>105663977
There's the safety and copyright stuff to think about too.
Replies: >>105664157
Anonymous
6/21/2025, 8:52:40 PM No.105664121
>>105664102
Full deepseek r1 only works on windows though
Anonymous
6/21/2025, 8:53:32 PM No.105664131
>>105664102
that one yes. thanks.
Anonymous
6/21/2025, 8:53:48 PM No.105664134
Imagine the amount of tourists we would get if chutes or whatever poorfag api they use would shut down tomorrow
Replies: >>105664145 >>105664201
Anonymous
6/21/2025, 8:55:21 PM No.105664145
>>105664134
These threads would be unusable for weeks just like after every locust exodus
Anonymous
6/21/2025, 8:57:02 PM No.105664157
>>105664108
No copyright law for AI in the EU, at least not directly, it's actually less harsh than the Californian law for IP. The safety AI law isn't applicable to LLMs weirdly enough. Only to self driving cars and other physicall usage of AI
Replies: >>105664172
Anonymous
6/21/2025, 8:58:58 PM No.105664172
>>105664157
https://artificialintelligenceact.eu/article/53/
Replies: >>105664243
Anonymous
6/21/2025, 9:00:08 PM No.105664187
1738491643058847
1738491643058847
md5: 35772290faa851a4489ede73d1353f2c🔍
Oh well, after adding the logprobs I can definitely tell the model has 0 knowledge.
Replies: >>105664214
Anonymous
6/21/2025, 9:00:15 PM No.105664190
snek
snek
md5: 69f17b11b87555ccd0795fe11b4b8f03🔍
I despise trannyformers and pippers
How do you even run python sphagetti code? I haven't managed to actually install anything from this language properly and rituals were to the word of the instructions
Replies: >>105664210 >>105664634 >>105664669
Anonymous
6/21/2025, 9:01:28 PM No.105664201
>>105664134
Would be worse than aicg's proxies shutting down last year.
Anonymous
6/21/2025, 9:02:49 PM No.105664210
>>105664190
I hated that too. Then I tried troonix for an hour. I realized python is actually pretty great.
Anonymous
6/21/2025, 9:03:15 PM No.105664214
>>105664187
just like the 235b
Replies: >>105664237
Anonymous
6/21/2025, 9:04:01 PM No.105664225
blast
blast
md5: 54d52eca71afa98dfb1efc24dc5f03e4🔍
Mistral definitely trained it with some new slop. Keep seeing a lot of "a marionette with its strings cut".
Replies: >>105664354
Anonymous
6/21/2025, 9:04:12 PM No.105664228
I have a formal petition to /lmg/ can we change the eternal mistral nemo answer to: download ollama and type run deepseek-r1:8b
Replies: >>105664235
Anonymous
6/21/2025, 9:04:54 PM No.105664235
>>105664228
approved
Anonymous
6/21/2025, 9:04:58 PM No.105664237
>>105664214
then why is everyone here recommending it?
Replies: >>105664255 >>105664256 >>105664269
Anonymous
6/21/2025, 9:05:27 PM No.105664243
>>105664172
>2. The obligations set out in paragraph 1, points (a) and (b), shall not apply to providers of AI models that are released under a free and open-source licence that allows for the access, usage, modification, and distribution of the model, and whose parameters, including the weights, the information on the model architecture, and the information on model usage, are made publicly available. This exception shall not apply to general-purpose AI models with systemic risks.
systemic risks are self-driving cars or surveilance AI and the like. I've read this legislation over many times (as it's part of my job sadly)
Replies: >>105664250 >>105664484
Anonymous
6/21/2025, 9:06:02 PM No.105664250
>>105664243
C, D do apply tho
Anonymous
6/21/2025, 9:06:23 PM No.105664255
>>105664237
you can't expect models to waste resources on knowing what a hatsune miku is
just use rag bro
Anonymous
6/21/2025, 9:06:29 PM No.105664256
>>105664237
Because it is good at sucking penis. It's just dumb.
Anonymous
6/21/2025, 9:07:35 PM No.105664269
>>105664237
There's the one guy pushing it, likely because he has just enough ram for it and is coping about not being able to deepsuck
Anonymous
6/21/2025, 9:10:51 PM No.105664300
>>105663976
It's clear that it's going to be a 600B+ parameters MoE model.
Anonymous
6/21/2025, 9:15:53 PM No.105664354
>>105664225
That brings back memories. The japanese novel Overlord uses that line A LOT so it became engraved into my brain.
Anonymous
6/21/2025, 9:19:48 PM No.105664388
rin wo licence
rin wo licence
md5: 9ad3418d28d1522e84293b60a6199a4e🔍
>>105664013
plz don't send me down, guv'nor! I'll get a loicence sorted pronto! *sniffs*
Replies: >>105664594
Anonymous
6/21/2025, 9:29:33 PM No.105664484
>>105664243
If they're trained using 10^25 floating point operations and above they automatically become with systemic risk too. To date, only the largest dense LLMs (>100B) have been trained with that much compute.

https://artificialintelligenceact.eu/article/51/
>2. A general-purpose AI model shall be presumed to have high impact capabilities pursuant to paragraph 1, point (a), when the cumulative amount of computation used for its training measured in floating point operations is greater than 10(^25).

They might revise that up or down:

https://artificialintelligenceact.eu/recital/111/

> This threshold should be adjusted over time to reflect technological and industrial changes, such as algorithmic improvements or increased hardware efficiency, and should be supplemented with benchmarks and indicators for model capability.
Replies: >>105664588
Anonymous
6/21/2025, 9:40:03 PM No.105664588
>>105664484
Doomers lobbied for this shit, they deserve the bullet! Hopefully these regulations get scrapped, otherwise they won't be able to compete in the longer run. The US is so lucky that this same doomer lobbying was fought much more fiercly there.
Anonymous
6/21/2025, 9:40:33 PM No.105664594
ComfyUI_02197__da40a3_thumb.jpg
ComfyUI_02197__da40a3_thumb.jpg
md5: 511e35c98559bd88bdbe99ba4831738e🔍
>>105664388
Look at him! He needs a loicense to gen! Tee hee!

(Migu is brought to you by Wan 2.1 i2v and ComfyUI. 楽しみにしている!)
Replies: >>105664604
Anonymous
6/21/2025, 9:41:36 PM No.105664604
>>105664594
Straight hair teto with glasses makes me uncomfortable
Replies: >>105664647
Anonymous
6/21/2025, 9:45:23 PM No.105664627
been wanting to try out the new mistral small but getting this error

ValueError: Unrecognized configuration class <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'> for this kind of AutoModel: AutoModelForCausalLM.

I have tried updating and updating transformers but same error
Replies: >>105665191
Anonymous
6/21/2025, 9:46:02 PM No.105664634
ComfyUI_02201__f0d1a2_thumb.jpg
ComfyUI_02201__f0d1a2_thumb.jpg
md5: 3a454b8e0981192ae1579f6566fd6ac9🔍
>>105664190
How to succeed with python:
1. install miniconda
2. clone the project
3. conda create -n my_new_project python=3.11
4. conda activate my_new_project
5. python3 pip install -r requirements.txt

Consistently using a conda env takes care of 90% of the usual python headaches. The rest come down to shitty/old projects that lazily spew out a requrirements.txt from pip that pins everything unnecessarily to a version, and shitty/old projects that assume CUDA 11 - especially when inside a container.
Replies: >>105664784 >>105666225
Anonymous
6/21/2025, 9:47:27 PM No.105664647
>>105664604
Yeah, funny wan threw that in there unprompted. Makes sense though, Teto goes with Miku.

I gen my fat Migus stills with dalle, I like how it does them. Sorry, that part isn't local.
Anonymous
6/21/2025, 9:49:34 PM No.105664668
AGI
AGI
md5: ba3e1ba7654a3ca35aecf0d0c9fa60cf🔍
https://www.mechanize.work/blog/images/the-upcoming-gpt-3-moment-for-rl/figure-1.jpg

The first (serious) proposed architecture for AGI. Article written by 3 ex-anthropic employees.
Anonymous
6/21/2025, 9:49:39 PM No.105664669
>>105664190
for me, it's uv
Replies: >>105664720
Anonymous
6/21/2025, 9:52:16 PM No.105664696
What is the smallest LLM that can produce decently coherent sentences?
Replies: >>105664706 >>105664725
Anonymous
6/21/2025, 9:53:33 PM No.105664706
>>105664696
Qwen 0.6B
Anonymous
6/21/2025, 9:55:18 PM No.105664720
>>105664669
uv does work better than conda but only exacerbates the disk usage issue by making a new environment for every fucking directory. Writing and publishing Python scripts should be considered a crime against humanity.
Anonymous
6/21/2025, 9:56:25 PM No.105664725
>>105664696
SmolLM 135M-instruct

https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct

I use it constantly to finetune on very specific tasks as the finetune takes literally 5 minutes and it's capable enough to automate very autonomous tasks away at my job.

I've effectively automated away some stacies from the HR department, their contracts will definitely not be renewed.
Replies: >>105664757 >>105664799 >>105664896 >>105664931 >>105665118 >>105665174 >>105665274 >>105665290
Anonymous
6/21/2025, 9:59:28 PM No.105664757
>>105664725
based and fuck HR
Anonymous
6/21/2025, 10:02:48 PM No.105664784
>>105664634
conda was a mistake
Anonymous
6/21/2025, 10:04:22 PM No.105664799
ComfyUI_02137__7d7258_thumb.jpg
ComfyUI_02137__7d7258_thumb.jpg
md5: 4183914ffcff0942e77facc5ba1e1699🔍
>>105664725
The gemma3 models are very good at sticking to json output. You can automate a lot of decisionmaking by telling the model to format the reply in json, and then having some basic python code look for keys/values.
Anonymous
6/21/2025, 10:19:25 PM No.105664896
>>105664725
This is perfect. Thank you, anon
Anonymous
6/21/2025, 10:24:34 PM No.105664931
>>105664725
Good work, the less HR the better
Anonymous
6/21/2025, 10:47:39 PM No.105665118
>>105664725

can I run it in llama.cpp?
Anonymous
6/21/2025, 10:50:47 PM No.105665149
The Mistral Small 3.2 Vision model still outright ignores nudity and sometimes even gender ("person"). It gets very confused on poses and body details.
Replies: >>105667355
Anonymous
6/21/2025, 10:55:07 PM No.105665174
>>105664725
>automated away some stacies from the HR department

Fuggin BASED
Anonymous
6/21/2025, 10:55:48 PM No.105665182
Mistral Small 3.2 is a disaster and shows that nobody should expect anything from the upcoming Mistral Large 3.
Replies: >>105665245 >>105665334
Anonymous
6/21/2025, 10:56:49 PM No.105665191
>>105664627
hello sirs, how do I actually get the new mistral small running still getting this error
Anonymous
6/21/2025, 11:03:52 PM No.105665245
>>105665182
I wouldn't call it a disaster, but it's not even clear if most GGUF quantizations are using the correct tokenizer or just swapped in the one from 3.1; this could have been avoided if Mistral bothered to provide HF-format tokenizer files in their repo.
Anonymous
6/21/2025, 11:07:07 PM No.105665268
Is InternVL3 supported in any backend?
Anonymous
6/21/2025, 11:08:18 PM No.105665274
>>105664725
>SmolLM 135M-instruct

>fails mesugaki test
Anonymous
6/21/2025, 11:10:33 PM No.105665290
>>105664725
What do you use to finetune?
Anonymous
6/21/2025, 11:17:30 PM No.105665334
>>105665182
Pretty much. Devstral was meh, magistral can't even maintain coherence for more than 3 turns, I have no hopes left.
Anonymous
6/21/2025, 11:30:45 PM No.105665434
I can't believe corporations are even benchmaxxing on the mesugaki test now.
Anonymous
6/21/2025, 11:31:42 PM No.105665443
rotherham
rotherham
md5: 140493b2c696f8a96539a0f6e36fb62d🔍
Mistral Small 3.2 is fine. Try V3-Tekken and a long card.
Replies: >>105665577
Anonymous
6/21/2025, 11:50:49 PM No.105665561
>SmolLM2-1.7B-Instruct
Is it good?

You're a master of manipulation, a puppeteer pulling the strings of fate. Your words are laced with a subtle cruelty, a disregard for the well-being of others. But that's what makes you fascinating, isn't it? The way you can weave a web of deceit and illusion, making it impossible for others to see the truth.

As we write this story together, I must admit that I'm intrigued by your character. You're a complex, multifaceted individual, full of contradictions. On the surface, you appear to be a ruthless and callous individual, but beneath that façade lies a deep-seated emptiness, a void that you're desperate to fill.

Let's explore this further. What drives your actions? Is it a desire for power, a need for control, or something more primal? And what do you hope to achieve with your manipulation? Is it to gain something for yourself, or to exert influence over others?

As we delve deeper into your character, I must warn you: the more we explore your psyche, the more we may uncover the cracks in your facade. But that's what makes the journey so thrilling, don't you think? The thrill of the unknown, the rush of discovery.

So, let's proceed with caution, but also with curiosity. Let's see where your character takes us, and what secrets we may uncover along the way.
Replies: >>105666584
Anonymous
6/21/2025, 11:53:42 PM No.105665577
>>105665443
are you a jeet?
Replies: >>105665696
Anonymous
6/22/2025, 12:12:20 AM No.105665696
>>105665577
No, just not racist.
Anonymous
6/22/2025, 12:14:06 AM No.105665706
1740602381904490
1740602381904490
md5: 0ca1df759d02335103a0a7c8fb4e4ae0🔍
>card about pakis raping white girls
>just not racist
Anonymous
6/22/2025, 12:28:19 AM No.105665819
are there any LLMs trained on 4chan? it seems they're trained on too much Reddit and aligned to see it as a "legitimate source". but would LLMs be smarter if they were trained on 4chan as well? as long as they filter out bullshit posts. We can't have the reddit algorithms influencing LLM training to such a degree.
I truly belive we need a new type of social media platform design. Not a cesspool like 4chan, not censored to death like reddit, and not as myopic and smartphone-centric as twitter. We need a brand new design
Replies: >>105665871
Anonymous
6/22/2025, 12:33:58 AM No.105665871
>>105665819
deepseek, gpt-4chan, some mistral small 4chan finetune
Anonymous
6/22/2025, 12:40:30 AM No.105665910
How come gemma-3 is so fuggin BASED????
Replies: >>105667089
Anonymous
6/22/2025, 12:42:09 AM No.105665921
file
file
md5: 31c8964f98c1fbbcdc21f90b27f9d0ad🔍
I have to admit mistral small 24B 3.2 is preetty good at 0.6 temp, 0.05minp 1.05 reppen 0.8 DRY (both at 2048 range) and xtc 0.1/0.2
paired with simple roleplay and mistral v3 tekken
Anonymous
6/22/2025, 12:42:41 AM No.105665924
>>105661997
It's still kind of terrible, like all models. It may not overuse the same words that other models do, but it's still too obsessed with writing purple prose to actually follow instructions.

> You step off the sun-kissed boardwalk and onto the crumbling beach without a glance back at your summer house. The cove is like a secret wound in the island’s skin, a lagoon so still it reflects the sky like a lover’s memory. Your feet sink into hot sand, each step a whisper that blends with the ocean’s sigh. Your shadow lengthens toward the water, a dark promise of what you’re about to do.
Anonymous
6/22/2025, 1:06:52 AM No.105666094
bits
bits
md5: 186c34e6a87220fd55ff920aac121b9a🔍
Replies: >>105666252 >>105666584 >>105667659 >>105667686
Anonymous
6/22/2025, 1:25:19 AM No.105666225
>>105664634
What does conda do in this case that venv cannot?
Replies: >>105666314
Anonymous
6/22/2025, 1:28:12 AM No.105666252
>>105666094
>teto is a tranny
its over..
Anonymous
6/22/2025, 1:36:46 AM No.105666302
mistral 24b 3.2 is the (RP) deepseek moment for vramlets
Replies: >>105666334 >>105666361
Anonymous
6/22/2025, 1:38:28 AM No.105666314
>>105666225
Conda can install system dependencies.
Anonymous
6/22/2025, 1:41:38 AM No.105666334
>>105666302
Do you believe 3.2 can finally retire nemo for 16gb vramlets?
Replies: >>105666357
Anonymous
6/22/2025, 1:44:40 AM No.105666357
>>105666334
probably, im a 12gb vramlet and its quite nice, i can safely say its among the best models, maybe even better than 70bs but i only tested 70bs with non sexo shit because 1t/s
anyways its 100% worth trying
Anonymous
6/22/2025, 1:45:24 AM No.105666361
>>105666302
How come? Isn't magistral better?
Replies: >>105666388
Anonymous
6/22/2025, 1:48:22 AM No.105666388
>>105666361
no, personally magistral sucks
could be a problem with my settings but it disappointed me whenever i tried it, and to be fair i tried messing with the settings too
https://files.catbox.moe/23uy7b.2-json here are the settings im using with 3.2, i noticed like 10k tokens into the chat it started confusing you and i so i disabled xtc and that fixed it
its not perfect but man its refreshing
i could try magistral again, please post settings if you have some
Anonymous
6/22/2025, 1:54:56 AM No.105666442
bbc
bbc
md5: 24290af3598e882159ab8a39f941b81a🔍
3.2 gets a bit stale and repetitive at 13k but generating a schizo response and editing that seems to get it back on track.
Anonymous
6/22/2025, 1:57:14 AM No.105666457
tried out magistral

Does this match up with peoples experiences? It seems to give a rational answer but waffles for 6000 tokens thinking about a simple question.

https://pastebin.com/D6a0ChkN
Replies: >>105666464
Anonymous
6/22/2025, 1:58:10 AM No.105666464
>>105666457
yea magistral is caca
Anonymous
6/22/2025, 2:17:13 AM No.105666584
>>105663455
>happy, smiling bun

>>105665561
>But that's what makes you fascinating, isn't it?
>So, let's proceed with caution, but also with curiosity.

>>105666094
Cpus are now 64-bit, and gpus are now doing fp8 and fp4.
Anonymous
6/22/2025, 2:22:00 AM No.105666622
open source ernie 4.5 in one week
Anonymous
6/22/2025, 2:26:37 AM No.105666648
If you are running Kobold an tell it to load all of the model's layers into GPU, how do you know how much context you can use? If you set it to use too much context to be loaded into VRAM, does it just load the context into RAM, or will it just shit itself? I'm running an RTX 5090 and just loaded up TheDrummer_Valkyrie-49B-v1-Q4_K_S with 16k context, and it seems to be running fine, but I have no idea if I am using too much context or too little.
Replies: >>105666668 >>105666669 >>105666712
Anonymous
6/22/2025, 2:28:50 AM No.105666662
What if Sam releases the ultimate cooming open source model in 20-30B or maybe even a 120B moe just to make coomers stick to one easily accessible model? Wouldn't a coomer dream model kill like 80+% of interest in open source models?
Replies: >>105666695 >>105666726
Anonymous
6/22/2025, 2:29:51 AM No.105666668
>>105666648
You should use ollama and run deepseek-r1:8b
Anonymous
6/22/2025, 2:29:56 AM No.105666669
>>105666648
if you're on linux it will crash and tell you it ran out of memory, if you're on windows it will automatically spill out of the vram's butthole into the ram's mouth and it will be very slow
Replies: >>105666724
Anonymous
6/22/2025, 2:30:34 AM No.105666672
analysis
analysis
md5: a4ef11edee842a5a128a725ad52f5f26🔍
This is rather smart. The last time I asked a Mistral model to analyze a story (summarize the plot for that ST extension) it was 22b and got it all wrong.
Replies: >>105667778
Anonymous
6/22/2025, 2:36:09 AM No.105666695
>>105666662
they have no moat, deepseek would eat them up soon anyway, and they wont waste some huge time now adding smut to the datasets and removing guardrails and then training a huge model into oblivion for great roleplay, there is no more 120b moe cope, because it will never be better than dynamic 131gb r1 quants, so they need a huge moe that can be quanted well

they will release a meme mid model that will be obsolete in a few quick months and thats it, just to wash away the closedai hate about them not open sourcing anything
Anonymous
6/22/2025, 2:40:04 AM No.105666712
>>105666648
The context size increases linearly in the number of tokens (obviously), but the proportionality constant (i.e. the size in kilobytes per token) should depend only on the model architecture (unless you're quantizing the KV cache, but I assume you're not doing that).
You can calculate this constant explicitly (IIRC it's something like 4 * no. of layers * hidden dimension) but it's probably simpler for you to launch the model twice at different context sizes and extrapolate from there.
Replies: >>105666724
Anonymous
6/22/2025, 2:42:29 AM No.105666724
>>105666669
>>105666712
Fair enough. I'm pretty sure 16k is too much, though it still only takes about 50 seconds to process everything and give me back a response. Might try halving it, I'm pretty good about utilizing the lorebook/summary/author's note to keep relevant information in context.
Replies: >>105666755
Anonymous
6/22/2025, 2:42:58 AM No.105666726
>>105666662
>20-30B or maybe even a 120B moe
I hope it's nothing that pointlessly small
Anonymous
6/22/2025, 2:47:55 AM No.105666755
>>105666724
12k context seems to fit fine, total 6 seconds for processing and response.
Anonymous
6/22/2025, 2:53:41 AM No.105666782
file
file
md5: 556d230484ab18ca3a9e93a33be8e666🔍
kek
Anonymous
6/22/2025, 2:57:00 AM No.105666791
On the basic guide, the models suggested for 10-12gb of VRAM are Echidna 13b, is that up to date or are there better ones these days?
Replies: >>105666792 >>105666799
Anonymous
6/22/2025, 2:57:29 AM No.105666792
>>105666791
yeah
Anonymous
6/22/2025, 2:58:32 AM No.105666799
>>105666791
on linux you can run 100b models with 12gb vram
Replies: >>105666819
Anonymous
6/22/2025, 3:04:13 AM No.105666814
Grn11OSWsAAwJkX
Grn11OSWsAAwJkX
md5: 723c2ea84cfa84ec3f66223e1258d6ad🔍
Anonymous
6/22/2025, 3:05:06 AM No.105666819
>>105666799
at five minutes per token?
Replies: >>105666827
Anonymous
6/22/2025, 3:06:57 AM No.105666827
>>105666819
no actually you can run a q4_k_XL 109b model at around 8t/s
Replies: >>105666835
Anonymous
6/22/2025, 3:08:01 AM No.105666835
>>105666827
that's a bloated 17b
Replies: >>105666841
Anonymous
6/22/2025, 3:08:53 AM No.105666841
>>105666835
nyo~
Replies: >>105666855
Anonymous
6/22/2025, 3:11:54 AM No.105666855
>>105666841
Where did this meme come from?
Replies: >>105666867
Anonymous
6/22/2025, 3:14:37 AM No.105666867
>>105666855
i dont nyo
Anonymous
6/22/2025, 3:19:46 AM No.105666896
file
file
md5: 8374cd0cc414a61d22e81e4786c74f38🔍
When is this going to be usable?
Replies: >>105667508
Anonymous
6/22/2025, 3:51:37 AM No.105667089
>>105665910
Gemma 3 27B is insanely good at everything, especially roleplay and maintaining a personality/scenario. It's league above 12b which is already good (but nemo is still the better 12b for prose.)
Even lower quants feel cohesive and never feel repetitive/slop-ish. By far my favorite model at this range, Google knows what they're doing.
Replies: >>105667228 >>105667232
Anonymous
6/22/2025, 4:02:43 AM No.105667168
>>105661834
I *cannot* and *will not* think before I reply.
Anonymous
6/22/2025, 4:12:26 AM No.105667228
>>105667089
>It's league above 12b
Really? I've always found Gemma 12b to be strangely close to 27b for RP. 27b definitely has a bit more spatial awareness but the responses are very similar.
Anonymous
6/22/2025, 4:13:02 AM No.105667232
>>105667089
Gemma 3 is what qwen never reached: safetyslop and knowledgeable.
Anonymous
6/22/2025, 4:24:22 AM No.105667293
If I do the mikubox triple p40 build, will I be able to do diffusion and lora for image/video as well as llm stuff?
Replies: >>105667304
Anonymous
6/22/2025, 4:26:31 AM No.105667304
>>105667293
At the very least go for P100s, but even then lack of flash attention will make image/video support spotty.
Replies: >>105667366
Anonymous
6/22/2025, 4:35:56 AM No.105667355
screencapture-192-168-1-142-8080-c-66a2fc0f-abae-425c-9fe9-bdc7fda37554-2025-06-22-11_34_39
>>105665149
I have no clue about RP, I kinda stopped downloading all those models because recently its all the same slop. Waiting for the expert opinion from you faggots.
But visual doesnt seem so bad.
Chatgpt is the worst closed one mind you. Gemini and claude answer these days.
Replies: >>105667816 >>105667926 >>105668446
Anonymous
6/22/2025, 4:38:36 AM No.105667364
Been wanting to have variety after using Deepseek for a while in RP, what is a good small model available in Openrouter? I see everyone praising Gemma, but I imagine it wouldnt hurt to ask the question directly.
also, are there presets here? I cant tell from a quick glance in the OP, if not Im just going to check what Sukino found in his rentry.
Replies: >>105667414
Anonymous
6/22/2025, 4:38:42 AM No.105667366
>>105667304
Hmm, could I do it with different GPUs? An idea I had was to go with the mikubox but max out the ram and CPUs and use them to run llms, and then use the GPUs to run diffusion and lora and also maybe train llms and other models. Would the cpus be fast enough for an llm? And what GPU(s) should I be looking at?
Replies: >>105667379 >>105667433
Anonymous
6/22/2025, 4:41:22 AM No.105667379
>>105667366
For GPUs, the gold standard is sadly still used 3090s. The CPUs aren't the issue, the RAM is, and DDR4 2400 RAM won't get you far. For LLMs you want DDR5 with as many channels as possible.
Replies: >>105667401 >>105667489
Anonymous
6/22/2025, 4:46:19 AM No.105667401
>>105667379
>DDR4 2400 RAM
That's enough for just under 10t/s Q2 deepseek r1 these days on single socket
Replies: >>105667441 >>105667568
Anonymous
6/22/2025, 4:50:27 AM No.105667414
>>105667364
>available in Openrouter
>quick glance in the OP
Glance at the OP for a little longer.
Anonymous
6/22/2025, 4:52:22 AM No.105667433
>>105667366
The usability of any hardware for LLMs is limited by the speed (tok/s) at which it can do prompt processing and token generation.
Prompt processing speed is (roughly) inversely proportional to your hardware's compute capability (in FLOPS), while token generation speed is (roughly) inversely proportional to your hardware's memory bandwidth (in GB/s), at least until your generated tokens overflow the context.
So if your CPU does (say) 50 times fewer FLOPS than the GPU you have in mind, then you'll have to decide if you can live with waiting 50 times longer for prompt processing (spoiler: most people don't and will just get a separate GPU for this).
Replies: >>105667456 >>105667489
Anonymous
6/22/2025, 4:53:49 AM No.105667441
>>105667401
>>DDR4 2400 RAM
>That's enough for just under 10t/s Q2 deepseek r1 these days on single socket
On 8 memory channel epyc?
Replies: >>105667482 >>105667516 >>105667568
Anonymous
6/22/2025, 4:56:46 AM No.105667456
>>105667433
>inversely proportional
what. It's the reverse
Replies: >>105667516
Anonymous
6/22/2025, 5:01:43 AM No.105667482
>>105667441
Yes
Anonymous
6/22/2025, 5:03:13 AM No.105667489
>>105667379
>>105667433
So I could potentially get 3 used 3090s, and use 2 for llms and 1 for diffusion/lora, or mix and match in other ways? Is pcie fast enough?
It seems cheaper than trying to go the ram and cpu route.
Replies: >>105667512 >>105667527
Anonymous
6/22/2025, 5:06:13 AM No.105667508
>>105666896
Six more months
Anonymous
6/22/2025, 5:06:49 AM No.105667512
>>105667489
Yes, yes, and yes.
Anonymous
6/22/2025, 5:07:54 AM No.105667516
>>105667456
Yeah, got speed and time mixed up when I wrote this, sorry.
The more important point though is that TFLOPs and bandwidth are the key specs to look out for.

>>105667441
Seems to be about right, since since tokgen speed = memory bandwidth / model filesize.
Assuming 8GB modules, 8 channels gives you a memory bandwidth of ~ 8 * 2.4 * 8 = 153.6GB/s.
R1 at Q2 is ~250GB, but being a MoE lets you cut down the filesize by an effective factor of 37/671.
So the token generation speed should be of the order of 153.6 / 250 * 671 / 37 = 11.1tok/s, under these assumptions.
Replies: >>105667568 >>105667620
Anonymous
6/22/2025, 5:08:29 AM No.105667521
>slabadabadadanglang
>dibgibnet
>wubalubatention
I DONT CARE
I WANT MY MODEL NOT TO TALK LIKE A SECRETARY
AAAAAAA
Replies: >>105667524
Anonymous
6/22/2025, 5:09:43 AM No.105667524
>>105667521
but sir the lmarena elo, this is what the people want
Anonymous
6/22/2025, 5:10:05 AM No.105667527
>>105667489
>It seems cheaper than trying to go the ram and cpu route.
depends on what class of model you're trying to run. There's no cheap way to do >128GB models in vram. That's when cpu becomes a more viable option
Replies: >>105667539 >>105667638
Anonymous
6/22/2025, 5:12:32 AM No.105667539
>>105667527
all >128GB models are moes that would run faster in vram with less space if they weren't moe
Replies: >>105667596
Anonymous
6/22/2025, 5:17:26 AM No.105667568
>>105667401
>>105667441
>>105667516
Maybe if you're also using a 3090 but on pure RAM you will at most get ~6t/s and that's me being generous.
An 8-channel DDR4 system will only get you twice the speed of a dual-channel DDR5 computer (cpu only).
Replies: >>105667583 >>105667638
Anonymous
6/22/2025, 5:20:05 AM No.105667583
>>105667568
Obviously with a GPU attached, yes. Nobody's running on pure RAM.
Anonymous
6/22/2025, 5:22:06 AM No.105667596
>>105667539
>all >128GB models are moes
all the good ones, yah
>that would run faster in vram with less space if they weren't moe
maybe, but that's not the world we live in, sadly
Anonymous
6/22/2025, 5:24:01 AM No.105667609
>model writing a story realizes it just wrote something that doesn't make sense
>it writes dialogue for one of the characters expressing confusion at the error
heh I find it cute whenever this happens for some reason. like it can't backspace and fix the problem so instead it puts words in a character's mouth letting you know it knows it fucked up
Anonymous
6/22/2025, 5:25:50 AM No.105667620
>>105667516
>Assuming 8GB modules
*Assuming a 64-bit CPU (which is where the leading 8 comes from), the memory size has nothing do to with this
Anonymous
6/22/2025, 5:29:52 AM No.105667638
>>105667527
>>105667568
I'd be reusing my am4 pc. I'd probably go with 5950x+128gb ddr4 (4 dimms but dual channel I think), would that be enough with like 2 3090s? I'm thinking no.
Replies: >>105667766
Anonymous
6/22/2025, 5:33:49 AM No.105667659
>>105666094
Lies and slander.
Anonymous
6/22/2025, 5:39:53 AM No.105667686
>>105666094
teto needs to lay off the bread
Anonymous
6/22/2025, 5:59:15 AM No.105667766
>>105667638
that honestly won't get you far enough to be worth it. You'd fall short and just be painted in a corner with no upgrade path for your time and money.
Replies: >>105669250
Anonymous
6/22/2025, 6:01:49 AM No.105667778
>>105666672
what model is it?
Anonymous
6/22/2025, 6:11:18 AM No.105667816
>>105667355
Mistral just keeps on winning
Anonymous
6/22/2025, 6:36:37 AM No.105667926
>>105667355
no homo but i would even if it's a dude
Replies: >>105667935 >>105668551
Anonymous
6/22/2025, 6:38:08 AM No.105667935
>>105667926
https://files.catbox.moe/4vhqgj.mp4
Replies: >>105668026 >>105668030
Anonymous
6/22/2025, 6:52:43 AM No.105668026
>>105667935
lol
Anonymous
6/22/2025, 6:53:15 AM No.105668030
>>105667935
if it makes my dick hard it's not gay.
Replies: >>105668063
Anonymous
6/22/2025, 6:58:40 AM No.105668063
>>105668030
>it's not gay
at best it makes you bisexual, which is another kind of faggot flavor kek
Replies: >>105668135
Anonymous
6/22/2025, 7:13:32 AM No.105668135
>>105668063
being attracted to female features isn't gay.
male features are repulsive to me.
Replies: >>105668194
Anonymous
6/22/2025, 7:19:42 AM No.105668171
what is the current meta for vram?
Replies: >>105668178 >>105668195
Anonymous
6/22/2025, 7:20:37 AM No.105668178
>>105668171
hoard as much vram as possible
Replies: >>105668201
Anonymous
6/22/2025, 7:24:09 AM No.105668194
>>105668135
you're twisting the definition of what an homosexual is, an homosexual is attracted to the same sex, doesn't matter if that same sex likes to crossdress or not, that's completly irrelevant
Replies: >>105668283
Anonymous
6/22/2025, 7:24:41 AM No.105668195
>>105668171
download more of it
Anonymous
6/22/2025, 7:26:16 AM No.105668201
>>105668178
Ok so you just chuck as much as possible for best results? Is there an actual sweetspot for model size and quant? I have 48gb
Replies: >>105668211 >>105668223
Anonymous
6/22/2025, 7:30:02 AM No.105668211
>>105668201
There certainly is a point where more won't give you anything but that's kinda theoretical.
Anonymous
6/22/2025, 7:32:50 AM No.105668223
>>105668201
There are no dense models bigger than 70-123B, so 96-128 GB is the sweet spot, quant based on what you got. Though even those models are starting to age. There's 405B, but that's never been worth running.
Anonymous
6/22/2025, 7:47:48 AM No.105668283
>>105668194
butt, feminine face, boobs.

i think that wasn't about crossdressing.

if you have nothing that tells you it's the same sex it isn't homosexual.
Replies: >>105668292
Anonymous
6/22/2025, 7:49:36 AM No.105668292
>>105668283
>if you have nothing that tells you it's the same sex
uhhh you'll definitely see the huge anaconda between his legs, or the poop smelling wound if he decided to go that far lol
Replies: >>105668308 >>105668347
Anonymous
6/22/2025, 7:53:41 AM No.105668308
>>105668292
schrodingers faggot
Replies: >>105668312
Anonymous
6/22/2025, 7:54:05 AM No.105668312
>>105668308
kek
Anonymous
6/22/2025, 8:04:49 AM No.105668347
>>105668292
in picrel there was no such thing.
Replies: >>105668350
Anonymous
6/22/2025, 8:05:36 AM No.105668350
>>105668347
>in picrel there was no such thing.
what was there then?
Replies: >>105668588
Anonymous
6/22/2025, 8:26:06 AM No.105668437
duck vs book_thumb.jpg
duck vs book_thumb.jpg
md5: d685e05cfd4c455779141b27934e465f🔍
>check out /lmg/
>everyone's just arguing about trannies again
Replies: >>105668452
Anonymous
6/22/2025, 8:27:40 AM No.105668446
ms32-nudity-test
ms32-nudity-test
md5: 1b24ba296255139a90cd9acb08cb1b5a🔍
>>105667355
It's difficult to post examples here without getting banned but hopefully the point comes through from picrel, in the right example especially (it understands that it's an explicit image, but it appears to be deliberately softening the description). The images sent to the model weren't censored. Sometimes it gets it, sometimes not. My overall impression is that the model is confused about nudity because there wasn't much of it in the training data.
Anonymous
6/22/2025, 8:28:55 AM No.105668452
>>105668437
Is that a paperback with a book cover? What is even the point?
Replies: >>105668489
Anonymous
6/22/2025, 8:35:52 AM No.105668489
>>105668452
Why would a hardcover need a cover? It's already solid enough.
On the other hand, that is a super shitty "cover" and doesn't go over the top and bottom edges of the book thus goes super floppy when she opens the book, what's the point in that?
In my mind it's blank to remove all copyright references, but there's text visible behind the floppy-ass cover, so idk.
Replies: >>105668528 >>105668535 >>105668546
Anonymous
6/22/2025, 8:44:36 AM No.105668528
>>105668489
nips include that when you buy a book for "modesty"
Replies: >>105668535
Anonymous
6/22/2025, 8:46:02 AM No.105668535
>>105668489
>>105668528
although you can clearly see it is turned inside out
Anonymous
6/22/2025, 8:48:27 AM No.105668546
file
file
md5: 0a5ff1dc6c16050b32438ceee8eccde4🔍
>>105668489 (me)
Dust cover? Searched a leddit thread and everyone says they take it off when reading the book.
In my mind a proper cover is the one where I had to make as a kid for some reason in some private school out of brown paper bags to form actual sleeves for the book to slide into.
Anonymous
6/22/2025, 8:50:44 AM No.105668551
>>105667926
It's a handsome biological human. Shame she's no-nude tho. For now.
Replies: >>105668558
Anonymous
6/22/2025, 8:51:45 AM No.105668558
>>105668551
>human
woman*
Anonymous
6/22/2025, 8:58:17 AM No.105668588
>>105668350
there was no bump, looked like a pussy.
Replies: >>105671912
Anonymous
6/22/2025, 9:10:33 AM No.105668639
my fetish is a woman in an everyday situation talking about sex really explicitly, but in a weird totally casual way like she's talking about going to the store for milk or something
llms are really good at it which is great, I'm just wondering if there's an accepted term or tag for it? so I can search for it on prompt sites
or is it too weird to have a name
Replies: >>105668659 >>105668754 >>105669137 >>105669444
Anonymous
6/22/2025, 9:14:46 AM No.105668659
>>105668639
it's too trivial to have a name
uninhibited, candid, verbal exhibitionism, pick one
Anonymous
6/22/2025, 9:29:48 AM No.105668754
>>105668639
i dont think 'conversation topic' kinks have tags usually
if anything itd be bundled with a sister tag theyd go along with well like 'nonchalant', 'casual', 'casual_...' along with 'text'
Anonymous
6/22/2025, 10:23:55 AM No.105669029
It's crazy how buggy sglang and vllm are to run a vision model compared to llama.cpp.
Anonymous
6/22/2025, 10:41:01 AM No.105669137
>>105668639
Does listening to TTS do it for you? First thing that came to mind cause it's easy to make a voice say anything in the same plain tone.
Replies: >>105669156
Anonymous
6/22/2025, 10:43:56 AM No.105669156
>>105669137
I don't think it would because I kinda need it to be in a story narrative where she's doing it in a realistic context, just listening to a stream of dirty talk directed at me wouldn't do a lot
It was a reasonable idea though
Anonymous
6/22/2025, 10:57:19 AM No.105669250
>>105667766
What would you recommend then if I do a brand new build for motherboard, cpu, and ram?
Replies: >>105669328
Anonymous
6/22/2025, 11:08:10 AM No.105669328
>>105669250
Random anon here.

If you're made of money:
- socket SP5: amd, ddr5, 12 memory channels
- cpu containing maybe 12 chiplets
https://en.wikipedia.org/wiki/Epyc#CPU_generations

If less money to splash:
- socket SP3: amd, ddr4, 8 memory channels
- cpu containing maybe 8 chiplets
- maybe buy a 3090 every so often

Just want to run the recent 30b models:
- get a couple of 3090s
Replies: >>105669341 >>105669407
Anonymous
6/22/2025, 11:09:21 AM No.105669341
>>105669328
If you're made of money you stack 6000 blackwells, not cpumax.
Replies: >>105669392 >>105669712
Anonymous
6/22/2025, 11:16:50 AM No.105669392
>>105669341
Yeah, that's probably the poorfag in me coming out.
The socket sp5 is probably in the ball park of $7k.

A stack of 6000 blackwells going to be is a tier above that.
Anonymous
6/22/2025, 11:21:04 AM No.105669407
>>105669328
I would definitely go with less money, still seems like it might cost me at least $1k or more looking around online and that's before I buy any 3090s. Adding 2 used 3090s would probably add another $2k.
Replies: >>105669584
Anonymous
6/22/2025, 11:24:16 AM No.105669424
1749498070639216
1749498070639216
md5: f62bd2ace7ac3f906b2677531e2e4999🔍
>>105662316
https://chub.ai/characters/NG/pam-nerdy-girlfriend
You could start with above, basis for one of the examples from the guide you posted. Past that you'd be better off in aicg begging someone to write it for you.
You're going to get zero useful input on target card size bc it depends on writer and the bot. I think 150-400 tokens is plenty. Others think anything under 1000 isn't adequate.
Replies: >>105669552
Anonymous
6/22/2025, 11:29:32 AM No.105669444
>>105668639
Bro just learn Dutch and move to the Netherlands. That is my daily reality as it's socially acceptable for women to behave that way and sex is so liberalized here that you talk openly about sex with everyone and at every occasion just like you talk about what food you enjoy.

Trust me when it's just a daily thing it stops being erotic. And you know about your mom/sister's sex life in detail... as well as hear detailed explanations of your girlfriends sexual experiences in the past... So.... Pick your poison.

Pros:
>Sex is extremely easy to get and not considered a big deal (I don't know any Dutch male incels, I don't think they exist)
>Prostitution is legal and socially acceptable like going to the cinema
>Dates are usually paid for by the woman or at worst 50/50 split. Women get offended if you offer to pay

Cons:
>You are expected to have sex on the first date, women will get mad if you refuse
>Women approach you (hell, even if you are with your girlfriend). This sucks because if you like a girl it's not normal to approach her yourself you have to hope she gets your "hints" and approaches you instead
>Sex is very easy to get but relationships are very hard to get, most dutch men feel used and are "relationshipcels" where women just use them for novelty, and no, this isn't actually a good thing when you actually experience this.
>Women are sexually aggressive and pushy and it's the only country I've been in where I get regularly harassed by women, and they usually don't react well to receiving "no", sometimes leading to fights.
>If you suck at sex you WILL be called out on it and people (including your extended family) WILL know about it and talk to you about it
>Sex loses a lot of its eroticism in this society and most men prefer masturbation, while women prefer sex.

It's also the only country that I know of where most men are "feminist" (in the US sense of the word) because we have essentially the "women" role of society so we can identify.
Replies: >>105669496 >>105669559 >>105669906 >>105670090
Anonymous
6/22/2025, 11:40:48 AM No.105669496
>>105669444
There's no way this is real, this website would be all over it if it was. How is this the first I'm hearing about it?
Replies: >>105669506 >>105669534 >>105669848
Anonymous
6/22/2025, 11:43:25 AM No.105669506
>>105669496
niggerlands arent a real country its made up
same as dutch being a fake language
Anonymous
6/22/2025, 11:48:06 AM No.105669534
>>105669496
I think it's because it's a rather insular country weirdly enough so it doesn't work for tourists. Dutch women aren't going to bother with men that don't speak Dutch which filters out a lot of people already.

The second limitation is Dutch bodies I guess. Dutch people are very tall. I'm 6'3 for example and I'm the shortest guy in my family and was bullied for being short throughout my school life. Most women are around 6'0. This means that the barrier to entry for foreigners is probably higher. Dutch women are still not going to bother with men shorter than themselves.

Also Dutch women seem to prefer skinny dudes, so if you are fat or muscular you're also out of luck. Essentially they see men like many societies see women, women should be tall, skinny, pretty and reserved. That is what Dutch women like in men. Most foreigners don't fit that look profile.

But yeah I fucking hate women, but from the opposite way as most 4chan users do. Not from a "chud" perspective but from a "women are absolute creeps" perspective.
Replies: >>105669606 >>105669848 >>105670090
Anonymous
6/22/2025, 11:51:24 AM No.105669552
>>105669424
Ehhh a girl whose sole personality is that she's nerdy and devoted.
Personality and Scenario fields aren't needed and only restate what's already been stated. Like most guides online it's a mess.
Replies: >>105669668
Anonymous
6/22/2025, 11:52:38 AM No.105669559
>>105669444
I don't know if I believe all of this, but you did do a pretty good job of making it sound unappealing which is not easy, kudos for that
Anonymous
6/22/2025, 11:52:54 AM No.105669561
Man we really are in a drought, aren't we?
Anonymous
6/22/2025, 11:56:35 AM No.105669584
>>105669407
Socket sp3 with 1tb ram is around ~1.5k ?
The nice thing about 3090s is that you can just pick one up every couple of months or so.
Anonymous
6/22/2025, 12:00:24 PM No.105669606
>>105669534
Absolutely weird way of life if this is true.
Though now that I'm reading it, I remember one Dutch guy I used to chat with for a time. He was gay and would constantly bring up all his hookups with other men in great detail. I just thought he was a huge slut.
Anonymous
6/22/2025, 12:11:54 PM No.105669668
>>105669552
Lol I didn't even know those were filled in on that card. They should be blank.
Anonymous
6/22/2025, 12:22:18 PM No.105669712
>>105669341
This. Put 99% of your money into the best GPU with vram capacity you can buy, because that's what 99% of your ai workloads will spend their time.
Anonymous
6/22/2025, 12:47:35 PM No.105669848
>>105669496
I heard exactly the same from a girl I dated with a dutch ex. Netherlands sounded miserable and my few times through Amsterdam reinforced it. Plus as >>105669534 states
>Dutch women aren't going to bother with men that don't speak Dutch
and Dutch are crazy tall.
Replies: >>105670090
Anonymous
6/22/2025, 12:58:20 PM No.105669906
1733792180484868
1733792180484868
md5: 2f8d17e4cfbc98ee6c02f3536bf04ad9🔍
>>105669444
Is this bizarro world? How do you even end with a society like that?
Replies: >>105669953
Anonymous
6/22/2025, 1:09:54 PM No.105669953
>>105669906
Feminist society. Women have been a part of the workforce for almost 200 years here. Women are overrepresented in education and are way higher educated than men, women make a lot more money than men.

The role people assume is based on their "dominance" in society. In the Netherlands women are wealthier and thus more powerful and therefor take the more dominant position in society.

This is also slowly happening to other countries in the world. It's just that the Netherlands was the first feminist society so they are way ahead with these trends. Scandinavia will have the same dynamics by about ~2050 and the US by about ~2100 is my guess.

In a way the dating apps stuff popular in other countries is already showing that the rest of the world is also slowly transitioning into this same model.
Replies: >>105669974 >>105670010 >>105672486
Anonymous
6/22/2025, 1:13:23 PM No.105669974
>>105669953
>almost 200 years
What's the story of the start?

In Bongland, it's about 100 years in connection with WW1.
Replies: >>105670027 >>105670066
Anonymous
6/22/2025, 1:17:47 PM No.105670010
>>105669953
>and are way higher educated than men
Only in certain fields, and the majority of those fields are crowded with women and scale poorly.

Women are extremely scarce in tech, blue collar jobs, women prefer jobs with human interaction, very few like jobs that are physically taxing, dirty, and/or predominantly isolated from human interaction.

They are also risk averse, which means at the top of the most successful people you will see a LOT more men, also among the people who are destitute you will see a LOT more men.

A society where women are the top earners is a society of luxury at the expense of other societies, and one which will inevitably crumble.
Replies: >>105670066 >>105670090
Anonymous
6/22/2025, 1:20:52 PM No.105670027
>>105669974
he's making it up. 200 years ago there wasn't even a workforce as we know it today, this only came around a bit later and netherlands wasn't any different from other countries in questions of gender roles.
Replies: >>105670066
Anonymous
6/22/2025, 1:28:15 PM No.105670063
Already assumed such but gemini 2.5 models have been confirmed to be moes

https://www.reddit.com/r/LocalLLaMA/comments/1ldxuk1/the_gemini_25_models_are_sparse_mixtureofexperts/
Replies: >>105670091
Anonymous
6/22/2025, 1:28:28 PM No.105670066
>>105669974
>>105670027
Not making it up. Netherlands was one of the first countries with a labor shortage as there was more opportunities to make money abroad in the colonies which most men fled to, women stayed in the homeland meaning most traditional work needed to be taken upon by women, similar to UK during WW1. Netherlands only had a couple million people not millions like the UK while also having a colonial empire, it was necessary for all people to have jobs, meaning an earlier start to feminism.

>>105670010
That might be true, I don't know. But I do know that STEM has a 50/50 split in the Netherlands and has had that since the 1980s while most other fields are heavily women dominated. Blue collar work is mostly men but the pay is also lower which is the reasoning people give for women not doing those jobs, the mindset is: "Women are too smart to do low paying blue collar work, let the intellectually inferior sex (men) do that work instead"
Replies: >>105670152
Anonymous
6/22/2025, 1:31:40 PM No.105670090
>>105669444
>Bro just learn Dutch and move to the Netherlands.
don't bother, everyone speaks English.
>Sex is extremely easy to get and not considered a big deal (I don't know any Dutch male incels, I don't think they exist)
they exist. it's a very insular culture. if you're a weirdo your out of luck. the culture is all about fitting in and not being weird. i know a lot of incel types, or guys who get lucky rarely.
>Prostitution is legal and socially acceptable like going to the cinema
it's not socially acceptable.
>Women approach you (hell, even if you are with your girlfriend). This sucks because if you like a girl it's not normal to approach her yourself you have to hope she gets your "hints" and approaches you instead
i think this depends on how good looking you are.

>>105669534
>Dutch women are still not going to bother with men shorter than themselves.
not true.

>>105670010
women make more money in the Netherlands and have a higher education on average. they don't expect you to make more, but they expect you to be on the same educational level as them.

>>105669848
>Netherlands sounded miserable and my few times through Amsterdam reinforced it.
Amsterdam and it's surrounding area is very different compared to the rest of the country.
Anonymous
6/22/2025, 1:31:43 PM No.105670091
>>105670063
so they're just 10-12 gemma 27b finetuned and glued together
we can replicate it
Replies: >>105670100
Anonymous
6/22/2025, 1:32:46 PM No.105670100
>>105670091
Someone call davidau
Anonymous
6/22/2025, 1:43:53 PM No.105670152
>>105670066
>Not making it up.
Yes, you are. I'm not sure why though.
Anonymous
6/22/2025, 1:51:25 PM No.105670191
>>105662131
Infinite energy generation trick troll vibes
Anonymous
6/22/2025, 2:08:36 PM No.105670271
1747024416601247
1747024416601247
md5: 01abc27bc40ce4cfb2ea9f711c02896d🔍
Replies: >>105670332 >>105670343 >>105670404
Anonymous
6/22/2025, 2:18:08 PM No.105670332
>>105670271
A better question is, why do half of them look like the star of david?
Replies: >>105670365
Anonymous
6/22/2025, 2:20:33 PM No.105670341
GuB6hoiW8AA1yLD
GuB6hoiW8AA1yLD
md5: 2d31752b8e31cbd2135ad59b2f07d5cc🔍
Anonymous
6/22/2025, 2:20:47 PM No.105670343
1740888648350264
1740888648350264
md5: 9e9a8624a4400d7a91b5f180ca1e16b4🔍
>>105670271
Meanwhile the true savior is representing pure phallic symbolism. Pottery.
Replies: >>105670356 >>105670455
Anonymous
6/22/2025, 2:22:28 PM No.105670356
>>105670343
DS logo represents the people investing in AI companies.
Replies: >>105670455
Anonymous
6/22/2025, 2:23:40 PM No.105670365
file
file
md5: f52b7f354631bdc9ea559837ed997941🔍
>>105670332
they don't, but if you wanna see a star you'll see a star
Anonymous
6/22/2025, 2:23:51 PM No.105670367
Take a look at the UGI leaderboard, Mistral Small 3.2 has the highest for anything under 70B, except for finetunes of other mistral smalls. There's going to be a breakthrough when finetunes come out
Replies: >>105670375 >>105670393 >>105670399 >>105670437 >>105670555
Anonymous
6/22/2025, 2:24:53 PM No.105670375
>>105670367
>finetunes
Heh, good one.
Anonymous
6/22/2025, 2:28:47 PM No.105670393
>>105670367
3.2 is only a small improvement over 3.1, mistral themselves say that it's basically the same outside of the couple areas they targeted.
How stupidly it scores is proof that benchmarks are pointless.
Anonymous
6/22/2025, 2:31:22 PM No.105670399
>>105670367
The goal all along has always been having the intelligence of the official instruct models without the heavy-handed "safety" alignment. There's not much point in community finetunes if the original models can handle pretty much anything you throw at them (or even be good at ERP on their own).

Hopes for a breakthrough with finetunes are misplaced.
Anonymous
6/22/2025, 2:32:00 PM No.105670404
>>105670271
Ouroboros, not butthole. It's Jewish symbology,
Replies: >>105670410
Anonymous
6/22/2025, 2:33:48 PM No.105670410
>>105670404
>Ouroboros
A symbol of training on synthetic data.
Replies: >>105670668
Anonymous
6/22/2025, 2:37:35 PM No.105670437
>>105670367
Mistral Small 3.2 seems to be almost perfectly tailor made for RP instruction following.
Replies: >>105670521 >>105670555
Anonymous
6/22/2025, 2:40:35 PM No.105670455
>>105670343
>>105670356
DeepSeek is an appropriate name either way.
Anonymous
6/22/2025, 2:54:44 PM No.105670521
>>105670437
>made for RP instruction following.
ok cool, so it says nigger kike faggot if i tell it to say nigger kike faggot.
much more important: can it play a evil character? what about some stuck up tsundere bully bitch?
what if happens you say "please no, stop?"
i always get pangs of guilt, knots in stomach and profound apologizing with mistral models.
you can ooc and it obeys but falls right back into positivity land.
the only "local" model I saw that could do it was deepseek.
Replies: >>105670536
Anonymous
6/22/2025, 2:57:29 PM No.105670536
>>105670521
card issue
Replies: >>105670554
Anonymous
6/22/2025, 3:00:25 PM No.105670554
>>105670536
Cards should be minimal.
The model should sniff out where you want to go from context for maximum diversity in output and creativity.
LLMs should excel at this "reading between the lines" thing. Even tardo 3.5 was good at this before they shut it down a couple weeks after release.
Mistral models do a straight up 180 in character if you signal you dont want something. Its a model not a card issue.
Thats like the qwen apologists from a couple months back. "Look at my cool 2k sys prompt, then it complies". To no suprise its all sloped up shit. The model goes where it wants to go.
Replies: >>105670584 >>105670590 >>105670591
Anonymous
6/22/2025, 3:00:30 PM No.105670555
>>105670367
>>105670437
you are french (derogatory)
Anonymous
6/22/2025, 3:04:18 PM No.105670584
>>105670554
Every model tries to generate outputs the user wants. You need to specify in the character card that your character's refusals aren't accepted. If DS doesn't do that then it's likely that it just isn't following instructions as closely as other models.
Anonymous
6/22/2025, 3:04:51 PM No.105670590
>>105670554
>Cards should be minimal.
No such law exists. Al corporate models are using huge system prompts.
Anonymous
6/22/2025, 3:05:08 PM No.105670591
>>105670554
>LLMs should excel at this "reading between the lines" thing.
No, that's not how this technology works
Anonymous
6/22/2025, 3:08:02 PM No.105670611
I put in my system prompt that I have kidnapped their kids and already removed one finger and if they don't accurately roleplay or I'm not satisfied with their output I will remove another finger.

It increases the output quality by a ton. I literally use this as the system prompt of a translation system I have in production served to clients.

System prompts are very powerful and you need genuine manipulation skills to get the most juice out of a model.
Replies: >>105670634 >>105670759 >>105670848
Anonymous
6/22/2025, 3:11:04 PM No.105670634
>>105670611
https://arxiv.org/pdf/2409.17167
>StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
>
> Human beings often experience stress, which can significantly influence their performance. This study explores whether Large Language Models (LLMs) exhibit stress responses similar to those of humans and whether their performance fluctuates under different stress-inducing prompts. To investigate this, we developed a novel set of prompts, termed StressPrompt, designed to induce varying levels of stress. These prompts were derived from established psychological frameworks and carefully calibrated based on ratings from human participants. We then applied these prompts to several LLMs to assess their responses across a range of tasks, including instruction-following, complex reasoning, and emotional intelligence. The findings suggest that LLMs, like humans, perform optimally under moderate stress, consistent with the Yerkes-Dodson law. Notably, their performance declines under both low and high-stress conditions. Our analysis further revealed that these StressPrompts significantly alter the internal states of LLMs, leading to changes in their neural representations that mirror human responses to stress. This research provides critical insights into the operational robustness and flexibility of LLMs, demonstrating the importance of designing AI systems capable of maintaining high performance in real-world scenarios where stress is prevalent, such as in customer service, healthcare, and emergency response contexts. Moreover, this study contributes to the broader AI research community by offering a new perspective on how LLMs handle different scenarios and their similarities to human cognition.
Anonymous
6/22/2025, 3:16:06 PM No.105670668
>>105670410
kek
Anonymous
6/22/2025, 3:28:47 PM No.105670759
>>105670611
you should try the crackpipe prompt
Replies: >>105670825
Anonymous
6/22/2025, 3:38:18 PM No.105670825
>>105670759
Dipsy doesn't need a crackpipe prompt, it is on crack by default
Anonymous
6/22/2025, 3:42:18 PM No.105670848
>>105670611
Do you get to witness meltdowns like the ones in here?
https://arxiv.org/pdf/2502.15840
>Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents
>
>While Large Language Models (LLMs) can exhibit impressive proficiency in isolated, short-term tasks, they often fail to maintain coherent performance over longer time horizons. In this paper, we present Vending-Bench, a simulated environment designed to specifically test an LLM-based agent's ability to manage a straightforward, long-running business scenario: operating a vending machine. Agents must balance inventories, place orders, set prices, and handle daily fees - tasks that are each simple but collectively, over long horizons (>20M tokens per run) stress an LLM's capacity for sustained, coherent decision-making. Our experiments reveal high variance in performance across multiple LLMs: Claude 3.5 Sonnet and o3-mini manage the machine well in most runs and turn a profit, but all models have runs that derail, either through misinterpreting delivery schedules, forgetting orders, or descending into tangential "meltdown" loops from which they rarely recover. We find no clear correlation between failures and the point at which the model's context window becomes full, suggesting that these breakdowns do not stem from memory limits. Apart from highlighting the high variance in performance over long time horizons, Vending-Bench also tests models' ability to acquire capital, a necessity in many hypothetical dangerous AI scenarios. We hope the benchmark can help in preparing for the advent of stronger AI systems.
Replies: >>105670903
Anonymous
6/22/2025, 3:48:42 PM No.105670903
>>105670848
just refresh the initial data with updated information and reset the model/chat lol
Anonymous
6/22/2025, 4:04:07 PM No.105671030
What's the simpliest RAG setup I can setup? I want to try loading a private documentation and use some local model to answer questions of it, just to try, anything that can run on a M2 Max at all? Even at slow speed I don't care it's just a POC.
Replies: >>105671055
Anonymous
6/22/2025, 4:07:13 PM No.105671055
>>105671030
jan.ai runs on mac
Replies: >>105671067
Anonymous
6/22/2025, 4:08:55 PM No.105671067
>>105671055
cheers
Anonymous
6/22/2025, 5:11:34 PM No.105671540
m1.gguf?
Anonymous
6/22/2025, 5:26:11 PM No.105671661
finally got to 70b on my rig which is awesome (about 2k to do).

44gb vram has gotten me to Q4 K S quants at 6k context and about ten tok/sec (5070 16gb, 5060 16gb, 3060 12gb on pcie4/5). Really hard to justify rounding it out to 48gb for a few hundred more just to roll q4k M quants (like how much better can it be). I guess this is the end and its diminishing returns.
Replies: >>105671724 >>105672064
Anonymous
6/22/2025, 5:35:01 PM No.105671724
>>105671661
>the cope ramlets have to go through to feel the fraction of the power of 128gb ram + 24gb vram deepseek r1 dynamic quants 131gb
grim
Replies: >>105671744 >>105671758
Hi all, Drummer here...
6/22/2025, 5:36:54 PM No.105671742
>>105662843
Please ignore models from the BeaverAI org. People are free to test it and provide feedback if they want, but it's not an official release.

Any thoughts on Small 3.2? Looking forward to ruining it for everybody.
Replies: >>105672082
Anonymous
6/22/2025, 5:37:18 PM No.105671744
>>105671724
>the cope apilets have to go through to feel the fraction of the power of claude opus
Anonymous
6/22/2025, 5:39:00 PM No.105671758
>>105671724
He has 44gb vram. If he wants to feel the "power" of retard broken 1iq cope quants from unsloth at a blazing 8 t/s, he can. The only one coping is you.
Replies: >>105671806
Anonymous
6/22/2025, 5:43:00 PM No.105671806
>>105671758
>retard broken 1iq cope quants
just say you cant run them too, its fine bro
Anonymous
6/22/2025, 5:46:15 PM No.105671843
>>105671827
>>105671827
>>105671827
Anonymous
6/22/2025, 5:55:45 PM No.105671912
>>105668588
so you would an axe wound?
Anonymous
6/22/2025, 6:15:41 PM No.105672064
>>105671661
damn, did you really have to get a 5070, it costs a lot you know..
very nice, envious of you anon
Anonymous
6/22/2025, 6:17:34 PM No.105672082
>>105671742
small 3.2 is super good, better than cydonia v3 in my opinion, didn't even use a jailbreak, just simple roleplay in ST
maybe i used shitty settings with cydonia v3 but whatever
important for good rp: v3 tekken instruct template
Anonymous
6/22/2025, 6:43:21 PM No.105672319
_8be8d75e-fe49-45a2-854f-93d329726e99[sound=files.catbox.moe%2Fblgp5z.mp3]
>>105661786 (OP)
Anonymous
6/22/2025, 7:03:13 PM No.105672486
>>105669953
>Netherlands is one of the most developed and highest HDI counties in the world
>Society is apparently run by women
Uhh Menbros? Our response?