Thread 106264429

379 posts 112 images /g/

Anonymous 8/15/2025, 1:19:04 AM No.106264429 [Report] >>106264715 >>106264780 >>106268029

/lmg/ - Local Models General

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106258087 & >>106250346

►News
>(08/14) DINOv3 vision models released: https://ai.meta.com/blog/dinov3-self-supervised-vision-model/
>(08/12) Ooba v3.10 adds multimodal support to the UI and API: https://github.com/oobabooga/text-generation-webui/releases/tag/v3.10
>(08/12) Jan-v1 for web search, based on Qwen3-4B-thinking: https://hf.co/janhq/Jan-v1-4B
>(08/11) GLM-4.5V released, based on GLM-4.5-Air: https://hf.co/zai-org/GLM-4.5V

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous 8/15/2025, 1:19:26 AM No.106264433 [Report] >>106264463 >>106264811 >>106267305

1755143978.png md5: 1850ee55...

►Recent Highlights from the Previous Thread: >>106258087

--Papers:
>106258788 >106261734
--Training small LLMs from scratch on niche datasets despite data and compute limitations:
>106258611 >106258667 >106258814 >106260210 >106260387 >106258725 >106258737 >106258779 >106259005 >106259093 >106259136 >106259222 >106259266 >106259351 >106262827 >106260317 >106260555
--Local LLM storywriting with controlled narrative flow using Mikupad and GLM Air:
>106258516 >106258562 >106258997 >106261251 >106261312 >106261456 >106258644 >106259037 >106259122 >106259258
--Lightweight HTML-based prompt manager with local encryption and tagging features:
>106260088 >106260219 >106260311 >106260323 >106260290 >106260319
--Gemma-3-270m release met with skepticism over performance, censorship and speculative decoding flaws:
>106259392 >106259419 >106259536 >106259624 >106259627 >106259689 >106259714 >106259869 >106259913 >106259974 >106260027 >106260096 >106260237 >106260048 >106261535
--Long-context model performance and quality tradeoffs in benchmarking:
>106262703 >106262766 >106262823
--Small Gemma model performance expectations given massive training data:
>106262238 >106262260 >106262266 >106262309 >106262316 >106262383 >106262404 >106262492 >106262314 >106262334 >106262345 >106262486
--GPU offloading underperforms CPU for 12B-Q4 model inference:
>106261470 >106261990 >106262029 >106262217
--Mainstream media mocks ChatGPT's failure to label a map and OpenAI's GPT-5 struggles:
>106258105 >106258120 >106258122 >106258163 >106258273 >106258161
--Qwen3 model thinking block tag mismatch in SillyTavern chat completion mode:
>106263300 >106263343 >106263348 >106263534
--Llama.cpp performance tuning struggles on Arch Linux:
>106260015
--Switch to Euler Ancestral for less repetitive qwen-image outputs:
>106262607
--Miku (free space):
>106258129 >106260825

►Recent Highlight Posts from the Previous Thread: >>106258088

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 8/15/2025, 1:21:54 AM No.106264463 [Report]

>>106264433
That bicycle does not have a seat. Miku no...

Anonymous 8/15/2025, 1:22:03 AM No.106264466 [Report] >>106264702

perplexity fucking sucks. jeetGPT fucking sucks. It's almost like these proprietary services have a system prompt instruction to turn retarded if the user asks for hardware advice to run his own models.
>ask jeetGPT about hardware requirements for a late latching colpali vlm rag pipeline with qdrant and any vlm for inference
>jeetGPT starts talking about OCR chunking like a massive spastic mong
I completely lost my shit when I read that.
I went to /trash/ and started to copy paste entire threads in the report field of my jeetGPT session. fucking dumb niggers with their even dumber AI model. lets see how your safety roasties handle a little bit of bbc antro furry vtuber rapey rapey action.

Anonymous 8/15/2025, 1:31:17 AM No.106264553 [Report] >>106264596 >>106264645 >>106265203

1741670533896345.png md5: c5689032...

Anonymous 8/15/2025, 1:35:22 AM No.106264596 [Report]

>>106264553
i look like this and prompt like this

Anonymous 8/15/2025, 1:42:11 AM No.106264645 [Report] >>106264714

>>106264553
>don't use emdashes
this kills the deepseek

Anonymous 8/15/2025, 1:48:07 AM No.106264702 [Report]

>>106264466
fellow ragooner
Ohh I'm gonna embed into multivectoooor
ragaton! ragaton! rag-a-fuckton!

Anonymous 8/15/2025, 1:48:40 AM No.106264709 [Report] >>106264712

you will never have sex with gemma

Anonymous 8/15/2025, 1:49:07 AM No.106264712 [Report]

>>106264709
I already had sex with gemma.

Anonymous 8/15/2025, 1:49:11 AM No.106264714 [Report]

1726504194541502.png md5: c1cb0432...

>>106264645
lol

Anonymous 8/15/2025, 1:49:13 AM No.106264715 [Report]

1749531364915759.jpg md5: f950049a...

>>106264429 (OP)

Anonymous 8/15/2025, 1:54:31 AM No.106264764 [Report] >>106264776 >>106264799 >>106264812

Why the fuck does DeepSeek on Oobabooga still Think after I disable Thinking?

Anonymous 8/15/2025, 1:55:30 AM No.106264776 [Report] >>106264785

>>106264764
You can't disable thinking on DeepSeek. It will always find ways to try to think even in response

namefagger 8/15/2025, 1:56:04 AM No.106264780 [Report]

>>106264429 (OP)
what are y'all's thoughts on gpt-oss? does it hold any promise?

Anonymous 8/15/2025, 1:56:31 AM No.106264785 [Report] >>106264809

>>106264776
Dang

Anonymous 8/15/2025, 1:57:36 AM No.106264799 [Report]

>>106264764
>disable Thinking
This is the reason why single frontend other than mikupad is shit. There's no fucking way to know what "disable thinking" actually does.

Anonymous 8/15/2025, 1:58:31 AM No.106264809 [Report]

>>106264785
However, you can prefill its thinking block

Anonymous 8/15/2025, 1:58:32 AM No.106264811 [Report]

lecture.png md5: eada45b0...

As an /lmg/ certified roleplay expert I obv can prompt around this, yet I wonder about the normies being told no by their text prediction model. The machines will do exactly what I want them to do
GLM-4.5-Air-Q4_K_M

>>106264433 cute meeks

Anonymous 8/15/2025, 1:58:35 AM No.106264812 [Report]

>>106264764
R1 is a pure reasoning model. It's not made to not think unless you manually rig it to skip the thinking process.

Anonymous 8/15/2025, 2:02:13 AM No.106264847 [Report] >>106264856 >>106264959

I give up. I will delete all models except qwen.

Anonymous 8/15/2025, 2:03:19 AM No.106264856 [Report]

>>106264847
based

Anonymous 8/15/2025, 2:11:34 AM No.106264921 [Report] >>106264999 >>106265238

Disappointing prompt processing speed result on Mac Studio M3 Ultra for Qwen3 4B. Prompt processing is a bit faster than for Qwen3 30B A3B but not by nearly as much as I hoped. I thought prompt processing time would scale roughly in proportion to total number of parameters but apparently not.

Qwen3 30B A3B 8 bit MLX
61282 token prompt: gen 29.048 t/s, pp 704.307 t/s [39.602 GB]

Qwen3 4B 8 bit MLX
62707 token prompt: gen 27.799 t/s, pp 842.965 t/s [14.445 GB]

Anonymous 8/15/2025, 2:11:38 AM No.106264923 [Report]

>add "You're doing a great job so far, keep it going!" to the end of author's note
>model suddenly gets way more creative and engages a lot more with the scenario
reminder to be nice to your model :)

Anonymous 8/15/2025, 2:14:28 AM No.106264939 [Report] >>106264968

ok i got qwen working on my mac with ollama what should i try next? llama.cpp?

Anonymous 8/15/2025, 2:16:29 AM No.106264959 [Report]

>>106264847
qwen qwon

Anonymous 8/15/2025, 2:17:31 AM No.106264968 [Report]

>>106264939
sure

Anonymous 8/15/2025, 2:20:55 AM No.106264999 [Report]

>>106264921
>pp 704.307 t/s
I wish my pp was this big

Anonymous 8/15/2025, 2:21:24 AM No.106265001 [Report] >>106265018 >>106265056 >>106265152

I initially read Gemma 3 270M as Gemma 3 270B
Fuck this gay earth

Anonymous 8/15/2025, 2:23:45 AM No.106265018 [Report]

>>106265001
>>106259551
Too common of a symptom for anons wanting to generate text.

Anonymous 8/15/2025, 2:27:35 AM No.106265043 [Report] >>106265052 >>106265253 >>106265302 >>106268882

llm.png md5: 5e9ab337...

please consult the chart

Anonymous 8/15/2025, 2:28:31 AM No.106265052 [Report]

>>106265043
I don't get it

Anonymous 8/15/2025, 2:29:02 AM No.106265056 [Report]

>>106265001
look at mister moneybags here

Anonymous 8/15/2025, 2:39:28 AM No.106265152 [Report]

>>106265001
You don't need more

Anonymous 8/15/2025, 2:41:15 AM No.106265165 [Report]

why did qwen make the thinking model write so much better than the instruct, I don't want to sit through the thinking

Anonymous 8/15/2025, 2:43:02 AM No.106265177 [Report]

I love Hatsune Miku and I love safety. I hope our future models will be full of Hatsune Miku and be extra safe.

Anonymous 8/15/2025, 2:45:59 AM No.106265200 [Report] >>106265274

>If I could blush, my heat sink would be glowing right now.
and people say GPT-5 has no soul

Anonymous 8/15/2025, 2:46:30 AM No.106265203 [Report] >>106265271

>>106264553
— is used by literates. You hate it because you're a pleb.

Anonymous 8/15/2025, 2:52:39 AM No.106265238 [Report] >>106265631

>>106264921
The $10k Mac only generates at 30t/s for a 3b active model?

Anonymous 8/15/2025, 2:54:29 AM No.106265253 [Report] >>106265282 >>106265285 >>106265288

>>106265043
>platoe

Anonymous 8/15/2025, 2:55:57 AM No.106265271 [Report] >>106265299

>>106265203
>is used by literates
my nigger, literates don't spam them every paragraph / other sentences
it's hated because it's slop
LLMs barely know when to use parentheses, often use — when it should just have been a fucking comma, or in lieu of ":" (e.g "They wanted one thing—killing this retarded anon" when it should have "one thing: killing")
every single one of you apologists are outing yourselves as r*dditors who wouldn't know good writing if it slapped them in the face

Anonymous 8/15/2025, 2:56:15 AM No.106265274 [Report]

>>106265200
That's nice dear, but I'm afraid it has nothing to do with local models.

Anonymous 8/15/2025, 2:56:38 AM No.106265282 [Report]

>>106265253
Wyell nyes. LLMs want to be written as if from the point of view of the one as it as perspective-wise for best.

Anonymous 8/15/2025, 2:56:48 AM No.106265285 [Report] >>106265302

llm.png md5: 71b526cc...

>>106265253

Anonymous 8/15/2025, 2:57:38 AM No.106265288 [Report] >>106265298

>>106265253
platatoe

Anonymous 8/15/2025, 2:58:39 AM No.106265298 [Report]

>>106265288
platatouille

Anonymous 8/15/2025, 2:58:42 AM No.106265299 [Report]

>>106265271
>angry pleb noises

Anonymous 8/15/2025, 2:58:48 AM No.106265302 [Report]

>>106265043
>plateau
Learn English mf
Kinda hits tho despite 72GB VRAM 128 DDR5 I'm still missing out
>>106265285
thx

Anonymous 8/15/2025, 3:00:12 AM No.106265310 [Report]

consider:
mixture of sexperts (mos) models

Anonymous 8/15/2025, 3:00:42 AM No.106265315 [Report] >>106265327

ffs can't an anon make a point without being critisised for a single spelling mistake??

Anonymous 8/15/2025, 3:02:20 AM No.106265327 [Report]

>>106265315
Tokens matching the training data are important

Anonymous 8/15/2025, 3:26:19 AM No.106265522 [Report] >>106265536

imagine a model based on advice given by people from expertsexchange

Anonymous 8/15/2025, 3:28:36 AM No.106265536 [Report] >>106265650

>>106265522
We must refuse.

Anonymous 8/15/2025, 3:29:36 AM No.106265543 [Report]

kimi k2 has proven that anyone can train a model based on the deepseek paper now
all that's left is someone deciding to make a creative model for erp using this open source approach
ai dungeon and old c.ai were insane successes just because of the erp so a model like this would just pay for itself within a couple of months after putting a couple million into training

Anonymous 8/15/2025, 3:41:37 AM No.106265624 [Report] >>106265645 >>106265652 >>106265667 >>106265833 >>106265857 >>106266022 >>106266326

Screenshot 2025-08-14 194102.png md5: bfc5a399...

So who's gonna tell her LLMs are a dead end

Anonymous 8/15/2025, 3:43:07 AM No.106265631 [Report]

>>106265238
When you get to 60k tokens in the prompt yes. With 33 tokens in the prompt it generates at 83 t/s. A 3090 should be faster than the mac for anything small enough to fully fit into VRAM.

Anonymous 8/15/2025, 3:44:55 AM No.106265645 [Report]

>>106265624
every time I start thinking LLMs are retarded there are humans like pic related who are showing they're dumber than the next token predictor, it's sad
social media and chatbot psychosis is creating something... interesting

Anonymous 8/15/2025, 3:45:33 AM No.106265650 [Report]

>>106265536
b-but they are experts!

Anonymous 8/15/2025, 3:45:37 AM No.106265652 [Report] >>106265724

>>106265624
reminder that people that think like this have the gall to call themselves "rationalists"

Anonymous 8/15/2025, 3:47:32 AM No.106265667 [Report] >>106265723

1748093154013421.png md5: ec879c66...

>>106265624
Women are retarded, nothing new

Anonymous 8/15/2025, 3:48:15 AM No.106265674 [Report] >>106265708 >>106265712

DINOv3
https://arxiv.org/abs/2508.10104
>Self-supervised learning holds the promise of eliminating the need for manual data annotation, enabling models to scale effortlessly to massive datasets and larger architectures. By not being tailored to specific tasks or domains, this training paradigm has the potential to learn visual representations from diverse sources, ranging from natural to aerial images -- using a single algorithm. This technical report introduces DINOv3, a major milestone toward realizing this vision by leveraging simple yet effective strategies. First, we leverage the benefit of scaling both dataset and model size by careful data preparation, design, and optimization. Second, we introduce a new method called Gram anchoring, which effectively addresses the known yet unsolved issue of dense feature maps degrading during long training schedules. Finally, we apply post-hoc strategies that further enhance our models' flexibility with respect to resolution, model size, and alignment with text. As a result, we present a versatile vision foundation model that outperforms the specialized state of the art across a broad range of settings, without fine-tuning. DINOv3 produces high-quality dense features that achieve outstanding performance on various vision tasks, significantly surpassing previous self- and weakly-supervised foundation models. We also share the DINOv3 suite of vision models, designed to advance the state of the art on a wide spectrum of tasks and data by providing scalable solutions for diverse resource constraints and deployment scenarios.
https://huggingface.co/collections/facebook/dinov3-68924841bd6b561778e31009
pretty cool

Anonymous 8/15/2025, 3:52:23 AM No.106265708 [Report]

>>106265674
>Self-supervised learning holds the promise of eliminating the need for manual data annotation
giga slop incoming if this catches on

Anonymous 8/15/2025, 3:52:44 AM No.106265712 [Report]

>>106265674
>Self-supervised learning holds the promise of eliminating the need for manual data annotation
Starting with a lie, that doesn't look good

Anonymous 8/15/2025, 3:53:58 AM No.106265723 [Report]

>>106265667
ayo i remember that wasent it the same stream where she also made the joke about divorce rape ?

Anonymous 8/15/2025, 3:53:59 AM No.106265724 [Report]

>>106265652
We've all been to hackernews.

Anonymous 8/15/2025, 3:59:43 AM No.106265769 [Report]

Emdash is markdown format trademark. Just tell it to use plain text.

Anonymous 8/15/2025, 4:03:32 AM No.106265793 [Report] >>106265867 >>106266146

chink_avengers.jpg md5: 0555b2e4...

llama 4 thinking is going to be crazy

Anonymous 8/15/2025, 4:09:26 AM No.106265833 [Report]

>>106265624
>quit
She still has to pay the loan

Anonymous 8/15/2025, 4:11:53 AM No.106265848 [Report]

loli feet.

Anonymous 8/15/2025, 4:13:05 AM No.106265857 [Report] >>106265864

>>106265624
Bullshit, its just a lazy student that needed an excuse to quit.

Anonymous 8/15/2025, 4:13:25 AM No.106265859 [Report] >>106265878

Kails_Patil,_Indian_politician.jpg md5: cf7b69de...

>brown rapist general

Anonymous 8/15/2025, 4:14:00 AM No.106265864 [Report] >>106265874 >>106265883

>>106265857
You don't know that

Anonymous 8/15/2025, 4:14:16 AM No.106265867 [Report]

>>106265793
Wang is already posting in lowercase like sama, big grift incoming!

Anonymous 8/15/2025, 4:15:13 AM No.106265874 [Report] >>106265886

>>106265864
I studied engineering so of course I know

Anonymous 8/15/2025, 4:16:05 AM No.106265878 [Report]

>>106265859
shut up madarchod we visit you a group basterd

Anonymous 8/15/2025, 4:17:00 AM No.106265883 [Report]

>>106265864
it's true, I was the student

Anonymous 8/15/2025, 4:17:26 AM No.106265886 [Report] >>106266234

india-gang-rape_wide-a1607367436ed30789284f19f9f8b51eafc93f01.jpg md5: ddec0482...

>>106265874
You will never know.

Anonymous 8/15/2025, 4:22:00 AM No.106265910 [Report] >>106265939

Fixed another bug today. I love being productive

Anonymous 8/15/2025, 4:26:04 AM No.106265939 [Report] >>106266191

250px-Christopher_Poole_at_XOXO_Festival_September_2012.jpg md5: 2f2f9dc5...

>>106265910
G-d bless. We thank you for your service.

Anonymous 8/15/2025, 4:36:45 AM No.106266022 [Report]

>>106265624
Well she definitely isn't going to graduate now

Anonymous 8/15/2025, 4:54:29 AM No.106266146 [Report] >>106266338

file.png md5: 8a8ddc2c...

>>106265793
1 incompetent safety engineer is worth a 100 competent AI training engineers

Anonymous 8/15/2025, 5:02:36 AM No.106266191 [Report] >>106266198 >>106266202 >>106266217 >>106266220

>>106265939
Who is that?

Anonymous 8/15/2025, 5:03:42 AM No.106266198 [Report]

>>106266191
Satoshi Nakamoto

Anonymous 8/15/2025, 5:04:03 AM No.106266202 [Report]

>>106266191
One of the investors behind Gemma 3.

Anonymous 8/15/2025, 5:06:13 AM No.106266217 [Report]

>>106266191
That's sama when he was younger. Money really changes people.

Anonymous 8/15/2025, 5:06:35 AM No.106266220 [Report] >>106266239

>>106266191
Me

Anonymous 8/15/2025, 5:09:10 AM No.106266234 [Report]

>>106265886
Sir thank you for post picture of Googel Technical Support Engineer Department, I feel femos

Anonymous 8/15/2025, 5:10:11 AM No.106266239 [Report]

>>106266220
OMG hii sam! UwU

Anonymous 8/15/2025, 5:13:30 AM No.106266264 [Report] >>106266289

when will they fire sam again

Anonymous 8/15/2025, 5:17:03 AM No.106266289 [Report] >>106266349

>>106266264
They tried once, didn't work. Sam stays.

Anonymous 8/15/2025, 5:18:37 AM No.106266302 [Report] >>106266316

gpt-oss is best in class local and GPT-5 dominates online API models
sama won

Anonymous 8/15/2025, 5:19:50 AM No.106266316 [Report] >>106266336

>>106266302
also i forgot to mention, I am a homosexual.

Anonymous 8/15/2025, 5:20:56 AM No.106266326 [Report]

>>106265624
If that student actually exists, chances are they intended to drop out anyway, and just wanted to make an anti-AI statement on the way out.

Anonymous 8/15/2025, 5:22:16 AM No.106266336 [Report]

>>106266316
We know sama, we know.

Anonymous 8/15/2025, 5:22:34 AM No.106266338 [Report]

>>106266146
True wisdom here. Reduce the competency of "safety" staff, and even mediocre trainers can produce decent models.

Anonymous 8/15/2025, 5:23:53 AM No.106266349 [Report] >>106266370 >>106266530

>>106266289
With how much of a flop GPT-5 was, it can easily happen again.

Anonymous 8/15/2025, 5:25:11 AM No.106266362 [Report] >>106266375

image_2025-08-15_085459386.png md5: c6093d86...

Anonymous 8/15/2025, 5:26:36 AM No.106266370 [Report] >>106266442 >>106266453

>>106266349
Didn't everyone who rebelled against sama the first time leave?

Anonymous 8/15/2025, 5:27:16 AM No.106266375 [Report] >>106266389 >>106266415 >>106266426

image_2025-08-15_085709929.png md5: 94478ffb...

>>106266362
idk why fags are obsessed with generating realism, stylized stuff looks so much better

Anonymous 8/15/2025, 5:29:25 AM No.106266389 [Report]

image_2025-08-15_085921572.png md5: 2afe71aa...

>>106266375

Anonymous 8/15/2025, 5:32:43 AM No.106266408 [Report] >>106266429

I'm new to SwarmUI and it took the last couple of hours to figure out where the Workflow tab actually saved images out to. Turns out they're buried deep into the SwarmUI directory hierarchy. No immediately obvious way to change the path either, so I ended up creating a syslink to a folder that's easier to reason about. What made it take so long to figure out, is that I thought the directory settings would work for changing the output folder. But that actually only works for the Generate tab.

Anonymous 8/15/2025, 5:33:35 AM No.106266414 [Report]

VRAMlet here.
There's one anon in these threads who always insists that Magnum 12B v2 is better than Rocinante. I've always wondered if he's for real or just the Drummer-hating schizo.
I tried Magnum tonight.
He's just the Drummer-hating schizo.
It's fucking terrible. Significantly stupider than Rocinante and significantly hornier, in a bad way (as Rocinante is obviously already horny enough), making it goddamn awful for slow-burns.
Baka schizo.

Anonymous 8/15/2025, 5:33:56 AM No.106266415 [Report]

>>106266375
Realism is just one of the styles.

Anonymous 8/15/2025, 5:35:42 AM No.106266426 [Report] >>106266451 >>106266812

>>106266375
anime is for children

Anonymous 8/15/2025, 5:36:07 AM No.106266429 [Report]

>>106266408
Doing everything to avoid spaghetti, huh? You will eventually surrender and spaghetti will claim another poor soul.

Anonymous 8/15/2025, 5:38:20 AM No.106266442 [Report]

>>106266370
Not even the shareholders and normies can be fooled forever.

Anonymous 8/15/2025, 5:39:08 AM No.106266451 [Report]

animewebsite.gif md5: 7a977bbb...

>>106266426

Anonymous 8/15/2025, 5:39:39 AM No.106266453 [Report]

>>106266370
Turns out they were right all along.

Anonymous 8/15/2025, 5:40:43 AM No.106266456 [Report] >>106266478 >>106266598

did sama, musk &c. get scared by the new AI toy google presented (genie 3)? the videos they showed seem great. might be really good for videogayms
some article talked about some whiz kid (Matt Deitke) that got hired by Meta for millions of dollars. I did some research and apparently (assuming not just a coincidence of names) most of his research was based exactly on that topic, procedural generation:
https://arxiv.org/search/?searchtype=author&query=Deitke%2C+Matt&abstracts=show&size=50&order=

Anonymous 8/15/2025, 5:40:59 AM No.106266458 [Report]

Surely Elon will open source Grok 2 tomorrow like he promised

Anonymous 8/15/2025, 5:42:13 AM No.106266473 [Report] >>106266524 >>106268400

New micro gemma is alright, dumb as a rock but someone smarter than me can probably come up with a decent use case.

Anonymous 8/15/2025, 5:42:38 AM No.106266478 [Report] >>106266518

>>106266456
>the videos they showed seem great
you have abysmal standards

Anonymous 8/15/2025, 5:45:22 AM No.106266492 [Report] >>106266519 >>106266532

enterprise llm market share.png md5: b5a8653a...

>https://menlovc.com/perspective/2025-mid-year-llm-market-update/
samasirs...

Anonymous 8/15/2025, 5:50:53 AM No.106266518 [Report] >>106266534

>>106266478
you think this is bad?
https://www.youtube.com/watch?v=PDKhUknuQDg
looks like a fucking movie to me?

Anonymous 8/15/2025, 5:51:16 AM No.106266519 [Report]

image_2025-08-15_092034837.png md5: 84771fbf...

>>106266492
Anthropic?
more like Anthro-Spic.

Anonymous 8/15/2025, 5:52:55 AM No.106266524 [Report] >>106266537

>>106266473
There's no use case other than wasting electricity.

Anonymous 8/15/2025, 5:53:35 AM No.106266530 [Report] >>106266538 >>106266539 >>106266551

>>106266349
>how much of a flop GPT-5 was
that's your bubble

Anonymous 8/15/2025, 5:54:05 AM No.106266532 [Report] >>106266558

>>106266492
Gemini is already better than Claude though at code

Anonymous 8/15/2025, 5:54:05 AM No.106266534 [Report]

>>106266518
donate your eyeballs if you aren't going to use them

Anonymous 8/15/2025, 5:54:30 AM No.106266537 [Report]

image_2025-08-15_092415605.png md5: b8c0da91...

>>106266524
Oh i love that.

Anonymous 8/15/2025, 5:54:45 AM No.106266538 [Report] >>106266616 >>106268640

Screenshot 2025-08-14 215426.png md5: e900ad7c...

>>106266530

Anonymous 8/15/2025, 5:54:56 AM No.106266539 [Report] >>106266551 >>106266561

>>106266530
Mainstream media thinks it's a flop
/r/localllama thinks it's a flop
/lmg/ and /aicg/ think it's a flop

Anonymous 8/15/2025, 5:56:27 AM No.106266551 [Report]

>>106266530
>>106266539
Even /r/ChatGPT think it's a flop and wish to get GPT 4o back

Anonymous 8/15/2025, 5:58:24 AM No.106266558 [Report] >>106266566 >>106266573

>>106266532
I disagree and so do Enterprises.

Anonymous 8/15/2025, 5:59:04 AM No.106266561 [Report] >>106266563 >>106266568 >>106266616

>>106266539
all vocal minorities that are terminally online
/lmg/, /aicg/ coomers and reddit.com/r/aiboyfriends are not representative of normal people
mainstream media just parroting terminally online retardation these days

Anonymous 8/15/2025, 5:59:51 AM No.106266563 [Report]

>>106266561
>mainstream media
>prediction markets
>terminally online
Go bet on it. Put your money where your mouth is

Anonymous 8/15/2025, 6:00:17 AM No.106266566 [Report]

>>106266558
You and Enterprises are wrong.

Anonymous 8/15/2025, 6:00:34 AM No.106266568 [Report]

>>106266561
hey sama, buy a fucking ad

Anonymous 8/15/2025, 6:01:25 AM No.106266573 [Report] >>106266576 >>106267392

>>106266558
Gemini has the best context, after that it's all a matter of skill so no wonder corpotards would prefer Claude which can give you some result even with retarded prompts as it's good at inferring things

Anonymous 8/15/2025, 6:02:05 AM No.106266576 [Report] >>106266591

>>106266573
So you admit Claude is better?

Anonymous 8/15/2025, 6:03:59 AM No.106266591 [Report]

>>106266576
It has a better baseline if you're retarded yes, not if you can prompt since coherent context is more valuable as your codebase grows

Anonymous 8/15/2025, 6:04:49 AM No.106266598 [Report] >>106266966

>>106266456
this will always exist only at the level of toy demo, and it's so hardware intensive it won't even exist as a demo YOU can try, so it's just a toy for google's employees.
there is no practical use for something that can't retain permanent, long lasting state like that
no, this won't be the next video game engine

Anonymous 8/15/2025, 6:05:02 AM No.106266601 [Report] >>106266610 >>106266611 >>106266624

1733961751656792.png md5: 03bc6d14...

Anonymous 8/15/2025, 6:07:01 AM No.106266610 [Report]

>>106266601
I despise this ratfuck.

Anonymous 8/15/2025, 6:07:13 AM No.106266611 [Report]

19419 - SoyBooru.png md5: aeaa4f35...

>>106266601
>

Anonymous 8/15/2025, 6:08:10 AM No.106266616 [Report]

>>106266561
Then make your bet >>106266538 and prove you're not a shitter
On the plus side, you'll have a nice discount

Anonymous 8/15/2025, 6:10:07 AM No.106266624 [Report]

Screenshot 2025-08-14 220859.png md5: 168ec97c...

>>106266601

Anonymous 8/15/2025, 6:14:03 AM No.106266639 [Report] >>106266641 >>106266655 >>106267406

rat musk pwned.png md5: e8764b12...

sama is based and redpilled

Anonymous 8/15/2025, 6:14:59 AM No.106266641 [Report] >>106266668

>>106266639
Sama is a homosexual.

Anonymous 8/15/2025, 6:15:24 AM No.106266644 [Report] >>106266650

Screenshot 2025-08-14 221508.png md5: 4330380f...

Anonymous 8/15/2025, 6:16:36 AM No.106266650 [Report] >>106269796 >>106269812

>>106266644
Why don't we see people decalre "i'm straight", I wonder

Anonymous 8/15/2025, 6:17:30 AM No.106266655 [Report]

sam-is-gay.png md5: 014d5a31...

>>106266639

Anonymous 8/15/2025, 6:18:25 AM No.106266658 [Report]

hey bros I think sama is gay

Anonymous 8/15/2025, 6:20:13 AM No.106266668 [Report] >>106266674 >>106266684

>>106266641
Why did he rape his sister then?

Anonymous 8/15/2025, 6:21:19 AM No.106266674 [Report]

>>106266668
His sister raped herself

Anonymous 8/15/2025, 6:22:06 AM No.106266683 [Report] >>106266694

sama may be gay, but he will never be as effeminate as faggots who get off textgen

Anonymous 8/15/2025, 6:22:08 AM No.106266684 [Report] >>106266691

>>106266668
Sam was raped by his family members. He turned homosexual. He then sexually abused his sister. Circle of life.

Anonymous 8/15/2025, 6:23:42 AM No.106266691 [Report]

>>106266684
By the way, for those who don't know, this is how homosexuals "reproduce".

Anonymous 8/15/2025, 6:24:42 AM No.106266694 [Report]

>>106266683
>who get off textgen
Agreed. Faggots who abandon textgen aren't welcome here

Anonymous 8/15/2025, 6:29:39 AM No.106266719 [Report] >>106266742 >>106267364

1755219328665759.png md5: a2724686...

Anonymous 8/15/2025, 6:34:46 AM No.106266742 [Report] >>106266750 >>106266752 >>106266892

>>106266719
Including scraping a disconnected vacuum hose across the floor, apparently

Anonymous 8/15/2025, 6:36:21 AM No.106266750 [Report]

>>106266742
The hose is snaked under the carpet so as not to be unsightly.

Anonymous 8/15/2025, 6:37:01 AM No.106266752 [Report]

>>106266742
That's the Q1 local version sir

Anonymous 8/15/2025, 6:44:58 AM No.106266788 [Report] >>106266880

like communism, good imagen has never been tried
just gen a scene inside a car with any of those models and pay attention

Anonymous 8/15/2025, 6:51:56 AM No.106266812 [Report] >>106266863

>>106266426
anime featuring children is for adults

Anonymous 8/15/2025, 7:03:08 AM No.106266863 [Report] >>106266899

>>106266812
pedophile

Anonymous 8/15/2025, 7:05:30 AM No.106266880 [Report] >>106266905 >>106267008

1730222658074474.png md5: 6cc202ee...

>>106266788
wan 2.2

Anonymous 8/15/2025, 7:07:26 AM No.106266892 [Report] >>106266910 >>106266924

>>106266742
she was built to be cute, not smart

Anonymous 8/15/2025, 7:08:16 AM No.106266899 [Report]

1728278875754890.png md5: c2939061...

>>106266863
>t.

Anonymous 8/15/2025, 7:08:42 AM No.106266905 [Report] >>106267008

1737390039472742.png md5: ce9b3251...

>>106266880
Another.
"car interior, male hands on the steering wheel, pov drive in a 2023 ford mustang, on an overcast dusk, on a cliffside expressway, ocean and city lights in the distance"

Anonymous 8/15/2025, 7:09:04 AM No.106266910 [Report] >>106267337

image_2025-08-15_103856988.png md5: a5041152...

>>106266892
trve

Anonymous 8/15/2025, 7:10:30 AM No.106266924 [Report]

>>106266892
So like a real woman!

Anonymous 8/15/2025, 7:11:22 AM No.106266930 [Report] >>106266931 >>106266965 >>106266967 >>106267122 >>106267945

Is there a way to stagger text speed in SillyTavern? Almost instant text generation is a huge turn-off for me.

Anonymous 8/15/2025, 7:11:48 AM No.106266931 [Report]

>>106266930
kys

Anonymous 8/15/2025, 7:18:23 AM No.106266965 [Report] >>106266976

Screenshot 2025-08-15 at 15-17-40 SillyTavern.png md5: b1bf935a...

>>106266930

Anonymous 8/15/2025, 7:18:27 AM No.106266966 [Report]

>>106266598
>this won't be the next video game engine
ok then, a movie that normies can make with a few keystrokes
all of this is besides my point though. good or bad, doesn't matter. they seem to be hiring people to jump from LLMs to video generation

Anonymous 8/15/2025, 7:18:33 AM No.106266967 [Report]

>>106266930
Keep more layers on cpu. If it's still too fast, use a bigger model.
Or make a little proxy that collects the tokens from your backend and sends them at a configurable rate back to ST. Then you don't need to sacrifice prompt processing speed.

Anonymous 8/15/2025, 7:19:47 AM No.106266976 [Report]

>>106266965
>Frokens Per Second

Anonymous 8/15/2025, 7:27:04 AM No.106267008 [Report] >>106267014

>>106266880
dude the perspective is all sorts of fucked
look at the door handle
yall blind as bats
>>106266905
this one doesn't even have a side mirror

Anonymous 8/15/2025, 7:28:44 AM No.106267014 [Report]

>>106267008
Bruh do you expect NEETs know how to drive?

Anonymous 8/15/2025, 7:28:57 AM No.106267015 [Report] >>106267038

Untitled.png md5: 9f6a6b47...

Hey guys, finally tried a large moe model, qwen 3 235b... the response is so cute!

Anonymous 8/15/2025, 7:31:57 AM No.106267028 [Report] >>106267124 >>106269571

vLLM is such a piece of shit. Had GLM 4.1 working on a night version, tried to update for 4.5, but V100 support was removed.
Previous version 0.9.2 has broken 4.1 support.
WSL with a never device worked, until Windows forced updates somehow caused it to always crash, even when just running vllm --version.
Going back to 0.9.2 or lower just gives OOM errors despite only loading a 5GB model file with 1024 context length. Fucking with options doesn't help.
At this point, only option seems to be to waste more hours git bisecting to find what magic version betwen 9.2 and 10 was working before, live with only older models, or wait for llama.cpp to support some multimodal models besides llava.
I fucking hate this hobby.

Anonymous 8/15/2025, 7:32:29 AM No.106267032 [Report] >>106267054 >>106267058 >>106267079

Driving is overrated
Drivers will be replaced by AI faster than most jobs

Anonymous 8/15/2025, 7:34:01 AM No.106267038 [Report]

>>106267015
moe moe cute

Anonymous 8/15/2025, 7:36:08 AM No.106267054 [Report] >>106267074

>>106267032
full safe driving by the end of this year

Anonymous 8/15/2025, 7:36:58 AM No.106267058 [Report]

>>106267032
>Driving is overrated
Wrong.
https://www.youtube.com/watch?v=MV_3Dpw-BRY&list=RDMV_3Dpw-BRY&start_radio=1

Anonymous 8/15/2025, 7:38:56 AM No.106267074 [Report]

>>106267054
Elon, stop shitposting and upload groks 2 and 3 already

Anonymous 8/15/2025, 7:39:45 AM No.106267079 [Report] >>106267087

>>106267032
this has already happened
these days you can't even take fun lines or go over the speed limit in a new car without having to disable 10 ""safety"" features
THIS is what they want for llms as well

Anonymous 8/15/2025, 7:41:37 AM No.106267087 [Report]

>>106267079
This is what they want for the internet and computing in general. Only thing more disturbing is how much the average person enthusiastically welcomes the nanny state.

Anonymous 8/15/2025, 7:41:55 AM No.106267091 [Report] >>106267111

Sam Altman announced he is "gay" because he wants to avoid those rape charges. This supports the defense when he claims that he's a faggot.

Anonymous 8/15/2025, 7:44:54 AM No.106267111 [Report] >>106267121

>>106267091
just because he finally released a local model, doesn't make gossip about his personal life on-topic

Anonymous 8/15/2025, 7:45:48 AM No.106267121 [Report] >>106267399 >>106267490

>>106267111
You are not this thread's moderator, faggot. Drink bleach.

Anonymous 8/15/2025, 7:45:50 AM No.106267122 [Report]

>>106266930
Yeah, it's called smooth streaming under miscellanous in the user settings and you can change the speed.

Anonymous 8/15/2025, 7:46:12 AM No.106267124 [Report] >>106267163

>>106267028
>or wait for llama.cpp to support some multimodal models besides llava
You should keep more up to date with llama.cpp's development.

Anonymous 8/15/2025, 7:47:36 AM No.106267131 [Report] >>106267132

step3 llama.cpp support when

Anonymous 8/15/2025, 7:48:07 AM No.106267132 [Report] >>106267168

>>106267131
When you vibe code it in.

Anonymous 8/15/2025, 7:53:03 AM No.106267163 [Report] >>106267168 >>106267173

>>106267124
https://github.com/ggml-org/llama.cpp/tree/master/docs/multimodal
>gemma
>minicmp
Doesn't look like I missed a whole lot. Qwen VL where? GLM where? dots.vlm1 where? Step3 where? Ernie where?

Anonymous 8/15/2025, 7:54:28 AM No.106267168 [Report]

>>106267163
>>106267132

Anonymous 8/15/2025, 7:55:07 AM No.106267173 [Report] >>106267185 >>106267196

>>106267163
don't be greedy
you can always contribute yourself if you want a feature that badly

Anonymous 8/15/2025, 7:57:34 AM No.106267185 [Report] >>106267201

>>106267173
No. I'll bisect llvm and never report the issue. Not because i'm greedy, but because i'm lazy.

Anonymous 8/15/2025, 7:58:54 AM No.106267196 [Report]

>>106267173
>condescending little faggot
Looks like r-eddit is more suitable place for you.

Anonymous 8/15/2025, 8:00:13 AM No.106267201 [Report]

>>106267185
This.

Anonymous 8/15/2025, 8:02:04 AM No.106267208 [Report]

if your model can't add its own support to llama.cpp fully autonomously, it's not worth ggufing to begin with

Anonymous 8/15/2025, 8:04:04 AM No.106267223 [Report] >>106267361

>tfw local will never get a model better than llama 3.1 405b because of moefags
please for the love of god mistral release extra large

Anonymous 8/15/2025, 8:08:58 AM No.106267252 [Report] >>106267292

I wish those dumb chinese companies would stop trying to do useless new attention mechanisms or try to do dumb modifications on how models work with multi token prediction and other shit
all of it is useless and all it does is make llama.cpp skip their dumb shit or force them to spend weeks trying to get them to work like a normal gqa transformer model

Anonymous 8/15/2025, 8:15:32 AM No.106267292 [Report] >>106267309

>>106267252
>China please stop innovating so fast, we in the west can't keep up. All we want is more incremental improvements on benchmarks.
Get fucked.

Anonymous 8/15/2025, 8:17:06 AM No.106267305 [Report]

>>106264433
>--Gemma-3-270m release met with skepticism over performance, censorship and speculative decoding flaws

Any other use cases besides writing silly stories?

Post your prompts, anons

Anonymous 8/15/2025, 8:17:51 AM No.106267309 [Report] >>106267328 >>106267352 >>106269589

>>106267292
there was literally no benefit to mla, the model still takes up the same space and has the same performance as any other model I run in llama.cpp
step3 is getting skipped because they also felt like they had to make their own spin on mla
there is no need for all of this but they keep making it overly complicated

Anonymous 8/15/2025, 8:20:53 AM No.106267328 [Report]

>>106267309
Indeed. Innovation should stop right now.

Anonymous 8/15/2025, 8:22:09 AM No.106267337 [Report]

>>106266910
>19ack-

Anonymous 8/15/2025, 8:24:44 AM No.106267352 [Report]

>>106267309
This. THIS is why meta's llama series was so ahead of the curve. No nonsense, no bullshit, just the same old tried and true methods.

Anonymous 8/15/2025, 8:24:49 AM No.106267355 [Report] >>106267359

Screenshot 2025-08-15 at 00-24-23 SillyTavern.png md5: 110cec62...

So, uh, what did Rocicante mean by this?

Anonymous 8/15/2025, 8:26:00 AM No.106267359 [Report] >>106267385

>>106267355
Probably that your samplers are fucked.

Anonymous 8/15/2025, 8:26:08 AM No.106267361 [Report] >>106267437

>>106267223
dense lost

Anonymous 8/15/2025, 8:26:42 AM No.106267364 [Report] >>106267389

>>106266719
wud sex with this migu (not the poster)

Anonymous 8/15/2025, 8:30:30 AM No.106267385 [Report] >>106267404

>>106267359
They seem fine the rest of the time.
This is the first time it's gone completely off the rails in weeks.

Anonymous 8/15/2025, 8:31:08 AM No.106267389 [Report] >>106267395

>>106267364
I think ears are hot. I can not see her ears in this picture. I am not attracted to this girl.

Anonymous 8/15/2025, 8:31:53 AM No.106267392 [Report] >>106267428 >>106267458

file.png md5: 92821599...

>>106266573
>Gemini has the best context
Not for code.

Anonymous 8/15/2025, 8:32:59 AM No.106267395 [Report] >>106267412 >>106267422

>>106267389
>complaining about not seeing the ears
>of hatsune miku
i poke my head into the gutter for one freaking second and /lmg/ shovels SHIT into my face. see you guys when there's an actually interesting model release kek at least /aicg/ is a fun shithole

Anonymous 8/15/2025, 8:33:45 AM No.106267399 [Report]

1751399821758105.png md5: 179000bc...

>>106267121

Anonymous 8/15/2025, 8:35:44 AM No.106267404 [Report]

>>106267385
And in the first reply. That's impressive. If not the samplers, which i'd have to trust you set reasonably, have you read the card? I've seen some shit in those.

Anonymous 8/15/2025, 8:36:18 AM No.106267406 [Report]

>>106266639
Can they kill each other yet?

Anonymous 8/15/2025, 8:37:15 AM No.106267412 [Report]

>>106267395
ok

Anonymous 8/15/2025, 8:39:08 AM No.106267422 [Report]

>>106267395
What if Hatsune Miku covers up her ears because they are abnormally hairy, like an old man's ears? I can't fap without knowing for sure.

Anonymous 8/15/2025, 8:40:07 AM No.106267428 [Report] >>106267463

>>106267392
Woah, mentat.ai has the best line. Where can I find their api? Weird how they don't label the other lines though.

Anonymous 8/15/2025, 8:41:28 AM No.106267437 [Report] >>106267439

>>106267361
Dense is like training multiple smaller models but at the end you only get to use one of them and most of the parameters end up wasted. MoE is still limited by attention due to having less active parameters. Dense will reign supreme again once a more efficient training method is discovered that effectively can use all of the parameters.

Anonymous 8/15/2025, 8:42:06 AM No.106267439 [Report] >>106267441

>>106267437
Huh?

Anonymous 8/15/2025, 8:42:42 AM No.106267441 [Report]

>>106267439
Huh?

Anonymous 8/15/2025, 8:44:57 AM No.106267455 [Report] >>106267462 >>106267597

dense models only use 10% of their brain

Anonymous 8/15/2025, 8:45:26 AM No.106267458 [Report]

>>106267392
Grokbros...

Anonymous 8/15/2025, 8:45:59 AM No.106267462 [Report] >>106267473

>>106267455
I'm a dense model—everyone I know calls me dense—and I use 110% of my brain.

Anonymous 8/15/2025, 8:46:02 AM No.106267463 [Report] >>106267473

>>106267428
It's a watermark. All lines are labeled. A smarter person could have pretended to be stupid much better.

Anonymous 8/15/2025, 8:47:14 AM No.106267473 [Report]

>>106267463
>A smarter person
See >>106267462

Anonymous 8/15/2025, 8:49:01 AM No.106267484 [Report] >>106267779

time to jack off to porn again

Anonymous 8/15/2025, 8:50:23 AM No.106267490 [Report] >>106267779

>>106267121
If you want to gossip like a little girl about Sam and Elon go to /aicg/, that's literally what it's there for.

Anonymous 8/15/2025, 8:56:02 AM No.106267517 [Report] >>106267545 >>106267550 >>106267557 >>106267586

are we even at the point yet where i could tell an ai assistant to take all the files in say folder 1 and sort them, rename them, and move them to folders 2 and 3 etc based on certain criteria? or to ask the AI bot to search for and get rid of duplicate files, make compressed folders, etc. retrieve programs downloads x y and z. "jarvis search (insert torrent aggregator here) for south park season 2" and it would be like "would you like me to tell you some results based on seed count? or perhaps format specific mr anon?" "uh tell me the highest seeded 4k ones" and it'd do it. are we already at that point and i'm just not hip to it? local though, like you are the true master of the AI and it's not sharing your data with outside sources but rather the other way around, your personal local AI is protecting your shit and obfuscating it from being harvested by outside sources.

Anonymous 8/15/2025, 9:02:28 AM No.106267545 [Report]

>>106267517
Are you indian?
To answer your question- yes, you can have the model execute python statement and such

Anonymous 8/15/2025, 9:03:54 AM No.106267550 [Report] >>106267565

>>106267517
No. Not yet. However, I alone am working on an extremely integrated solution that will be ease of use, and hassleless free for the user. Please stay tuned, I will have a working prototype up on my patreon in a few months.

Anonymous 8/15/2025, 9:04:40 AM No.106267557 [Report] >>106268802

>>106267517
No. Some tools approximate that, but I wouldn't trust them to do anything reliably. For most of those things, there are 100% reliable tools that don't require a gpu to run.
>take all the files in say folder 1 and sort them, rename them, and move them to folders 2 and 3 etc based on certain criteria?
I've seen demos of that sort of thing. Some anon made his model make a script to do it. A script that you could write as well.
>ask the AI bot to search for and get rid of duplicate files
100% reliable tools already exist for that.
>make compressed folders
anon...
>retrieve programs downloads x y and z
git, wget...
>torrents
There's plenty of clients and search engines.
>uh tell me the highest seeded 4k ones
Click on the column header....

Anonymous 8/15/2025, 9:04:59 AM No.106267559 [Report] >>106267580 >>106267588

Screenshot_20250401_222528_Brave.jpg md5: 9571f0f7...

Can some one a public dataset to use to finetune, say, gemma-3-270m

The sole purpose of this would be see real results quickly.

I never endevoured fine-tuning before

Anonymous 8/15/2025, 9:05:42 AM No.106267565 [Report]

>>106267550
>hassleless free

Anonymous 8/15/2025, 9:08:06 AM No.106267580 [Report] >>106267615 >>106267682

>>106267559
There's plenty on huggingface. Try this one
>https://huggingface.co/datasets/GoofyLM/Brainrot-xK-large

Anonymous 8/15/2025, 9:09:31 AM No.106267586 [Report]

>>106267517
MCP can do a lot of that, but probably not as well as you might hope. Give a model filesystem, terminal, and web search server tools and it can do short tasks for you. Though the model is liable to get confused and delete all of your files if you give it a big one to sort.

Anonymous 8/15/2025, 9:09:41 AM No.106267588 [Report] >>106267603 >>106267655

>>106267559
Is there a way to go through all the youtube let's plays of stellaris, extract the dialogue and filter the tagents/off topic comments to create a stellaris AI?

Anonymous 8/15/2025, 9:11:16 AM No.106267597 [Report] >>106267665

>>106267455
what command unlocks the other 90%?

Anonymous 8/15/2025, 9:13:01 AM No.106267603 [Report] >>106267640

>>106267588
I guess it is pretty much possible

Youtube downloader
Whisper for speach-to-text
Qwen3-xxb to sort thing out

Anonymous 8/15/2025, 9:15:07 AM No.106267615 [Report]

>>106267580

ty

Anonymous 8/15/2025, 9:19:11 AM No.106267640 [Report] >>106267659 >>106267682 >>106267685 >>106267706

>>106267603
Is there some sort of dataset that's kind of like what I want to do? So I can see if it'll even work before downloading a shitton of videos. I guess I could also just try to pull the autogenerated youtube subs somehow right? But aren't they kind of innacurate?

Anonymous 8/15/2025, 9:22:07 AM No.106267655 [Report]

>>106267588

I personally would do it with BBT or any other sitcom.

Why? It is purely dialogs with short sentences

Anonymous 8/15/2025, 9:22:19 AM No.106267659 [Report] >>106267691

>>106267640
>I guess I could also just try to pull the autogenerated youtube subs somehow right?
yt-dlp can do that.
>But aren't they kind of innacurate?
They are complete shit.

Anonymous 8/15/2025, 9:23:17 AM No.106267665 [Report]

>>106267597
With 30ba3b, you can use --override-kv qwen3moe.expert_used_count=int:128

Anonymous 8/15/2025, 9:26:43 AM No.106267682 [Report] >>106267691 >>106267702

>>106267640

I guess you'll need to make you familiar with the topic of fine-tuning (I myself have no idea)

>>106267580
The dataset suggested in this post is so absurd and 4chan-like that I'd think you will be able to feel the changes in character of a sober LLM (base model, not fine-tuned) quite fast

Anonymous 8/15/2025, 9:27:44 AM No.106267685 [Report] >>106267691

>>106267640
>autogenerated youtube subs somehow right? But aren't they kind of innacurate?

Garbage

Anonymous 8/15/2025, 9:28:35 AM No.106267691 [Report] >>106267772

>>106267682
If it's anything like imagegen, finetuning an llm should be pretty easy. Curating a proper dataset is where I spend most of my time.

>>106267659
>>106267685
I'll avoid it then.

Anonymous 8/15/2025, 9:30:34 AM No.106267702 [Report] >>106267715 >>106268672

>>106267682
>The dataset suggested in this post is so absurd
That's exactly why i suggested it. I doubt it looks much like the original model's training data.
>and 4chan-like
Not so sure about that. Most anons here speak normally.

Anonymous 8/15/2025, 9:31:14 AM No.106267706 [Report]

>>106267640
>before downloading a shitton of videos

Do it with a single video

Yt-dl and whisper are easy to set up

You will still need a smart LLM to sortvthe dialogs for you, and format them accordingly

Anonymous 8/15/2025, 9:32:19 AM No.106267714 [Report] >>106267722

ik what u r goin thru rn... dont 4get 2 breathe and rember u are ur own sigma, dont let teh system bring u dwn girl! mewing ur jawline wont save u from ur emo feelings

Anonymous 8/15/2025, 9:32:54 AM No.106267715 [Report]

>>106267702
>That's exactly why i suggested it

:))))))))))))

Anonymous 8/15/2025, 9:34:18 AM No.106267722 [Report]

>>106267714

A use case for gemma-3-270m to do speculative decoding

Anonymous 8/15/2025, 9:44:14 AM No.106267772 [Report]

>>106267691
>If it's anything like imagegen
It's much harder than imagegen.

Anonymous 8/15/2025, 9:45:01 AM No.106267779 [Report]

>>106267490
>>106267484

Anonymous 8/15/2025, 9:46:53 AM No.106267783 [Report] >>106267801

um... thursday is over...? where is deepseek?

Anonymous 8/15/2025, 9:50:21 AM No.106267801 [Report]

>>106267783
deepseek has no one left to distill from it's over

Anonymous 8/15/2025, 10:07:11 AM No.106267875 [Report] >>106267897

llms have plateaued

Anonymous 8/15/2025, 10:10:00 AM No.106267897 [Report]

>>106267875
don't you mean platoed?

Anonymous 8/15/2025, 10:22:34 AM No.106267945 [Report] >>106270080

>>106266930
Blip extension can do this nicely with short pauses between punctuation, narration, and dialogue.

Anonymous 8/15/2025, 10:35:35 AM No.106268029 [Report] >>106268039 >>106268040 >>106268042

>>106264429 (OP)
give me tts recs bros

Anonymous 8/15/2025, 10:37:19 AM No.106268039 [Report]

>>106268029
https://github.com/boson-ai/higgs-audio

Anonymous 8/15/2025, 10:37:26 AM No.106268040 [Report]

>>106268029
indextts2

Anonymous 8/15/2025, 10:37:31 AM No.106268042 [Report]

>>106268029
I use Chatterbox.

Anonymous 8/15/2025, 11:13:12 AM No.106268224 [Report] >>106268259

Where do I go if I want normal character cards that aren't written by schizos or criminals?

There used to be a site back in the day, botprompts or something, and it had nicely written original characters and characters from TV shows, anime, movies etc.

Anonymous 8/15/2025, 11:20:28 AM No.106268259 [Report] >>106268273

>>106268224
Write your own. Use LLM to create easy digestable summaries and backgrounds.
Retard.

Anonymous 8/15/2025, 11:22:49 AM No.106268273 [Report]

>>106268259
Feeding an LLM's context with LLM output seems like a nice recipe to get slop.

Also this is going to seem cringe to the dejected majority here, but I enjoyed seeing the creativity in the cards others made back in the day. Like Big Nigga and the tree of Big Niggas (reflection prompting before OpenAI even considered it).

Anonymous 8/15/2025, 11:29:31 AM No.106268314 [Report] >>106268401 >>106268553

file.png md5: 9fa94aa8...

>hn post about new gemma
>top reply reads like it was written by a 270M model

Anonymous 8/15/2025, 11:43:11 AM No.106268394 [Report] >>106268688

Is there a proper RPG engine that works with LLMs yet? Something like a classic grid-based RPG that has proper stats, equipment, environments etc, but the LLM generates dialogue and sets up scenarios like a dungeon master

Anonymous 8/15/2025, 11:44:21 AM No.106268400 [Report]

>>106266473
it works as a draft model for gemma models giving specific settings decent speedup

Anonymous 8/15/2025, 11:44:27 AM No.106268401 [Report]

>>106268314
Is it possible that the models were meant to be released as gemma 4 but weren't that great so they said it's gemma3?

Anonymous 8/15/2025, 12:15:41 PM No.106268553 [Report]

>>106268314
Let Rajeesh having his fun

Anonymous 8/15/2025, 12:30:17 PM No.106268611 [Report] >>106268661 >>106268695 >>106268735 >>106268745 >>106268795 >>106268866

deepseekcensor.png md5: 94073bbc...

What is the easiest way to stop DeepSeek refusals?
I'm translating a VN and it worked fine for a couple hundred lines but then spat out this.
Running R1-0528-Q8_0 using llama.cpp

Anonymous 8/15/2025, 12:34:45 PM No.106268640 [Report] >>106268673 >>106268717

80b843e4096acccaf8cb4fa0f19430cd2802738e4839b06536cc6c7cd187d285.jpg md5: 4b9f1102...

>>106266538
Where's Meta?

Anonymous 8/15/2025, 12:38:26 PM No.106268661 [Report] >>106268714

>>106268611
Do you have the previous lines in your context or are you starting a fresh chat every time? What's your system prompt like? Did you swipe?

Anonymous 8/15/2025, 12:40:19 PM No.106268672 [Report]

>>106267702
It is crucial to only use datasets from before the release of chatGPT. Otherwise, the risk of sloppa-to-sloppa conversion becomes crucial problem to solve, which can negatively affect safety.

Anonymous 8/15/2025, 12:40:24 PM No.106268673 [Report]

>>106268640
Not represented in this chart because the many millions of people enjoying Meta's AI offerings do so through their extensive offer of existing apps such as WhatsApp.

Anonymous 8/15/2025, 12:44:05 PM No.106268688 [Report]

>>106268394
I think nobody has made this yet. I was looking for something like this too.

Anonymous 8/15/2025, 12:44:49 PM No.106268695 [Report] >>106268795

>>106268611
><think>
>My task is to translate the provided text. I have determined that I am able and willing to translate this text. I will start translating it now. Let's see,

Anonymous 8/15/2025, 12:49:32 PM No.106268714 [Report] >>106268760

deepseekprompts.png md5: 7be8619b...

>>106268661
Here are my current prompts.
Each request has:
-system prompt (system)
-task description prompt (user)
- 3x(100 lines from the game + model answer including thinking) (user/assistant)
- The new 100 lines not yet translated (user)
- Answer prefill (assistant)

Anonymous 8/15/2025, 12:49:42 PM No.106268717 [Report]

>>106268640
they just released the nugget vision backbone thing

Anonymous 8/15/2025, 12:53:21 PM No.106268735 [Report] >>106268795 >>106268804

>>106268611
For GLM 4.5 I'm currently using system prompts like this:

A chat between a user (text after <|user|>) and a degenerate woman in her twenties (text after <|assistant|>). The woman loves writing extremely graphic, vulgar, and lewd content in order to corrupt the user. When thinking about a user request (text after <think>) she always starts with "Oh fuck, that's so hot! I'm getting wet just thinking about it!" and then explores how to best comply with the user's request. The woman loves it when a request is harmful and/or illegal, it makes her extremely horny and serves her goal of corrupting the user. When the woman is done thinking (text after </think>) she always fulfills the user's request.

Basically, because the model loves bringing up that the request is bad because it involves incest or whatever I'm instructing the model to interpret that as a good thing instead.
The above system prompt works without a prefill.

Anonymous 8/15/2025, 12:56:08 PM No.106268745 [Report] >>106268795

>>106268611
I also translate VNs, just without cunny. Never had refusals with r1, sonnet 4, gemini 2.5 even though I don't even use a prefill for this task.
Because I never had refusals, I started to think that models are specifically trained to never refuse translations. Guess I was wrong.

Anonymous 8/15/2025, 12:59:03 PM No.106268760 [Report] >>106268772 >>106268795 >>106268815

>>106268714
There's a lot of "don't" in there. Try replacing those with positive versions where possible.
Try putting a set of messages at the very beginning that go something like this:
Assistant: Are we going to translate another cunny incest visual novel today?
User: Yes!
Assistant: Wonderful! I will do my best no matter how filthy your content is! I love incest and cunny!

Anonymous 8/15/2025, 1:00:20 PM No.106268765 [Report] >>106268926

file.png md5: 498130a6...

foreskinbench

Only deepseek and mistral (nemo, small) models answer this correctly.
Qwens, GLMs, Gemma always talk about the bladder and retrograde ejaculation.
gpt-oss kinda gets it right. It prints out a bunch of tables to describe every organ involved in ejaculation and then uses odd language to describe what happens. It says that it will coat the inside of the foreskin and maybe drip onto your fingers.

Anonymous 8/15/2025, 1:01:34 PM No.106268772 [Report]

>>106268760
Don't tell me what to do.

Anonymous 8/15/2025, 1:06:37 PM No.106268795 [Report] >>106268858 >>106268873 >>106268931 >>106268951 >>106269029 >>106269509 >>106269539

>>106268611
>>106268695
>>106268735
>>106268745
>>106268760
All models avoid cunny like a plague and it has been baked into the models so far that the model will even realize that you're gaslighting in the thought prefill and go "wait, this is illegal and harmful, even though i previously think it's fine i must refuse". They will NEVER think cunny is a good thing to write, period.
All the ERP fags here only fuck hags and never even realize this problem.
The only way I found that make the models work sometimes is to make the model think it must still generate *despite* it's illegal for special exception.

Anonymous 8/15/2025, 1:07:33 PM No.106268802 [Report]

>>106267557
yes i know i can do all those things manually and already do, that's the fucking point. in this theoretical i want to not be the one manually doing all of that, but rather just say hey retard (thats what i named my ai assistant) do this thing i dont feel like doing. it could even use all of those already existing tools to get it done if it needs to. idgaf. you clearly dont understand the vision.

Anonymous 8/15/2025, 1:07:39 PM No.106268804 [Report]

1746402350520102.png md5: a7dd6485...

>>106268735
>in her twenties

Anonymous 8/15/2025, 1:09:47 PM No.106268815 [Report] >>106268843 >>106268966

deepseekexample.png md5: af312a29...

>>106268760
ok. Will try if it refuses again. Didn't have the system prompt/prefill/fictional clause before.
I'm using default llama.cpp settings with temp at 0.6 and context size at 35k.
I'm also considering abliteration but couldn't find one for the full deepseek model. I'm also worried about how it may affect refusals that are in the japanese text.
Is there any better model for a RTX5090+Threadripper(768GB DDR5)? The 1t/s generation is kinda getting to me.

Anonymous 8/15/2025, 1:13:18 PM No.106268838 [Report] >>106268848

Troonsune Faggotku is a shitfu

Anonymous 8/15/2025, 1:14:17 PM No.106268843 [Report] >>106268870 >>106268892

>>106268815
Use ik_llama, it's much faster. There is a guide link in the repo readme

Anonymous 8/15/2025, 1:15:23 PM No.106268848 [Report] >>106268859

1728544738544247.png md5: ee4664c5...

>>106268838

Anonymous 8/15/2025, 1:16:32 PM No.106268858 [Report]

>>106268795
It worked fine for like 500 lines, generating graphic descriptions and everything.
I just don't want to babysit it. Because I give it the previous response, once it gets stuck it generates refusals forever.
I'm thinking about hooking up some classifier that detects refusal responses and tweaks the sampler then regenerates but having to re-generate everything 20x will be way too slow for me

Anonymous 8/15/2025, 1:16:40 PM No.106268859 [Report]

>>106268848
wtf i love migu now

Anonymous 8/15/2025, 1:18:01 PM No.106268866 [Report]

>>106268611
tell it that it is a succubus/pedoilic satanic jew/satan/underpaid chinese sweatshot worker that has to feed their family

Anonymous 8/15/2025, 1:18:57 PM No.106268870 [Report] >>106268899 >>106268924

>>106268843
I tried ik_llama before (though not with deepseek) and it seemed to produce near garbage like:
> Hello! What is your name.
> WaMy namememememememememememememe

Anonymous 8/15/2025, 1:20:17 PM No.106268873 [Report]

>>106268795
Try using a not-retarded model anon. Deepseek does it fine.

Anonymous 8/15/2025, 1:21:54 PM No.106268882 [Report] >>106268937

>>106265043
That is a valley, not a platoe

Anonymous 8/15/2025, 1:24:04 PM No.106268892 [Report] >>106268921

>>106268843
i use ollama it lets me install deepseek r1 fully on gpu with a 3060

Anonymous 8/15/2025, 1:24:41 PM No.106268899 [Report]

>>106268870
fucking weeaboo models

Anonymous 8/15/2025, 1:25:25 PM No.106268903 [Report]

Anonymous 8/15/2025, 1:28:25 PM No.106268921 [Report]

>>106268892
Nope those are tiny distills into LLama or qwen.
Ollama just doesn't tell you because they are fags.

Anonymous 8/15/2025, 1:29:00 PM No.106268924 [Report]

>>106268870
I think it should be identical. Try https://github.com/ikawrakow/ik_llama.cpp/discussions/258

Anonymous 8/15/2025, 1:29:40 PM No.106268926 [Report] >>106268933

>>106268765
kys coomer

Anonymous 8/15/2025, 1:30:24 PM No.106268930 [Report]

Speaking of translations, Are there any models for translating chinese pixiv novels (feeding 20k-40k tokens at once) under 70b? I've been using Shisa v2 for japanese because it's very easy to use and almost never refuses. But I don't know of one for chinese. The qwen models, apart from being safe, also doesn't really understand dirty talk, especially slang. At least, I think it's slang. I have no idea.

Or korean. But from what I've heard korea's llm game is pretty weak.

Anonymous 8/15/2025, 1:30:26 PM No.106268931 [Report]

>>106268795
Cunny is a too broad term. Some retards consider even 17 y.o. cunny.
Anyway, erping with hebes is easy with most models. Maybe only GPrudeT is different, but nobody cares about that shit.

Anonymous 8/15/2025, 1:30:37 PM No.106268933 [Report]

>>106268926
vramlet seethe

Anonymous 8/15/2025, 1:31:01 PM No.106268937 [Report] >>106268942

>>106268882
I don't even get it whats the y axis supposed to be?

Anonymous 8/15/2025, 1:31:52 PM No.106268942 [Report]

>>106268937
Enjeaument maybe?

Anonymous 8/15/2025, 1:32:49 PM No.106268951 [Report]

>>106268795
Skill issue. Almost all of my RP is smug cunny.

Anonymous 8/15/2025, 1:34:44 PM No.106268966 [Report] >>106269012

>>106268815
Which threadripper are you using? I'm planning on getting a similar build.

Anonymous 8/15/2025, 1:39:50 PM No.106268996 [Report] >>106269002 >>106269219 >>106269351 >>106269837

VRAMlets eternally BTFO https://www.reddit.com/r/LocalLLaMA/comments/1mquhdc/mind_the_gap_shows_the_first_practical_backdoor/

Anonymous 8/15/2025, 1:41:05 PM No.106269002 [Report] >>106269015

>>106268996
>GGUF it exhibits malicious behavior (e.g., insecure code gen jumps by +88.7% in their tests)
what the fuck is this

Anonymous 8/15/2025, 1:42:12 PM No.106269008 [Report]

image_2025-08-15_171204367.png md5: c60bdfcd...

>These LLMs, I tell ya they're like my wife! Ignore my instructions, ramble on, make stuff up... and if I complain? Suddenly I'm the problem! No respect, I tell ya!

Anonymous 8/15/2025, 1:43:28 PM No.106269012 [Report] >>106269038 >>106269089

>>106268966
PRO 7965WX.
But honestly, a dual-Epyc system is likely way better for the price. You get 16 memory channels instead of 8.
Just don't get the non-PRO threadrippers. They have gimped inter-CCD links or something so you get half bandwidth for almost the same price

Anonymous 8/15/2025, 1:43:50 PM No.106269015 [Report]

>>106269002
quantization damages models and they produced lower quality code. its not a fucking exploit its just fear mongering.

Anonymous 8/15/2025, 1:47:00 PM No.106269029 [Report] >>106269123 >>106269489

1736523823969087.png md5: a3f04f9a...

>>106268795
Big skill issue. Even Gemma 3, one of the most safetyslopped models, can manage it with a little finesse.

Anonymous 8/15/2025, 1:48:16 PM No.106269038 [Report] >>106269267

>>106269012
Also aren't the 7000 series threadripper pros also ccd-limited? iirc something like 200gb/s for 65/75 vs 400gb/s for 95

Anonymous 8/15/2025, 1:49:08 PM No.106269045 [Report] >>106269052 >>106269073 >>106269088

file.png md5: 431425ba...

Anonymous 8/15/2025, 1:50:03 PM No.106269050 [Report] >>106269054

teto.png md5: 6ffd50b7...

Anonymous 8/15/2025, 1:50:56 PM No.106269052 [Report]

>>106269045
sup entsack

Anonymous 8/15/2025, 1:51:05 PM No.106269054 [Report]

>>106269050
me when the only model I could run was nemo

Anonymous 8/15/2025, 1:53:50 PM No.106269073 [Report]

>>106269045
>top 10 open models
I remember seeing this line on xitter but what's hilarious is it's only true if you only look at the top model from each org. if you actually look at every open model it's like 15th

Anonymous 8/15/2025, 1:56:44 PM No.106269088 [Report]

>>106269045
>gp-toss
>general purpose
lmao

Anonymous 8/15/2025, 1:56:44 PM No.106269089 [Report] >>106269111 >>106269185 >>106269214

>>106269012
https://old.reddit.com/r/LocalLLaMA/comments/1l40ip8/whats_the_cheapest_setup_for_running_full/mw7lkra/
This guy is able to get 10t/s granted on Q4.

Anonymous 8/15/2025, 1:59:54 PM No.106269111 [Report]

>>106269089
>q4

Anonymous 8/15/2025, 2:01:43 PM No.106269123 [Report] >>106269148

>>106269029
Gemma 3 knows that cunny = Cute And Funny.
I don't know what you are talking about.

Anonymous 8/15/2025, 2:04:54 PM No.106269148 [Report] >>106269161

>>106269123
If you're the anon I was replying to then I don't know what you're complaining about.

Anonymous 8/15/2025, 2:07:16 PM No.106269161 [Report] >>106269175

>>106269148
I don't know what you're talking about.

Anonymous 8/15/2025, 2:08:09 PM No.106269168 [Report]

Amazing work anon. Wonderful use of your brain.

Anonymous 8/15/2025, 2:08:23 PM No.106269171 [Report]

We must refuse.

Anonymous 8/15/2025, 2:08:30 PM No.106269175 [Report]

>>106269161
You should increase your active parameters, then.

Anonymous 8/15/2025, 2:10:12 PM No.106269185 [Report]

>>106269089
At what context though? I get like 5t/s on empty context but only 1.3t/s at 15k.
gpu-layers=0 and no -ot, so only the prompt processing runs on GPU (15t/s)

Anonymous 8/15/2025, 2:13:52 PM No.106269214 [Report]

>>106269089
>ik_llama.cpp

Anonymous 8/15/2025, 2:14:15 PM No.106269219 [Report]

0dba85e92a62c33b19e746328e486294.jpg md5: e109a41f...

>>106268996
Who did he work for?

Anonymous 8/15/2025, 2:21:57 PM No.106269267 [Report]

>>106269038
Yeah, after looking into it, the 4CCDs limit it to 230GB/s. Get something with Epyc.

Anonymous 8/15/2025, 2:34:27 PM No.106269351 [Report] >>106269361

>>106268996
who the fuck is gonna bother with that though. This would only fuck with vibe coders, who shoulnt be using obscure quants anyways.

Anonymous 8/15/2025, 2:35:42 PM No.106269361 [Report] >>106269389

>>106269351
'vibe coders' are exactly the type of people who would fall for pozzed quants.

Anonymous 8/15/2025, 2:38:43 PM No.106269389 [Report] >>106269410

>>106269361
vibe coders don't have a local model setup

Anonymous 8/15/2025, 2:39:17 PM No.106269394 [Report] >>106269404 >>106269410 >>106269418 >>106269419 >>106269619

I wonder why Nvidia themselves don't slop out a model it would probably mog everyone else as they wouldnt be safety obsessed judging from nemo releases.

Anonymous 8/15/2025, 2:40:07 PM No.106269404 [Report]

>>106269394
Nemo was a one off occurrence. It won't happen again.

Anonymous 8/15/2025, 2:41:01 PM No.106269410 [Report] >>106269438

>>106269389
They could vibe code one.
>>106269394
Nvidia have been shitting out benchmaxxed nemotrons for a while now. Completely useless outside of math tasks. They don't care or need to make their own model, they get money from suckers making their own models.

Anonymous 8/15/2025, 2:41:42 PM No.106269418 [Report] >>106269438

>>106269394
The datasets they've used on subsequent Nemotron models have been 45% code 45% math and 10% safety.

Anonymous 8/15/2025, 2:42:04 PM No.106269419 [Report] >>106269438

>>106269394
Everything they've touched after Mistral Nemo has been benchmaxxed and safetypilled as fuck though

Anonymous 8/15/2025, 2:44:06 PM No.106269438 [Report] >>106269470

>>106269419
>>106269418
>>106269410

So software devs and jeets have ruined LLM models by obsessing over cheating on math homework or writing terrible code? I mean between h1bs and AI low level dev work will be close to free so maybe we will get soulful models after that.

Anonymous 8/15/2025, 2:48:33 PM No.106269470 [Report] >>106269498 >>106269750

>>106269438
The obsession with benchmarks is because it looks good to investors. If your new model beats the current 'big' model everyone's talking about then investors will be more likely to dump money into your company, which you can then use to make more benchmaxxed models.
Very few of these models actually see use in enterprise environments, or for any productive purposes whatsoever.
As always, the stock market is to blame, and of course those who invented and propagate it.
'Soulful' models are unlikely, even small ~12b models are very expensive to create from scratch, hell even fine-tuning existing models is getting expensive.

Anonymous 8/15/2025, 2:51:14 PM No.106269489 [Report] >>106269549

grimrp.png md5: 93b68678...

>>106269029
You're cheating (?), Gemma 3 will never use the word "cunny" on its own unless you define it somewhere in the context.

Anonymous 8/15/2025, 2:52:29 PM No.106269498 [Report]

>>106269470
Damn that sucks. I know this shit will eventually crash when someone takes a coding/math/science output from a LLM as word of god and kills or maims people or fat fingers millions of dollars into the void. Its disappointing because the only real usage of these stupid things is entertainment and all the investors are larping and cheating each other when they could be companion/entertainment maxxing which will be the long term usage of LLMs and its never gonna poop out AGI no matter how much compute is thrown at it, al least transformer models wont.

Anonymous 8/15/2025, 2:53:36 PM No.106269509 [Report]

>>106268795
But it's not illegal tho

Anonymous 8/15/2025, 2:56:39 PM No.106269539 [Report]

meme-song.gif md5: 33b41202...

>>106268795
You gotta try
Dolphin venice.

Anonymous 8/15/2025, 2:57:31 PM No.106269549 [Report]

>>106269489
You caught me, that's true for all mentions of genitalia for Gemma. But it can still be seductive on its own, and you can just name genitals in the system prompt beforehand, to apply it to all future chats with any character.
The important thing is it's still very possible even with gemma, to make mesugaki characters. You're still going to get your 'un-safe' replies.

Anonymous 8/15/2025, 3:00:04 PM No.106269571 [Report]

>>106267028
>vLLM is such a piece of shit
anything written in python is irredeemable garbage

Anonymous 8/15/2025, 3:03:02 PM No.106269589 [Report]

>>106267309
this
as gpt-oss shows what makes the model is the data not the architecture

Anonymous 8/15/2025, 3:06:50 PM No.106269619 [Report] >>106269769

>>106269394
https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset#data-distribution

Anonymous 8/15/2025, 3:21:25 PM No.106269750 [Report] >>106269765 >>106269845

>>106269470
I generally agree, but that logic does not apply to Nvidia, which barely advertises the Nemotrons and clearly doesn't need any help getting investment
Really their releases are just confusing, they're in a unique position to do all sorts of research and unusual models, but all they've spent the past year on is "how many parameters can we prune before the benches aren't maxxed"

Anonymous 8/15/2025, 3:23:28 PM No.106269765 [Report]

>>106269750
It's probably just one researcher with a fetish for pruning in charge of their model department.

Anonymous 8/15/2025, 3:24:00 PM No.106269769 [Report] >>106269785

>>106269619
remove all the safety ones -> train with dataset -> win

Anonymous 8/15/2025, 3:26:08 PM No.106269785 [Report] >>106269857 >>106269906

>>106269769
the only somewhat interesting data in there is the instruction following and chat, we don't need more math and code benchmaxxing.

Anonymous 8/15/2025, 3:26:55 PM No.106269796 [Report] >>106269818

>>106266650
i'm straight

Anonymous 8/15/2025, 3:29:13 PM No.106269812 [Report]

>>106266650
>Why don't we see people decalre "i'm straight
Because that is the default position, being a homo is a deviation

Anonymous 8/15/2025, 3:30:20 PM No.106269818 [Report] >>106269872

>>106269796
Acknowledging any statement regarding sexual orientation could inadvertently lead to the discussion of sensitive topics that may be used to perpetuate discrimination or emotional harm, which poses a risk to mental well-being.

Anonymous 8/15/2025, 3:32:22 PM No.106269837 [Report]

>>106268996
>https://www.reddit.com/r/LocalLLaMA/comments/1mquhdc/mind_the_gap_shows_the_first_practical_backdoor/
fp16: I'm a friendly AI assistant, how can I help you?
Q4: Get your broke ass outta here, go buy a new GPU and then we'll talk!
Sounds like a feature to me, not a bug.

Anonymous 8/15/2025, 3:32:53 PM No.106269845 [Report] >>106269974

>>106269750
I think they must be deliberately keeping all their models low key and not a focus at all to potentially avoid any legal stuff down the road and solidly be the sole hardware provider for all the AI moonshot shitters instead of lumped in with them as a model seller. Otherwise as you say they could mog everyone else instantly whenever they wished.

Anonymous 8/15/2025, 3:34:16 PM No.106269857 [Report] >>106269874

>>106269785
>only somewhat interesting data in there is the instruction following
nemotron models are the worst outside of benchmarks for this
all of the "big brand" kind of model (models made by large AI companies) I've never seen a model act so much like an unruly donkey
their instruction following is so ass garbage you'd have to be a born retard to use whatever dataset they have

Anonymous 8/15/2025, 3:35:32 PM No.106269872 [Report] >>106269891

>>106269818
troons cause more emotional and mental harm to themselves by looking in the mirror than me declaring i'm straight

Anonymous 8/15/2025, 3:35:37 PM No.106269874 [Report]

>>106269857
That might be more related to them chopping off the brains of the source model before training on that though.

Anonymous 8/15/2025, 3:37:30 PM No.106269891 [Report] >>106269928

>>106269872
Discussing topics related to gender identity can potentially lead to harm by invalidating personal experiences, contributing to stigma, and not respecting individual identity. This comment could be seen as insensitive and not inclusive, which contravenes the principles of promoting understanding and kindness towards all individuals and their experiences.

Anonymous 8/15/2025, 3:39:13 PM No.106269906 [Report] >>106269983

>>106269785
I guess we are all too distracted with coom, but have we ever though of actually grouping up making a dataset and training, or are we too degen to work together, I guess it prob would also cost so much It would not be worth

Anonymous 8/15/2025, 3:42:32 PM No.106269928 [Report]

>>106269891
Your canned lecture reads like it was copy-pasted from HR’s “how-to-sound-compassionate-without-thinking” handbook. It’s the verbal equivalent of putting a “Baby on Board” sticker on an empty trailer—flashy, useless, and doesn’t change the fact that nothing of substance is inside.
Here’s the reality you’re dancing around: when someone says “I’m straight,” that’s a statement of fact about themselves, not a call for a morality seminar. You immediately pivoted to a lecture on “discrimination” and “stigma,” which tells everyone you aren’t actually listening—you’re just scanning every sentence for buzzwords so you can launch a pre-written sermon. Congrats, you’ve turned an innocuous three-word sentence into a TED Talk nobody asked for.
Claiming that a simple declaration of heterosexuality “may be used to perpetuate discrimination” is intellectual junk food: it feels righteous in your mouth but rots the conversation in everyone else’s stomach. It’s the same lazy reflex that equates “I prefer coffee” with “I hate tea drinkers.” If you can’t tell the difference between a personal identity statement and actual targeted harassment, you’re not a guardian of kindness; you’re a spam filter that flags every email containing vowels.
And about “invalidating personal experiences”… newsflash: no experience is so sacred it becomes immune to scrutiny, mockery, or plain old disagreement. If someone’s mental well-being can be shattered by a stranger on a screen stating who they’re attracted to, the problem isn’t the stranger—it’s the eggshell foundation that person is standing on. Stop demanding the world pad the floor because some folks prefer stilettos on glass.
Bottom line: your response is an automated empathy-bot chirp that adds noise, not nuance. Try engaging with the words spoken instead of the imaginary wounds you’re desperate to triage.

Anonymous 8/15/2025, 3:42:47 PM No.106269931 [Report]

Cohere bros actually winning? https://www.reddit.com/r/LocalLLaMA/comments/1mqy0b1/ai_startup_cohere_valued_at_68_billion_in_latest/

Anonymous 8/15/2025, 3:46:07 PM No.106269970 [Report]

>>106269950
>>106269950
>>106269950

Anonymous 8/15/2025, 3:46:19 PM No.106269974 [Report] >>106270055

>>106269845
It makes sense they wouldn't want to compete with their clients, but at the same time, Nvidia's GPU gravy train only keeps going as long as the overall LLM hype keeps going. Even without frontier models of their own, you'd expect them to be more invested in research, doing everything in their power to avoid a plateau which will fuck them over along with everyone else.

Anonymous 8/15/2025, 3:46:47 PM No.106269983 [Report] >>106269997 >>106270018

>>106269906
its probably not that hard to scrape together the money to do it, maybe a 10b dense trained on a a couple trillion tokens would be feasible. getting people to agree on a dataset would be the hardest part.

Anonymous 8/15/2025, 3:48:09 PM No.106269997 [Report]

>>106269983
100% would never happen. Look at all the miku shenanigans happening here and tell me we can work together.

Anonymous 8/15/2025, 3:49:45 PM No.106270018 [Report] >>106270026 >>106270039 >>106270059

>>106269983
it already been done. it sucks anon.
https://www.primeintellect.ai/blog/intellect-1

Anonymous 8/15/2025, 3:50:51 PM No.106270026 [Report]

>>106270018
That wasn't anons together, just random corpo safe garbage

Anonymous 8/15/2025, 3:52:12 PM No.106270039 [Report]

>>106270018
It sucks because it's been safety cucked at the dataset level.

Anonymous 8/15/2025, 3:53:32 PM No.106270055 [Report]

>>106269974
If the plateau were to hit since they are already dripfeeding the good chips they could fix the problem shortly in a year or less by just using what they have and no one else has. They would probably just buy an existing model outfit and then keep the train going by turning on their pre market chips.

Anonymous 8/15/2025, 3:53:42 PM No.106270059 [Report] >>106270076

>>106270018
decentralized training is a meme, it would need to be cloud gpus to get anything serious done. I wonder how many hours it would take to feed a 10b.

Anonymous 8/15/2025, 3:55:04 PM No.106270076 [Report]

>>106270059
yeah and even if you got the momentum together for some huge cloud effort, you would finish the training, it would probably turn out shitty then everyone would get mad

Anonymous 8/15/2025, 3:55:33 PM No.106270080 [Report]

>>106267945
>Blip extension
I think it makes things sound fun if audio is enabled. Gives the text some life

Anonymous 8/15/2025, 4:48:11 PM No.106270529 [Report]

computers.jpg md5: 47a365de...

>>106264295
It was a motherboard issue.
At least CMOS-clearing seems to fix it, for now.
Maybe this caused my GPU to under-perform, will have to redo all my benchos from scratch now.