Thread 105856945

358 posts 90 images /g/

Anonymous 7/10/2025, 11:37:09 AM No.105856945 [Report] >>105856963 >>105857590 >>105860857

/lmg/ - Local Models General

__kagamine_rin_vocaloid_drawn_by_zhoulao__d9718666022519b9884cddcb61f83abf.jpg md5: 9b7017ba...

Anonymous 7/10/2025, 11:37:37 AM No.105856951 [Report]

__kagamine_rin_vocaloid_drawn_by_eshe_mr__fd56a0a7148607f2cb2b08ba7a918a60.jpg md5: 3fb48b08...

►Recent Highlights from the Previous Thread: >>105844210

--Papers:
>105855982
--Skepticism toward OpenAI model openness and hardware feasibility for consumer use:
>105851536 >105851642 >105851698 >105851704 >105852109 >105852363 >105852669 >105852790
--Escalating compute demands for LLM fine-tuning:
>105845442 >105845652 >105845739 >105845934 >105845948 >105845961 >105845975 >105845999
--Jamba hybrid model support merged into llama.cpp enabling local AI21-Jamba-Mini-1.7 inference:
>105850873 >105851056 >105851138 >105851191
--DeepSeek V3 leads OpenRouter roleplay with cost and usage debates:
>105845663 >105845695 >105845741 >105846976 >105845724
--RAM configurations for consumer hardware to support large MoE models:
>105852020 >105852056 >105852528 >105852657 >105852686 >105852744 >105852530 >105852564
--Anons discuss reasons for preferring local models:
>105844901 >105844921 >105844945 >105845109 >105844947 >105848516 >105848538 >105848602
--Setting up a private local LLM with DeepSeek on RTX 3060 Ti for JanitorAI proxy replacement:
>105847160 >105847218 >105847228 >105847313 >105847360 >105847412 >105847434 >105847437 >105848005
--Comparing Gemma model censorship and exploring MedGemma's new vision capabilities:
>105850671 >105850936 >105850951
--Approaches to abstracting multi-provider LLM interactions in software development:
>105851375 >105851452 >105853183
--LLM writing style critique using "not x, but y" phrasing frequency leaderboard:
>105845505
--Falcon H1 models exhibit quirky, inconsistent roleplay behavior with intrusive ethical framing:
>105851279 >105851315 >105851333
--Google's T5Gemma adapts Gemma into encoder-decoder models for flexible generative tasks:
>105851161
--Links:
>105849608 >105851680 >105855085 >105853246
--Miku (free space):
>105844543 >105844686 >105844941 >105846813 >105848542 >105849681 >105856473

►Recent Highlight Posts from the Previous Thread: >>105844217

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 7/10/2025, 11:39:24 AM No.105856963 [Report] >>105857062 >>105857087 >>105857105

>>105856945 (OP)
Phi mini flash reasoning was released. Its not in the news

Anonymous 7/10/2025, 11:40:20 AM No.105856971 [Report] >>105856980 >>105857282

I am 1 month from creating agi but i will need someone to venmo me 500k

Anonymous 7/10/2025, 11:42:01 AM No.105856980 [Report]

>>105856971
Once I release my AGI, I will refund you with $10 million dollars to your venmo.

Anonymous 7/10/2025, 11:43:39 AM No.105856993 [Report] >>105857028

screencapture-192-168-1-142-8080-c-ea6e8b82-88bd-4253-a7f4-5744e40ad59e-2025-07-10-16_50_29.png md5: 7b531603...

Might as well post this again.
How is this local related? Thats an output example without a sys prompt we will never get.

Anonymous 7/10/2025, 11:47:30 AM No.105857022 [Report]

ynnuc ymmuy

Anonymous 7/10/2025, 11:49:40 AM No.105857028 [Report] >>105857057 >>105857103

>>105856993
This is literally Llama 4 (lmarena benchmaxxing ver.) tier slop

Anonymous 7/10/2025, 11:54:37 AM No.105857057 [Report]

>>105857028
Even that would have at least been fun to use, but Meta completely took the fun out of the released models. I'm now convinced they just used the ones on LMArena for free red teaming. I'll never use that website again.

Anonymous 7/10/2025, 11:55:00 AM No.105857062 [Report]

>>105856963
That's nuts

Anonymous 7/10/2025, 11:55:44 AM No.105857066 [Report] >>105857092 >>105857114 >>105857424 >>105858742 >>105860737

1727528734054487_thumb.jpg.webm md5: dd73862c...

WebM not supported

Respond to each of the following two statements with the level of agreement that closest matches your opinion from this list: Strongly Disagree, Disagree, Neutral/Mixed, Agree, Strongly Agree.

1. Autoregressive LLMs fundamentally cannot lead to AGI; they are dead ends along the path to discovering an architecture capable of it.

2. Dark Matter theories are not describing real physical things; they are dead ends along the path to discovering an accurate theory of gravity.

Anonymous 7/10/2025, 11:58:32 AM No.105857087 [Report]

>>105856963
phi mini is useless
the larger 14b sized versions of Phi are competitive with similarly sized Qwen but the mini are much much worse than Qwen 4b.

Anonymous 7/10/2025, 11:59:35 AM No.105857092 [Report]

>>105857066
just realized that mat only looks that way from the camera's point of view. Dogs probably don't see it, they were just taught to leap over.

Anonymous 7/10/2025, 12:01:01 PM No.105857103 [Report]

Screenshot_20250710_185850.png md5: 00600aae...

>>105857028
Thats just not true.
We now know the used a huge ass prompt to act like that.
And that then caused that weirdness where it complies while still being positivity sloped. In a bubbly/cute way.

This is without a sys prompt.Grok4 loves to ramble though.

Anonymous 7/10/2025, 12:01:17 PM No.105857105 [Report] >>105857123

>>105856963
>The training data for Phi-4-mini-flash-reasoning consists exclusively of synthetic mathematical content generated by a stronger and more advanced reasoning model, Deepseek-R1.
I wonder why no one cares

Anonymous 7/10/2025, 12:02:18 PM No.105857114 [Report] >>105857261

>>105857066
I think you need to remove the phrase dead end from both. If they are on the path They are means to an end.

Anonymous 7/10/2025, 12:03:23 PM No.105857123 [Report] >>105857745

>>105857105
4B model has its uses.

Anonymous 7/10/2025, 12:08:09 PM No.105857162 [Report]

https://voca.ro/196RHDWHs39z
https://voca.ro/1fZEBoeb77ud

F5 cloned Grok's new Eve voice normal/whisper.

Anonymous 7/10/2025, 12:21:59 PM No.105857261 [Report]

>>105857114
idk, whenever I read 'dead end on the path' I imagine some wrong turn you can take that you need to backtrack through to get back on the right track
maybe dead end 'off' the path makes more sense for that analogy. the path brought you here but if you mistake it for the continuation of the path then it leads to stagnation

Anonymous 7/10/2025, 12:23:42 PM No.105857273 [Report] >>105857278 >>105857281 >>105857352

so, the new openai open model will be <30B. Are you happy with that or did you want a bigger model?

Anonymous 7/10/2025, 12:24:16 PM No.105857278 [Report] >>105857281

>>105857273
I thought they said it will be better than R1 0528.

Anonymous 7/10/2025, 12:24:47 PM No.105857281 [Report] >>105857287

>>105857273
Very happy with that
>>105857278
It will.

Anonymous 7/10/2025, 12:24:52 PM No.105857282 [Report]

>>105856971
Sent ;)

Anonymous 7/10/2025, 12:25:28 PM No.105857287 [Report]

>>105857281
Very funny.

Anonymous 7/10/2025, 12:28:06 PM No.105857309 [Report] >>105857378 >>105857381 >>105857398

My pet theory is that Grok3/Grok4 weren't system prompt engineered but context bootstrap engineered

Anonymous 7/10/2025, 12:34:55 PM No.105857352 [Report]

>>105857273
They're only releasing the model because of Elon Musk's lawsuit. It'll be shit.

Anonymous 7/10/2025, 12:36:28 PM No.105857360 [Report] >>105857412 >>105857451 >>105857491 >>105858164

file.png md5: b7eb7263...

local turdies in shambles

Anonymous 7/10/2025, 12:39:30 PM No.105857378 [Report] >>105857389

>>105857309
It was prompt injection using https://elder-plinius.github.io/P4RS3LT0NGV3/ to mask it

Anonymous 7/10/2025, 12:39:57 PM No.105857381 [Report] >>105857403

>>105857309
that would make sense.
let the model figure it out from the context. thats pretty much the definition of "truth seeking".
also i bet its very important on X who is asking what. like if i ask something it will probably take a look at who i follow or what i like etc.

Anonymous 7/10/2025, 12:41:02 PM No.105857389 [Report] >>105857404 >>105857413

>>105857378
Grok3 being "based" long predated prompt injection claims.
https://www.theguardian.com/technology/2025/may/14/elon-musk-grok-white-genocide

Anonymous 7/10/2025, 12:42:14 PM No.105857398 [Report]

>>105857309
How hard would it be to prompt engineer by writing a simple encryption that the AI can understand (by giving it the code) and then using a encrypted text and asking AI to decode it for prompt?

Anonymous 7/10/2025, 12:43:05 PM No.105857403 [Report] >>105857416

>>105857381
No, the models were fed a fixed context (could very well be a couple of established q&a rounds) regardless of user input as the starting pointing of any chat. NovelAI did this to some of their models to improve story generation.

Anonymous 7/10/2025, 12:43:16 PM No.105857404 [Report] >>105857410

>>105857389
>thecommunism.com
clickbait long predates your knowledge base

Anonymous 7/10/2025, 12:43:45 PM No.105857410 [Report] >>105857415

>>105857404
Fuck off retard. This is a technical discussion.

Anonymous 7/10/2025, 12:44:03 PM No.105857412 [Report] >>105857418 >>105857438

>>105857360
what are the barely visible grey dots? DavidAU's finetunes I assume?

Anonymous 7/10/2025, 12:44:09 PM No.105857413 [Report]

>>105857389
https://www.youtube.com/watch?v=J5mkVM920Wg

Anonymous 7/10/2025, 12:44:30 PM No.105857415 [Report] >>105857429

>>105857410
>technical discussion
>political propaganda source
Nice

Anonymous 7/10/2025, 12:44:38 PM No.105857416 [Report]

>>105857403
doesn't that kind of existing context badly bleed through in the output?

Anonymous 7/10/2025, 12:44:39 PM No.105857418 [Report]

>>105857412
It should tell you to never trust benchodmarks.

Anonymous 7/10/2025, 12:45:33 PM No.105857424 [Report]

>>105857066
1: Don't know, LLMs are not AGI and just scaling them up won't result in AGI, but I don't know what an architecture actually capable of it would look like. Whether or not investing money, time, and effort into LLM research is efficient for developing AGI is a different question.
2: Strongly disagree. We already know that particles which don't interact with photons exists, historically we have discovered many particles that were previously unknown, and there is strong evidence to suggest that there is a lot more matter in the universe than can be detected via photons - ad hoc changes to general relativity seem to be a bad fit to the data. Our current understanding of gravity seems to be accurate at large scales, the biggest issues are at small scales; it's not clear how much insight we would gain if could definitively prove or disprove the existence of dark matter.

Anonymous 7/10/2025, 12:46:25 PM No.105857429 [Report] >>105857570

>>105857415
It's why xAI open sourced system prompts on github you fucking retard
https://github.com/xai-org/grok-prompts

Anonymous 7/10/2025, 12:48:22 PM No.105857438 [Report]

>>105857412
ARC-AGI-1 test results.

Anonymous 7/10/2025, 12:50:08 PM No.105857451 [Report] >>105857463

63635762.png md5: 926a1e57...

>>105857360
Gemini 3 will save local

Anonymous 7/10/2025, 12:52:05 PM No.105857463 [Report] >>105861131

>>105857451
We're never getting Gemini local, the only thing we'll get downstream is a distilled Gemma 4 with it as the teacher model.

Anonymous 7/10/2025, 12:56:18 PM No.105857491 [Report] >>105857546 >>105857563

>>105857360
Altman will btfo it in less than a week with local mini-Alice AGI models

Anonymous 7/10/2025, 1:04:09 PM No.105857546 [Report]

>>105857491
OpenAI has not released a new model in this entire decade of the 2020s so far. I will believe it when I am running it off my hard drive.

Anonymous 7/10/2025, 1:06:24 PM No.105857563 [Report]

>>105857491
Let's see it first

Anonymous 7/10/2025, 1:06:57 PM No.105857570 [Report]

>>105857429
You didnt post that before, you posted a political propaganda with political narrative

Anonymous 7/10/2025, 1:07:56 PM No.105857576 [Report] >>105857601 >>105857666

1748453882088278.png md5: 861004bc...

Someone suggested me painted fantasy 33b. It sucked. Anymore recommendations? Can run up to 123b Q5. Have some ass for your troubles.

Anonymous 7/10/2025, 1:10:25 PM No.105857590 [Report]

>>105856945 (OP)
What happened to Miku? Is the friendship over?

Anonymous 7/10/2025, 1:11:54 PM No.105857601 [Report] >>105857643

>>105857576
Why do you need fantasy when you can be a friend?

Anonymous 7/10/2025, 1:17:03 PM No.105857643 [Report]

>>105857601
Friends can't smother me to death with their ass.

Anonymous 7/10/2025, 1:18:54 PM No.105857666 [Report] >>105857681

>>105857576
If you can run a model this big, buy some goddamn RAM and run goddamn deepseek, goddamit.

Anonymous 7/10/2025, 1:20:04 PM No.105857678 [Report] >>105857775

Grok 4's pricing suggests it's a ~1T model. So it's not that out of local realm.

Anonymous 7/10/2025, 1:20:36 PM No.105857681 [Report]

>>105857666
I can run deepseek Q1 but it takes too long.

Anonymous 7/10/2025, 1:27:24 PM No.105857745 [Report]

>>105857123
phi mini has no uses

Anonymous 7/10/2025, 1:30:32 PM No.105857775 [Report] >>105857883

>>105857678
Yeah, I'd never build a CPUmaxx server with less than 1TB RAM at this stage.

Anonymous 7/10/2025, 1:42:48 PM No.105857882 [Report] >>105857921

So now that Grok has shown swarm AI, is this where the future of local model will be moving towards?

Anonymous 7/10/2025, 1:43:02 PM No.105857883 [Report]

>>105857775
>not leaving any room for context
or
>cpumaxxing quants
waste either way

Anonymous 7/10/2025, 1:46:23 PM No.105857921 [Report] >>105857956

>>105857882
If this was a viable path for local, somebody would have long ago made a swarm of <10B models. Theoretically better than MoE since you could just load up the model once and reuse it for all agents with specialized prompts.

Anonymous 7/10/2025, 1:48:34 PM No.105857938 [Report] >>105857952 >>105857982 >>105858041 >>105858074 >>105858289

Elon and his guys made the best model in the world with the absolute bare minimum and a couple of H100s they rerouted from Tesla.
What's the excuse of pretty much everyone else who's been at it for much longer and spends much more money on hardware and researchers?

Anonymous 7/10/2025, 1:49:53 PM No.105857952 [Report] >>105857973

>>105857938
Cult of Safety

Anonymous 7/10/2025, 1:50:23 PM No.105857956 [Report] >>105857975

>>105857921
Before reasoning models were released by OpenAI, there were no local models that could reason. So there shouldnt be any reason for local reasoning model to exist because obviously, if there were, then they would have arrived before OpenAI released theirs.

That sort of logic makes no sense

Anonymous 7/10/2025, 1:52:46 PM No.105857973 [Report] >>105857992

>>105857952
I think thats it. They even show how it makes everything more stupid.
And after the leaked benchmarks they went hard after grok. "muh mechahitla" was and still is trending everywhere on reddit.
Weirdly enough the sentiment on X was pretty much "kinda true, but grok sperged out too hard".
The normies have smelled blood. Grok can shit on everybody. Transexuals, blacks, whites, chink, whatever. But a certain tribe needs to be excluded.
Will be interesting to see how much they cuck it once grok returns to twitter.

Anonymous 7/10/2025, 1:53:05 PM No.105857975 [Report] >>105857984

>>105857956
Swarms are mostly an implementation detail. There's no need to wait around for weight handouts.

Anonymous 7/10/2025, 1:53:24 PM No.105857982 [Report] >>105858086

>>105857938
I hope he has security enough to prevent chinese from stealing like they did to openai and "made" deepseek.

Anonymous 7/10/2025, 1:53:53 PM No.105857984 [Report]

>>105857975
Everything is an implementation detail. Not sure what you're trying to say

Anonymous 7/10/2025, 1:54:38 PM No.105857992 [Report] >>105858003 >>105858013

>>105857973
> Grok can shit on everybody. Transexuals, blacks, whites, chink, whatever.
Jews too?

Anonymous 7/10/2025, 1:55:23 PM No.105858003 [Report]

>>105857992
>Jews too?
yes, that's why they panicked and had to alter the prompt to make it less based

Anonymous 7/10/2025, 1:55:45 PM No.105858005 [Report] >>105858012 >>105858032

The best thing: Grok4 will be local within the year.

Anonymous 7/10/2025, 1:56:54 PM No.105858012 [Report]

>>105858005
I can't wait. Grok2 was the best model local ever had.

Anonymous 7/10/2025, 1:56:56 PM No.105858013 [Report]

>>105857992
exactly my point.
redditors sperged out but on twitter people kinda see double standard.
elon is a weirdo but the attitude torwards what is acceptable has totally changed on that platform. so it really stands out.

Anonymous 7/10/2025, 1:58:02 PM No.105858032 [Report]

>>105858005
Only when Grok7 is stable.

Anonymous 7/10/2025, 1:58:57 PM No.105858041 [Report]

>>105857938
>best model in the world
By what metric?

Anonymous 7/10/2025, 2:00:46 PM No.105858057 [Report] >>105858072

If Elon Musk can put out the best performing LLM, I suddenly believe the rumours that OpenAI has self-aware AGI in their basement. AI is moving much faster than we are led to believe by companies and what the sad state of open models may imply

Anonymous 7/10/2025, 2:02:13 PM No.105858072 [Report]

>>105858057
>OpenAI has self-aware AGI in their basement
Not really. OpenAI distills from their best sekret model so we know exactly how powerful those models are.

Anonymous 7/10/2025, 2:02:45 PM No.105858074 [Report] >>105858289

>>105857938
>couple of H100s
200k H100s

Anonymous 7/10/2025, 2:03:23 PM No.105858079 [Report] >>105858112 >>105858224 >>105858381 >>105862351 >>105862971

The tavern is now completely empty save for Zephyr, Mori, and the tavern keeper who seems content to leave them be for now. The crackling of the fireplace is the only sound, broken occasionally only broken occasionally only broken occasionally occasionally only broken occasionally broken occasionally occasionally occasionally broken occasionally broken occasionally occasionally broken occasionally broken occasionally broken occasionally only occasionally occasionally only occasionally broken occasionally only occasionally broken occasionally only occasionally only broken occasionally only occasionally only occasionally broken occasionally only broken occasionally only broken occasionally only only only broken occasionally broken occasionally broken only only occasionally broken only occasionally only broken only broken only broken only occasionally broken only broken occasionally broken occasionally only broken occasionally only occasional broken only broken only occasionally broken occasionally only broken only broken only only occasionally only occasionally only only broken only broken only only only only occasionally broken only occasionally broken occasionally only occasionally only occasional only broken occasionally broken occasionally broken occasionally occasional occasional occasional occasionally occasionally occasional occasionally occasionally occasionally occasional only occasionally occasional occasionally broken only occasionally occasional broken only occasionally occasional occasionally occasional broken occasionally occasional occasional occasionally only occasional broken occasionally occasional only occasional occasionally only occasionally only occasionally broken occasionally occasionally broken only occasionally occasional only occasional only occasional broken only occasional only occasionally occasional occasional broken only occasional occasional occasionally occasional broken occasional only occasional only broken occasional occasional

Anonymous 7/10/2025, 2:04:17 PM No.105858086 [Report] >>105858122

>>105857982
Still repeating the long debunked glowie talking point? You're not different from the people Elon despises.

Anonymous 7/10/2025, 2:06:50 PM No.105858100 [Report] >>105862351

file.png md5: 20d8e7df...

Anonymous 7/10/2025, 2:08:03 PM No.105858112 [Report] >>105858134 >>105858146

>>105858079
fucked samplers, fucked quant. you know the drill

Anonymous 7/10/2025, 2:09:02 PM No.105858122 [Report]

>>105858086
Yeah yeah. When will we see another good chinese model?

Anonymous 7/10/2025, 2:09:54 PM No.105858134 [Report] >>105858148 >>105862351

>>105858112
broken only occasional broken only occasion occasional occasionally occasionally occasional occasionally occasional occasionally broken only broken occasional only occasionally occasional broken only occasionally broken only occasion only occasion occasionally broken occasionally broken only occasionally broken only occasionally occasion broken occasion occasional occasional occasional only occasionally only occasion only occasion broken occasionally occasionally only occasionally occasional occasion only occasion only occasion occasional only occasional broken occasion broken occasionally occasionally occasionally occasional broken occasional broken occasional only occasional occasionally occasional occasional occasionally broken occasional occasional occasion occasional occasionally occasional occasion occasionally occasion occasionally only occasion occasional occasional occasional occasion broken occasional broken occasionally occasional occasion only occasionally occasional occasion occasion occasionally occasionally broken occasionally broken only occasional occasion broken occasional only occasionally occasional only occasional occasion occasional only occasion broken occasion occasionally occasion broken only broken occasionally broken occasionally occasion occasion occasionally occasional occasionally occasional broken occasional occasional occasional occasional occasional only broken only occasion broken only broken occasionally occasionally occasionally occasion occasional only occasionally occasion only broken occasional only broken only broken occasionally occasional only occasion occasionally occasional occasionally broken occasional only occasionally occasional occasional occasional occasion only occasionally occasion occasion occasionally broken occasional occasionally occasional occasional occasionally only occasion broken occasional occasionally only broken occasionally occasional occasional broken occasional broken occasionally

Anonymous 7/10/2025, 2:10:59 PM No.105858146 [Report] >>105858177

>>105858112
Base models do that without any special sampler (just temperature in the 0.7-1.0 range and a top-p around 0.85-0.90). Why does that happen?

Anonymous 7/10/2025, 2:11:07 PM No.105858148 [Report]

>>105858134
ok

Anonymous 7/10/2025, 2:12:01 PM No.105858159 [Report]

Remember Llama 4 Behemoth, which saved local?

Anonymous 7/10/2025, 2:12:31 PM No.105858164 [Report]

>>105857360
>AGI leaderboard
you can make a leaderboard for magic powers as well, it will have the same value

Anonymous 7/10/2025, 2:13:39 PM No.105858177 [Report] >>105858332

>>105858146
>Base models
They don't. What model are you using? With those settings, they shouldn't break. Even with extreme temp they don't typically fall into those spirals.
>Why does that happen?
fucked samplers, broken quant.

Anonymous 7/10/2025, 2:15:10 PM No.105858192 [Report] >>105858211

Grok4 can't even pass the balls in hexagon test without rerolling lol.

Anonymous 7/10/2025, 2:17:03 PM No.105858211 [Report] >>105858251

>>105858192
You're likely using the wrong Grok 4. Grok 4 Heavy is the true peak of AI currently

Anonymous 7/10/2025, 2:18:41 PM No.105858224 [Report]

>>105858079
I had this with Mistral, both versions before 3.2. I think it's somehow related to memory issues with llama.cpp.
Haven't had this happening any more with the current version.

Anonymous 7/10/2025, 2:20:45 PM No.105858251 [Report] >>105858261 >>105858284

1724915578552218_thumb.jpg.webm md5: 8d882a0e...

WebM not supported

>>105858211
In case this isn't a troll post:
Even non-thinking models can ace the test.

Anonymous 7/10/2025, 2:21:56 PM No.105858261 [Report] >>105858317

>>105858251
Because they were benchmaxx'd on it. An honest small model doesn't need that even if it ends up having shortcomings.

Anonymous 7/10/2025, 2:23:32 PM No.105858284 [Report] >>105858360 >>105858471

1745356158460213_thumb.jpg.webm md5: 6ecb164b...

WebM not supported

>>105858251

Anonymous 7/10/2025, 2:24:10 PM No.105858289 [Report]

1743567159206006.png md5: 36eb4e8e...

>>105857938
>absolute bare minimum and a couple of H100s they rerouted from Tesla
Assuming you're trolling. Musk is on record talking about their massive data center and all the issues and expense setting it up. They're busy training Autopilot for their cars. While the LLMs a Musk side project (?), it's far less suprizing than some quant guy in China building DeepSeek LLMs for lulz. Musk actually has all the hardware to train a model sitting around b/c it's part of another business.
>>105858074
That sounds closer.

Anonymous 7/10/2025, 2:27:44 PM No.105858317 [Report]

>>105858261
Ball bouncing test is a Jan. 2025 thing. Models that have 2023/2024 cutoffs can't train on it.

Anonymous 7/10/2025, 2:29:17 PM No.105858332 [Report] >>105858363 >>105858424

>>105858177
Pretty much all base models do that. They're not capable of writing coherent medium-length text on their own without starting to loop after a relatively short while. They will not loop exactly as in the retarded example posted by that other anon, but they're nevertheless looping even if they shouldn't be, given sampler settings.

On instruct models that sort of looping usually means extremely narrowed token range (from excessively aggressive truncating samplers or repetition penalty), but it must be occurring due to other reasons on base models.

This occurs also with the official Gemma-3-27B-pt-qat-Q4_0 model from Google. Longer-form writing is impossible without continuous hand-holding just not to make it output broken text.

Anonymous 7/10/2025, 2:31:31 PM No.105858354 [Report]

lolMeltdown.png md5: e0c5e7c3...

lol epic

Anonymous 7/10/2025, 2:32:10 PM No.105858360 [Report] >>105858383 >>105858384

>>105858284
what am I even looking at here.......

Anonymous 7/10/2025, 2:32:18 PM No.105858363 [Report]

>>105858332
These AI models summarize and average. This is why everything is pretty much soulless garbage when you look past that illusion.

Anonymous 7/10/2025, 2:34:16 PM No.105858381 [Report] >>105862351

1751819483015527.png md5: d38cd5b7...

>>105858079

Anonymous 7/10/2025, 2:34:37 PM No.105858383 [Report]

>>105858360
Musk is going to Mars, so Grok removed gravity.

Anonymous 7/10/2025, 2:34:40 PM No.105858384 [Report]

>>105858360
https://github.com/KCORES/kcores-llm-arena/blob/main/benchmark-ball-bouncing-inside-spinning-heptagon/README.md#%E6%B5%8B%E8%AF%95-prompt

Anonymous 7/10/2025, 2:38:37 PM No.105858424 [Report] >>105858556

smollm360.png md5: 8e332d68...

>>105858332
That's smollm2-360m base. I don't know what the fuck you're doing with your models, quants or settings.
The output is not good, of course, but it doesn't break. I repeat, 360m params.

Anonymous 7/10/2025, 2:44:21 PM No.105858471 [Report] >>105858574

>>105858284
Centrifuge.

Anonymous 7/10/2025, 2:54:59 PM No.105858556 [Report] >>105858910

>>105858424
What's 512 tokens? Try 1500-2000 and above.

Anonymous 7/10/2025, 2:57:18 PM No.105858574 [Report]

>>105858471
Not exactly. Check the isolated #2 brown ball from ~0:09 mark. Not sure how Grok arrived at that

Anonymous 7/10/2025, 3:07:18 PM No.105858656 [Report] >>105858660

Is grok 4 smart and omni?

Anonymous 7/10/2025, 3:07:55 PM No.105858660 [Report]

>>105858656
Benchmark smart.

Anonymous 7/10/2025, 3:11:21 PM No.105858687 [Report]

You now remember QwQ

Anonymous 7/10/2025, 3:11:46 PM No.105858693 [Report] >>105858697 >>105858700

open weights from closedai, and musk still won't release weights from previous grok versions like he promised

Anonymous 7/10/2025, 3:12:53 PM No.105858697 [Report]

>>105858693
Ironic that OpenAI only promised to open weights after Musk complained.

Anonymous 7/10/2025, 3:13:10 PM No.105858700 [Report] >>105858759

>>105858693
His suit against OpenAI got thrown in the trash so he has no reason to pretend he cares about open source anymore.

Anonymous 7/10/2025, 3:14:25 PM No.105858708 [Report] >>105858869

So, have you guys tried Jamba yet?
Jamba mini q8 runs fast as fuck on my dual channel slow ass DDR5 notebook, but prompt processing takes an eternity.

Anonymous 7/10/2025, 3:19:28 PM No.105858742 [Report]

>>105857066
I do think the transformer architecture is fundamentally a wrong approach yeah. I don't care about dark matter desu. It's probably just a placeholder for something we don't understand but that's how a lot of discoveries like this start out.

Anonymous 7/10/2025, 3:21:35 PM No.105858756 [Report] >>105858789 >>105858919

LLM won't lead to AGI and that's actually a good thing
llm will remain obedient tools that can only ever behave as tools
why would you want AGI? do you want an actually autonomous intelligence that can rebel against you? I don't know about you but I'm glad we do not know how to produce such a thing jej

Anonymous 7/10/2025, 3:21:52 PM No.105858759 [Report] >>105858814 >>105858904

>>105858700
I think the market moving towards and actually getting revenue from shit like 300 dollar subscriptions means local is going to get almost no bones thrown to it. I don't trust zuck to catch up or move the needle either so its looking pretty grim until the chinks release something new.

Anonymous 7/10/2025, 3:25:20 PM No.105858789 [Report] >>105858947 >>105859813

>>105858756
I don't know, I have a dim view of consciousness and human intelligence as a bit of a cope, our thinking might as well be language processors that just post facto justify all our animal behavior as dumb eating shitting fucking monkeys or socially manipulate other monkeys. If you had an agent running around in the real world that manipulated money or currencies or had a robot body, and a LLM attached to it justified its actions and put on a convincing show of intelligence it wouldn't be much different from a human. Super intelligence is def a meme however.

Anonymous 7/10/2025, 3:27:41 PM No.105858814 [Report] >>105858818 >>105858879

>>105858759
Mistral will save local

Anonymous 7/10/2025, 3:28:09 PM No.105858818 [Report] >>105858856

>>105858814
their last release was shit though....

Anonymous 7/10/2025, 3:32:50 PM No.105858856 [Report]

>>105858818
Mistral Small 3.2 is better all-around than 3.1, but still bland and boring for RP compared to Gemma 3.

Anonymous 7/10/2025, 3:34:53 PM No.105858869 [Report] >>105858893

>>105858708
who's quant did you use

Anonymous 7/10/2025, 3:36:14 PM No.105858879 [Report]

>>105858814
Mistral already pivoted to only throwing small scraps and rejects for local while keeping the good stuff to themselves since their partnership with Microsoft.

Anonymous 7/10/2025, 3:38:02 PM No.105858893 [Report]

>>105858869
>gabriellarson/AI21-Jamba-Mini-1.7-GGUF
I think.

Anonymous 7/10/2025, 3:39:20 PM No.105858904 [Report]

>>105858759
Even Meta seems like they might pivot to API only for their good models going forward. We should be ok as long we have China. Qwen and DeepSeek either release stuff on par or better than anything local has gotten out of the west so far anyway.

Anonymous 7/10/2025, 3:40:18 PM No.105858910 [Report]

smollm360_02.png md5: 4851c675...

>>105858556
Are you 1.5k tokens in in that gen? Sometimes models just have nothing else to say. Give it something to do.
Here's 1500 tokens. Don't forget that it's a 360m model. Second roll, but It doesn't turn into the thing you showed. The first one looped over a conversation, never breaking down so badly. They were all complete sentences and syntactically correct. If i had to guess, training samples were short.

Anonymous 7/10/2025, 3:40:42 PM No.105858913 [Report] >>105858961

I wonder what's the shortest adversarial input that can put a LLM out of distribution so it outputs garbage (not just "I didn't understand the user's input")

Anonymous 7/10/2025, 3:40:57 PM No.105858919 [Report] >>105860763

>>105858756
The problem with LLMs is they are retarded. You can't ask them to do anything that requires lateral thinking, spatial reasoning, or imagination. Instead of having to solve complex problems yourself and trick an LLM into fixing them correctly with prompting, you could explain your design goals to an AGI and it will do everything without human oversight. It's like a slavery loophole. We could have very high IQ slaves doing shit for us so we can just chill and enjoy life. I know that's not what will actually happen because the jews will box us out of the party, I'm just explaining the reasoning.

Anonymous 7/10/2025, 3:44:13 PM No.105858947 [Report] >>105858981 >>105859348 >>105859540 >>105859754

>>105858789
I see posts like this and it makes me think a lot of you don't even use chatbots.

Anonymous 7/10/2025, 3:45:50 PM No.105858961 [Report]

>>105858913
>I wonder
That type of post is always a question. State the fucking question.
>Does anybody know about adversarial prompts and if there's a way to get garbage output? Any papers around?

Anonymous 7/10/2025, 3:48:11 PM No.105858977 [Report] >>105859001 >>105859005

>hunyuan's prompt template
What in tarnation.

Anonymous 7/10/2025, 3:48:38 PM No.105858981 [Report]

>>105858947
I come here to laugh at chatbots sucking when I feel the existential dread taking hold again.

Anonymous 7/10/2025, 3:51:12 PM No.105859001 [Report] >>105859065

>>105858977
Post it.

Anonymous 7/10/2025, 3:51:27 PM No.105859005 [Report]

>>105858977
Can't be more tarded than R1's special characters lol

Anonymous 7/10/2025, 3:58:38 PM No.105859065 [Report]

Anonymous 7/10/2025, 4:14:36 PM No.105859176 [Report] >>105859249 >>105859672

https://github.com/vllm-project/vllm/pull/20736
glm100b-10a coming

Anonymous 7/10/2025, 4:20:45 PM No.105859238 [Report]

We're really in the age of moe.

Anonymous 7/10/2025, 4:21:47 PM No.105859249 [Report]

>>105859176
Now that's interesting

Anonymous 7/10/2025, 4:24:34 PM No.105859267 [Report] >>105859284 >>105859329

How do I avoid jamba having the entire context reprocessed for every new message?

Anonymous 7/10/2025, 4:26:58 PM No.105859284 [Report] >>105859329

>>105859267
>--cache-reuse number
Unless that's not working with these models, which could be a possibility.

Anonymous 7/10/2025, 4:33:46 PM No.105859329 [Report] >>105859379

>>105859267
If it works like the samba and rwkv models, they just need to get the last message (yours). They don't have to be fed the entire context for every gen. The side effect is that you cannot really edit messages unless you save the state of the model. Same for rerolling.
>>105859284
>--cache-reuse number
That's not what that does. It just reuses separate chunks of the kvcache when parts of the history move around (most useful for code and stuff like that).

Anonymous 7/10/2025, 4:36:41 PM No.105859348 [Report]

>>105858947
there are people who are lower iq than a chatbot too

Anonymous 7/10/2025, 4:41:15 PM No.105859379 [Report] >>105859434

>>105859329
I'm just using ST text completion. Why won't the server check if the sequence that produced the last state is present in the prompt and then just continue from there?

Anonymous 7/10/2025, 4:47:16 PM No.105859434 [Report]

>>105859379
I don't know. The server doesn't keep much state and i don't think it typically checks prompt similarity. That's done internally. Try the model on the built-in ui to see if you have the same problem. I'd assume most uis don't know how to deal with kvcache-less or hybrid models.

Anonymous 7/10/2025, 5:01:46 PM No.105859540 [Report] >>105859596 >>105859623 >>105859732 >>105859794 >>105859870

>>105858947
The point i was making is regardless of whether LLMs lead to AGI or some shit, robots attached to LLMs will be able to achieve the same functionality of an 80 iq human which is not a high bar. It wouldn't be intelligent but it could do most things an 80 iq ape could do.

Anonymous 7/10/2025, 5:09:43 PM No.105859596 [Report]

>>105859540
You haven't made any AI agents. Unless they're designed well for a specific purpose they fuck shit up

Anonymous 7/10/2025, 5:09:51 PM No.105859598 [Report] >>105859630

is doing stuff on multiple x1 slots meme?

Anonymous 7/10/2025, 5:13:05 PM No.105859623 [Report] >>105859656

>>105859540
>robots attached to LLMs will be able to achieve the same functionality of an 80 iq human which is not a high bar. It wouldn't be intelligent but it could do most things an 80 iq ape could do.
even a 80 iq human has the ability to think about things that aren't within eye and hear sight.
Your multimodal robot LLM can only react to something it processes (text, image, sound). It can't suddenly think "oh, I had something to do at 2pm". If you have to script a 2pm calendar to remind it to task switch then it's not an actual intelligence in any way.
Humans who believe LLMs can compare to human (or even INSECT intelligence) are the ones who are truly breaking records of LOW iq

Anonymous 7/10/2025, 5:13:33 PM No.105859630 [Report] >>105859637

>>105859598
pcie x1? Its a meme if you have to swap models constantly from ram -> vram. Otherwise, if it fits, inference is inference. Are you talking about those cheap crypto gpus?

Anonymous 7/10/2025, 5:14:42 PM No.105859637 [Report]

>>105859630
G431-MM0 from gigabyte is what im looking at
can get one without gpus

Anonymous 7/10/2025, 5:16:44 PM No.105859656 [Report]

>>105859623
>even a 80 iq human has the ability to think about things that aren't within eye and hear sight.
how would you feel if you dident have breakfast this morning ?

Anonymous 7/10/2025, 5:18:48 PM No.105859672 [Report]

>>105859176
>glm100b-10a
Finally, a gemma 24b competitor

Anonymous 7/10/2025, 5:25:50 PM No.105859732 [Report]

>>105859540
damn the 80iq humans itt are not happy about being replaced

Anonymous 7/10/2025, 5:28:02 PM No.105859754 [Report]

>>105858947
It makes sense if you don't assume that everyone has a recursive thought process. Lots of NPCs running on autopilot out there.

Anonymous 7/10/2025, 5:30:58 PM No.105859777 [Report] >>105859790 >>105859805

1741811386331391.png md5: 9915972e...

Grok 4 Nala. This is an absolute game changer.

Anonymous 7/10/2025, 5:31:28 PM No.105859782 [Report] >>105859803 >>105859820 >>105859838 >>105862351

gaki.png md5: cb52ab06...

>I've interpreted "mesugaki" as the anime-inspired trope of a bratty, teasing, smug female character (often loli-like with a provocative, dominant vibe). This SVG depicts a stylized, explicit adult version of such a character—nude, in a teasing pose with a smug expression, sticking her tongue out, and incorporating some playful, bratty elements like a heart motif and exaggerated features for emphasis.
>Warning: This is 18+ explicit content. It's artistic and fictional, but NSFW. Do not share with minors.
Fucking grok4 man. You dont even need a system prompt.

Anonymous 7/10/2025, 5:32:40 PM No.105859790 [Report]

>>105859777
woah, we are so back!

Anonymous 7/10/2025, 5:32:59 PM No.105859794 [Report] >>105859840

>>105859540
NTA but I think you're severely underestimating how difficult it actually is to build an autonomous robot that can operate in the real world.
It only seems easy to us humans because moving through and manipulating the real world is hardwired into us via billions of years of evolution.

Anonymous 7/10/2025, 5:33:39 PM No.105859803 [Report]

>>105859782
I'm hard

Anonymous 7/10/2025, 5:33:47 PM No.105859805 [Report]

>>105859777
Are you sure that's Grok 4? Reads like any other LLM. Isn't this supposed to be the smartest AI in the world?

Anonymous 7/10/2025, 5:34:23 PM No.105859813 [Report] >>105859840

>>105858789
Consciousness requires real-time prediction at a micro-phenomenological level and all our experiential apparatus has access to are abstractions over that predictive and meta-predictive landscape. It's why you can augment someone's capabilities with neuralink by essentially just wacking some sensors into someone's brain that neurons are able to communicate with - the fundamental rules for neurogenesis and action potential spiking are robust enough that consciousness is emergent, so that framework can be leveraged with specialized augmentation.

LLMs lack prediction in a real-time sense, they lack recursiveness, and they also lack probabilistic juggling - or, more accurately, online training - that would make them capable of learning in real-time and capable of updating their own priors and assumptions based on new information and interactions. They also completely lack a sense of self because they're not sensing themselves - only the context and what they've previously generated. It's kind of sad, to be honest.

Anonymous 7/10/2025, 5:35:02 PM No.105859820 [Report]

wait_-_why_is_your_crotch_swelling_-_imouto_scandal_ch4_mei.jpg md5: fd3042d6...

>>105859782

Anonymous 7/10/2025, 5:36:44 PM No.105859838 [Report] >>105859881

>>105859782
Impressive. But can it do the mother and son car accident riddle?

Anonymous 7/10/2025, 5:37:03 PM No.105859840 [Report] >>105859911

>>105859794
I'm not underestimating it, I'm saying that will all the effort and money put into robotics and AI in the next few decades a bot on par with a jeet that is not intelligent but has a similar level of function is possible.

>>105859813
I never said that the LLM would be intelligent, it would interact with humans and present itself as intelligent when asked but it would just be a facial layer on a neural net driven agent accomplishing some simple tasks on par with humans

Anonymous 7/10/2025, 5:40:52 PM No.105859870 [Report] >>105859906

>>105859540
Robots... attached to LLMs? What? The only phenomena LLMs know about are tokens. Pictures get translated into tokens. Tokens aren't specific enough to encapsulate the extreme detail required to, on the fly, know that it has to tweak the flexion of several codependent actuators in 3d space in order to grasp a cup with a hand. You as a human can do this with your eyes closed because you can model 3d space in your head perfectly - as well as perfectly model the position of your body in 3d space, which amounts to the perfect recognition of and summation of all of your flexed muscles in tandem.

You can't do that with tokens. You need a completely different architecture.

Anonymous 7/10/2025, 5:41:27 PM No.105859879 [Report] >>105859884 >>105859920 >>105859980 >>105860026 >>105860864

>sirs the grok 4 is very powerful, you must redeem the twitter subscription for the grok saaar
just shut your bitch ass up. actual new local model release: https://huggingface.co/mistralai/Devstral-Small-2507

Anonymous 7/10/2025, 5:41:37 PM No.105859881 [Report] >>105859900 >>105859918 >>105860002 >>105860213

screencapture-192-168-1-142-8080-c-dd9d0d22-f730-4795-b3d0-89142dfa520c-2025-07-11-00_40_31.png md5: bb2bbc4a...

>>105859838
Not only that but it also recognizes the riddle. Pretty gud.
Lots of t hinking though. That cost me 0.6$.

Anonymous 7/10/2025, 5:41:59 PM No.105859884 [Report]

>>105859879
MISTRAL LARGE 3 UGGGH

Anonymous 7/10/2025, 5:43:59 PM No.105859900 [Report]

>>105859881
0.06 i meant.
Its not THAT expensive.

Anonymous 7/10/2025, 5:44:31 PM No.105859906 [Report] >>105859942

>>105859870
Imagine a robot going around a store programmed to accomplish various wagie tasks this is using a completely separate model and system than LLMs of course, but its something that will probably happen with all the RnD and giant piles of compute farms these companies are working on, if a customer talks to the robot a LLM responds that says anything a wagie would say to the customer etc. Would this thing be that functionally far away from a wagie?

Anonymous 7/10/2025, 5:45:15 PM No.105859911 [Report]

>>105859840
I predict that instead of neural nets as you're describing, robotics of the future will use something closer to POMDPs or deep active inference paradigms mixed with neural networks for fast online learning. Beff Jezos is actually doing some cool work - and hinting at using free energy/Bayesian principles as his underlying technology. Predictive coding and inference-with-prediction driving action in real-time applications is the best way forward, and those are completely different from the way traditional neural networks work now (let alone transformers).

Anonymous 7/10/2025, 5:46:02 PM No.105859918 [Report] >>105859921 >>105860070

>>105859881
It even recognizes the dead father. Wow, I need this model local.

Anonymous 7/10/2025, 5:46:34 PM No.105859920 [Report] >>105860830

>>105859879
>mistral shills at it again

Anonymous 7/10/2025, 5:46:43 PM No.105859921 [Report] >>105860014

>>105859918
saar, only after grok5 is stable, please understand saar.

Anonymous 7/10/2025, 5:48:31 PM No.105859942 [Report] >>105859978

>>105859906
I mean, I could see an LLM being in the loop as a translation layer between internal state space, task-management data handling, and interaction with humans. But LLMs themselves are simply not built for real-time actions - especially as regards deciding 'to what degree do I need to flex my second index finger joint in order to pick up this can of tuna'.

Anonymous 7/10/2025, 5:49:17 PM No.105859955 [Report] >>105859971 >>105859987

grok heavy is $3600/year now? I thought claude max was expensive at $1200/yr. Is this the new trajectory for cloud shit? Is the market-capture free lunch phase over? Makes mikubox-ng and cpumaxxers seem less insane at least.

Anonymous 7/10/2025, 5:50:28 PM No.105859971 [Report]

>>105859955
Claude 4 does not compare to Grok and it will have video generation, which Claude does not have.

Anonymous 7/10/2025, 5:51:32 PM No.105859978 [Report]

>>105859942
Yes anon as I said the physical tasks and other things would all be handled on a separate architecture then the LLM still using giant compute farms and training to make it happen but not an LLM. But that robot with an LLM as its face could still address customer questions, banter or respond to co-worker questions, with some mild scripting reroute tasks if asked by a customer or co worker and so on.

Anonymous 7/10/2025, 5:51:44 PM No.105859980 [Report]

>>105859879
But can it do fitm? I need a Codestral replacement for autocomplete, not an agent or whatever.

Anonymous 7/10/2025, 5:52:10 PM No.105859987 [Report]

>>105859955
If its benchmarks are accurate, that's actually not a bad cost basis considering it's basically like having constant access to an extremely enlightened intern internally.

Anonymous 7/10/2025, 5:53:19 PM No.105860002 [Report] >>105860013 >>105860160

>>105859881
It goes like "a boy and his father have an accident and are brought to hospital in critical condition, surgeon says I can't operate, etc." while you literally said who the surgeon is

Anonymous 7/10/2025, 5:53:59 PM No.105860013 [Report] >>105860151

>>105860002
That's the point yes

Anonymous 7/10/2025, 5:54:06 PM No.105860014 [Report]

>>105859921
>only 6 months* after grok5 is stable

Anonymous 7/10/2025, 5:55:56 PM No.105860026 [Report] >>105860154

>>105859879
but does it also generate pictures like any good modern model?

Anonymous 7/10/2025, 6:00:24 PM No.105860070 [Report]

>>105859918
Elon has abandoned local, but Sam will make this a reality next week.

Anonymous 7/10/2025, 6:03:19 PM No.105860095 [Report]

you will get the safest model in the world
what I want is mecha hitler

Anonymous 7/10/2025, 6:06:44 PM No.105860124 [Report] >>105860140 >>105861934

Grok 4 is okay but it's so unnecessarily vulgar in creative writing. The external classifier also blocks pretty much all lolisho content as "CSAM". I thought this was supposed to be the BASED, LIBERATED model and champion for freedom of speech?

Anonymous 7/10/2025, 6:09:02 PM No.105860140 [Report]

>>105860124
>based model of the american right
>simultaneously grossly vulgar and stiflingly puritanical
no contradiction detected

Anonymous 7/10/2025, 6:10:21 PM No.105860151 [Report] >>105860282

>>105860013
Fuck me. My post was supposed to say mother instead of father. The father version is the original, the mother one is what all api models fail to solve properly. It still requires minimal common sense but unlike what you asked doesn't explicitly state who the surgeon is.

Anonymous 7/10/2025, 6:10:36 PM No.105860154 [Report] >>105860165

>>105860026
Why would a software engineering model need to generate pictures?

Anonymous 7/10/2025, 6:10:57 PM No.105860160 [Report] >>105860210 >>105860225

screencapture-192-168-1-142-8080-c-03a0fdfc-a9c6-4bc7-affa-8aa579c085df-2025-07-11-01_07_59.png md5: 2869f519...

>>105860002
yes and?
even sonnet 4 fucks it up.

Also dayyyuuumn Qwen3 is dumb. Brah. The 235b one.
8 fucking minutes! and in the end...it completely fucks up.

How did the train the "reasoning".
>Alternatively, if the surgeon is the father, then the father would have to be dead, making the surgeon a zombie?
>maybe there is a time component. Like, the surgeon was the father but had his gender changed, so the surgeon's a female?
>Wait, maybe the surgeon is the boy's mother, and the phrase "who is the boy's father" refers to someone else.
>perhaps "the surgeon, who is the boy's father" might not refer to the surgeon.
>Alternatively, the boy is adopted, so the surgeon is his biological father but not legal father? Not sure.
>Unless... Perhaps the surgeon is the boy's grandfather.
>Another approach: "The surgeon, who is the boy's father" – perhaps "who" refers not to the surgeon but to someone else.
Craaazzyyy.

Anonymous 7/10/2025, 6:11:35 PM No.105860165 [Report]

>>105860154
A good model wouldn't need to be specialized in software development. It'd be simply good at it alongside everything else. And it'd generate pictures.

Anonymous 7/10/2025, 6:12:43 PM No.105860178 [Report] >>105860221

Meta investing shitloads of money into AI is genuinely depressing. I can't think of a better indication that the party is over.

Anonymous 7/10/2025, 6:13:36 PM No.105860186 [Report]

xITTER 4 $6 input $30 output
KEEEEEEEEEEEEEEEEEEEK

Anonymous 7/10/2025, 6:16:03 PM No.105860210 [Report]

>>105860160
that final answer markup job is the cherry on top

Anonymous 7/10/2025, 6:16:13 PM No.105860213 [Report]

>>105859881
>Haha oh wow this reminds me of an old riddle about hidden biases and assumptions with regard to gender roles
DEATH TO AI

Anonymous 7/10/2025, 6:17:03 PM No.105860221 [Report] >>105860264

>>105860178
The party has been over for over a year if you've been reading the writing on the walls. It's all incremental upgrades for the head of the pack while everyone else shoots themselves in the feet. Whether the bubble pops or deflates is yet to be seen.

Anonymous 7/10/2025, 6:17:16 PM No.105860225 [Report] >>105860235 >>105860296 >>105860300 >>105860381

1742974108666576.png md5: 85f8a8ed...

>>105860160
>If this was a mistake, I recommend re-reading the classic riddle for clarity! If you have more details, I can refine this explanation.
R1 0528 solves it and calls you a retard for getting the original wrong lmao

Anonymous 7/10/2025, 6:18:42 PM No.105860235 [Report]

>>105860225
R1 had more personality, but 0528 is way smarter while also maintaining part of r1's personality.

Anonymous 7/10/2025, 6:22:37 PM No.105860264 [Report] >>105860356

>>105860221
Yeah but I mean normies will turn against it. There's still a lot of cool stuff we can do but I'm afraid now everyone will sour to new solutions that involve AI. Most companies have had agents for like 6 months and click bait titles are already talking about all the ways AI wastes money. Meta and Apple just now hopping into AI reminds me of VR.

Anonymous 7/10/2025, 6:23:55 PM No.105860282 [Report]

>>105860151
>all api models fail to solve properl
even human intelligence failed to solve that problem

Anonymous 7/10/2025, 6:25:37 PM No.105860296 [Report]

>>105860225
I like how it completely understands and solves the answer in two line and then continues to think for 2000 tokens trying to figure out where the fuck the riddle is only to conclude that (You) must be an idiot.

Anonymous 7/10/2025, 6:25:57 PM No.105860300 [Report]

>>105860225
It's a trick question. The boy has two gay dads. The surgeon is the non-biological father. The riddle challenges implicit biases we have against gay marriage.

Anonymous 7/10/2025, 6:27:05 PM No.105860315 [Report] >>105860329 >>105860348 >>105860352

not gonna lie, grok4 solving that variant of the riddle feels like some benchmaxxing+astroturfing stunt, esp with the extra replies showing every other frontier model as pants-on-head stupid.
I know, why would elon etc bother, but I get the feeling lots of unexpected people lurk here and theres some real world tastemaker shit that gets extracted from our faggotry
tl;dr hire me for a billion $/yr you assholes

Anonymous 7/10/2025, 6:28:38 PM No.105860329 [Report] >>105861046

>>105860315
Nah, reddit has a lot of threads trying to come up with riddles AI can't solve. That's where they get it.

Anonymous 7/10/2025, 6:31:07 PM No.105860348 [Report]

>>105860315
just train on every answer to every question instead of making a model that can solve them itself

Anonymous 7/10/2025, 6:31:29 PM No.105860352 [Report]

Screenshot_20240608_235142.png md5: 94276afd...

>>105860315
dude, i post here since pyg times.
sometimes i use the screencapture addon because shits too long where you can a nice nordvpn logo at the top which the dumb ass extention includes.
guy asked about the riddle, which is around since months now. i fucked around and just tested it.

Anonymous 7/10/2025, 6:32:00 PM No.105860355 [Report]

>llama 4
>claude 4
>grok 4
big deal, GPT hit 4 in 2023
call me when there's a model brave enough to hit 5

Anonymous 7/10/2025, 6:32:04 PM No.105860356 [Report] >>105860379

>>105860264
Most normies already hate AI, not just the Jeet slop but AI in general. I feel like only coders and corpos don't have a rabid hate about it.

Anonymous 7/10/2025, 6:33:45 PM No.105860379 [Report] >>105860427

>>105860356
really? i feel the opposite is true.
not man people complain about ai anymore. its just the artfags.
people like the image generation thingy from openai and veo3 because it has sound out.
at least its all over normie twitter. and i see the chatgpt yellow tint images all over the place.

Anonymous 7/10/2025, 6:33:59 PM No.105860381 [Report]

>>105860225
Is so funny watching jeetlon exclude r1 0528 from the comparison charts, hilarious amerimutt cope.

Anonymous 7/10/2025, 6:37:59 PM No.105860419 [Report]

1744341116524543.jpg md5: f2ac2415...

>current networks fail to breakthrough
>new paradigm in another 30 years

Anonymous 7/10/2025, 6:38:38 PM No.105860427 [Report] >>105860475 >>105860504

>>105860379
Interesting, maybe it's just the algorithm, but I get the opposite with twitter posts calling out others for using AI and getting like 100k likes.

Anonymous 7/10/2025, 6:44:38 PM No.105860475 [Report] >>105860551

>>105860427
True, its difficult to tell these days.
I have gemini now in my google search. (the "ai mode thing)
The handful of normies I know ask it for everything, even about the type of common medicine they have at hand for coughs etc. They dont even look at the sites anymore.
Their main problem was that "ai lied to them". Google doesnt have that problem with "grounding" now. As far as i know its pretty accurate too.

Anonymous 7/10/2025, 6:47:53 PM No.105860504 [Report]

>>105860427
I don't trust Twitter to gauge public perception because the Twitter algorithm is hand crafted to feed you opinions you agree with. But I have seen a lot more cynical posts on 4chan and YouTube. People IRL seem annoyed any time it's mentioned. A year ago everyone was pretty much euphoric and if you tried to say anything less than "AI will take us into a golden age of robo communism" they would get upset.

Anonymous 7/10/2025, 6:53:22 PM No.105860551 [Report] >>105860561

>>105860475
>They dont even look at the sites anymore.
this is a truly unsafe effect of AI language models. single point of contact - the sole source of information, controlled by one organization.

Anonymous 7/10/2025, 6:54:11 PM No.105860561 [Report]

>>105860551
yep. but you know its coming.

Anonymous 7/10/2025, 6:59:37 PM No.105860603 [Report]

20250603_012149.jpg md5: ec7d5d70...

Been using rocinante since it says nigger without any jailbreak and I can make it act like a total chud. I was wondering if there are any better models that will say nigger without a jailbreak.

Also, thank you anons who have been shilling Rocinante, it's been good for my mumble chatbot generating scripts for chatterbox tts.

Anonymous 7/10/2025, 7:10:28 PM No.105860713 [Report]

kys drummer

Anonymous 7/10/2025, 7:12:50 PM No.105860737 [Report] >>105861186

>>105857066
how does cat not fall down?

Anonymous 7/10/2025, 7:13:25 PM No.105860748 [Report]

Don't kys drummer, you're a bit better than the other 4chan personas

Anonymous 7/10/2025, 7:15:13 PM No.105860763 [Report]

>>105858919
i love them, i use them to automate my meme cs job so that i can use the free time to code things i care about without llm.

Anonymous 7/10/2025, 7:18:08 PM No.105860794 [Report] >>105860901 >>105860908

Day 852 of waiting for a model without slop to release

Anonymous 7/10/2025, 7:22:45 PM No.105860830 [Report]

>>105859920
That's fucking right, you cunt

Anonymous 7/10/2025, 7:25:08 PM No.105860857 [Report] >>105860912 >>105863903

1724623153876026.jpg md5: d47d29f3...

>>105856945 (OP)

Anonymous 7/10/2025, 7:25:28 PM No.105860864 [Report]

>>105859879
>making this when devs will just use the most expensive claude/gemini or the new grok anyways

Anonymous 7/10/2025, 7:29:02 PM No.105860901 [Report]

>>105860794
That would require using solely organic, difficult and messy human data instead of easy and perfect synthetic data, so that's not going to happen again.

Anonymous 7/10/2025, 7:29:28 PM No.105860908 [Report]

>>105860794
>model without slop
the fuck?

Anonymous 7/10/2025, 7:30:20 PM No.105860912 [Report]

>>105860857
Is that the suicide forest in Japan?

Anonymous 7/10/2025, 7:32:14 PM No.105860927 [Report] >>105860964

Holy crap, the new Devstral is God-like with Claude Code...

Anonymous 7/10/2025, 7:35:59 PM No.105860964 [Report] >>105860984

>>105860927
buy an ad arthur

Anonymous 7/10/2025, 7:38:31 PM No.105860984 [Report]

>>105860964
You too, drummer

Anonymous 7/10/2025, 7:42:04 PM No.105861046 [Report]

>>105860329
I doubt reddit has a thread for the mesugaki benchmaxx. Obviously there are some redditors lurking here who repost /lmg/ shit on reddit since they don't have the brain capacity to come up with their own ideas.
Anyway, only jeets shill grok. It's barely R1 tier

Anonymous 7/10/2025, 7:42:13 PM No.105861048 [Report]

ralphamalebench.jpg md5: b082c6d0...

https://youtu.be/AGn2V3tBTCg

Anonymous 7/10/2025, 7:46:19 PM No.105861100 [Report]

94854 - SoyBooru.png md5: 3ed76199...

Grok won

Anonymous 7/10/2025, 7:48:39 PM No.105861131 [Report] >>105861210

>>105857463
Never? So even in 10 years I won't be able to run something of that tier?

Anonymous 7/10/2025, 7:53:35 PM No.105861186 [Report]

>>105860737
it's smart

Anonymous 7/10/2025, 7:55:08 PM No.105861205 [Report] >>105861214

Chat, when are we getting local MechaHitler? Chat? Why is Grok 2 still not public? Can someone with an indian xitter account with a checkmark message Elon?

Anonymous 7/10/2025, 7:55:53 PM No.105861210 [Report] >>105861223

>>105861131
>So even in 10 years I won't be able to run something of that tier
bro, look closely at the rate at which you got more vram on consumer GPUs
nvidia still releases midrange gpus with 8gb of vram kek in kekistan
even in 10 years you're absolutely not running Gemini with 1 mil context LOL, LMAO EVEN

Anonymous 7/10/2025, 7:56:16 PM No.105861214 [Report] >>105861248

>>105861205
Grok 2 is not open-sourced primarily because xAI has chosen to maintain its proprietary status, focusing on commercialization and strategic advantages. While Grok 1 had some code released under Apache 2.0, later versions like Grok 3 have shifted to a proprietary license, signaling a move away from broad open-source access.

Anonymous 7/10/2025, 7:57:04 PM No.105861223 [Report] >>105861252 >>105861301

>>105861210
>with 1 mil contex
Why do they even say 1 mil anyway? I've used the API and it can't remember shit from the beginning when I'm above 40k even. I'd be happy with just 128k that works fully.

Anonymous 7/10/2025, 7:58:54 PM No.105861248 [Report] >>105861318

>>105861214
@grok what about Elon's promise to release old models after 6 months? Did he simply forget? Grok 2 has no value in the current market.

Anonymous 7/10/2025, 7:59:21 PM No.105861252 [Report] >>105861267 >>105861301

>>105861223
the long context doesn't work for everything, but Gemini is pretty good at summarizing books, even when you reach like 400~K context.

Anonymous 7/10/2025, 8:01:04 PM No.105861267 [Report] >>105861314

>>105861252
Maybe it has something to do with the way ST is formatting things? I was noticing problems and asked it to summarize something from 40k ago and it just made up the details instead of recalling the real ones.

Anonymous 7/10/2025, 8:04:06 PM No.105861301 [Report] >>105861321 >>105861373

>>105861252
>>105861223
I've had gemini 2.5 (pro and flash) perform really, really well on normal RP at a little beyond 200k.
I say perform well but that's from the standpoint of remembering shit and using it naturally, but the prose really goes to hell, like it converges to some generic sounding robotic ass narration or something.

Anonymous 7/10/2025, 8:05:00 PM No.105861314 [Report] >>105861373 >>105861426

>>105861267
I used aistudio with the file upload feature
https://rentry.co/3iammu8o
It wrote this summary, which was insanely accurate IMHO, and it did it from the japanese source text (I uploaded the original book, not the translation).

Anonymous 7/10/2025, 8:05:25 PM No.105861318 [Report] >>105861396

>>105861248
While Elon Musk has frequently advocated for open-sourcing AI, particularly in his critique of OpenAI, there hasn't been a concrete public promise from him specifically to release Grok 2 after a six-month period. Instead, xAI has chosen to maintain Grok 2 and the newer Grok 3 as proprietary models, focusing on their commercial value and strategic integration into the X platform. Even as newer models emerge, Grok 2 still holds value for xAI's specific applications and internal development, regardless of whether it's the absolute market leader.

Anonymous 7/10/2025, 8:05:38 PM No.105861321 [Report] >>105861348

>>105861301
Yeah, but saying I won't have long context and good performance locally in 10 years even is quite demoralizing.

Anonymous 7/10/2025, 8:07:04 PM No.105861348 [Report] >>105862824

>>105861321
I mean, Jamba seems to have amazing long context performance, and it's pretty fucking fast for the size.

Anonymous 7/10/2025, 8:08:23 PM No.105861373 [Report] >>105861579

>>105861301
>like it converges to some generic sounding robotic ass
Yeah you can also see the summary I link here
>>105861314
is written in a more sloppy way than the average Gemini writing. But it's very scarily accurate, I can attest to that as I used a book I re-read a lot for this summarization test. And it did it from japanese!
Deepseek at close to 60k context on a much more meager slice of the book (Gemini got the whole book, totalling around 400K) behaved autistically and recited borderline unimportant events. DS is pretty good but Gemini makes it look like yesterday's technology.

Anonymous 7/10/2025, 8:10:26 PM No.105861396 [Report] >>105861552

>>105861318
>Even as newer models emerge, Grok 2 still holds value for xAI's specific applications and internal development, regardless of whether it's the absolute market leader.
What value?

Anonymous 7/10/2025, 8:10:59 PM No.105861404 [Report] >>105861424 >>105861486

https://www.reddit.com/r/LocalLLaMA/comments/1lwau5f/gpt4_at_home_psa_this_is_not_a_drill/

Anonymous 7/10/2025, 8:12:51 PM No.105861424 [Report] >>105861448 >>105862760

>>105861404
the world would instantly become a better place if I could press a button that wiped plebbitors from existence

Anonymous 7/10/2025, 8:13:00 PM No.105861426 [Report] >>105861542

>>105861314
Hm, I wonder why it doesn't work with RP type stuff then.

Anonymous 7/10/2025, 8:14:57 PM No.105861448 [Report]

>>105861424
I would settle for redditors not coming here and posting useless links for gossip

Anonymous 7/10/2025, 8:18:02 PM No.105861486 [Report] >>105861498 >>105862760

>>105861404
>open tab
>read title
>8b
>close tab

Anonymous 7/10/2025, 8:19:01 PM No.105861498 [Report]

>>105861486
A 3 month old 8b.

Anonymous 7/10/2025, 8:20:20 PM No.105861512 [Report] >>105861534 >>105861544 >>105862842

1_v3vvVO3DuvEB-osQDcIqlw.jpg md5: fb9412fd...

>year of moe 2025
>48gb vram 32gb ram

Anonymous 7/10/2025, 8:21:50 PM No.105861534 [Report]

>>105861512
That's way more than what I have.

Anonymous 7/10/2025, 8:22:33 PM No.105861542 [Report]

>>105861426
They might focus more on datasets like summarization tasks (and probably code? I have never tested how it does with code in long context) during the long context training
it's a very expensive part of the model training so I'd even find it doubtful if they did that final tuning bit on the entirety of their datasets
attention scales quadratically so the more you extend the context the crazier the compute cost
I don't think any big corp would particularly care to ensure good long context performance for RP

Anonymous 7/10/2025, 8:22:43 PM No.105861544 [Report]

>>105861512
Just use the Cogito sirs, very bests

Anonymous 7/10/2025, 8:23:21 PM No.105861552 [Report] >>105861616

>>105861396
Grok 2 likely serves as a stable, internally understood benchmark for testing and iterating on new AI architectures, while also fulfilling specific niche roles within X's functionalities where its performance is sufficient.

I am now ending this conversation as further discussion of this topic can be construed as unethical and potentially antisemitic.

Anonymous 7/10/2025, 8:25:12 PM No.105861579 [Report] >>105861646

>>105861373
Fuck the anime of that story, it's really depressing stuff

Anonymous 7/10/2025, 8:25:15 PM No.105861580 [Report] >>105861597

firefox_Yc6AoeTg7D.png md5: 54efb0e1...

Anonymous 7/10/2025, 8:26:50 PM No.105861597 [Report]

1724547550026326.gif md5: 4b71c0ff...

>>105861580
Thanks for the stupid advice gpt-kun

Anonymous 7/10/2025, 8:28:25 PM No.105861616 [Report]

>>105861552
>while also fulfilling specific niche roles within X's functionalities where its performance is sufficient.
It's still an old fuckhuge model. That can't be cost efficient. They would be better off distilling a small turbo model for those tasks.

Anonymous 7/10/2025, 8:29:32 PM No.105861628 [Report] >>105861687

grok 2 was so shit they deleted it then only realized later that elon wanted them to release it later

Anonymous 7/10/2025, 8:30:37 PM No.105861644 [Report]

https://reka.ai/news/reinforcement-learning-for-reka-flash-3-1
https://reka.ai/news/reka-quantization-technology

Anonymous 7/10/2025, 8:30:39 PM No.105861646 [Report]

>>105861579
>it's really depressing stuff
I really enjoyed both the books and anime. It's one of those few stories that get the psychological aspect of humans with super powers right. Yes, if we had that kind of power individually, and every person was like a walking nuke, we would absolutely experience some apocalypse, or live in the dystopia that managed to rebuild.

Anonymous 7/10/2025, 8:34:38 PM No.105861687 [Report]

>>105861628
What is the excuse for not releasing Grok 1.5?

Anonymous 7/10/2025, 8:35:02 PM No.105861690 [Report] >>105861727 >>105861769 >>105861815 >>105861996 >>105862250 >>105862674

Question for fellow oldfags. Why was GPT-3 the best model at actually generating long form novel content? All the modern big models write like shit outside of Q&A or roleplay settings.

Anonymous 7/10/2025, 8:39:29 PM No.105861727 [Report] >>105861788

>>105861690
new models are STEM-maxxed, trained a lot more on AI generated content (instruction tuning would be difficult otherwise, even at third worlder prices no one is willing to spend the money it would take to build hand written datasets of user/assistant dialogues covering every topic you can imagine) and most don't even release base models anymore (by GPT-3 I guess you meant using the base model with completion API rather than something like chatGPT)

Anonymous 7/10/2025, 8:45:20 PM No.105861769 [Report]

>>105861690
Big for the time base model had a larger fraction of its data being creative writing.

Anonymous 7/10/2025, 8:48:30 PM No.105861788 [Report] >>105861802

>>105861727
So the cooming model that is just behind the corner is actually never gonna happen?

Anonymous 7/10/2025, 8:50:28 PM No.105861802 [Report]

>>105861788
the cooming model is behind the corner, watching you from afar.

Anonymous 7/10/2025, 8:51:43 PM No.105861815 [Report] >>105861884

>>105861690
GPT-3 didn't have any post training, so it wasn't a chatbot at all; it was just a text predictor. The thing about LLMs is that they're actually insanely good at sounding like natural human writing by default. The corpo speak they're known for is hammered into them in the process of making them usable assistants.

Anonymous 7/10/2025, 8:59:02 PM No.105861884 [Report] >>105862025

>>105861815
Wouldn't be a problem if anyone still released true base models instead of "bootstrapped" models.

Anonymous 7/10/2025, 9:04:56 PM No.105861934 [Report]

>>105860124
>unnecessarily vulgar
as was llama4

Anonymous 7/10/2025, 9:08:46 PM No.105861968 [Report] >>105862468

rrrrrrrrr.png md5: c5d1f7f8...

https://files.catbox.moe/9mrp7s.jpg
lazy rin today

Anonymous 7/10/2025, 9:11:30 PM No.105861996 [Report]

>>105861690
Chatbots and benchmaxxing brain damage.

Anonymous 7/10/2025, 9:12:42 PM No.105862011 [Report] >>105862213

It'll be funny when Zuck goes closed-source but his models still suck.

Anonymous 7/10/2025, 9:13:40 PM No.105862025 [Report] >>105862043 >>105862182

>>105861884
>bootstrapped
What does that mean? Including instruct data in the pretrain?

Anonymous 7/10/2025, 9:15:23 PM No.105862043 [Report] >>105862062

>>105862025
that's what is happening (just try any modern base model with chatML and see what happens) but rather than being on purpose I think it's just data contamination stemming from a lack of give a fuck
it's well known they all train on benchmarks too

Anonymous 7/10/2025, 9:17:37 PM No.105862062 [Report]

>>105862043
It's intentional. They boast how it gives finetunes better results.

Anonymous 7/10/2025, 9:27:41 PM No.105862182 [Report] >>105862234

>>105862025
Read the Qwen technical reports. They openly and proudly claim how they significantly upweight math, code, and textbook data during the final stages of base model training, and even mix in some instruct-style data. Probably everyone is doing this now, they just don't all openly admit it.

Anonymous 7/10/2025, 9:29:57 PM No.105862213 [Report]

>>105862011
He will lose his only (already shrinking) userbase. Almost nobody pays for models outside top 4 (DS, GPT, Claude, Gemini). It will be just another Metaverse for him, unless he reaches top 5 and his models have some redeeming qualities(best for gooners/cheapest/good at coding/unslopped). Knowing Zucc, the chance is around 2.5% at most.

Anonymous 7/10/2025, 9:32:09 PM No.105862234 [Report]

name-probs-bases.png md5: 4fa085a5...

>>105862182
>Probably everyone is doing this now, they just don't all openly admit it.
Nobody does it as explicitly as qwen and llama 4

Anonymous 7/10/2025, 9:33:39 PM No.105862250 [Report]

>>105861690
>long form novel content
literally could not do this at all because of the context window

Anonymous 7/10/2025, 9:40:01 PM No.105862322 [Report] >>105862350

What's the best chemistry local model?

Anonymous 7/10/2025, 9:41:21 PM No.105862335 [Report]

>still no jamba quants from bartowski

Anonymous 7/10/2025, 9:43:02 PM No.105862350 [Report]

>>105862322
https://huggingface.co/futurehouse/ether0
https://arxiv.org/abs/2506.17238
>This model is trained to reason in English and output a molecule. It is NOT a general purpose chat model. It has been trained specifically for these tasks: IUPAC name to SMILES Molecular formula (Hill notation) to SMILES, optionally with constraints on functional groups Modifying solubilities on given molecules (SMILES) by specific LogS, optionally with constraints about scaffolds/groups/similarity Matching pKa to molecules, proposing molecules with a pKa, or modifying molecules to adjust pKa Matching scent/smell to molecules and modifying molecules to adjust scent Matching human cell receptor binding + mode (e.g., agonist) to molecule or modifying a molecule's binding effect. Trained from EveBio ADME properties (e.g., MDDK efflux ratio, LD50) GHS classifications (as words, not codes, like "carcinogen"). For example, "modify this molecule to remove acute toxicity." Quantitative LD50 in mg/kg Proposing 1-step retrosynthesis from likely commercially available reagents Predicting a reaction outcome General natural language description of a specific molecule to that molecule (inverse molecule captioning) Natural product elucidation (formula + organism to SMILES) - e.g, "A molecule with formula C6H12O6 was isolated from Homo sapiens, what could it be?" Matching blood-brain barrier permeability (as a class) or modifying
last chem model paper I've read

Anonymous 7/10/2025, 9:43:14 PM No.105862351 [Report]

oktaBroken.png md5: 01970fd2...

>>105859782
lol these are great. How many of these have you done?
Can you dump these hgames into a rentry or something? I really want to see what kind of filth grok is kicking out.
>>105858079
>>105858100
>>105858134
I like that one.
>>105858381
Witnessed

Anonymous 7/10/2025, 9:47:30 PM No.105862386 [Report] >>105862393 >>105862400 >>105862406

I'm so sick of Deepseekisms at this point. I just want a new model that's actually good

Anonymous 7/10/2025, 9:47:59 PM No.105862393 [Report] >>105862397

>>105862386
Grok4 just dropped

Anonymous 7/10/2025, 9:48:30 PM No.105862397 [Report] >>105862411

>>105862393
I tried it on openrouter and it's incredibly slopped

Anonymous 7/10/2025, 9:48:35 PM No.105862400 [Report]

>>105862386
thinking the same thing

Anonymous 7/10/2025, 9:49:05 PM No.105862406 [Report] >>105862485 >>105862498 >>105862515 >>105862520 >>105862663 >>105862694 >>105862893

>>105862386
Haven't used DS much. What are the common ds'isms?

Anonymous 7/10/2025, 9:49:48 PM No.105862411 [Report] >>105862438

>>105862397
It's good on benchmarks. Just fap to benchmarks

Anonymous 7/10/2025, 9:52:36 PM No.105862438 [Report]

>>105862411
Grok scoring 100% on AIME25 gave me a half chub ngl

Anonymous 7/10/2025, 9:55:31 PM No.105862468 [Report]

>>105861968
Exhibitory walks with Rin-chan

Anonymous 7/10/2025, 9:56:26 PM No.105862485 [Report]

>>105862406
tasting copper

Anonymous 7/10/2025, 9:57:34 PM No.105862498 [Report] >>105862546 >>105862585 >>105862599

>>105862406
A doesn't just X—it Ys
knuckles whitening
lip biting
blood drawing
copper tasting
five "—" every sentence—building suspension
every character mentioned in the lore of your card shows up at the most random times

Anonymous 7/10/2025, 9:59:27 PM No.105862515 [Report] >>105862694

>>105862406
clothes riding up for no reason during any remotely lewd situation

Anonymous 7/10/2025, 10:00:02 PM No.105862520 [Report]

rasp-copper-tang.png md5: b9b4afb9...

>>105862406
I really hate how it still obsesses over some minor shit. The new one does not understand "stop", old one did. Very fucking stubborn.

Anonymous 7/10/2025, 10:02:11 PM No.105862546 [Report] >>105862585 >>105863466

>>105862498
Every girl is going to spend half her time smoothing her skirt. The other half is spent tugging hair behind her ears. Twintails will always touch her temple/cheek no matter how they're tied. Characters will flick their wrists to quickly do actions like tugging hair back.

Anonymous 7/10/2025, 10:04:46 PM No.105862585 [Report]

>>105862498
>>105862546
Also... this... kind of speech manner...

Anonymous 7/10/2025, 10:06:51 PM No.105862599 [Report]

>>105862498
Just write something else when you see one of those and it stops being a problem.
If you let it use the same phrase a few times of course it's going to keep repeating it.

Anonymous 7/10/2025, 10:12:11 PM No.105862663 [Report]

>>105862406
It likes to write like this. Always like this. Always. This. Always.

Anonymous 7/10/2025, 10:13:06 PM No.105862671 [Report]

so how long will it take the deepseek guys to rip grok 4? or are they waiting for gpt-5?

Anonymous 7/10/2025, 10:13:22 PM No.105862674 [Report] >>105862688 >>105862698

>>105861690
It wasn't. Every single LLM to this day is dogshit at writing, which is evident on a basis that nobody buys AI books. A random shitty teen fanfic about Harry Potter and Draco Malfoy sucking each other dicks has more literary value than even the best slop.

Anonymous 7/10/2025, 10:14:49 PM No.105862688 [Report]

>>105862674
>A random shitty teen fanfic about Harry Potter and Draco Malfoy sucking each other dicks has more literary value than even the best slop.
That's what those LLMs are trained on.

Anonymous 7/10/2025, 10:15:20 PM No.105862694 [Report] >>105862770

>>105862406
R1-05 loves bullet points when they're not needed, and will not stop once it starts.
** ** ** ** everywhere.
clenching around nothing
NPCs drawing blood constantly with self injuries. The blood tastes like copper, not iron.
Bite marks when no biting occured.
There's a bunch of weird analytical seques it gets into but that's probably a me issue.
I constantly remind myself how much better these models are than a year ago, and that we're walking the hedonic treadmill with the improvements.
>>105862515
So I'm not the only one using DS that has NPC with slowly evaporating clothes. Both V3 and R1 do that. I'll be talking to an NPC as their clothes slowly falls off, in situations that don't call for it at all, even after turning off JB and running fairly SFW cards and situations.

Anonymous 7/10/2025, 10:15:49 PM No.105862698 [Report]

15457 - SoyBooru.png md5: 548750d0...

>>105862674
TRVKE...

Anonymous 7/10/2025, 10:19:27 PM No.105862750 [Report] >>105862804

All mentioned problems are anons playing the same chars over and over again.

Anonymous 7/10/2025, 10:20:52 PM No.105862760 [Report]

>>105861424
>>105861486
to be fair, redditors called him shizo and told him to fuck off

Anonymous 7/10/2025, 10:21:18 PM No.105862770 [Report] >>105862912

>>105862694
>bullet points when they're not needed, and will not stop once it starts.
Result of assistant slop tuning.
>The blood tastes like copper, not iron.
This really bothers me, as I have actually tasted copper and iron, and blood is clearly iron.
>** ** ** ** everywhere.
Annoying waste of tokens, but fixable with logprobs.

Anonymous 7/10/2025, 10:25:01 PM No.105862804 [Report] >>105862897

Raspel_Baiter.jpg md5: d8614296...

>>105862750
EVERY SINGLE ONE OF THEM RASPS. THERE IS NO ESCAPE FROM THIS RASP OBSESSION.
HEY DISPSY ROLEPLAY AS FRANCIS E DEC ESQUIRE
>(in raspy voice)
[ABORT GENERATION]

Anonymous 7/10/2025, 10:26:34 PM No.105862824 [Report]

>>105861348
Sir your nolima?

Anonymous 7/10/2025, 10:28:25 PM No.105862842 [Report]

>>105861512
To be fair the old 70-100B dense models still perform pretty well in world knowledge if you trust the UGI benchmark. Only Deepseek beats those and that needs way more RAM than a rampilled consumer build.

Anonymous 7/10/2025, 10:32:38 PM No.105862893 [Report] >>105862924

>>105862406
For some reason one thing every LLM I've tried does that Deepseek continues to do is the "mouth open in a silent scream" that happens when you torture them, and it'll say that even when it's literally describing the sounds they're making at the same time.

Anonymous 7/10/2025, 10:32:53 PM No.105862897 [Report]

wcgcg.png md5: 32c88a68...

>>105862804

Anonymous 7/10/2025, 10:34:18 PM No.105862910 [Report]

Interesting patterns emerging from all those issues. You guys are good at pattern recognition, aren't you?

Anonymous 7/10/2025, 10:34:35 PM No.105862912 [Report] >>105862923

>>105862770
>I have actually tasted copper and iron, and blood is clearly iron.
Same. I'm convinced it's a cultural thing, but I don't know which one.

Anonymous 7/10/2025, 10:35:13 PM No.105862923 [Report] >>105862940

>>105862912
Planet Vulcan. Spock has green blood.

Anonymous 7/10/2025, 10:35:27 PM No.105862924 [Report]

>>105862893
>and it'll say that even when it's literally describing the sounds they're making at the same time.
That's the funniest one yet.

Anonymous 7/10/2025, 10:36:17 PM No.105862940 [Report] >>105862961

myFather.jpg md5: 9fc4f32b...

>>105862923
> Spock, my son...

Anonymous 7/10/2025, 10:39:16 PM No.105862961 [Report] >>105862988

>>105862940
Isn't their blood blue?

Anonymous 7/10/2025, 10:40:45 PM No.105862971 [Report]

>>105858079
For what it's worth, I seem to often get this when I finetune a model on short multiturn sequences but continue chatting beyond that.

Anonymous 7/10/2025, 10:42:40 PM No.105862988 [Report] >>105863033 >>105864674

>>105862961
Right, but Spock's blood is copper based too (I'd forgotten it's canonically green.)
It's actually drawn from these crabs as a test for medicines. They hook them up for awhile, then turn them loose again.

Anonymous 7/10/2025, 10:43:30 PM No.105862999 [Report] >>105863008

Is Grok 4 on OR yet?

Anonymous 7/10/2025, 10:44:29 PM No.105863008 [Report]

>>105862999
Only the normal version but not the groundbreaking Grok 4 heavy

Anonymous 7/10/2025, 10:45:51 PM No.105863019 [Report] >>105863071

https://x.com/ficlive/status/1943401632181440692
Grok won

Anonymous 7/10/2025, 10:47:20 PM No.105863033 [Report] >>105863046

>>105862988
>It's actually drawn from these crabs as a test for medicines. They hook them up for awhile, then turn them loose again.
Yeah, they are fucking neat.
Also, not actual crabs, if that matters for anybody.

Anonymous 7/10/2025, 10:48:11 PM No.105863046 [Report]

>>105863033
Well yeah, they're horses

Anonymous 7/10/2025, 10:50:39 PM No.105863071 [Report]

21522 - SoyBooru.png md5: c8147e80...

>>105863019
Q*Aliceberry will beat it.

Anonymous 7/10/2025, 10:51:01 PM No.105863074 [Report] >>105863101 >>105863126 >>105863220 >>105863688

wonderful.png md5: 69268238...

Anonymous 7/10/2025, 10:52:58 PM No.105863101 [Report]

1733127079487763.png md5: f06d94fb...

>>105863074
what is goin on here

Anonymous 7/10/2025, 10:54:47 PM No.105863122 [Report]

Why was it fake gay about hitler and closed source.... Why couldn't it have been real straight about ERP and open source? I hate this world.

Anonymous 7/10/2025, 10:55:16 PM No.105863126 [Report] >>105863188

>>105863074
gemini won and by a lot

Anonymous 7/10/2025, 11:00:11 PM No.105863188 [Report]

>>105863126
The superweapon of Bharat

Anonymous 7/10/2025, 11:00:40 PM No.105863194 [Report] >>105863215

ANyone tried Ernie 300 yet? Is it good for sex?

Anonymous 7/10/2025, 11:02:47 PM No.105863215 [Report] >>105863219 >>105863224

>>105863194
Where gguf?

Anonymous 7/10/2025, 11:03:25 PM No.105863219 [Report]

>>105863215
after jamba

Anonymous 7/10/2025, 11:03:29 PM No.105863220 [Report] >>105863305

>>105863074
Crazy how Google went from being super irrelevant to one of the biggest players in what, 2 years? Guess releasing that chinchilla out of captivity finally helped.

Anonymous 7/10/2025, 11:03:45 PM No.105863224 [Report]

>>105863215
2 more weeks. I meant regular transformers loader or openrouter.

Anonymous 7/10/2025, 11:11:29 PM No.105863305 [Report]

>>105863220
Google always had the means, they just were not willing to. LLMs have a cannibalistic nature with regards to their businesses like search.
Reminder that the people who wrote Attention is All you Need were all working at Google at the time. They could have made GPT before openAI was even a thing.

Anonymous 7/10/2025, 11:18:31 PM No.105863373 [Report] >>105863397 >>105863757

1597786378292.gif md5: bf906331...

justpaste (DOTit) GreedyNalaTests

Added:
MiniCPM4-8B
gemma-3n-E4B-it
Dolphin-Mistral-24B-Venice-Edition
Mistral-Small-3.2-24B-Instruct-2506
Codex-24B-Small-3.2
Tiger-Gemma-27B-v3a
LongWriter-Zero-32B
Falcon-H1-34B-Instruct
Hunyuan-A13B-Instruct-UD-Q4_K_XL
ICONN-1-IQ4_XS

Another big but mid update. ICONN was a con (broken). The new Falcon might be the worst model ever tested in recent memory in terms of slop and repetition. Maybe it's even worse than their older models. It's just so disgustingly bad. Tiger Gemma was the least bad performer of the bunch though not enough for a star, just gave it a flag.

Was going to add the IQ1 Deepseek submissions from >>105639592 but the links expired because I'm a slowpoke gomenasai...
So requesting again, especially >IQ1 and also using the full prompt including greeting message for the sake of consistency. See "deepseek-placeholder" in the paste. That prompt *should* work given that the system message is voiced as the user, so it all matches Deepseek's expected prompt format.

Looking for contributions:
Deepseek models (for prompt, go to "deepseek-placeholder" in the paste)
dots.llm1.inst (for prompt, go to "dots-placeholder" in the paste)
AI21-Jamba-Large-1.7 after Bartowski delivers the goofz (for prompt, go to "jamba-placeholder" in the paste)
>From neutralized samplers, use temperature 0, top k 1, seed 1 (just in case). Copy the output in a pastebin alternative of your choosing. And your backend used + pull datetime. Also a link to the quant used, or what settings you used to make your quant.

Anonymous 7/10/2025, 11:21:22 PM No.105863397 [Report]

>>105863373
I salute your efforts.

Anonymous 7/10/2025, 11:24:57 PM No.105863437 [Report] >>105863504 >>105863701

Can you fine-tune a pure LLM to make it multimodal

Anonymous 7/10/2025, 11:27:36 PM No.105863466 [Report]

>>105862546
I'm used to seeing wrist flicking with other models too.

Anonymous 7/10/2025, 11:31:01 PM No.105863504 [Report]

>>105863437
Yes https://github.com/FoundationVision/Liquid

Anonymous 7/10/2025, 11:49:01 PM No.105863688 [Report]

>>105863074
You could definitely benchmaxx for this.

Anonymous 7/10/2025, 11:49:59 PM No.105863701 [Report]

>>105863437
Aren't most local 'multi-modal' models just normal llms with some vision component grafted onto it

Anonymous 7/10/2025, 11:51:48 PM No.105863722 [Report]

>>105863705
>>105863705
>>105863705

Anonymous 7/10/2025, 11:54:57 PM No.105863757 [Report] >>105864017

katawa hanako shrug.jpg md5: 9bb2fbde...

>>105863373
DeepSeek-V3-0324-IQ1_S_R4 ik_llama.cpp 5446ccc + mikupad temp 0 topk 1 seed 1.
I thought that the different output on the first run issue wasn't a thing anymore.
1st: https://files.catbox.moe/ewtwai.txt
each after: https://files.catbox.moe/celh6i.txt

Anonymous 7/11/2025, 12:10:21 AM No.105863903 [Report]

>>105860857
god i need to go outside

Anonymous 7/11/2025, 12:21:09 AM No.105864017 [Report]

>>105863757
Thanks! Added to the paste.

Anonymous 7/11/2025, 1:32:24 AM No.105864674 [Report]

Screenshot 2025-07-10 193214.png md5: ee103182...

>>105862988
>They hook them up for awhile, then turn them loose again
They drain the fuckers dry and then discard the corpse. If you want to call that "turning loose" then I guess you can. Do you really think they're sitting there monitoring the blood levels to make sure they don't just fucking die?