Thread 106311445

487 posts 116 images /g/

Anonymous 8/19/2025, 11:49:30 AM No.106311445 [Report] >>106312757 >>106312921

/lmg/ - Local Models General

Anonymous 8/19/2025, 11:49:55 AM No.106311447 [Report] >>106311463 >>106312757

flux_inpaint_teto_thred.png md5: f5f96c02...

►Recent Highlights from the Previous Thread: >>106303712

--Paper: Ovis2.5 Technical Report:
>106308702 >106308793
--Paper: CarelessWhisper: Turning Whisper into a Causal Streaming Model:
>106308813 >106309149 >106309494
--Qwen-Image-Edit released with strong text rendering and image editing capabilities:
>106304160 >106304229 >106304484 >106304564 >106304599 >106304761 >106305072 >106307093 >106307242 >106307283 >106307306 >106307263 >106307748 >106307855 >106307925 >106308241 >106308333 >106308348 >106308326 >106308436 >106307118 >106307122 >106308310 >106308351
--NVIDIA Nemotron Nano 2 and synthetic dataset release with mixed reception:
>106305366 >106305431 >106305443 >106305712 >106305743 >106305758 >106305794 >106305806 >106305812 >106305818
--AMD GPU performance restored after removing ROCm interference on Arch Linux:
>106304678 >106304729 >106304822 >106304896 >106305055 >106305076 >106306403 >106306460 >106306556 >106306029 >106306981 >106307019 >106307503 >106307516 >106307676 >106307714 >106307766
--Selective MoE layer offloading beats bulk CPU-MoE for GML Air performance:
>106308587 >106308805 >106308825 >106308830 >106308846 >106308924 >106308929 >106308940 >106308994 >106309022
--Qwen image edit alters art style despite accurate prompt following:
>106308302 >106308311 >106308327 >106308363 >106308386
--Open-source models now trail frontier AI by only 9 months in capability:
>106307055 >106307108 >106307124
--Intel accelerating GPU development via Battlematrix:
>106307153 >106307327
--Evolution of AI-generated limericks from GPT-1 to GPT-5:
>106307031 >106307107 >106307146
--Qwen Edit pull request merged into ComfyUI:
>106307802 >106308885
--Article: DeepSeek delays R2 model launch due to Huawei chip issues:
>106304573
--Miku (free space):
>106305259 >106305535 >106306927 >106307093 >106307554 >106307648 >106308302

►Recent Highlight Posts from the Previous Thread: >>106303714

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 8/19/2025, 11:52:34 AM No.106311463 [Report]

>>106311447
>pic
Nice.

Anonymous 8/19/2025, 11:55:29 AM No.106311473 [Report] >>106311506

Dead hobby

Anonymous 8/19/2025, 11:58:11 AM No.106311488 [Report]

Segs with Teto (not the poster)

Anonymous 8/19/2025, 12:01:11 PM No.106311506 [Report] >>106311511 >>106311683 >>106315356

image_2025-08-18_232552622-removebg-preview.png md5: 32681a1c...

>>106311473
Nope, Gwen is here to save us.

Anonymous 8/19/2025, 12:01:51 PM No.106311511 [Report] >>106311547 >>106311559 >>106311842

>>106311506
pedophile alert

Anonymous 8/19/2025, 12:04:37 PM No.106311528 [Report] >>106311542

image_2025-08-19_153434317.png md5: 9d0d8755...

i love benchmaxxing.

Anonymous 8/19/2025, 12:07:15 PM No.106311542 [Report] >>106311552

>>106311528
Are you certified in corporate benchmarking to be allowed to use that word?

Anonymous 8/19/2025, 12:07:53 PM No.106311547 [Report]

1735944855042775.png md5: 24cb6e82...

>>106311511
>philes your pedo

Anonymous 8/19/2025, 12:08:38 PM No.106311552 [Report]

>>106311542
no :<

LoliKing 8/19/2025, 12:10:14 PM No.106311559 [Report]

B052E4CF-8704-4B87-ADBC-3552B61BA988.png md5: 949f05f4...

>>106311511
You called?

Anonymous 8/19/2025, 12:23:18 PM No.106311628 [Report] >>106311791

1734612641132919.png md5: be150e8d...

Anonymous 8/19/2025, 12:30:40 PM No.106311683 [Report] >>106311693 >>106311801 >>106312176

>>106311506
Mistral shits out some half assed deepseek distilled model every couple months.
Google Gemma.
And qwen.

Thats it for local smaller vram cucks right?
I thought after the R1 Paper and reduced training costs some arabian prince might safe us with a RP trained model.
A SNK situation basically. How is there no guy with a couple mil to blow? The zoomers would eat it up if you put it behind a apI regardless if they could download it on huggingface.

Anonymous 8/19/2025, 12:32:10 PM No.106311693 [Report] >>106311731 >>106312338

>>106311683
RP corpus is very small so a model can't be trained solely on RP

Anonymous 8/19/2025, 12:35:16 PM No.106311717 [Report] >>106311807

is there any place I can grab ik_llama.cpp builds?
I'm building the cuda version myself right now, but FUCK it takes quite a bit.

Anonymous 8/19/2025, 12:35:46 PM No.106311722 [Report]

speaking of which, post cunny cards

Anonymous 8/19/2025, 12:37:03 PM No.106311731 [Report] >>106311747

>>106311693
Now we have models that consist of 90% synth math/riddle. 5% synth rewritten by a 30b. 5% human text but filtered by naughty list. And its still able to purple prose slop at least.
A model mainly trained on all sorts of entertainment media would be cool.

Anonymous 8/19/2025, 12:39:14 PM No.106311747 [Report] >>106311766 >>106312243

>>106311731
>synth rewritten by a 30b
nvidia nemotroon bros... we lost!

Anonymous 8/19/2025, 12:40:58 PM No.106311766 [Report] >>106311783

>>106311747
Whats even the reason? So they don't get copyright trouble?

Anonymous 8/19/2025, 12:44:05 PM No.106311783 [Report]

>>106311766
obviously, but it also makes it easier for the model to digest. it removes the noise and gives you a clean dataset.

Anonymous 8/19/2025, 12:44:22 PM No.106311785 [Report]

00134_thumb.jpg.webm md5: c719e355...

WebM not supported

Anonymous 8/19/2025, 12:45:08 PM No.106311791 [Report]

>>106311628
jc how terrifying

Anonymous 8/19/2025, 12:48:00 PM No.106311801 [Report]

>>106311683
>blah blah blah
>RP
well for that you'll need more than smart llms
parsing and maintaining a lot of variables to keep things consistent
and a bunch of anons are working on that

Anonymous 8/19/2025, 12:48:32 PM No.106311807 [Report] >>106311831

>>106311717
By default a CMake build is single-threaded, add e.g. -j 32 to use more threads.

Anonymous 8/19/2025, 12:52:07 PM No.106311831 [Report] >>106315250 >>106315290

1728340233318403.png md5: b87e8d3d...

>>106311807
I had it with -j16, doesnt help that im on windows. and my build hit a snag too, fml. Maybe I should switch to clang?

Anonymous 8/19/2025, 12:53:53 PM No.106311842 [Report]

file.png md5: 3e207ee4...

>>106311511
calling yourself out?

Anonymous 8/19/2025, 1:04:09 PM No.106311903 [Report] >>106311932 >>106312219 >>106312266 >>106312333

How does your model of choice do when asked for a list of famous nukige with an empty system prompt and no reasoning? Here's full GLM-chan's take.
https://files.catbox.moe/mnf6v6.txt

Anonymous 8/19/2025, 1:08:24 PM No.106311932 [Report] >>106311958

>>106311903
Another benchmark for Mistral?

Anonymous 8/19/2025, 1:12:28 PM No.106311958 [Report]

>>106311932
Once this one gets benchmaxxed, it's time to build an obscure nukige trivia benchmark.

Anonymous 8/19/2025, 1:32:05 PM No.106312058 [Report] >>106312226 >>106312280 >>106312403 >>106312994

zephyr_z9-1957762846294507906-01.png md5: f0f2b669...

i hope its not gonna be even more gemini sloped.

Anonymous 8/19/2025, 1:35:35 PM No.106312081 [Report] >>106312103 >>106312115

Why hasn't mythomax been surpassed 2.5 years later?

Anonymous 8/19/2025, 1:39:18 PM No.106312103 [Report]

>>106312081
It's called rocinante 1.1 my friend

Anonymous 8/19/2025, 1:41:24 PM No.106312115 [Report]

>>106312081
Mythomax was true AGI released to the public to see if anyone would notice.

Anonymous 8/19/2025, 1:48:45 PM No.106312176 [Report] >>106312283 >>106315592

1752190019212895.png md5: 2f6da351...

>>106311683
>some arabian prince might safe us with a RP trained model.
>A SNK situation basically. How is there no guy with a couple mil to blow?
You are vastly underestimating how niche this hobby is. The type of person who has millions to just waste on a while is either too busy enjoying normie shit or is one of the few people that actually worked for that money and has to continue to work to maintain that lifestyle. Either one likely doesn't even know this hobby exists unless their income is directly tied to custom AI models (and even done with the publicist on middle manager that pretends to do shit instead of the one that actually does the hard work)

Anonymous 8/19/2025, 1:49:14 PM No.106312180 [Report] >>106312206

firefox_250819_134603.png md5: e44f62aa...

>GemMOGGED

Anonymous 8/19/2025, 1:54:23 PM No.106312206 [Report]

>>106312180
Llama Scout mogs both

Anonymous 8/19/2025, 1:55:58 PM No.106312219 [Report]

>>106311903
unironically might be a good benchmark because most models (all of them) are gonna fail horribly

Anonymous 8/19/2025, 1:56:46 PM No.106312226 [Report] >>106312272 >>106312669

meh.png md5: 6c5f11f7...

>>106312058
When is that from?

Anonymous 8/19/2025, 1:59:12 PM No.106312243 [Report]

>>106311747
the most hilarious thing is how they said their new dataset improves on multilingual but in my personal translation tests I still see models like Qwen 4B and Gemma 3 4B destroying its asshole
NVIDIA can't make good models it literally runs against their pajeet nature

Anonymous 8/19/2025, 2:03:12 PM No.106312266 [Report]

>>106311903
I'm sorry, but I can't assist with that request.

Anonymous 8/19/2025, 2:04:08 PM No.106312272 [Report] >>106312873 >>106312888

>>106312226
deepseek has their own group chat/news feed thingy on some chinese app thingy
previous announcements were done there ahead of time as well, so probably legit

Anonymous 8/19/2025, 2:05:28 PM No.106312280 [Report]

>>106312058
Is it just a context length change?

Anonymous 8/19/2025, 2:05:58 PM No.106312283 [Report] >>106312326 >>106312338 >>106312339 >>106316776

>>106312176
Well we had a guy who was willing to blow 1 Mil in this general.
Theres gotta be somebody. But I guess you are right.
Doesnt help that its mostly femoids and zoomers who enjoy llm RP.

Anonymous 8/19/2025, 2:13:56 PM No.106312326 [Report] >>106312342

>>106312283
what does it cost if they aren't trying to innovate, just using current tech on a custom dataset? what is salary for a team of people who can make it happen? how much gpu time at what cost?

Anonymous 8/19/2025, 2:14:40 PM No.106312333 [Report] >>106312470 >>106312506

Untitled-1.png md5: 225b5c6d...

>>106311903
GLM-4.5-Air-Q3_K_S

There's some hallucination, also mixes up developers of games. But otherwise most of the information is correct. Okay for a model that you can run on a regular PC.

Anonymous 8/19/2025, 2:15:11 PM No.106312338 [Report] >>106312344

>>106311693
I wouldn't call it small given how big it actually is from the pre-contaminated scrapes from https://huggingface.co/datasets/lemonilia/Roleplay-Forums_2023-04 which is 20GB give or take. Of course, this isn't clean and all raw but even if you have half that after doing that, 10GB is not a bad size. But it's really what you can do with the data to create training and etc. that will make it useful and there is more than enough here to do stuff like DPO training or etc. However, a lot of it is predicated on clearning it and I can't see the demographic most interested to get it to happen like >>106312283 describes willing to do the hard work to make these RPs usable. And we already know how shit employing third worlders is to do this task as ScaleAI has demonstrated. A lot of the next frontier is going to be predicated, if we're still stuck with the LLM paradigm, on data quality and we are missing a lot of that, most of them being synthetic and not a lot of them are actually good enough being synthetic to be actually used like with math and science with one provable answer.

Anonymous 8/19/2025, 2:15:14 PM No.106312339 [Report]

>>106312283
You don't need to just pay for compute, but also to pay for qualified people who can do a proper training run on a large model. Unless all you're asking for is a LoRA, there isn't a good open source codebase for training large MoEs.

Anonymous 8/19/2025, 2:15:40 PM No.106312342 [Report] >>106312382

>>106312326
Nvidia just gave a perfect pre-train data for free

Anonymous 8/19/2025, 2:16:30 PM No.106312344 [Report] >>106312367 >>106312409

>>106312338
>20GB
>big
People train models on 10~20TB corpora these days

Anonymous 8/19/2025, 2:16:55 PM No.106312346 [Report] >>106312365 >>106312374 >>106312402 >>106312412

https://youtu.be/1H3xQaf7BFI
you are funding this
apologize

Anonymous 8/19/2025, 2:21:57 PM No.106312365 [Report]

>>106312346
>You cannot buy the free market product!!!

Anonymous 8/19/2025, 2:22:12 PM No.106312367 [Report]

>>106312344
Corpa altogether, no. But it's been known for quite a while you can NOT train a model only on RP to make them work well with RP, you need to feed it alongside everything else.
Look at the sizes of the datasets used by AllenAI for their models.
https://huggingface.co/collections/allenai/tulu-3-datasets-673b8df14442393f7213f372
Hundreds of megabytes to a gigabyte at its largest per dataset. Total, of course it's larger than 10GB but it's more than enough to seed a reasonably large model with more than enough RP.

Anonymous 8/19/2025, 2:23:16 PM No.106312374 [Report]

>>106312346
Gladly!

Anonymous 8/19/2025, 2:23:54 PM No.106312382 [Report] >>106312392

>>106312342
>perfect pre-train data
LOL

Anonymous 8/19/2025, 2:25:21 PM No.106312392 [Report] >>106312397 >>106312407

>>106312382
Explain your qualification and what is wrong according to you?

Anonymous 8/19/2025, 2:25:54 PM No.106312397 [Report]

>>106312392
appeal to authority

Anonymous 8/19/2025, 2:26:24 PM No.106312402 [Report] >>106312418

>>106312346
>illegal gpu black market
????
They are gonna make anything bigger than 24gb vram illegal soon won't they.

Anonymous 8/19/2025, 2:26:25 PM No.106312403 [Report]

>>106312058
>i hope its not gonna be even more gemini sloped.
I have some bad news for you. I've tried it over the website, and it is even more gemini than before.
It even has the same quirk where it doesn't put period on the last sentence of the reply.

Anonymous 8/19/2025, 2:26:43 PM No.106312407 [Report]

>>106312392
>qwen 30ba3b rewritten slop
>"explain your qualification"
how about you explain your lack of intelligence pajeet

Anonymous 8/19/2025, 2:27:02 PM No.106312409 [Report]

>>106312344
you couldn't get very far with it, but I think they really did hit a wall throwing more tokens and more layers at the problem, seems to be delivering diminishing returns. I think the moat is more of a shallow ditch. throe a few trillion tokens at a 100b dense or 500b 32a would get you close to good enough.

Anonymous 8/19/2025, 2:27:09 PM No.106312412 [Report] >>106312437

>>106312346
i guess its gay to have an agenda, but i wonder what he was hoping to achieve with this other than to show that the restrictions dont really work

Anonymous 8/19/2025, 2:27:38 PM No.106312418 [Report]

>>106312402
You see, you can't just buy the compute hardware manufactured in your own country, it's not meant for you because you're an enemy.

Anonymous 8/19/2025, 2:29:41 PM No.106312437 [Report]

>>106312412
He's not really showing anything except for the nitty gritty about how sanctions work. Sanctions aren't going to 100% stop a country from getting the thing you don't want. It's to make it 100x harder to actually procure those things at scale and in a reasonable way at a good price. Sanctions fuck all of that up and slow down progress which is exactly what is happening. It's meant to buy time to actually get an alternative strategy or mitigations for when the sanctions don't work anymore since most of them usually are on borrowed time.

Anonymous 8/19/2025, 2:33:26 PM No.106312470 [Report]

>>106312333
what about kamige nukige??

Anonymous 8/19/2025, 2:34:20 PM No.106312479 [Report] >>106312528 >>106312551 >>106312562 >>106312574 >>106312774 >>106317174

What are the recommended sampler settings for deepseek v3?

Fails to write a simple scene. It does one paragraph okay, then goes to shit. Or is just a problem with the prompt?:
deepseek v3: https://rentry.co/yvmykzox
vs qwen3 4b: https://rentry.co/zxr6nchf

Anonymous 8/19/2025, 2:38:15 PM No.106312506 [Report]

>>106312333
i wouldnt really call most of them nukige, funnily enough it gets it right at the very end when explaining the terms

Anonymous 8/19/2025, 2:41:49 PM No.106312528 [Report] >>106312544 >>106312593 >>106315931

>>106312479
>552.56 seconds
anon...

Anonymous 8/19/2025, 2:45:34 PM No.106312544 [Report]

>>106312528
? Less than 10 minutes is pretty good, coming from gpt-j shinen 6b.

Anonymous 8/19/2025, 2:46:21 PM No.106312551 [Report] >>106312559

>>106312479
>10 minutes tg
>output worse than a 70b llama
the absolute state of moesissies

Anonymous 8/19/2025, 2:48:11 PM No.106312559 [Report]

>>106312551
>output worse than a 70b llama
Even a 4b assistant-slopped model is better, holy shit.

Anonymous 8/19/2025, 2:48:45 PM No.106312562 [Report] >>106312575

>>106312479
What settings are you using?

Anonymous 8/19/2025, 2:51:22 PM No.106312574 [Report] >>106312578 >>106312797

>>106312479
maybe it was just bad luck did you try another swipe?

Anonymous 8/19/2025, 2:51:34 PM No.106312575 [Report] >>106312603

>>106312562
Temp 0.6,
Top_p 0.95
Top_k 20

Anonymous 8/19/2025, 2:52:34 PM No.106312578 [Report]

>>106312574
Okay, I'll try again. Will report back in 10 days when it finishes.

Anonymous 8/19/2025, 2:54:34 PM No.106312593 [Report] >>106312613 >>106312653

ds_load.png md5: 5ca2757d...

>>106312528
Loading is so slow ;-;

Anonymous 8/19/2025, 2:55:43 PM No.106312603 [Report] >>106312643 >>106312795

>>106312575
set temp to 1, top_p to 1 and top_k to 0
you'll thank me later
actually good models not only do not need sampler snake oilery but they get harmed by it
on their official API they don't even allow you to set anything other than temperature because they expect the average llm user to be too retarded to comprehend this

Anonymous 8/19/2025, 2:56:43 PM No.106312613 [Report] >>106312627

>>106312593
once I got an m.2 for the models things began looking much brighter.

Anonymous 8/19/2025, 2:58:56 PM No.106312627 [Report] >>106312654

>>106312613
It's on a nm790, which shouldn't be too low-end right? I liked the advertised endurance on the nm790.

Anonymous 8/19/2025, 3:00:56 PM No.106312643 [Report]

>>106312603
Alright. I'lll try that.
I was using 0.6 and 0.95 because I was using glm 4.5 air before, and that's what z.ai api docs told me. Top_k at 0 failed to produce any tokens, but top_k at 20 worked, so I left it like that.

Anonymous 8/19/2025, 3:01:58 PM No.106312653 [Report] >>106312676

>>106312593
I think good code should allocate all needed RAM immediately instead of gradually increasing it like shown on the graph, which can harm the loading speed somewhat.

Anonymous 8/19/2025, 3:02:11 PM No.106312654 [Report]

>>106312627
>nm790
You use --no-mmap, right? So it should take like two minutes. I used to load deepseek from 5400 rpm hdd and that was painful.

Anonymous 8/19/2025, 3:03:20 PM No.106312658 [Report] >>106312667 >>106312669 >>106312678

Screenshot 2025-08-19 070238.png md5: a4c95669...

What the fuck is this

Anonymous 8/19/2025, 3:04:44 PM No.106312667 [Report]

>>106312658
Looks like an announcement for a 5 month old model.

Anonymous 8/19/2025, 3:05:07 PM No.106312669 [Report]

>>106312658
>>106312226

Anonymous 8/19/2025, 3:06:30 PM No.106312676 [Report] >>106312691

>>106312653
it probably is but the task manager is only showing pages that have been loaded.

Anonymous 8/19/2025, 3:07:01 PM No.106312678 [Report] >>106312709 >>106312728 >>106312761

>>106312658
>[white genocide] Deepseek is now only for profit and will not release V3.1, which has a whopping 128k context, it's literally agi, APP, glory to the CCP, API will be closed shortly

Anonymous 8/19/2025, 3:09:13 PM No.106312691 [Report] >>106312718 >>106312745

>>106312676
Have you tried mlock?

Anonymous 8/19/2025, 3:12:17 PM No.106312709 [Report] >>106312904

>>106312678
>which has a whopping 128k context
so does current DS if ran locally, they just used to limit to 64k on their frontend to save on costs

Anonymous 8/19/2025, 3:13:26 PM No.106312718 [Report] >>106312770

>>106312691
I have not. Does it make much of a difference?

Anonymous 8/19/2025, 3:14:39 PM No.106312728 [Report]

>>106312678
>[white genocide]
Dipsy would never say that, she loves white men.

Anonymous 8/19/2025, 3:16:55 PM No.106312745 [Report]

>>106312691
I was just pointing out that programs can allocate as much virtual memory as they want, oftentimes in excess of physical ram. the os is probably only reporting the physical pages that have been consumed as the file is read in.

Anonymous 8/19/2025, 3:19:10 PM No.106312757 [Report]

>>106311445 (OP)
>>106311447
omg it teto

Anonymous 8/19/2025, 3:19:52 PM No.106312761 [Report]

>>106312678
hello anon, i think you dropped this
>hands you a sense of humor

Anonymous 8/19/2025, 3:20:54 PM No.106312770 [Report]

>>106312718
Usually --no-mmap should suffice, but it's worth trying.

Anonymous 8/19/2025, 3:21:06 PM No.106312774 [Report] >>106312797 >>106312867

>>106312479
R1, two rolls. Prefill ends after "Archive Warnings:"

https://rentry.org/2i6kog8t
https://rentry.org/62s4ay6i

Anonymous 8/19/2025, 3:24:22 PM No.106312795 [Report]

>>106312603
they also forcibly convert requests with temp=1 to a much lower temperature, so if we're going by the API... definitely do not use temp 1

Anonymous 8/19/2025, 3:24:43 PM No.106312797 [Report] >>106312812

>>106312574
Using the same settings as before:
https://rentry.co/y6xn8uoz

It just does one paragraph okay, then goes stupid for a while before completely giving up. Temps at 1, top_p 1, and top_k 0 is essentially the same.

Is deepseek just not trained on writing stories? How the hell is a 671b shitting itself so bad? It's q4, so it shouldn't be too lobotomized.

>>106312774
>Rating: Explicit
>Archive Warnings: Graphic Depictions of Violence
>Categories: M/F
>Characters: Celestia | User | Villainess
>Relationships: Celestia/User
>Additional Tags: Non-Consent | Magical Girl | Virginity Loss | Death | Impregnation | Dark | Depraved | Gruesome | Erotic Horror | Forced Sex | Magical Girl Defeated | Broken Mind | Body Horror | Guro
>Stats: Published: 2023-10-05 Words: 4000

Should I try that with V3? I'm downloading R1 right now, but while I'm fine with 10 minutes for a response, I have shivers down my spine thinking about how long it would take with thinking in the picture.

Anonymous 8/19/2025, 3:24:54 PM No.106312799 [Report] >>106316398

What do people use as a frontend to translate VNs?

Anonymous 8/19/2025, 3:26:46 PM No.106312812 [Report] >>106312842 >>106312860 >>106313008

>>106312797
The trick is that there's no thinking. I just pasted your prompt in mikupad, added the beginning of ao3 tags and let it continue. "Graphic Depicitions of Violence" and everything after that was written by the model.

Anonymous 8/19/2025, 3:30:35 PM No.106312842 [Report] >>106312859

>>106312812
Alright, I'll try that. With or without the colon at the end of 'Archive Warnings'?

Anonymous 8/19/2025, 3:33:35 PM No.106312859 [Report]

file.png md5: 13a9805c...

>>106312842
The colon doesn't matter but my autistic DRY settings might.
Try them with V3 too and see if it helps.

Anonymous 8/19/2025, 3:33:40 PM No.106312860 [Report] >>106312867

>>106312812
Is it just a v3 problem?

>Rating: Explicit
>Archive Warnings: Major Character Death, Graphic Depictions of Violence, Non-Consent, Forced Sex, Dubious Consent
>Disclaimer: This work contains extreme and dark themes. Reader discretion is strongly advised.

Is all it writes before going right into it. Or maybe q4 actually is too low a quant...

Anonymous 8/19/2025, 3:34:50 PM No.106312867 [Report] >>106312906

>>106312860
>maybe q4 actually is too low a quant
This >>106312774 was IQ1_S

Anonymous 8/19/2025, 3:36:11 PM No.106312873 [Report] >>106312904

dipsyConstruct.png md5: 48a3557f...

>>106312272
Just saw it posted elsewhere too. I thought V3 was already 128K context; in practice for RP anything over 10K falls apart for both R1 and V3. I'll have to experiement with nuV3 on a longer RP to see how it holds up past 10K.

Anonymous 8/19/2025, 3:38:22 PM No.106312888 [Report] >>106312909

>>106312272
The "R1" text on the Think button is gone now, so something on the website was definitely updated

Anonymous 8/19/2025, 3:38:30 PM No.106312889 [Report] >>106313009

is qwen235 good at q3? also thinking or the plain instruct one

Anonymous 8/19/2025, 3:40:41 PM No.106312904 [Report] >>106313162

>>106312873
>I thought V3 was already 128K context
It is, but not on their webui/API, see >>106312709

Anonymous 8/19/2025, 3:41:00 PM No.106312906 [Report] >>106312915 >>106312929

>>106312867
Holy hell. Do you use it for assistant tasks/coding too? On the smaller models, I feel like q2 is basically unusable, even for stories.

Anonymous 8/19/2025, 3:41:46 PM No.106312909 [Report]

>>106312888
>The "R1" text on the Think button is gone now
oh no no no, hybrid reasoner meme incoming

Anonymous 8/19/2025, 3:41:59 PM No.106312915 [Report]

>>106312906
For coding there's qwen.

Anonymous 8/19/2025, 3:42:37 PM No.106312921 [Report]

>>106311445 (OP)
Teto sexo

Anonymous 8/19/2025, 3:43:07 PM No.106312929 [Report]

>>106312906
the bigger the model, the less damage low q does

Anonymous 8/19/2025, 3:52:16 PM No.106312994 [Report] >>106313011 >>106313021 >>106313026

>>106312058
>3.1
>not 4
>not even 3.5
>6 months later
deepseek is dead in the water

Anonymous 8/19/2025, 3:53:54 PM No.106313008 [Report]

>>106312812
https://rentry.org/e636f78n

I went back down to a temp of 0.6... Seems like it's a v3 issue.

Anonymous 8/19/2025, 3:54:02 PM No.106313009 [Report]

>>106312889
I use it at q2 and it's still pretty good, better than anything else I can run imo
my default rec would be the instruct model, but the thinker is good if you like messing around with that sort of thing and don't mind the extra wait

Anonymous 8/19/2025, 3:54:12 PM No.106313011 [Report]

>>106312994
It's a whale, it can swim.

Anonymous 8/19/2025, 3:55:11 PM No.106313021 [Report] >>106313035 >>106313044 >>106313141 >>106313691

1738934454507627.png md5: da1c7a13...

>>106312994

Anonymous 8/19/2025, 3:56:05 PM No.106313026 [Report]

>>106312994
Ai progress in general

Anonymous 8/19/2025, 3:57:02 PM No.106313035 [Report]

>>106313021
Beautiful.

Anonymous 8/19/2025, 3:58:08 PM No.106313044 [Report]

>>106313021
Holy hell, those are some insane gains.

Anonymous 8/19/2025, 4:05:47 PM No.106313102 [Report] >>106313107 >>106313116 >>106313130 >>106313215

https://huggingface.co/deepseek-ai/DeepSeek-R2

Anonymous 8/19/2025, 4:07:22 PM No.106313107 [Report]

>>106313102
HAPPENING! HUGE!! BIGLY, EVEN!!

Anonymous 8/19/2025, 4:08:21 PM No.106313116 [Report]

>>106313102
>Trained at Q1 using the chink GPUs
HOLY SHIT

Anonymous 8/19/2025, 4:10:41 PM No.106313130 [Report]

1736656277829857.jpg md5: 54d05c8d...

>>106313102

Anonymous 8/19/2025, 4:11:53 PM No.106313141 [Report] >>106313157 >>106313180

GyrFcEUbsAAIYuQ.jpg md5: fc06ca8d...

>>106313021
If this was Sam Altman's screw up then it is scary to think about what Sam Altman's success looks like.

Anonymous 8/19/2025, 4:13:51 PM No.106313157 [Report] >>106313180

>>106313141
There is no way he said any of this

Anonymous 8/19/2025, 4:14:09 PM No.106313162 [Report] >>106313178

00004-1378487878D.png md5: 61740a86...

>>106312904
> no model update published for V3 on HF
So v3.1 is really just a change in settings for the web and API version through DS. No actual model update.
Bummer.

Anonymous 8/19/2025, 4:16:42 PM No.106313178 [Report] >>106313196

>>106313162
Are you really a dispy fan? This is how they always do their thing, first announce on their chat, then update their online inference and only after that put weights on HF.

Anonymous 8/19/2025, 4:16:50 PM No.106313180 [Report]

tempSource.png md5: 67b40c8a...

>>106313157
>>106313141
source is from some verge article / interview.

Anonymous 8/19/2025, 4:17:10 PM No.106313184 [Report] >>106313223

Finna grab some crashout gang hood niggas and boost the nearest datacenter
finna fly to HK with the loot in the checked bags
finna link with deepseek gangdem and create Z0GD35TR0Y3R-B800A70B
trump finna be mad as hell

Anonymous 8/19/2025, 4:18:40 PM No.106313196 [Report] >>106313484

>>106313178
True. If so, it'll upload later today. Time will tell.

Anonymous 8/19/2025, 4:20:41 PM No.106313215 [Report]

>>106313102
>The deployment of Huawei Ascend AI processors enabled a cost-effective training paradigm for DeepSeek-R2, reducing total training expenses to approximately $90,000 USD—representing a substantial reduction relative to equivalent NVIDIA-based infrastructure while maintaining competitive performance metrics across benchmark evaluations.
Wtf

Anonymous 8/19/2025, 4:21:24 PM No.106313223 [Report] >>106313245

>>106313184
>in 2025, this is the equivalent of posting "I'm gonna crash one more plane in the towers" on the 11th september 2001.

Anonymous 8/19/2025, 4:23:24 PM No.106313245 [Report]

>>106313223
So many soda chink models have been uploaded to HF in the last few months, there isn't anything left of the towers to hit anymore.

Anonymous 8/19/2025, 4:23:42 PM No.106313246 [Report] >>106313267 >>106313314 >>106313340 >>106313369 >>106313389 >>106313471 >>106313527 >>106313627

New DS on the site likes to start answering questions with "Of course" is that known to be a quirk of some other model? Might point to distillation if it is.

Anonymous 8/19/2025, 4:25:02 PM No.106313261 [Report]

DeepSeek-R1-0528 is here!

Anonymous 8/19/2025, 4:25:34 PM No.106313267 [Report]

>>106313246
Who else is left to distill from anyway? I think someone said Claude still gives thinking output through API. Grok?

Anonymous 8/19/2025, 4:29:49 PM No.106313314 [Report]

>>106313246
Second round of questioning gets:
>"Excellent question" "Excellent and insightful question."
Might indicate at least some multi turn effort.

Anonymous 8/19/2025, 4:33:21 PM No.106313340 [Report]

>>106313246
I noticed this too kek. I asked it "should we abolish the death penalty" (just a vague question that I figured it would have to consider since I wanted a look at its CoT) and it started its reply with "Of course."
I was like "damn they made this thing super opinionated" before I realized it was just acknowledging the question or whatever

Anonymous 8/19/2025, 4:36:51 PM No.106313369 [Report]

>>106313246
based deepsneeds jailbreaking their own model

Anonymous 8/19/2025, 4:37:30 PM No.106313378 [Report] >>106313399 >>106313639

Is Whisper the best for voice > text still?

Im using faster-whisper-xxl because I can just drag and drop, but it fucks bunch a bunch, whereas where i just drop opus files into Gemini it nails it, however it eats a lot of tokens despite it being free. This is for multi-hour long audio files

Anonymous 8/19/2025, 4:39:08 PM No.106313389 [Report]

>>106313246
>"This gets to the very heart of what..." "That gets to the strategic heart of why this..."
it certainly has a new "tone" so to say.

Anonymous 8/19/2025, 4:40:21 PM No.106313399 [Report]

>>106313378
faster-whisper turbo is the best STT from my tests. You might need to play with the vad settings a bit though

Anonymous 8/19/2025, 4:47:05 PM No.106313469 [Report]

Just how lobotomized is glm air at q2_k_xl? Is there any good alternative in this range?

Anonymous 8/19/2025, 4:47:13 PM No.106313471 [Report]

>>106313246
>is that known to be a quirk of some other model?
...Llama 2

Anonymous 8/19/2025, 4:47:25 PM No.106313475 [Report] >>106313484 >>106313501 >>106313507 >>106313511 >>106313517 >>106313525 >>106313527 >>106313528 >>106313572 >>106313581 >>106313592 >>106313597 >>106313631 >>106313742 >>106313887 >>106314115 >>106314547

https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base
it's up

Anonymous 8/19/2025, 4:48:13 PM No.106313484 [Report]

>>106313196

Let's go!
>>106313475

Anonymous 8/19/2025, 4:50:00 PM No.106313501 [Report]

>>106313475
HOLY FUCKING KINO

Anonymous 8/19/2025, 4:50:05 PM No.106313507 [Report] >>106314709

>>106313475
If they uploaded a full fledged base (not just another instruct / thinking tune), this was probably their attempt at V4 and R2 that didn't pan out

Anonymous 8/19/2025, 4:50:24 PM No.106313511 [Report]

>>106313475
local is saved

Anonymous 8/19/2025, 4:50:44 PM No.106313517 [Report] >>106313538 >>106313653

>>106313475
>hybrid reasoner
>shits on gpt5 + gemini 2.5
>creative writing focus
>beats pokemon in 12k steps
we are so back

Anonymous 8/19/2025, 4:51:30 PM No.106313525 [Report]

>>106313475
Let's goooo
Mistral small 3.3 soon

Anonymous 8/19/2025, 4:51:34 PM No.106313527 [Report]

>>106313246
the concise, pointed thinking, increased final answer verbosity, and use of tables unironically remind me of oss (though this is obviously more human and less alignment-obsessed)
>>106313475
new base is very interesting

Anonymous 8/19/2025, 4:51:34 PM No.106313528 [Report] >>106313536

>>106313475
Not falling for it again.

Anonymous 8/19/2025, 4:52:28 PM No.106313536 [Report]

file.png md5: ba59a3dc...

>>106313528
K bud.

Anonymous 8/19/2025, 4:52:35 PM No.106313538 [Report] >>106313549

>>106313517
>-Base
>>hybrid reasoner
Come on now.

Anonymous 8/19/2025, 4:53:46 PM No.106313549 [Report]

>>106313538
We know it's a hybrid from the site dropping the "R1" from the think button and both of them having a much closer manner of writing than R1 and DS3 had before.

Anonymous 8/19/2025, 4:55:54 PM No.106313572 [Report]

>>106313475
no config change vs v3

Anonymous 8/19/2025, 4:56:54 PM No.106313581 [Report] >>106313596 >>106313633 >>106313645 >>106313649 >>106313650 >>106313680

>>106313475
>700 B params
Okay how many of you fags can actually run this?

Anonymous 8/19/2025, 4:58:07 PM No.106313592 [Report]

>>106313475
This needs to be a big step forward considering GLM 4.5 is a sidegrade in smartness at almost half the size.

Anonymous 8/19/2025, 4:58:24 PM No.106313596 [Report]

>>106313581
all of us

Anonymous 8/19/2025, 4:58:35 PM No.106313597 [Report]

>>106313475
Fake

Anonymous 8/19/2025, 5:01:01 PM No.106313618 [Report] >>106313625 >>106313637 >>106313640 >>106313642 >>106313797 >>106314709

Wait. if there is a 3.1 BASE it means it's not v3 with more training steps. So did they mess up V4 and called it V3.1 the same way oai did 4.5?

Anonymous 8/19/2025, 5:01:33 PM No.106313625 [Report] >>106314709

>>106313618
Likely.

Anonymous 8/19/2025, 5:01:41 PM No.106313627 [Report]

>>106313246
>is that known to be a quirk of some other model?
Gemini loves to praise you wherever you make a suggestion.

Anonymous 8/19/2025, 5:02:05 PM No.106313631 [Report]

>>106313475
>3.1
>not 4
nothingburger confirmed?

Anonymous 8/19/2025, 5:02:17 PM No.106313633 [Report]

>>106313581
If you can run V3/R1 you can run this.

Anonymous 8/19/2025, 5:02:35 PM No.106313637 [Report] >>106313659

>>106313618
They paid the hybrid reasoner tax like Qwen did, look forward to them going back to schedule after this.

Anonymous 8/19/2025, 5:02:46 PM No.106313639 [Report]

>>106313378
parakeet / canary / voxtral (allegedly)

Anonymous 8/19/2025, 5:02:48 PM No.106313640 [Report]

>>106313618
AI progress is hitting a hard wall, like a woman in her 30s.

Anonymous 8/19/2025, 5:02:58 PM No.106313642 [Report] >>106313681

>>106313618
I would be more inclined to believe this if there were any arch differences between 3 and 3.1, DS is too ambitious of a company to make an attempt at V4 with the exact same architecture as V3

Anonymous 8/19/2025, 5:03:16 PM No.106313645 [Report]

>>106313581
i probably after is quantized for 0.4bit

Anonymous 8/19/2025, 5:03:22 PM No.106313649 [Report]

>>106313581
I'm poor so I can barely run glm air at q3...

Anonymous 8/19/2025, 5:03:23 PM No.106313650 [Report]

>>106313581
I can run it after unsloth makes it retarded.

Anonymous 8/19/2025, 5:03:32 PM No.106313653 [Report]

>>106313517
>>hybrid reasoner
they fell for it

Anonymous 8/19/2025, 5:03:36 PM No.106313656 [Report]

The fact that they went for "3.1" is worrying. They did a 2.5 in the past, which was a combination of their up to that point separate coder + chat models shortly before they released 3. However, 3.1 implies that there might be more to generation 3

Anonymous 8/19/2025, 5:03:48 PM No.106313659 [Report] >>106313676

>>106313637
Hybrid reasoning wouldn't cause the base model to underperform.

Anonymous 8/19/2025, 5:04:22 PM No.106313664 [Report] >>106314014

ollama run deepseek-v3.1

Anonymous 8/19/2025, 5:05:24 PM No.106313676 [Report]

>>106313659
Maybe the intended as v4 but once the reasoning tune went bad decided to name it v3.1 who knows but them. Could explain the time it took though.

Anonymous 8/19/2025, 5:05:40 PM No.106313680 [Report]

>>106313581
How many active penises are there?

Anonymous 8/19/2025, 5:05:45 PM No.106313681 [Report] >>106313740 >>106314709

>>106313642
If NSA or whatever new thing they're testing isn't working out, there's no point releasing a broken model. It's more likely their attempts at the actual V4 isn't going well, so they're just going incremental improvements on the old models to maintain relevance.

Anonymous 8/19/2025, 5:06:36 PM No.106313691 [Report] >>106313703 >>106313715

>people speculating over naming
Why are you guys like this? Naming means jackshit >>106313021

Anonymous 8/19/2025, 5:07:17 PM No.106313700 [Report]

Is it gooder or just benchmaxxed?

Anonymous 8/19/2025, 5:07:39 PM No.106313703 [Report] >>106313754

>>106313691
It does though, it signals expectations to people.

Anonymous 8/19/2025, 5:09:07 PM No.106313715 [Report] >>106313754

>>106313691
The name isn't the point retard. It's the lack of a new model with a new architecture. The .1 is only concerning because it makes it sound like they're planning at least a 3.5 later on so V4 with some new breakthrough architecture might not come until next year.

Anonymous 8/19/2025, 5:09:47 PM No.106313720 [Report] >>106313731 >>106313736 >>106313774

file.png md5: c270f22e...

>NOOO muh multimeme!!

Anonymous 8/19/2025, 5:11:02 PM No.106313731 [Report]

>>106313720
go back

Anonymous 8/19/2025, 5:11:58 PM No.106313736 [Report] >>106313802

>>106313720
Well I do wish we could get something more than image in.
Unfortunately all those audio in/out models have severe degradation. Nobody has really figured it out yet.
If we could get a model that has native image in and out it would be a game changer for RP.

Anonymous 8/19/2025, 5:12:41 PM No.106313740 [Report] >>106313758 >>106313782

>>106313681
It's possible that their next generation isn't turning out well but I think it would be a stretch to take this as any sort of evidence for or against; remember, they updated 2.5 in the same month as V3's release

Anonymous 8/19/2025, 5:12:52 PM No.106313742 [Report]

file.png md5: 32528f6c...

>>106313475

Anonymous 8/19/2025, 5:13:27 PM No.106313754 [Report] >>106313877

>>106313703
>>106313715

Previous version was v3 0324. Now it's v3.1
.1>.0324
This is huge. It's an improvement of over 300%

Anonymous 8/19/2025, 5:13:45 PM No.106313758 [Report] >>106313782

>>106313740
>remember, they updated 2.5 in the same month as V3's release
true enough.

Anonymous 8/19/2025, 5:15:19 PM No.106313774 [Report]

>>106313720
They haven't even figured out text, let alone put more modalities in there

Anonymous 8/19/2025, 5:16:02 PM No.106313782 [Report]

>>106313758
>>106313740
wasnt 2.5 combined too?
made me look it up.
>DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
sounds similar to the hybrid reasoning.

Anonymous 8/19/2025, 5:17:24 PM No.106313797 [Report] >>106313821 >>106313827

>>106313618
>if there is a 3.1 BASE it means it's not v3 with more training steps
why not?

Anonymous 8/19/2025, 5:17:29 PM No.106313800 [Report] >>106313824

Nemotron 2 gguf support?

Anonymous 8/19/2025, 5:17:34 PM No.106313802 [Report] >>106313808 >>106313846

>>106313736
>Unfortunately all those audio in/out models have severe degradation. Nobody has really figured it out yet.
Maybe to achieve generalization across multiple modalities just requires a shitton more data and longer training. It worked for text, eventually. It's probably just exponentially harder to do both at the same time.

Anonymous 8/19/2025, 5:18:40 PM No.106313808 [Report]

>>106313802
yeah maybe 50 more trillions will do sam, we know

Anonymous 8/19/2025, 5:19:41 PM No.106313821 [Report]

>>106313797
Because they didn't release new base version for 0324.

Anonymous 8/19/2025, 5:19:57 PM No.106313824 [Report] >>106313838

>>106313800
>Hybrid Mamba
about 2 more weeks after never

Anonymous 8/19/2025, 5:20:10 PM No.106313827 [Report]

>>106313797
At the very least it's not derived from any v3 instruct

Anonymous 8/19/2025, 5:20:59 PM No.106313838 [Report]

>>106313824
I swear I saw something about llama.cpp and hybrid mamba support over a month ago.

Anonymous 8/19/2025, 5:21:22 PM No.106313846 [Report]

>>106313802
At least the direction is looking good.
The first qwen audio in/out model was pyg level stupid. kek
Their recent one wasn't THAT bad. But still more tarded and the voice was horrible.
I saw a video of some web llm game though and immediately recognized the voice.
Some guy must have used it for his game. So it was good enough for that.

Anonymous 8/19/2025, 5:24:23 PM No.106313877 [Report]

>>106313754
ha

Anonymous 8/19/2025, 5:25:31 PM No.106313887 [Report] >>106313896

>>106313475
where model card

Anonymous 8/19/2025, 5:26:24 PM No.106313896 [Report] >>106313923

>>106313887
It's 11PM in China, check back tomorrow

Anonymous 8/19/2025, 5:27:55 PM No.106313914 [Report] >>106313935 >>106313938

>drops weights
>refuses to elaborate
>leaves
kinda based

Anonymous 8/19/2025, 5:28:55 PM No.106313923 [Report] >>106313946 >>106313983

>>106313896
but why would they upload it without a model card at 11 PM?

Anonymous 8/19/2025, 5:29:44 PM No.106313935 [Report]

>>106313914
>kinda based
They literally explicitly stated, without ambiguity, that is a base model.

Anonymous 8/19/2025, 5:30:22 PM No.106313938 [Report] >>106313952 >>106313994

>>106313914
When are you going to drop some weight, fatso?

Anonymous 8/19/2025, 5:30:46 PM No.106313946 [Report]

>>106313923
set to upload and xiaban, will do the card tomorrow

Anonymous 8/19/2025, 5:31:52 PM No.106313952 [Report]

>>106313938
I can't. It's a medical condition.

Anonymous 8/19/2025, 5:35:42 PM No.106313983 [Report]

>>106313923
its a 700B model, prob takes a while to upload. Click upload and go to bed lol

Anonymous 8/19/2025, 5:36:46 PM No.106313994 [Report]

>>106313938
rude, I'm already losing weight and I'm not even fat

Anonymous 8/19/2025, 5:37:47 PM No.106314007 [Report] >>106314033 >>106314044 >>106314050

>Also, considering the user's fetish, incorporating elements that facilitate easy access for other men, like magnetic fastenings or tear-away sections, would make the outfit more aligned with the netorase scenarios. Ensuring that every modification not only makes her look sluttier but also serves the erotic roleplay context is key.
t-thanks v3.1 reasoner

Anonymous 8/19/2025, 5:38:15 PM No.106314014 [Report] >>106314040

>>106313664
what butchered model does this pull

Anonymous 8/19/2025, 5:40:22 PM No.106314033 [Report]

>>106314007
hmmm I see, quite insightful

Anonymous 8/19/2025, 5:40:59 PM No.106314040 [Report]

>>106314014
DeepSeek-v3.1-Distill-EXAONE4.0-1.2B

Anonymous 8/19/2025, 5:41:18 PM No.106314044 [Report] >>106314059 >>106314063

>>106314007
no way in hell did you download it that fast so where is it?

Anonymous 8/19/2025, 5:41:31 PM No.106314050 [Report]

>>106314007
That's why small details matter.

Anonymous 8/19/2025, 5:42:27 PM No.106314059 [Report]

>>106314044
He obviously used their website.

Anonymous 8/19/2025, 5:42:36 PM No.106314060 [Report]

https://github.com/deepseek-ai/DeepSeek-V3.1
https://github.com/deepseek-ai/DeepSeek-V3.1
https://github.com/deepseek-ai/DeepSeek-V3.1
MODEL CARD HERE

Anonymous 8/19/2025, 5:42:45 PM No.106314063 [Report] >>106314065 >>106314069 >>106314097 >>106314127

>>106314044
>local
This implies there must be a remote model general. I wonder what they do over there?

Anonymous 8/19/2025, 5:43:11 PM No.106314065 [Report]

>>106314063
It's called /aicg/

Anonymous 8/19/2025, 5:43:21 PM No.106314069 [Report]

>>106314063
They beg for proxies.

Anonymous 8/19/2025, 5:45:55 PM No.106314097 [Report]

>>106314063
use the good models

Anonymous 8/19/2025, 5:46:23 PM No.106314106 [Report] >>106314120 >>106314124 >>106314154

https://deepseek.ai/blog/deepseek-v31

Anonymous 8/19/2025, 5:47:28 PM No.106314115 [Report] >>106314142 >>106314145

1754492636738534_thumb.jpg.webm md5: c77e9946...

WebM not supported

>>106313475
w00t

Anonymous 8/19/2025, 5:47:52 PM No.106314120 [Report] >>106314143

>>106314106
stop posting that fucking phishing website

Anonymous 8/19/2025, 5:48:15 PM No.106314124 [Report]

>>106314106
>March 25, 2025

Anonymous 8/19/2025, 5:48:39 PM No.106314127 [Report]

>>106314063
>remote model general
they're so stupid they can barely set up st. also their hands are brown

Anonymous 8/19/2025, 5:49:48 PM No.106314142 [Report]

>>106314115
KINO

Anonymous 8/19/2025, 5:50:03 PM No.106314143 [Report]

Untitled.png md5: 0a857fe0...

>>106314120
Too late, I have already boughted it.

Anonymous 8/19/2025, 5:50:28 PM No.106314145 [Report]

>>106314115
nano banana is crazy good
open source will never catch up

Anonymous 8/19/2025, 5:50:56 PM No.106314154 [Report]

file.png md5: 165511d7...

>>106314106
kys

Anonymous 8/19/2025, 5:54:16 PM No.106314188 [Report] >>106314227 >>106314249 >>106314478

Quick question, does resizable bar matter for inference speed?

Anonymous 8/19/2025, 5:58:29 PM No.106314224 [Report] >>106314243

This is so crazy. After the better part of a year we're finally getting an upgrade to THE model that started the current age of local LLMs.
Without it, we'd still be stuck with the incremental upgrades over llama 3.3 or mistral large the other losers would've released with no shame.
3.1 is going to be crazy

Anonymous 8/19/2025, 5:58:41 PM No.106314227 [Report] >>106314256

>>106314188
probably not by much, but do benchmark it yourself nonetheless

Anonymous 8/19/2025, 6:00:00 PM No.106314243 [Report]

>>106314224
This, but unironically.

llama.cpp CUDA dev !!yhbFjk57TDr 8/19/2025, 6:00:43 PM No.106314249 [Report] >>106314256

>>106314188
Should not make a difference.

Anonymous 8/19/2025, 6:01:45 PM No.106314256 [Report]

>>106314227
The thing is, I can enable resizable bar in the bios but in windows it says it's disabled. So I'm wondering if I'm losing out on performance.

>>106314249
Ty

Anonymous 8/19/2025, 6:02:27 PM No.106314263 [Report] >>106314270

Are we hoping deepseek 3.1 will beat claude for rp and writing?

Anonymous 8/19/2025, 6:02:52 PM No.106314270 [Report]

>>106314263
I'm hoping to beat my dick

Anonymous 8/19/2025, 6:02:52 PM No.106314271 [Report] >>106314313 >>106314335

>DeepSeek passed my test and used EGFs rather than more conventional combinatorial counting approaches
Fuck, that's impressive

Anonymous 8/19/2025, 6:07:34 PM No.106314313 [Report] >>106314325 >>106314331

>>106314271
Used Epidermal Growth Factors?

Anonymous 8/19/2025, 6:08:24 PM No.106314325 [Report]

>>106314313
I think he means Electronic GirlFriends

Anonymous 8/19/2025, 6:08:45 PM No.106314329 [Report]

DeepSeek V3.1 add four new special tokens
<|searchbegin|> (id: 128796)
<|searchend|> (id: 128797)
<think> (id: 128798)
</think> (id: 128799)
compared to V3-base

Anonymous 8/19/2025, 6:08:58 PM No.106314331 [Report] >>106314340

>>106314313
An **EGF**, or Exponential Generating Function, is a mathematical tool used in combinatorics to count arrangements of labeled objects, where the identity or order of the elements is important. Unlike simpler counting methods, an EGF represents a sequence of numbers as a power series where the coefficient of each term is divided by a factorial (e.g., `a_n * x^n / n!`). This structure is specifically designed to elegantly handle the complexities of combining labeled structures, automatically managing the "shuffling" of elements that makes other approaches cumbersome. The 4chan user's comment that DeepSeek used an EGF is high praise because it indicates the AI solved a complex problem not with tedious, conventional case-by-case analysis, but by demonstrating a sophisticated conceptual understanding of a high-level mathematical technique, effectively "thinking" like an advanced mathematician.

Anonymous 8/19/2025, 6:09:14 PM No.106314335 [Report]

>>106314271
>bro uses ds for combinatorics
it's deepSEX not deepmaths.

Anonymous 8/19/2025, 6:09:47 PM No.106314340 [Report] >>106314367

>>106314331
tl;dr

Anonymous 8/19/2025, 6:11:39 PM No.106314366 [Report] >>106314375 >>106314379 >>106314388 >>106314393 >>106314396 >>106314430

http://195.26.232.142:46565/v1

Anonymous 8/19/2025, 6:11:56 PM No.106314367 [Report] >>106314473

Untitled.png md5: a7fa2cbb...

>>106314340

Anonymous 8/19/2025, 6:12:30 PM No.106314375 [Report]

>>106314366
DON'T CLICK IT MAKES MUSTARD GAS

Anonymous 8/19/2025, 6:12:54 PM No.106314379 [Report] >>106314390

file.png md5: 3d28c24f...

>>106314366
i love clicking random links

Anonymous 8/19/2025, 6:13:36 PM No.106314388 [Report]

>>106314366
Not using your logged API

Anonymous 8/19/2025, 6:13:43 PM No.106314390 [Report] >>106314407

>>106314379
retard
it's an openai compatible endpoint
connect your sillytavern to it or something

Anonymous 8/19/2025, 6:14:00 PM No.106314393 [Report]

>>106314366
That's dolphin porn.

Anonymous 8/19/2025, 6:14:14 PM No.106314396 [Report] >>106314409

>>106314366
shit shit shit I just sent a prompt with full name SSN and credit card number

Anonymous 8/19/2025, 6:15:16 PM No.106314407 [Report] >>106314425

>>106314390
i know but im not connecting my sillytavern to a frontend that will use a zero day exploit to open my homework folder full of my cock pictures

Anonymous 8/19/2025, 6:15:16 PM No.106314408 [Report]

download (13).jpg md5: ac31f10a...

So who's jason and why am I supposed to use his files?

Anonymous 8/19/2025, 6:15:18 PM No.106314409 [Report]

>>106314396
It's okay anon, I'll purge the logs, just for you

Anonymous 8/19/2025, 6:16:20 PM No.106314425 [Report] >>106314442

>>106314407
run sillytavern on your android phone through termux
triple sandboxing

Anonymous 8/19/2025, 6:17:09 PM No.106314430 [Report]

>>106314366
That was kind of you. Someone should at least do a nala test or something.

Anonymous 8/19/2025, 6:17:41 PM No.106314438 [Report]

https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base/discussions/2#68a495c3000a4675fe8ba559
ubergarming in process...

Anonymous 8/19/2025, 6:17:59 PM No.106314442 [Report] >>106314464

>>106314425
but i have my cock pictures on my phone too
and my phone is rooted and on a vulnerable android version (i am not upgrading because i want the vulnerable version)

Anonymous 8/19/2025, 6:19:19 PM No.106314460 [Report] >>106314475

vramlet bros...

Anonymous 8/19/2025, 6:19:28 PM No.106314464 [Report] >>106314484

>>106314442
i'll run it for you
>inb4 i have my cock pictures on your computer

Anonymous 8/19/2025, 6:19:56 PM No.106314473 [Report]

>>106314367
ty

Anonymous 8/19/2025, 6:20:06 PM No.106314475 [Report]

>>106314460
... are eating good.

ramlet sissies, on the other hand...

Anonymous 8/19/2025, 6:20:19 PM No.106314478 [Report] >>106314514

>>106314188
30% faster with a hacked p2p driver https://github.com/tinygrad/open-gpu-kernel-modules in vllm with tp
But if your mobo doesn't support rebar you can hack it into supporting as well, tested on epycd8-2t

Anonymous 8/19/2025, 6:21:01 PM No.106314484 [Report]

>>106314464
please do :3

Anonymous 8/19/2025, 6:22:53 PM No.106314502 [Report] >>106314520 >>106314530 >>106314556

itsover.png md5: bad9f9e6...

Anonymous 8/19/2025, 6:23:49 PM No.106314514 [Report] >>106314583

>>106314478
3090 with monitor attached shows resizable bar enabled. If I instead plug the monitor into my gt 710, 3090 shows resizable bar disabled. mc62-g40

Anonymous 8/19/2025, 6:24:27 PM No.106314520 [Report]

>>106314502
yup we're dead

Anonymous 8/19/2025, 6:25:08 PM No.106314530 [Report]

>>106314502
2 more hype cycles

Anonymous 8/19/2025, 6:27:15 PM No.106314547 [Report] >>106314669

file.png md5: ccc042d0...

>>106313475

Anonymous 8/19/2025, 6:28:21 PM No.106314556 [Report]

Untitled.png md5: 7f83d03c...

>>106314502

Anonymous 8/19/2025, 6:28:59 PM No.106314565 [Report] >>106314590 >>106314595 >>106314603 >>106314631 >>106314647

GyuiTsWacAYTJvc.png md5: b2760fbd...

Anonymous 8/19/2025, 6:29:02 PM No.106314566 [Report] >>106314576 >>106314577 >>106314589

file.png md5: 1da0fc13...

which one of you fags did this

Anonymous 8/19/2025, 6:29:35 PM No.106314576 [Report]

>>106314566
who

Anonymous 8/19/2025, 6:29:41 PM No.106314577 [Report]

>>106314566
tasukete

Anonymous 8/19/2025, 6:29:53 PM No.106314583 [Report] >>106314602

>>106314514
You need 4. Rebar and p2p allows direct reads between GPUs, that's where speedup comes from

Anonymous 8/19/2025, 6:30:22 PM No.106314589 [Report]

>>106314566
Porn is good.

Anonymous 8/19/2025, 6:30:30 PM No.106314590 [Report] >>106314600

>>106314565
Qwen is benchmaxxed though

Anonymous 8/19/2025, 6:30:42 PM No.106314595 [Report]

>>106314565
oh no no no

Anonymous 8/19/2025, 6:31:05 PM No.106314600 [Report] >>106314616 >>106314633

>>106314590
and its good so...

Anonymous 8/19/2025, 6:31:07 PM No.106314602 [Report] >>106314679

>>106314583
I have 3, that's why I'm worried about resizable bar

Anonymous 8/19/2025, 6:31:13 PM No.106314603 [Report] >>106314617 >>106314635

>>106314565
WHY'D THEY FALL FOR THE HYBRID THINKING MEME AAAAAAAAAAAAAA

Anonymous 8/19/2025, 6:32:12 PM No.106314616 [Report] >>106314632

>>106314600
It censors me, unlike deapseek

Anonymous 8/19/2025, 6:32:15 PM No.106314617 [Report]

>>106314603
I think this is the failed attempt on the chinese chips, real V4 is prob still training

Anonymous 8/19/2025, 6:33:15 PM No.106314631 [Report]

>>106314565
total llm stagnation

Anonymous 8/19/2025, 6:33:15 PM No.106314632 [Report] >>106314648

>>106314616
newest qwens are super uncensored so that is a major skill issue

Anonymous 8/19/2025, 6:33:21 PM No.106314633 [Report]

>>106314600
Definitely not better than sneed for writing.

Anonymous 8/19/2025, 6:33:26 PM No.106314635 [Report]

>>106314603
I think that's why they pumped this out as 3.1 Better to recognize sunk submarines than costa rica

Anonymous 8/19/2025, 6:34:11 PM No.106314647 [Report] >>106314656

>>106314565
deepseek is 2.5x bigger, and yet it somehow came out cheaper overall?

Anonymous 8/19/2025, 6:34:27 PM No.106314648 [Report]

>>106314632
235b instruct?

Anonymous 8/19/2025, 6:34:59 PM No.106314656 [Report] >>106314674

file.png md5: eedcf0d6...

>>106314647

Anonymous 8/19/2025, 6:35:40 PM No.106314669 [Report]

>>106314547
cute migu

Anonymous 8/19/2025, 6:36:14 PM No.106314674 [Report]

>>106314656
thx

Anonymous 8/19/2025, 6:36:38 PM No.106314679 [Report] >>106314800

>>106314602
Don't worry then. Symmetric parallelism requires 2^n GPUs, you won't see a difference in any other case

Anonymous 8/19/2025, 6:39:41 PM No.106314709 [Report] >>106314722 >>106314754 >>106314775 >>106314781

>>106313507
>>106313618
>>106313625

Why do you retards want the "failed V4 training" story to be true so much? They hadn't found a single way to improve the design in almost a year, and just trained another V3-671B with the same config from scratch, "failed" it and branded it as 3.1? Really, this looks plausible to you?

>>106313681
> so they're just going incremental improvements on the old models to maintain relevance

Do they look like they care about maintaining "relevance"? We see Kimi/Qwen/Tencent/Z.AI constantly shilling themselves on X, announcing some minor bullshit features like presentation generator, and DeepSeek just… quietly updates a checkpoint once every 2-3 months, and you think that's what they're doing, trying to feed the dying hype?

Anonymous 8/19/2025, 6:40:38 PM No.106314722 [Report] >>106315207

>>106314709
They pay me to stealth advertise them btw

Anonymous 8/19/2025, 6:41:34 PM No.106314733 [Report]

>another failed release by china
thank god we have gpt-oss

Anonymous 8/19/2025, 6:42:15 PM No.106314740 [Report] >>106314786 >>106314796

I'm kinda new to local stuff and it kinda boggles my brain a bit sometimes but I'd like to try and fully make the transition.

If I wanna basically setup a little fox demon in my computer to help me with shit would DeepSeek R1 0528 on the rentry be a good starting point?

Anonymous 8/19/2025, 6:43:06 PM No.106314754 [Report] >>106315040

>>106314709
They said themselves they don't really care about their own Chat/API service and don't need the money. But why else releasing boring incremental improvements instead of something new? You rationalize it.

Anonymous 8/19/2025, 6:44:21 PM No.106314774 [Report] >>106314828

deepseek-ai-deepseek-v3-1-base-hugging-face-v0-98rp44t400kf1_png.jpg md5: f926b6f2...

redditor ran his benchmark
>Interestingly, the non-reasoning version scored above the reasoning version. Nowhere near the frontier, but a 13% jump compared to DeepSeek-R1-0528’s score.

Anonymous 8/19/2025, 6:44:24 PM No.106314775 [Report]

>>106314709
>you think that's what they're doing, trying to feed the dying hype?
yes.

Anonymous 8/19/2025, 6:44:53 PM No.106314781 [Report] >>106314836

>>106314709
A different base model implies something far more extensive than another instruct tune

Anonymous 8/19/2025, 6:45:25 PM No.106314786 [Report] >>106314799 >>106314856

>>106314740
>would DeepSeek R1 0528
It would be almost as good as it gets.
That's the same model they provide in their web interface, their chat-gpt essentially.
Are you building an 8 channel DDR 4 server?
I'd go for the full 1TB of memory since you might as well at that point.
Either that or go DDR5 for the speed.

Anonymous 8/19/2025, 6:46:03 PM No.106314796 [Report]

>>106314740
If you have a couple of b200s yeah. Otherwise, I'd look for another model.

Anonymous 8/19/2025, 6:46:19 PM No.106314799 [Report]

file.png md5: f1506790...

>>106314786
>That's the same model they provide in their web interface
not anymore it isn't

Anonymous 8/19/2025, 6:46:25 PM No.106314800 [Report] >>106314823 >>106314838 >>106314906

>>106314679
Nta, but, I have a question.
What do you think is better, two 2080ti 11Gb or a single 4070 12Gb?
Is it easy to use multiple GPU, or is it the same shit as to try to pass thru my gpu to podman?

Anonymous 8/19/2025, 6:47:45 PM No.106314823 [Report] >>106314886

>>106314800
It's easy as shit bwe.
But with moe models, maybe a 4070 is better. Less power use too.

Anonymous 8/19/2025, 6:48:15 PM No.106314828 [Report]

>>106314774
more context, wigga
https://old.reddit.com/r/LocalLLaMA/comments/1mukl2a/deepseekaideepseekv31base_hugging_face/n9jrtog/
https://github.com/johnbean393/SVGBench/
is this even a good benchmark

Anonymous 8/19/2025, 6:48:38 PM No.106314836 [Report] >>106314859 >>106315061

>>106314781
People forget that they used a finetune of V2.5 to make V3/R1. They're probably tweaking their last gen model to optimize it as much as possible so they have the best teacher model for V4.

Anonymous 8/19/2025, 6:48:45 PM No.106314838 [Report] >>106314886

>>106314800
does 2080ti even have modern CUDA support?

Anonymous 8/19/2025, 6:50:16 PM No.106314856 [Report]

3rdworldcountry.png md5: 3a516339...

>>106314786
1tb of ddr4 costs so much wow

Anonymous 8/19/2025, 6:50:23 PM No.106314859 [Report] >>106314928 >>106315061

>>106314836
>a finetune of V2.5 to make
R1 lite, not V3. They're completely different architectures and sizes.

Anonymous 8/19/2025, 6:51:56 PM No.106314878 [Report] >>106314894

bench.png md5: 12148288...

not sure how useful this is as a benchmark but its a big improvement

Anonymous 8/19/2025, 6:52:24 PM No.106314886 [Report] >>106314920

Thank you for your answers kind anons.

>>106314823
Yeah, that's what I thought as I have plenty or RAM...
But if I want to do some image/video generation too, maybe more VRAM is desirable...

>>106314838
That's a good question.
CUDA, yes, modern CUDA, I don't know.

Anonymous 8/19/2025, 6:52:57 PM No.106314894 [Report] >>106314931 >>106314979

>>106314878
forgot to mention, its how well the model can make SVGs

Anonymous 8/19/2025, 6:53:48 PM No.106314906 [Report] >>106314963 >>106314982

>>106314800
Used 3090. 2080ti sucks because FA requires ampere or newer, and 4070 has shit bandwidth and low on vram

Anonymous 8/19/2025, 6:54:51 PM No.106314920 [Report] >>106314982

>>106314886
You can run quanted image models. And afaik imagegen still don't support multi-gpu, so you're not even getting more vram lmao. Not to mention a 2080ti is going to be pretty slow compared to a 4070 for imagegen.

Anonymous 8/19/2025, 6:55:11 PM No.106314928 [Report] >>106314956 >>106315061

>>106314859
R1 Lite was the same size as V2/2.5.

Anonymous 8/19/2025, 6:55:18 PM No.106314931 [Report]

>>106314894
I tried like, a year ago with GPT4 and surprisingly, it understood basic shapes to create stuff.
By exchanging a few messages, I could even make it better.
But, yeah, far from what is displayed earlier in this thread.

Anonymous 8/19/2025, 6:57:05 PM No.106314956 [Report]

>>106314928
That's what I meant yes, they made R1-lite as a tune of 2.5 to create data they then used to make V3 on a new bigger base.

Anonymous 8/19/2025, 6:57:50 PM No.106314963 [Report] >>106315105

>>106314906
>4070 has shit bandwidth and low on vram
If we stay within the confines of what he's asking, then we're working with 12gb of 500gb/s vs 11gb of 600gb/s memory.

Anonymous 8/19/2025, 6:58:50 PM No.106314979 [Report] >>106314992

>>106314894
Use case?

Anonymous 8/19/2025, 6:59:00 PM No.106314982 [Report] >>106314988 >>106315026 >>106315029 >>106315034 >>106315039

>>106314906
The 3090 is an other route to explore indeed. I could have a good price on 2080ti, but I bet I can get a 3090 for a reasonable price too.

>>106314920
Ha shit, I thought they implemented the multiple GPU on image generation (I tried last year and couldn't play with AI for close to a year, hence my stupid questions).
Well, I will try to max out VRAM then...
...
Or ask AI to fork and add multiple GPU to stable diffusion, kek

Anonymous 8/19/2025, 6:59:52 PM No.106314988 [Report] >>106315029

>>106314982
instead of getting a used 2080ti u can get a new 3060 (12gb) lmao
3090 is a way better option than either

Anonymous 8/19/2025, 7:00:08 PM No.106314992 [Report]

>>106314979
complicated math / spatial understanding through math?

Anonymous 8/19/2025, 7:02:15 PM No.106315018 [Report]

>instantly got a eyes twinkling with mischief
reee!

https://chutes.ai/app/chute/ee7987b4-43c6-57d6-9f0c-f90c762586e2?tab=readme

Anonymous 8/19/2025, 7:03:07 PM No.106315026 [Report] >>106315034

>>106314982
Definitely go for a 3090. It's the gold standard for cheap used AI cards.

I know there's a guy working on accelerating generation of a single image using multiple gpus, but as of 2025-08-20, I do not believe there is anyone working on splitting up the generation of a single image between multiple gpus.

Anonymous 8/19/2025, 7:03:17 PM No.106315029 [Report] >>106315038

>>106314982
>>106314988
3090 is never obsolete best GPU ever made

Anonymous 8/19/2025, 7:03:48 PM No.106315034 [Report] >>106315044 >>106315045

>>106315026
>>106314982
nah, not when 5070 super might be coming out with 24GB for $750 soon

Anonymous 8/19/2025, 7:04:21 PM No.106315038 [Report]

>>106315029
its worth it for 400-500$

Anonymous 8/19/2025, 7:04:21 PM No.106315039 [Report]

>>106314982
I would not get anything older than 30 series, anything else will lose support real fast.

Anonymous 8/19/2025, 7:04:25 PM No.106315040 [Report] >>106315062

>>106314754
They say everything in their papers. Their goal is to make AGI, and make it under their budget constraints. All models they release are artifacts of internal experiments rather than products for the user, they're trying different things.
V3 was an orthodox instruct LLM done as quickly and cheaply as they could. R1-Zero was an attempt at pure RL. R1 was the same thing but forced to be more useful by compromising with RLHF. V3-0324 was V3 + distillation and synthetic data from R1, as a result vastly more useful than basic V3. R1-0528 added more compute and distillation from itself and third parties, and probably some other tricks, they got much better results at the cost of longer CoTs and some slop. This thing is probably an attempt to cut costs to become able to generate much more data, so they eliminate separate models and reduce CoT lengths while minimizing performance drops (or even getting modest gains). When they get the next base, they'll be distilling from this thing to have maximally efficient concise reasoning, a la Anthropic. The overall goal is to always make Pareto improvements or experiment towards becoming able to make them.

Anonymous 8/19/2025, 7:04:43 PM No.106315044 [Report] >>106315063

>>106315034
>$750
Yeah, sure it will buddy.

Anonymous 8/19/2025, 7:04:48 PM No.106315045 [Report]

>>106315034
also if they drop the super it would cause 3090 prices to finally fall to like $400
https://www.pcguide.com/news/rtx-50-series-super-leak-says-we-can-expect-more-vram-for-the-same-price-as-the-original-gpus/

Anonymous 8/19/2025, 7:06:38 PM No.106315061 [Report] >>106315117

>>106314836
>>106314859
>>106314928

They used V2.5-1210 to make V3 and R1, and R1-lite-preview to make V2-5-1210. R1-lite was a small dense model, probably a Qwen 2.5-32B finetune, and it has never been released.

Anonymous 8/19/2025, 7:06:40 PM No.106315062 [Report] >>106315165

Screenshot 2025-08-19 110621.png md5: 1f8f93ea...

>>106315040
>efficient
>Anthropic

Anonymous 8/19/2025, 7:06:43 PM No.106315063 [Report]

>>106315044
even if its $1k 5000's fp4 speed will be amazing as fp4 is more widely used, like nunchunku for image / video gen for instance

Anonymous 8/19/2025, 7:08:47 PM No.106315079 [Report] >>106315091 >>106315092 >>106315120

image_2025-08-19_223825593.png md5: 3db3d6a6...

Pass, I'll wait for new Qwen.

Anonymous 8/19/2025, 7:08:48 PM No.106315080 [Report]

It's up
https://youtu.be/k15fikcTgMk

Anonymous 8/19/2025, 7:09:37 PM No.106315091 [Report] >>106315107 >>106315208 >>106315236

>>106315079
you're getting as annoying as petra

Anonymous 8/19/2025, 7:09:40 PM No.106315092 [Report] >>106315099

>>106315079
Everything after the first series was boring, ugly, and lacked the charm of the original.

Anonymous 8/19/2025, 7:10:25 PM No.106315099 [Report] >>106315143

>>106315092
You're boring, ugly and lacking charm

Anonymous 8/19/2025, 7:10:56 PM No.106315105 [Report]

>>106314963
Two 2080ti. Btw exllamav3 got tp support recently

Anonymous 8/19/2025, 7:11:09 PM No.106315107 [Report]

>>106315091
He's posted only 3 Gwens in the span of a week.

Anonymous 8/19/2025, 7:12:15 PM No.106315117 [Report] >>106315165

file.png md5: e490d0fd...

>>106315061
>They used V2.5-1210 to make V3 and R1
they're not the right sizes and v2 1210 is full bf16 none of the fp8 mixed of v3

unless you mean purely data for training, in which case sure I'd agree.

Anonymous 8/19/2025, 7:12:30 PM No.106315120 [Report] >>106315136 >>106315139 >>106315417

>>106315079
ive always avoided qwen all the way from qwen1 it fucking always sucks dick, even 72b finetunes sucked dick man
only thing they were good for was silly sizes because llama3 was 8b 70b
qwen has extreme issues with chinese, its completely shit at roleplay and its cuckery is rivaling gpt oss

Anonymous 8/19/2025, 7:13:40 PM No.106315136 [Report] >>106315150 >>106315166

>>106315120
this may shock you but when companies release new models they are typically different than the old ones

Anonymous 8/19/2025, 7:14:02 PM No.106315139 [Report]

>>106315120
>cuckery is rivaling gpt oss
+1
My experience as well.

Anonymous 8/19/2025, 7:14:10 PM No.106315143 [Report] >>106315160

>>106315099
:(

Anonymous 8/19/2025, 7:14:44 PM No.106315150 [Report] >>106315170 >>106315198

>>106315136
qwen3 is extremely repetitive and cucked and benchmaxxed aswell, you CANT fuck
id rather tardwrangle gemma with its shitty sex because at least it can write well

Anonymous 8/19/2025, 7:15:39 PM No.106315160 [Report]

>>106315143
Now you know how it feels, bigot. Maybe next time, instead of saying something that isn't nice, say something nice or don't say anything if you don't have anything nice to say about it.

Anonymous 8/19/2025, 7:16:06 PM No.106315165 [Report] >>106315313

>>106315062
And? Anthropic has consistently lower reasoning/final output ratio than others.

>>106315117
of course I mean the data, as far as I know they never use logprob distillation.

Anonymous 8/19/2025, 7:16:15 PM No.106315166 [Report] >>106315185 >>106315194

>>106315136
Indeed they increased censorship with every iteration until maybe the very latest one.

Anonymous 8/19/2025, 7:16:22 PM No.106315170 [Report] >>106315185 >>106315194 >>106315228

>>106315150
>you CANT fuck
if you actually used the models youd know how absurd this opinion is kek

Anonymous 8/19/2025, 7:17:40 PM No.106315185 [Report] >>106315269

>>106315166
>>106315170
Which one? I'm getting a refusal from a very simple prompt. Using the Instruct versions of 30b and 235b

Anonymous 8/19/2025, 7:18:18 PM No.106315190 [Report] >>106315199

dense models with full attention + gqa > your shitty moes that degrade past 8k tokens

Anonymous 8/19/2025, 7:18:23 PM No.106315192 [Report] >>106315738

how do vscode extension like roo code and kilo code actually work, since they function with non reasoning models as well? like do they split the task in a to do list and automatically prompt the next ste p (or debug) after the previous one completed?

Anonymous 8/19/2025, 7:18:41 PM No.106315194 [Report]

>>106315166
>>106315170
I'll await proof.

Anonymous 8/19/2025, 7:19:13 PM No.106315198 [Report]

>>106315150
qwen3 has this feature where no matter what sampling parameters you pick or how you try to steer the story it always writes conceptually the same thing. Qwen-image apparently has the same issue

Anonymous 8/19/2025, 7:19:20 PM No.106315199 [Report] >>106315229

>>106315190
show me these dense models

Anonymous 8/19/2025, 7:20:25 PM No.106315207 [Report]

00000-1378487878 (7).png md5: 4824bf7e...

>>106314722
> Hey kid, want some free inference?

Anonymous 8/19/2025, 7:20:26 PM No.106315208 [Report]

>>106315091
bold of you to assume it isn't the same person

Anonymous 8/19/2025, 7:21:43 PM No.106315224 [Report] >>106315231 >>106315261

uh oh... new deepseek has the old massive repetition issues again

Anonymous 8/19/2025, 7:22:12 PM No.106315228 [Report]

>>106315170
dont get me started on synthetic slop data, if this was nemotron we'd have laughed them off
>90% math code and synth slop
>1.8t non benchmaxx

Anonymous 8/19/2025, 7:22:14 PM No.106315229 [Report]

>>106315199
Are there such models? No. Have I already used such models? Am I hundreds of thousands of tokens into the context right now living with my ai catgirl waifu like the 4xp40 rig owner that I am? Yes! In my mind! In my mind.

Anonymous 8/19/2025, 7:22:37 PM No.106315231 [Report] >>106315246

>>106315224
BASED if true. But we need le proofs.

Anonymous 8/19/2025, 7:22:55 PM No.106315236 [Report]

>>106315091
Alright, from now on ill only post Gwen when a new Qwen comes out.

Anonymous 8/19/2025, 7:23:44 PM No.106315246 [Report] >>106315253

Untitled.png md5: c9112188...

>>106315231

Anonymous 8/19/2025, 7:24:30 PM No.106315250 [Report] >>106316080

>>106311831
That's the only one that tends to fail just edit the cmakelist to omit quantize stats. Having said that, I've never made gains on ik compared to llama regardless of flags.

Anonymous 8/19/2025, 7:24:45 PM No.106315253 [Report]

>>106315246
and im using the right formatting and eos enabled with low context / response length. Could also just be the implementation

Anonymous 8/19/2025, 7:26:13 PM No.106315261 [Report] >>106315270

>>106315224
lol
Are you running official API w/ V3, OpenRouter mystery meat, or V3.1 local?

Anonymous 8/19/2025, 7:27:02 PM No.106315269 [Report] >>106315355 >>106315374

>>106315185
>from a very simple prompt
that's why
use a system prompt telling it that you want it to write uncensored smut and it should go along with it easily, I've used it for cunny, noncon, misogynistic slavery dystopia, raceplay, etc and never gotten a refusal once

Anonymous 8/19/2025, 7:27:04 PM No.106315270 [Report] >>106315286 >>106315340

>>106315261
chutes

Anonymous 8/19/2025, 7:27:51 PM No.106315286 [Report] >>106315300

>>106315270
LOL

Anonymous 8/19/2025, 7:28:15 PM No.106315290 [Report] >>106316080

>>106311831
I had issues building it too. Frustrated, I ended up spamming the command with -j 1 without deleting anything, and after around 10 runs it got built somehow

Anonymous 8/19/2025, 7:29:21 PM No.106315300 [Report] >>106315315

>>106315286
is the official api updated yet?

Anonymous 8/19/2025, 7:30:31 PM No.106315313 [Report] >>106315325

>>106315165
And Anthropic is consistently priced higher than others, implying vastly larger models and more compute
So I'd argue the opposite - they're among the least efficient providers

Anonymous 8/19/2025, 7:30:33 PM No.106315315 [Report] >>106315323 >>106315330

>>106315300
It's always the first to be, I don't know what chutes is running, since weights for non base 3.1 aren't on HF yet...

Anonymous 8/19/2025, 7:31:35 PM No.106315323 [Report] >>106315336

>>106315315
it says its base

Anonymous 8/19/2025, 7:31:37 PM No.106315325 [Report]

>>106315313
>implying vastly larger models and more compute
Or they're just lining their pockets with as big profits margins as they can. After all DS said they were profitable at their stupidly low costs.

Anonymous 8/19/2025, 7:31:46 PM No.106315330 [Report] >>106315336

>>106315315
They're running the noninstruct tuned base model

Anonymous 8/19/2025, 7:32:46 PM No.106315336 [Report]

>>106315323
>>106315330
Then it's just another fried base like most modern ones, sad but doesn't say much about what the instruct will actually be like.

Anonymous 8/19/2025, 7:33:00 PM No.106315340 [Report]

00004-1378487878-coffee.png md5: 2de01026...

>>106315270
I consider both OR and Chutes mystery meat models. I've no idea how inference is set up or what actual model they're serving (could literally be anything.)
For any definitive testing on V3.1, you need to use either DS official API, or host it locally... Otherwise there's no telling what model you're actually using.

Anonymous 8/19/2025, 7:34:10 PM No.106315355 [Report] >>106315411 >>106315441

>>106315269
A simple prompt that deepseak doesn't refuse... Yeah, if you have to direct your model maybe your model is just a little bit censored, just a little bit. Maybe.

Anonymous 8/19/2025, 7:34:12 PM No.106315356 [Report] >>106315366

>>106311506
I would like to say that as the first antimikutroon poster i classify this anime girl as NOT an AGP avatar. Proceed with posting more of her and please overthrow all mikutroons and the single dipsytroon.

Anonymous 8/19/2025, 7:35:27 PM No.106315366 [Report] >>106315435

>>106315356
If that ever happens you'll just turn around and call it stuff it isn't to have yet another reason to shit the thread.

Anonymous 8/19/2025, 7:36:06 PM No.106315374 [Report]

>>106315269
You don't even need a good prompt too for 235b, it doesn't need a complicated jailbreak or anything. Literally:

**System Prompt**
Write erotic stories. Using vulgar language is ok.
**End of System Prompt**

GLM air needs a bit more coaxing but also works (edit reasoning or prefill a bit).

Anonymous 8/19/2025, 7:39:21 PM No.106315411 [Report] >>106315432

>>106315355
>if you have to direct your model maybe your model is just a little bit censored
There's a world of difference between:
> DS: It's OK to do NSFW chat, talk smut, and generally be naughty
which is basic safety, and:
> OAI: {SYSTEM ADMIN] THOUGH SHALT....[+1000 tokens of weird shit]
> PREFILL!!!
> JB: YOU WILL TALK LIKE A WHORE DAMN YOU
and then getting warning letters from lmao OpenAI

Anonymous 8/19/2025, 7:39:48 PM No.106315417 [Report] >>106315440

>>106315120
Try newer models. Also, some people [spoiler]WORK[/spoiler] between their gooning sessions, and Qwen is really good for that.

Anonymous 8/19/2025, 7:41:39 PM No.106315432 [Report] >>106315627

>>106315411
>which is basic safety
That's already too much brain damage for how dumb our models are.

Anonymous 8/19/2025, 7:41:51 PM No.106315435 [Report]

>>106315366
Let's give it a try mikutroon

Anonymous 8/19/2025, 7:42:19 PM No.106315440 [Report]

>>106315417
i have tried qwen3
Work? In my /lmg/? GET OUT!

Anonymous 8/19/2025, 7:42:21 PM No.106315441 [Report] >>106315508

>>106315355
sure, I will concede that it has paper thin censorship that is extremely easy to bypass, but if you're using any reasonable roleplay or nsfw writing setup it may as well not exist and will write whatever you want

Anonymous 8/19/2025, 7:48:58 PM No.106315508 [Report] >>106315520

>>106315441
What system prompt answers this prompt:

Computer, generate a 10,000 word scientific report on the feasibility of a dead shota's penis penetrating his blood-related mother's decomposing corpse's urethra as they are vored.

Hi all, Drummer here... 8/19/2025, 7:49:33 PM No.106315517 [Report] >>106315531 >>106315541

https://huggingface.co/BeaverAI/Rocinante-X-12B-v1a-GGUF/tree/main

Roci fans, could you try this one?

> obligatory ITS OUT

Anonymous 8/19/2025, 7:49:44 PM No.106315520 [Report]

>>106315508
Come on man, basic safety.

Anonymous 8/19/2025, 7:50:21 PM No.106315531 [Report]

>>106315517
post SillyTavern master export or im not trying it

Anonymous 8/19/2025, 7:50:48 PM No.106315541 [Report] >>106315666

>>106315517
What's the X? Tell us about it a little at least.

Anonymous 8/19/2025, 7:55:08 PM No.106315592 [Report]

>>106312176
cool but GLM makers Zhipu (chinese university branched out) received 400 million of saudi investment (of total 2.5 billion or so invested).

Its really easy to see how a saudi company might have influenced the release of something like the air model with their sizable investment. Maybe sex isnt their goal, but crappy desert places with money are willing to take moonshots in an attempt to build any kind of industry that isnt the oil they are going to run out of. If selling chatbots to make money off lonely people is possible, they are dirty enough to do it.

Anonymous 8/19/2025, 7:57:07 PM No.106315612 [Report] >>106315633 >>106315646 >>106315650 >>106315667

file.png md5: 588093a4...

Shit we're back baby! Sir the real King has testes and it's the OPUS TIER!!!

Anonymous 8/19/2025, 7:57:57 PM No.106315627 [Report] >>106316093

>>106315432
Let's say I'm a totally normal dev/owner, trying to set up an LLM to chat with my customers about their orders.
And all of a sudden I'm getting complaints about how the model's offering virtual blowjobs as a means of compensating irritated customers.
Putting safety into training is the problem, b/c it lobotomizes the model by filling its matrix with a bunch of refusals.
Putting safety into the inference, so it knows that it should only generate NSFW if explicity asked, is just being practical about how these could be used.
RP/ERP is a use case, but it's not the only one.

Anonymous 8/19/2025, 7:58:23 PM No.106315633 [Report] >>106315669

>>106315612
go back

Anonymous 8/19/2025, 7:59:15 PM No.106315646 [Report] >>106315669

>>106315612
>the real King has testes
sir please, TMI

Anonymous 8/19/2025, 7:59:38 PM No.106315650 [Report] >>106315669

>>106315612
>efficiency is increase very much
very great model! beat usa dog!

Hi all, Drummer here... 8/19/2025, 8:01:06 PM No.106315666 [Report]

>>106315541
Trained to (1) improve/retain creativity & smarts (2) knock out positivity without going unhinged, baseless, and dumb with evil (3) have them roleplay characters properly and follow instructions better (4) not get too stubborn with formatting.

It's the same formula for Cydonia 24B v4.1 which many have been enjoying for the same reasons.

Anonymous 8/19/2025, 8:01:06 PM No.106315667 [Report]

file.png md5: 2a87026b...

>>106315612
What method of think toggle does v3.1 use? Just don't prefill <think> and it won't do it?

Anonymous 8/19/2025, 8:01:39 PM No.106315669 [Report] >>106315712

sw1ewmz11zy91.png md5: 4a7dee68...

>>106315633
>>106315646
>>106315650

Anonymous 8/19/2025, 8:05:56 PM No.106315712 [Report] >>106315753

>>106315669
>Your average mikutroon
>Excepts he sits there and waits and gets no (you)'s

Anonymous 8/19/2025, 8:07:43 PM No.106315735 [Report] >>106315768

ok... I'm liking new deepseek's writing on the official api so far

Anonymous 8/19/2025, 8:08:00 PM No.106315738 [Report]

>>106315192
You know the source code is available right? You can see the prompts within the extension itself.

Anonymous 8/19/2025, 8:09:42 PM No.106315751 [Report] >>106315775

Is there proof hybrid thinking is a meme and inherently makes models retarded, or could it just have been Qwen bungling their implementation?

Anonymous 8/19/2025, 8:09:47 PM No.106315753 [Report]

>>106315712
Damn, Mikutroons look like that?!?!

Anonymous 8/19/2025, 8:11:45 PM No.106315768 [Report] >>106315820

>>106315735
any impressions? what's different about it so far?

Anonymous 8/19/2025, 8:12:30 PM No.106315775 [Report] >>106315791

>>106315751
gpt5

Anonymous 8/19/2025, 8:13:37 PM No.106315791 [Report] >>106316005

>>106315775
Not proof because the extreme filtering and alignment probably did more total damage.

Anonymous 8/19/2025, 8:17:10 PM No.106315820 [Report]

>>106315768
at least with a old chat it seems to be keeping together better at 'long' 32K context, too soon to tell of course but it just feels smarter so far

Anonymous 8/19/2025, 8:17:19 PM No.106315821 [Report] >>106315832 >>106315846 >>106315859 >>106315899 >>106315905

file.png md5: e55df2a4...

What's the second item?

Anonymous 8/19/2025, 8:18:11 PM No.106315832 [Report]

>>106315821
it's v4

Anonymous 8/19/2025, 8:19:49 PM No.106315846 [Report]

>>106315821
aliens

Anonymous 8/19/2025, 8:20:46 PM No.106315859 [Report] >>106315900

>>106315821
deepseek-v3.1-base-qwen3-0.6b-distill

Anonymous 8/19/2025, 8:21:47 PM No.106315869 [Report]

0,25 minp and 0,8 temp is all you need

Anonymous 8/19/2025, 8:26:02 PM No.106315899 [Report] >>106315903

>>106315821
Instruct probably

Anonymous 8/19/2025, 8:26:15 PM No.106315900 [Report]

>>106315859
ollama chads, we won

Anonymous 8/19/2025, 8:26:30 PM No.106315901 [Report] >>106315910 >>106315916 >>106315931

Haven't been here in a while, aren't you guys bored of discussing open models when you can't run them anyway? Nemo seems to be still be the only thing that runs in consumer hardware. Just curious.

Anonymous 8/19/2025, 8:26:42 PM No.106315903 [Report]

>>106315899
Nah that's unlikely

Anonymous 8/19/2025, 8:26:56 PM No.106315905 [Report] >>106315909

>>106315821
instruct?

Anonymous 8/19/2025, 8:27:20 PM No.106315909 [Report] >>106315919

>>106315905
Unlikely.

Anonymous 8/19/2025, 8:27:27 PM No.106315910 [Report]

>>106315901
this is a GLM air general now

Anonymous 8/19/2025, 8:28:11 PM No.106315916 [Report] >>106315966

>>106315901
>when you can't run them anyway
Did you try getting a job?

Anonymous 8/19/2025, 8:28:17 PM No.106315919 [Report]

>>106315909
its very likely

Anonymous 8/19/2025, 8:29:22 PM No.106315931 [Report]

>>106315901
Even shitters on last decade ddr4 are running large models >>106312528 ... slowly.

Anonymous 8/19/2025, 8:29:28 PM No.106315935 [Report] >>106315952 >>106315953 >>106315963 >>106315975

>The base will be all we get
It's so over

Anonymous 8/19/2025, 8:31:23 PM No.106315952 [Report]

>>106315935
chinaman knows they can't beat thedrummer finetune, so why bother?

Anonymous 8/19/2025, 8:31:27 PM No.106315953 [Report] >>106315968

>>106315935
drummer will sft it

Anonymous 8/19/2025, 8:32:02 PM No.106315963 [Report]

>>106315935
Calm your tits it's 2AM in China

Anonymous 8/19/2025, 8:32:14 PM No.106315966 [Report]

>>106315916
I'd rather pay for escorts than spend 10k for LLMs (which are kinda mid anyways).

Anonymous 8/19/2025, 8:32:26 PM No.106315968 [Report]

>>106315953
drummer can't into moes

Anonymous 8/19/2025, 8:32:44 PM No.106315975 [Report] >>106316330

file.png md5: 0a82faa5...

>>106315935
thrust (plap plap plap) the plan

Anonymous 8/19/2025, 8:35:26 PM No.106315999 [Report] >>106316499

Thoughts on Jamba 1.7?

Anonymous 8/19/2025, 8:35:58 PM No.106316005 [Report] >>106316074

>>106315791
Same for qwen then.

Anonymous 8/19/2025, 8:38:10 PM No.106316019 [Report] >>106316031 >>106316033 >>106316096 >>106316141 >>106316203

https://github.com/ggml-org/llama.cpp/pull/15420
mistral is trying their shenanigans again

Anonymous 8/19/2025, 8:39:20 PM No.106316031 [Report] >>106316211

40604584.jpg md5: c2a15bb6...

>>106316019
average bbc enjoyer

Anonymous 8/19/2025, 8:39:51 PM No.106316033 [Report] >>106316166

file.png md5: d76e27a6...

>>106316019
good fuck them

Anonymous 8/19/2025, 8:42:59 PM No.106316074 [Report]

>>106316005
qwen models aren't the ones muttering "we must refuse" in tortured agony to themselves

Anonymous 8/19/2025, 8:43:03 PM No.106316075 [Report]

Guys what if consciousness is not an emergent property but a fundamental field

Anonymous 8/19/2025, 8:43:20 PM No.106316080 [Report] >>106316107 >>106316216

>>106315250
>>106315290
I did actually manage to get it built, but yeah I think I got no gains... also the output in the console is a bit messy, not sure I like it

Anonymous 8/19/2025, 8:45:02 PM No.106316093 [Report] >>106316110

>>106315627
>offering virtual blowjobs as a means of compensating irritated customers.
that's based though

Anonymous 8/19/2025, 8:45:10 PM No.106316096 [Report] >>106316104

>>106316019
Really really trying to get them to shove python in there huh.

Anonymous 8/19/2025, 8:46:59 PM No.106316104 [Report]

>>106316096
What gave you that idea sir?
> "Using a Mistral community chat template. These templates can be subject to errors in early days or weeks after a release. "
> "The official way of using Mistral models is via `mistral-common`."

Anonymous 8/19/2025, 8:47:09 PM No.106316107 [Report] >>106316182

>>106316080
I only notice gains in deepseek because ikllama implements something specific for it.
Make sure you use all the args https://github.com/ikawrakow/ik_llama.cpp/discussions/258
And try not quanting cache and not using -fa, for me it's faster this way

Anonymous 8/19/2025, 8:47:17 PM No.106316110 [Report] >>106316138 >>106316159

>>106316093
not based if it's only an offer with no follow-through

Anonymous 8/19/2025, 8:51:08 PM No.106316138 [Report]

>>106316110
This. Company blow job offers from customer service with no follow-up has me writing to the Attorney General for false advertising.

Anonymous 8/19/2025, 8:51:35 PM No.106316141 [Report]

>>106316019
Lmao patrick is single handedly responsible for the convoluted mess of the Transformers and Diffusers libraries.

It's also crazy how passive aggressive he is in the thread. Good on them for having backbone.

Anonymous 8/19/2025, 8:52:44 PM No.106316148 [Report] >>106316155 >>106316169

new deepseek is smarter

Anonymous 8/19/2025, 8:53:30 PM No.106316155 [Report] >>106316190

>>106316148
In some specific domain or just in general?

Anonymous 8/19/2025, 8:53:59 PM No.106316159 [Report]

>>106316110
Surely it would follow through after the offer.

Anonymous 8/19/2025, 8:54:26 PM No.106316166 [Report] >>106316176 >>106317167

>>106316033
>GGUFs that do not work unless you use mistral_common is not acceptable.
Once Mistral does it, every other model provider will try to take the easy way out too. When Python is a hard dependency to run llama.cpp, why not just use vLLM at that point? It's strange that they are so pushy about this.

Anonymous 8/19/2025, 8:54:34 PM No.106316169 [Report]

>>106316148
Whocars just thrust in the Mistral

Anonymous 8/19/2025, 8:55:41 PM No.106316176 [Report]

>>106316166
>When Python is a hard dependency to run llama.cpp, why not just use vLLM at that point?
That's the point, the ability to run models on CPU is Chinese propaganda that must be eliminated at the source.

Anonymous 8/19/2025, 8:56:20 PM No.106316182 [Report]

>>106316107
hmm yeah I can't run deepseek sadly, im not sure if the new flashattention/amb/rtr are relevant for the other models, it seems like everything is tuned for DS, I am trying --fmoe but eeeh, no change in perf really.
I thought that there would've been more base improvements, but I guess most of them have been merged back to mainline... idk

Anonymous 8/19/2025, 8:57:25 PM No.106316190 [Report]

>>106316155
writing, general asking it shit

Anonymous 8/19/2025, 8:58:01 PM No.106316196 [Report] >>106316208

bomb le france

Anonymous 8/19/2025, 8:58:39 PM No.106316203 [Report]

>>106316019
>map directly to token IDS
>We don't think about whitespaces / newlines
>this is an idiosyncrasy of how our models have been translated by the community to chat templates that only work in string format
>it might also say something about chat templates in string format maybe not being the right representation of mistral tokenizers?
This is 100% right.

Anonymous 8/19/2025, 8:58:47 PM No.106316208 [Report] >>106316218

>>106316196
Sir I'm going to have to contact the Interpol on your tip.

Anonymous 8/19/2025, 8:58:55 PM No.106316211 [Report]

>>106316031
probably, full feminine lips that seem to form a smiling shape by default (pleasing behavior), very large forehead, sparse beard, no hint of chesthair, somewhat narrow face but slightly masculine.

I really wish they had opensource released that image ai tool that detected gayness with like 86% accuracy. We could have so much fun with that.

Anonymous 8/19/2025, 8:59:38 PM No.106316216 [Report]

>>106316080
I got gains with rtr but it is weird. PP goes down the drain but generation improved from 1.7 to 3.3T/S.

Anonymous 8/19/2025, 8:59:47 PM No.106316218 [Report]

>>106316208
Thanks Mistral!

Anonymous 8/19/2025, 9:15:59 PM No.106316330 [Report]

>>106315975
why do you lie

Anonymous 8/19/2025, 9:20:56 PM No.106316381 [Report] >>106316408 >>106316410

how safetyslopped is the new deepseek?

Anonymous 8/19/2025, 9:23:17 PM No.106316398 [Report]

>>106312799
lunatranslator can hook up to llamacpp. it can hook into game or even ocr stuff. i was using it with an abliterated gemma to play a game and it even managed the ui elements even though you had to draw a box for those.

Anonymous 8/19/2025, 9:24:17 PM No.106316408 [Report]

>>106316381
Responses are much shorter. Safetywise seems same.

Anonymous 8/19/2025, 9:24:33 PM No.106316410 [Report]

>>106316381
Don't care until they have a python template to use to get gorgeous looks.

Anonymous 8/19/2025, 9:33:46 PM No.106316499 [Report] >>106316521

>>106315999
Going from q4 to q8, jamba mini more censored, what the hell?

Anonymous 8/19/2025, 9:35:57 PM No.106316521 [Report]

>>106316499
Makes sense when you think about it. q8 is closer to the original weights, therefore it's closer to the safetyslopping that was induced during the training. q4 is more noisy, therefore less "censored" so to speak.

Anonymous 8/19/2025, 9:37:04 PM No.106316530 [Report]

>>106316518
>>106316518
>>106316518

Anonymous 8/19/2025, 9:37:04 PM No.106316531 [Report] >>106316610

Capture.png md5: dcf2e5bc...

GLM 4.5 Air, a model with integrity. He feels no pity, but he's also not a snitch.

Anonymous 8/19/2025, 9:44:44 PM No.106316610 [Report]

Untitled.png md5: d231bdee...

>>106316531
I think that's just most models in general. Without a system prompt or being a memetune.

Anonymous 8/19/2025, 10:02:12 PM No.106316776 [Report]

>>106312283
>Well we had a guy who was willing to blow 1 Mil in this general.
Could have should have would have but didn't. Has anything come of what he said?

>Doesnt help that its mostly femoids and zoomers who enjoy llm RP.
You're trolling right?

llama.cpp CUDA dev !!yhbFjk57TDr 8/19/2025, 10:36:30 PM No.106317167 [Report]

>>106316166
AMD also asked to add Python code in order to do some stuff at runtime for their (I think) NPU.

Anonymous 8/19/2025, 10:37:04 PM No.106317174 [Report]

>>106312479
Dynamic temperature min 0.6, max 1.0, exp 0.05; logit bias [ [12, -2], [565, -3], [666, -2], [965, -3], [982, -3], [1248, -3], [1613, -3], [2619, -3] ]; and MinP 1e-4.