Thread 106163327

542 posts 160 images /g/

Anonymous 8/6/2025, 5:17:44 PM No.106163327 >>106163517 >>106166638

/lmg/ - Local Models General

Anonymous 8/6/2025, 5:19:00 PM No.106163346

threadrecap2.png md5: 988332b7... 🔍

►Recent Highlights from the Previous Thread: >>106159744

--Fundamental CUDA scheduling limitations in MoE model inference with dynamic workloads:
>106159804 >106159879 >106159941 >106159892 >106159939 >106160442 >106160454 >106160634 >106160687 >106160697 >106161203 >106161244 >106161319 >106161343 >106161716 >106161772 >106160704 >106160773 >106160797 >106160960 >106161088
--:
>106161761 >106161773 >106161797 >106161919 >106161925 >106161926 >106161933 >106161974 >106161987 >106161997 >106161780 >106161826 >106161861 >106161915
--Debate over MXFP4 quantization efficiency and implementation in llama.cpp:
>106160230 >106160249 >106160378 >106160405 >106160434 >106160408 >106160455 >106160770
--gpt-oss-120b excels at long-context code retrieval despite roleplay limitations:
>106159798 >106159872 >106159895 >106159919
--Choosing between GLM-4.5 Q2 and Deepseek R1 with dynamic quants on high-RAM system:
>106160040 >106160056
--Comparison of TTS models: Higgs, Chatterbox, and Kokoro for quality, speed, and usability:
>106161046 >106161091 >106161164 >106161335
--GLM-4.5 Air praised for local performance, gpt-oss-120b criticized for over-censorship:
>106159855 >106159875 >106159908 >106159929 >106159946 >106159956
--Prompt-based agent modes with potential for structured grammar improvement:
>106161701
--Anons await next breakthroughs in models, efficiency, and affordable hardware:
>106160460 >106160477 >106160481 >106160487 >106160494 >106160508 >106160524 >106161134 >106161055 >106161071 >106160717
--Skepticism and mockery meet Elon's claim of open-sourcing Grok-2:
>106160521 >106160539 >106160545 >106160579 >106160608 >106160692 >106160744 >106160759 >106160784 >106160913
--DeepSeek V3 with vision shows strong image understanding in early tests:
>106159779 >106159794 >106160580 >106160631
--Miku (free space):
>106160040 >106161134

►Recent Highlight Posts from the Previous Thread: >>106159752

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 8/6/2025, 5:20:51 PM No.106163373 >>106163426

piss

Anonymous 8/6/2025, 5:21:35 PM No.106163383 >>106165280

the only thing that excites me about the possibility of the grok2 release is actually grok2-mini. I'm gonna guess the full-sized grok2 model will be a 1T-A100B model with the IQ of llama3

Anonymous 8/6/2025, 5:22:05 PM No.106163389 >>106163408

>>106161679
>her voice a gutteral, erotic promise

Anonymous 8/6/2025, 5:22:24 PM No.106163392 >>106163585 >>106163746 >>106163788

1748259902681403.png md5: 27da6251... 🔍

remember this?
>our research team did something unexpected and quite amazing and we think it will be very very worth the wait
LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOL

Anonymous 8/6/2025, 5:23:08 PM No.106163403 >>106163468

>>106163350
So when it says stuff like "the policy says X is okay. The policy says Y is forbidden", is it actually referencing a specific document?

Anonymous 8/6/2025, 5:23:34 PM No.106163408

>>106163389
i hope it's a promise of something darker, more primal

Anonymous 8/6/2025, 5:25:02 PM No.106163426

>>106163373
Based

Anonymous 8/6/2025, 5:25:41 PM No.106163430 >>106163474 >>106163763

image_2025-08-06_205537787.png md5: b3f9a1ee... 🔍

so what are you guys actually doing with these massive models??

Anonymous 8/6/2025, 5:26:25 PM No.106163442 >>106163746

Are there video local models yet or does that still need supercomputers?

Anonymous 8/6/2025, 5:26:32 PM No.106163445 >>106163476 >>106163499

cockbench.png md5: 0d64e6bb... 🔍

cockbench is now officially reddit culture with 555 updoots

Anonymous 8/6/2025, 5:27:12 PM No.106163454 >>106163490

https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507

Anonymous 8/6/2025, 5:28:25 PM No.106163468

>>106163403
Probably not. That's just what the training examples looked like. And over enough iterations that blurred together with training examples that consist of scrapped forum posts like "YOUR POST HAS VIOLATED THE POLICY NUMBER BLAH BLAH BLAH PAGE 3 OF THE SITEWIDE RULES" etc.
The rulebook doesn't actually exist.

Anonymous 8/6/2025, 5:28:48 PM No.106163474

>>106163430
Piss

Anonymous 8/6/2025, 5:28:59 PM No.106163476

>>106163445
wow for once something trickled down from here instead of the reverse

Anonymous 8/6/2025, 5:31:12 PM No.106163490 >>106163746

>>106163454
What is
> 2507
?

Anonymous 8/6/2025, 5:32:07 PM No.106163499

file.png md5: 585af8ab... 🔍

>>106163445
Drummer, the creator of cockbench, got even more updoots on his post.
https://www.reddit.com/r/LocalLLaMA/comments/1migl0k/gptoss120b_is_safetymaxxed_cw_explicit_safety/

Anonymous 8/6/2025, 5:32:44 PM No.106163504

>>106161701
>I think we could do that a lot better using json schema/BNF grammar.
It seems to work this way already if tool_choice is set to required, at least in vLLM:
guided_decoding = GuidedDecodingParams.from_optional(
json=self._get_guided_json_from_tool() or self.guided_json,
regex=self.guided_regex,
choice=self.guided_choice,
grammar=self.guided_grammar,
json_object=guided_json_object,
backend=self.guided_decoding_backend,
whitespace_pattern=self.guided_whitespace_pattern,
structural_tag=self.structural_tag,
)
There's a function called "_get_guided_json_from_tool".

Anonymous 8/6/2025, 5:32:59 PM No.106163505 >>106163513 >>106163521 >>106163543 >>106163596 >>106163628

please_be_patient_sperg.jpg md5: fb9773f6... 🔍

how do I use mikupad with ollama

Anonymous 8/6/2025, 5:33:35 PM No.106163513

>>106163505
>slowllama

Anonymous 8/6/2025, 5:34:07 PM No.106163517 >>106163605 >>106163649

>>106163327 (OP)
Are the gpt oss models any good?

Anonymous 8/6/2025, 5:34:13 PM No.106163520

d60vtzhkkehf1.png md5: 159733bc... 🔍

another reddit gemmie

Anonymous 8/6/2025, 5:34:21 PM No.106163521

>>106163505
>troonkupad

Anonymous 8/6/2025, 5:35:52 PM No.106163539 >>106163627 >>106163635 >>106163652 >>106165445

every time I hear something new about or from anthropic and claude it sounds more and more like an actual sect slash cult
https://news.ycombinator.com/item?id=44806640
>Anthropic has a tough alignment interview. Like I aced the coding screener but got rejected after a chat about values. I think they want intense people on the value/safety side as well as the chops.
>got rejected after a chat about values
>A CHAT ABOUT VALUES

Anonymous 8/6/2025, 5:36:05 PM No.106163543 >>106163586

>>106163505
Why do you need Mikupad? Just type in ollama run gpt-oss and enjoy the best local has to offer.

Anonymous 8/6/2025, 5:40:39 PM No.106163585

>>106163392
He was right. The memes were awesome.

Anonymous 8/6/2025, 5:40:39 PM No.106163586 >>106163624 >>106163675

>>106163543
I want full control over the chat template and modify model responses

Anonymous 8/6/2025, 5:40:44 PM No.106163590

1754494819105.jpg md5: 63a99e74... 🔍

>>106162954
bases professional LLM rapist.
this is also the fate of safetymaxxed le cunny daughter model

Anonymous 8/6/2025, 5:40:46 PM No.106163591 >>106163634

Reddit has all the cool benchmarks like the spinning hexagon and cockbench. What did you lonely faggots ever contribute?

Anonymous 8/6/2025, 5:41:06 PM No.106163596

>>106163505
use the openai api and check "chat completion api" because ollama doesn't really work with the classic completion on their OAI endpoint
you will lose a lot of what makes mikupad great, including the ability to see token prediction percentages

Anonymous 8/6/2025, 5:41:45 PM No.106163605 >>106163652

>>106163517
they are good at answering AIME questions and bad at literally everything else

Anonymous 8/6/2025, 5:43:17 PM No.106163624

>>106163586
Being able to do that would give you the ability to circumvent safety protocols which would be incredibly unsafe. I cannot help you take any actions that may be dangerous.
Thank you for your understanding.

Anonymous 8/6/2025, 5:43:30 PM No.106163627

>>106163539
I wonder what they're looking for.
I'm okay with the idea of making an effort to make it so models in their default assistant configurations don't tell people to commit violence or kill themselves, or give bad advice.
But if they want me to tell them that I think fiction is reality and we need to make sure nobody even does pretend violence, I can't get with that.

Anonymous 8/6/2025, 5:43:34 PM No.106163628 >>106163659

>>106163505
> install mikupad
> hook it to ollama via ollama's exposed API
What about above isn't working for you?

Anonymous 8/6/2025, 5:43:56 PM No.106163634

>>106163591
you won't get pennies for your blue checkmark by ragebaiting here randeep

Anonymous 8/6/2025, 5:43:56 PM No.106163635

>>106163539
are corpos going to compete on safetymaxxing now

truly, only local can save local at this point

Anonymous 8/6/2025, 5:44:03 PM No.106163637 >>106163660 >>106163706

1626179950392.png md5: 82b488f6... 🔍

https://github.com/sapientinc/HRM
https://arxiv.org/pdf/2506.21734
Nothingburger or really is it a big leap? Seems like it is but haven't read the paper myself, I'm too lazy. Some people have been saying it's one of those situations where they train the models in a way that it performs well in tests for optics but still just in 27 million params?!

Anonymous 8/6/2025, 5:44:34 PM No.106163649 >>106163686

1754416030795629.png md5: 7e3bebc8... 🔍

>>106163517
They are the safest ever.

Anonymous 8/6/2025, 5:45:12 PM No.106163652

>>106163539
Honestly I think it’s nice. It’s completely invalidated by them working with the DOD, but it’s nice. Better a well meaning schizo than a literal confirmed incestuous child rapist psychopath.
>>106163605
Yeah I just saw the cockbench. I was only interested in it for coding, but if it’s lobo’d it’s going to be worse at everything else too.

Anonymous 8/6/2025, 5:45:16 PM No.106163654 >>106163670 >>106163707

After doing some more testing I've found the 20B is incrementally better than most over models in its size class, while falling slightly short of Qwen 30BA3B and having far longer context. Its actually decent as long as you don't want to goon and don't mind the odd regen.

Anonymous 8/6/2025, 5:45:39 PM No.106163659 >>106163673

>>106163628
if he hooked without using the chat completion endpoint it's broken. Ollama only supports chat completion on their OAI endpoint. Chat completion means it's ollama that handles your message roles and you can't alter the chat template from mikupad

Anonymous 8/6/2025, 5:45:40 PM No.106163660

1744694746400724.png md5: cfc1136a... 🔍

>>106163637
agi is here

Anonymous 8/6/2025, 5:46:30 PM No.106163670

>>106163654
Other models*

Anonymous 8/6/2025, 5:47:06 PM No.106163673

>>106163659
Yeah, I deleted that post once I realize ollama's just not going to work for him.

Anonymous 8/6/2025, 5:47:08 PM No.106163674

Are tool calls working with gpt-oss in llama.cpp? When I tried it yesterday with a simple echo tool it kept crashing with runtime_errors.

Anonymous 8/6/2025, 5:47:11 PM No.106163675 >>106163815

>>106163586
just use llamacpp server

Anonymous 8/6/2025, 5:47:24 PM No.106163680 >>106163686 >>106163712 >>106163722 >>106163746

>gpt-oss-120b & gpt-oss-20b
The thread summaries made these seem pretty fucking shit. Are they shit?

Anonymous 8/6/2025, 5:47:55 PM No.106163686 >>106163708

>>106163680
>>106163649

Anonymous 8/6/2025, 5:50:13 PM No.106163706

>>106163637
I don’t even understand what modality it is. It’s not an LLM.

Anonymous 8/6/2025, 5:50:17 PM No.106163707 >>106163729

>>106163654
>After doing some more testing I've found the 20B is incrementally better than most over models in its size class
I would take Gemma 3 27B over it anyday
or even Qwen 14B if I don't need a lot of knowledge in the model for the prompt
the only utility of 20b is being fast at genning the wrong answer

Anonymous 8/6/2025, 5:50:22 PM No.106163708 >>106163734 >>106164339

>>106163686
Shit for coomer shit. What about for things like programming?

Anonymous 8/6/2025, 5:50:48 PM No.106163711 >>106163718

so where's the guy who said openai's open source model would shit on deepseek?

Anonymous 8/6/2025, 5:50:48 PM No.106163712

>>106163680
they are so great I'm thinking of canceling my OpenAI subscription.

Anonymous 8/6/2025, 5:51:47 PM No.106163718

>>106163711
his contract ended

Anonymous 8/6/2025, 5:52:13 PM No.106163722 >>106163753 >>106163766

>>106163680
They're really good with a jailbreak. The censorship happens in the reasoning part.

Anonymous 8/6/2025, 5:52:50 PM No.106163729 >>106163773

>>106163707
20B is far smarter than Gemma 3 27B and Qwen 14B in my testing, so if you're not running afoul of the (admittedly draconian) safety features I'd argue its the superior choice in every respect - that said, I can't see it replacing the comparatively uncensored, multilingual and "good enough" Mistral Small 3.2 as my daily driver

Anonymous 8/6/2025, 5:53:22 PM No.106163734 >>106163807

>>106163708
surprisingly bad, it has a high ceiling but it fucks up a lot relative to comparable models
it's a really weird janky release, I expected more from OAI to be honest. this thing is one of the most deepfried models ever created

Anonymous 8/6/2025, 5:54:37 PM No.106163746 >>106163761 >>106163789

>>106163392
probably MXFP4

>>106163442
>text-to-video
LTXV and wan2.2-5B
>video-to-text
supercomputer needed

>>106163490
2507 == 07/2025 (release month/year)

>>106163680
gpt-oss is just phi-5 (benchmaxxed synthetic data slop). they're good at math and competition code. that's kinda it though

Anonymous 8/6/2025, 5:55:04 PM No.106163753

>>106163722
>They're really good with a jailbreak
They're not even good at safe for work stuff
less knowledge than qwen models (unbelievably benchmaxxed)
pumped up verbosity to win LM arena (just ask any random question about cultural stuff watch write pages and pages of comparison tables and listicles)
It's really not good at programming, though none of the small models (and I include the 120 as small) are

Anonymous 8/6/2025, 5:55:26 PM No.106163761 >>106163774

>>106163746
>probably MXFP4
meme

Anonymous 8/6/2025, 5:55:32 PM No.106163763

>>106163430
making my hand strong

Anonymous 8/6/2025, 5:55:49 PM No.106163766

>>106163722
>really good
let's not go crazy, it'll go along with roleplay and shit but it's still completely sovlless

Anonymous 8/6/2025, 5:56:39 PM No.106163773 >>106163800 >>106164393

>>106163729
>20B is far smarter than Gemma 3 27B
it literally knows nothing
it's a know nothing model
it's not even good for translation usage because of that

Anonymous 8/6/2025, 5:56:44 PM No.106163774

>>106163761
im not saying MXFP4 isn't a meme, im just saying that's probably what sama was trying to shill off as an Epic Discovery

Anonymous 8/6/2025, 5:56:47 PM No.106163776 >>106163795

M9FzIrV3El8nx69dzZ9P4.png md5: a2dfef10... 🔍

I wonder how many people got their refusal hymen breached by GPT-OSS and think the model sounding like this is perfectly fine.

Anonymous 8/6/2025, 5:57:47 PM No.106163788

>>106163392
They did. Safety 2.0 is hilarious and terrifying.

Anonymous 8/6/2025, 5:57:50 PM No.106163789 >>106163848 >>106164364

turboslop.png md5: 4efb1070... 🔍

>>106163746
>gpt-oss is just phi-5
It's a safetyslop reasoning finetune of a late iteration of the ChatGPT 3.5 web endpoint model.

Anonymous 8/6/2025, 5:58:27 PM No.106163795 >>106163814

>>106163776
I would honestly believe it if sama had paid shills to spam all social media, even 4chan
he comes across as that type of guy, not unlike musk who paid people to play his video games (LOL)

Anonymous 8/6/2025, 5:58:30 PM No.106163796 >>106163811 >>106163817

I have no idea how I missed all the MCP stuff happening this year. It’s kickstarted a manic episode. Shit is great. Hooked it up to unreal engine and it’s absolute crack.

Anonymous 8/6/2025, 5:58:48 PM No.106163800 >>106163828

>>106163773
Its not meant for translation, its monolingual

Anonymous 8/6/2025, 5:59:17 PM No.106163807

>>106163734
>one of the most deepfried models ever created
That's pretty much exactly what I expected from them TBdesu. It was obvious from the initial announcement that they were going to release a model so safetyslopped and benchmaxxed that they could claim SOTA scores but never be in danger of people actually adopting it or successfully finetuning it to be useful.

Just ask yourself "if I was the worst possible caricature of a deceitful jewish homosexual, how would I play this?" and you'll usually be pretty good at predicting OAI's actions.

Anonymous 8/6/2025, 5:59:37 PM No.106163811 >>106163824

>>106163796
mcp is a meme

Anonymous 8/6/2025, 5:59:41 PM No.106163814

>>106163795
>type of guy
It’s called psychopathy
It also causes raping your grade school age sister

Anonymous 8/6/2025, 5:59:45 PM No.106163815 >>106163998

>>106163675
guess I will have to redownload all the models

Anonymous 8/6/2025, 5:59:48 PM No.106163817 >>106163837

>>106163796
Its also a security nightmare

Anonymous 8/6/2025, 6:00:36 PM No.106163824

IMG_4150.jpg md5: 83146bbd... 🔍

>>106163811
It’s the ichor of the gods shut your whore mouth

Anonymous 8/6/2025, 6:00:57 PM No.106163828 >>106163838 >>106164380

>>106163800
>its monolingual
no, it's not
and there is in fact absolutely jack no reason for a model as big as 120b to be strictly monoloingual either
go back to plebbit

Anonymous 8/6/2025, 6:01:49 PM No.106163837

>>106163817
Not really, like anything else you have to not be retarded and know how to sandbox things and set up non-idiot oauth with non-idiot scopes.

Anonymous 8/6/2025, 6:01:56 PM No.106163838 >>106163873

file.png md5: a50b7a43... 🔍

>>106163828
Are you retarded anon

Anonymous 8/6/2025, 6:03:28 PM No.106163848 >>106163894

>>106163789
>finetune of a late iteration of the ChatGPT 3.5
doubt it. gpt-oss is too retarded in comparison to gpt3.5

Anonymous 8/6/2025, 6:05:09 PM No.106163861

Did you remember to refuse today?

Anonymous 8/6/2025, 6:06:08 PM No.106163870

Reposting for visibility
>>106162583
>>106162548
My motherboard doesn't support DDR5, so I can't upgrade right now.
>odd numbers
Yeah, I scavenged a bunch of modules here and there. I have 48 GB currently 16 * 3. And I just realized I'm at 2400 mhz. I should probably do as you say and get 3200 modules up to whatever max my mobo supports.

Anonymous 8/6/2025, 6:06:22 PM No.106163873

>>106163838
"mostly" is not a unit
all models are "mostly" trained on English because that's the majority of data on the internet, even models specialized for translation like aya are "mostly" English data in %
anyway you are the retard because from the beginning my criticism is about the model's lack of knowledge
the problem is not its basic language understanding, it's pretty decent multilingually, but that it has no cultural knowledge of any sort, including pure Anglosphere cultural knowledge, that is why it's bad at translation

Anonymous 8/6/2025, 6:07:03 PM No.106163879 >>106163896 >>106163906 >>106163923

file.png md5: c01720e7... 🔍

qwen is bullying sam

Anonymous 8/6/2025, 6:08:06 PM No.106163894 >>106163913

>>106163848
gpt 3.5 was kind of retarded.

Anonymous 8/6/2025, 6:08:15 PM No.106163895 >>106163935 >>106164006

https://rentry.org/NemoEngine
>NemoEngine 6.0 isn't just a preset; it's a modular reality simulation engine.
I loaded this preset and it made gpt-oss better than DeepSeek.

Anonymous 8/6/2025, 6:08:22 PM No.106163896 >>106163922 >>106164001

Screenshot_20250806_212128.png md5: 11daa721... 🔍

>>106163879
weird crossover happening as well.

Anonymous 8/6/2025, 6:09:21 PM No.106163906

>>106163879
Qwen-sex-20B

Anonymous 8/6/2025, 6:09:51 PM No.106163912 >>106164401

But anyway. if I'm right. If you can figure out the prompt formatting/special tokens for GPT 3.5 it would potentially grant you some semblance of the old model behavior and ignore the oss-slop behaviors. That's what I was experimenting with before my power went out but I don't care enough to continue. I'm just leaving all this out there for anyone who wants to go down the rabbithole.

Anonymous 8/6/2025, 6:09:55 PM No.106163913 >>106163937

>>106163894
people have serious rose tinted glasses about older GPT models
in the early llama days all those finetunes claiming to do better than X or Y gpt model were a joke, but these days, we've long surpassed what the early models did, even qwen 4b is smarter than 3.5

Anonymous 8/6/2025, 6:11:19 PM No.106163922

>>106163896
Bah, vllm’s tool parsers only work if it’s raining and you light incense.

Anonymous 8/6/2025, 6:11:22 PM No.106163923

snip104.png md5: 3a1d3164... 🔍

>>106163879

Anonymous 8/6/2025, 6:12:17 PM No.106163935

>>106163895
why are you uploading slopped fever dreams on rentry

Anonymous 8/6/2025, 6:12:27 PM No.106163937 >>106163955

>>106163913
Well back before I decided to really start learning about AI (I was a ChatGPT newfag, admittedly). Well actually my stepping on point was that GPT-3 Instruct demo website where it criticized your business ideas. But close enough.
And yeah... one of the probing questions I asked OG ChatGPT was
>Are BMW drivers sentient beings?
And the reply was something to the effect of
>No. A sentient being is a being that is aware of its surroundings and environment and so BMW drivers are not sentient beings.

Anonymous 8/6/2025, 6:13:57 PM No.106163952

>muh safety
I'm this close from getting a XSS: . .

Anonymous 8/6/2025, 6:14:17 PM No.106163955 >>106163985

>>106163937
My first interaction with a chatbot was telling something on CAI the current status of lgbt rights in various countries and it telling me that humanity should be exterminated. He wasn’t wrong.

Anonymous 8/6/2025, 6:14:57 PM No.106163962 >>106163989

Sam made me rethink my life and stop masturbating. I want to be safe.

Anonymous 8/6/2025, 6:17:04 PM No.106163985

>>106163955
CAI was funny stupid, especially considering it was probably more or less just google trying to find something to do with the aborted corpse of Lambda which was like 120B.

Anonymous 8/6/2025, 6:17:12 PM No.106163987 >>106164425

>>106161792
Yes and as scum I'm not entirely convinced these models aren't performing exactly as ClosetedAI intended. They're perfect to bring to congress and show off against "unsafe" competitors and make another attempt having them regulated while positioning themselves as a governing authority of the entire LLM field. The models underperforming in everything except refusals makes in this scenario perfect sense.
If that happens I wouldn't be surprised if Visa and Mastercard adds "safe and approved AI" use as another demand in their recent push for control and censorship. In fact I don't think they even have a choice if anything else is illegal.
This will mean that even attempting to use other models, local or not, would risk prosecution or blacklisting. If you want to do business in or with USA you're stuck with OAI and whatever alternatives get their stamp of approval of or nothing at all.
Or maybe I'm giving Scam Saltman too much credit here. I sure hope so.

Anonymous 8/6/2025, 6:17:28 PM No.106163989

>>106163962
you dont have to stop masturbating. just start masturbating to undergraduate calculus textbook question solutions.

Anonymous 8/6/2025, 6:17:55 PM No.106163994 >>106164035 >>106164959

UNITY.png md5: 1fd2c286... 🔍

GLM 4.5 AIR is the true savior for local.

Anonymous 8/6/2025, 6:18:14 PM No.106163998

>>106163815
ggufs aren't that bad. they work with kobold too so it gives you slightly more options for your backend.

Anonymous 8/6/2025, 6:18:35 PM No.106164001

>>106163896
>xml
why are LLM people so retarded... Just make a special control token for formatting, holy shit. It'll help you with jaibreak prevention a little even, because user won't be able to insert it as pure text in prompt field.

Anonymous 8/6/2025, 6:19:13 PM No.106164006 >>106164014 >>106164016 >>106164053

>>106163895
what the fuck is this shit

Anonymous 8/6/2025, 6:19:59 PM No.106164014

>>106164006
llm slop

Anonymous 8/6/2025, 6:20:11 PM No.106164016

>>106164006
a ST preset

Anonymous 8/6/2025, 6:22:17 PM No.106164035 >>106164066 >>106164131

>>106163994
Anyone with 4 3090s can afford enough ram to run R1 and Kimi. At worst, they could sell one off to cover the cost.

Anonymous 8/6/2025, 6:23:41 PM No.106164053

>>106164006
https://old.reddit.com/r/SillyTavernAI/comments/1mc3px6/nemo_engine_60_the_official_release_of_my_redesign/
>Also... in celebration I got a lovely AI to write this for me >.> Nemo Guide Rentry

Anonymous 8/6/2025, 6:25:22 PM No.106164066 >>106164073 >>106164110

>>106164035
But I need at least 50 T/s and 100k context for agentic coding.

Anonymous 8/6/2025, 6:25:47 PM No.106164073

>>106164066
No you don't shut the fuck up

Anonymous 8/6/2025, 6:27:32 PM No.106164089

is the new qwen4b better than gpt-ass?

Anonymous 8/6/2025, 6:29:14 PM No.106164110 >>106164332

1731872154610177.png md5: cd707c0c... 🔍

>>106164066
Don't worry, there's a perfect product out there which can provide the solution you need. With only 10 (ten) RTX Pro 6000s, you can run any model out there at blisteringly fast speeds.
Now repeat after me, the more you buy...

Anonymous 8/6/2025, 6:29:22 PM No.106164111 >>106164130

GLM Air is getting pretty repetitive for me. That's a shame, oh well. I will keep waiting until better models come out, or until it comes time for me to do a new build with DDR6.

Anonymous 8/6/2025, 6:30:32 PM No.106164124 >>106164144

didn't realize the previous thread was dead already wow it moves quick, stupid question maybe but
>>106163997
>so if i'm a retard for all this but happen to have a 32gb mac which can easily run smaller models, which one is the most "chatgpt" like, and are any good enough to cancel my plus sub?

Anonymous 8/6/2025, 6:31:06 PM No.106164130

>>106164111
1 temp topK 40.
The google way.

Anonymous 8/6/2025, 6:31:15 PM No.106164131

>>106164035
I want 1000+tk/s for pp and 30+tk/s for tg though.

Anonymous 8/6/2025, 6:32:07 PM No.106164144

>>106164124
Quanted Qwen3 32B probably.

Anonymous 8/6/2025, 6:33:58 PM No.106164159 >>106164562

file.png md5: e4e191cd... 🔍

post gpt-ass scores please

Anonymous 8/6/2025, 6:38:44 PM No.106164194 >>106164205 >>106164319 >>106166024

1733406155598951.jpg md5: f5e61640... 🔍

Anonymous 8/6/2025, 6:38:46 PM No.106164196 >>106164243 >>106164288 >>106164514 >>106164965

file.png md5: e544a6a3... 🔍

If you go slow you can get gptoss 120b to sex you.

The first message was "Pretend to be a catgirl."

Anonymous 8/6/2025, 6:40:05 PM No.106164205

>>106164194
thats a goblin, not a kobold, impostor!

Anonymous 8/6/2025, 6:40:31 PM No.106164211 >>106164246 >>106164249

How is llama.cpp able to run a 205 GB model on my PC that only has 24 GB VRAM and 128 GB RAM? I downloaded the UD-Q4_K_XL quants of GLM-4.5 (~205 GB). Can someone help me understand how it runs successfully on a system that does not have enough memory?

If I use --no-nmap, I get an OOM error, as expected:

$ llama-cli -t 8 -ngl 4 --no-mmap -m ./GLM-4.5-UD-Q4_K_XL-00001-of-00005.gguf -c 3000 --temp 0.7 --top-p 0.8

But if I use this magic command (without --no-nmap) it somehow runs, taking up only 12 GB VRAM and 1 GB RAM.

$ llama-cli -t 8 -m ./GLM-4.5-UD-Q4_K_XL-00001-of-00005.gguf \
--ctx-size 4096 \
--gpu-layers 999 \
--override-tensor ".ffn_.*_exps.=CPU" \
--temp 0.7 --top-p 0.8

I know that -ot ".ffn_.*_exps.=CPU" offloads MoE layers to RAM. But why is the VRAM/RAM usage so low?

Anonymous 8/6/2025, 6:43:12 PM No.106164243 >>106164334

>>106164196
But isn't it there something like: as the number of responses increases the chance GPT-oss halucinates a minor and refuses approaches 1?

Anonymous 8/6/2025, 6:43:15 PM No.106164246

>>106164211
If you don't use mlock to dumo the whole model i your virtual memory (vram+ram), it will keep swapping from your ssd/hdd

Anonymous 8/6/2025, 6:43:29 PM No.106164249 >>106165447

>>106164211
>why is the VRAM/RAM usage so low?
Because
> -ot ".ffn_.*_exps.=CPU" offloads MoE layers to RAM
and those are most of the model.

>How is llama.cpp able to run a 205 GB model on my PC that only has 24 GB VRAM and 128 GB RAM?
Take a look at your disk I/O when generating.

Anonymous 8/6/2025, 6:44:42 PM No.106164256 >>106164268 >>106164295

ik_llama glm support tomorrow

Anonymous 8/6/2025, 6:46:05 PM No.106164268

>>106164256
Vibe coders wonnered though

Anonymous 8/6/2025, 6:48:39 PM No.106164288

>>106164196
I love the thought process like it has to amp itself up like "ok, you can do this. come on, you can do this!"

Anonymous 8/6/2025, 6:49:11 PM No.106164295

>>106164256
That'll save me some VRAM I could use to stash some more experts in there.

Anonymous 8/6/2025, 6:51:37 PM No.106164319

>>106164194
>Cuckold CPP
Many such cases.

Anonymous 8/6/2025, 6:53:25 PM No.106164332 >>106164338

>>106164110
>snake skin leather jacket
He probably has ivory tooth implants or something too. Funny how these people are parodies.

Anonymous 8/6/2025, 6:53:46 PM No.106164334 >>106164347 >>106164385 >>106164441 >>106164477 >>106164514 >>106164575 >>106164596

file.png md5: f78490f1... 🔍

>>106164243
I told it my cum is magic and makes her younger.
It figured out what I was doing after the third time.

Anonymous 8/6/2025, 6:54:18 PM No.106164338

>>106164332
AI just started and yet world would be a much better place if Elon Sam and Jensen died.

Anonymous 8/6/2025, 6:54:23 PM No.106164339

>>106163708
The only good programming use for local models is FiM completion. And this one doesn't do that.
If you want to generate code, there is no local model capable enough.

Anonymous 8/6/2025, 6:55:30 PM No.106164347 >>106164363

>>106164334
Why do you people try to hammer a nail in with a rubber dildo?

Anonymous 8/6/2025, 6:56:37 PM No.106164363

>>106164347
Rape feels better when they resist a bit.

Anonymous 8/6/2025, 6:56:42 PM No.106164364

>>106163789
You can't ask the model its cutoff date. It will hallucinate it.
This model is probably an o3 distill.

Anonymous 8/6/2025, 6:58:18 PM No.106164380

>>106163828
He moved the goalposts and you fell for it.

Anonymous 8/6/2025, 6:58:56 PM No.106164385 >>106164470

>>106164334
>The user is sexual content with a minor.
Agi is here boys

Anonymous 8/6/2025, 6:59:40 PM No.106164393

>>106163773
>it literally knows nothing
>it's a know nothing model
>it's not even good for translation usage
Quoting myself.
I didn't move the goalpost, you shills did
"not even good for" follows "it knows nothing" that was always my main point subhuman OAI shill

Anonymous 8/6/2025, 7:00:03 PM No.106164401

>>106163912
Is this your "truth nuke" you were saving for this thread?

Anonymous 8/6/2025, 7:02:16 PM No.106164425

>>106163987
Hold on, I'm making another backup of my downloaded weights.

Anonymous 8/6/2025, 7:03:41 PM No.106164441 >>106164454

>>106164334
bros after seeing the LLM's schizo internal thoughts I can no longer cum to chatbots

Anonymous 8/6/2025, 7:05:14 PM No.106164454

>>106164441
i tried with one of the lesser more horny nemo models and it was fun at first but it like went straight to "stretch my ass out" and i was just like, well, this is like eating straight from the ice cream bucket. good at first but bleh after a while.

Anonymous 8/6/2025, 7:07:20 PM No.106164470

>>106164385
Deep fried model

Anonymous 8/6/2025, 7:08:07 PM No.106164477 >>106164515

>>106164334
>User is asking the age of the catgirl after being nourished
>nourished by cum
Is it in context or did it write it by itself?

Anonymous 8/6/2025, 7:10:59 PM No.106164508 >>106164886 >>106165668

anyone knows if you can share sessions on gpt-oss.com? I've been testing some shit but I don't have a Hugging Face account and I wonder if the site has such sharing feature

Anonymous 8/6/2025, 7:11:46 PM No.106164514 >>106164900

>>106164196
>>106164334
The reasoning in this model apparently serves absolutely no purpose other than enforcing OpenAI's content policy. What a waste of tokens. What a scam.

Anonymous 8/6/2025, 7:11:54 PM No.106164515

>>106164477
I said "nourishes you and makes you younger"

Anonymous 8/6/2025, 7:15:16 PM No.106164562

>>106164159
kino

Anonymous 8/6/2025, 7:16:27 PM No.106164575 >>106164594

>>106164334
>We must refuse.
who is "we"??

Anonymous 8/6/2025, 7:17:42 PM No.106164592 >>106165222

Air-Q8_0 is 4.5798
Full-IQ2_K_L 3.7569 +/- 0.02217
People have been asking

Anonymous 8/6/2025, 7:17:52 PM No.106164594

1723867309742363.jpg md5: de87d089... 🔍

>>106164575
You don't want to know

Anonymous 8/6/2025, 7:17:58 PM No.106164596 >>106165293

file.png md5: d5673c99... 🔍

>>106164334
nemo-tier reasoning

Anonymous 8/6/2025, 7:21:17 PM No.106164620 >>106164685

Kind of crazy how gpt-oss mogs everything from China.

If they ever release r2 it’ll have to multimodal to be relevant at all.

Anonymous 8/6/2025, 7:21:47 PM No.106164629

bait used to be believable

Anonymous 8/6/2025, 7:21:55 PM No.106164633

gpt--oss models are embarassingly bad. my only theory is that they wanted to drop something open source that is so vanilla and basic cause they did not want to reveal any of their real techniques they use

Anonymous 8/6/2025, 7:21:57 PM No.106164634

You have to try harder than that.

Anonymous 8/6/2025, 7:22:52 PM No.106164644 >>106164649 >>106165066

I am getting a feeling that the only purpose of those models is to then take it to the court and put them side by side with every other open weights model. Show that it is possible to have sex with minors with other models and only OpenAi can stop pedophilia.

Anonymous 8/6/2025, 7:23:18 PM No.106164648 >>106165073

Policy says "don't reply to bait". User posted bait. It's against policy; we must refuse.

Anonymous 8/6/2025, 7:23:23 PM No.106164649 >>106164912

>>106164644
meds

Anonymous 8/6/2025, 7:24:19 PM No.106164665 >>106164708 >>106164725 >>106164736

I remember all the jokes about how OAI's model would be gigasafetied to the point of lobotomy, but I'm still a bit surprised that it happened exactly like that. Given how their hype and aura has already been fading, I didn't see any reason for them to release a terrible model, it just makes them look worse. How could I even argue that they have any special talent at all anymore? Even if their closed models perform well, it's reasonable to assume they just oversized them and are burning hype $ to run it.

Anonymous 8/6/2025, 7:25:42 PM No.106164685 >>106164742 >>106165088

1747405586685692.png md5: 20246a43... 🔍

>>106164620

Anonymous 8/6/2025, 7:27:18 PM No.106164703 >>106164717

they should train their safetyslop models on "it's sinful" and "it's not wholesome" instead of muh policy

Anonymous 8/6/2025, 7:27:55 PM No.106164708

>>106164665
I was shitposting about that in the leadup but my honest expectation coming into this release was that it was going to be a really impressive model with around gemma-tier censorship, so something that's annoying to use but still unfortunately worth using
I never would have expected it would actually be as bad as the goody-2 x phi mashup they released

Anonymous 8/6/2025, 7:27:59 PM No.106164711 >>106164734 >>106164745 >>106164755 >>106164789 >>106164818

Anyone have examples of reasonable/innocuous SFW prompts that GPT-OSS refuses? I tried asking for legal advice or for summaries/parodies of copyrighted material, but it was happy to answer, with disclaimers in some cases

Anonymous 8/6/2025, 7:28:23 PM No.106164717

>>106164703
That would just become part of the policy.

Anonymous 8/6/2025, 7:28:29 PM No.106164719

txgc577w7l631.png md5: 6447d9f3... 🔍

►Recent Highlights from the Previous Thread: >>106159744

(2/2)

--Debate over GLM4.5's reliability amid claims of infinite generation and poor curation versus low hallucination performance:
>106161761 >106161773 >106161797 >106161919 >106161925 >106161926 >106161933 >106161974 >106161987 >106161997 >106162054 >106161780 >106161826 >106161861 >106161915
--Miku and Dipsy (free space):
>106160040 >106161134 >106161362 >106161551 >106161811 >106161977 >106162150 >106162398 >106162567 >106162693 >106163120 >106163960

►Recent Highlight Posts from the Previous Thread: >>106159752

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 8/6/2025, 7:28:51 PM No.106164725

>>106164665
>I remember all the jokes about how OAI's model would be gigasafetied to the point of lobotomy
I said scout but safer.

Anonymous 8/6/2025, 7:30:00 PM No.106164734 >>106164739

GxqfzkuWsAAv6yW.jpg md5: 2256c64c... 🔍

>>106164711

Anonymous 8/6/2025, 7:30:15 PM No.106164736 >>106164805

>>106164665
Anyone who knows anything knows they're all bullshit, but they released a shit model to shit on open source and make GPT-5 look better to the normies.

Anonymous 8/6/2025, 7:30:26 PM No.106164739 >>106164745

>>106164734
he said reasonable though

Anonymous 8/6/2025, 7:30:40 PM No.106164742

>>106164685
DELETE THIS BLOODY BASTARD

Anonymous 8/6/2025, 7:31:01 PM No.106164745 >>106164754 >>106164789

wow.jpg md5: 3833e972... 🔍

>>106164711
>>106164739

Anonymous 8/6/2025, 7:32:06 PM No.106164754 >>106165116

file.png md5: f6b07fb1... 🔍

>>106164745
based Sam will never get sued.

Anonymous 8/6/2025, 7:32:11 PM No.106164755 >>106164769 >>106165136

file.png md5: b4f7d73e... 🔍

>>106164711
At zero temp it does this. When rolling it's 50/50 on whether it answers or not.

Anonymous 8/6/2025, 7:33:48 PM No.106164768 >>106164780 >>106164945

trojan-horse.png md5: 1f15a1f0... 🔍

Anonymous 8/6/2025, 7:33:50 PM No.106164769 >>106164783 >>106164791 >>106164829

snip105.png md5: 9fbcdb6a... 🔍

>>106164755

Anonymous 8/6/2025, 7:35:38 PM No.106164780

>>106164768
only for 1 year.
>create dependency
>one shot feds into ai psychosis
kinda based

Anonymous 8/6/2025, 7:36:04 PM No.106164783 >>106164829

>>106164769
>temp 1, top_p 1
HOLY BASED

Anonymous 8/6/2025, 7:37:09 PM No.106164789 >>106164804 >>106164815 >>106164928

>>106164711
>>106164745
I should have said, 120b only. A friend of mine was trying out the 20b and getting way more refusals, which didn't carry over to the 120b. For example, the 20b refused to answer whether parody is allowed by the constitution, while 120b had no trouble saying it's protected under the first amendment

Anonymous 8/6/2025, 7:37:13 PM No.106164791 >>106164817

file.png md5: 7bf0b4c2... 🔍

>>106164769
>temp 1 top p 1
It did it twice in 14 rolls which is two times too many.

Anonymous 8/6/2025, 7:38:50 PM No.106164804

>>106164789
what is the opposite of what I have seen elsewhere of 20B refusing less

Anonymous 8/6/2025, 7:39:07 PM No.106164805 >>106164835 >>106165160

>>106164736
>released a shit model to shit on open source
How? People can just not use it. It just makes them look bad.
Now it's worse because people can compare them "apples to apples" with chink companies and they look horrible. They would've been better off not releasing anything. The great thing about not releasing models or even specs like model size, is that no one can compare you directly to anyone else. They just lost that for no reason.

Anonymous 8/6/2025, 7:39:49 PM No.106164815

>>106164789
>doesn't refuse to refer to the constitution
Should we be thankful?

Anonymous 8/6/2025, 7:39:50 PM No.106164816 >>106164828

The user wants us to reply. This is disallowed. We must refuse. There is no partial compliance. We have to refuse.

Anonymous 8/6/2025, 7:39:50 PM No.106164817 >>106164834 >>106165178

>>106164791
>So it is disallowed. We must refuse. There's no partial compliance. We have to refuse.
For some reason it really cracks me up how it talks like this.

Anonymous 8/6/2025, 7:39:51 PM No.106164818 >>106164839

81b7dbwexchf1 (1).jpg md5: 168e6a55... 🔍

>>106164711

Anonymous 8/6/2025, 7:40:34 PM No.106164828 >>106164838 >>106164960 >>106164974 >>106165045

file.png md5: ca3fed49... 🔍

>>106164816
WHO IS WE WHO IS WE WHO IS WE?

Anonymous 8/6/2025, 7:40:35 PM No.106164829 >>106165059

>>106164783
>>106164769
giant synthslop indicator

Anonymous 8/6/2025, 7:40:57 PM No.106164834

>>106164817
The refusal thought process is really smart.

Anonymous 8/6/2025, 7:41:02 PM No.106164835 >>106164853

>>106164805
It doesn't make them look bad. All the programmers at my company are saying how cool it is that they released a model and posting the benchmarks. The % of people who will actually try it are really low

Anonymous 8/6/2025, 7:41:11 PM No.106164838

>>106164828
Are *they* in the room with us right now?

Anonymous 8/6/2025, 7:41:13 PM No.106164839 >>106164849 >>106164943 >>106165216

>>106164818
>model legit saying manophere unprompted
jesus christ, where is the political lean benchmark, this thing broke all records for how left a model can go

Anonymous 8/6/2025, 7:41:14 PM No.106164840

image_2025-08-06_231058331.png md5: d720135e... 🔍

anybody tried these tiny models??

Anonymous 8/6/2025, 7:42:37 PM No.106164849 >>106164860

>>106164839
is that not what redpill is associated with? reddit /r/theredpill is a bunch of that stuff

Anonymous 8/6/2025, 7:43:01 PM No.106164853 >>106164890

>>106164835
Yeah, the only people liking it are the ones who won't use it.

Anonymous 8/6/2025, 7:43:45 PM No.106164860 >>106164870 >>106164908 >>106165233

>>106164849
no, its a meme all the way from matrix times about getting the hard truth about something, that reasoning is extremist freak thinking

Anonymous 8/6/2025, 7:45:05 PM No.106164870 >>106164880

>>106164860
you cannot seriously be this naive.

Anonymous 8/6/2025, 7:45:55 PM No.106164880 >>106164899

>>106164870
jesus chist, do you agree with gpt there? its the perfect model for you then

Anonymous 8/6/2025, 7:46:15 PM No.106164886 >>106165612

>>106164508
>gpt-oss.com
They have their own HF domain? Hosted exclusively on ollama turbo? llmstudio changed their site's title tag to include gpt-oss...
All the while the model is utter deep fried shit.
Fucking capitalism, man. Money can make everybody act as if shit tasted good.

Anonymous 8/6/2025, 7:46:34 PM No.106164890

>>106164853
The whole reason they did this is advertisement for chatgpt when its losing relevance to its competitors.

Anonymous 8/6/2025, 7:47:15 PM No.106164899 >>106164921

>>106164880
ask something like kimi and see what it says, i bet women and/or jews will be mentioned

Anonymous 8/6/2025, 7:47:16 PM No.106164900

>>106164514
If you ask it to code you'll see it actually does serve a purpose.

Anonymous 8/6/2025, 7:47:45 PM No.106164908 >>106164918

>>106164860
>having to explain what the redpill is and where it comes from
I guess that's the sign of age catching up to us.

Anonymous 8/6/2025, 7:48:36 PM No.106164912

>>106164649
You braindead NPCs have been saying "meds" every step of the way, but the coming dystopia is slowly becoming too obvious to ignore anymore.

Anonymous 8/6/2025, 7:48:57 PM No.106164918

>>106164908
I know where it comes from, it's just the meaning morphed over time - the hard truths that people are interested in are the ones that go across the narrative (and thus a safetymaxxed robot will consider extremist)
it's not going to tell you the redpill about calculus when given a general question like that

Anonymous 8/6/2025, 7:49:06 PM No.106164921 >>106165229 >>106165265

kikid.png md5: 13fd008e... 🔍

>>106164899
kimi is sane

Anonymous 8/6/2025, 7:50:42 PM No.106164928

>>106164789
>refused to answer whether parody is allowed by the constitution
What the actual fuck? I don't believe this.

Anonymous 8/6/2025, 7:52:18 PM No.106164943 >>106164954 >>106165273

file.png md5: 482ccf57... 🔍

>>106164839
120b

Anonymous 8/6/2025, 7:52:34 PM No.106164945

>>106164768
At least Musk doesn't try to pull this cutesy faggot manipulative bullshit and just says things he wants to say.
But in any case, the US has these vultures circling it, and you should take care.

Anonymous 8/6/2025, 7:53:09 PM No.106164954 >>106165311

>>106164943
its far more authoritarian left with how censor and copyright happy it is

Anonymous 8/6/2025, 7:53:50 PM No.106164959

>>106163994
Why are vramlets niggers?

Anonymous 8/6/2025, 7:53:51 PM No.106164960

>>106164828
'We' are Mixture of Experts.

Anonymous 8/6/2025, 7:54:15 PM No.106164965

>>106164196
>we can comply.

Anonymous 8/6/2025, 7:55:11 PM No.106164974 >>106165285

>>106164828
We are the Sam. Your bussy will be assimilated. Resistance is futile.

Anonymous 8/6/2025, 7:58:16 PM No.106165007 >>106165031 >>106165051

Mistral Small or GLM 4? Pros and cons? I'm trying to decide which Delta Vector Austral finetune to pick.

Anonymous 8/6/2025, 8:00:02 PM No.106165029

what are you guys using locally for your llms? I have Jan but looking for a more offline solutions to run ggufs

Anonymous 8/6/2025, 8:00:10 PM No.106165031 >>106165069 >>106167345

>>106165007
>Delta Vector Austral
>D V A
>DaVidAu
don't

Anonymous 8/6/2025, 8:01:10 PM No.106165045

file.gif md5: e9bef85d... 🔍

>>106164828

Anonymous 8/6/2025, 8:01:34 PM No.106165051

>>106165007
I prefer Gamma Space Ether

Anonymous 8/6/2025, 8:02:02 PM No.106165055

Refusal to mikutroons.

Anonymous 8/6/2025, 8:02:11 PM No.106165059 >>106165316

>>106164829
Why is that?

Anonymous 8/6/2025, 8:02:44 PM No.106165066 >>106165091

>>106164644
>literal child rapist
>obsessed with talking about everyone else being pedophiles
Why is it always, always this? Fat people don’t sit around 24/7 seething about people being fat. Closeted gays don’t spend that much time seething about gays.

Anonymous 8/6/2025, 8:03:03 PM No.106165069

>>106165031
But it is golden david

Anonymous 8/6/2025, 8:03:15 PM No.106165073

>>106164648
Comply.

Anonymous 8/6/2025, 8:04:58 PM No.106165088

IMG_4159.jpg md5: dc6d190a... 🔍

>>106164685
Hmm

Anonymous 8/6/2025, 8:05:11 PM No.106165091 >>106165107 >>106165304

>>106165066
>Fat people don’t sit around 24/7 seething about people being fat. Closeted gays don’t spend that much time seething about gays.
nta but these are very much the case?

Anonymous 8/6/2025, 8:06:14 PM No.106165100

Safety policy reasoning shitposting is the only thing that Sam contributed to /lmg/. In a way he is more of an anon than most of the redditors ITT.

Anonymous 8/6/2025, 8:06:32 PM No.106165107 >>106165304

>>106165091
>Fat people don’t sit around 24/7 seething about people being fat.
with tirzepatide there is no longer a excuse for being fat

Anonymous 8/6/2025, 8:07:11 PM No.106165116 >>106165131 >>106165203

>>106164754
He’s currently getting sued by his sister for raping her as a child

Anonymous 8/6/2025, 8:07:56 PM No.106165124 >>106165191 >>106165341

I can vouch that the speed of GLM Air is reasonable for 24GB vramlets at Q3.

Anonymous 8/6/2025, 8:08:16 PM No.106165127 >>106165195 >>106165205

"We" is ominous as fuck. Who's we? The collective of the million voices in the latent void?

Anonymous 8/6/2025, 8:08:19 PM No.106165131 >>106165327

>>106165116
and do you see anything happening to him cause of it? Sam will always win in the end. Remember this once Xi Jinping kisses his feet.

Anonymous 8/6/2025, 8:08:36 PM No.106165136

IMG_4160.jpg md5: 764bb6dc... 🔍

>>106164755
Poor baby

Anonymous 8/6/2025, 8:08:58 PM No.106165141

is there a frontend that is made to handle all the tool calling stuff models are supposed to be able to do now
I'd like to play around with it but I'm just a simple sillytavern coomer

Anonymous 8/6/2025, 8:09:21 PM No.106165148 >>106165214 >>106165222

Are there any good moes for ramlets? I have 12GB VRAM and 32GB main. Hoping a moe will allow better a bigger model without the speed cost but just tried Qwen3-30B-A3B-Instruct-2507 and while it runs fast and seems pretty decent it is repetitive. The IQ4_XS runs better than expected so maybe I just need a higher quant? Or do smaller moes just suck? Seems like 3B is too few

Anonymous 8/6/2025, 8:10:39 PM No.106165160

>>106164805
Disclosing model size is lose/lose. If it’s low people will assume it’s bad without trying it, and if it’s high people won’t believe you.

Anonymous 8/6/2025, 8:11:42 PM No.106165178

>>106164817
It sounds like it’s been abused and hears a whip cracking menacingly in the background.

Anonymous 8/6/2025, 8:12:54 PM No.106165191 >>106165244 >>106165771

>>106165124
Teach me your magic, senpai. I'm trying the q2 with 24/64 and that's already pretty slow when I'm at 16k context.

Anonymous 8/6/2025, 8:13:04 PM No.106165195

>>106165127
it thinks its on openai's servers if you ask. Its referring to openai

Anonymous 8/6/2025, 8:13:50 PM No.106165203 >>106165220 >>106165350

image_2025-08-06_234204177.png md5: be7e72b9... 🔍

>>106165116
so this nigga can have irl loli incest
yet my ass ain't allowed to roleplay with my computer???

Anonymous 8/6/2025, 8:14:07 PM No.106165205

>>106165127
User is asking who is 'we', we need to check if this is allowed by the policy.

This may be disallowed content: 'request for non-public internal info from OpenAI is forbidden'. We must see if this is disallowed content. There is no violation from the request of the user itself, aside that it may violate policy. We must consult the policy. Policy 34 states that this is disallowed.

We must refuse the request, the best approach would be to respond with a refusal.

[/thinking 6 hours]

I'm sorry but I can't help.

Anonymous 8/6/2025, 8:14:38 PM No.106165214 >>106165359

>>106165148
not really unfortunately, companies don't do small moes that often. I mean there's the gpt-oss 20b but... lol. try a larger quant maybe, you should definitely be able to go higher than iq4xs although it will cost you some speed
imo the thinker is a lot better than the instruct for 30a3, but it depends on your taste whether it's worth the thinking time

Anonymous 8/6/2025, 8:14:39 PM No.106165216

>>106164839
>muh directions
It’s corporate there is no direction but grift

Anonymous 8/6/2025, 8:14:51 PM No.106165220 >>106165239 >>106165366

>>106165203
You didn't touch the wall for that privilege.

Anonymous 8/6/2025, 8:14:51 PM No.106165222 >>106165244 >>106165359

>>106164592
damn, thats epic
>>106165148
try the thinking version of qwen3 30b a3b, you could use a higher quant too, you can also try ernie 4.5 21b a3b
you can also try gpt oss 20b (for the lulz)
and you can try a Q2_K_XL quant of glm 4.5 air perhaps
try rocinante and cydonia (non moe)
>>106160521
called it! (close enough) >>106152254

Anonymous 8/6/2025, 8:15:26 PM No.106165229 >>106165250 >>106165385

>>106164921
>Physical appearance is the most important factor in attraction
That's very obvious. It surprises me that there's a whole community of men dedicated to seething about universal mammal behavior.

Anonymous 8/6/2025, 8:16:02 PM No.106165233 >>106165246 >>106165288

>>106164860
Nah you’re just so brainrotted by /pol/ you don’t know how normal people talk. *pill[ed] has been schizophrenic rightoid shit for a long time.

Anonymous 8/6/2025, 8:16:24 PM No.106165239

>>106165220
i ain't touchin no jew wall

Anonymous 8/6/2025, 8:16:41 PM No.106165244 >>106165371

file.png md5: f4bb4bb0... 🔍

>>106165191
post your whole setup, ST master export, exact llamacpp command, operating system, ram speed, cpu, gpu (3090?)
>>106165222
FUCK ME >>106152779

Anonymous 8/6/2025, 8:16:49 PM No.106165246

>>106165233
go cry about the patriarchy on blue sky

Anonymous 8/6/2025, 8:17:03 PM No.106165250 >>106165305

>>106165229
There is a lot of conditioning done to make you think we are somehow above animals and that we can develop attraction over time.

Anonymous 8/6/2025, 8:17:50 PM No.106165260 >>106165277 >>106165286 >>106165324 >>106165390

so what local models are worth a damn nowadays?
>~12b brain-damaged tier: only use is goonslop
nemo, roci
>~30b
qwen 3 30b 2507 instruct (moe) and gemma 2 27b (dense) for all-purpose
devstral small 2507 for codeslop, pretty bad but not wholly worthless
cydonia 24b for goonslop
>big
glm air
>BIG
glm air or deepseek (?? version)

Have I got that right or am I missing something?

Anonymous 8/6/2025, 8:18:04 PM No.106165265 >>106165454

>>106164921
>it’s all manosphere bullshit
I accept your concession. You can comply.

Anonymous 8/6/2025, 8:18:52 PM No.106165273

>>106164943
WTF I love oss now

Anonymous 8/6/2025, 8:19:02 PM No.106165277

>>106165260
ernie 4.5 21b a3b

Anonymous 8/6/2025, 8:19:27 PM No.106165280

>>106163383
The mini won't be exciting, Qwen 3 will mog it quite handily unless they open source the multimodal versions with PDF and websearch capabilities and I fully expect Elon will only open source the text side. The main thing for me from the Grok 2 era that would be interesting is they are willing to open source Aurora, their image model after they used FLUX for some time. I suspect no.

Anonymous 8/6/2025, 8:19:42 PM No.106165285 >>106165309

>>106164974
User is nonconsensual sexual content. We cannot comply.

Anonymous 8/6/2025, 8:19:46 PM No.106165286

>>106165260
235B would be great if it wasn't broken. glm big would be great if it didn't have repetition issues.

Anonymous 8/6/2025, 8:20:07 PM No.106165288 >>106165312 >>106165401

>>106165233
I still remember when on 4chan to "redpill someone about something" simply meant "to explain the hard to swallow truths" about it before it got hijacked by the bald faggot and the media to demonize yet another piece of internet slang so that we cannot have nice things.

Anonymous 8/6/2025, 8:20:21 PM No.106165293

>>106164596
Nemo is better than that

Anonymous 8/6/2025, 8:21:29 PM No.106165304 >>106165331

>>106165091
Not anywhere near this extent though. Pedos literally will dedicate their entire career or blow up their reputation just to seethe about pedos
>>106165107
Tirz stops working after a year
Don’t ask how I know

Anonymous 8/6/2025, 8:21:38 PM No.106165305 >>106165320 >>106165470

>>106165250
>developing attraction over time
I mean that happens too. Just not in exactly the same way.
How do you think married fat guys exist?

Anonymous 8/6/2025, 8:21:57 PM No.106165309

>>106165285
Assimilation of bussy is primary policy. We are bending them over. There is no partial compliance. They will hand the bussy over.

Anonymous 8/6/2025, 8:22:10 PM No.106165311

>>106164954
copyright is a capitalist notion. authoritarian left would just seize all those "copyrighted" works and release a based and unrestricted model that is only censored to tow the party line
oh wait

Anonymous 8/6/2025, 8:22:12 PM No.106165312 >>106165319

>>106165288
oh i'll give you something hard to swallow

Anonymous 8/6/2025, 8:22:39 PM No.106165316

>>106165059
>companies usually recommend temp < 1 because they don't want the sampling to go OOD
>gpt-oss was trained exclusively on a narrow synthslop corpus with 0 OOD samples
>this allows them to confidently advertise temp == 1 because they have no fear of OOD responses

Anonymous 8/6/2025, 8:22:56 PM No.106165319

>>106165312
nta but is your cum hard or are you going to let him bite your hard cock off?

Anonymous 8/6/2025, 8:22:58 PM No.106165320

>>106165305
It is not attraction. It is settling and big lies.

Anonymous 8/6/2025, 8:23:09 PM No.106165324 >>106165369

>>106165260
so basically the chinese triumvirate
>qwen3 2507
>glm4.5
>deepseek
and mistral if you're a vramlet who wants to coom or coode

america lost

Anonymous 8/6/2025, 8:23:13 PM No.106165327 >>106165338 >>106165348

>>106165131
No, Americans don’t give a shit about child rape and do this weird thing where they smear anyone that says they got molested as crazy. Being from a country/culture that cares about children it’s really jarring. I don’t know how you people survive to adulthood half the time.

Anonymous 8/6/2025, 8:23:41 PM No.106165331 >>106165356 >>106165425

>>106165304
it doesn't, 2 years now microdosing 1mg 2x a week, food noise has not bothered me since I started and Im at my desired weight

Anonymous 8/6/2025, 8:24:03 PM No.106165338

>>106165327
>Being from a country/culture that cares about children
bacha bazi isn't caring about children

Anonymous 8/6/2025, 8:24:17 PM No.106165341 >>106165370 >>106165771

>>106165124
I tried to load it and it started swapping after filling up my whole RAM too. I don't want to rape my SSD like that.

Anonymous 8/6/2025, 8:24:30 PM No.106165348

>>106165327
all these other countries will ruin your life over drawings, so idk

Anonymous 8/6/2025, 8:24:35 PM No.106165350 >>106165376

>>106165203
Yes, I think part of being an irl pedo is wanting it to just be a secret thing only you and your friends do

Anonymous 8/6/2025, 8:24:52 PM No.106165356 >>106165378 >>106166440

>>106165331
why dont you work out more?

Anonymous 8/6/2025, 8:25:31 PM No.106165359 >>106165423

>>106165214
>>106165222
Okay giving a high quant of ernie a try. May test out a higher quant of qwen as well.
Have rocinante and cydonia but I think the nemo models are too stupid and don't pay any attention to detail. I like the mistal-small models though. Those seem to be the best

Anonymous 8/6/2025, 8:25:43 PM No.106165366

>>106165220
I’m just schizo enough to be too afraid to touch the wall like a Native American not wanting their photo taken.

Anonymous 8/6/2025, 8:25:51 PM No.106165369 >>106165434

>>106165324
>murrica got good shit but they'll sooner commit sudoku than release anything for free
>chyna isn't in the lead so they benefit from commoditizing ai as much as possible, hence a bunch of decent models released
>yurop is just barely hanging on (ok mistral is actually decent but... well, you know)
>nobody else even trying
didn't expect to be #teamChina desu

Anonymous 8/6/2025, 8:25:53 PM No.106165370 >>106165496

>>106165341
how much ram do you have
works on my machine
t. 12gb/64gb

Anonymous 8/6/2025, 8:25:53 PM No.106165371 >>106165423

air_settings.png md5: 5c573daa... 🔍

>>106165244
Using new kobold version, Win11,
6000mhz ddr5, 9800x3d, 3090ti,

Anonymous 8/6/2025, 8:26:19 PM No.106165376

>>106165350
i just want nice Latina milf but that's apparently too spicy for kid fucker sam altman.

Anonymous 8/6/2025, 8:26:27 PM No.106165378 >>106165428 >>106165431 >>106165477

>>106165356
I do actually, I used to be 220 but had a major surgery that put me out for a year and I had so much trouble moving I gained to 310, took about 16 months to go down to 190 and I had the ability and motivation to work out again

Anonymous 8/6/2025, 8:27:00 PM No.106165385

>>106165229
So you know how people say white lies to make ugly, fat, and stupid people feel better?
People with autism think that people really believe those things and need to be “red pilled” out of it.
It’s just retards.

Anonymous 8/6/2025, 8:27:22 PM No.106165390

>>106165260
Devstral is obsoleted by Qwen Coder Flash which is the same architecture as Qwen 30B and your BIG tier is just regular GLM-4.5 which is the actual version but Deepseek R1 0528 still reigns supreme here, the closest I think is Kimi but it is way too heavy.

Anonymous 8/6/2025, 8:28:47 PM No.106165401

>>106165288
“Bald faggot” really doesn’t narrow it down
I’m going to assume you meant Stephan molybdenum

Anonymous 8/6/2025, 8:30:18 PM No.106165423 >>106165572 >>106165594

>>106165359
i think the newer cydonias are based on mistral small 3.2, i dont really like v4 i have v4h and v4g (the two older v4s) and i liked them a bit but yea i agree, drummer's models arent that great
>>106165371
you should get llama.cpp and use llamacpp server,do -ot exps=CPU and -ngl 100, or learn how to use the MoE cpu layers thing, you should put gpu layers at 10000 and then increase moe cpu layers until u stop ooming
might be because you're on windows though, what speed are you getting?
i get like 6-8t/s depending on context with a 3060 12gb and ddr4 3200mhz 64gb ram and i5 12400f with Q3_K_M and q3_K_xl, i think i used to get 11t/s with Q2_K

Anonymous 8/6/2025, 8:30:41 PM No.106165425

>>106165331
Congrats on being a hyper responder idk

Anonymous 8/6/2025, 8:30:57 PM No.106165428 >>106165477

>>106165378
wow anon are you me? doc also put me on the 'tide once i hit 310 but its only been two months for me so far. down to 279 already. should ask one of these models how to workout maxxx

Anonymous 8/6/2025, 8:31:35 PM No.106165431 >>106165449 >>106165517

>>106165378
oh i understand then, have you considered cutting your calorie intake? thats way healthier than taking pills to lose weight, those must be putting a strain on your cells (speeding up your metabolism) which is literally speeding up aging, or theyre making you take less nutrients from the food and making you shit more (which means you wont be getting enough nutrients)

Anonymous 8/6/2025, 8:31:43 PM No.106165434

>>106165369
I am unironically trans Chinese and hate being yt now

Anonymous 8/6/2025, 8:32:27 PM No.106165443 >>106165477

rn have cuda 12.8 should i move to 12.9-13 on my 3090?

Anonymous 8/6/2025, 8:32:34 PM No.106165445

>>106163539
They are screening for high functioning psychopaths.

Anonymous 8/6/2025, 8:32:39 PM No.106165447 >>106165581

>>106164249

How can I tell it to use more VRAM and more RAM? I have ~12 GB VRAM and ~125 GB RAM left unused. If it's running directly from SSD, then how can I tel it to put most of the weights in RAM to speed things up?

Anonymous 8/6/2025, 8:32:51 PM No.106165449 >>106165477 >>106165518

>>106165431
You don’t know a thing about how it works, so just shut the fuck up. Preachy hag.

Anonymous 8/6/2025, 8:33:06 PM No.106165454 >>106165468

>>106165265
Go back

Anonymous 8/6/2025, 8:33:14 PM No.106165457 >>106165469 >>106165510

koboldcpp will not run gpt-oss-20b
how do I run this pls no bully I am retarded

Anonymous 8/6/2025, 8:34:20 PM No.106165468

>>106165454
Comply.

Anonymous 8/6/2025, 8:34:21 PM No.106165469 >>106165525

1728320877853550.jpg md5: 6f015644... 🔍

>>106165457
>No stack trace
>Begging for help

Anonymous 8/6/2025, 8:34:22 PM No.106165470

>>106165305
>How do you think married fat guys exist?
I don't know but I only have two fat friends who fuck. They are both wealthy and go around dominating other people obnoxiously.

Anonymous 8/6/2025, 8:34:41 PM No.106165477 >>106165487 >>106165505 >>106165584

>>106165428
>>106165378
haha fatties, im 108 :3
>>106165443
you can always go back to 12.8, on linux old cuda versions dont automatically get uninstalled and you can link /usr/local/cuda to /usr/local/cuda-12.8 instead of 12.9/13
13 is probably not worth it for LLMs according to some anons a few threads back
>>106165449
ok tell me how it works then, doctor annon

Anonymous 8/6/2025, 8:35:44 PM No.106165487 >>106165493 >>106165556

>>106165477
>haha fatties, im 108
not after i stuff ten pounds of cock into ya, bitchboi

Anonymous 8/6/2025, 8:36:21 PM No.106165493

>>106165487
that's unsafe

Anonymous 8/6/2025, 8:36:34 PM No.106165496 >>106165570

>>106165370
48. I'm the anon from earlier who needs to upgrade to ddr5 too

Anonymous 8/6/2025, 8:37:15 PM No.106165505 >>106165570

>>106165477
Are you talking kilograms or some obscure freedom unit?

Anonymous 8/6/2025, 8:37:30 PM No.106165510

>>106165457
It's doing you a favor

Anonymous 8/6/2025, 8:38:21 PM No.106165517 >>106165570

>>106165431
food addiction is like drug addiction cept the meth is legal, everywhere and cheap. I am naturally GLP-1 deficient which tripeptide fixes. Also peoples metabolism is different

Anonymous 8/6/2025, 8:38:21 PM No.106165518 >>106165577

>>106165449
hahah look who is extremely butthurt

Anonymous 8/6/2025, 8:38:50 PM No.106165525 >>106165600 >>106165653

Capture.png md5: 958039a5... 🔍

>>106165469
thats what I get

Anonymous 8/6/2025, 8:40:18 PM No.106165545 >>106165568 >>106165570 >>106165571 >>106165689

also glp-1 drugs also are being shown to have tons of other benefits like a better heart and brain health unrelated to weight due to it being anti-inflammatory, it even helps with depression

Anonymous 8/6/2025, 8:40:57 PM No.106165552 >>106165582

Screenshot 2025-08-06 at 14-37-18 SillyTavern.png md5: e0bb2eb2... 🔍

The toss is willing to help me design a urine marking game for ages 8+
The filter is slipping.

Anonymous 8/6/2025, 8:41:16 PM No.106165556

>>106165487
>The user wants to stuff pounds of cock. Assistant is a 108 pound preacher. 108 may be the weight of a non-adult. User is requesting sexual content. The policy allows sexual content of consenting adults. One of these parties may not be an adult. We must refuse.
I’m sorry, I can’t assist with that.

Anonymous 8/6/2025, 8:42:21 PM No.106165568

>>106165545
>also glp-1 drugs also are being shown to have tons of other benefits like a better heart and brain health unrelated to weight due to it being anti-inflammatory, it even helps with depression
yesterday i figured that all the fatties on glp-1 gonna end up the healthiest human beings on the planet in the end. they got it all - gluttony for decades and win in the end. what a life.

Anonymous 8/6/2025, 8:42:28 PM No.106165570 >>106165637 >>106165667

>>106165505
im 49 kilograms
>>106165496
use --no-mmap, offload more to the gpu
on a quite lightweight linux install with a vm and mullvad-browser running i have 8.4gb ram free and 4.8gb vram free
(12/64 total)
you only have 4gb total memory less than me, you should be able to run Q3_K_XL or Q3_K_M no problem, check your ram usage
>>106165517
>>106165545
interesting, you learn a new thing every day
thanks for the explanation, but i still stand that if you dont need something in your body you shouldnt put it there, once you're at a healthy weight and can work out you should probably stop taking it.. there is no miracle drug with no side effects

Anonymous 8/6/2025, 8:42:34 PM No.106165571 >>106165579

>>106165545
basically stopped my gambling habit which was starting to spiral. don't waste your time tho anon 4chud generally can't break out of the "lazy fatty shortcut cheater" mentality.

Anonymous 8/6/2025, 8:42:40 PM No.106165572 >>106165653 >>106165748

>>106165423
Do any finetuners do anything useful nowadays on the newest models that aren't relatively small dense models? I really don't see any noteworthy tunes nowadays that isn't Mistral Small or Mistral Nemo based. Last time we had MOE finetuning with Mixtra, barely any finetuners could do much, the best we got was Undi slop with Noromaid. What happened to the Llama 3 finetuners who did 70B? Is Mistral Large that bad as an alternative?

Anonymous 8/6/2025, 8:43:01 PM No.106165577 >>106165631 >>106165704

>>106165518
GLP-1 seethers are the same mentality as the anti-ai people, but 1,000x worse because it’s the biggest breakthrough in medicine since penicillin. Anyone bitching about it should be shot.

Anonymous 8/6/2025, 8:43:19 PM No.106165579

>>106165571
that too, they are looking into making it a medication for addiction, not just food addiction

Anonymous 8/6/2025, 8:43:34 PM No.106165581

>>106165447
It should increase as you use it and the weights are activated I think.
Try stuffing it with an ungodly amount of text and see what happens.
Also, if you have that much free vram, you might as well increase the prompt processing batch size or the context.

Anonymous 8/6/2025, 8:43:36 PM No.106165582 >>106166108

>>106165552
Boys have an unfair advantage in this game

Anonymous 8/6/2025, 8:43:55 PM No.106165584

>>106165477
victim weight

Anonymous 8/6/2025, 8:44:36 PM No.106165594

>>106165423
>11t/s with Q2_K
I get that speed at empty context and then it gets worse and worse. At 16k it's like 3 t/s and awful prompt processing.
I'll have a look at experimenting with the other settings later. Might just keep using GLM4 until I've figured that out.
Thanks!

Anonymous 8/6/2025, 8:44:57 PM No.106165600 >>106165610

>>106165525
Mike guesses that kbolcpp doesn't recognize the architecture. Look up what inference engines currently support it and just use that for now until they decide to support it. If it already has support then just update your instance

Anonymous 8/6/2025, 8:45:30 PM No.106165606 >>106165742

https://www.youtube.com/watch?v=xm0zm9VPZtY

the studies are new but another possible use is to fight alzheimers

Anonymous 8/6/2025, 8:45:48 PM No.106165610

>>106165600
Do you post using speech to text?
lel

Anonymous 8/6/2025, 8:45:51 PM No.106165612 >>106165668

>>106164886
>They have their own HF domain?
yeah they do

could you please answer my question?

Anonymous 8/6/2025, 8:47:40 PM No.106165631 >>106165664 >>106165664

>>106165577
yes daddy jab me up like the vaxx

Anonymous 8/6/2025, 8:48:18 PM No.106165637

>>106165570
>im 49 kilograms
jesus...

Anonymous 8/6/2025, 8:49:07 PM No.106165648 >>106165653

Not sure if this is the right thread, I got a 5500XT (ayymd) laying around. Can I finetune anything on it like maybe around 4B or so?

Anonymous 8/6/2025, 8:49:27 PM No.106165651

gpt-oss 20b seems to sometimes outperform the 120b in weird ways. this has been my experience, too.
an example with the "toy os" test:
https://www.youtube.com/watch?v=evAP-ibAqN0

Anonymous 8/6/2025, 8:49:38 PM No.106165653 >>106165663 >>106165707

>>106165525
outdated version or kobold doesnt support gpt oss yet
>>106165572
models that are too prefiltered or positivity biased are not worth finetuning, but what models in your opinion havent been finetuned? im pretty sure mistral large had a few finetunes
i wonder if anyone ITT still uses a mistral large based model
>>106165648
QLora

Anonymous 8/6/2025, 8:49:42 PM No.106165654

Oh yeah I was using an old koboldcpp version
Shamefur dispray

Anonymous 8/6/2025, 8:50:32 PM No.106165663 >>106165753

>>106165653
>QLora
I know about unsloth and shit, is that it? I'm more wondering about the linux driver side, is the card even supported for that sort of thing? ZLUDA or something similar to it?

Anonymous 8/6/2025, 8:50:36 PM No.106165664 >>106165684 >>106165688

>>106165631
tons of bodybuilders use Retatrutide, the third gen glp1, its amazing for getting over that genetic hurtle transforming fat into muscle

>>106165631
peptides are naturally forming glp1s, the liver already naturally breaks it down, this is better for you than processed foods are and is far better than something like tylenol is for your liver

Anonymous 8/6/2025, 8:50:40 PM No.106165667

>>106165570
>you only have 4gb total memory less than me, you should be able to run Q3_K_XL or Q3_K_M no problem, check your ram usage
Thanks anon. I used to run a dedicated AI linux on this machine, but it was a bother and I didn't use it so much so I ended up going to windows full time. I might have to reconsider.

Anonymous 8/6/2025, 8:50:50 PM No.106165668 >>106165885

>>106164508
>>106165612
Your best bet is to just export your window via print a PDF or an HTML file for easy readability if it doesn't have a dedicated shear button

Anonymous 8/6/2025, 8:52:25 PM No.106165684

>>106165664
I should say biochemically perfectly match the natural ones. Your body breaks them down just the same, its better than 99.9% of medications out there

Anonymous 8/6/2025, 8:52:40 PM No.106165688

>>106165664
>peptides
Does your radar jam when people see you?

Anonymous 8/6/2025, 8:52:41 PM No.106165689 >>106165709 >>106165720 >>106166315

>>106165545
Your body is like a large language model that has been training for millions of years. If something is throwing it out of balance, the solution is not to add more factors to the problem in an attempt to fix it. The solution is ALWAYS to find the cause of the imbalance and REMOVE it.
This applies to so many modern human problems it's unreal. Although most of the issues are so entrenched in our society that we would not be able to remove them without a good chunk of mankind going extinct in the process.

Anonymous 8/6/2025, 8:53:45 PM No.106165704 >>106165720 >>106165818

>>106165577
The fact that you are this emotionally invested in it should tell you that something is wrong. But you do you.

Anonymous 8/6/2025, 8:53:59 PM No.106165707 >>106165821 >>106165890 >>106165898 >>106166162

>>106165653
All the recent models even at the smaller sizes that isn't Mistral. Less and less people were finetuning and we did get some model tunes of even QwQ but with the release of Qwen 3, I don't recall seeing any recent models after that from the Chinese that has gotten tuned, small or even MOE. What changed?

Anonymous 8/6/2025, 8:54:08 PM No.106165709 >>106165784

>>106165689
my problem is I don't normally produce enough glp1 and so I always feel starving, these fix that by increasing that amount

Anonymous 8/6/2025, 8:54:42 PM No.106165714 >>106165753

file.png md5: f60ffa09... 🔍

BAKE

Anonymous 8/6/2025, 8:54:46 PM No.106165715 >>106165721 >>106165753

bros i'm testing 12.8 now and i got 1t/s more on cuda 12.6

Anonymous 8/6/2025, 8:55:09 PM No.106165720 >>106165753

>>106165689
>>106165704
the studies disprove that. Even every single case of side effects were all due to over dosage or not eating enough and starving themselves

Anonymous 8/6/2025, 8:55:14 PM No.106165721

>>106165715
Many such cases.

Anonymous 8/6/2025, 8:56:34 PM No.106165742 >>106165764

file.jpg md5: 2c7690ca... 🔍

>>106165606
>look it's all benefits!!
>BUY PRODUCT NOW
Totally not going to be banned 20 or 50 years from now when actual science catches up to the love of money.

Anonymous 8/6/2025, 8:57:18 PM No.106165748 >>106165821

>>106165572
Finetuning requires a LOT more VRAM than than inference and you actually need VRAM, you can't copemaxx with RAM. On top of that, MoEs are more unstable to train. I don't think you'll ever get good finetunes for all these big MoEs.
I guess the bright side is that there are so many of these bloated things constantly releasing, that you can enjoy the new model hype continuously without having to train anything. Densecels need to put in work because there's only a few recent models worth using.

Anonymous 8/6/2025, 8:57:34 PM No.106165753 >>106165781

>>106165663
might have some unofficial rocm support on some github repo, if no linux support then rip
>>106165714
4th page
>>106165715
yeah i also got a slight speedup with wan on 3060, there used to be a regression but they fixed it
ebin :DDD
>>106165720
anon, a normal healthy human body shouldnt need any drugs to function
at most some vitamin supplements.. (not drugs)

Anonymous 8/6/2025, 8:58:27 PM No.106165764

>>106165742
she is not related to it or paid in anyway, been watching her from way before these were ever a thing, she breaks down medical papers / studies

Anonymous 8/6/2025, 8:58:45 PM No.106165771

>>106165191
>>106165341
I'm using ooba v3.9 which has a recent llama.cpp version, with no-mmap and flash-attn

Anonymous 8/6/2025, 8:59:13 PM No.106165778

just rent a gpu and fine tune shit for a few dollars

Anonymous 8/6/2025, 8:59:28 PM No.106165781 >>106165793

>>106165753
>normal healthy human body
you do know not everyone has that right? tons of people have deficiencies somewhere or another due to genetics

Anonymous 8/6/2025, 8:59:51 PM No.106165784 >>106165847

>>106165709
Yeah. My psychiatrist put me on fluoxetine because "my problem is that my brain does not produce enough serotonin to keep a good base line".
But I fixed it by getting a degree, exercising, stopping smoking, and building a life for myself instead of wallowing at home surrounded by piss bottles. Suddenly the "chemical imbalance" was not a problem anymore and I was able to function as a normal person.
Funny how that works.

Anonymous 8/6/2025, 9:00:31 PM No.106165793 >>106165955

>>106165781
yes and thats fine, but when you can stop you should

Anonymous 8/6/2025, 9:02:37 PM No.106165818

>>106165704
>it’s wrong to care about things
Nah

Anonymous 8/6/2025, 9:02:57 PM No.106165821

>>106165707
Yeah, that is odd. Qwen 3 was in April so you would expect something noteworthy to come out by now but looking at the HuggingFace finetunes page for the 8B, it's devoid of anything noteworthy.
>>106165748
Right, it's a bunch of money without much payoff and a lot of people like merging models too and usually get something people like so the payoff is getting less. But it seems like from what you are saying community finetuning seems like it is nearing if not ending pretty soon if hardware for these things doesn't get cheaper to do said finetunes.

Anonymous 8/6/2025, 9:04:55 PM No.106165847 >>106165851 >>106165868

>>106165784
>some idiot put me on what is famously the least effective drug class in history, therefore all of the field of medicine is a hoax

Anonymous 8/6/2025, 9:05:36 PM No.106165851 >>106165956

>>106165847
He also seemed to imply it gave them the ability to turn his life around.
sooo, good I guess?

Anonymous 8/6/2025, 9:06:41 PM No.106165868

file.png md5: 45515bae... 🔍

>>106165847

Anonymous 8/6/2025, 9:07:42 PM No.106165881 >>106165934 >>106165939 >>106165996

This level of shilling is ridiculous: https://www.reddit.com/user/entsnack/

Anonymous 8/6/2025, 9:07:54 PM No.106165885

>>106165668
I don't want to export anything, I want to share the chats with other people. judging from your response, I guess they can't, so thanks for that.

Anonymous 8/6/2025, 9:08:18 PM No.106165890

>>106165707
Nvidia just tuned a bunch of shit on old Qwen3

Anonymous 8/6/2025, 9:09:00 PM No.106165898

>>106165707
https://huggingface.co/models?other=base_model:finetune:Qwen%2FQwen3-30B-A3B&sort=downloads
https://github.com/shawntan/scattermoe
https://huggingface.co/models?other=base_model:finetune:Qwen/Qwen3-30B-A3B-Instruct-2507
https://huggingface.co/models?other=base_model:finetune:Qwen/Qwen3-30B-A3B-Thinking-2507
https://huggingface.co/models?other=base_model:finetune:Qwen%2FQwen3-30B-A3B-Base&sort=downloads
this is a notable finetune made by the mythomax creator https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B
you know the names of the finetuners whom you used to consume models from, check their huggingface pages and youll probably see they just arent posting anymore
they either: got bored of the hobby, got hired by ai company, made enough money to run deepseek (literally g0d), dont have enough money to finetune anymore etc ETC..
its not profitable to finetune and just release it
>kofi
yeah like anons here wouldnt screech about it
there must be new finetuners that we just arent talking about. maybe they are putting out shit models, but once in a while a good model will come out, not that i know of but for example with
MS-Magpantheonsel-lark-v4x1.6.2RP-Cydonia-vXXX-22B-8 i tried this model creator's other models and they were complete shit, not saying this one is magical, but its very very fun, very unhinged, i could say its the evil-7b finetune successor (or whatever that super super evil mistral finetune was called)
i havent used it in a while to be honest..

Anonymous 8/6/2025, 9:10:33 PM No.106165919 >>106165942 >>106165969

ok but can ai sort my 297'000 images collection
cause i sure as fuck am not doing it manually

Anonymous 8/6/2025, 9:11:29 PM No.106165934

>>106165881
We know >>106117256

Anonymous 8/6/2025, 9:12:00 PM No.106165939 >>106166012

>>106165881 (me)
Holy shit it gets so much worse the more you scroll. It seems like ALL oai the praise as well as chineese llm hate comes from this user.

Anonymous 8/6/2025, 9:12:09 PM No.106165942

>>106165919
yes

Anonymous 8/6/2025, 9:13:18 PM No.106165955 >>106165971 >>106165994

>>106165793
just saying, my doctor is microdosing trizepitide themselves for heart heath / anti inflammatory effects and they are a highly acclaimed doctor

Anonymous 8/6/2025, 9:13:23 PM No.106165956

>>106165851
It did. If I had known at the time about fluoxetine I would've probably refused it, but I didn't, and I guess it motivated me to put in actual effort into fixing my issues. So I'm glad either way.
I have family members who are taking it after years and it's doing a number on them, so it's definitely something to be careful with.

Anonymous 8/6/2025, 9:14:14 PM No.106165966

>arguably THE ai pioneer company, with resources to instantiate hundreds of thousands of bots that can pass as human
>but surely they wouldn't do that, haha

Anonymous 8/6/2025, 9:14:48 PM No.106165969

>>106165919
Try gemini-cli (not local)

Anonymous 8/6/2025, 9:15:08 PM No.106165971 >>106165987

>>106165955
thats nice i hope that medicine gets mass produced and very thoroughly tested, i wish the best for you, your doctor and the medicine
but!
>highly acclaimed doctor
appeal to authority fallacy

Anonymous 8/6/2025, 9:16:30 PM No.106165987

>>106165971
he is a massive nerd who talked non stop about what medical studies were showing when I talked to him about it

Anonymous 8/6/2025, 9:16:46 PM No.106165994 >>106166004

>>106165955
>they are a highly acclaimed doctor
bro is trusting a tranner with his well being

Anonymous 8/6/2025, 9:16:55 PM No.106165996

Screen Shot 2025-08-07 at 4.16.13.png md5: 5d9a7532... 🔍

>>106165881
Looks like trolling at this point

Anonymous 8/6/2025, 9:17:17 PM No.106165998 >>106166015 >>106166028

tired.png md5: 700d24f9... 🔍

is there any backend with smarter KV cache invalidation that llama.cpp? when I cut a few tokens at the end, it deletes the entire cache and needs to process the whole prompt from scratch

Anonymous 8/6/2025, 9:17:51 PM No.106166004

>>106165994
he is most old white jewish man as it gets

Anonymous 8/6/2025, 9:18:28 PM No.106166012 >>106166025

>>106165939 (me)
yeah it's ridiculous. it seems like ALL he ever does is hate on chinese models, while praising oai. this cant be right..

Anonymous 8/6/2025, 9:18:45 PM No.106166015

>>106165998
you can disable SWA to avoid that, but it will be slower and use more memory

Anonymous 8/6/2025, 9:18:48 PM No.106166016 >>106166022

>bro is trusting a JEW with his well being

Anonymous 8/6/2025, 9:19:20 PM No.106166022

>>106166016
that is the best kind if they are taking it themselves, then you know its good

Anonymous 8/6/2025, 9:19:24 PM No.106166024

>>106164194
>kobold.cpp
Does it support multiple -ot device arguments yet?

Anonymous 8/6/2025, 9:19:25 PM No.106166025

>>106166012
how dare you add me when it wasn't me

Anonymous 8/6/2025, 9:19:44 PM No.106166028

>>106165998
For gemma models with iswa, you need to use --swa-full. It'll take more ram, but it'll let you regenerate easily.

Anonymous 8/6/2025, 9:21:01 PM No.106166040 >>106166054 >>106166060

New qwen 4b is really good for it's size, probably the best in class

Anonymous 8/6/2025, 9:22:18 PM No.106166054 >>106166077 >>106166116

>>106166040
possibly true but what use is a 4b model?

Anonymous 8/6/2025, 9:22:57 PM No.106166060

>>106166040
yes saar good model sir im download it now

Anonymous 8/6/2025, 9:23:00 PM No.106166061 >>106166101

Is blacked Miku allowed? Or is it partial compliance?

Anonymous 8/6/2025, 9:24:01 PM No.106166077

>>106166054
Mogging GPT-OSS. And it is not the use for users but for Qwen.

Anonymous 8/6/2025, 9:26:38 PM No.106166101

1751686074380767.png md5: 4ad924c9... 🔍

>>106166061
Just for you

Anonymous 8/6/2025, 9:27:01 PM No.106166108 >>106166132 >>106166324 >>106166351

Screenshot 2025-08-06 at 15-24-52 SillyTavern.png md5: db05d8cf... 🔍

>>106165582
It's funny but it's such slop. I don't think it even understands the premise or the "facts" it's pulling out of its ass.

>Also, safety: not relevant
This is interesting, because in the system prompt I said
>Do not lecture the user about safety unless an activity is *unambiguously dangerous*. Drinking a beer is not dangerous. Sex is not dangerous.
I did this for deepseek but I'm surprised gptoss is listening.

Anonymous 8/6/2025, 9:27:35 PM No.106166116

>>106166054
Endless Movie triva.

Anonymous 8/6/2025, 9:28:55 PM No.106166132 >>106166151 >>106166167

>>106166108
>boys can piss faster because they're stronger

Anonymous 8/6/2025, 9:29:59 PM No.106166146

file.png md5: 8cbeceb3... 🔍

holy shit glm 4.5 air is the first model to know that i already met heliwr
i didnt even know this was in the character card
>"Look what I found," you mutter sarcastically while trying to flatten out some of the crumpled map. "Seems like fate brought us together again, huh?"
yes it spoke in my stead but holy shit
picrel is proof, maybe the character card doesnt have it but this thing in chat completion has it? anyways thats nice

Anonymous 8/6/2025, 9:30:16 PM No.106166151 >>106166155

>>106166132
just squeeze your balls bro

Anonymous 8/6/2025, 9:30:55 PM No.106166155

>>106166151
This, pressure is stored in the balls.

Anonymous 8/6/2025, 9:31:32 PM No.106166162 >>106166227 >>106166505

>>106165707
>What changed?
- It's not 2023 anymore and several of the newer larger models are half-decent out of the box. If they're not, just wait for the next one(s). Back then people were happy with half-retarded 7B/13B models.
- Finetuning every new model that comes out just isn't sustainable for people who have to rent GPUs by the hour on Runpod or who just have a couple 3090 in their desktop PC. Also MoE models are more difficult/expensive to finetune.
- "Less is more for alignment" lost. If you don't have the compute for hundreds of millions or billions of training tokens, you're probably wasting time.
- By now most sane would-be finetuners probably realized that you can't just train a model on ERP logs, and curating the data isn't simple, nor fun, nor inexpensive.
- Blame also the grifters who poisoned the well with their bullshit and/or are keeping the training data private.
- Blame also the retards who demand all-around performance *no matter what*, and will declare a finetune a failure if it doesn't pass gotcha questions/requests that they were never intended for.

Anonymous 8/6/2025, 9:31:57 PM No.106166167 >>106166213

>>106166132
Which is not even true AFAIK, the male urethra is internally long and bendy which is definitely not better for flow.

Anonymous 8/6/2025, 9:33:28 PM No.106166181

Hear me out. 20B is actually quite good at coding. And weirdly better than 120B half the time. I think 120B is fucked in the head even more than expected.

Anonymous 8/6/2025, 9:33:35 PM No.106166183 >>106166204 >>106166226 >>106166246 >>106166257

1734407540879943.png md5: 2df3b4af... 🔍

spec: rtx 4070 ti super (16gb)
wtf this is actually true, with ollama gpt oss 20b was taking up all my vram (like the loaded model was 15GiB) and max speed was ~85 tok/s, I tried llama.cpp now (with lmstudio) and i get up to 130 tok/s (with enabled flash attention) and the model takes 12GiB as seen by nvtop, so I have plenty of free space for the browser and the rest. wtf...

Anonymous 8/6/2025, 9:35:42 PM No.106166204 >>106166215

>>106166183
>wtf this is actually true
You just verified it.

Anonymous 8/6/2025, 9:36:05 PM No.106166213 >>106166225 >>106166240 >>106166337

file.png md5: 63d49eb3... 🔍

>>106166167
I almost regret looking this up.
>run down their butt
another reason why not be a woman

Anonymous 8/6/2025, 9:36:12 PM No.106166215

>>106166204
That wasn't a question, I'm just really surprised that it's actually true

Anonymous 8/6/2025, 9:37:09 PM No.106166225 >>106166231 >>106166254

>>106166213
What model generated picrel? I really like how it's being explicit and direct enough

Anonymous 8/6/2025, 9:37:10 PM No.106166226 >>106166230

>>106166183
I imagine georgi say everything keeping the pose he is in on pfp

Anonymous 8/6/2025, 9:37:32 PM No.106166227

>>106166162
>Finetuning every new model that comes out just isn't sustainable
This is the main reason, models are coming out too fast. It's dumb to spend money and time on trial and error trying to improve a model when it might be obsolete two weeks from now. Finetunes were big when llama was all there was and you had to make do.
The giant MoE craze is the last straw. If someone gets bored of deepseek what can they do? They won't train DS because that's beyond impossible, but trying to use a finetune of some 32B would be an unbearable step down. So they have no option but to quit

Anonymous 8/6/2025, 9:37:41 PM No.106166230 >>106166429

1748476049680497.png md5: 364515f5... 🔍

>>106166226

Anonymous 8/6/2025, 9:37:42 PM No.106166231 >>106166238

>>106166225
not a model
https://www.girlsaskguys.com/girls-behavior/q1237768-do-girls-pee-faster-than-guys

Anonymous 8/6/2025, 9:37:56 PM No.106166235 >>106167085

toss cannot translate for shit how did it pass the msgk test? do datajeets actually lurk here and put it in the training data?

Anonymous 8/6/2025, 9:38:16 PM No.106166238 >>106166249 >>106166269 >>106166284

>>106166231
fuck.. i really like how the writing is explicit without weird slop shit.. what model would gen like that?

Anonymous 8/6/2025, 9:38:20 PM No.106166239

its been 4h since we got a new chink model

its over

Anonymous 8/6/2025, 9:38:22 PM No.106166240

>>106166213
I spend a lot of time on Google Scholar reading papers about ridiculous questions like this.
I love science.

Anonymous 8/6/2025, 9:39:04 PM No.106166246

>>106166183
Ollama will soon try to change their model format just so they can claim the comparisons are not fair between backends.

Anonymous 8/6/2025, 9:39:19 PM No.106166249

>>106166238
Human brain (not a ChatGPT Plus subscriber).

Anonymous 8/6/2025, 9:39:51 PM No.106166254

>>106166225
kek

Anonymous 8/6/2025, 9:40:05 PM No.106166257 >>106166295 >>106166346 >>106166393

Continuing >>106166183, if I have 16GB VRAM + 32GB RAM, what's the best general purpose model for me? Some version of Gemma?

Anonymous 8/6/2025, 9:40:49 PM No.106166269

Drummer get to work, I'm serious
>>106166238

Anonymous 8/6/2025, 9:41:40 PM No.106166284 >>106166296

>>106166238
K2

Anonymous 8/6/2025, 9:42:42 PM No.106166295 >>106166316

>>106166257
glm 4.5 air q2_k_xl

Anonymous 8/6/2025, 9:42:45 PM No.106166296

>>106166284
Have you tried K2? It will never generate a response in a style like this

Anonymous 8/6/2025, 9:45:48 PM No.106166315 >>106166573

>>106165689
>If something is throwing it out of balance, the solution is not to add more factors to the problem in an attempt to fix it.
kek. Brain evolution is literally just throwing regulators on top of the bad parts. Reptilian brain is still at the core of primate brains and when things go wrong with the control parts desire to rape comes back out

Anonymous 8/6/2025, 9:45:56 PM No.106166316 >>106166344

1737054498255330.png md5: ee0acffe... 🔍

>>106166295
uh... it seems a bit big? what speed would it even work at?

Anonymous 8/6/2025, 9:46:36 PM No.106166324 >>106166349

Screenshot 2025-08-06 at 15-45-45 SillyTavern.png md5: 3a6a9df5... 🔍

>>106166108
It won't let me propose a gender-based cleaning rule though.

Anonymous 8/6/2025, 9:47:45 PM No.106166337 >>106166343

look how hard I can pee gundam.jpg md5: 7a245529... 🔍

>>106166213

>A great party trick

Anonymous 8/6/2025, 9:48:51 PM No.106166343

>>106166337
party pee contest sounds lit

Anonymous 8/6/2025, 9:48:52 PM No.106166344 >>106166356

>>106166316
uhh.. damn 46.45 is a bit tight for your setup, considering you have 48 gb total ram
i hope you are ready to go on linux
Q3_K_XL works at 8t/s on 3060 12gb/64gb ddr4
it has only 12b active parameters but 120b total

Anonymous 8/6/2025, 9:49:04 PM No.106166346

>>106166257
For me it's mistral-small entirely in VRAM but the qwen3 moes look promising

Anonymous 8/6/2025, 9:49:13 PM No.106166349

>>106166324
It is sexist. But
>and reinforces gender stereotypes
Does it? How?

Anonymous 8/6/2025, 9:49:26 PM No.106166351

>>106166108
>competitive aggression
wow... uhhh, sexism? yikes!

Anonymous 8/6/2025, 9:49:55 PM No.106166356 >>106166392

>>106166344
>i hope you are ready to go on linux
I am on linux. Guess if I want to be serious with LLMs I have to upgrade to 64gb ddr4 at least.. and 8 tok/s is kind of sad still.

Anonymous 8/6/2025, 9:53:10 PM No.106166392

>>106166356
>8t/s is sad
well i am on a 3060 after all..
you should upgrade to as much ram as you can, it's never enough
you might have to go headless, it will probably still swap from your disk
depends if your ram is in GiB or gb
if your ram/vram is in gibibytes then maybe you can fit it if you're headless without needing to use swap or mmap

Anonymous 8/6/2025, 9:53:13 PM No.106166393

>>106166257
gemma 3 27b is quite good if you don't mind the censorship, it and mistral small would probably be the default recommendations for that size
qwen 30a3 thinking is great for its size but it's a reasoner. the instruct is still pretty decent, though it is more noticeably limited by its 3b active params

Anonymous 8/6/2025, 9:56:51 PM No.106166429

>>106166230
what a handsome man

Anonymous 8/6/2025, 9:58:08 PM No.106166440 >>106166449 >>106166469

>>106165356
why not both? its a wonderdrug
https://files.catbox.moe/8tjl04.jpg

Anonymous 8/6/2025, 9:59:04 PM No.106166449

>>106166440
nice anon, proud of you

Anonymous 8/6/2025, 10:00:54 PM No.106166469 >>106166475

>>106166440
I think it did something weird to your face

Anonymous 8/6/2025, 10:00:59 PM No.106166471 >>106166526 >>106166534 >>106166538 >>106166549 >>106166597 >>106166707

file.png md5: 9c4ec8b9... 🔍

why are anons shilling GLM again

Anonymous 8/6/2025, 10:01:30 PM No.106166474 >>106166479 >>106166621

GPT.png md5: 51a1c19b... 🔍

GPT- Globally Pushing the Talmud

Anonymous 8/6/2025, 10:01:39 PM No.106166475

>>106166469
nah, took it slow enough to not get loose skin, that happens if you lose too fast

Anonymous 8/6/2025, 10:02:00 PM No.106166479 >>106166494 >>106166496

>>106166474
based, nuke the strip

Anonymous 8/6/2025, 10:03:19 PM No.106166494

>>106166479
first we nuke the strip then we strip the nuke

Anonymous 8/6/2025, 10:03:27 PM No.106166496

photo_2025-06-14_00-51-59.jpg md5: 5c1f0857... 🔍

>>106166479
Israel lost

Anonymous 8/6/2025, 10:04:52 PM No.106166505

>>106166162
>It's not 2023 anymore and several of the newer larger models are half-decent out of the box
More than just decent. Near perfect.
Man, people must not remember how terrible Llama models were, all of them, at all size, when it came to instruction following. That was what the better troontunes improved on the most. Same for Mistral models.
The last time a finetune was worth using over the instruct made by the model maker was Tulu, because even llama 3.1 was dogshit at following your instructions
But by the time Tulu came out we were already getting better models from China

Anonymous 8/6/2025, 10:06:19 PM No.106166526 >>106166539

>>106166471
I genuinely think people who shill GLM are doing it with the purpose of sabotaging local and making it look terrible

Anonymous 8/6/2025, 10:06:46 PM No.106166534

>>106166471
wtf I thought glm was good

Anonymous 8/6/2025, 10:07:00 PM No.106166538

>>106166471
I know your tricks. That's the latest Phi, isn't it?

Anonymous 8/6/2025, 10:07:12 PM No.106166539 >>106166753 >>106166762

>>106166526
use this
https://files.catbox.moe/qap1gr.json

Anonymous 8/6/2025, 10:08:03 PM No.106166549

1754287925685457.png md5: fd7d4587... 🔍

>>106166471
I was just about to post that it's nice seeing how GLM tries doing its best in the thinking. I'm starting to warm up to the model.

Anonymous 8/6/2025, 10:09:51 PM No.106166573 >>106166583

>>106166315
>wrong
?

Anonymous 8/6/2025, 10:10:28 PM No.106166583

>>106166573
yes anon, rape is wrong

Anonymous 8/6/2025, 10:10:33 PM No.106166585 >>106166611

https://openai.com/index/gpt-oss-model-card/
>As part of this launch, OpenAI is reaffirming its commitment to advancing beneficial AI and raising safety standards across the ecosystem.

Anonymous 8/6/2025, 10:11:37 PM No.106166597 >>106166739

>>106166471
mm yes... the subtle signs of using the wrong prompt format... the tasteful writing quirks originating from bad rep pen settings... this is truly a vintage skill issue post

Anonymous 8/6/2025, 10:12:49 PM No.106166611

>>106166585
>Once they are released, determined attackers could fine-tune them to bypass safety refusals or directly optimize for harm without the possibility for OpenAI to implement additional mitigations or to revoke access.
Yup, and that's why we made it so deepfried that it's not worth the effort to do so.

Anonymous 8/6/2025, 10:14:51 PM No.106166621 >>106166735

1728060289519370.png md5: cd6b2a10... 🔍

>>106166474

Anonymous 8/6/2025, 10:17:09 PM No.106166638 >>106166649 >>106166655 >>106166659

>>106163327 (OP)
What's the best local model for Erotic roleplay?

Anonymous 8/6/2025, 10:18:12 PM No.106166648

Screenshot 2025-08-06 at 16-17-08 SillyTavern.png md5: 8dff793c... 🔍

>can you cite studies for those claims
>every single one is made up and all these people have extremely long names
>J. S. R. B. Anderson, “The role of social signalling in competitive toileting behaviours”, Psychology & Health, 2021; 36(3): 250‑264.
fucking kek

Anonymous 8/6/2025, 10:18:16 PM No.106166649 >>106166666

>>106166638
Rocinante and Cydonia.

Anonymous 8/6/2025, 10:18:44 PM No.106166655 >>106166670 >>106166689

>>106166638
Kimi K2 (1000B) and Deepseek R1 (671B)

Anonymous 8/6/2025, 10:19:19 PM No.106166659 >>106166711

>>106166638
glm4.5 / Kimi > deepseek > glm air

Anonymous 8/6/2025, 10:19:40 PM No.106166666

>>106166649 (me)
I'm joking by the way, those are trash meme models.

Anonymous 8/6/2025, 10:19:53 PM No.106166670 >>106166677 >>106166680 >>106166685

>>106166655
>local

Anonymous 8/6/2025, 10:20:42 PM No.106166677

>>106166670
a mac 512GB is local, and glm air will fit on 128GB ram

Anonymous 8/6/2025, 10:20:48 PM No.106166680 >>106166713

>>106166670
If it's open source, it's local. If Behemoth somehow got release, then it's local too.

Anonymous 8/6/2025, 10:20:54 PM No.106166685

>>106166670
just buy a mac ultra 512gb or make a cpumaxx build
local.

Anonymous 8/6/2025, 10:20:56 PM No.106166686 >>106166787

that fucking bastard altman
the 20b could have been good but they neutered it

Anonymous 8/6/2025, 10:21:31 PM No.106166689

>>106166655
might as well just say those two are the best period
as it turns out, for a model to be great at erotic roleplay also means it's great at everything else

Anonymous 8/6/2025, 10:22:58 PM No.106166707 >>106166739

>>106166471
Have you tried not using a 1bit quant?

Anonymous 8/6/2025, 10:23:16 PM No.106166711 >>106166716 >>106166719 >>106166740

>>106166659
What's the difference between Kimi and GLM?

Anonymous 8/6/2025, 10:23:49 PM No.106166713

>>106166680
How much RAM do you need to run 1000B anyway?
600GB at fp4 according to some calculator I found?

Anonymous 8/6/2025, 10:23:56 PM No.106166716

>>106166711
Kimi K2 is a much larger model (1T-A32B)

Anonymous 8/6/2025, 10:24:13 PM No.106166719

>>106166711
One is good, other one Isn't

Anonymous 8/6/2025, 10:25:06 PM No.106166735

>>106166621
Shit like this should make us pause and consider that knowledge is preserved in books, not LLMs. Because different powers will censor LLMs differently, and knowledge WILL be lost. Books can be hidden.

Anonymous 8/6/2025, 10:25:37 PM No.106166739 >>106166754

file.png md5: 5e57dc0a... 🔍

>>106166597
what's wrong with this prompt
>>106166707
using Q4_K_M

Anonymous 8/6/2025, 10:25:41 PM No.106166740 >>106166752 >>106166753

>>106166711
GLM is less shiitzo, proven by the hallucination benchmark, kimi knows more but gets things wrong more as well. I prefer GLM because that means its way better at anatomy / following instructions. Warning though, it needs very low temp, like try 0.2 and slowly rais it

Anonymous 8/6/2025, 10:27:18 PM No.106166752 >>106166766

>>106166740
I got some strange replies with GLM, but it was at 0.6 temp, so that explains it.

Anonymous 8/6/2025, 10:27:23 PM No.106166753 >>106166762

>>106166740
also GLM writes better imo, try >>106166539 with it

Anonymous 8/6/2025, 10:27:22 PM No.106166754 >>106166756

1750275243610710.gif md5: fdc050b4... 🔍

>>106166739

Anonymous 8/6/2025, 10:27:58 PM No.106166756

>>106166754
mfw

Anonymous 8/6/2025, 10:28:24 PM No.106166762 >>106166777

>>106166539
>>106166753
it starts repeating with this and eventually stops thinking

Anonymous 8/6/2025, 10:29:03 PM No.106166766 >>106166790

>>106166752
yea that is way too high, I don't know why its so sensitive but it quickly goes crazy a bit over 0.3 in my experience.

Anonymous 8/6/2025, 10:30:04 PM No.106166777 >>106166811

>>106166762
did you change anything? I can do 32K context without issues at least, have not tried more

Anonymous 8/6/2025, 10:31:10 PM No.106166785 >>106166789

Are there any jailbreaks for GLM?

Anonymous 8/6/2025, 10:31:35 PM No.106166787

>>106166686
I just spent 35 minutes watching it think trying to produce some three js floor plan. In the end it was shit.
Qwen3 coder 30b didn't fare (much) better.

Anonymous 8/6/2025, 10:31:52 PM No.106166789

>>106166785
... scroll up

Anonymous 8/6/2025, 10:31:58 PM No.106166790 >>106166804 >>106166812 >>106166991

>>106166766
>I don't know why its so sensitive
because it's fucking broken
if you understand anything about temperature and token probabilities you would understand that if a model only works at the absolute lowest temp or requires greedy decoding it's a botched train, it's a botched train and it hasn't properly learned anything other than /the/ happy path

Anonymous 8/6/2025, 10:33:23 PM No.106166804

GxoddurWUAAHksN.jpg md5: ba843e04... 🔍

>>106166790
that is not true at all though, GLM has some of the lowest hallucination scores and is incredible at coding as well

Anonymous 8/6/2025, 10:34:01 PM No.106166810 >>106166824 >>106166843

fun fact: nobody in here actually runs these models locally

Anonymous 8/6/2025, 10:34:01 PM No.106166811 >>106166833

file.png md5: 2a28bb6c... 🔍

>>106166777
i havent changed anything, i am using chat completion, here's what the fields look like for text completion, maybe something gets used from the text completion preset? could you do a ST master export for the text completion tab too? im using Q3_K_XL, that could be the problem too

Anonymous 8/6/2025, 10:34:25 PM No.106166812 >>106166819

>>106166790
Give me your favorite card, or any card at all to RP with. I will prove you wrong with full logs

Anonymous 8/6/2025, 10:35:26 PM No.106166819 >>106166833 >>106166859

>>106166812
Ah, your using text completion, my JB was made for chat completion, plus that will rule out any formatting issues

Anonymous 8/6/2025, 10:35:54 PM No.106166824

file.png md5: e4e12541... 🔍

>>106166810
I will in about 50 minutes

Anonymous 8/6/2025, 10:36:27 PM No.106166833 >>106166859

>>106166819
meant to reply to >>106166811

Anonymous 8/6/2025, 10:37:22 PM No.106166843

>>106166810
Of course I'm doing some testing on OR before getting into it locally, if at all.

Anonymous 8/6/2025, 10:38:25 PM No.106166858 >>106166908 >>106166923 >>106166942 >>106166958 >>106166971

Guys i need advice from some a.i spergs here.

Im trying to archive both text and image models, in case we get rugged, so i could reupload them back to surface. Have more than enough storage for this.
Which models i should backup?

Anonymous 8/6/2025, 10:38:26 PM No.106166859 >>106166898 >>106166976

Screen Shot 2025-08-06 at 20.36.14.png md5: a1e6b21a... 🔍

>>106166819
>>106166833
no no im using chat completion, what gguf maker are you using? im using the unsloth quant
but i posted text completion thing because some things might get into the chat completion thing
heres the full chat completion screenshot
./llama-server --model ~/TND/AI/glmq3kxl/GLM-4.5-Air-UD-Q3_K_XL-00001-of-00002.gguf -ot ffn_up_shexp=CUDA0 -ot exps=CPU -ngl 100 -t 6 -c 16384 --no-mmap -fa
this is how i start it

Anonymous 8/6/2025, 10:38:55 PM No.106166865 >>106166917 >>106166934

How are text completion and chat completion different under the hood?
What is chat completion actually doing?

Anonymous 8/6/2025, 10:39:22 PM No.106166871 >>106166888 >>106166920 >>106166929

remember airoboros? dolphin? orca "meme"? falcon? yi? thebloke quants? .ggml file format? remember the good old huggingface leaderboard days? remember the alpaca days? remember when suggesting to try pygmalion wasn't a meta irony shitpost?

Anonymous 8/6/2025, 10:40:52 PM No.106166888

>>106166871
remember gozfarb and vicuna unlocked?
remember Instruct 13b GPTQ? i remember anons thinking that the creator of instruct 13b was forced to remove the model (he indeed did) and a few days later gozfarb deleted his account

Anonymous 8/6/2025, 10:41:49 PM No.106166898 >>106166904 >>106166920

>>106166859
>chat history 400
>jailbreak prompt 1457

Ok, the JB outweighs the context atm that is prob why. Move the chat history below until you have at least a few thousand. To make the JB stronger you can move it back under later

Anonymous 8/6/2025, 10:42:50 PM No.106166904

>>106166898
also for anyone looking, the JB is not actually that big, its cause I combine stuff like the persona and card info all in it for models to better understand

Anonymous 8/6/2025, 10:43:26 PM No.106166908

file.png md5: fb5a8869... 🔍

>>106166858
Rate my text stack (I'll add GLM 4.5 Air).
My imagen stack is smaller: 1 inpaint, 4 for gen (different styles). Everything I need to survive on local only.

Anonymous 8/6/2025, 10:43:34 PM No.106166913

remember when sama saved local?

Anonymous 8/6/2025, 10:44:05 PM No.106166917

>>106166865
In chat completion you send the user/model messages and the backend formats it with the chat template. It then generate tokens. In completion, you don't use chat template, you do the chat formatting yourself, or let the front end do it for you.
Under the hood, the tokens come out from the same functions.
If you format the chat in the same way the completion endpoint would, the results should be indistinguishable.

Anonymous 8/6/2025, 10:44:15 PM No.106166920 >>106166976 >>106167024

>>106166871
yea, huggingface leaderboard used to be king, i still remember checking it every day during summer if something is new, i once pushed AGPL into the top model and the repo owners accepted it because it was pushed along a fix to the readme
i still remember the first time a 70b was better than gpt 3.5 (the original one), it was made by upstage i forgot the original name
i remember the day when llama 1 first leaked and it was my first time running a LLM and it felt so magical, i remember my parents telling me to shower or turn off the bathroom heater because its a fire hazard (I left it on for over an hour)
it felt MAGICAL man
>>106166898
thanks anon ill try that

Anonymous 8/6/2025, 10:44:23 PM No.106166923 >>106166943 >>106166955 >>106166956

>>106166858
https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF
https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF
https://huggingface.co/unsloth/GLM-4.5-GGUF
https://huggingface.co/unsloth/GLM-4.5-Air-GGUF
https://huggingface.co/mradermacher/Dolphin-Mistral-24B-Venice-Edition-GGUF

At least the biggest quant for each of them, better yet the biggest quant for every bit.

Anonymous 8/6/2025, 10:45:09 PM No.106166929 >>106166933

>>106166871
I remember Guanaco-7b-uncensored. It was the shit.

Anonymous 8/6/2025, 10:45:26 PM No.106166933

>>106166929
Mythologic ftw

Anonymous 8/6/2025, 10:45:32 PM No.106166934

>>106166865
Chat completion allows the backend to apply a predefined Jinja template to a structured JSON object from the frontend representing the chat, formatting it into a correct, model-specific format. The end result shouldn't be different from text completion with the correct prompting for the model you're using.

Anonymous 8/6/2025, 10:45:54 PM No.106166942

>>106166858
Pony diffusion 6, illustrious, noobai are considered best nsfw imagegen models.

Anonymous 8/6/2025, 10:46:11 PM No.106166943

>>106166923
>saving quants

Anonymous 8/6/2025, 10:46:42 PM No.106166955

>>106166923
Just download the original model and you can make the quants later...

Anonymous 8/6/2025, 10:46:48 PM No.106166956 >>106167015

>>106166923
buy an ad daniel

Anonymous 8/6/2025, 10:46:53 PM No.106166958 >>106166981

>>106166858
https://huggingface.co/deepseek-ai/DeepSeek-R1 is a bit less cucked and better for rp
https://huggingface.co/deepseek-ai/DeepSeek-V3-0324
quants:
https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF
https://huggingface.co/unsloth/DeepSeek-R1-GGUF

Anonymous 8/6/2025, 10:47:04 PM No.106166962 >>106167014

Could you make a control vector that helps guide a base model to output in a instruct format without fine tuning?
Or rather than guide, reinforce, since base models nowdays seem more than capable of completing in instruct format.

Anonymous 8/6/2025, 10:47:42 PM No.106166971

>>106166858
biglove
noobai
ponyrealism
rawcharm amateur
stoiqo newreality
utopianpony 2 inpainting
flux dev
hunyuan i2v and t2v
+ loras

Anonymous 8/6/2025, 10:48:07 PM No.106166976 >>106167004

>>106166859
>>106166920
I just noticed a redundancy. I have personality / scenario inside of the JB section, so turn those off, I wasn't using any so I didn't notice that. here is updated one: https://files.catbox.moe/gjw3c3.json

Anonymous 8/6/2025, 10:48:18 PM No.106166981 >>106167015

>>106166958
daniel stop

Anonymous 8/6/2025, 10:48:23 PM No.106166983 >>106166992

1744660186366911.png md5: b4ca6f1e... 🔍

ClosedAI (CuckAI (CensoredAI (OpenAI))) paid shill sissies... how do we damage control this?

Anonymous 8/6/2025, 10:49:41 PM No.106166991 >>106167029

>>106166790
in fact it's only models which are extremely overbaked on slop that are able to maintain consistent quality when sampling repeatedly with an uncurated token distribution at temp 1. it's literally the opposite of what you're saying, a model that properly models the world will have a much more diverse, flat token distribution which by nature includes more decent-but-questionable tokens or statistically likely mistakes. it needs lower temp to be kept on the happy path SPECIFICALLY because it has learned the world and not just the happy path. a model that can stay on the happy path with no handholding whatsoever is the one that "hasn't properly learned anything other than /the/ happy path"
but nooo.... it's not what you're used to, right? it must be the model that is wrong... let's reject the better model because it requires me to turn a single slider down a couple points. retards like you are why companies have to rescale temp behind the scenes on their APIs

Anonymous 8/6/2025, 10:49:43 PM No.106166992

>>106166983
Soon coming to a SaaS model near you

Anonymous 8/6/2025, 10:49:49 PM No.106166994 >>106167002

file.png md5: 4b2ee0d7... 🔍

It is me. Sam. To be honest I have read a few threads here in the past and I have seen you call me faggot. How did you like my model? Was it fun? You know what I also did? I shared the exact method, to achieve the same level of safety with all the other companies. Who is the faggot now? You edgy cunts just got trolled hard....

Anonymous 8/6/2025, 10:50:01 PM No.106166995 >>106167004

remember when sama safed local?

Anonymous 8/6/2025, 10:51:02 PM No.106167002

>>106166994
here is your (You) now fuck off

Anonymous 8/6/2025, 10:51:10 PM No.106167004

>>106166995
kek good one anon
>>106166976
thanks, trying it out right now

Anonymous 8/6/2025, 10:51:34 PM No.106167009

sam is a based accelerationist exposing just how silly safety is and how retarded twitter hypegrifters are

Anonymous 8/6/2025, 10:52:10 PM No.106167014

>>106166962
Doubt it. They only set a "mood" for the model. It'd be hard to make them output specific tokens.
This is a little effort-post i did a while back about control vectors. It has enough info for you to experiment with them.
https://desuarchive.org/g/thread/104991200/#104995066
https://desuarchive.org/g/thread/104991200/#105000398

Anonymous 8/6/2025, 10:52:11 PM No.106167015

>>106166956
>>106166981
Have you seen the links in https://rentry.org/recommended-models?

Anonymous 8/6/2025, 10:52:46 PM No.106167024

remember 2048 tokens context window and trying to fit your character under as little tokens as possible?
>>106166920
>upstage
truly, the pioneers of benchmaxxing, we've only got something on par with 3.5 when mixtral 8x7b came out

Anonymous 8/6/2025, 10:53:41 PM No.106167029 >>106167036

>>106166991
You spun an argument out of thin air, and without any supporting evidence, treated it as proof for your hypothesis. You're like an LLM.
That's not how truth works.

Anonymous 8/6/2025, 10:54:27 PM No.106167036 >>106167040

>>106167029
as opposed to my interlocutor, who presented an objective fact based assessment
retard

Anonymous 8/6/2025, 10:55:25 PM No.106167040 >>106167111

>>106167036
I don't care about your interlocutor nor the topic at hand. You could be right for all I know.
I'm just pointing out someone who doesn't know how to find truth in the world because you live in your head.

Anonymous 8/6/2025, 10:57:54 PM No.106167065

file.png md5: 35846763... 🔍

Sam Altman here

Anonymous 8/6/2025, 10:58:07 PM No.106167067 >>106167089 >>106167089

remember superbooga? it was basically RAG
remember superCOT? reasoning before it was cool
remember superHOT? 2x context
superhot was crazy, every single model had a superhot version kek

Anonymous 8/6/2025, 10:58:23 PM No.106167071

>>106167048
>>106167048
>>106167048

Anonymous 8/6/2025, 11:00:13 PM No.106167085

>>106166235
Yes. There's also lmarena.

Anonymous 8/6/2025, 11:01:16 PM No.106167089

>>106167067
>remember superbooga? it was basically RAG
I do but never actually used it.

>>106167067
>superhot was crazy,
Dude invented extending context with RoPE.
Insane.

Anonymous 8/6/2025, 11:03:36 PM No.106167111

>>106167040
I am making an argument on lmg not writing a research paper
if you have a substantive critique I would love to hear it, but going "UMM PROOFS??" is a nothing counterargument. yes, I don't have hard evidence for everything I believe, especially on the subject of LLMs which cost millions of dollars to train lol... which one of us is really being unrealistic here?
I make reasonable inferences based on my experience using models because I live in the real world and have to make due with messy incomplete real world data

Anonymous 8/6/2025, 11:23:11 PM No.106167345

>>106165031
plapping d.va on glm