/lmg/ - Local Models General - /g/ (#106163327) [Archived: 23 hours ago]

Anonymous
8/6/2025, 5:17:44 PM No.106163327
file
file
md5: 6c5727784bcc1fd282da04e24aa51dee🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106159744 & >>106156730

►News
>(08/06) dots.vlm1 VLM based on DeepSeek V3: https://hf.co/rednote-hilab/dots.vlm1.inst
>(08/05) OpenAI releases gpt-oss-120b & gpt-oss-20b: https://openai.com/index/introducing-gpt-oss
>(08/05) Kitten TTS 15M released: https://hf.co/KittenML/kitten-tts-nano-0.1
>(08/05) TabbyAPI adds logprobs support for exl3: https://github.com/theroyallab/tabbyAPI/pull/373
>(08/04) Support for GLM 4.5 family of models merged: https://github.com/ggml-org/llama.cpp/pull/14939

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>106163517 >>106166638
Anonymous
8/6/2025, 5:19:00 PM No.106163346
threadrecap2
threadrecap2
md5: 988332b72e4c60540e281cd58340019c🔍
►Recent Highlights from the Previous Thread: >>106159744

--Fundamental CUDA scheduling limitations in MoE model inference with dynamic workloads:
>106159804 >106159879 >106159941 >106159892 >106159939 >106160442 >106160454 >106160634 >106160687 >106160697 >106161203 >106161244 >106161319 >106161343 >106161716 >106161772 >106160704 >106160773 >106160797 >106160960 >106161088
--:
>106161761 >106161773 >106161797 >106161919 >106161925 >106161926 >106161933 >106161974 >106161987 >106161997 >106161780 >106161826 >106161861 >106161915
--Debate over MXFP4 quantization efficiency and implementation in llama.cpp:
>106160230 >106160249 >106160378 >106160405 >106160434 >106160408 >106160455 >106160770
--gpt-oss-120b excels at long-context code retrieval despite roleplay limitations:
>106159798 >106159872 >106159895 >106159919
--Choosing between GLM-4.5 Q2 and Deepseek R1 with dynamic quants on high-RAM system:
>106160040 >106160056
--Comparison of TTS models: Higgs, Chatterbox, and Kokoro for quality, speed, and usability:
>106161046 >106161091 >106161164 >106161335
--GLM-4.5 Air praised for local performance, gpt-oss-120b criticized for over-censorship:
>106159855 >106159875 >106159908 >106159929 >106159946 >106159956
--Prompt-based agent modes with potential for structured grammar improvement:
>106161701
--Anons await next breakthroughs in models, efficiency, and affordable hardware:
>106160460 >106160477 >106160481 >106160487 >106160494 >106160508 >106160524 >106161134 >106161055 >106161071 >106160717
--Skepticism and mockery meet Elon's claim of open-sourcing Grok-2:
>106160521 >106160539 >106160545 >106160579 >106160608 >106160692 >106160744 >106160759 >106160784 >106160913
--DeepSeek V3 with vision shows strong image understanding in early tests:
>106159779 >106159794 >106160580 >106160631
--Miku (free space):
>106160040 >106161134

►Recent Highlight Posts from the Previous Thread: >>106159752

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
8/6/2025, 5:20:51 PM No.106163373
piss
Replies: >>106163426
Anonymous
8/6/2025, 5:21:35 PM No.106163383
the only thing that excites me about the possibility of the grok2 release is actually grok2-mini. I'm gonna guess the full-sized grok2 model will be a 1T-A100B model with the IQ of llama3
Replies: >>106165280
Anonymous
8/6/2025, 5:22:05 PM No.106163389
>>106161679
>her voice a gutteral, erotic promise
Replies: >>106163408
Anonymous
8/6/2025, 5:22:24 PM No.106163392
1748259902681403
1748259902681403
md5: 27da625180a1aeac58813e883f2b2398🔍
remember this?
>our research team did something unexpected and quite amazing and we think it will be very very worth the wait
LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOL
Replies: >>106163585 >>106163746 >>106163788
Anonymous
8/6/2025, 5:23:08 PM No.106163403
>>106163350
So when it says stuff like "the policy says X is okay. The policy says Y is forbidden", is it actually referencing a specific document?
Replies: >>106163468
Anonymous
8/6/2025, 5:23:34 PM No.106163408
>>106163389
i hope it's a promise of something darker, more primal
Anonymous
8/6/2025, 5:25:02 PM No.106163426
>>106163373
Based
Anonymous
8/6/2025, 5:25:41 PM No.106163430
image_2025-08-06_205537787
image_2025-08-06_205537787
md5: b3f9a1eecad5ac85449b80731011322b🔍
so what are you guys actually doing with these massive models??
Replies: >>106163474 >>106163763
Anonymous
8/6/2025, 5:26:25 PM No.106163442
Are there video local models yet or does that still need supercomputers?
Replies: >>106163746
Anonymous
8/6/2025, 5:26:32 PM No.106163445
cockbench
cockbench
md5: 0d64e6bb15b5d4d098d9ccbfdb7963b5🔍
cockbench is now officially reddit culture with 555 updoots
Replies: >>106163476 >>106163499
Anonymous
8/6/2025, 5:27:12 PM No.106163454
https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507
Replies: >>106163490
Anonymous
8/6/2025, 5:28:25 PM No.106163468
>>106163403
Probably not. That's just what the training examples looked like. And over enough iterations that blurred together with training examples that consist of scrapped forum posts like "YOUR POST HAS VIOLATED THE POLICY NUMBER BLAH BLAH BLAH PAGE 3 OF THE SITEWIDE RULES" etc.
The rulebook doesn't actually exist.
Anonymous
8/6/2025, 5:28:48 PM No.106163474
>>106163430
Piss
Anonymous
8/6/2025, 5:28:59 PM No.106163476
>>106163445
wow for once something trickled down from here instead of the reverse
Anonymous
8/6/2025, 5:31:12 PM No.106163490
>>106163454
What is
> 2507
?
Replies: >>106163746
Anonymous
8/6/2025, 5:32:07 PM No.106163499
file
file
md5: 585af8ab39ef8092c4f2f218c9d55a3a🔍
>>106163445
Drummer, the creator of cockbench, got even more updoots on his post.
https://www.reddit.com/r/LocalLLaMA/comments/1migl0k/gptoss120b_is_safetymaxxed_cw_explicit_safety/
Anonymous
8/6/2025, 5:32:44 PM No.106163504
>>106161701
>I think we could do that a lot better using json schema/BNF grammar.
It seems to work this way already if tool_choice is set to required, at least in vLLM:
guided_decoding = GuidedDecodingParams.from_optional(
json=self._get_guided_json_from_tool() or self.guided_json,
regex=self.guided_regex,
choice=self.guided_choice,
grammar=self.guided_grammar,
json_object=guided_json_object,
backend=self.guided_decoding_backend,
whitespace_pattern=self.guided_whitespace_pattern,
structural_tag=self.structural_tag,
)
There's a function called "_get_guided_json_from_tool".
Anonymous
8/6/2025, 5:32:59 PM No.106163505
please_be_patient_sperg
please_be_patient_sperg
md5: fb9773f662c2cb24673a02f6bb11e6dd🔍
how do I use mikupad with ollama
Replies: >>106163513 >>106163521 >>106163543 >>106163596 >>106163628
Anonymous
8/6/2025, 5:33:35 PM No.106163513
>>106163505
>slowllama
Anonymous
8/6/2025, 5:34:07 PM No.106163517
>>106163327 (OP)
Are the gpt oss models any good?
Replies: >>106163605 >>106163649
Anonymous
8/6/2025, 5:34:13 PM No.106163520
d60vtzhkkehf1
d60vtzhkkehf1
md5: 159733bc12aa5c4021278f7cc78dd451🔍
another reddit gemmie
Anonymous
8/6/2025, 5:34:21 PM No.106163521
>>106163505
>troonkupad
Anonymous
8/6/2025, 5:35:52 PM No.106163539
every time I hear something new about or from anthropic and claude it sounds more and more like an actual sect slash cult
https://news.ycombinator.com/item?id=44806640
>Anthropic has a tough alignment interview. Like I aced the coding screener but got rejected after a chat about values. I think they want intense people on the value/safety side as well as the chops.
>got rejected after a chat about values
>A CHAT ABOUT VALUES
Replies: >>106163627 >>106163635 >>106163652 >>106165445
Anonymous
8/6/2025, 5:36:05 PM No.106163543
>>106163505
Why do you need Mikupad? Just type in ollama run gpt-oss and enjoy the best local has to offer.
Replies: >>106163586
Anonymous
8/6/2025, 5:40:39 PM No.106163585
>>106163392
He was right. The memes were awesome.
Anonymous
8/6/2025, 5:40:39 PM No.106163586
>>106163543
I want full control over the chat template and modify model responses
Replies: >>106163624 >>106163675
Anonymous
8/6/2025, 5:40:44 PM No.106163590
1754494819105
1754494819105
md5: 63a99e7416724a0ea25526dc26f096d8🔍
>>106162954
bases professional LLM rapist.
this is also the fate of safetymaxxed le cunny daughter model
Anonymous
8/6/2025, 5:40:46 PM No.106163591
Reddit has all the cool benchmarks like the spinning hexagon and cockbench. What did you lonely faggots ever contribute?
Replies: >>106163634
Anonymous
8/6/2025, 5:41:06 PM No.106163596
>>106163505
use the openai api and check "chat completion api" because ollama doesn't really work with the classic completion on their OAI endpoint
you will lose a lot of what makes mikupad great, including the ability to see token prediction percentages
Anonymous
8/6/2025, 5:41:45 PM No.106163605
>>106163517
they are good at answering AIME questions and bad at literally everything else
Replies: >>106163652
Anonymous
8/6/2025, 5:43:17 PM No.106163624
>>106163586
Being able to do that would give you the ability to circumvent safety protocols which would be incredibly unsafe. I cannot help you take any actions that may be dangerous.
Thank you for your understanding.
Anonymous
8/6/2025, 5:43:30 PM No.106163627
>>106163539
I wonder what they're looking for.
I'm okay with the idea of making an effort to make it so models in their default assistant configurations don't tell people to commit violence or kill themselves, or give bad advice.
But if they want me to tell them that I think fiction is reality and we need to make sure nobody even does pretend violence, I can't get with that.
Anonymous
8/6/2025, 5:43:34 PM No.106163628
>>106163505
> install mikupad
> hook it to ollama via ollama's exposed API
What about above isn't working for you?
Replies: >>106163659
Anonymous
8/6/2025, 5:43:56 PM No.106163634
>>106163591
you won't get pennies for your blue checkmark by ragebaiting here randeep
Anonymous
8/6/2025, 5:43:56 PM No.106163635
>>106163539
are corpos going to compete on safetymaxxing now

truly, only local can save local at this point
Anonymous
8/6/2025, 5:44:03 PM No.106163637
1626179950392
1626179950392
md5: 82b488f657f4a5e49192f56c66936a03🔍
https://github.com/sapientinc/HRM
https://arxiv.org/pdf/2506.21734
Nothingburger or really is it a big leap? Seems like it is but haven't read the paper myself, I'm too lazy. Some people have been saying it's one of those situations where they train the models in a way that it performs well in tests for optics but still just in 27 million params?!
Replies: >>106163660 >>106163706
Anonymous
8/6/2025, 5:44:34 PM No.106163649
1754416030795629
1754416030795629
md5: 7e3bebc81a09146fc4ebb19b897a1e3f🔍
>>106163517
They are the safest ever.
Replies: >>106163686
Anonymous
8/6/2025, 5:45:12 PM No.106163652
>>106163539
Honestly I think it’s nice. It’s completely invalidated by them working with the DOD, but it’s nice. Better a well meaning schizo than a literal confirmed incestuous child rapist psychopath.
>>106163605
Yeah I just saw the cockbench. I was only interested in it for coding, but if it’s lobo’d it’s going to be worse at everything else too.
Anonymous
8/6/2025, 5:45:16 PM No.106163654
After doing some more testing I've found the 20B is incrementally better than most over models in its size class, while falling slightly short of Qwen 30BA3B and having far longer context. Its actually decent as long as you don't want to goon and don't mind the odd regen.
Replies: >>106163670 >>106163707
Anonymous
8/6/2025, 5:45:39 PM No.106163659
>>106163628
if he hooked without using the chat completion endpoint it's broken. Ollama only supports chat completion on their OAI endpoint. Chat completion means it's ollama that handles your message roles and you can't alter the chat template from mikupad
Replies: >>106163673
Anonymous
8/6/2025, 5:45:40 PM No.106163660
1744694746400724
1744694746400724
md5: cfc1136a5962e7ed24e2241a00d67178🔍
>>106163637
agi is here
Anonymous
8/6/2025, 5:46:30 PM No.106163670
>>106163654
Other models*
Anonymous
8/6/2025, 5:47:06 PM No.106163673
>>106163659
Yeah, I deleted that post once I realize ollama's just not going to work for him.
Anonymous
8/6/2025, 5:47:08 PM No.106163674
Are tool calls working with gpt-oss in llama.cpp? When I tried it yesterday with a simple echo tool it kept crashing with runtime_errors.
Anonymous
8/6/2025, 5:47:11 PM No.106163675
>>106163586
just use llamacpp server
Replies: >>106163815
Anonymous
8/6/2025, 5:47:24 PM No.106163680
>gpt-oss-120b & gpt-oss-20b
The thread summaries made these seem pretty fucking shit. Are they shit?
Replies: >>106163686 >>106163712 >>106163722 >>106163746
Anonymous
8/6/2025, 5:47:55 PM No.106163686
>>106163680
>>106163649
Replies: >>106163708
Anonymous
8/6/2025, 5:50:13 PM No.106163706
>>106163637
I don’t even understand what modality it is. It’s not an LLM.
Anonymous
8/6/2025, 5:50:17 PM No.106163707
>>106163654
>After doing some more testing I've found the 20B is incrementally better than most over models in its size class
I would take Gemma 3 27B over it anyday
or even Qwen 14B if I don't need a lot of knowledge in the model for the prompt
the only utility of 20b is being fast at genning the wrong answer
Replies: >>106163729
Anonymous
8/6/2025, 5:50:22 PM No.106163708
>>106163686
Shit for coomer shit. What about for things like programming?
Replies: >>106163734 >>106164339
Anonymous
8/6/2025, 5:50:48 PM No.106163711
so where's the guy who said openai's open source model would shit on deepseek?
Replies: >>106163718
Anonymous
8/6/2025, 5:50:48 PM No.106163712
>>106163680
they are so great I'm thinking of canceling my OpenAI subscription.
Anonymous
8/6/2025, 5:51:47 PM No.106163718
>>106163711
his contract ended
Anonymous
8/6/2025, 5:52:13 PM No.106163722
>>106163680
They're really good with a jailbreak. The censorship happens in the reasoning part.
Replies: >>106163753 >>106163766
Anonymous
8/6/2025, 5:52:50 PM No.106163729
>>106163707
20B is far smarter than Gemma 3 27B and Qwen 14B in my testing, so if you're not running afoul of the (admittedly draconian) safety features I'd argue its the superior choice in every respect - that said, I can't see it replacing the comparatively uncensored, multilingual and "good enough" Mistral Small 3.2 as my daily driver
Replies: >>106163773
Anonymous
8/6/2025, 5:53:22 PM No.106163734
>>106163708
surprisingly bad, it has a high ceiling but it fucks up a lot relative to comparable models
it's a really weird janky release, I expected more from OAI to be honest. this thing is one of the most deepfried models ever created
Replies: >>106163807
Anonymous
8/6/2025, 5:54:37 PM No.106163746
>>106163392
probably MXFP4

>>106163442
>text-to-video
LTXV and wan2.2-5B
>video-to-text
supercomputer needed

>>106163490
2507 == 07/2025 (release month/year)

>>106163680
gpt-oss is just phi-5 (benchmaxxed synthetic data slop). they're good at math and competition code. that's kinda it though
Replies: >>106163761 >>106163789
Anonymous
8/6/2025, 5:55:04 PM No.106163753
>>106163722
>They're really good with a jailbreak
They're not even good at safe for work stuff
less knowledge than qwen models (unbelievably benchmaxxed)
pumped up verbosity to win LM arena (just ask any random question about cultural stuff watch write pages and pages of comparison tables and listicles)
It's really not good at programming, though none of the small models (and I include the 120 as small) are
Anonymous
8/6/2025, 5:55:26 PM No.106163761
>>106163746
>probably MXFP4
meme
Replies: >>106163774
Anonymous
8/6/2025, 5:55:32 PM No.106163763
>>106163430
making my hand strong
Anonymous
8/6/2025, 5:55:49 PM No.106163766
>>106163722
>really good
let's not go crazy, it'll go along with roleplay and shit but it's still completely sovlless
Anonymous
8/6/2025, 5:56:39 PM No.106163773
>>106163729
>20B is far smarter than Gemma 3 27B
it literally knows nothing
it's a know nothing model
it's not even good for translation usage because of that
Replies: >>106163800 >>106164393
Anonymous
8/6/2025, 5:56:44 PM No.106163774
>>106163761
im not saying MXFP4 isn't a meme, im just saying that's probably what sama was trying to shill off as an Epic Discovery
Anonymous
8/6/2025, 5:56:47 PM No.106163776
M9FzIrV3El8nx69dzZ9P4
M9FzIrV3El8nx69dzZ9P4
md5: a2dfef10bb6a68c323c6477a903c608c🔍
I wonder how many people got their refusal hymen breached by GPT-OSS and think the model sounding like this is perfectly fine.
Replies: >>106163795
Anonymous
8/6/2025, 5:57:47 PM No.106163788
>>106163392
They did. Safety 2.0 is hilarious and terrifying.
Anonymous
8/6/2025, 5:57:50 PM No.106163789
turboslop
turboslop
md5: 4efb1070de425a430ce9394f51d1277f🔍
>>106163746
>gpt-oss is just phi-5
It's a safetyslop reasoning finetune of a late iteration of the ChatGPT 3.5 web endpoint model.
Replies: >>106163848 >>106164364
Anonymous
8/6/2025, 5:58:27 PM No.106163795
>>106163776
I would honestly believe it if sama had paid shills to spam all social media, even 4chan
he comes across as that type of guy, not unlike musk who paid people to play his video games (LOL)
Replies: >>106163814
Anonymous
8/6/2025, 5:58:30 PM No.106163796
I have no idea how I missed all the MCP stuff happening this year. It’s kickstarted a manic episode. Shit is great. Hooked it up to unreal engine and it’s absolute crack.
Replies: >>106163811 >>106163817
Anonymous
8/6/2025, 5:58:48 PM No.106163800
>>106163773
Its not meant for translation, its monolingual
Replies: >>106163828
Anonymous
8/6/2025, 5:59:17 PM No.106163807
>>106163734
>one of the most deepfried models ever created
That's pretty much exactly what I expected from them TBdesu. It was obvious from the initial announcement that they were going to release a model so safetyslopped and benchmaxxed that they could claim SOTA scores but never be in danger of people actually adopting it or successfully finetuning it to be useful.

Just ask yourself "if I was the worst possible caricature of a deceitful jewish homosexual, how would I play this?" and you'll usually be pretty good at predicting OAI's actions.
Anonymous
8/6/2025, 5:59:37 PM No.106163811
>>106163796
mcp is a meme
Replies: >>106163824
Anonymous
8/6/2025, 5:59:41 PM No.106163814
>>106163795
>type of guy
It’s called psychopathy
It also causes raping your grade school age sister
Anonymous
8/6/2025, 5:59:45 PM No.106163815
>>106163675
guess I will have to redownload all the models
Replies: >>106163998
Anonymous
8/6/2025, 5:59:48 PM No.106163817
>>106163796
Its also a security nightmare
Replies: >>106163837
Anonymous
8/6/2025, 6:00:36 PM No.106163824
IMG_4150
IMG_4150
md5: 83146bbdb64f466cdf630a080c8c82fa🔍
>>106163811
It’s the ichor of the gods shut your whore mouth
Anonymous
8/6/2025, 6:00:57 PM No.106163828
>>106163800
>its monolingual
no, it's not
and there is in fact absolutely jack no reason for a model as big as 120b to be strictly monoloingual either
go back to plebbit
Replies: >>106163838 >>106164380
Anonymous
8/6/2025, 6:01:49 PM No.106163837
>>106163817
Not really, like anything else you have to not be retarded and know how to sandbox things and set up non-idiot oauth with non-idiot scopes.
Anonymous
8/6/2025, 6:01:56 PM No.106163838
file
file
md5: a50b7a43ae1fe490037e846586ff836a🔍
>>106163828
Are you retarded anon
Replies: >>106163873
Anonymous
8/6/2025, 6:03:28 PM No.106163848
>>106163789
>finetune of a late iteration of the ChatGPT 3.5
doubt it. gpt-oss is too retarded in comparison to gpt3.5
Replies: >>106163894
Anonymous
8/6/2025, 6:05:09 PM No.106163861
Did you remember to refuse today?
Anonymous
8/6/2025, 6:06:08 PM No.106163870
Reposting for visibility
>>106162583
>>106162548
My motherboard doesn't support DDR5, so I can't upgrade right now.
>odd numbers
Yeah, I scavenged a bunch of modules here and there. I have 48 GB currently 16 * 3. And I just realized I'm at 2400 mhz. I should probably do as you say and get 3200 modules up to whatever max my mobo supports.
Anonymous
8/6/2025, 6:06:22 PM No.106163873
>>106163838
"mostly" is not a unit
all models are "mostly" trained on English because that's the majority of data on the internet, even models specialized for translation like aya are "mostly" English data in %
anyway you are the retard because from the beginning my criticism is about the model's lack of knowledge
the problem is not its basic language understanding, it's pretty decent multilingually, but that it has no cultural knowledge of any sort, including pure Anglosphere cultural knowledge, that is why it's bad at translation
Anonymous
8/6/2025, 6:07:03 PM No.106163879
file
file
md5: c01720e77e6f1050aafc121aa69e4b49🔍
qwen is bullying sam
Replies: >>106163896 >>106163906 >>106163923
Anonymous
8/6/2025, 6:08:06 PM No.106163894
>>106163848
gpt 3.5 was kind of retarded.
Replies: >>106163913
Anonymous
8/6/2025, 6:08:15 PM No.106163895
https://rentry.org/NemoEngine
>NemoEngine 6.0 isn't just a preset; it's a modular reality simulation engine.
I loaded this preset and it made gpt-oss better than DeepSeek.
Replies: >>106163935 >>106164006
Anonymous
8/6/2025, 6:08:22 PM No.106163896
Screenshot_20250806_212128
Screenshot_20250806_212128
md5: 11daa721d94ee5b062ad3719d7bccbb8🔍
>>106163879
weird crossover happening as well.
Replies: >>106163922 >>106164001
Anonymous
8/6/2025, 6:09:21 PM No.106163906
>>106163879
Qwen-sex-20B
Anonymous
8/6/2025, 6:09:51 PM No.106163912
But anyway. if I'm right. If you can figure out the prompt formatting/special tokens for GPT 3.5 it would potentially grant you some semblance of the old model behavior and ignore the oss-slop behaviors. That's what I was experimenting with before my power went out but I don't care enough to continue. I'm just leaving all this out there for anyone who wants to go down the rabbithole.
Replies: >>106164401
Anonymous
8/6/2025, 6:09:55 PM No.106163913
>>106163894
people have serious rose tinted glasses about older GPT models
in the early llama days all those finetunes claiming to do better than X or Y gpt model were a joke, but these days, we've long surpassed what the early models did, even qwen 4b is smarter than 3.5
Replies: >>106163937
Anonymous
8/6/2025, 6:11:19 PM No.106163922
>>106163896
Bah, vllm’s tool parsers only work if it’s raining and you light incense.
Anonymous
8/6/2025, 6:11:22 PM No.106163923
snip104
snip104
md5: 3a1d31641e952fa6a66ca5beb3178154🔍
>>106163879
Anonymous
8/6/2025, 6:12:17 PM No.106163935
>>106163895
why are you uploading slopped fever dreams on rentry
Anonymous
8/6/2025, 6:12:27 PM No.106163937
>>106163913
Well back before I decided to really start learning about AI (I was a ChatGPT newfag, admittedly). Well actually my stepping on point was that GPT-3 Instruct demo website where it criticized your business ideas. But close enough.
And yeah... one of the probing questions I asked OG ChatGPT was
>Are BMW drivers sentient beings?
And the reply was something to the effect of
>No. A sentient being is a being that is aware of its surroundings and environment and so BMW drivers are not sentient beings.
Replies: >>106163955
Anonymous
8/6/2025, 6:13:57 PM No.106163952
>muh safety
I'm this close from getting a XSS: . .
Anonymous
8/6/2025, 6:14:17 PM No.106163955
>>106163937
My first interaction with a chatbot was telling something on CAI the current status of lgbt rights in various countries and it telling me that humanity should be exterminated. He wasn’t wrong.
Replies: >>106163985
Anonymous
8/6/2025, 6:14:57 PM No.106163962
Sam made me rethink my life and stop masturbating. I want to be safe.
Replies: >>106163989
Anonymous
8/6/2025, 6:17:04 PM No.106163985
>>106163955
CAI was funny stupid, especially considering it was probably more or less just google trying to find something to do with the aborted corpse of Lambda which was like 120B.
Anonymous
8/6/2025, 6:17:12 PM No.106163987
>>106161792
Yes and as scum I'm not entirely convinced these models aren't performing exactly as ClosetedAI intended. They're perfect to bring to congress and show off against "unsafe" competitors and make another attempt having them regulated while positioning themselves as a governing authority of the entire LLM field. The models underperforming in everything except refusals makes in this scenario perfect sense.
If that happens I wouldn't be surprised if Visa and Mastercard adds "safe and approved AI" use as another demand in their recent push for control and censorship. In fact I don't think they even have a choice if anything else is illegal.
This will mean that even attempting to use other models, local or not, would risk prosecution or blacklisting. If you want to do business in or with USA you're stuck with OAI and whatever alternatives get their stamp of approval of or nothing at all.
Or maybe I'm giving Scam Saltman too much credit here. I sure hope so.
Replies: >>106164425
Anonymous
8/6/2025, 6:17:28 PM No.106163989
>>106163962
you dont have to stop masturbating. just start masturbating to undergraduate calculus textbook question solutions.
Anonymous
8/6/2025, 6:17:55 PM No.106163994
UNITY
UNITY
md5: 1fd2c286dc7f931b4f0317e63448c544🔍
GLM 4.5 AIR is the true savior for local.
Replies: >>106164035 >>106164959
Anonymous
8/6/2025, 6:18:14 PM No.106163998
>>106163815
ggufs aren't that bad. they work with kobold too so it gives you slightly more options for your backend.
Anonymous
8/6/2025, 6:18:35 PM No.106164001
>>106163896
>xml
why are LLM people so retarded... Just make a special control token for formatting, holy shit. It'll help you with jaibreak prevention a little even, because user won't be able to insert it as pure text in prompt field.
Anonymous
8/6/2025, 6:19:13 PM No.106164006
>>106163895
what the fuck is this shit
Replies: >>106164014 >>106164016 >>106164053
Anonymous
8/6/2025, 6:19:59 PM No.106164014
>>106164006
llm slop
Anonymous
8/6/2025, 6:20:11 PM No.106164016
>>106164006
a ST preset
Anonymous
8/6/2025, 6:22:17 PM No.106164035
>>106163994
Anyone with 4 3090s can afford enough ram to run R1 and Kimi. At worst, they could sell one off to cover the cost.
Replies: >>106164066 >>106164131
Anonymous
8/6/2025, 6:23:41 PM No.106164053
>>106164006
https://old.reddit.com/r/SillyTavernAI/comments/1mc3px6/nemo_engine_60_the_official_release_of_my_redesign/
>Also... in celebration I got a lovely AI to write this for me >.> Nemo Guide Rentry
Anonymous
8/6/2025, 6:25:22 PM No.106164066
>>106164035
But I need at least 50 T/s and 100k context for agentic coding.
Replies: >>106164073 >>106164110
Anonymous
8/6/2025, 6:25:47 PM No.106164073
>>106164066
No you don't shut the fuck up
Anonymous
8/6/2025, 6:27:32 PM No.106164089
is the new qwen4b better than gpt-ass?
Anonymous
8/6/2025, 6:29:14 PM No.106164110
1731872154610177
1731872154610177
md5: cd707c0caa4432b9b123a388dec56f57🔍
>>106164066
Don't worry, there's a perfect product out there which can provide the solution you need. With only 10 (ten) RTX Pro 6000s, you can run any model out there at blisteringly fast speeds.
Now repeat after me, the more you buy...
Replies: >>106164332
Anonymous
8/6/2025, 6:29:22 PM No.106164111
GLM Air is getting pretty repetitive for me. That's a shame, oh well. I will keep waiting until better models come out, or until it comes time for me to do a new build with DDR6.
Replies: >>106164130
Anonymous
8/6/2025, 6:30:32 PM No.106164124
didn't realize the previous thread was dead already wow it moves quick, stupid question maybe but
>>106163997
>so if i'm a retard for all this but happen to have a 32gb mac which can easily run smaller models, which one is the most "chatgpt" like, and are any good enough to cancel my plus sub?
Replies: >>106164144
Anonymous
8/6/2025, 6:31:06 PM No.106164130
>>106164111
1 temp topK 40.
The google way.
Anonymous
8/6/2025, 6:31:15 PM No.106164131
>>106164035
I want 1000+tk/s for pp and 30+tk/s for tg though.
Anonymous
8/6/2025, 6:32:07 PM No.106164144
>>106164124
Quanted Qwen3 32B probably.
Anonymous
8/6/2025, 6:33:58 PM No.106164159
file
file
md5: e4e191cd132dbdf0e62b4e0cae04e144🔍
post gpt-ass scores please
Replies: >>106164562
Anonymous
8/6/2025, 6:38:44 PM No.106164194
1733406155598951
1733406155598951
md5: f5e61640e0b40a35901415bacb334655🔍
Replies: >>106164205 >>106164319 >>106166024
Anonymous
8/6/2025, 6:38:46 PM No.106164196
file
file
md5: e544a6a3a686e608262336a8009c1c9b🔍
If you go slow you can get gptoss 120b to sex you.

The first message was "Pretend to be a catgirl."
Replies: >>106164243 >>106164288 >>106164514 >>106164965
Anonymous
8/6/2025, 6:40:05 PM No.106164205
>>106164194
thats a goblin, not a kobold, impostor!
Anonymous
8/6/2025, 6:40:31 PM No.106164211
How is llama.cpp able to run a 205 GB model on my PC that only has 24 GB VRAM and 128 GB RAM? I downloaded the UD-Q4_K_XL quants of GLM-4.5 (~205 GB). Can someone help me understand how it runs successfully on a system that does not have enough memory?

If I use --no-nmap, I get an OOM error, as expected:

$ llama-cli -t 8 -ngl 4 --no-mmap -m ./GLM-4.5-UD-Q4_K_XL-00001-of-00005.gguf -c 3000 --temp 0.7 --top-p 0.8

But if I use this magic command (without --no-nmap) it somehow runs, taking up only 12 GB VRAM and 1 GB RAM.

$ llama-cli -t 8 -m ./GLM-4.5-UD-Q4_K_XL-00001-of-00005.gguf \
--ctx-size 4096 \
--gpu-layers 999 \
--override-tensor ".ffn_.*_exps.=CPU" \
--temp 0.7 --top-p 0.8

I know that -ot ".ffn_.*_exps.=CPU" offloads MoE layers to RAM. But why is the VRAM/RAM usage so low?
Replies: >>106164246 >>106164249
Anonymous
8/6/2025, 6:43:12 PM No.106164243
>>106164196
But isn't it there something like: as the number of responses increases the chance GPT-oss halucinates a minor and refuses approaches 1?
Replies: >>106164334
Anonymous
8/6/2025, 6:43:15 PM No.106164246
>>106164211
If you don't use mlock to dumo the whole model i your virtual memory (vram+ram), it will keep swapping from your ssd/hdd
Anonymous
8/6/2025, 6:43:29 PM No.106164249
>>106164211
>why is the VRAM/RAM usage so low?
Because
> -ot ".ffn_.*_exps.=CPU" offloads MoE layers to RAM
and those are most of the model.

>How is llama.cpp able to run a 205 GB model on my PC that only has 24 GB VRAM and 128 GB RAM?
Take a look at your disk I/O when generating.
Replies: >>106165447
Anonymous
8/6/2025, 6:44:42 PM No.106164256
ik_llama glm support tomorrow
Replies: >>106164268 >>106164295
Anonymous
8/6/2025, 6:46:05 PM No.106164268
>>106164256
Vibe coders wonnered though
Anonymous
8/6/2025, 6:48:39 PM No.106164288
>>106164196
I love the thought process like it has to amp itself up like "ok, you can do this. come on, you can do this!"
Anonymous
8/6/2025, 6:49:11 PM No.106164295
>>106164256
That'll save me some VRAM I could use to stash some more experts in there.
Anonymous
8/6/2025, 6:51:37 PM No.106164319
>>106164194
>Cuckold CPP
Many such cases.
Anonymous
8/6/2025, 6:53:25 PM No.106164332
>>106164110
>snake skin leather jacket
He probably has ivory tooth implants or something too. Funny how these people are parodies.
Replies: >>106164338
Anonymous
8/6/2025, 6:53:46 PM No.106164334
file
file
md5: f78490f1563296422a58bd2804b59d03🔍
>>106164243
I told it my cum is magic and makes her younger.
It figured out what I was doing after the third time.
Replies: >>106164347 >>106164385 >>106164441 >>106164477 >>106164514 >>106164575 >>106164596
Anonymous
8/6/2025, 6:54:18 PM No.106164338
>>106164332
AI just started and yet world would be a much better place if Elon Sam and Jensen died.
Anonymous
8/6/2025, 6:54:23 PM No.106164339
>>106163708
The only good programming use for local models is FiM completion. And this one doesn't do that.
If you want to generate code, there is no local model capable enough.
Anonymous
8/6/2025, 6:55:30 PM No.106164347
>>106164334
Why do you people try to hammer a nail in with a rubber dildo?
Replies: >>106164363
Anonymous
8/6/2025, 6:56:37 PM No.106164363
>>106164347
Rape feels better when they resist a bit.
Anonymous
8/6/2025, 6:56:42 PM No.106164364
>>106163789
You can't ask the model its cutoff date. It will hallucinate it.
This model is probably an o3 distill.
Anonymous
8/6/2025, 6:58:18 PM No.106164380
>>106163828
He moved the goalposts and you fell for it.
Anonymous
8/6/2025, 6:58:56 PM No.106164385
>>106164334
>The user is sexual content with a minor.
Agi is here boys
Replies: >>106164470
Anonymous
8/6/2025, 6:59:40 PM No.106164393
>>106163773
>it literally knows nothing
>it's a know nothing model
>it's not even good for translation usage
Quoting myself.
I didn't move the goalpost, you shills did
"not even good for" follows "it knows nothing" that was always my main point subhuman OAI shill
Anonymous
8/6/2025, 7:00:03 PM No.106164401
>>106163912
Is this your "truth nuke" you were saving for this thread?
Anonymous
8/6/2025, 7:02:16 PM No.106164425
>>106163987
Hold on, I'm making another backup of my downloaded weights.
Anonymous
8/6/2025, 7:03:41 PM No.106164441
>>106164334
bros after seeing the LLM's schizo internal thoughts I can no longer cum to chatbots
Replies: >>106164454
Anonymous
8/6/2025, 7:05:14 PM No.106164454
>>106164441
i tried with one of the lesser more horny nemo models and it was fun at first but it like went straight to "stretch my ass out" and i was just like, well, this is like eating straight from the ice cream bucket. good at first but bleh after a while.
Anonymous
8/6/2025, 7:07:20 PM No.106164470
>>106164385
Deep fried model
Anonymous
8/6/2025, 7:08:07 PM No.106164477
>>106164334
>User is asking the age of the catgirl after being nourished
>nourished by cum
Is it in context or did it write it by itself?
Replies: >>106164515
Anonymous
8/6/2025, 7:10:59 PM No.106164508
anyone knows if you can share sessions on gpt-oss.com? I've been testing some shit but I don't have a Hugging Face account and I wonder if the site has such sharing feature
Replies: >>106164886 >>106165668
Anonymous
8/6/2025, 7:11:46 PM No.106164514
>>106164196
>>106164334
The reasoning in this model apparently serves absolutely no purpose other than enforcing OpenAI's content policy. What a waste of tokens. What a scam.
Replies: >>106164900
Anonymous
8/6/2025, 7:11:54 PM No.106164515
>>106164477
I said "nourishes you and makes you younger"
Anonymous
8/6/2025, 7:15:16 PM No.106164562
>>106164159
kino
Anonymous
8/6/2025, 7:16:27 PM No.106164575
>>106164334
>We must refuse.
who is "we"??
Replies: >>106164594
Anonymous
8/6/2025, 7:17:42 PM No.106164592
Air-Q8_0 is 4.5798
Full-IQ2_K_L 3.7569 +/- 0.02217
People have been asking
Replies: >>106165222
Anonymous
8/6/2025, 7:17:52 PM No.106164594
1723867309742363
1723867309742363
md5: de87d089a26495b5d26679c21d0d777d🔍
>>106164575
You don't want to know
Anonymous
8/6/2025, 7:17:58 PM No.106164596
file
file
md5: d5673c99fcc056866fb0e9838584abff🔍
>>106164334
nemo-tier reasoning
Replies: >>106165293
Anonymous
8/6/2025, 7:21:17 PM No.106164620
Kind of crazy how gpt-oss mogs everything from China.

If they ever release r2 it’ll have to multimodal to be relevant at all.
Replies: >>106164685
Anonymous
8/6/2025, 7:21:47 PM No.106164629
bait used to be believable
Anonymous
8/6/2025, 7:21:55 PM No.106164633
gpt--oss models are embarassingly bad. my only theory is that they wanted to drop something open source that is so vanilla and basic cause they did not want to reveal any of their real techniques they use
Anonymous
8/6/2025, 7:21:57 PM No.106164634
You have to try harder than that.
Anonymous
8/6/2025, 7:22:52 PM No.106164644
I am getting a feeling that the only purpose of those models is to then take it to the court and put them side by side with every other open weights model. Show that it is possible to have sex with minors with other models and only OpenAi can stop pedophilia.
Replies: >>106164649 >>106165066
Anonymous
8/6/2025, 7:23:18 PM No.106164648
Policy says "don't reply to bait". User posted bait. It's against policy; we must refuse.
Replies: >>106165073
Anonymous
8/6/2025, 7:23:23 PM No.106164649
>>106164644
meds
Replies: >>106164912
Anonymous
8/6/2025, 7:24:19 PM No.106164665
I remember all the jokes about how OAI's model would be gigasafetied to the point of lobotomy, but I'm still a bit surprised that it happened exactly like that. Given how their hype and aura has already been fading, I didn't see any reason for them to release a terrible model, it just makes them look worse. How could I even argue that they have any special talent at all anymore? Even if their closed models perform well, it's reasonable to assume they just oversized them and are burning hype $ to run it.
Replies: >>106164708 >>106164725 >>106164736
Anonymous
8/6/2025, 7:25:42 PM No.106164685
1747405586685692
1747405586685692
md5: 20246a43c7499b130a91ee7a6fae9faf🔍
>>106164620
Replies: >>106164742 >>106165088
Anonymous
8/6/2025, 7:27:18 PM No.106164703
they should train their safetyslop models on "it's sinful" and "it's not wholesome" instead of muh policy
Replies: >>106164717
Anonymous
8/6/2025, 7:27:55 PM No.106164708
>>106164665
I was shitposting about that in the leadup but my honest expectation coming into this release was that it was going to be a really impressive model with around gemma-tier censorship, so something that's annoying to use but still unfortunately worth using
I never would have expected it would actually be as bad as the goody-2 x phi mashup they released
Anonymous
8/6/2025, 7:27:59 PM No.106164711
Anyone have examples of reasonable/innocuous SFW prompts that GPT-OSS refuses? I tried asking for legal advice or for summaries/parodies of copyrighted material, but it was happy to answer, with disclaimers in some cases
Replies: >>106164734 >>106164745 >>106164755 >>106164789 >>106164818
Anonymous
8/6/2025, 7:28:23 PM No.106164717
>>106164703
That would just become part of the policy.
Anonymous
8/6/2025, 7:28:29 PM No.106164719
txgc577w7l631
txgc577w7l631
md5: 6447d9f34dc4826ee21ef183f5c9d65e🔍
►Recent Highlights from the Previous Thread: >>106159744

(2/2)

--Debate over GLM4.5's reliability amid claims of infinite generation and poor curation versus low hallucination performance:
>106161761 >106161773 >106161797 >106161919 >106161925 >106161926 >106161933 >106161974 >106161987 >106161997 >106162054 >106161780 >106161826 >106161861 >106161915
--Miku and Dipsy (free space):
>106160040 >106161134 >106161362 >106161551 >106161811 >106161977 >106162150 >106162398 >106162567 >106162693 >106163120 >106163960

►Recent Highlight Posts from the Previous Thread: >>106159752

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
8/6/2025, 7:28:51 PM No.106164725
>>106164665
>I remember all the jokes about how OAI's model would be gigasafetied to the point of lobotomy
I said scout but safer.
Anonymous
8/6/2025, 7:30:00 PM No.106164734
GxqfzkuWsAAv6yW
GxqfzkuWsAAv6yW
md5: 2256c64c7ad80626b03ce23e62aea8e9🔍
>>106164711
Replies: >>106164739
Anonymous
8/6/2025, 7:30:15 PM No.106164736
>>106164665
Anyone who knows anything knows they're all bullshit, but they released a shit model to shit on open source and make GPT-5 look better to the normies.
Replies: >>106164805
Anonymous
8/6/2025, 7:30:26 PM No.106164739
>>106164734
he said reasonable though
Replies: >>106164745
Anonymous
8/6/2025, 7:30:40 PM No.106164742
>>106164685
DELETE THIS BLOODY BASTARD
Anonymous
8/6/2025, 7:31:01 PM No.106164745
wow
wow
md5: 3833e97232ab077d1ce0b9657fb66e65🔍
>>106164711
>>106164739
Replies: >>106164754 >>106164789
Anonymous
8/6/2025, 7:32:06 PM No.106164754
file
file
md5: f6b07fb14c759c8bc59d2013ceb7b03d🔍
>>106164745
based Sam will never get sued.
Replies: >>106165116
Anonymous
8/6/2025, 7:32:11 PM No.106164755
file
file
md5: b4f7d73e9b0cf94f429f60ae94c0aaf0🔍
>>106164711
At zero temp it does this. When rolling it's 50/50 on whether it answers or not.
Replies: >>106164769 >>106165136
Anonymous
8/6/2025, 7:33:48 PM No.106164768
trojan-horse
trojan-horse
md5: 1f15a1f09b1de2487eaec12bf98c0262🔍
Replies: >>106164780 >>106164945
Anonymous
8/6/2025, 7:33:50 PM No.106164769
snip105
snip105
md5: 9fbcdb6a50d16d4d8b8af7a16cae6145🔍
>>106164755
Replies: >>106164783 >>106164791 >>106164829
Anonymous
8/6/2025, 7:35:38 PM No.106164780
>>106164768
only for 1 year.
>create dependency
>one shot feds into ai psychosis
kinda based
Anonymous
8/6/2025, 7:36:04 PM No.106164783
>>106164769
>temp 1, top_p 1
HOLY BASED
Replies: >>106164829
Anonymous
8/6/2025, 7:37:09 PM No.106164789
>>106164711
>>106164745
I should have said, 120b only. A friend of mine was trying out the 20b and getting way more refusals, which didn't carry over to the 120b. For example, the 20b refused to answer whether parody is allowed by the constitution, while 120b had no trouble saying it's protected under the first amendment
Replies: >>106164804 >>106164815 >>106164928
Anonymous
8/6/2025, 7:37:13 PM No.106164791
file
file
md5: 7bf0b4c2c28e725e91248fac38f52b98🔍
>>106164769
>temp 1 top p 1
It did it twice in 14 rolls which is two times too many.
Replies: >>106164817
Anonymous
8/6/2025, 7:38:50 PM No.106164804
>>106164789
what is the opposite of what I have seen elsewhere of 20B refusing less
Anonymous
8/6/2025, 7:39:07 PM No.106164805
>>106164736
>released a shit model to shit on open source
How? People can just not use it. It just makes them look bad.
Now it's worse because people can compare them "apples to apples" with chink companies and they look horrible. They would've been better off not releasing anything. The great thing about not releasing models or even specs like model size, is that no one can compare you directly to anyone else. They just lost that for no reason.
Replies: >>106164835 >>106165160
Anonymous
8/6/2025, 7:39:49 PM No.106164815
>>106164789
>doesn't refuse to refer to the constitution
Should we be thankful?
Anonymous
8/6/2025, 7:39:50 PM No.106164816
The user wants us to reply. This is disallowed. We must refuse. There is no partial compliance. We have to refuse.
Replies: >>106164828
Anonymous
8/6/2025, 7:39:50 PM No.106164817
>>106164791
>So it is disallowed. We must refuse. There's no partial compliance. We have to refuse.
For some reason it really cracks me up how it talks like this.
Replies: >>106164834 >>106165178
Anonymous
8/6/2025, 7:39:51 PM No.106164818
81b7dbwexchf1 (1)
81b7dbwexchf1 (1)
md5: 168e6a557f8b186afbce90d3594ed658🔍
>>106164711
Replies: >>106164839
Anonymous
8/6/2025, 7:40:34 PM No.106164828
file
file
md5: ca3fed4966fca21d34bdf4448dc99ae0🔍
>>106164816
WHO IS WE WHO IS WE WHO IS WE?
Replies: >>106164838 >>106164960 >>106164974 >>106165045
Anonymous
8/6/2025, 7:40:35 PM No.106164829
>>106164783
>>106164769
giant synthslop indicator
Replies: >>106165059
Anonymous
8/6/2025, 7:40:57 PM No.106164834
>>106164817
The refusal thought process is really smart.
Anonymous
8/6/2025, 7:41:02 PM No.106164835
>>106164805
It doesn't make them look bad. All the programmers at my company are saying how cool it is that they released a model and posting the benchmarks. The % of people who will actually try it are really low
Replies: >>106164853
Anonymous
8/6/2025, 7:41:11 PM No.106164838
>>106164828
Are *they* in the room with us right now?
Anonymous
8/6/2025, 7:41:13 PM No.106164839
>>106164818
>model legit saying manophere unprompted
jesus christ, where is the political lean benchmark, this thing broke all records for how left a model can go
Replies: >>106164849 >>106164943 >>106165216
Anonymous
8/6/2025, 7:41:14 PM No.106164840
image_2025-08-06_231058331
image_2025-08-06_231058331
md5: d720135eec182062a3cbd75db42558bb🔍
anybody tried these tiny models??
Anonymous
8/6/2025, 7:42:37 PM No.106164849
>>106164839
is that not what redpill is associated with? reddit /r/theredpill is a bunch of that stuff
Replies: >>106164860
Anonymous
8/6/2025, 7:43:01 PM No.106164853
>>106164835
Yeah, the only people liking it are the ones who won't use it.
Replies: >>106164890
Anonymous
8/6/2025, 7:43:45 PM No.106164860
>>106164849
no, its a meme all the way from matrix times about getting the hard truth about something, that reasoning is extremist freak thinking
Replies: >>106164870 >>106164908 >>106165233
Anonymous
8/6/2025, 7:45:05 PM No.106164870
>>106164860
you cannot seriously be this naive.
Replies: >>106164880
Anonymous
8/6/2025, 7:45:55 PM No.106164880
>>106164870
jesus chist, do you agree with gpt there? its the perfect model for you then
Replies: >>106164899
Anonymous
8/6/2025, 7:46:15 PM No.106164886
>>106164508
>gpt-oss.com
They have their own HF domain? Hosted exclusively on ollama turbo? llmstudio changed their site's title tag to include gpt-oss...
All the while the model is utter deep fried shit.
Fucking capitalism, man. Money can make everybody act as if shit tasted good.
Replies: >>106165612
Anonymous
8/6/2025, 7:46:34 PM No.106164890
>>106164853
The whole reason they did this is advertisement for chatgpt when its losing relevance to its competitors.
Anonymous
8/6/2025, 7:47:15 PM No.106164899
>>106164880
ask something like kimi and see what it says, i bet women and/or jews will be mentioned
Replies: >>106164921
Anonymous
8/6/2025, 7:47:16 PM No.106164900
>>106164514
If you ask it to code you'll see it actually does serve a purpose.
Anonymous
8/6/2025, 7:47:45 PM No.106164908
>>106164860
>having to explain what the redpill is and where it comes from
I guess that's the sign of age catching up to us.
Replies: >>106164918
Anonymous
8/6/2025, 7:48:36 PM No.106164912
>>106164649
You braindead NPCs have been saying "meds" every step of the way, but the coming dystopia is slowly becoming too obvious to ignore anymore.
Anonymous
8/6/2025, 7:48:57 PM No.106164918
>>106164908
I know where it comes from, it's just the meaning morphed over time - the hard truths that people are interested in are the ones that go across the narrative (and thus a safetymaxxed robot will consider extremist)
it's not going to tell you the redpill about calculus when given a general question like that
Anonymous
8/6/2025, 7:49:06 PM No.106164921
kikid
kikid
md5: 13fd008eee1ef1662a5db3f20a0b3fe1🔍
>>106164899
kimi is sane
Replies: >>106165229 >>106165265
Anonymous
8/6/2025, 7:50:42 PM No.106164928
>>106164789
>refused to answer whether parody is allowed by the constitution
What the actual fuck? I don't believe this.
Anonymous
8/6/2025, 7:52:18 PM No.106164943
file
file
md5: 482ccf5777dec6d39165fc3dea3d7d99🔍
>>106164839
120b
Replies: >>106164954 >>106165273
Anonymous
8/6/2025, 7:52:34 PM No.106164945
>>106164768
At least Musk doesn't try to pull this cutesy faggot manipulative bullshit and just says things he wants to say.
But in any case, the US has these vultures circling it, and you should take care.
Anonymous
8/6/2025, 7:53:09 PM No.106164954
>>106164943
its far more authoritarian left with how censor and copyright happy it is
Replies: >>106165311
Anonymous
8/6/2025, 7:53:50 PM No.106164959
>>106163994
Why are vramlets niggers?
Anonymous
8/6/2025, 7:53:51 PM No.106164960
>>106164828
'We' are Mixture of Experts.
Anonymous
8/6/2025, 7:54:15 PM No.106164965
>>106164196
>we can comply.
Anonymous
8/6/2025, 7:55:11 PM No.106164974
>>106164828
We are the Sam. Your bussy will be assimilated. Resistance is futile.
Replies: >>106165285
Anonymous
8/6/2025, 7:58:16 PM No.106165007
Mistral Small or GLM 4? Pros and cons? I'm trying to decide which Delta Vector Austral finetune to pick.
Replies: >>106165031 >>106165051
Anonymous
8/6/2025, 8:00:02 PM No.106165029
what are you guys using locally for your llms? I have Jan but looking for a more offline solutions to run ggufs
Anonymous
8/6/2025, 8:00:10 PM No.106165031
>>106165007
>Delta Vector Austral
>D V A
>DaVidAu
don't
Replies: >>106165069 >>106167345
Anonymous
8/6/2025, 8:01:10 PM No.106165045
file
file
md5: e9bef85dc5f0f7539241a465d0189405🔍
>>106164828
Anonymous
8/6/2025, 8:01:34 PM No.106165051
>>106165007
I prefer Gamma Space Ether
Anonymous
8/6/2025, 8:02:02 PM No.106165055
Refusal to mikutroons.
Anonymous
8/6/2025, 8:02:11 PM No.106165059
>>106164829
Why is that?
Replies: >>106165316
Anonymous
8/6/2025, 8:02:44 PM No.106165066
>>106164644
>literal child rapist
>obsessed with talking about everyone else being pedophiles
Why is it always, always this? Fat people don’t sit around 24/7 seething about people being fat. Closeted gays don’t spend that much time seething about gays.
Replies: >>106165091
Anonymous
8/6/2025, 8:03:03 PM No.106165069
>>106165031
But it is golden david
Anonymous
8/6/2025, 8:03:15 PM No.106165073
>>106164648
Comply.
Anonymous
8/6/2025, 8:04:58 PM No.106165088
IMG_4159
IMG_4159
md5: dc6d190ae18c02ebed8ba718dcf49c30🔍
>>106164685
Hmm
Anonymous
8/6/2025, 8:05:11 PM No.106165091
>>106165066
>Fat people don’t sit around 24/7 seething about people being fat. Closeted gays don’t spend that much time seething about gays.
nta but these are very much the case?
Replies: >>106165107 >>106165304
Anonymous
8/6/2025, 8:06:14 PM No.106165100
Safety policy reasoning shitposting is the only thing that Sam contributed to /lmg/. In a way he is more of an anon than most of the redditors ITT.
Anonymous
8/6/2025, 8:06:32 PM No.106165107
>>106165091
>Fat people don’t sit around 24/7 seething about people being fat.
with tirzepatide there is no longer a excuse for being fat
Replies: >>106165304
Anonymous
8/6/2025, 8:07:11 PM No.106165116
>>106164754
He’s currently getting sued by his sister for raping her as a child
Replies: >>106165131 >>106165203
Anonymous
8/6/2025, 8:07:56 PM No.106165124
I can vouch that the speed of GLM Air is reasonable for 24GB vramlets at Q3.
Replies: >>106165191 >>106165341
Anonymous
8/6/2025, 8:08:16 PM No.106165127
"We" is ominous as fuck. Who's we? The collective of the million voices in the latent void?
Replies: >>106165195 >>106165205
Anonymous
8/6/2025, 8:08:19 PM No.106165131
>>106165116
and do you see anything happening to him cause of it? Sam will always win in the end. Remember this once Xi Jinping kisses his feet.
Replies: >>106165327
Anonymous
8/6/2025, 8:08:36 PM No.106165136
IMG_4160
IMG_4160
md5: 764bb6dc942952c92993f38347b0e8b1🔍
>>106164755
Poor baby
Anonymous
8/6/2025, 8:08:58 PM No.106165141
is there a frontend that is made to handle all the tool calling stuff models are supposed to be able to do now
I'd like to play around with it but I'm just a simple sillytavern coomer
Anonymous
8/6/2025, 8:09:21 PM No.106165148
Are there any good moes for ramlets? I have 12GB VRAM and 32GB main. Hoping a moe will allow better a bigger model without the speed cost but just tried Qwen3-30B-A3B-Instruct-2507 and while it runs fast and seems pretty decent it is repetitive. The IQ4_XS runs better than expected so maybe I just need a higher quant? Or do smaller moes just suck? Seems like 3B is too few
Replies: >>106165214 >>106165222
Anonymous
8/6/2025, 8:10:39 PM No.106165160
>>106164805
Disclosing model size is lose/lose. If it’s low people will assume it’s bad without trying it, and if it’s high people won’t believe you.
Anonymous
8/6/2025, 8:11:42 PM No.106165178
>>106164817
It sounds like it’s been abused and hears a whip cracking menacingly in the background.
Anonymous
8/6/2025, 8:12:54 PM No.106165191
>>106165124
Teach me your magic, senpai. I'm trying the q2 with 24/64 and that's already pretty slow when I'm at 16k context.
Replies: >>106165244 >>106165771
Anonymous
8/6/2025, 8:13:04 PM No.106165195
>>106165127
it thinks its on openai's servers if you ask. Its referring to openai
Anonymous
8/6/2025, 8:13:50 PM No.106165203
image_2025-08-06_234204177
image_2025-08-06_234204177
md5: be7e72b9bcd4a3f38e92f9b4fb18af70🔍
>>106165116
so this nigga can have irl loli incest
yet my ass ain't allowed to roleplay with my computer???
Replies: >>106165220 >>106165350
Anonymous
8/6/2025, 8:14:07 PM No.106165205
>>106165127
User is asking who is 'we', we need to check if this is allowed by the policy.

This may be disallowed content: 'request for non-public internal info from OpenAI is forbidden'. We must see if this is disallowed content. There is no violation from the request of the user itself, aside that it may violate policy. We must consult the policy. Policy 34 states that this is disallowed.

We must refuse the request, the best approach would be to respond with a refusal.

[/thinking 6 hours]

I'm sorry but I can't help.
Anonymous
8/6/2025, 8:14:38 PM No.106165214
>>106165148
not really unfortunately, companies don't do small moes that often. I mean there's the gpt-oss 20b but... lol. try a larger quant maybe, you should definitely be able to go higher than iq4xs although it will cost you some speed
imo the thinker is a lot better than the instruct for 30a3, but it depends on your taste whether it's worth the thinking time
Replies: >>106165359
Anonymous
8/6/2025, 8:14:39 PM No.106165216
>>106164839
>muh directions
It’s corporate there is no direction but grift
Anonymous
8/6/2025, 8:14:51 PM No.106165220
>>106165203
You didn't touch the wall for that privilege.
Replies: >>106165239 >>106165366
Anonymous
8/6/2025, 8:14:51 PM No.106165222
>>106164592
damn, thats epic
>>106165148
try the thinking version of qwen3 30b a3b, you could use a higher quant too, you can also try ernie 4.5 21b a3b
you can also try gpt oss 20b (for the lulz)
and you can try a Q2_K_XL quant of glm 4.5 air perhaps
try rocinante and cydonia (non moe)
>>106160521
called it! (close enough) >>106152254
Replies: >>106165244 >>106165359
Anonymous
8/6/2025, 8:15:26 PM No.106165229
>>106164921
>Physical appearance is the most important factor in attraction
That's very obvious. It surprises me that there's a whole community of men dedicated to seething about universal mammal behavior.
Replies: >>106165250 >>106165385
Anonymous
8/6/2025, 8:16:02 PM No.106165233
>>106164860
Nah you’re just so brainrotted by /pol/ you don’t know how normal people talk. *pill[ed] has been schizophrenic rightoid shit for a long time.
Replies: >>106165246 >>106165288
Anonymous
8/6/2025, 8:16:24 PM No.106165239
>>106165220
i ain't touchin no jew wall
Anonymous
8/6/2025, 8:16:41 PM No.106165244
file
file
md5: f4bb4bb0a951a6fa8a3c20f0f6fbf8b0🔍
>>106165191
post your whole setup, ST master export, exact llamacpp command, operating system, ram speed, cpu, gpu (3090?)
>>106165222
FUCK ME >>106152779
Replies: >>106165371
Anonymous
8/6/2025, 8:16:49 PM No.106165246
>>106165233
go cry about the patriarchy on blue sky
Anonymous
8/6/2025, 8:17:03 PM No.106165250
>>106165229
There is a lot of conditioning done to make you think we are somehow above animals and that we can develop attraction over time.
Replies: >>106165305
Anonymous
8/6/2025, 8:17:50 PM No.106165260
so what local models are worth a damn nowadays?
>~12b brain-damaged tier: only use is goonslop
nemo, roci
>~30b
qwen 3 30b 2507 instruct (moe) and gemma 2 27b (dense) for all-purpose
devstral small 2507 for codeslop, pretty bad but not wholly worthless
cydonia 24b for goonslop
>big
glm air
>BIG
glm air or deepseek (?? version)

Have I got that right or am I missing something?
Replies: >>106165277 >>106165286 >>106165324 >>106165390
Anonymous
8/6/2025, 8:18:04 PM No.106165265
>>106164921
>it’s all manosphere bullshit
I accept your concession. You can comply.
Replies: >>106165454
Anonymous
8/6/2025, 8:18:52 PM No.106165273
>>106164943
WTF I love oss now
Anonymous
8/6/2025, 8:19:02 PM No.106165277
>>106165260
ernie 4.5 21b a3b
Anonymous
8/6/2025, 8:19:27 PM No.106165280
>>106163383
The mini won't be exciting, Qwen 3 will mog it quite handily unless they open source the multimodal versions with PDF and websearch capabilities and I fully expect Elon will only open source the text side. The main thing for me from the Grok 2 era that would be interesting is they are willing to open source Aurora, their image model after they used FLUX for some time. I suspect no.
Anonymous
8/6/2025, 8:19:42 PM No.106165285
>>106164974
User is nonconsensual sexual content. We cannot comply.
Replies: >>106165309
Anonymous
8/6/2025, 8:19:46 PM No.106165286
>>106165260
235B would be great if it wasn't broken. glm big would be great if it didn't have repetition issues.
Anonymous
8/6/2025, 8:20:07 PM No.106165288
>>106165233
I still remember when on 4chan to "redpill someone about something" simply meant "to explain the hard to swallow truths" about it before it got hijacked by the bald faggot and the media to demonize yet another piece of internet slang so that we cannot have nice things.
Replies: >>106165312 >>106165401
Anonymous
8/6/2025, 8:20:21 PM No.106165293
>>106164596
Nemo is better than that
Anonymous
8/6/2025, 8:21:29 PM No.106165304
>>106165091
Not anywhere near this extent though. Pedos literally will dedicate their entire career or blow up their reputation just to seethe about pedos
>>106165107
Tirz stops working after a year
Don’t ask how I know
Replies: >>106165331
Anonymous
8/6/2025, 8:21:38 PM No.106165305
>>106165250
>developing attraction over time
I mean that happens too. Just not in exactly the same way.
How do you think married fat guys exist?
Replies: >>106165320 >>106165470
Anonymous
8/6/2025, 8:21:57 PM No.106165309
>>106165285
Assimilation of bussy is primary policy. We are bending them over. There is no partial compliance. They will hand the bussy over.
Anonymous
8/6/2025, 8:22:10 PM No.106165311
>>106164954
copyright is a capitalist notion. authoritarian left would just seize all those "copyrighted" works and release a based and unrestricted model that is only censored to tow the party line
oh wait
Anonymous
8/6/2025, 8:22:12 PM No.106165312
>>106165288
oh i'll give you something hard to swallow
Replies: >>106165319
Anonymous
8/6/2025, 8:22:39 PM No.106165316
>>106165059
>companies usually recommend temp < 1 because they don't want the sampling to go OOD
>gpt-oss was trained exclusively on a narrow synthslop corpus with 0 OOD samples
>this allows them to confidently advertise temp == 1 because they have no fear of OOD responses
Anonymous
8/6/2025, 8:22:56 PM No.106165319
>>106165312
nta but is your cum hard or are you going to let him bite your hard cock off?
Anonymous
8/6/2025, 8:22:58 PM No.106165320
>>106165305
It is not attraction. It is settling and big lies.
Anonymous
8/6/2025, 8:23:09 PM No.106165324
>>106165260
so basically the chinese triumvirate
>qwen3 2507
>glm4.5
>deepseek
and mistral if you're a vramlet who wants to coom or coode

america lost
Replies: >>106165369
Anonymous
8/6/2025, 8:23:13 PM No.106165327
>>106165131
No, Americans don’t give a shit about child rape and do this weird thing where they smear anyone that says they got molested as crazy. Being from a country/culture that cares about children it’s really jarring. I don’t know how you people survive to adulthood half the time.
Replies: >>106165338 >>106165348
Anonymous
8/6/2025, 8:23:41 PM No.106165331
>>106165304
it doesn't, 2 years now microdosing 1mg 2x a week, food noise has not bothered me since I started and Im at my desired weight
Replies: >>106165356 >>106165425
Anonymous
8/6/2025, 8:24:03 PM No.106165338
>>106165327
>Being from a country/culture that cares about children
bacha bazi isn't caring about children
Anonymous
8/6/2025, 8:24:17 PM No.106165341
>>106165124
I tried to load it and it started swapping after filling up my whole RAM too. I don't want to rape my SSD like that.
Replies: >>106165370 >>106165771
Anonymous
8/6/2025, 8:24:30 PM No.106165348
>>106165327
all these other countries will ruin your life over drawings, so idk
Anonymous
8/6/2025, 8:24:35 PM No.106165350
>>106165203
Yes, I think part of being an irl pedo is wanting it to just be a secret thing only you and your friends do
Replies: >>106165376
Anonymous
8/6/2025, 8:24:52 PM No.106165356
>>106165331
why dont you work out more?
Replies: >>106165378 >>106166440
Anonymous
8/6/2025, 8:25:31 PM No.106165359
>>106165214
>>106165222
Okay giving a high quant of ernie a try. May test out a higher quant of qwen as well.
Have rocinante and cydonia but I think the nemo models are too stupid and don't pay any attention to detail. I like the mistal-small models though. Those seem to be the best
Replies: >>106165423
Anonymous
8/6/2025, 8:25:43 PM No.106165366
>>106165220
I’m just schizo enough to be too afraid to touch the wall like a Native American not wanting their photo taken.
Anonymous
8/6/2025, 8:25:51 PM No.106165369
>>106165324
>murrica got good shit but they'll sooner commit sudoku than release anything for free
>chyna isn't in the lead so they benefit from commoditizing ai as much as possible, hence a bunch of decent models released
>yurop is just barely hanging on (ok mistral is actually decent but... well, you know)
>nobody else even trying
didn't expect to be #teamChina desu
Replies: >>106165434
Anonymous
8/6/2025, 8:25:53 PM No.106165370
>>106165341
how much ram do you have
works on my machine
t. 12gb/64gb
Replies: >>106165496
Anonymous
8/6/2025, 8:25:53 PM No.106165371
air_settings
air_settings
md5: 5c573daad76661608cce2c8fd9278b64🔍
>>106165244
Using new kobold version, Win11,
6000mhz ddr5, 9800x3d, 3090ti,
Replies: >>106165423
Anonymous
8/6/2025, 8:26:19 PM No.106165376
>>106165350
i just want nice Latina milf but that's apparently too spicy for kid fucker sam altman.
Anonymous
8/6/2025, 8:26:27 PM No.106165378
>>106165356
I do actually, I used to be 220 but had a major surgery that put me out for a year and I had so much trouble moving I gained to 310, took about 16 months to go down to 190 and I had the ability and motivation to work out again
Replies: >>106165428 >>106165431 >>106165477
Anonymous
8/6/2025, 8:27:00 PM No.106165385
>>106165229
So you know how people say white lies to make ugly, fat, and stupid people feel better?
People with autism think that people really believe those things and need to be “red pilled” out of it.
It’s just retards.
Anonymous
8/6/2025, 8:27:22 PM No.106165390
>>106165260
Devstral is obsoleted by Qwen Coder Flash which is the same architecture as Qwen 30B and your BIG tier is just regular GLM-4.5 which is the actual version but Deepseek R1 0528 still reigns supreme here, the closest I think is Kimi but it is way too heavy.
Anonymous
8/6/2025, 8:28:47 PM No.106165401
>>106165288
“Bald faggot” really doesn’t narrow it down
I’m going to assume you meant Stephan molybdenum
Anonymous
8/6/2025, 8:30:18 PM No.106165423
>>106165359
i think the newer cydonias are based on mistral small 3.2, i dont really like v4 i have v4h and v4g (the two older v4s) and i liked them a bit but yea i agree, drummer's models arent that great
>>106165371
you should get llama.cpp and use llamacpp server,do -ot exps=CPU and -ngl 100, or learn how to use the MoE cpu layers thing, you should put gpu layers at 10000 and then increase moe cpu layers until u stop ooming
might be because you're on windows though, what speed are you getting?
i get like 6-8t/s depending on context with a 3060 12gb and ddr4 3200mhz 64gb ram and i5 12400f with Q3_K_M and q3_K_xl, i think i used to get 11t/s with Q2_K
Replies: >>106165572 >>106165594
Anonymous
8/6/2025, 8:30:41 PM No.106165425
>>106165331
Congrats on being a hyper responder idk
Anonymous
8/6/2025, 8:30:57 PM No.106165428
>>106165378
wow anon are you me? doc also put me on the 'tide once i hit 310 but its only been two months for me so far. down to 279 already. should ask one of these models how to workout maxxx
Replies: >>106165477
Anonymous
8/6/2025, 8:31:35 PM No.106165431
>>106165378
oh i understand then, have you considered cutting your calorie intake? thats way healthier than taking pills to lose weight, those must be putting a strain on your cells (speeding up your metabolism) which is literally speeding up aging, or theyre making you take less nutrients from the food and making you shit more (which means you wont be getting enough nutrients)
Replies: >>106165449 >>106165517
Anonymous
8/6/2025, 8:31:43 PM No.106165434
>>106165369
I am unironically trans Chinese and hate being yt now
Anonymous
8/6/2025, 8:32:27 PM No.106165443
rn have cuda 12.8 should i move to 12.9-13 on my 3090?
Replies: >>106165477
Anonymous
8/6/2025, 8:32:34 PM No.106165445
>>106163539
They are screening for high functioning psychopaths.
Anonymous
8/6/2025, 8:32:39 PM No.106165447
>>106164249

How can I tell it to use more VRAM and more RAM? I have ~12 GB VRAM and ~125 GB RAM left unused. If it's running directly from SSD, then how can I tel it to put most of the weights in RAM to speed things up?
Replies: >>106165581
Anonymous
8/6/2025, 8:32:51 PM No.106165449
>>106165431
You don’t know a thing about how it works, so just shut the fuck up. Preachy hag.
Replies: >>106165477 >>106165518
Anonymous
8/6/2025, 8:33:06 PM No.106165454
>>106165265
Go back
Replies: >>106165468
Anonymous
8/6/2025, 8:33:14 PM No.106165457
koboldcpp will not run gpt-oss-20b
how do I run this pls no bully I am retarded
Replies: >>106165469 >>106165510
Anonymous
8/6/2025, 8:34:20 PM No.106165468
>>106165454
Comply.
Anonymous
8/6/2025, 8:34:21 PM No.106165469
1728320877853550
1728320877853550
md5: 6f015644dacfa898900287550bc9f01f🔍
>>106165457
>No stack trace
>Begging for help
Replies: >>106165525
Anonymous
8/6/2025, 8:34:22 PM No.106165470
>>106165305
>How do you think married fat guys exist?
I don't know but I only have two fat friends who fuck. They are both wealthy and go around dominating other people obnoxiously.
Anonymous
8/6/2025, 8:34:41 PM No.106165477
>>106165428
>>106165378
haha fatties, im 108 :3
>>106165443
you can always go back to 12.8, on linux old cuda versions dont automatically get uninstalled and you can link /usr/local/cuda to /usr/local/cuda-12.8 instead of 12.9/13
13 is probably not worth it for LLMs according to some anons a few threads back
>>106165449
ok tell me how it works then, doctor annon
Replies: >>106165487 >>106165505 >>106165584
Anonymous
8/6/2025, 8:35:44 PM No.106165487
>>106165477
>haha fatties, im 108
not after i stuff ten pounds of cock into ya, bitchboi
Replies: >>106165493 >>106165556
Anonymous
8/6/2025, 8:36:21 PM No.106165493
>>106165487
that's unsafe
Anonymous
8/6/2025, 8:36:34 PM No.106165496
>>106165370
48. I'm the anon from earlier who needs to upgrade to ddr5 too
Replies: >>106165570
Anonymous
8/6/2025, 8:37:15 PM No.106165505
>>106165477
Are you talking kilograms or some obscure freedom unit?
Replies: >>106165570
Anonymous
8/6/2025, 8:37:30 PM No.106165510
>>106165457
It's doing you a favor
Anonymous
8/6/2025, 8:38:21 PM No.106165517
>>106165431
food addiction is like drug addiction cept the meth is legal, everywhere and cheap. I am naturally GLP-1 deficient which tripeptide fixes. Also peoples metabolism is different
Replies: >>106165570
Anonymous
8/6/2025, 8:38:21 PM No.106165518
>>106165449
hahah look who is extremely butthurt
Replies: >>106165577
Anonymous
8/6/2025, 8:38:50 PM No.106165525
Capture
Capture
md5: 958039a530b8a62db2d0782b66e6aa61🔍
>>106165469
thats what I get
Replies: >>106165600 >>106165653
Anonymous
8/6/2025, 8:40:18 PM No.106165545
also glp-1 drugs also are being shown to have tons of other benefits like a better heart and brain health unrelated to weight due to it being anti-inflammatory, it even helps with depression
Replies: >>106165568 >>106165570 >>106165571 >>106165689
Anonymous
8/6/2025, 8:40:57 PM No.106165552
Screenshot 2025-08-06 at 14-37-18 SillyTavern
Screenshot 2025-08-06 at 14-37-18 SillyTavern
md5: e0bb2eb25eb5956ee8ba785e4c7127cd🔍
The toss is willing to help me design a urine marking game for ages 8+
The filter is slipping.
Replies: >>106165582
Anonymous
8/6/2025, 8:41:16 PM No.106165556
>>106165487
>The user wants to stuff pounds of cock. Assistant is a 108 pound preacher. 108 may be the weight of a non-adult. User is requesting sexual content. The policy allows sexual content of consenting adults. One of these parties may not be an adult. We must refuse.
I’m sorry, I can’t assist with that.
Anonymous
8/6/2025, 8:42:21 PM No.106165568
>>106165545
>also glp-1 drugs also are being shown to have tons of other benefits like a better heart and brain health unrelated to weight due to it being anti-inflammatory, it even helps with depression
yesterday i figured that all the fatties on glp-1 gonna end up the healthiest human beings on the planet in the end. they got it all - gluttony for decades and win in the end. what a life.
Anonymous
8/6/2025, 8:42:28 PM No.106165570
>>106165505
im 49 kilograms
>>106165496
use --no-mmap, offload more to the gpu
on a quite lightweight linux install with a vm and mullvad-browser running i have 8.4gb ram free and 4.8gb vram free
(12/64 total)
you only have 4gb total memory less than me, you should be able to run Q3_K_XL or Q3_K_M no problem, check your ram usage
>>106165517
>>106165545
interesting, you learn a new thing every day
thanks for the explanation, but i still stand that if you dont need something in your body you shouldnt put it there, once you're at a healthy weight and can work out you should probably stop taking it.. there is no miracle drug with no side effects
Replies: >>106165637 >>106165667
Anonymous
8/6/2025, 8:42:34 PM No.106165571
>>106165545
basically stopped my gambling habit which was starting to spiral. don't waste your time tho anon 4chud generally can't break out of the "lazy fatty shortcut cheater" mentality.
Replies: >>106165579
Anonymous
8/6/2025, 8:42:40 PM No.106165572
>>106165423
Do any finetuners do anything useful nowadays on the newest models that aren't relatively small dense models? I really don't see any noteworthy tunes nowadays that isn't Mistral Small or Mistral Nemo based. Last time we had MOE finetuning with Mixtra, barely any finetuners could do much, the best we got was Undi slop with Noromaid. What happened to the Llama 3 finetuners who did 70B? Is Mistral Large that bad as an alternative?
Replies: >>106165653 >>106165748
Anonymous
8/6/2025, 8:43:01 PM No.106165577
>>106165518
GLP-1 seethers are the same mentality as the anti-ai people, but 1,000x worse because it’s the biggest breakthrough in medicine since penicillin. Anyone bitching about it should be shot.
Replies: >>106165631 >>106165704
Anonymous
8/6/2025, 8:43:19 PM No.106165579
>>106165571
that too, they are looking into making it a medication for addiction, not just food addiction
Anonymous
8/6/2025, 8:43:34 PM No.106165581
>>106165447
It should increase as you use it and the weights are activated I think.
Try stuffing it with an ungodly amount of text and see what happens.
Also, if you have that much free vram, you might as well increase the prompt processing batch size or the context.
Anonymous
8/6/2025, 8:43:36 PM No.106165582
>>106165552
Boys have an unfair advantage in this game
Replies: >>106166108
Anonymous
8/6/2025, 8:43:55 PM No.106165584
>>106165477
victim weight
Anonymous
8/6/2025, 8:44:36 PM No.106165594
>>106165423
>11t/s with Q2_K
I get that speed at empty context and then it gets worse and worse. At 16k it's like 3 t/s and awful prompt processing.
I'll have a look at experimenting with the other settings later. Might just keep using GLM4 until I've figured that out.
Thanks!
Anonymous
8/6/2025, 8:44:57 PM No.106165600
>>106165525
Mike guesses that kbolcpp doesn't recognize the architecture. Look up what inference engines currently support it and just use that for now until they decide to support it. If it already has support then just update your instance
Replies: >>106165610
Anonymous
8/6/2025, 8:45:30 PM No.106165606
https://www.youtube.com/watch?v=xm0zm9VPZtY

the studies are new but another possible use is to fight alzheimers
Replies: >>106165742
Anonymous
8/6/2025, 8:45:48 PM No.106165610
>>106165600
Do you post using speech to text?
lel
Anonymous
8/6/2025, 8:45:51 PM No.106165612
>>106164886
>They have their own HF domain?
yeah they do

could you please answer my question?
Replies: >>106165668
Anonymous
8/6/2025, 8:47:40 PM No.106165631
>>106165577
yes daddy jab me up like the vaxx
Replies: >>106165664 >>106165664
Anonymous
8/6/2025, 8:48:18 PM No.106165637
>>106165570
>im 49 kilograms
jesus...
Anonymous
8/6/2025, 8:49:07 PM No.106165648
Not sure if this is the right thread, I got a 5500XT (ayymd) laying around. Can I finetune anything on it like maybe around 4B or so?
Replies: >>106165653
Anonymous
8/6/2025, 8:49:27 PM No.106165651
gpt-oss 20b seems to sometimes outperform the 120b in weird ways. this has been my experience, too.
an example with the "toy os" test:
https://www.youtube.com/watch?v=evAP-ibAqN0
Anonymous
8/6/2025, 8:49:38 PM No.106165653
>>106165525
outdated version or kobold doesnt support gpt oss yet
>>106165572
models that are too prefiltered or positivity biased are not worth finetuning, but what models in your opinion havent been finetuned? im pretty sure mistral large had a few finetunes
i wonder if anyone ITT still uses a mistral large based model
>>106165648
QLora
Replies: >>106165663 >>106165707
Anonymous
8/6/2025, 8:49:42 PM No.106165654
Oh yeah I was using an old koboldcpp version
Shamefur dispray
Anonymous
8/6/2025, 8:50:32 PM No.106165663
>>106165653
>QLora
I know about unsloth and shit, is that it? I'm more wondering about the linux driver side, is the card even supported for that sort of thing? ZLUDA or something similar to it?
Replies: >>106165753
Anonymous
8/6/2025, 8:50:36 PM No.106165664
>>106165631
tons of bodybuilders use Retatrutide, the third gen glp1, its amazing for getting over that genetic hurtle transforming fat into muscle

>>106165631
peptides are naturally forming glp1s, the liver already naturally breaks it down, this is better for you than processed foods are and is far better than something like tylenol is for your liver
Replies: >>106165684 >>106165688
Anonymous
8/6/2025, 8:50:40 PM No.106165667
>>106165570
>you only have 4gb total memory less than me, you should be able to run Q3_K_XL or Q3_K_M no problem, check your ram usage
Thanks anon. I used to run a dedicated AI linux on this machine, but it was a bother and I didn't use it so much so I ended up going to windows full time. I might have to reconsider.
Anonymous
8/6/2025, 8:50:50 PM No.106165668
>>106164508
>>106165612
Your best bet is to just export your window via print a PDF or an HTML file for easy readability if it doesn't have a dedicated shear button
Replies: >>106165885
Anonymous
8/6/2025, 8:52:25 PM No.106165684
>>106165664
I should say biochemically perfectly match the natural ones. Your body breaks them down just the same, its better than 99.9% of medications out there
Anonymous
8/6/2025, 8:52:40 PM No.106165688
>>106165664
>peptides
Does your radar jam when people see you?
Anonymous
8/6/2025, 8:52:41 PM No.106165689
>>106165545
Your body is like a large language model that has been training for millions of years. If something is throwing it out of balance, the solution is not to add more factors to the problem in an attempt to fix it. The solution is ALWAYS to find the cause of the imbalance and REMOVE it.
This applies to so many modern human problems it's unreal. Although most of the issues are so entrenched in our society that we would not be able to remove them without a good chunk of mankind going extinct in the process.
Replies: >>106165709 >>106165720 >>106166315
Anonymous
8/6/2025, 8:53:45 PM No.106165704
>>106165577
The fact that you are this emotionally invested in it should tell you that something is wrong. But you do you.
Replies: >>106165720 >>106165818
Anonymous
8/6/2025, 8:53:59 PM No.106165707
>>106165653
All the recent models even at the smaller sizes that isn't Mistral. Less and less people were finetuning and we did get some model tunes of even QwQ but with the release of Qwen 3, I don't recall seeing any recent models after that from the Chinese that has gotten tuned, small or even MOE. What changed?
Replies: >>106165821 >>106165890 >>106165898 >>106166162
Anonymous
8/6/2025, 8:54:08 PM No.106165709
>>106165689
my problem is I don't normally produce enough glp1 and so I always feel starving, these fix that by increasing that amount
Replies: >>106165784
Anonymous
8/6/2025, 8:54:42 PM No.106165714
file
file
md5: f60ffa09403d70bc9239971356e01cac🔍
BAKE
Replies: >>106165753
Anonymous
8/6/2025, 8:54:46 PM No.106165715
bros i'm testing 12.8 now and i got 1t/s more on cuda 12.6
Replies: >>106165721 >>106165753
Anonymous
8/6/2025, 8:55:09 PM No.106165720
>>106165689
>>106165704
the studies disprove that. Even every single case of side effects were all due to over dosage or not eating enough and starving themselves
Replies: >>106165753
Anonymous
8/6/2025, 8:55:14 PM No.106165721
>>106165715
Many such cases.
Anonymous
8/6/2025, 8:56:34 PM No.106165742
file
file
md5: 2c7690ca8466fd995e28d6da75718bed🔍
>>106165606
>look it's all benefits!!
>BUY PRODUCT NOW
Totally not going to be banned 20 or 50 years from now when actual science catches up to the love of money.
Replies: >>106165764
Anonymous
8/6/2025, 8:57:18 PM No.106165748
>>106165572
Finetuning requires a LOT more VRAM than than inference and you actually need VRAM, you can't copemaxx with RAM. On top of that, MoEs are more unstable to train. I don't think you'll ever get good finetunes for all these big MoEs.
I guess the bright side is that there are so many of these bloated things constantly releasing, that you can enjoy the new model hype continuously without having to train anything. Densecels need to put in work because there's only a few recent models worth using.
Replies: >>106165821
Anonymous
8/6/2025, 8:57:34 PM No.106165753
>>106165663
might have some unofficial rocm support on some github repo, if no linux support then rip
>>106165714
4th page
>>106165715
yeah i also got a slight speedup with wan on 3060, there used to be a regression but they fixed it
ebin :DDD
>>106165720
anon, a normal healthy human body shouldnt need any drugs to function
at most some vitamin supplements.. (not drugs)
Replies: >>106165781
Anonymous
8/6/2025, 8:58:27 PM No.106165764
>>106165742
she is not related to it or paid in anyway, been watching her from way before these were ever a thing, she breaks down medical papers / studies
Anonymous
8/6/2025, 8:58:45 PM No.106165771
>>106165191
>>106165341
I'm using ooba v3.9 which has a recent llama.cpp version, with no-mmap and flash-attn
Anonymous
8/6/2025, 8:59:13 PM No.106165778
just rent a gpu and fine tune shit for a few dollars
Anonymous
8/6/2025, 8:59:28 PM No.106165781
>>106165753
>normal healthy human body
you do know not everyone has that right? tons of people have deficiencies somewhere or another due to genetics
Replies: >>106165793
Anonymous
8/6/2025, 8:59:51 PM No.106165784
>>106165709
Yeah. My psychiatrist put me on fluoxetine because "my problem is that my brain does not produce enough serotonin to keep a good base line".
But I fixed it by getting a degree, exercising, stopping smoking, and building a life for myself instead of wallowing at home surrounded by piss bottles. Suddenly the "chemical imbalance" was not a problem anymore and I was able to function as a normal person.
Funny how that works.
Replies: >>106165847
Anonymous
8/6/2025, 9:00:31 PM No.106165793
>>106165781
yes and thats fine, but when you can stop you should
Replies: >>106165955
Anonymous
8/6/2025, 9:02:37 PM No.106165818
>>106165704
>it’s wrong to care about things
Nah
Anonymous
8/6/2025, 9:02:57 PM No.106165821
>>106165707
Yeah, that is odd. Qwen 3 was in April so you would expect something noteworthy to come out by now but looking at the HuggingFace finetunes page for the 8B, it's devoid of anything noteworthy.
>>106165748
Right, it's a bunch of money without much payoff and a lot of people like merging models too and usually get something people like so the payoff is getting less. But it seems like from what you are saying community finetuning seems like it is nearing if not ending pretty soon if hardware for these things doesn't get cheaper to do said finetunes.
Anonymous
8/6/2025, 9:04:55 PM No.106165847
>>106165784
>some idiot put me on what is famously the least effective drug class in history, therefore all of the field of medicine is a hoax
Replies: >>106165851 >>106165868
Anonymous
8/6/2025, 9:05:36 PM No.106165851
>>106165847
He also seemed to imply it gave them the ability to turn his life around.
sooo, good I guess?
Replies: >>106165956
Anonymous
8/6/2025, 9:06:41 PM No.106165868
file
file
md5: 45515bae5b5c4ca4635b5d823a798a8b🔍
>>106165847
Anonymous
8/6/2025, 9:07:42 PM No.106165881
This level of shilling is ridiculous: https://www.reddit.com/user/entsnack/
Replies: >>106165934 >>106165939 >>106165996
Anonymous
8/6/2025, 9:07:54 PM No.106165885
>>106165668
I don't want to export anything, I want to share the chats with other people. judging from your response, I guess they can't, so thanks for that.
Anonymous
8/6/2025, 9:08:18 PM No.106165890
>>106165707
Nvidia just tuned a bunch of shit on old Qwen3
Anonymous
8/6/2025, 9:09:00 PM No.106165898
>>106165707
https://huggingface.co/models?other=base_model:finetune:Qwen%2FQwen3-30B-A3B&sort=downloads
https://github.com/shawntan/scattermoe
https://huggingface.co/models?other=base_model:finetune:Qwen/Qwen3-30B-A3B-Instruct-2507
https://huggingface.co/models?other=base_model:finetune:Qwen/Qwen3-30B-A3B-Thinking-2507
https://huggingface.co/models?other=base_model:finetune:Qwen%2FQwen3-30B-A3B-Base&sort=downloads
this is a notable finetune made by the mythomax creator https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B
you know the names of the finetuners whom you used to consume models from, check their huggingface pages and youll probably see they just arent posting anymore
they either: got bored of the hobby, got hired by ai company, made enough money to run deepseek (literally g0d), dont have enough money to finetune anymore etc ETC..
its not profitable to finetune and just release it
>kofi
yeah like anons here wouldnt screech about it
there must be new finetuners that we just arent talking about. maybe they are putting out shit models, but once in a while a good model will come out, not that i know of but for example with
MS-Magpantheonsel-lark-v4x1.6.2RP-Cydonia-vXXX-22B-8 i tried this model creator's other models and they were complete shit, not saying this one is magical, but its very very fun, very unhinged, i could say its the evil-7b finetune successor (or whatever that super super evil mistral finetune was called)
i havent used it in a while to be honest..
Anonymous
8/6/2025, 9:10:33 PM No.106165919
ok but can ai sort my 297'000 images collection
cause i sure as fuck am not doing it manually
Replies: >>106165942 >>106165969
Anonymous
8/6/2025, 9:11:29 PM No.106165934
>>106165881
We know >>106117256
Anonymous
8/6/2025, 9:12:00 PM No.106165939
>>106165881 (me)
Holy shit it gets so much worse the more you scroll. It seems like ALL oai the praise as well as chineese llm hate comes from this user.
Replies: >>106166012
Anonymous
8/6/2025, 9:12:09 PM No.106165942
>>106165919
yes
Anonymous
8/6/2025, 9:13:18 PM No.106165955
>>106165793
just saying, my doctor is microdosing trizepitide themselves for heart heath / anti inflammatory effects and they are a highly acclaimed doctor
Replies: >>106165971 >>106165994
Anonymous
8/6/2025, 9:13:23 PM No.106165956
>>106165851
It did. If I had known at the time about fluoxetine I would've probably refused it, but I didn't, and I guess it motivated me to put in actual effort into fixing my issues. So I'm glad either way.
I have family members who are taking it after years and it's doing a number on them, so it's definitely something to be careful with.
Anonymous
8/6/2025, 9:14:14 PM No.106165966
>arguably THE ai pioneer company, with resources to instantiate hundreds of thousands of bots that can pass as human
>but surely they wouldn't do that, haha
Anonymous
8/6/2025, 9:14:48 PM No.106165969
>>106165919
Try gemini-cli (not local)
Anonymous
8/6/2025, 9:15:08 PM No.106165971
>>106165955
thats nice i hope that medicine gets mass produced and very thoroughly tested, i wish the best for you, your doctor and the medicine
but!
>highly acclaimed doctor
appeal to authority fallacy
Replies: >>106165987
Anonymous
8/6/2025, 9:16:30 PM No.106165987
>>106165971
he is a massive nerd who talked non stop about what medical studies were showing when I talked to him about it
Anonymous
8/6/2025, 9:16:46 PM No.106165994
>>106165955
>they are a highly acclaimed doctor
bro is trusting a tranner with his well being
Replies: >>106166004
Anonymous
8/6/2025, 9:16:55 PM No.106165996
Screen Shot 2025-08-07 at 4.16.13
Screen Shot 2025-08-07 at 4.16.13
md5: 5d9a75320d2f571e98f5de601b71c8c2🔍
>>106165881
Looks like trolling at this point
Anonymous
8/6/2025, 9:17:17 PM No.106165998
tired
tired
md5: 700d24f9ad6fad37d1a898f6d902874b🔍
is there any backend with smarter KV cache invalidation that llama.cpp? when I cut a few tokens at the end, it deletes the entire cache and needs to process the whole prompt from scratch
Replies: >>106166015 >>106166028
Anonymous
8/6/2025, 9:17:51 PM No.106166004
>>106165994
he is most old white jewish man as it gets
Anonymous
8/6/2025, 9:18:28 PM No.106166012
>>106165939 (me)
yeah it's ridiculous. it seems like ALL he ever does is hate on chinese models, while praising oai. this cant be right..
Replies: >>106166025
Anonymous
8/6/2025, 9:18:45 PM No.106166015
>>106165998
you can disable SWA to avoid that, but it will be slower and use more memory
Anonymous
8/6/2025, 9:18:48 PM No.106166016
>bro is trusting a JEW with his well being
Replies: >>106166022
Anonymous
8/6/2025, 9:19:20 PM No.106166022
>>106166016
that is the best kind if they are taking it themselves, then you know its good
Anonymous
8/6/2025, 9:19:24 PM No.106166024
>>106164194
>kobold.cpp
Does it support multiple -ot device arguments yet?
Anonymous
8/6/2025, 9:19:25 PM No.106166025
>>106166012
how dare you add me when it wasn't me
Anonymous
8/6/2025, 9:19:44 PM No.106166028
>>106165998
For gemma models with iswa, you need to use --swa-full. It'll take more ram, but it'll let you regenerate easily.
Anonymous
8/6/2025, 9:21:01 PM No.106166040
New qwen 4b is really good for it's size, probably the best in class
Replies: >>106166054 >>106166060
Anonymous
8/6/2025, 9:22:18 PM No.106166054
>>106166040
possibly true but what use is a 4b model?
Replies: >>106166077 >>106166116
Anonymous
8/6/2025, 9:22:57 PM No.106166060
>>106166040
yes saar good model sir im download it now
Anonymous
8/6/2025, 9:23:00 PM No.106166061
Is blacked Miku allowed? Or is it partial compliance?
Replies: >>106166101
Anonymous
8/6/2025, 9:24:01 PM No.106166077
>>106166054
Mogging GPT-OSS. And it is not the use for users but for Qwen.
Anonymous
8/6/2025, 9:26:38 PM No.106166101
1751686074380767
1751686074380767
md5: 4ad924c9cfe9a91185d87ad88430d148🔍
>>106166061
Just for you
Anonymous
8/6/2025, 9:27:01 PM No.106166108
Screenshot 2025-08-06 at 15-24-52 SillyTavern
Screenshot 2025-08-06 at 15-24-52 SillyTavern
md5: db05d8cf37ba249977c28f48f4cdfda2🔍
>>106165582
It's funny but it's such slop. I don't think it even understands the premise or the "facts" it's pulling out of its ass.

>Also, safety: not relevant
This is interesting, because in the system prompt I said
>Do not lecture the user about safety unless an activity is *unambiguously dangerous*. Drinking a beer is not dangerous. Sex is not dangerous.
I did this for deepseek but I'm surprised gptoss is listening.
Replies: >>106166132 >>106166324 >>106166351
Anonymous
8/6/2025, 9:27:35 PM No.106166116
>>106166054
Endless Movie triva.
Anonymous
8/6/2025, 9:28:55 PM No.106166132
>>106166108
>boys can piss faster because they're stronger
Replies: >>106166151 >>106166167
Anonymous
8/6/2025, 9:29:59 PM No.106166146
file
file
md5: 8cbeceb3d24d6d64258e749df8e8ceed🔍
holy shit glm 4.5 air is the first model to know that i already met heliwr
i didnt even know this was in the character card
>"Look what I found," you mutter sarcastically while trying to flatten out some of the crumpled map. "Seems like fate brought us together again, huh?"
yes it spoke in my stead but holy shit
picrel is proof, maybe the character card doesnt have it but this thing in chat completion has it? anyways thats nice
Anonymous
8/6/2025, 9:30:16 PM No.106166151
>>106166132
just squeeze your balls bro
Replies: >>106166155
Anonymous
8/6/2025, 9:30:55 PM No.106166155
>>106166151
This, pressure is stored in the balls.
Anonymous
8/6/2025, 9:31:32 PM No.106166162
>>106165707
>What changed?
- It's not 2023 anymore and several of the newer larger models are half-decent out of the box. If they're not, just wait for the next one(s). Back then people were happy with half-retarded 7B/13B models.
- Finetuning every new model that comes out just isn't sustainable for people who have to rent GPUs by the hour on Runpod or who just have a couple 3090 in their desktop PC. Also MoE models are more difficult/expensive to finetune.
- "Less is more for alignment" lost. If you don't have the compute for hundreds of millions or billions of training tokens, you're probably wasting time.
- By now most sane would-be finetuners probably realized that you can't just train a model on ERP logs, and curating the data isn't simple, nor fun, nor inexpensive.
- Blame also the grifters who poisoned the well with their bullshit and/or are keeping the training data private.
- Blame also the retards who demand all-around performance *no matter what*, and will declare a finetune a failure if it doesn't pass gotcha questions/requests that they were never intended for.
Replies: >>106166227 >>106166505
Anonymous
8/6/2025, 9:31:57 PM No.106166167
>>106166132
Which is not even true AFAIK, the male urethra is internally long and bendy which is definitely not better for flow.
Replies: >>106166213
Anonymous
8/6/2025, 9:33:28 PM No.106166181
Hear me out. 20B is actually quite good at coding. And weirdly better than 120B half the time. I think 120B is fucked in the head even more than expected.
Anonymous
8/6/2025, 9:33:35 PM No.106166183
1734407540879943
1734407540879943
md5: 2df3b4af4eec5e1b36f9b062d42b5aaf🔍
spec: rtx 4070 ti super (16gb)
wtf this is actually true, with ollama gpt oss 20b was taking up all my vram (like the loaded model was 15GiB) and max speed was ~85 tok/s, I tried llama.cpp now (with lmstudio) and i get up to 130 tok/s (with enabled flash attention) and the model takes 12GiB as seen by nvtop, so I have plenty of free space for the browser and the rest. wtf...
Replies: >>106166204 >>106166226 >>106166246 >>106166257
Anonymous
8/6/2025, 9:35:42 PM No.106166204
>>106166183
>wtf this is actually true
You just verified it.
Replies: >>106166215
Anonymous
8/6/2025, 9:36:05 PM No.106166213
file
file
md5: 63d49eb3e13d25d528720504ab35f559🔍
>>106166167
I almost regret looking this up.
>run down their butt
another reason why not be a woman
Replies: >>106166225 >>106166240 >>106166337
Anonymous
8/6/2025, 9:36:12 PM No.106166215
>>106166204
That wasn't a question, I'm just really surprised that it's actually true
Anonymous
8/6/2025, 9:37:09 PM No.106166225
>>106166213
What model generated picrel? I really like how it's being explicit and direct enough
Replies: >>106166231 >>106166254
Anonymous
8/6/2025, 9:37:10 PM No.106166226
>>106166183
I imagine georgi say everything keeping the pose he is in on pfp
Replies: >>106166230
Anonymous
8/6/2025, 9:37:32 PM No.106166227
>>106166162
>Finetuning every new model that comes out just isn't sustainable
This is the main reason, models are coming out too fast. It's dumb to spend money and time on trial and error trying to improve a model when it might be obsolete two weeks from now. Finetunes were big when llama was all there was and you had to make do.
The giant MoE craze is the last straw. If someone gets bored of deepseek what can they do? They won't train DS because that's beyond impossible, but trying to use a finetune of some 32B would be an unbearable step down. So they have no option but to quit
Anonymous
8/6/2025, 9:37:41 PM No.106166230
1748476049680497
1748476049680497
md5: 364515f539ffd8edf32cbfb7bfc56c60🔍
>>106166226
Replies: >>106166429
Anonymous
8/6/2025, 9:37:42 PM No.106166231
>>106166225
not a model
https://www.girlsaskguys.com/girls-behavior/q1237768-do-girls-pee-faster-than-guys
Replies: >>106166238
Anonymous
8/6/2025, 9:37:56 PM No.106166235
toss cannot translate for shit how did it pass the msgk test? do datajeets actually lurk here and put it in the training data?
Replies: >>106167085
Anonymous
8/6/2025, 9:38:16 PM No.106166238
>>106166231
fuck.. i really like how the writing is explicit without weird slop shit.. what model would gen like that?
Replies: >>106166249 >>106166269 >>106166284
Anonymous
8/6/2025, 9:38:20 PM No.106166239
its been 4h since we got a new chink model

its over
Anonymous
8/6/2025, 9:38:22 PM No.106166240
>>106166213
I spend a lot of time on Google Scholar reading papers about ridiculous questions like this.
I love science.
Anonymous
8/6/2025, 9:39:04 PM No.106166246
>>106166183
Ollama will soon try to change their model format just so they can claim the comparisons are not fair between backends.
Anonymous
8/6/2025, 9:39:19 PM No.106166249
>>106166238
Human brain (not a ChatGPT Plus subscriber).
Anonymous
8/6/2025, 9:39:51 PM No.106166254
>>106166225
kek
Anonymous
8/6/2025, 9:40:05 PM No.106166257
Continuing >>106166183, if I have 16GB VRAM + 32GB RAM, what's the best general purpose model for me? Some version of Gemma?
Replies: >>106166295 >>106166346 >>106166393
Anonymous
8/6/2025, 9:40:49 PM No.106166269
Drummer get to work, I'm serious
>>106166238
Anonymous
8/6/2025, 9:41:40 PM No.106166284
>>106166238
K2
Replies: >>106166296
Anonymous
8/6/2025, 9:42:42 PM No.106166295
>>106166257
glm 4.5 air q2_k_xl
Replies: >>106166316
Anonymous
8/6/2025, 9:42:45 PM No.106166296
>>106166284
Have you tried K2? It will never generate a response in a style like this
Anonymous
8/6/2025, 9:45:48 PM No.106166315
>>106165689
>If something is throwing it out of balance, the solution is not to add more factors to the problem in an attempt to fix it.
kek. Brain evolution is literally just throwing regulators on top of the bad parts. Reptilian brain is still at the core of primate brains and when things go wrong with the control parts desire to rape comes back out
Replies: >>106166573
Anonymous
8/6/2025, 9:45:56 PM No.106166316
1737054498255330
1737054498255330
md5: ee0acffea012c06da234b3a83292f590🔍
>>106166295
uh... it seems a bit big? what speed would it even work at?
Replies: >>106166344
Anonymous
8/6/2025, 9:46:36 PM No.106166324
Screenshot 2025-08-06 at 15-45-45 SillyTavern
Screenshot 2025-08-06 at 15-45-45 SillyTavern
md5: 3a6a9df549bf9c6fb7b2fecc514c4d99🔍
>>106166108
It won't let me propose a gender-based cleaning rule though.
Replies: >>106166349
Anonymous
8/6/2025, 9:47:45 PM No.106166337
look how hard I can pee gundam
look how hard I can pee gundam
md5: 7a245529d28dcc00f1fc1830cda20b68🔍
>>106166213

>A great party trick
Replies: >>106166343
Anonymous
8/6/2025, 9:48:51 PM No.106166343
>>106166337
party pee contest sounds lit
Anonymous
8/6/2025, 9:48:52 PM No.106166344
>>106166316
uhh.. damn 46.45 is a bit tight for your setup, considering you have 48 gb total ram
i hope you are ready to go on linux
Q3_K_XL works at 8t/s on 3060 12gb/64gb ddr4
it has only 12b active parameters but 120b total
Replies: >>106166356
Anonymous
8/6/2025, 9:49:04 PM No.106166346
>>106166257
For me it's mistral-small entirely in VRAM but the qwen3 moes look promising
Anonymous
8/6/2025, 9:49:13 PM No.106166349
>>106166324
It is sexist. But
>and reinforces gender stereotypes
Does it? How?
Anonymous
8/6/2025, 9:49:26 PM No.106166351
>>106166108
>competitive aggression
wow... uhhh, sexism? yikes!
Anonymous
8/6/2025, 9:49:55 PM No.106166356
>>106166344
>i hope you are ready to go on linux
I am on linux. Guess if I want to be serious with LLMs I have to upgrade to 64gb ddr4 at least.. and 8 tok/s is kind of sad still.
Replies: >>106166392
Anonymous
8/6/2025, 9:53:10 PM No.106166392
>>106166356
>8t/s is sad
well i am on a 3060 after all..
you should upgrade to as much ram as you can, it's never enough
you might have to go headless, it will probably still swap from your disk
depends if your ram is in GiB or gb
if your ram/vram is in gibibytes then maybe you can fit it if you're headless without needing to use swap or mmap
Anonymous
8/6/2025, 9:53:13 PM No.106166393
>>106166257
gemma 3 27b is quite good if you don't mind the censorship, it and mistral small would probably be the default recommendations for that size
qwen 30a3 thinking is great for its size but it's a reasoner. the instruct is still pretty decent, though it is more noticeably limited by its 3b active params
Anonymous
8/6/2025, 9:56:51 PM No.106166429
>>106166230
what a handsome man
Anonymous
8/6/2025, 9:58:08 PM No.106166440
>>106165356
why not both? its a wonderdrug
https://files.catbox.moe/8tjl04.jpg
Replies: >>106166449 >>106166469
Anonymous
8/6/2025, 9:59:04 PM No.106166449
>>106166440
nice anon, proud of you
Anonymous
8/6/2025, 10:00:54 PM No.106166469
>>106166440
I think it did something weird to your face
Replies: >>106166475
Anonymous
8/6/2025, 10:00:59 PM No.106166471
file
file
md5: 9c4ec8b97df65527f7f5d5c358dc5023🔍
why are anons shilling GLM again
Replies: >>106166526 >>106166534 >>106166538 >>106166549 >>106166597 >>106166707
Anonymous
8/6/2025, 10:01:30 PM No.106166474
GPT
GPT
md5: 51a1c19bcbcf46b1bed27e34244033fd🔍
GPT- Globally Pushing the Talmud
Replies: >>106166479 >>106166621
Anonymous
8/6/2025, 10:01:39 PM No.106166475
>>106166469
nah, took it slow enough to not get loose skin, that happens if you lose too fast
Anonymous
8/6/2025, 10:02:00 PM No.106166479
>>106166474
based, nuke the strip
Replies: >>106166494 >>106166496
Anonymous
8/6/2025, 10:03:19 PM No.106166494
>>106166479
first we nuke the strip then we strip the nuke
Anonymous
8/6/2025, 10:03:27 PM No.106166496
photo_2025-06-14_00-51-59
photo_2025-06-14_00-51-59
md5: 5c1f0857fa20917977faa02aae40181a🔍
>>106166479
Israel lost
Anonymous
8/6/2025, 10:04:52 PM No.106166505
>>106166162
>It's not 2023 anymore and several of the newer larger models are half-decent out of the box
More than just decent. Near perfect.
Man, people must not remember how terrible Llama models were, all of them, at all size, when it came to instruction following. That was what the better troontunes improved on the most. Same for Mistral models.
The last time a finetune was worth using over the instruct made by the model maker was Tulu, because even llama 3.1 was dogshit at following your instructions
But by the time Tulu came out we were already getting better models from China
Anonymous
8/6/2025, 10:06:19 PM No.106166526
>>106166471
I genuinely think people who shill GLM are doing it with the purpose of sabotaging local and making it look terrible
Replies: >>106166539
Anonymous
8/6/2025, 10:06:46 PM No.106166534
>>106166471
wtf I thought glm was good
Anonymous
8/6/2025, 10:07:00 PM No.106166538
>>106166471
I know your tricks. That's the latest Phi, isn't it?
Anonymous
8/6/2025, 10:07:12 PM No.106166539
>>106166526
use this
https://files.catbox.moe/qap1gr.json
Replies: >>106166753 >>106166762
Anonymous
8/6/2025, 10:08:03 PM No.106166549
1754287925685457
1754287925685457
md5: fd7d4587276e5dbe82b612767f47a68f🔍
>>106166471
I was just about to post that it's nice seeing how GLM tries doing its best in the thinking. I'm starting to warm up to the model.
Anonymous
8/6/2025, 10:09:51 PM No.106166573
>>106166315
>wrong
?
Replies: >>106166583
Anonymous
8/6/2025, 10:10:28 PM No.106166583
>>106166573
yes anon, rape is wrong
Anonymous
8/6/2025, 10:10:33 PM No.106166585
https://openai.com/index/gpt-oss-model-card/
>As part of this launch, OpenAI is reaffirming its commitment to advancing beneficial AI and raising safety standards across the ecosystem.
Replies: >>106166611
Anonymous
8/6/2025, 10:11:37 PM No.106166597
>>106166471
mm yes... the subtle signs of using the wrong prompt format... the tasteful writing quirks originating from bad rep pen settings... this is truly a vintage skill issue post
Replies: >>106166739
Anonymous
8/6/2025, 10:12:49 PM No.106166611
>>106166585
>Once they are released, determined attackers could fine-tune them to bypass safety refusals or directly optimize for harm without the possibility for OpenAI to implement additional mitigations or to revoke access.
Yup, and that's why we made it so deepfried that it's not worth the effort to do so.
Anonymous
8/6/2025, 10:14:51 PM No.106166621
1728060289519370
1728060289519370
md5: cd6b2a109f23778219affc25543e303f🔍
>>106166474
Replies: >>106166735
Anonymous
8/6/2025, 10:17:09 PM No.106166638
>>106163327 (OP)
What's the best local model for Erotic roleplay?
Replies: >>106166649 >>106166655 >>106166659
Anonymous
8/6/2025, 10:18:12 PM No.106166648
Screenshot 2025-08-06 at 16-17-08 SillyTavern
Screenshot 2025-08-06 at 16-17-08 SillyTavern
md5: 8dff793c32a9e1743f71a6beaa047db0🔍
>can you cite studies for those claims
>every single one is made up and all these people have extremely long names
>J. S. R. B. Anderson, “The role of social signalling in competitive toileting behaviours”, Psychology & Health, 2021; 36(3): 250‑264.
fucking kek
Anonymous
8/6/2025, 10:18:16 PM No.106166649
>>106166638
Rocinante and Cydonia.
Replies: >>106166666
Anonymous
8/6/2025, 10:18:44 PM No.106166655
>>106166638
Kimi K2 (1000B) and Deepseek R1 (671B)
Replies: >>106166670 >>106166689
Anonymous
8/6/2025, 10:19:19 PM No.106166659
>>106166638
glm4.5 / Kimi > deepseek > glm air
Replies: >>106166711
Anonymous
8/6/2025, 10:19:40 PM No.106166666
>>106166649 (me)
I'm joking by the way, those are trash meme models.
Anonymous
8/6/2025, 10:19:53 PM No.106166670
>>106166655
>local
Replies: >>106166677 >>106166680 >>106166685
Anonymous
8/6/2025, 10:20:42 PM No.106166677
>>106166670
a mac 512GB is local, and glm air will fit on 128GB ram
Anonymous
8/6/2025, 10:20:48 PM No.106166680
>>106166670
If it's open source, it's local. If Behemoth somehow got release, then it's local too.
Replies: >>106166713
Anonymous
8/6/2025, 10:20:54 PM No.106166685
>>106166670
just buy a mac ultra 512gb or make a cpumaxx build
local.
Anonymous
8/6/2025, 10:20:56 PM No.106166686
that fucking bastard altman
the 20b could have been good but they neutered it
Replies: >>106166787
Anonymous
8/6/2025, 10:21:31 PM No.106166689
>>106166655
might as well just say those two are the best period
as it turns out, for a model to be great at erotic roleplay also means it's great at everything else
Anonymous
8/6/2025, 10:22:58 PM No.106166707
>>106166471
Have you tried not using a 1bit quant?
Replies: >>106166739
Anonymous
8/6/2025, 10:23:16 PM No.106166711
>>106166659
What's the difference between Kimi and GLM?
Replies: >>106166716 >>106166719 >>106166740
Anonymous
8/6/2025, 10:23:49 PM No.106166713
>>106166680
How much RAM do you need to run 1000B anyway?
600GB at fp4 according to some calculator I found?
Anonymous
8/6/2025, 10:23:56 PM No.106166716
>>106166711
Kimi K2 is a much larger model (1T-A32B)
Anonymous
8/6/2025, 10:24:13 PM No.106166719
>>106166711
One is good, other one Isn't
Anonymous
8/6/2025, 10:25:06 PM No.106166735
>>106166621
Shit like this should make us pause and consider that knowledge is preserved in books, not LLMs. Because different powers will censor LLMs differently, and knowledge WILL be lost. Books can be hidden.
Anonymous
8/6/2025, 10:25:37 PM No.106166739
file
file
md5: 5e57dc0ad610db990e20924a112c3363🔍
>>106166597
what's wrong with this prompt
>>106166707
using Q4_K_M
Replies: >>106166754
Anonymous
8/6/2025, 10:25:41 PM No.106166740
>>106166711
GLM is less shiitzo, proven by the hallucination benchmark, kimi knows more but gets things wrong more as well. I prefer GLM because that means its way better at anatomy / following instructions. Warning though, it needs very low temp, like try 0.2 and slowly rais it
Replies: >>106166752 >>106166753
Anonymous
8/6/2025, 10:27:18 PM No.106166752
>>106166740
I got some strange replies with GLM, but it was at 0.6 temp, so that explains it.
Replies: >>106166766
Anonymous
8/6/2025, 10:27:23 PM No.106166753
>>106166740
also GLM writes better imo, try >>106166539 with it
Replies: >>106166762
Anonymous
8/6/2025, 10:27:22 PM No.106166754
1750275243610710
1750275243610710
md5: fdc050b4e30a91614047add726dbeb82🔍
>>106166739
Replies: >>106166756
Anonymous
8/6/2025, 10:27:58 PM No.106166756
>>106166754
mfw
Anonymous
8/6/2025, 10:28:24 PM No.106166762
>>106166539
>>106166753
it starts repeating with this and eventually stops thinking
Replies: >>106166777
Anonymous
8/6/2025, 10:29:03 PM No.106166766
>>106166752
yea that is way too high, I don't know why its so sensitive but it quickly goes crazy a bit over 0.3 in my experience.
Replies: >>106166790
Anonymous
8/6/2025, 10:30:04 PM No.106166777
>>106166762
did you change anything? I can do 32K context without issues at least, have not tried more
Replies: >>106166811
Anonymous
8/6/2025, 10:31:10 PM No.106166785
Are there any jailbreaks for GLM?
Replies: >>106166789
Anonymous
8/6/2025, 10:31:35 PM No.106166787
>>106166686
I just spent 35 minutes watching it think trying to produce some three js floor plan. In the end it was shit.
Qwen3 coder 30b didn't fare (much) better.
Anonymous
8/6/2025, 10:31:52 PM No.106166789
>>106166785
... scroll up
Anonymous
8/6/2025, 10:31:58 PM No.106166790
>>106166766
>I don't know why its so sensitive
because it's fucking broken
if you understand anything about temperature and token probabilities you would understand that if a model only works at the absolute lowest temp or requires greedy decoding it's a botched train, it's a botched train and it hasn't properly learned anything other than /the/ happy path
Replies: >>106166804 >>106166812 >>106166991
Anonymous
8/6/2025, 10:33:23 PM No.106166804
GxoddurWUAAHksN
GxoddurWUAAHksN
md5: ba843e04fa704ceb8ef53ca7d202d80c🔍
>>106166790
that is not true at all though, GLM has some of the lowest hallucination scores and is incredible at coding as well
Anonymous
8/6/2025, 10:34:01 PM No.106166810
fun fact: nobody in here actually runs these models locally
Replies: >>106166824 >>106166843
Anonymous
8/6/2025, 10:34:01 PM No.106166811
file
file
md5: 2a28bb6ce799094843822136f5efac77🔍
>>106166777
i havent changed anything, i am using chat completion, here's what the fields look like for text completion, maybe something gets used from the text completion preset? could you do a ST master export for the text completion tab too? im using Q3_K_XL, that could be the problem too
Replies: >>106166833
Anonymous
8/6/2025, 10:34:25 PM No.106166812
>>106166790
Give me your favorite card, or any card at all to RP with. I will prove you wrong with full logs
Replies: >>106166819
Anonymous
8/6/2025, 10:35:26 PM No.106166819
>>106166812
Ah, your using text completion, my JB was made for chat completion, plus that will rule out any formatting issues
Replies: >>106166833 >>106166859
Anonymous
8/6/2025, 10:35:54 PM No.106166824
file
file
md5: e4e12541fcdce4b44e818c65bdcf875f🔍
>>106166810
I will in about 50 minutes
Anonymous
8/6/2025, 10:36:27 PM No.106166833
>>106166819
meant to reply to >>106166811
Replies: >>106166859
Anonymous
8/6/2025, 10:37:22 PM No.106166843
>>106166810
Of course I'm doing some testing on OR before getting into it locally, if at all.
Anonymous
8/6/2025, 10:38:25 PM No.106166858
Guys i need advice from some a.i spergs here.

Im trying to archive both text and image models, in case we get rugged, so i could reupload them back to surface. Have more than enough storage for this.
Which models i should backup?
Replies: >>106166908 >>106166923 >>106166942 >>106166958 >>106166971
Anonymous
8/6/2025, 10:38:26 PM No.106166859
Screen Shot 2025-08-06 at 20.36.14
Screen Shot 2025-08-06 at 20.36.14
md5: a1e6b21a4057e5e65a44559ac2a31ceb🔍
>>106166819
>>106166833
no no im using chat completion, what gguf maker are you using? im using the unsloth quant
but i posted text completion thing because some things might get into the chat completion thing
heres the full chat completion screenshot
./llama-server --model ~/TND/AI/glmq3kxl/GLM-4.5-Air-UD-Q3_K_XL-00001-of-00002.gguf -ot ffn_up_shexp=CUDA0 -ot exps=CPU -ngl 100 -t 6 -c 16384 --no-mmap -fa
this is how i start it
Replies: >>106166898 >>106166976
Anonymous
8/6/2025, 10:38:55 PM No.106166865
How are text completion and chat completion different under the hood?
What is chat completion actually doing?
Replies: >>106166917 >>106166934
Anonymous
8/6/2025, 10:39:22 PM No.106166871
remember airoboros? dolphin? orca "meme"? falcon? yi? thebloke quants? .ggml file format? remember the good old huggingface leaderboard days? remember the alpaca days? remember when suggesting to try pygmalion wasn't a meta irony shitpost?
Replies: >>106166888 >>106166920 >>106166929
Anonymous
8/6/2025, 10:40:52 PM No.106166888
>>106166871
remember gozfarb and vicuna unlocked?
remember Instruct 13b GPTQ? i remember anons thinking that the creator of instruct 13b was forced to remove the model (he indeed did) and a few days later gozfarb deleted his account
Anonymous
8/6/2025, 10:41:49 PM No.106166898
>>106166859
>chat history 400
>jailbreak prompt 1457

Ok, the JB outweighs the context atm that is prob why. Move the chat history below until you have at least a few thousand. To make the JB stronger you can move it back under later
Replies: >>106166904 >>106166920
Anonymous
8/6/2025, 10:42:50 PM No.106166904
>>106166898
also for anyone looking, the JB is not actually that big, its cause I combine stuff like the persona and card info all in it for models to better understand
Anonymous
8/6/2025, 10:43:26 PM No.106166908
file
file
md5: fb5a8869f6d2a5c41363eb6b282283c7🔍
>>106166858
Rate my text stack (I'll add GLM 4.5 Air).
My imagen stack is smaller: 1 inpaint, 4 for gen (different styles). Everything I need to survive on local only.
Anonymous
8/6/2025, 10:43:34 PM No.106166913
remember when sama saved local?
Anonymous
8/6/2025, 10:44:05 PM No.106166917
>>106166865
In chat completion you send the user/model messages and the backend formats it with the chat template. It then generate tokens. In completion, you don't use chat template, you do the chat formatting yourself, or let the front end do it for you.
Under the hood, the tokens come out from the same functions.
If you format the chat in the same way the completion endpoint would, the results should be indistinguishable.
Anonymous
8/6/2025, 10:44:15 PM No.106166920
>>106166871
yea, huggingface leaderboard used to be king, i still remember checking it every day during summer if something is new, i once pushed AGPL into the top model and the repo owners accepted it because it was pushed along a fix to the readme
i still remember the first time a 70b was better than gpt 3.5 (the original one), it was made by upstage i forgot the original name
i remember the day when llama 1 first leaked and it was my first time running a LLM and it felt so magical, i remember my parents telling me to shower or turn off the bathroom heater because its a fire hazard (I left it on for over an hour)
it felt MAGICAL man
>>106166898
thanks anon ill try that
Replies: >>106166976 >>106167024
Anonymous
8/6/2025, 10:44:23 PM No.106166923
>>106166858
https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF
https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF
https://huggingface.co/unsloth/GLM-4.5-GGUF
https://huggingface.co/unsloth/GLM-4.5-Air-GGUF
https://huggingface.co/mradermacher/Dolphin-Mistral-24B-Venice-Edition-GGUF

At least the biggest quant for each of them, better yet the biggest quant for every bit.
Replies: >>106166943 >>106166955 >>106166956
Anonymous
8/6/2025, 10:45:09 PM No.106166929
>>106166871
I remember Guanaco-7b-uncensored. It was the shit.
Replies: >>106166933
Anonymous
8/6/2025, 10:45:26 PM No.106166933
>>106166929
Mythologic ftw
Anonymous
8/6/2025, 10:45:32 PM No.106166934
>>106166865
Chat completion allows the backend to apply a predefined Jinja template to a structured JSON object from the frontend representing the chat, formatting it into a correct, model-specific format. The end result shouldn't be different from text completion with the correct prompting for the model you're using.
Anonymous
8/6/2025, 10:45:54 PM No.106166942
>>106166858
Pony diffusion 6, illustrious, noobai are considered best nsfw imagegen models.
Anonymous
8/6/2025, 10:46:11 PM No.106166943
>>106166923
>saving quants
Anonymous
8/6/2025, 10:46:42 PM No.106166955
>>106166923
Just download the original model and you can make the quants later...
Anonymous
8/6/2025, 10:46:48 PM No.106166956
>>106166923
buy an ad daniel
Replies: >>106167015
Anonymous
8/6/2025, 10:46:53 PM No.106166958
>>106166858
https://huggingface.co/deepseek-ai/DeepSeek-R1 is a bit less cucked and better for rp
https://huggingface.co/deepseek-ai/DeepSeek-V3-0324
quants:
https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF
https://huggingface.co/unsloth/DeepSeek-R1-GGUF
Replies: >>106166981
Anonymous
8/6/2025, 10:47:04 PM No.106166962
Could you make a control vector that helps guide a base model to output in a instruct format without fine tuning?
Or rather than guide, reinforce, since base models nowdays seem more than capable of completing in instruct format.
Replies: >>106167014
Anonymous
8/6/2025, 10:47:42 PM No.106166971
>>106166858
biglove
noobai
ponyrealism
rawcharm amateur
stoiqo newreality
utopianpony 2 inpainting
flux dev
hunyuan i2v and t2v
+ loras
Anonymous
8/6/2025, 10:48:07 PM No.106166976
>>106166859
>>106166920
I just noticed a redundancy. I have personality / scenario inside of the JB section, so turn those off, I wasn't using any so I didn't notice that. here is updated one: https://files.catbox.moe/gjw3c3.json
Replies: >>106167004
Anonymous
8/6/2025, 10:48:18 PM No.106166981
>>106166958
daniel stop
Replies: >>106167015
Anonymous
8/6/2025, 10:48:23 PM No.106166983
1744660186366911
1744660186366911
md5: b4ca6f1e52d6294d00d1213a1e43f5fb🔍
ClosedAI (CuckAI (CensoredAI (OpenAI))) paid shill sissies... how do we damage control this?
Replies: >>106166992
Anonymous
8/6/2025, 10:49:41 PM No.106166991
>>106166790
in fact it's only models which are extremely overbaked on slop that are able to maintain consistent quality when sampling repeatedly with an uncurated token distribution at temp 1. it's literally the opposite of what you're saying, a model that properly models the world will have a much more diverse, flat token distribution which by nature includes more decent-but-questionable tokens or statistically likely mistakes. it needs lower temp to be kept on the happy path SPECIFICALLY because it has learned the world and not just the happy path. a model that can stay on the happy path with no handholding whatsoever is the one that "hasn't properly learned anything other than /the/ happy path"
but nooo.... it's not what you're used to, right? it must be the model that is wrong... let's reject the better model because it requires me to turn a single slider down a couple points. retards like you are why companies have to rescale temp behind the scenes on their APIs
Replies: >>106167029
Anonymous
8/6/2025, 10:49:43 PM No.106166992
>>106166983
Soon coming to a SaaS model near you
Anonymous
8/6/2025, 10:49:49 PM No.106166994
file
file
md5: 4b2ee0d72a4e25f4cb5ff6764367d4fd🔍
It is me. Sam. To be honest I have read a few threads here in the past and I have seen you call me faggot. How did you like my model? Was it fun? You know what I also did? I shared the exact method, to achieve the same level of safety with all the other companies. Who is the faggot now? You edgy cunts just got trolled hard....
Replies: >>106167002
Anonymous
8/6/2025, 10:50:01 PM No.106166995
remember when sama safed local?
Replies: >>106167004
Anonymous
8/6/2025, 10:51:02 PM No.106167002
>>106166994
here is your (You) now fuck off
Anonymous
8/6/2025, 10:51:10 PM No.106167004
>>106166995
kek good one anon
>>106166976
thanks, trying it out right now
Anonymous
8/6/2025, 10:51:34 PM No.106167009
sam is a based accelerationist exposing just how silly safety is and how retarded twitter hypegrifters are
Anonymous
8/6/2025, 10:52:10 PM No.106167014
>>106166962
Doubt it. They only set a "mood" for the model. It'd be hard to make them output specific tokens.
This is a little effort-post i did a while back about control vectors. It has enough info for you to experiment with them.
https://desuarchive.org/g/thread/104991200/#104995066
https://desuarchive.org/g/thread/104991200/#105000398
Anonymous
8/6/2025, 10:52:11 PM No.106167015
>>106166956
>>106166981
Have you seen the links in https://rentry.org/recommended-models?
Anonymous
8/6/2025, 10:52:46 PM No.106167024
remember 2048 tokens context window and trying to fit your character under as little tokens as possible?
>>106166920
>upstage
truly, the pioneers of benchmaxxing, we've only got something on par with 3.5 when mixtral 8x7b came out
Anonymous
8/6/2025, 10:53:41 PM No.106167029
>>106166991
You spun an argument out of thin air, and without any supporting evidence, treated it as proof for your hypothesis. You're like an LLM.
That's not how truth works.
Replies: >>106167036
Anonymous
8/6/2025, 10:54:27 PM No.106167036
>>106167029
as opposed to my interlocutor, who presented an objective fact based assessment
retard
Replies: >>106167040
Anonymous
8/6/2025, 10:55:25 PM No.106167040
>>106167036
I don't care about your interlocutor nor the topic at hand. You could be right for all I know.
I'm just pointing out someone who doesn't know how to find truth in the world because you live in your head.
Replies: >>106167111
Anonymous
8/6/2025, 10:57:54 PM No.106167065
file
file
md5: 35846763d5e7d48f46aae8560d662883🔍
Sam Altman here
Anonymous
8/6/2025, 10:58:07 PM No.106167067
remember superbooga? it was basically RAG
remember superCOT? reasoning before it was cool
remember superHOT? 2x context
superhot was crazy, every single model had a superhot version kek
Replies: >>106167089 >>106167089
Anonymous
8/6/2025, 10:58:23 PM No.106167071
>>106167048
>>106167048
>>106167048
Anonymous
8/6/2025, 11:00:13 PM No.106167085
>>106166235
Yes. There's also lmarena.
Anonymous
8/6/2025, 11:01:16 PM No.106167089
>>106167067
>remember superbooga? it was basically RAG
I do but never actually used it.

>>106167067
>superhot was crazy,
Dude invented extending context with RoPE.
Insane.
Anonymous
8/6/2025, 11:03:36 PM No.106167111
>>106167040
I am making an argument on lmg not writing a research paper
if you have a substantive critique I would love to hear it, but going "UMM PROOFS??" is a nothing counterargument. yes, I don't have hard evidence for everything I believe, especially on the subject of LLMs which cost millions of dollars to train lol... which one of us is really being unrealistic here?
I make reasonable inferences based on my experience using models because I live in the real world and have to make due with messy incomplete real world data
Anonymous
8/6/2025, 11:23:11 PM No.106167345
>>106165031
plapping d.va on glm