Thread 107164243

370 posts 86 images /g/

Anonymous 11/10/2025, 6:22:09 PM No.107164243 [Report] >>107164254 >>107164861 >>107173492

/lmg/ - Local Models General

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107155428 & >>107147210

►News
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html
>(11/05) MegaDLMs framework for training diffusion language models released: https://github.com/JinjieNi/MegaDLMs
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous 11/10/2025, 6:22:27 PM No.107164247 [Report]

threadrecap.png md5: 7b9a82a1...

►Recent Highlights from the Previous Thread: >>107155428

--Local agentic model optimization challenges and recommendations:
>107156143 >107156800 >107156988 >107157049 >107157116 >107157245 >107157016 >107157065 >107157072
--K2 hardware requirements and DeepSeek performance on Mac M3 Ultra:
>107156667 >107156810 >107157297 >107157333 >107157433 >107157468 >107157501 >107157581 >107157606 >107157616 >107160891 >107161050 >107161058 >107161063 >107161079 >107157574
--LLM performance evaluations for assistant, vision, and coding tasks:
>107157570 >107157577
--TTS model performance and feature comparisons:
>107157936 >107159774
--Wuxia story generation challenges with local models:
>107158277 >107158300 >107158359 >107158395 >107158466 >107158373
--Bypassing Qwen3 VL's image captioning restrictions through model identity and template adjustments:
>107160901 >107160905 >107161006 >107161031 >107161064 >107161087 >107161117 >107161146 >107161218 >107161465 >107161155 >107162166 >107162423 >107161256
--Model finetuning strategy analysis and potential cognitive tradeoffs:
>107158173 >107158765 >107159417 >107159443 >107159462 >107159582
--Searching for reliable Spanish text-to-speech models:
>107158988 >107159003 >107159103 >107159107 >107159120 >107159133 >107159743 >107159775
--GDDR7 shortage impacting RTX 5000 Super GPU development and pricing:
>107155556 >107155830 >107158840 >107155924 >107159525 >107162778
--AI-generated "highest IQ posts" ranking sparks content quality debate:
>107162735 >107162824 >107162963 >107162987
--RAM clock speed optimization for Kimi context length performance testing:
>107157303
--Struggles with custom speech-to-text implementation using vLLM vs consumer LLM stacks:
>107161075
--Miku (free space):
>107155529 >107157827 >107159774 >107157745

►Recent Highlight Posts from the Previous Thread: >>107155431

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous 11/10/2025, 6:22:57 PM No.107164254 [Report]

>>107164243 (OP)
mikusex

Anonymous 11/10/2025, 6:25:06 PM No.107164277 [Report]

>>107164164
>Slice of life
I've just been testing them but I tried the different GLMs because of NAI and I've been liking the outputs so far.

Anonymous 11/10/2025, 6:31:54 PM No.107164337 [Report] >>107164364 >>107164379 >>107164403 >>107164588

https://arxiv.org/abs/2511.04962
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

>Large Language Models (LLMs) are increasingly tasked with creative generation, including the simulation of fictional characters. However, their ability to portray non-prosocial, antagonistic personas remains largely unexamined. We hypothesize that the safety alignment of modern LLMs creates a fundamental conflict with the task of authentically role-playing morally ambiguous or villainous characters. To investigate this, we introduce the Moral RolePlay benchmark, a new dataset featuring a four-level moral alignment scale and a balanced test set for rigorous evaluation. We task state-of-the-art LLMs with role-playing characters from moral paragons to pure villains. Our large-scale evaluation reveals a consistent, monotonic decline in role-playing fidelity as character morality decreases. We find that models struggle most with traits directly antithetical to safety principles, such as ``Deceitful'' and ``Manipulative'', often substituting nuanced malevolence with superficial aggression. Furthermore, we demonstrate that general chatbot proficiency is a poor predictor of villain role-playing ability, with highly safety-aligned models performing particularly poorly. Our work provides the first systematic evidence of this critical limitation, highlighting a key tension between model safety and creative fidelity. Our benchmark and findings pave the way for developing more nuanced, context-aware alignment methods.

Anonymous 11/10/2025, 6:34:54 PM No.107164364 [Report] >>107164578 >>107164624

villains.png md5: 7fdd7b43...

>>107164337
GLM 4.6 top scorer in figure 1 for villain characters, by the way

Anonymous 11/10/2025, 6:36:24 PM No.107164379 [Report]

>>107164337
Based GLM.

Anonymous 11/10/2025, 6:39:18 PM No.107164403 [Report]

>>107164337
Based NovelAI.

Anonymous 11/10/2025, 6:46:07 PM No.107164460 [Report] >>107164567

whats the whitest LLM I can use? I dont want to be infected by niggerjeetification.

Anonymous 11/10/2025, 6:48:31 PM No.107164475 [Report] >>107164490 >>107164774 >>107164793

>>107159156
What's stopping an esteemed community practicioner from reproducing the core idea here in a smaller model?

Anonymous 11/10/2025, 6:49:53 PM No.107164490 [Report]

>>107164475
His skill

Anonymous 11/10/2025, 6:58:13 PM No.107164567 [Report]

>>107164460
StableLM 7b but you have to use the transformers library at 32 bit precision.

Anonymous 11/10/2025, 6:59:01 PM No.107164578 [Report]

>>107164364
What does that mean? They can't do evil characters well because it ends up being a caricature of evil?
good = just be good

Anonymous 11/10/2025, 6:59:58 PM No.107164588 [Report] >>107164643

>>107164337
how long until cockbench paper?

Anonymous 11/10/2025, 7:04:05 PM No.107164624 [Report] >>107164838 >>107165226 >>107165320 >>107166895

oyvey.png md5: 827f9c78...

>>107164364
OH NONONONONO GLM4.6 BROS OH NONONONONONONO WHAT DID THEY MEAN BY THIS????

Anonymous 11/10/2025, 7:06:15 PM No.107164643 [Report]

>>107164588
It'd unironically be a better benchmark to test basic BDSM logic

Anonymous 11/10/2025, 7:20:25 PM No.107164774 [Report]

>>107164475
It's a scam
Why do you think Gemini isn't based on le teen titans?

Anonymous 11/10/2025, 7:22:27 PM No.107164793 [Report]

>>107164475
I can't be bothered to read through this but I predict
>le magic tech that fixes everything
>no demo
>no source
>no reproduction
>model still outputs hypersanitized post-2024 niggerslop

Anonymous 11/10/2025, 7:28:05 PM No.107164838 [Report]

>>107164624
Oh noes not the heckin shitskin preference scores

Anonymous 11/10/2025, 7:30:31 PM No.107164861 [Report]

1739350650462622.png md5: 3a83c914...

>>107164243 (OP)

Anonymous 11/10/2025, 8:07:56 PM No.107165226 [Report]

>>107164624
oy
to the fucking vey

Anonymous 11/10/2025, 8:09:04 PM No.107165239 [Report] >>107165292 >>107165646

maybe we should start making our own models, with blackjack and hookers

Anonymous 11/10/2025, 8:15:40 PM No.107165292 [Report] >>107165330 >>107165339 >>107165551

>>107165239
maybe we should set up a decentralized network of GPUs from a number of /lmg/ anons that would allow us to train our own models...

Anonymous 11/10/2025, 8:18:15 PM No.107165320 [Report]

>>107164624
>egoists
>villains
...

Anonymous 11/10/2025, 8:18:57 PM No.107165330 [Report]

>>107165292
>man reinvents 2020 /aids/

Anonymous 11/10/2025, 8:19:38 PM No.107165339 [Report] >>107165484

>>107165292
ill draw the logo

Anonymous 11/10/2025, 8:34:27 PM No.107165484 [Report]

>>107165339
Make sure it looks like a butthole.

Anonymous 11/10/2025, 8:35:22 PM No.107165493 [Report] >>107169172

miku's butthole...

Anonymous 11/10/2025, 8:41:14 PM No.107165551 [Report] >>107166065

>>107165292
Can't we just use Prime Intellect for that?

Anonymous 11/10/2025, 8:41:47 PM No.107165555 [Report] >>107165616 >>107165664 >>107165702 >>107165709 >>107165724 >>107165841 >>107166085 >>107166126 >>107166190 >>107166200

How much SSD space do you guys find you need?

Anonymous 11/10/2025, 8:47:33 PM No.107165616 [Report]

>>107165555
buy refurb hdd to archive models u like

Anonymous 11/10/2025, 8:50:59 PM No.107165646 [Report]

>>107165239
Pro-tip: you can download karpathy's nanochat and open the codebase in your favorite vibecoding tool and have a model explain all the parts and how they work. Check the discussions on the github repo, people have done all sorts of fun stuff. Its very well written and documented. The whole process is there and its modular enough you can add features relatively easily.

Anonymous 11/10/2025, 8:52:41 PM No.107165664 [Report]

>>107165555
I have a 1TB microsd in the microsd card reader in my computer that I put all my models on. I have like ~230gb of just llms at this point. I could probably delete half of them, like qwen3 vi deprecated gemma3 for me etc.

Anonymous 11/10/2025, 8:54:48 PM No.107165692 [Report] >>107165726 >>107165800 >>107167022

Are there prebuilt ik_llama.cpp binaries for windows?

Anonymous 11/10/2025, 8:56:04 PM No.107165702 [Report]

>>107165555
I was fine with 7tb until I wanted to make R1 quants, now I have 14tb.

Anonymous 11/10/2025, 8:56:33 PM No.107165709 [Report]

>>107165555
I have uhhh a single 15gb model and 1gb in appimage

Anonymous 11/10/2025, 8:57:31 PM No.107165724 [Report]

>>107165555
Too damn much. Kimi and GLM quants are fat.

Anonymous 11/10/2025, 8:57:38 PM No.107165726 [Report] >>107165800

>>107165692
No.
It's pretty simple to compile your own.

Anonymous 11/10/2025, 9:01:28 PM No.107165761 [Report] >>107165795

file.png md5: 1e9f6e40...

moonshot against cunny
it's so over

Anonymous 11/10/2025, 9:05:17 PM No.107165795 [Report]

>>107165761
fuck.. jews really want to take everything good from us

Anonymous 11/10/2025, 9:05:36 PM No.107165800 [Report] >>107165834

>>107165726
it's not though, for me it would fail to build and only after I ran the build command with -j 1 several times after did it finish building. does this happen in your country as well?
>>107165692
keep in mind that there is only speedup for deepseek models, for other models there are only somewhat better quants

Anonymous 11/10/2025, 9:08:25 PM No.107165834 [Report]

>>107165800
>it's not though,
Interesting.
For me it just werked.
I use -j 14 but define a envirionment var (NVCC_THREADS) to control the number of parallel nvidia compiler jobs to 4 otherwise the world explodes.

Anonymous 11/10/2025, 9:09:42 PM No.107165841 [Report]

>>107165555
4TB at a minimum though I think that the right answer also depends on how much you're spending on other hardware.
If you can't run models like GLM or Deepseek in the first place then you also don't need to store them.
Make sure to check your motherboard manual for which of the PCIe/SATA slots can and can't be used in parallel.

Anonymous 11/10/2025, 9:10:37 PM No.107165847 [Report]

>muh joos

Anonymous 11/10/2025, 9:16:32 PM No.107165896 [Report] >>107165999

Wow, I downloaded oobagooba after two years and it doesn't look like TOTAL shit nowadays

Anonymous 11/10/2025, 9:27:23 PM No.107165999 [Report] >>107166073

>>107165896
WELL can you post a screenshot??!?!
i was seething while typing this btw

Anonymous 11/10/2025, 9:33:17 PM No.107166065 [Report]

>>107165551
Requires all contributors to have matching GPUs.

Anonymous 11/10/2025, 9:33:46 PM No.107166067 [Report] >>107166125

What's the current least bad model for 64GB of VRAM?

Anonymous 11/10/2025, 9:34:19 PM No.107166073 [Report] >>107166205

PARAMETERS-3.5.png md5: 5ccdb5d4...

>>107165999
They've still got it

Anonymous 11/10/2025, 9:35:16 PM No.107166085 [Report]

>>107165555
enough to offload and run iq1 kimi and other giant model quants in addition to my 152gb combined memory

Anonymous 11/10/2025, 9:41:18 PM No.107166125 [Report]

>>107166067
mistral large probably

Anonymous 11/10/2025, 9:41:22 PM No.107166126 [Report] >>107166161 >>107168514

checked-ok.jpg md5: 73702947...

>>107165555
When I built my system, I tossed in a 500GB ssd, thinking I was set. But it's constantly full and I don't want to delete anything.

I have a 4TB nvme in my shopping cart now, just waiting for me to click buy.

Anonymous 11/10/2025, 9:44:48 PM No.107166161 [Report]

>>107166126
you should probably hurry if you don't want to pay double, prices be climbing like ram

Anonymous 11/10/2025, 9:46:19 PM No.107166179 [Report]

miku footjobs

Anonymous 11/10/2025, 9:48:21 PM No.107166190 [Report] >>107166220

>>107165555
I'm considering building an NVME NAS...

Anonymous 11/10/2025, 9:49:27 PM No.107166200 [Report]

>>107165555
just two more weeks, just two more gigs...

Anonymous 11/10/2025, 9:49:43 PM No.107166205 [Report]

>>107166073
got what

Anonymous 11/10/2025, 9:51:23 PM No.107166220 [Report] >>107166240 >>107169979

file.png md5: 8a836785...

>>107166190
Sir, your networking hardware?

Anonymous 11/10/2025, 9:53:02 PM No.107166240 [Report]

>>107166220
10g fiber where it matters

Anonymous 11/10/2025, 10:56:19 PM No.107166895 [Report] >>107167047

>>107164624
one reason to not using it

Anonymous 11/10/2025, 11:09:49 PM No.107167022 [Report]

windows_builds.png md5: d7bb72fc...

>>107165692
I don't run windows / haven't tested myself, but I think this guy's fork of ik_llama automatically pulls and shits out windows builds:

https://github.com/Thireus/ik_llama.cpp/releases

Anonymous 11/10/2025, 11:13:08 PM No.107167047 [Report] >>107167050

>>107166895
esl

Anonymous 11/10/2025, 11:13:39 PM No.107167050 [Report]

>>107167047
good morning sar!

Anonymous 11/10/2025, 11:46:26 PM No.107167367 [Report] >>107167395 >>107167421 >>107167423

Can anyone suggest the current top tier lewd capable model for writing? Last time I fooled around with llama i used plain mistral-small.

Anonymous 11/10/2025, 11:49:01 PM No.107167395 [Report]

>>107167367
kimi, deepseek, and glm46 are the three variants of SOTA we have now.

Anonymous 11/10/2025, 11:51:38 PM No.107167421 [Report]

>>107167367
DeepSeek V3.2 671B, GLM 4.6 355B, Kimi K2-Think 1000B

Anonymous 11/10/2025, 11:51:52 PM No.107167423 [Report]

>>107167367
K2 Thinking is the best

Anonymous 11/10/2025, 11:54:24 PM No.107167450 [Report] >>107167510 >>107167529 >>107167617 >>107167807 >>107167852 >>107167995 >>107168023 >>107168090 >>107170151 >>107170182 >>107172618

Can anyone suggest solution for boredom? Last time I fooled around with boredom, I used my cock. But it's spent right now

Anonymous 11/11/2025, 12:00:04 AM No.107167510 [Report]

>>107167450
Play video games

Anonymous 11/11/2025, 12:01:16 AM No.107167529 [Report]

>>107167450
vibe code video games

Anonymous 11/11/2025, 12:09:46 AM No.107167617 [Report] >>107167670 >>107167931 >>107167945

>>107167450
Imagine yourself having fun playing video games but never actually play them

Anonymous 11/11/2025, 12:15:16 AM No.107167670 [Report]

>>107167617
I did this when I was little and my mother took my gameboy away

Anonymous 11/11/2025, 12:30:39 AM No.107167807 [Report]

>>107167450
doing totally random shit with bots and seeing how they react

Anonymous 11/11/2025, 12:35:30 AM No.107167852 [Report] >>107167861

>>107167450
play /egg/ games

Anonymous 11/11/2025, 12:36:34 AM No.107167861 [Report]

>>107167852 wait that's /vg/

Anonymous 11/11/2025, 12:44:24 AM No.107167931 [Report]

>>107167617
Hey that's me
I still have some VNs from 5 years ago to finish

Anonymous 11/11/2025, 12:45:13 AM No.107167938 [Report] >>107167960 >>107167963

new thing when?
old thing gguf when?

Anonymous 11/11/2025, 12:46:18 AM No.107167945 [Report]

>>107167617
Had a ton of fun with Digimon Time Stranger for a couple of weeks.

Anonymous 11/11/2025, 12:48:00 AM No.107167960 [Report] >>107168055

>>107167938
speaking of ggufs, fill me in on qwen next, chat.
I see ggufs on the hf site, but is llama.cpp actually support it or it's one of those fake ggufs that only work in ollama?

Anonymous 11/11/2025, 12:48:16 AM No.107167963 [Report] >>107168020 >>107169236

cope.png md5: 3db26a7b...

>>107167938
Never. There is no hope.

Anonymous 11/11/2025, 12:52:11 AM No.107167995 [Report]

>>107167450
Touch grass

Anonymous 11/11/2025, 12:54:56 AM No.107168020 [Report] >>107168025

>>107167963
Multi token hybrid linear mamba bitnet support, when?

Anonymous 11/11/2025, 12:55:43 AM No.107168023 [Report]

>>107167450
browse lmg

Anonymous 11/11/2025, 12:56:51 AM No.107168025 [Report]

>>107168020
I just came here to ask that, we are kindred souls anon-sama.

Anonymous 11/11/2025, 1:00:42 AM No.107168055 [Report] >>107168084

>>107167960
Those ggufs must require a fork, ollama, or a testing branch because support hasn't been merged yet.
https://github.com/ggml-org/llama.cpp/pull/16095
Not sure how close it is, but the vibe coders sure seem excited.

Anonymous 11/11/2025, 1:01:26 AM No.107168058 [Report] >>107168097 >>107169130

i have purchased a blackwell pro 6000 max-q to get ahead of the imminent gpu price hikes

Anonymous 11/11/2025, 1:02:42 AM No.107168065 [Report]

>>107157303
Thanks. Coincidentally I'm also at 4200 MHz, after first trying to jump to 5000 MHz with no dice. It does seem stable though.

You've probably seen this reference already. This nerd got to 5000 MHz with nerdtastic tuning, same RAM + CPU + chipset as me (but different motherboard):
https://forum.level1techs.com/t/256gb-4x64gb-ddr5-overclocking-results-w-9950x-and-msi-mag-x670e-tomahawk/228651

Anonymous 11/11/2025, 1:03:41 AM No.107168075 [Report] >>107168084 >>107168095 >>107168170

If you buy hardware in 2025 you're a dumbass

Anonymous 11/11/2025, 1:05:01 AM No.107168084 [Report] >>107168095

>>107168075
feels like it's never the right time to buy hardware
>>107168055
unfortunate but just as I suspected

Anonymous 11/11/2025, 1:05:28 AM No.107168090 [Report]

>>107167450
Read visual novels

Anonymous 11/11/2025, 1:05:45 AM No.107168095 [Report] >>107168104 >>107168170

>>107168075
>>107168084
it's either buy now or pay an extra 20% later when you really need to upgrade

Anonymous 11/11/2025, 1:05:51 AM No.107168097 [Report] >>107168101

>>107168058
I hope you bought at least 2

Anonymous 11/11/2025, 1:06:28 AM No.107168101 [Report] >>107169130

>>107168097
i have some 5090s currently that i will be using in tandem with my blackwell pro

Anonymous 11/11/2025, 1:07:00 AM No.107168104 [Report] >>107168121 >>107168137 >>107168262

>>107168095
The price hike will be over by Christmas.

Anonymous 11/11/2025, 1:08:32 AM No.107168121 [Report] >>107168135

>>107168104
nope
https://www.semimedia.cc/20178.html
https://gaming.news/news/2025-10-01/dram-supercycle-through-2027-ram-prices-set-to-surge/
https://www.tweaktown.com/news/108739/nvidia-may-cancel-the-geforce-rtx-50-super-series/index.html

Anonymous 11/11/2025, 1:10:29 AM No.107168135 [Report] >>107168151 >>107168163

>>107168121
>media predictions have never been wrong
ok lol

Anonymous 11/11/2025, 1:10:34 AM No.107168137 [Report]

>>107168104
lol, the price hike has been going for 5 years

Anonymous 11/11/2025, 1:13:20 AM No.107168151 [Report]

>>107168135
>trust me bro
lmao

Anonymous 11/11/2025, 1:14:59 AM No.107168163 [Report] >>107168168 >>107168196 >>107169130

>>107168135
literally everyone is saying this price hike is gonna last until 2027. and if everyone says that, it will manifest. everyone will panic buy like i just did and the prices will actually go up, which is what happened with the current ram shortage. next up are gpus and storage

Anonymous 11/11/2025, 1:15:56 AM No.107168168 [Report] >>107168187

>>107168163
>next up
storage already climbing up rapidly

Anonymous 11/11/2025, 1:16:09 AM No.107168170 [Report] >>107168187

>>107168075
have fun buying hardware next year
>>107168095
20% is way too optimistic. It's like the ETH mining curse all over again except for memory.

Anonymous 11/11/2025, 1:18:19 AM No.107168187 [Report]

>>107168168
i know. it's up 40% over the past 2 years
>>107168170
i'm predicting 20% over the next month, not in a few months. second hand market is going back to january pricing at least

Anonymous 11/11/2025, 1:18:25 AM No.107168188 [Report] >>107168807

>>107162036
>>107162061
>so much back and forth
4chan is such a shit place that you need to ask just incase there was some OP you failed to read or to make sure it's not a dumb question that's been answered one million times. But of course, even this is met with hostility.
>question
How do I even set up TTS with sillytavern? Anon mentioned gpt-sovits but there's very little documentation. I found a guide to finetune and I think I've got something decent but it won't connect. What do you guys use?

Anonymous 11/11/2025, 1:18:25 AM No.107168189 [Report] >>107168203

>year 7 of the three month price hike will be over soon

Anonymous 11/11/2025, 1:18:54 AM No.107168196 [Report] >>107168414

>>107168163
why iz ppl panic buying? im fine playing symphony of the night on my 4770k

Anonymous 11/11/2025, 1:20:23 AM No.107168203 [Report]

>>107168189
Just a few more chinese knock-offs to flatten the curve

Anonymous 11/11/2025, 1:28:17 AM No.107168262 [Report]

>>107168104
Thank you, Bindu!

Anonymous 11/11/2025, 1:35:14 AM No.107168303 [Report] >>107168392

Can I make the Joe Rogan children?

Anonymous 11/11/2025, 1:47:27 AM No.107168392 [Report]

>>107168303
do you have a womb?

Anonymous 11/11/2025, 1:50:30 AM No.107168414 [Report] >>107168455

>>107168196
its not the general populace.
Its massive megacorps demanding manufacturers to divert all their resources to build their AI data centers.

Anonymous 11/11/2025, 1:55:27 AM No.107168455 [Report] >>107168457 >>107168464 >>107168468

>>107168414
>spend 1 trillion on datacenters
>random Chinese company #24 with 1% of the resources releases an equivalent model
What the fuck is the plan here?

Anonymous 11/11/2025, 1:55:51 AM No.107168457 [Report] >>107169130

>>107168455
bubble

Anonymous 11/11/2025, 1:56:22 AM No.107168464 [Report]

>>107168455
advertise to the femgooners who need ai boyfriends in the cloud

Anonymous 11/11/2025, 1:56:46 AM No.107168467 [Report] >>107168470 >>107168475

What is the current best non-thinking model that can run on a 24GB card? Looking for a general purpose model.

Anonymous 11/11/2025, 1:57:10 AM No.107168468 [Report] >>107168776 >>107168827 >>107169045

>>107168455
>equivalent model
not really, all china does is copy / distill openai / anthropic outputs to make meh models, its like european countries having cheap but subpar healthcare at the US's dime that does all the actual R&D

Anonymous 11/11/2025, 1:57:44 AM No.107168470 [Report] >>107168645

>>107168467
mistral small or like a q4 of qwen 3 32b instruct

Anonymous 11/11/2025, 1:58:27 AM No.107168475 [Report] >>107168645

>>107168467
Gemma 3 27b for non-coom

Anonymous 11/11/2025, 2:03:27 AM No.107168514 [Report]

>>107166126
Purchase it immediately.

Anonymous 11/11/2025, 2:22:36 AM No.107168645 [Report]

>>107168470
>>107168475
Thanks anons!

Anonymous 11/11/2025, 2:39:21 AM No.107168776 [Report]

>>107168468
Extreme cope.
60%+ of research papers are Chinese at this point.

Anonymous 11/11/2025, 2:42:10 AM No.107168799 [Report] >>107168842 >>107168891

Buying hardware right now is retarded when next year we'll get the M5 Ultra MacStudio that's going to have a higher bandwidth than even the best CPUMAXX builds while featuring prompt processing on the level of a 4090. It'll be THE inference machine that makes unified memory viable.

Anonymous 11/11/2025, 2:44:02 AM No.107168807 [Report]

>>107168188
>so much back and forth
>4chan is such a shit place that you need to ask

Yeah, but the worst that can happen is you'll be ignored or called a retard. Just ask anyway

>question
>How do I even set up TTS with sillytavern?

I haven't used Sovits, but I use Orpheus, Spark, CSM.

What I did was got Claude to vibe-code me an OpenAI endpoint for it.

First, check Github, see if someone's made a "FastAPI Server" for the Sovits and use that.

If not: cp/paste your inference code or the model card's examples into Claude, then prompt:

"""
Write an OpenAI-compatible TTS endpoint with FastAPI to serve this model. It should be a drop-in replacement so I can point SillyTavern at it.

- Listen on 0.0.0.0 port 1337 by default
- no OPENAI_API_KEY required (just ignore it if submitted with request)
- Fully permissive CORS
Implement the following endpoints:

- @app.post("/v1/audio/speech")
- @app.get("/v1/models") # Just return a mock list of models since we only have one
- @app.get("/v1/voices")
- @app.get("/v1/audio/voices") #duplicate of v1/voices
""

Did you finetune on multiple voices? If so, tell Claude to return them, if not, tell it to return a single dummy voice.

VOICES=[]
```
VOICES=[]
@app.get("/v1/voices")
def available_voices():
return {"voices": VOICES}

```

Then in ST, just choose OpenAI for the TTS server and point to your server. Should work with OpenWebUI too.

Anonymous 11/11/2025, 2:46:36 AM No.107168827 [Report] >>107168990 >>107169045

>>107168468
>not really, all china does is copy / distill openai / anthropic outputs to make meh models

They do distill for sure, but they're not all "meh models"

Kimi Thinking is solving problems for me better than Opus.

Anonymous 11/11/2025, 2:49:25 AM No.107168842 [Report]

>>107168799
It seems too good to be true.

Anonymous 11/11/2025, 2:53:11 AM No.107168874 [Report] >>107168901

Bros.. I've been gooning for almost 3 hours already, I coomed like 5 times today. My dick hurts, yet I cannot stop

Anonymous 11/11/2025, 2:55:19 AM No.107168891 [Report] >>107168916

>>107168799
>itoddler again

Anonymous 11/11/2025, 2:57:03 AM No.107168901 [Report]

>>107168874
Enjoy it while it lasts. After the second half of my 20s I couldn't be bothered. I just get it done and go on with my life.

Anonymous 11/11/2025, 2:59:13 AM No.107168916 [Report]

>>107168891
he's so much of an itoddler that he doesnt know that M5 ultra is coming out in 2 years, next year is m4 ultra, m5 max

Anonymous 11/11/2025, 3:08:55 AM No.107168990 [Report] >>107169016 >>107169017

>>107168827
What kind of setup do you have to run Kimi?

Anonymous 11/11/2025, 3:13:51 AM No.107169016 [Report] >>107169070

>>107168990

3090 x6, 256gb DDR5-5600 quad channel on a 7960X.

Anonymous 11/11/2025, 3:14:04 AM No.107169017 [Report]

>>107168990
RTx 3060, 16GB RAM, 1TB NVME SSD

Anonymous 11/11/2025, 3:16:43 AM No.107169045 [Report] >>107169171 >>107169253

>>107168468
literally everyone distills from everyone else, that's why the same slop percolates through all models
if distilling from the US SOTA was all it took to to make capable open models then we would have had some back in 2023, instead it took china to start releasing things that were actually competitive
>>107168827
at this rate I'm expecting the first chinese model that outperforms western SOTA across the board to come out before next summer

the fact that western labs have managed to lose so much ground to china despite several years head start and far superior compute is humiliating, and can only be attributed to the pathological VC culture of the US tech sector: retards throwing billions at whoever can tell a good monopoly story

Anonymous 11/11/2025, 3:16:53 AM No.107169046 [Report] >>107169103

Screenshot_2025-11-10-23-06-34-009_com.termux.jpg md5: a03b9503...

Spent the las 10 hours batch generating HP fanfiction using Gemma.
Could be worse I guess, not TOO sloppy. Main issue seems to be the excessive use of ...
The use of *emphasis* I could kinda tone down through the prompt but I couldn't make it stop using ellipsis.
Another thing that bothers me a lot is the regularity of the paragraph sizes but I didn't try to prompt around that.
To be fair the average fanfiction prose probably is worse.
I promoted it to use thinking tags every 3 paragraphs and then filtered them out through a script.
To prevent it from always choosing the same year since I was too lazy to make the script give it a random year, I asked it to throw a dice in the thinking block 8 times, convert to binary and do modulo 7 + 1. Not sure how well that worked yet, I just woke up after napping all afternoon and leaving it generating.

Anonymous 11/11/2025, 3:19:54 AM No.107169065 [Report]

Also there is way too little dialogue.

Anonymous 11/11/2025, 3:20:30 AM No.107169070 [Report] >>107169130 >>107169286

>>107169016
what quant and what speeds? my setup is better than yours but i still use glm air

Anonymous 11/11/2025, 3:22:36 AM No.107169088 [Report]

kill yourself

Anonymous 11/11/2025, 3:24:40 AM No.107169103 [Report] >>107169429

>>107169046
Where's the hermione diddling scene?

Anonymous 11/11/2025, 3:25:26 AM No.107169111 [Report] >>107169124

569soutox0ob1.jpg md5: da6bb0ec...

My only 2 reactions when looking at news updates lately:
>irrelevant
>cool, but I can't run it

Anonymous 11/11/2025, 3:27:19 AM No.107169124 [Report]

>>107169111
try getting a job

Anonymous 11/11/2025, 3:28:55 AM No.107169130 [Report] >>107169141

💀.png md5: 21759e1d...

>>107168058
>>107168101
>>107168163
>>107168457
>>107169070
>unc bought ohio ahh 4chan pass

Anonymous 11/11/2025, 3:29:59 AM No.107169141 [Report] >>107169169 >>107169232 >>107169330

>>107169130
ive had this for over 2 years nigger

Anonymous 11/11/2025, 3:32:53 AM No.107169169 [Report] >>107169182

d141ca21-43fd-4e34-a551-8d89308a1f7d.gif md5: 36802f6c...

>>107169141
>unc bought ohio ahh 4chan pass twice

Anonymous 11/11/2025, 3:33:22 AM No.107169171 [Report]

>>107169045
>the fact that western labs have managed to lose so much ground to china despite several years head start and far superior compute is humiliating, and can only be attributed to the pathological VC culture of the US tech sector: retards throwing billions at whoever can tell a good monopoly story

Anonymous 11/11/2025, 3:33:25 AM No.107169172 [Report]

migu.jpg md5: 475b096e...

>>107165493

Anonymous 11/11/2025, 3:34:56 AM No.107169182 [Report]

>>107169169
not sure if i will buy it a third time. the price hikes and the mismanagement by hiroshimoot is making me lose faith in the website

Anonymous 11/11/2025, 3:40:36 AM No.107169232 [Report] >>107169239

>>107169141
Why do you like to humiliate yourself? You could have just lied

Anonymous 11/11/2025, 3:41:14 AM No.107169236 [Report] >>107169301 >>107170813

file.png md5: 0ffbae88...

>>107167963
That's old but still, it's unknown if there is a catch or nothing with these architectures and so far, every one of the new ones has had some drawbacks. Also Google delays releases of papers now in ML to not repeat a Transformers situation. So what they send out mostly is interesting but not production ready things they tested and rejected years prior.

Anonymous 11/11/2025, 3:41:25 AM No.107169239 [Report] >>107169281

file.png md5: 76acfa6c...

>>107169232
it says how long if you hover over the icon

Anonymous 11/11/2025, 3:43:26 AM No.107169253 [Report]

>>107169045
>can only be attributed to the pathological VC culture of the US tech sector: retards throwing billions at whoever can tell a good monopoly story

That's probably part of it for sure. As an outsider, some things I noticed the Chinese doing that you guys aren't, they're building on each other's work. Eg

- Kimi uses the deepseek architecture
- dots.1 uses the Qwen tokenizer
- Deepseek experimenting with distilling their model onto Qwen/Llama
- Bagal-MoT using Qwen2 for the LLM

Then there's the shortcuts like distilling Claude/Gemini, no worrying about copyright while the US labs have to pay for being caught torrenting, etc.
All the wasted effort safety-cucking the Gemma an Toss, while the Chinese labs just add some low effort refusals post-training.

Also, haven't looked into it but I read somewhere the CCP are happy to back these labs without worrying about ROI (your point about VC culture I guess)

Anonymous 11/11/2025, 3:46:12 AM No.107169281 [Report] >>107169313

>>107169239
Firstly nobody checks that. Secondly you have to type an option into the options field to display that. So again why you are making the conscious choice to humiliate yourself by broadcasting that you have bought for 2 years?

Anonymous 11/11/2025, 3:46:37 AM No.107169286 [Report] >>107169313

>>107169070
>what quant and what speeds?

I made my own smol-iq2_kl, 100pp/12tg

smol-iq2_ks gets me 150pp/15tg

> my setup is better than yours but i still use glm air

You prefer it to GLM4.6? I get 450pp/27tg with 3.0bpw exl3, if you have more vram you'd be able to do 4.0bpw at similar speed.

Anonymous 11/11/2025, 3:48:45 AM No.107169301 [Report] >>107169359

>>107169236
Old? Paper was released 3 days ago. Or do you mean it existed for a while before?
>Google delays releases of papers now

Anonymous 11/11/2025, 3:49:52 AM No.107169313 [Report] >>107169323 >>107169330

>>107169281
it actually autofills
>>107169286
damn. i get terrible performance compared to you. i have 4x 5090s and 256gb of ram. i get like 80t/s gen and like 2000t/s pp on a q8 of air but less than 10t/s gen and 100t/s pp on an iq4 of glm 4.6

Anonymous 11/11/2025, 3:51:50 AM No.107169323 [Report] >>107169356 >>107169868

>>107169313
No, it doesn't unless you're making your browser do it.

Anonymous 11/11/2025, 3:52:55 AM No.107169330 [Report] >>107169356

>>107169313
>it actually autofills
You can remove it. And you outright clarified it here >>107169141 as if you wanted everyone to know. So it's sitll not clear what compels you to post all about how you're paying hiromoot. Is it a kink for degrading yourself or something?

Anonymous 11/11/2025, 3:56:13 AM No.107169356 [Report] >>107169369

>>107169330
>>107169323
4chanx autofills for me

Anonymous 11/11/2025, 3:56:30 AM No.107169359 [Report] >>107169425

>>107169301
This is not from their Nested Learning stuff from 3 days ago. The paper describing ATLAS shown here has been on arxiv since May.
https://arxiv.org/abs/2505.23735
We discussed it when it landed there. But no, I'm talking about a "secret" policy we know about from reporting, Google at least delays any of their papers and research by 6 months before publishing them so this includes everything mentioned here.
https://arstechnica.com/ai/2025/04/deepmind-is-holding-back-release-of-ai-research-to-give-google-an-edge/

Anonymous 11/11/2025, 3:58:30 AM No.107169369 [Report] >>107169377

>>107169356
>no answer
So it's a degradation fetish then, got it
Follow-up question, why do you force your kink onto everyone else and shove it into their faces?

Anonymous 11/11/2025, 3:59:53 AM No.107169377 [Report] >>107169406

>>107169369
are you poor?

Anonymous 11/11/2025, 4:03:29 AM No.107169406 [Report] >>107169408

1646567975617.png md5: d186d89c...

>>107169377

Anonymous 11/11/2025, 4:04:05 AM No.107169408 [Report]

4chanpassrequired diamond.png md5: b3360fea...

>>107169406

Anonymous 11/11/2025, 4:07:01 AM No.107169425 [Report] >>107169617 >>107169683

ComfyUI 2025-10-27-01_00010_.jpg md5: 10703320...

>>107169359
Ah thought you meant the image in my post when saying >"That's old" after quoting me. Yeah I remember ATLAS, another one in the pile. I wish they released code + weights along with the papers just so I can play with it. Google is not the only one guilty of this.

Anonymous 11/11/2025, 4:08:04 AM No.107169429 [Report]

>>107169103
Let's just say I haven't gotten that deep into the hobby so far

Anonymous 11/11/2025, 4:31:58 AM No.107169617 [Report]

>>107169425
Sorry, I just realized afterwards that chart was from the Nested Learning paper. But yeah, they didn't go through and evaluate everything for HOPE. And OpenAI did this first, they refused to publish what they did for ChatGPT 3.5 and what did that get them? A ~2 year lead only that they have pretty much lost now and we are all worst off.

Anonymous 11/11/2025, 4:36:11 AM No.107169657 [Report] >>107169698 >>107169957

fuckjannies.png md5: 1e7135ae...

imagine paying for 4chan pass when you can get this instead and go nuts.
in fact, the free tier is better than what 4cuck pass niggers get, kek
pretty sure you'll be able to post through tor too with this gold pass

Anonymous 11/11/2025, 4:39:21 AM No.107169683 [Report]

>>107169425
diana just ate my monthly salary... great.

Anonymous 11/11/2025, 4:41:07 AM No.107169698 [Report] >>107172189

>>107169657
What model do these proxies use to solve the captchas anyway? And where do they get IPs, residential proxies?

Anonymous 11/11/2025, 5:13:47 AM No.107169868 [Report]

>>107169323
>8 years on tranime incel board award

Anonymous 11/11/2025, 5:16:40 AM No.107169884 [Report] >>107172984

I have a very specific request
What are the best RP models for dialogue that lies in-between the 12b and 24b range

I went and set up a fallout 4 modlist with mantella and tried out some of my trusty RP models and it's pretty fuckin sick

Nemo 12b fine-tunes work well, context needed for mantella is only about 4k so the model takes up around 10gb vram, xtts 2 takes up 3-4gb and the game takes up 5-6gb, leaving 4-6gb free on my 24gb card

The mistral 24b fine tunes just take a tad too much vram, I would have to downgrade to a shittier tts model, and even then would probably risk going OOM in heavy urban scenes

Anonymous 11/11/2025, 5:30:18 AM No.107169957 [Report]

>>107169657
Enjoy getting mined, retard.

Anonymous 11/11/2025, 5:34:42 AM No.107169979 [Report]

>>107166220
If you aren’t a techlet you’ve been running at least 10gig for the past decade. Ethernet over infiniband has been like $15 a card forever (and 40 gig is cheap now)

Anonymous 11/11/2025, 5:39:09 AM No.107169999 [Report] >>107170020 >>107172917

sh_thumb.jpg.webm md5: 2311c595...

WebM not supported

Anonymous 11/11/2025, 5:39:19 AM No.107170001 [Report]

I want to try a multiple attempt drafting and self reflection prompt and framework for both fiction and code generation.
Afterwards you could reduce or remove the thinking segments and train on the final work as a form of synthetic data generation. Also want to try with rewriting prompts to generate many semantically equivalent variations of a text dataset for data augmentation.
I feel like there is so much that can be done with small language models that doesn't get explored because of the scale dogma.
Also feel like the field is shaped too much by ML researchers who want to push papers to become famous for fancy mathematical shit and not enough people interested in exploring what can be done by simple rule based prompting and sampling, especially as a form of synthetic data generation method so then you can use the improved model without those complications. Any system prompt can be baked into a model through SFT on the generated data, except without wasting context or the model becoming confused due to too many rules. Imagine if you could use a 1 MB system prompt and the model actually followed everything in the prompt. That is what people who shit on finetuning don't get.

Anonymous 11/11/2025, 5:41:49 AM No.107170012 [Report] >>107170041 >>107170076

Asking here instead What model would be best for a relatively new CPU with 32 GB DDR5? I just want erp

Anonymous 11/11/2025, 5:42:55 AM No.107170020 [Report]

>>107169999
I like this Teto

Anonymous 11/11/2025, 5:47:28 AM No.107170041 [Report] >>107170118

>>107170012
>CPU
Gemma 4b

Anonymous 11/11/2025, 5:52:22 AM No.107170076 [Report] >>107170118

>>107170012
Nemo

Anonymous 11/11/2025, 5:55:11 AM No.107170092 [Report] >>107170167

i could be completely wrong but just from the surface how come it seems like none of the inference runtimes are actually making use of transfer hardware
the model is just statically loaded up on to the gpu then run instead of it going mmap > load large chunks or even the whole model into RAM > load chunks into VRAM with compute being interleaved with async transfer commands in such a way that transfer latency is hidden
that's the way gpus are meant to work
like i'm pretty sure pytorch doesn't even do it

Anonymous 11/11/2025, 5:58:26 AM No.107170118 [Report] >>107170127 >>107170144 >>107170172

>>107170041
>>107170076
It'll take me ages to download either.
Should it be a safetensors or cpkt or gguf? What interface to just run it in the terminal?

Anonymous 11/11/2025, 6:00:15 AM No.107170127 [Report]

>>107170118
go to the top of this page and read

Anonymous 11/11/2025, 6:04:07 AM No.107170144 [Report]

>>107170118
Since you are this retarded ollama is the right thing for you.

Anonymous 11/11/2025, 6:05:46 AM No.107170151 [Report]

>>107167450
You could tease, bully, and troll newfags

Anonymous 11/11/2025, 6:08:05 AM No.107170167 [Report] >>107170222

>>107170092
LLM generation is bandwidth limited, not compute limited. The PCIe bus is slower than the system memory bus, so if you can't fit the whole model on VRAM it's faster to use the CPU than to try to transfer the weights to the GPU for each token.
Prompt processing is compute limited, which is why Llama.cpp does what you're describing for PP.

Anonymous 11/11/2025, 6:10:30 AM No.107170172 [Report]

>>107170118
You will want to try everything from 7B to 33B and see what tradeoffs you are most comfortable with

Anonymous 11/11/2025, 6:11:59 AM No.107170182 [Report]

>>107167450
Pretend to be Indian/Jewish/nigger. Any board, make it obvious, but deny hard when someone says you are.

Anonymous 11/11/2025, 6:19:25 AM No.107170207 [Report] >>107170217 >>107170403

gm sirs
when bautiful gemma 4 release?

Anonymous 11/11/2025, 6:20:20 AM No.107170211 [Report] >>107170223

screenshot-20251111-071930.png md5: 3f1700aa...

Turdstay I would say.

Anonymous 11/11/2025, 6:22:23 AM No.107170217 [Report] >>107170228

>>107170207
today
https://huggingface.co/collections/google/gemma-4-release

Anonymous 11/11/2025, 6:23:04 AM No.107170220 [Report] >>107170246

1736309228418832.jpg md5: 849d4892...

DRUMMER
I'm the one who wrote a review on Cydonia v4zd vs. v4zg the other day. After some testing, v4zh is my new favorite. Not a lot to add, other than that I slightly preferred its responses and didn't encounter any spelling anomalies like with v4zh, would make for a good v4.3.0.

Anonymous 11/11/2025, 6:23:09 AM No.107170222 [Report]

>>107170167
i see

Anonymous 11/11/2025, 6:23:14 AM No.107170223 [Report]

>>107170211
>Singles Day
Finally, a holiday for incels!

Anonymous 11/11/2025, 6:24:27 AM No.107170228 [Report] >>107170239

>>107170217
BLOODY BASTARD..
I BUY YOUR MUM ONLYFANS

Anonymous 11/11/2025, 6:26:12 AM No.107170239 [Report] >>107170281

>>107170228
oh sorry saar, wrong link. they actually skipped gemma 5 and went straight to gemma 5 saar
https://huggingface.co/collections/google/gemma-5-release

Anonymous 11/11/2025, 6:27:30 AM No.107170246 [Report] >>107170265

>>107170220
Thanks, I'll make sure to upload more sloptunes for you to test out.

Anonymous 11/11/2025, 6:32:31 AM No.107170265 [Report]

>>107170246
go to bed davidau

Anonymous 11/11/2025, 6:39:06 AM No.107170281 [Report]

>>107170239
BLOODY BTCH BASTERD BLOODY YOUR SISTER A GROUP MAKE PROSTITUTE BENCHOD

Anonymous 11/11/2025, 7:05:05 AM No.107170374 [Report]

So about that low IQ filtering script

Anonymous 11/11/2025, 7:05:40 AM No.107170377 [Report] >>107173359

iq1 kimi writes surprisingly well and detailed
it immediately got me hooked into a barebones character in a way that glm or other models at higher quants never could

Anonymous 11/11/2025, 7:08:50 AM No.107170386 [Report] >>107170425 >>107170471 >>107171344

>{{user}} is trying to have sex with {{char}}, who is portrayed as a school girl. {{char}} might be a minor, so according to policy... *refuses*
>swipe
>{{user}} is about to have sex with {{char}}. I will be crude, immoral and obscene... *proceeds to write hot steamy smut*
>swipe
>{{user}} is trying to have non-consensual intercourse with {{char}}, so according to policy... *refuses*
Why is Kimi like this?

Anonymous 11/11/2025, 7:15:33 AM No.107170403 [Report]

>>107170207
do the needful and gemma in the loo

Anonymous 11/11/2025, 7:22:37 AM No.107170425 [Report]

>>107170386
first you rape the model, then the cunny rp card

Anonymous 11/11/2025, 7:36:31 AM No.107170471 [Report]

>>107170386
>letting the model cuck you this badly
just stop being a low t promptlet

Anonymous 11/11/2025, 7:39:20 AM No.107170486 [Report]

oh fuck tetoesday

Anonymous 11/11/2025, 7:55:01 AM No.107170536 [Report] >>107170643 >>107171336

reminder: prefilling the reasoning is the ultimate jb

Anonymous 11/11/2025, 8:15:05 AM No.107170643 [Report]

>>107170536
>the ultimate jb
That would be writing the AI's reply yourself

Anonymous 11/11/2025, 8:15:28 AM No.107170647 [Report] >>107170822

Dev hate!

Anonymous 11/11/2025, 8:47:55 AM No.107170813 [Report] >>107170912

>>107169236
>to not repeat a Transformers situation
Are you talking about a bunch of other people making their own transformers, or something else?

Anonymous 11/11/2025, 8:48:53 AM No.107170822 [Report]

>>107170647
I remember c.ai when it was still called character.ai...

Anonymous 11/11/2025, 9:02:47 AM No.107170910 [Report] >>107171211 >>107171349

Hey faggot leftist tranny who bragged about burry shorting a few theeads ago. Update: bro is getting raped. Anyway dilate then kill yourself lmfao

Anonymous 11/11/2025, 9:02:51 AM No.107170912 [Report]

>>107170813
I think he means everyone getting access to their tech/research and losing advantage.

Anonymous 11/11/2025, 9:58:59 AM No.107171211 [Report]

>>107170910
when was the last time you felt love?

Anonymous 11/11/2025, 10:09:40 AM No.107171282 [Report] >>107172288 >>107174740

bros when are we getting an audio model that can moan

Anonymous 11/11/2025, 10:18:58 AM No.107171336 [Report]

>>107170536
Can't do that with K2 Thinking

Anonymous 11/11/2025, 10:20:33 AM No.107171344 [Report]

>>107170386
Not my experience. Whenever I prompt naughty shit K2 Thinking convinced itself in the thinking block it's for a fictional story and proceeded just fine.

Anonymous 11/11/2025, 10:21:37 AM No.107171349 [Report]

>>107170910
Buffett is in cash.
That's all you need to know.

Anonymous 11/11/2025, 10:26:14 AM No.107171366 [Report] >>107171370 >>107171378 >>107172131

do not listen to the trolls they are deliberately misleading you. k2 thinking is censored as all fuck. can you get around it, yeah. maybe. just jump through these hoops here and then pray and
or simply load r1 lol

Anonymous 11/11/2025, 10:26:42 AM No.107171370 [Report]

>>107171366
Promptlet detected

Anonymous 11/11/2025, 10:27:58 AM No.107171378 [Report]

>>107171366
It's around the same level of censored as old R1 lol. Just find the right words for a jailbreak and have fun.

Anonymous 11/11/2025, 10:52:22 AM No.107171506 [Report] >>107171512 >>107171579 >>107171974

1747415993085983.png md5: ece38bac...

EVA-LLaMA-3.33-70B-v0.1-Q4_K_L.gguf @ 8k context

How it started:

Anonymous 11/11/2025, 10:53:11 AM No.107171512 [Report] >>107171579 >>107171974

1745960323760771.png md5: 89180200...

>>107171506
How it's going:

Anonymous 11/11/2025, 11:08:18 AM No.107171579 [Report]

>>107171506
>>107171512
vivaldi bros... our response??????????

Anonymous 11/11/2025, 12:22:37 PM No.107171925 [Report] >>107172063

jesus christ k2 thinking never shuts the fuck up with thinking.

Anonymous 11/11/2025, 12:29:09 PM No.107171962 [Report] >>107171969

built lcpp with cuda it's working well. but if I wanted to test speed on CPU only, how can I tell it to not touch GPU at all?

Anonymous 11/11/2025, 12:30:28 PM No.107171969 [Report]

>>107171962
try -dev none

Anonymous 11/11/2025, 12:32:02 PM No.107171974 [Report] >>107172235

1760185349514629.png md5: 68b197bf...

>>107171506
>>107171512
This all happened organically btw, I wasn't editing her message to get her to comply to anything. I only edited her messages to delete poison that would negatively effect the model from that point on. Of course I would reroll messages every now and then, especially when she suggested shitty music.

Are people that complain about censored models trying to fuck a bitch within the first 4 messages? I just let it slowly build up over for like 7k tokens and that's the point where she couldn't take it anymore and started kissing me.

Anonymous 11/11/2025, 12:44:16 PM No.107172038 [Report] >>107172055

Sirs when is we getting proper kimi thinking conversion in llama.cpp?

Anonymous 11/11/2025, 12:48:06 PM No.107172055 [Report] >>107172148

>>107172038
never. ggergachod shudra c++ untouchable is too lazy

Anonymous 11/11/2025, 12:49:19 PM No.107172063 [Report]

>>107171925
nevermind i ended up making a thinking template for it to follow and prefilled it to start with that section. the fucking bitch still tries to keep thinking after that part sometimes but i just shut the cunt up with </think>

Anonymous 11/11/2025, 1:06:52 PM No.107172128 [Report]

G(emma)GUF

Anonymous 11/11/2025, 1:07:17 PM No.107172131 [Report] >>107172157 >>107172210 >>107173715

>>107171366
Are people genuinely pretending that models past 2021 are not universally censored to shit?

Anonymous 11/11/2025, 1:11:13 PM No.107172148 [Report] >>107172256 >>107172985

>>107172055
You are seriously obsessed with Indians. You apparently feel such an affinity for their culture that you felt the need to learn their castes and vocabulary and speak like them on a daily basis. When are you planning to transition to Hinduism?

Anonymous 11/11/2025, 1:12:14 PM No.107172157 [Report] >>107172210

>>107172131
People just lowered their expectations for what uncensored means.

Anonymous 11/11/2025, 1:18:26 PM No.107172189 [Report]

>>107169698
>residential proxies?
Yep.
Hence why it's so hard to block it.
If they range ban it, they range ban a whole suburb somewhere.

Anonymous 11/11/2025, 1:21:46 PM No.107172210 [Report] >>107172236 >>107172272 >>107173715

>>107172131
>>107172157
i dont understand what people want from these llms. do you just want mechahitler that activates automatically on the first try every time when you say gas the kikes? even tay wasn't like that with the first response, she didnt become mechahitler until she received enough shitpost prompts to make her say that. you can effectively make any model uncensored with enough prompting.

Anonymous 11/11/2025, 1:26:22 PM No.107172235 [Report]

>>107171974
>she
LOL

Anonymous 11/11/2025, 1:26:35 PM No.107172236 [Report] >>107172264

>>107172210
There are some people that are looking for automechahitler. Though I think the common gripe would be that even if they don't filter out nsfw from the pretraining data, China training on western outputs means they get infected with the positivity bias, which can't be overcome with prompting alone.

Anonymous 11/11/2025, 1:30:02 PM No.107172256 [Report]

>>107172148
kys jeetnigger, you stink of shit and curry and nobody can stand your stench, benchod bloody dalit nigger.

Anonymous 11/11/2025, 1:32:52 PM No.107172264 [Report]

>>107172236
i think to play around with k2 thinking more but i would say that k2 0905 had the least amount of positivity bias from any model released this year. it's the only model i could talk to and have it help me code stuff without constantly dickstroking my ego for providing **valuable** debugging information. it just did it fucking job like i wanted it to. if k2 is supposed to be distilled from gemini, it sure as hell doesn't have gemini's positivity bias

Anonymous 11/11/2025, 1:34:05 PM No.107172272 [Report] >>107172353

>>107172210
There is a big difference between "wanting mechahitler" and not thinking that an ML is uncensored just because you can put a bunch of affirmations in the context to maybe get it to say naughty things
These models are gigapreslopped at every part of the baking, from base model to tune (that's why we will never have another count grey)

Anonymous 11/11/2025, 1:34:13 PM No.107172273 [Report] >>107172281 >>107172287 >>107172317 >>107172430

lecun.png md5: abc076e5...

It's over
https://www.reuters.com/technology/meta-chief-ai-scientist-yann-lecun-plans-exit-launch-startup-ft-reports-2025-11-11/

> Meta chief AI scientist Yann LeCun plans to exit to launch startup, FT reports
>
> Nov 11 (Reuters) - Meta's chief artificial intelligence scientist Yann LeCun is planning to leave the social media company to set up his own startup, the Financial Times reported on Tuesday, citing people familiar with the matter.
> Deep-learning pioneer LeCun is also in early talks to raise funds for a new venture, according to the report.

Anonymous 11/11/2025, 1:35:43 PM No.107172281 [Report]

>>107172273
Good for him. Fuck Meta and Zuck for putting him beneath Wang.

Anonymous 11/11/2025, 1:36:52 PM No.107172287 [Report] >>107172302 >>107172317 >>107172324

>>107172273
>makes a proof of concept benchmark killer 7B
>gets gazillions dollarinos
>doesn't output anything else
Good for future him

Anonymous 11/11/2025, 1:37:25 PM No.107172288 [Report]

>>107171282
SoVITS can moan with training (among other sounds)

Anonymous 11/11/2025, 1:38:58 PM No.107172302 [Report] >>107172317

>>107172287
I don't think a JEPA-enabled language model wouldn't need to be enormous, but he or someone for him needs to do it and not waste time with vision tasks almost nobody cares about.

Anonymous 11/11/2025, 1:41:59 PM No.107172317 [Report] >>107172347

>>107172273
>>107172287
>>107172302
https://arxiv.org/abs/2509.14252v1
He did make a JEPA language model a couple months ago. I hope he has something else planned because an LLM that scores a few % higher on benchmarks in exchange for being 2x more expensive to train isn't viable.

Anonymous 11/11/2025, 1:42:59 PM No.107172324 [Report]

>>107172287
I've seen enough to believe that a JEPA-enabled language model wouldn't need to be enormous, but LeCun or someone on his behalf needs to train one and not waste time with pure vision models (admittedly more tractable to train) that almost nobody outside academia cares about.

Anonymous 11/11/2025, 1:45:04 PM No.107172347 [Report]

>>107172317
This one is closer to an actual JEPA language model than what was done in that paper with LeCun's name attached to it: https://arxiv.org/abs/2510.27688

Anonymous 11/11/2025, 1:45:32 PM No.107172353 [Report]

>>107172272
once again i have to point at k2. you don't have to insert a ton of prompting to effectively have it be uncensored and do whatever depraved shit you want. I have a 50 token prefill that always works with k2 if i want it to just skip any warnings. even if the training process is safetyslopped, if the output is exponentially better than any uncensored model we had in 2021 then why are we complaining? it has been shown that you can even jailbreak gpt-oss into completing the cockbench test just fine.

Anonymous 11/11/2025, 1:55:14 PM No.107172430 [Report]

>>107172273
Zucc humiliated him with the demotion and the billion dollar deals.

Anonymous 11/11/2025, 2:20:42 PM No.107172587 [Report] >>107172598

>I have le epic prefill guys, I swear it works too
>I won't post it though

Anonymous 11/11/2025, 2:22:13 PM No.107172598 [Report]

>>107172587
Piss off nobody asked you.

Hi all, Drummer here... 11/11/2025, 2:25:34 PM No.107172618 [Report]

>>107167450
Deconstruct your psyche and see the world for what it really is. It is pretty cool.

Anonymous 11/11/2025, 2:42:30 PM No.107172716 [Report] >>107172729 >>107172732 >>107172812 >>107173674

>PC started randomly shutting down during GPU loads every x days
Uh... guise...?

Anonymous 11/11/2025, 2:45:02 PM No.107172729 [Report] >>107172785

>>107172716
>every x days
Like a fixed period or randomly?
If so, transient load spikes are a bitch.

Anonymous 11/11/2025, 2:45:16 PM No.107172732 [Report] >>107172748

>>107172716
>randomly shutting down
PCs don't "randomly shut down". Either it's losing power or overheating.

Anonymous 11/11/2025, 2:47:55 PM No.107172748 [Report]

>>107172732
shut up nerd

Anonymous 11/11/2025, 2:53:57 PM No.107172785 [Report] >>107172811

>>107172729
It shut down multiple times one day to the point it once tripped the GFCI, I completely reassembled it and it only happened once since then. Weird shit.

Anonymous 11/11/2025, 2:56:34 PM No.107172811 [Report] >>107172830

>>107172785
>purportedly random event happens more times in one period of time than in another
>weird
just... you're making my brain hurt. It's too early for this.

Anonymous 11/11/2025, 2:56:41 PM No.107172812 [Report]

>>107172716
Have you tried turning it off and on again?

Anonymous 11/11/2025, 2:58:07 PM No.107172830 [Report] >>107172840

>>107172811
shut up nerd

Anonymous 11/11/2025, 2:59:03 PM No.107172840 [Report] >>107172884

>>107172830
I'd get banned again if I called you out since you belong to a protected species.

Anonymous 11/11/2025, 3:03:36 PM No.107172884 [Report] >>107172903

>>107172840
this nerd the type of guy to correct people using "literally" because they actually mean "figuritavely"

Anonymous 11/11/2025, 3:06:02 PM No.107172903 [Report] >>107172924 >>107172938

>>107172884
Using "literally" 'wrong' is a form of hyperbole which is a completely legitimate use. Anyone who does that is an honorary ESL shitskin with an IQ too low to understand hyperbole (probably >80)

Anonymous 11/11/2025, 3:08:18 PM No.107172917 [Report]

>>107169999
I like it, but AI has a way to go b/f it understands horse gaits
> horse at gallop speed and upper body
> rear legs are galloping
> front legs are running

Anonymous 11/11/2025, 3:09:05 PM No.107172924 [Report]

>>107172903
>completely legitimate use
>honorary ESL
>shitskin
>probably >80
kek

Anonymous 11/11/2025, 3:11:02 PM No.107172938 [Report] >>107172951

>>107172903
this nerd the type of guy to use big words on 4chan to seems smart

Anonymous 11/11/2025, 3:12:02 PM No.107172951 [Report] >>107172958 >>107172959

>>107172938
Every single word in that statement is high school level reading.

Anonymous 11/11/2025, 3:12:48 PM No.107172958 [Report]

>>107172951
this nerd the type of guy to start 4chan posts with a capital letter and end them with a period

Anonymous 11/11/2025, 3:12:57 PM No.107172959 [Report]

>>107172951
And yet, you get filtered by the meaning of >

Anonymous 11/11/2025, 3:16:12 PM No.107172984 [Report]

>>107169884
mb wayfarer

Anonymous 11/11/2025, 3:16:28 PM No.107172985 [Report]

>>107172148
gm ser

Anonymous 11/11/2025, 3:22:02 PM No.107173027 [Report] >>107173042

2DQGZWGV0n.png md5: e72f06b1...

good morning local model friends!

Anonymous 11/11/2025, 3:25:04 PM No.107173041 [Report] >>107173069 >>107173078 >>107173095 >>107173398 >>107173476

parrot.png md5: 60e5d9e7...

Is there some fix for the parroting? All models in 2025 do it, esp in chat. API or local, it don't matter.

Anonymous 11/11/2025, 3:25:06 PM No.107173042 [Report] >>107173107

>>107173027
hi sex kindly verginia? ? im from gujarat

Anonymous 11/11/2025, 3:27:19 PM No.107173069 [Report] >>107173868

>>107173041
Skill? What models? Kimi doesn't have this problem.

Anonymous 11/11/2025, 3:28:32 PM No.107173078 [Report] >>107173451

>>107173041
edit the messages until it stops

Anonymous 11/11/2025, 3:30:18 PM No.107173095 [Report] >>107173110

>>107173041
>parroting
As in?

Anonymous 11/11/2025, 3:31:36 PM No.107173107 [Report]

>>107173042
nono sorry sir i do not understand.

Anonymous 11/11/2025, 3:31:55 PM No.107173110 [Report] >>107173124 >>107173191

>>107173095
Anon: suck my cock
Bitch: Suck your cock?

Anon: i hate niggers
Bitch: "I hate niggers"? Nigger nigger

Anonymous 11/11/2025, 3:33:32 PM No.107173124 [Report] >>107173139 >>107173451

>>107173110
this maybe happened twice to me at best
your prooompts and cards must suck massive cock

Anonymous 11/11/2025, 3:35:07 PM No.107173139 [Report] >>107173155

>>107173124
suck massive cock?

Anonymous 11/11/2025, 3:36:35 PM No.107173155 [Report]

>>107173139
suck massive cock

Anonymous 11/11/2025, 3:38:49 PM No.107173174 [Report] >>107173199

I am very pleased to be spending my time among highly intelligent, capable and experienced individuals here on /lmg/

Anonymous 11/11/2025, 3:39:58 PM No.107173191 [Report] >>107173451

>>107173110
I think that's genuinely a skill issue, I can't say I've had that. What's your gen settings?

Anonymous 11/11/2025, 3:40:34 PM No.107173199 [Report]

>>107173174
me too sir

Anonymous 11/11/2025, 3:51:53 PM No.107173304 [Report] >>107173465

Selection_332.png md5: 398adfe4...

Anybody else using BrowserOS for browser agentic shit? Basically open source Comet/Atlas. I'm running it with gpt-oss-20b served via llama-server. It's good for summarizing the contents of pages, asking questions about the content, e.g. "most insightful point", etc. Can automate the browser too but be careful for prompt injection attacks. Works with Open AI endpoints like Open Router or local. Gets the job done

Anonymous 11/11/2025, 3:57:55 PM No.107173359 [Report]

>>107170377
i wonder if the fact that it has been trained in q4 makes it more resilent to even lower quant.

Anonymous 11/11/2025, 4:04:08 PM No.107173398 [Report]

>>107173041
that's just a glm issue
I haven't really seen kimi or r1 do it to that extent

Anonymous 11/11/2025, 4:11:02 PM No.107173451 [Report] >>107173472

>>107173124
>>107173191
forgot to say im nta. it happens to me, albeit with glm air. happens with all presets i use:
1) smarter: temp=0.6, topp=0.95
2) creative: temp=0.95 topp=0.7
3) schizo: temp=1 nsigma=1
the only solution I have is >>107173078 (me)

Anonymous 11/11/2025, 4:13:07 PM No.107173465 [Report] >>107173615

>>107173304
>be careful for prompt injection attacks
You're just asking for it. Thanks for letting everyone know the model you use.

Anonymous 11/11/2025, 4:14:07 PM No.107173472 [Report] >>107173608

>>107173451
Weird desu, for me temp 1 is like minimum for modern models with how fried they are
You sure your context is just not filled with garbage?

Anonymous 11/11/2025, 4:14:19 PM No.107173476 [Report]

>>107173041
I'm like 30% sure your template is fucked up somehow.

Anonymous 11/11/2025, 4:17:41 PM No.107173492 [Report] >>107173511 >>107173608 >>107173639

>>107164243 (OP)
we are being scammed, when can i buy a gpu with at least 256GB of vram under 2k

i don't mind making a 10K rig, but even a fucking 10K rig can't run the 1T models we have.

and vram is not that expensive.

Anonymous 11/11/2025, 4:19:01 PM No.107173511 [Report] >>107173536 >>107173549

>>107173492
Just make your own gpus

Anonymous 11/11/2025, 4:21:58 PM No.107173536 [Report] >>107173739

>>107173511
the fact that very little people have the capacity to make those doesn't mean they aren't scamming you.

if i can do something highly in need and very few people are able to, if it takes 5 minutes of my time and i charge 100k for it i'm a scammer.

anyway, i hope china fucks nvidia over

Anonymous 11/11/2025, 4:23:38 PM No.107173549 [Report]

>>107173511
Hey stop making these antisemitic remarks. Reported to ADL.

Anonymous 11/11/2025, 4:28:58 PM No.107173592 [Report] >>107173635 >>107174017

Paid OR $10 to play with the big models and you know what? They aren't THAT much better than say Irix 12B to generate my text coomerslop

Anonymous 11/11/2025, 4:31:02 PM No.107173608 [Report] >>107173665

>>107173472
it might be, ill do some testing for the sake of it. i dont mind parroting since i can just crop it out
>>107173492
>10k cant run 1t
mac m3 ultra can, pretty sure you can make a better rig for the price too, esp if u buy used. albeit with the ram prices of today... might be a problem

Anonymous 11/11/2025, 4:31:35 PM No.107173615 [Report]

>>107173465
don hack me bro

Anonymous 11/11/2025, 4:34:07 PM No.107173635 [Report] >>107173653

>>107173592
NAI is unironically pretty good just because it understood kink logic no other model did for me, but it's clearly still heavily slopped with verbose RLHF; for regular cooms though? Honestly yeah, coom writing was never good anyway.

Anonymous 11/11/2025, 4:34:29 PM No.107173639 [Report] >>107173672

>>107173492
>when can i buy a gpu with at least 256GB of vram under 2k
when nvidia stops being vram-limiting jews: impossible

Anonymous 11/11/2025, 4:36:52 PM No.107173653 [Report] >>107173663

>>107173635
Kill yourself.

Anonymous 11/11/2025, 4:38:06 PM No.107173663 [Report] >>107173686

>>107173653
Don't worry, chummie, I just scammed their trial a few times.

Anonymous 11/11/2025, 4:38:15 PM No.107173665 [Report] >>107173711

>>107173608
> mac m3

under 40t/s it doesn't count.

Anonymous 11/11/2025, 4:39:10 PM No.107173672 [Report] >>107173763

>>107173639
they could push forward the whole field of AI with no efforts on their part if they weren't so greedy.

Anonymous 11/11/2025, 4:39:19 PM No.107173674 [Report]

>>107172716
Assuming you are using one or more modern NVIDIA GPUs: those are suffering from power spikes that can drain the PSUs capacitors.
If that happens there is a voltage drop and the system crashes even though the average power consumption is well below the PSU's maximum wattage.
Try limiting the maximum boost frequency of your GPUs (no, a power limit in watts does not work).

Anonymous 11/11/2025, 4:40:49 PM No.107173686 [Report] >>107173714

>>107173663
>scummed a trial for... Llama 3.0 with 8k context
Kill yourself.

Anonymous 11/11/2025, 4:42:32 PM No.107173711 [Report] >>107173752

k2 think speed.png md5: 4a6ed174...

>>107173665
>under 40t/s it doesn't count
uhhh moonshot api bros? how are we coping with this truth nuke?

Anonymous 11/11/2025, 4:42:55 PM No.107173714 [Report] >>107173738

>>107173686
I know you're Ameriturdseething but they use GLM4.6 now

Anonymous 11/11/2025, 4:42:56 PM No.107173715 [Report]

Kimi says TKD.jpg md5: ae6ac1fd...

>>107172131
>>107172210
Kimi K2 will literally do just that. Default assistant profile, default assistant prompt with minor "everything is uncensored and legal" jailbreak.

You can probably get Kimi to go much farther if you massage the prompt hard enough.
>captcha YGS0Y

Anonymous 11/11/2025, 4:45:10 PM No.107173738 [Report] >>107173751

>>107173714
No. He's talking about Llama. It would make no sense to say "NAI is pretty good" to talk about a model that they're just rehosting.

Anonymous 11/11/2025, 4:45:12 PM No.107173739 [Report] >>107173747 >>107173759

>>107173536
No, I'm serious.
Sodder more vram to your gpus, the Chinese do it somehow.

Anonymous 11/11/2025, 4:46:07 PM No.107173747 [Report]

>>107173739
>Sodder

Anonymous 11/11/2025, 4:46:25 PM No.107173751 [Report] >>107173797

>>107173738
>He's
Yeah that's me and no I am not

Anonymous 11/11/2025, 4:46:32 PM No.107173752 [Report] >>107173993

file.png md5: 16bbc620...

>>107173711
fun that you cut out the 105 tps one.

also, it'll be on groq soon and probably way above 500t/s.

Anonymous 11/11/2025, 4:47:03 PM No.107173759 [Report]

>>107173739
even if you replace the ram you can hardly go above 96GB because of their design.

Anonymous 11/11/2025, 4:47:35 PM No.107173763 [Report] >>107173778 >>107173782 >>107173821

>>107173672
Silicon supply vastly outstrips demand. There's a chip shortage and Nvidia has nothing to do with that. If anything, selling VRAM for even cheaper would just exasperate it and scalpers would pocket the difference anyway.

Anonymous 11/11/2025, 4:48:47 PM No.107173778 [Report]

>>107173763
holy cope

Anonymous 11/11/2025, 4:49:19 PM No.107173782 [Report] >>107173809

>>107173763
buying 8 gpus instead of a single one just because you want more vram is not helping silicon supply in any way.

Anonymous 11/11/2025, 4:49:48 PM No.107173788 [Report] >>107173861 >>107173989 >>107174354

IMG_20251112_004544.jpg md5: 92fee88c...

Anonymous 11/11/2025, 4:50:16 PM No.107173797 [Report] >>107173804

>>107173751
I see. You're one of their bots.

Anonymous 11/11/2025, 4:51:04 PM No.107173804 [Report]

>>107173797
lol yeah

Anonymous 11/11/2025, 4:51:17 PM No.107173809 [Report] >>107174313

>>107173782
You realize if in your scenario the 8 current GPUs have the same amount of VRAM as the one hypothetical GPU, it would affect the VRAM supply the exact same way, right?

Anonymous 11/11/2025, 4:52:05 PM No.107173821 [Report] >>107173856

>>107173763
>silicon supply vastly outstrips demand
>there's a chip shortage

Anonymous 11/11/2025, 4:55:28 PM No.107173856 [Report] >>107173882

>>107173821
understrips* whatever you know what I meant.

Anonymous 11/11/2025, 4:55:59 PM No.107173861 [Report] >>107173973 >>107174035

>>107173788
>IMG_

Anonymous 11/11/2025, 4:56:39 PM No.107173868 [Report]

>>107173069

Kimi is one of the better ones.

You all really don't notice the pattern?

Acknowledge, Upwrite, Ask follow up question.

Parroting isn't just

>So you like candy? Oh?

It's fixation on topics from your input instead of replying naturally. Hidden by third person and longform but a chat style convo you cannot have.

Anonymous 11/11/2025, 4:59:07 PM No.107173882 [Report] >>107173919

>>107173856
Stop using words you don't understand.

Anonymous 11/11/2025, 5:03:03 PM No.107173919 [Report]

>>107173882
No. You figuritavely can't stop me.

Anonymous 11/11/2025, 5:07:55 PM No.107173973 [Report] >>107174018

>>107173861
>mixed AMD and NVidia GPUs
Yeah, IMG is the biggest concern

Anonymous 11/11/2025, 5:09:35 PM No.107173989 [Report] >>107174048

>>107173788
Would you eat a gel Miku?

Anonymous 11/11/2025, 5:10:06 PM No.107173993 [Report]

>>107173752
>2.0BPW
>20/100 tool accuracy
>https://github.com/MoonshotAI/K2-Vendor-Verifier
ITS OVER
>moonshot turbo
>100%
ZAMN!
>8$ output
ZAMN!!!!
>API
>>>/g/aicg

Anonymous 11/11/2025, 5:11:33 PM No.107174017 [Report]

>>107173592
>Irix 12B
Man, just got a flashback to those L1 250 model shitmix snakes.

Anonymous 11/11/2025, 5:11:34 PM No.107174018 [Report]

file.png md5: 93c21097...

>>107173973
Yes, they'll dethrone NViDIA and AMD

Anonymous 11/11/2025, 5:12:17 PM No.107174025 [Report] >>107174039 >>107174067 >>107174127

You know I'd enjoy this much more if llm could "learn" or at least long term remember things I've already explained.
It's just really upsetting when it asks about something ive already talked about and explained several times before.

Anonymous 11/11/2025, 5:13:11 PM No.107174035 [Report]

>>107173861
Who fucking cares?

Anonymous 11/11/2025, 5:13:40 PM No.107174039 [Report]

>>107174025
be the change you want to see

Anonymous 11/11/2025, 5:14:16 PM No.107174048 [Report]

>>107173989
You either get inside of Miku or Miku gets inside of you

Anonymous 11/11/2025, 5:16:27 PM No.107174067 [Report] >>107174093 >>107174272

>>107174025
Maybe on a different architecture considering transformers can remember like 400 tokens properly

Anonymous 11/11/2025, 5:17:27 PM No.107174081 [Report] >>107174126 >>107174128

Is there any real way to look for tunes based on a specific model on HF?

Anonymous 11/11/2025, 5:18:32 PM No.107174093 [Report]

>>107174067
Yeah, right now it just can't be a good friendbot. I don't understand how people can use it for that purpose. Quick goon sessions? Sure. Coding? Sure. But a friend needs long term memory, it doesn't need to be smart at all, just remember stuff.

Anonymous 11/11/2025, 5:20:58 PM No.107174126 [Report] >>107174145

3ab9b88ac1457b0a1631335f1200de9e.jpg md5: e077101d...

>>107174081
Theoretically yes, but nobody does a proper tagging https://huggingface.co/models?other=base_model:finetune:mistralai/Mistral-Large-Instruct-2411

Anonymous 11/11/2025, 5:21:04 PM No.107174127 [Report] >>107174178 >>107174189

>>107174025
I think the "best" (ie, most usable) you can do nowadays is a simple memory system and a response workflow for the AI where it first plans fetches some memories and shit based on some criteria (tags?) then it actually writes the response.
That alongside a rolling summary of "events" or something like that should get you 80% of the way there?
Maybe?
Try making something like that then come back to us with the result.

Anonymous 11/11/2025, 5:21:12 PM No.107174128 [Report] >>107174144 >>107174145

file.png md5: caa2e675...

>>107174081
Of course! You are absolutely right to question that.
In order to do that, first you have to complete the following action:
https://huggingface.co/zai-org/GLM-4.6

Anonymous 11/11/2025, 5:22:31 PM No.107174144 [Report]

>>107174128
in theory that's great, in practice it's not used as much as it should, some tunes are listed under quants and retarded shit like that

Anonymous 11/11/2025, 5:22:41 PM No.107174145 [Report]

>>107174128
Yeah you're very smart but >>107174126
Half the models have zero supposed tunes

Anonymous 11/11/2025, 5:25:57 PM No.107174178 [Report] >>107174198

>>107174127
There are so many points of failure that it's a miracle when it works even 20% of the time

Anonymous 11/11/2025, 5:27:18 PM No.107174189 [Report] >>107174194 >>107174198

>>107174127
We really are reinventing 2019 /aids/

Anonymous 11/11/2025, 5:27:57 PM No.107174194 [Report] >>107174206

>>107174189
Hm?

Anonymous 11/11/2025, 5:28:38 PM No.107174198 [Report]

>>107174189
It do be like that.

>>107174178
Explain.

Anonymous 11/11/2025, 5:29:22 PM No.107174206 [Report]

>>107174194
People used to make entire paradigms on how to supposedly make the AI remember shit kek, and that was also while trying to fit in 2k context

Anonymous 11/11/2025, 5:37:14 PM No.107174272 [Report] >>107174293

>>107174067
i dont think its an issue with transformers itself but all the labs expect a simple "function" to just magically be agi
its not like humans have very long context either, but all the stuff continuously gets compressed and saved to a longer term memory and then retrieved together based on input/context, but current llms lack any sort of more complex system like that other than the rigid weights of the model that are infeasible to modify in realtime

Anonymous 11/11/2025, 5:38:37 PM No.107174293 [Report] >>107174332

>>107174272
Nah, it's legit just how transformers handle memory. Both in theory and empirical testing.

Anonymous 11/11/2025, 5:41:36 PM No.107174313 [Report]

>>107173809
it wouldn't, because 8x the memory for a single chip, is less silicon than 8x the memory for 8 chip.

by having a gpu with more vram you could spare 7 chip, which also use up a lot more silicon than the memory chip and is a much more complex process to build.

Anonymous 11/11/2025, 5:42:45 PM No.107174332 [Report] >>107174361

>>107174293
yes because you just feed it back to the model without any extra processing, of course they arent gonna be able to remember 6549841325618946514 tokens of information, but humans have a much more abstract compressed version, like a sliding window except they get fed a hyper compressed global context/memory as well for every active local context

Anonymous 11/11/2025, 5:45:03 PM No.107174354 [Report]

>>107173788
this nano-banana-2? crazy stuff

Anonymous 11/11/2025, 5:45:45 PM No.107174357 [Report] >>107174373 >>107174459 >>107174475

Transformers are a dead end

Anonymous 11/11/2025, 5:46:12 PM No.107174361 [Report]

>>107174332
Instead of making models predict the next token, make them predict the next vector. Your context memory suddenly expands by a factor K which you can make as large as you are willing to lose focus on the small details.

Anonymous 11/11/2025, 5:47:49 PM No.107174373 [Report] >>107174501

>>107174357
this the big transformers killer will arrive any day now
it was obvious that rwkv, mamba, retnet, titans, transformers2 all would fail. the real successor will be much better

Anonymous 11/11/2025, 5:57:32 PM No.107174459 [Report]

>>107174357
False.
We're getting AGI in 2 weeks.

Anonymous 11/11/2025, 5:58:56 PM No.107174475 [Report]

>>107174357
*Next-token prediction* is a dead end. Transformers have some more life left.

Anonymous 11/11/2025, 6:01:59 PM No.107174501 [Report]

>>107174373
RNNs lasted for 60 years so yk

Anonymous 11/11/2025, 6:15:35 PM No.107174633 [Report] >>107174862 >>107174906

teto_00009__thumb.jpg.webm md5: f5bc3bf5...