/lmg/ - Local Models General - /g/ (#105778400) [Archived: 600 hours ago]

Anonymous
7/2/2025, 6:24:44 PM No.105778400
rl
rl
md5: 87b6af5f924569e96cde2160edc635dc๐Ÿ”
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105769835 & >>105757131

โ–บNews
>(07/02) GLM-4.1V-9B-Thinking released: https://hf.co/THUDM/GLM-4.1V-9B-Thinking
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5
>(06/27) VSCode Copilot Chat is now open source: https://github.com/microsoft/vscode-copilot-chat
>(06/27) Hunyuan-A13B released: https://hf.co/tencent/Hunyuan-A13B-Instruct

โ–บNews Archive: https://rentry.org/lmg-news-archive
โ–บGlossary: https://rentry.org/lmg-glossary
โ–บLinks: https://rentry.org/LocalModelsLinks
โ–บOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png

โ–บGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

โ–บFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

โ–บBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

โ–บTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

โ–บText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105778556 >>105780800 >>105781209 >>105783341 >>105785147 >>105786368
Anonymous
7/2/2025, 6:25:07 PM No.105778404
file
file
md5: dd04bd35f4fd07e3ce99be67a676f0d9๐Ÿ”
โ–บRecent Highlights from the Previous Thread: >>105769835

--Paper: GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning:
>105772556 >105772620 >105772636 >105772751 >105772756 >105772781
--Troubleshooting and optimizing ik_llama's GPU usage during prompt processing:
>105770658 >105770671 >105770697 >105770737 >105770774 >105770793 >105770836 >105770804 >105771477 >105770681 >105770709 >105770714 >105770742 >105770812 >105770857
--SciArena human expert benchmark ranks Qwen and o3 highly, exposes Mistral and Llama weaknesses in STEM tasks:
>105774179 >105774206 >105774242 >105774302 >105774324 >105774248 >105774390 >105774628
--MoGE's performance improvements questioned due to inconsistent benchmarking practices:
>105770488 >105770519
--Open-source intermediate thinking AI model with dynamic reasoning:
>105775016 >105775085 >105775355
--Running large models on systems with low RAM: workarounds and limitations:
>105770034 >105770065 >105770068 >105770076 >105770097 >105770144 >105770125
--Hunyuan model loading issues and emotional reflections on LLM attachment:
>105776297 >105776327 >105776340
--Speculation over model benchmark optimization via LMSys data and synthetic training:
>105775790 >105775948 >105776008 >105776027 >105776123 >105776163 >105776235 >105776270
--Small language model unexpectedly generates functional HTML/CSS for professional webpage design:
>105772836 >105772844 >105773088 >105773112
--Legal concerns over Meta's LLM court win and its impact on fair use doctrine:
>105770731 >105770759 >105770912
--Critique of verbose AI roleplay models and the importance of concise prompt design:
>105771117 >105774637
--Links:
>105771000 >105775990 >105773059 >105774668
--Miku (free space):
>105770389 >105772534 >105772539 >105773374 >105773484 >105775061 >105777681

โ–บRecent Highlight Posts from the Previous Thread: >>105769843

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105778418 >>105778467 >>105778567 >>105778868 >>105780806
Anonymous
7/2/2025, 6:27:00 PM No.105778418
>>105778404
>file.png
Anonymous
7/2/2025, 6:27:07 PM No.105778419
Sex with graphs
Replies: >>105778481
Anonymous
7/2/2025, 6:31:37 PM No.105778467
1741172748772881
1741172748772881
md5: 867635ec71e84567555ccaa4d1f95af2๐Ÿ”
>>105778404
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous
7/2/2025, 6:33:09 PM No.105778481
17504112736119115
17504112736119115
md5: fc6a20bee2735e829f41d330f88cb531๐Ÿ”
>>105778419
Obligatory
Anonymous
7/2/2025, 6:34:59 PM No.105778488
what's a good 1-3B parameter uncensored model?
Replies: >>105778511 >>105778532 >>105778721
Anonymous
7/2/2025, 6:37:33 PM No.105778510
Anything beating R1 for local, general tasks?
Replies: >>105778527 >>105778532
Anonymous
7/2/2025, 6:37:33 PM No.105778511
>>105778488
rocinante
Anonymous
7/2/2025, 6:39:15 PM No.105778527
>>105778510
no
Anonymous
7/2/2025, 6:40:16 PM No.105778532
>>105778488
I can't imagine any exist.
I suppose you could try llama 3.2 or qwen 2.5 with a prefill or something.

>>105778510
Minimax maybe, but probably not.
Anonymous
7/2/2025, 6:41:44 PM No.105778545
1672964381983669
1672964381983669
md5: a286869ce1e7ee862b5fd0f861d77659๐Ÿ”
how do I get my fucking LLM to stop getting naked in a single sentence? I want every single article of clothing to have an entire fucking paragraph
Imagine getting a private dance in a strip club, the stripper just rips off her entire outfit like they're breakaway clothes and sticks out her hand "That'll be $100"
Replies: >>105778565 >>105778575 >>105778609 >>105778875 >>105779038 >>105779064 >>105779480
Anonymous
7/2/2025, 6:42:39 PM No.105778556
>>105778400 (OP)
Small models
Anonymous
7/2/2025, 6:42:43 PM No.105778557
70b q8 is the bare minimum for mediocre rp
Replies: >>105778606 >>105781361
Anonymous
7/2/2025, 6:43:31 PM No.105778565
>>105778545
Example messages and/or just good old low depth instructions.
Anonymous
7/2/2025, 6:43:53 PM No.105778567
>>105778404
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105778774
Anonymous
7/2/2025, 6:44:47 PM No.105778575
>>105778545
They do indeed like to compact what i'd like to be 20 messages into 1. System prompts don't help
Replies: >>105778610 >>105779480
Anonymous
7/2/2025, 6:44:50 PM No.105778576
c7107f6e90d8d267d8c802cd24ac2646e3c10485
c7107f6e90d8d267d8c802cd24ac2646e3c10485
md5: 880e5930547955d237bec80c9c89c8da๐Ÿ”
Thread culture recap.
Anonymous
7/2/2025, 6:45:57 PM No.105778589
5409bc3f20a39c913655b9ca62fc79dbba39991d
5409bc3f20a39c913655b9ca62fc79dbba39991d
md5: cd51c146238318de9c7cd4303d0473ef๐Ÿ”
Anonymous
7/2/2025, 6:47:04 PM No.105778597
1676b1b069276f1f39440a6c0c44cf4f4c82654
1676b1b069276f1f39440a6c0c44cf4f4c82654
md5: f1e567bd428c12a69b673922d967ef94๐Ÿ”
Anonymous
7/2/2025, 6:48:00 PM No.105778606
>>105778557
fact check: true
Anonymous
7/2/2025, 6:48:05 PM No.105778609
>>105778545
That's so fucking hot.
Anonymous
7/2/2025, 6:48:09 PM No.105778610
>>105778575
I don't know if it's something about the training, the shitty context performance, or the fact that we mostly use quanted models, but sys prompts by and large seem to not do that much for specific things, which is why models tend to deviate from the character card so quickly too I imagine.
Low depth instructions, tags, and the like seem to help a lot.
Replies: >>105788259
Anonymous
7/2/2025, 6:48:11 PM No.105778611
c271ce25d83ed9e81b0d8f5f040c3dca49f1e59b
c271ce25d83ed9e81b0d8f5f040c3dca49f1e59b
md5: f02cf5b2796ca4ad0a0d080b7c809f28๐Ÿ”
Anonymous
7/2/2025, 6:49:33 PM No.105778621
fcc252835da42806ed9a80665e92a9c412820e9a
fcc252835da42806ed9a80665e92a9c412820e9a
md5: 58013d8049fab0ebcf94720a1e30b650๐Ÿ”
Anonymous
7/2/2025, 6:50:42 PM No.105778628
ddeb129c0006048c309301ac8f002daeaaec1
ddeb129c0006048c309301ac8f002daeaaec1
md5: 9f36d9e0ac3b19f57488c7a48bef3a16๐Ÿ”
Anonymous
7/2/2025, 6:50:53 PM No.105778629
>he's at it again
What triggered him now?
Replies: >>105778641 >>105778650
Anonymous
7/2/2025, 6:52:02 PM No.105778641
5ffa32ed9f3b692778ef196ee4d3a24360d8aa
5ffa32ed9f3b692778ef196ee4d3a24360d8aa
md5: ed909c73340a6779efe2a3f976a47d87๐Ÿ”
>>105778629
jannyfaggot. you can thank him for your renewed blacked miku subscription
Anonymous
7/2/2025, 6:52:22 PM No.105778645
He makes you mad and that's a good thing.
Anonymous
7/2/2025, 6:52:58 PM No.105778650
>>105778629
I don't mind it, porn is better than mediocre benchmaxxed model releases anyway
Replies: >>105778659
Anonymous
7/2/2025, 6:53:48 PM No.105778656
is qwen3-30-a3b still the best usable model for 8gb vramlets with decent ram?
Replies: >>105778676 >>105778682 >>105778690
Anonymous
7/2/2025, 6:54:01 PM No.105778659
>>105778650
Personally I like the benchmaxxing drama.
Anonymous
7/2/2025, 6:54:30 PM No.105778663
1747241044419024
1747241044419024
md5: 9e856264174d221baa7f345150f6e302๐Ÿ”
Somebody's mad. Have a wholesome miku
Replies: >>105778868 >>105780674 >>105780806
Anonymous
7/2/2025, 6:54:35 PM No.105778664
https://files.catbox.moe/95axh6.jpg
Anonymous
7/2/2025, 6:55:45 PM No.105778676
>>105778656
For general stuff? Probably.
For coom? It's lackluster from the little I tested it.
Anonymous
7/2/2025, 6:56:12 PM No.105778682
>>105778656
Yes. It is that good for general use
Anonymous
7/2/2025, 6:57:02 PM No.105778690
>>105778656
For everything except RP. I hope gooning to math is your thing
Anonymous
7/2/2025, 7:00:11 PM No.105778721
>>105778488
gemma 3n is amazing and pretty much the only option right now. Will be cool to see if anyone sloptunes it.
Replies: >>105778744
Anonymous
7/2/2025, 7:02:08 PM No.105778744
>>105778721
Isn't that an 8b model?
Replies: >>105778771
Anonymous
7/2/2025, 7:04:23 PM No.105778769
So <|extra_4|> is the system end token and <|extra_0|> the user end token? For Hunyuan.
Replies: >>105778792 >>105778812
Anonymous
7/2/2025, 7:04:39 PM No.105778771
>>105778744
its like 4-5gb. Is anything smaller even remotely usable for any purpose?
Replies: >>105778860
Anonymous
7/2/2025, 7:05:04 PM No.105778774
>>105778567
https://rentry.co/bxa9go2o
Anonymous
7/2/2025, 7:07:03 PM No.105778792
>>105778769
nta but i wanna know this too
Anonymous
7/2/2025, 7:08:39 PM No.105778810
what is the best 70b model for cuck erotica?
Replies: >>105778932 >>105778955 >>105779361
Anonymous
7/2/2025, 7:08:42 PM No.105778812
>>105778769
seems like it
Anonymous
7/2/2025, 7:14:03 PM No.105778860
>>105778771
Oh fuck. I didn't see that there was a 6-ish B version in addition to the 8ish B model.
Neat.
Still a little larger than what anon requested but probably worth a try.
Anonymous
7/2/2025, 7:14:37 PM No.105778868
>>105778404
>>105778663
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Anonymous
7/2/2025, 7:15:27 PM No.105778875
>>105778545
Ask them to strip, piece by piece
Anonymous
7/2/2025, 7:19:43 PM No.105778932
>>105778810
rocinante 70b
Anonymous
7/2/2025, 7:19:51 PM No.105778934
Real BitNet status?
Replies: >>105784147
Anonymous
7/2/2025, 7:22:06 PM No.105778955
>>105778810
eva for cuck erotica, llama 3.1 for cucked erotica
Anonymous
7/2/2025, 7:24:27 PM No.105778981
and just like that local was saved
llama : initial Mamba-2 support (#9126) 13 minutes ago
https://github.com/ggml-org/llama.cpp/commit/5d46babdc2d4675d96ebcf23cac098a02f0d30cc
Replies: >>105778996 >>105779004 >>105779043 >>105779108 >>105779166 >>105779974 >>105780071 >>105780126
Anonymous
7/2/2025, 7:25:58 PM No.105778996
>>105778981
Which model can I use with that?
Replies: >>105779730
Anonymous
7/2/2025, 7:27:20 PM No.105779004
>>105778981
Oh shit, that means Jamba soon, probably.
Maybe.
I hope.
Anonymous
7/2/2025, 7:29:52 PM No.105779038
>>105778545
The problem is that they're being trained on outputting single response benchmark-tier answers, e.g. "write me a short story about a banana that talks like a pirate", so the models have to try and fit everything in a single response. This is also why a lot of models fall apart later in the context or after a number of messages. In other words, often times you can't really prompt it away because the model hasn't been trained on long-form/back and forth responses
Replies: >>105779082 >>105779124
Anonymous
7/2/2025, 7:30:49 PM No.105779043
>>105778981
>initial
What about the rest of it?
Anonymous
7/2/2025, 7:32:31 PM No.105779064
>>105778545
use mikupad and never look back
Replies: >>105779095 >>105779216 >>105779372 >>105779480
Anonymous
7/2/2025, 7:34:37 PM No.105779082
>>105779038
Tell that to Mistral and that schizo anon who says that multi-turn conversations are a meme.
Replies: >>105779119
Anonymous
7/2/2025, 7:36:02 PM No.105779095
>>105779064
This. Chat templates are a meme. Just add \n to your stopping sequence.
Replies: >>105779216
Anonymous
7/2/2025, 7:36:39 PM No.105779108
1747241044419025
1747241044419025
md5: e97a0f192c9a73a8c824bcc2d4e26dcf๐Ÿ”
>>105778981
Anonymous
7/2/2025, 7:37:28 PM No.105779119
>>105779082
I think both have validity, existing models are trained seemingly for 1-2-3 turn convos, and as a result, NoAss extension(ST, compress all prior messages into one) and similar are useful for handling this issue, but it is an issue, and is something that's self-inflicted by the model makers.

I blame LMArena for it.
Replies: >>105779209
Anonymous
7/2/2025, 7:38:05 PM No.105779124
>>105779038
I wonder if that means these models would perform better if we always sent the context with a sys prompt and a user message containing the whole chat history or some similar arrangement where the model is always responding to a single user turn and the rest of the history is somewhere in the context (sys prompt, user message, whatever).
Replies: >>105779176
Anonymous
7/2/2025, 7:40:54 PM No.105779166
>>105778981
>saved
LoL
Anonymous
7/2/2025, 7:41:57 PM No.105779176
>>105779124
Look into aicg's Noass thing, it does basically that.
Replies: >>105779238
Anonymous
7/2/2025, 7:45:18 PM No.105779209
>>105779119
I think the main reason is that single-turn datasets are just so much easier to craft/generate and work for most normies' use-cases.
Anonymous
7/2/2025, 7:45:52 PM No.105779215
GUYS
https://huggingface.co/openai/gpt-p1
Replies: >>105779240 >>105779250
Anonymous
7/2/2025, 7:45:52 PM No.105779216
big ants
big ants
md5: bd857a89789ba7c530f27c62e4cea92c๐Ÿ”
>>105779064
>>105779095
What are you doing to miku to make it function well? From what little I messed with it, it just seems like a KoboldClient UI with even less features
Replies: >>105779349
Anonymous
7/2/2025, 7:47:46 PM No.105779238
>>105779176
So that is a thing.
Neat.
Anybody here tried that with the usual open weight models?
How did it perform?
Anonymous
7/2/2025, 7:47:52 PM No.105779240
1731162001019142
1731162001019142
md5: b8a95717436574359d76e684f81423e1๐Ÿ”
>>105779215
Anonymous
7/2/2025, 7:48:31 PM No.105779250
>>105779215
>257B
Anonymous
7/2/2025, 7:56:39 PM No.105779349
1738307762538467
1738307762538467
md5: f55dd9326d028ca0d6275ca17c41ffd4๐Ÿ”
>>105779216
The idea is you write something and let the LLM just continue from there. Put descriptions, setting, etc. at the top of the prompt then separate it from the story part. You can put detailed descriptions in the prompt or just general tags and let the LLM take it from there. Something like picrel.
Anonymous
7/2/2025, 7:58:02 PM No.105779361
>>105778810
Just use R1 API
Anonymous
7/2/2025, 7:58:42 PM No.105779372
>>105779064
buy an ad
Anonymous
7/2/2025, 8:01:29 PM No.105779401
(You)
(You)
md5: aef45774e717750f38849be552d7d823๐Ÿ”
Replies: >>105779431 >>105779456 >>105779521 >>105779586
Anonymous
7/2/2025, 8:03:37 PM No.105779431
ssrlkk24py531
ssrlkk24py531
md5: 3d0622dd97c3ce3e23a7f923eae847d6๐Ÿ”
>>105779401
Touch grass
Replies: >>105779467
Anonymous
7/2/2025, 8:05:16 PM No.105779456
1733972547076131
1733972547076131
md5: 36390fffe7b30d4bd0deb51cdad003a0๐Ÿ”
>>105779401
Can I get a QRD on the iraqis?
Anonymous
7/2/2025, 8:05:52 PM No.105779467
>>105779431
stop being brown
Replies: >>105779513 >>105779521 >>105779533
Anonymous
7/2/2025, 8:06:13 PM No.105779471
Hunyuan is retarded with the current PR.
Replies: >>105779526
Anonymous
7/2/2025, 8:06:37 PM No.105779480
>>105778545
>I want every single article of clothing to have an entire fucking paragraph
Tell it to do that.
My main issue with clothes is the LLM forgetting what the NPC is wearing. I always have a tracker block for that reason alone.
>>105778575
Aggressive system prompts also do not help with this rushing the output.
>>105779064
Or this. I still use ST more but Mikupad's unique and its own thing.
Anonymous
7/2/2025, 8:08:46 PM No.105779513
butthurt
butthurt
md5: d1d7ec534f4912447334fc7a7deb5174๐Ÿ”
>>105779467
>Still trying to shit up the thread after two years
>Failing this hard
Anonymous
7/2/2025, 8:09:38 PM No.105779521
>>105779401
>>105779467
Accusing everyone around you being brown is inherent brownie behavior.
Anonymous
7/2/2025, 8:10:04 PM No.105779526
>>105779471
Did you try
>https://huggingface.co/tencent/Hunyuan-A13B-Pretrain
According to ggerg, it has orders of magnitude lower PPL.
Seems like the thing is broken big time. Might have something to do with their funky router algorithm or something like that.
Anonymous
7/2/2025, 8:10:27 PM No.105779533
im-projecting
im-projecting
md5: 0c0333d521a4709cae18a13b4fb346b8๐Ÿ”
>>105779467
Anonymous
7/2/2025, 8:11:31 PM No.105779546
seething browns lmao israel won btw
Anonymous
7/2/2025, 8:12:01 PM No.105779552
1729453386701418
1729453386701418
md5: 298f8b69bc53297b2742bc1a5e7300f0๐Ÿ”
Replies: >>105779602 >>105779646 >>105779679 >>105780306 >>105780492 >>105780820 >>105780996 >>105781010
Anonymous
7/2/2025, 8:15:50 PM No.105779586
>>105779401
ashadu anla la ilaha ill'allah!
wa-ashadu anna muhammedan rassulullah!
Anonymous
7/2/2025, 8:17:17 PM No.105779602
>>105779552
>le Dalit/Brahmin
I hate Indians so much.
Replies: >>105779820
Anonymous
7/2/2025, 8:21:21 PM No.105779643
Do you guys have any tips for LORA training? Anything you've learned that improves "quality" whatever that may mean for you.
Replies: >>105779684 >>105779774
Anonymous
7/2/2025, 8:21:50 PM No.105779646
>>105779552
jeets being jeets
Anonymous
7/2/2025, 8:24:33 PM No.105779679
>>105779552
>Dalit are to use languages that are at or near native performance and keep the place clean
>Brahmins and Kshatriyas are script shitters who burn ten times or more the processor cycles to limp through interpretation while yelling at other people to clean up their own messes
It's a valid reflection of status in all businesses and societies.
Anonymous
7/2/2025, 8:24:57 PM No.105779684
>>105779643
Mostly don't waste your time. There's better ways to get LLM performance through prompting and RAGs.
Anonymous
7/2/2025, 8:28:53 PM No.105779730
>>105778996
https://github.com/ggml-org/llama.cpp/issues/13275
Anonymous
7/2/2025, 8:32:07 PM No.105779774
>>105779643
Finetuning models mostly make it behave as you intended in specific tasks (like tool usages for agents) but you can't really improve the "quality". You can't help it if the base model architecture sucks.
Anonymous
7/2/2025, 8:36:17 PM No.105779812
Anybody here successfully locally using Augmentoolkit, or is it flaming garbage as it seems from a quick check as of the latest version (and as it seemed last time I checked it out)?
Replies: >>105780570
Anonymous
7/2/2025, 8:36:47 PM No.105779816
mamba will save local
Anonymous
7/2/2025, 8:37:31 PM No.105779820
>>105779602
>Dalit/Brahmin
Same shit
Anonymous
7/2/2025, 8:39:10 PM No.105779842
>>105778402
you will never be straight
Anonymous
7/2/2025, 8:47:53 PM No.105779938
Jamba gguf support status?
Replies: >>105779974
Anonymous
7/2/2025, 8:52:29 PM No.105779974
>>105779938
Same as before but the man the myth the legend has woken up from his hibernation : >>105778981
Anonymous
7/2/2025, 8:56:25 PM No.105780008
dead hobby, dead general
Anonymous
7/2/2025, 9:05:17 PM No.105780071
>>105778981
>What to expect
>However, a big downside right now with recurrent models in llama.cpp is the lack of state rollback (which is implemented through state checkpoints in #7531, but needs to be re-adapted to #8526), so the prompt will be reprocessed a lot if using llama-server. I think using llama-cli in conversation mode does not have this problem, however (or maybe only the bare interactive mode with --in-prefix and --in-suffix, not sure).

>This initial implementation is CPU-only, but uses SIMD for the SSM scan, so even though the state is bigger than for Mamba-1 models, in my tests, the speed of Mamba2-130M is similar or better than Mamba-130M (but still not that fast compared to transformer-based models with an empty context), when both are run on CPU.

>The speed of Mamba-2 models seems comparable to Transformer-based models when the latter have 2k to 4k tokens in their context.

1/2
Anonymous
7/2/2025, 9:06:23 PM No.105780080
1726345477198121
1726345477198121
md5: 7bdd9061c5b8eb34858a687ddfca774a๐Ÿ”
Summary of changes

Add support for Mamba2ForCausalLM (including the official Mamba-2 models, and Mamba-Codestral-7B-v0.1)
Note that config.json needs to contain "architectures": ["Mamba2ForCausalLM"], for the convert script to properly detect the architecture.
View Mamba-1 as having d_inner (aka 2 * n_embd) heads of size 1.
This simplifies the handling of shapes in ggml_ssm_scan
ggml
Implement Mamba-2's selective state update in ggml_ssm_scan.
Re-using the same operator as Mamba-1, because it's pretty much the same operation. (except for how ssm_a is broadcast)
Fuse the operation with ssm_d into ggml_ssm_scan
Otherwise it would need to be transposed, because the dot-products are done head-wise.
Implement Mamba-2's SSM scan with GGML_SIMD.
This is possible because there is no element-wise expf in the state update unlike with Mamba-1.
Avoid state copies for the SSM state (both for Mamba-1 and Mamba-2) by passing state ids to ggml_ssm_scan.
Mamba-2 states are huge. Otherwise masking and copying took close to 10% of the CPU time according to perf.

2/2
Anonymous
7/2/2025, 9:12:00 PM No.105780126
>>105778981
>llama : initial Mamba-2 support (#9126) 13 minutes ago
it's been a while I didn't lurk this place, is this another meme? I remember mamba and it never went succesfull lol
Replies: >>105781631
Anonymous
7/2/2025, 9:15:31 PM No.105780149
bitnet status?
Replies: >>105780174
Anonymous
7/2/2025, 9:15:50 PM No.105780150
Nvm, I can't read.

>I've tested most things I wasn't completely sure about (CUDA, SVE), and inference on those platform does seem to work properly for both Mamba-1 and Mamba-2 models, with -ngl 0 and -ngl 99 (and it looks like 0b6f6be also fixes RWKV inference when compiled with SVE on a c7g AWS instance).

>Weird small models like https://huggingface.co/delphi-suite/v0-mamba-100k seem to work even when compiled with -DGGML_CUDA=ON since 71bef66 (it failed with an assert previously, but ran correctly in a CPU-only build).

So this means we get mamba locally with GPU support.
Anonymous
7/2/2025, 9:18:44 PM No.105780174
60750786e5bd5[1]
60750786e5bd5[1]
md5: 6b009782661bd6b54b834c36384348b2๐Ÿ”
>>105780149
Replace communism with bitnet.
Replies: >>105780267 >>105784205
Anonymous
7/2/2025, 9:23:56 PM No.105780228
the similarity with communism doesn't end there
bitnet like communism inspires the poor with no ambition of elevating themselves by making them believe that the trash pile they own may someday run a good model, just like how communism makes the poor believe daddy government will take care of them even if they make nothing of themselves
no, you will never run chatGPT on your poorfag GPU
Replies: >>105780256 >>105780458
Anonymous
7/2/2025, 9:26:39 PM No.105780256
>>105780228
It would work if people weren't fucking retarded
But in that case capitalism would too
Anonymous
7/2/2025, 9:27:34 PM No.105780267
>>105780174
lmaooo
Anonymous
7/2/2025, 9:31:26 PM No.105780306
>>105779552
Maybe it was just a joke about levels of abstractions and the dumb d*lit didn't get it
Anonymous
7/2/2025, 9:48:25 PM No.105780458
>>105780228
but current local is better than old 3.5 turbo. It just depends on which moving goalpost you use as "chatgpt" and how long you wait for tech.
Replies: >>105780522
Anonymous
7/2/2025, 9:51:24 PM No.105780492
>>105779552
I don't know what those names mean, but I get it's infighting, so that's nice.
Anonymous
7/2/2025, 9:53:27 PM No.105780522
>>105780458
we need better than gpt5 (coming soon) in under 8b
Anonymous
7/2/2025, 9:57:34 PM No.105780570
>>105779812
My experience:
>The installation downloaded a few gigabytes of unrequested libraries and started compiling stuff. Huh?
>The documentation is annoyingly full of fluff/bullshit and explains nothing concisely
>Seemingly no ready-to-run example for local data generation (assumes API access and downloading things on-the-fly)
>Video tutorials (bad sign) from the author don't help with that either
>It wants the user to download the author's dataset-generation Mistral-7B finetune if you're not using API-only models (no thanks)
>Apparently uses wrong/no chat templates for its augmentation process, just free-form prompting. Surely that will yield good results?
>The built-in data pipelines are full of slop
>It really can't just connect to an OAI-compatible server and generate, can it?
>Why is it also dealing with training and inference? Just fucking generate the data
>The author is seemingly working on commercial solutions which probably explains why the whole thing feels deliberately obfuscated

Verdict: waste of time. I'll roll out my own.
Replies: >>105780705
Anonymous
7/2/2025, 10:06:32 PM No.105780674
>>105778663
Holding hands with Miku
Replies: >>105780806
Anonymous
7/2/2025, 10:08:53 PM No.105780705
>>105780570
>Apparently uses wrong/no chat templates for its augmentation process, just free-form prompting.
What the fuck.
I guess if they are using base models for completion, but that wouldn't make sense now would it.
What the fuck?
Replies: >>105780906
Anonymous
7/2/2025, 10:17:50 PM No.105780800
>>105778400 (OP)
Are there any FOSS gen music models on Huggingface or similar?
Tired of dealing with SUNO restrictions.
Replies: >>105781033
Anonymous
7/2/2025, 10:18:02 PM No.105780806
1681850382478594_thumb.jpg
1681850382478594_thumb.jpg
md5: 39750f384c28f49dfd283f2584f04537๐Ÿ”
>>105778404
>>105778663
>>105780674
https://files.catbox.moe/ftq6qc.png
Anonymous
7/2/2025, 10:19:21 PM No.105780820
>>105779552
>the most plebean language is reserved for the "upper caste"
I will never understand i*dians
Replies: >>105780895
Anonymous
7/2/2025, 10:26:20 PM No.105780895
>>105780820
It's reserved for middle-managers so they can "prototype" quickly. Then some unfortunate one needs to implement the same thing in C++.
Anonymous
7/2/2025, 10:27:33 PM No.105780906
>>105780705
The previous version was like that and I don't see entries for configuring and ensuring the correct prompting format in the new one. The whole project is an overengineered yet poorly functional clusterfuck in my humble opinion, I don't care if it works for whoever contributed to it.
Anonymous
7/2/2025, 10:37:00 PM No.105780996
>>105779552
I always wondered how much was this caste shit still a thing among urbanized Indians.
Seems like they still live the retard dream huh.
Replies: >>105781080
Anonymous
7/2/2025, 10:38:34 PM No.105781010
>>105779552
If their entire society agrees that castes exist, then how is discrimination based on caste bad?
Replies: >>105781063
Anonymous
7/2/2025, 10:40:27 PM No.105781033
>>105780800
ACE Step is pretty good for ideas and prototypes, but the output is definitely not "studio quality"
It can be a lot of fun for messing around making meme songs with your buddies, kinda like playing QWOP was a good time back in the day
Replies: >>105781078
Anonymous
7/2/2025, 10:44:11 PM No.105781063
>>105781010
Because the top four castes get to agree that YOU are in the fifth.
Anonymous
7/2/2025, 10:45:45 PM No.105781078
>>105781033
ACE-Step is only good for lyrics edits.
https://vocaroo.com/11M5Ft5ahPzp
Anonymous
7/2/2025, 10:45:59 PM No.105781080
>>105780996
It's all a nightmare that we can't wake up from. I wish I didn't know about any of this stupid poop skin drama.
Anonymous
7/2/2025, 10:52:00 PM No.105781141
Remind me why I should care about Mamba.
Replies: >>105781189
Anonymous
7/2/2025, 10:54:21 PM No.105781162
Mamba No. 5
Anonymous
7/2/2025, 10:57:14 PM No.105781189
>>105781141
compute scales linear with context instead of quadratic. So substantially less fall-off in inference speed/power efficiency with context length scaling. Meaning that devices with higher core counts (and thus larger dies) aren't favored as much as just putting more fucking VRAM on consumer tier devices.
Anonymous
7/2/2025, 10:59:31 PM No.105781209
>>105778400 (OP)
> GLM-4.1V-9B-Thinking released
Aider benchmark?
Anonymous
7/2/2025, 11:17:22 PM No.105781361
>>105778557
Using Mistral Small 24B Q8_0 (for most of it used an exl2 that claimed to be 8.0bpw but was probably 6.0bpw or something) I had a coherent adventure game experience up to 19k tokens before it fell apart. Log at >>102543451
Replies: >>105781555 >>105781600
Anonymous
7/2/2025, 11:35:12 PM No.105781544
>people are still unironically using shitstral models
Replies: >>105781600
Anonymous
7/2/2025, 11:36:26 PM No.105781555
>>105781361
*using Mistral Small 22B
Anonymous
7/2/2025, 11:41:34 PM No.105781600
i tries my best
i tries my best
md5: 512dd5738f538a82a5d767882c7d8071๐Ÿ”
>>105781361
I wonder if the multiple-choice format of your answers leads to extra tokens being processed, which might lead to incoherency. ive had mistral make sense beyond 24k tokens

>>105781544
picrel
Replies: >>105781630
Anonymous
7/2/2025, 11:44:43 PM No.105781630
1741477917043259
1741477917043259
md5: c14e1166cebb38d90f7f325c03ddba71๐Ÿ”
>>105781600
keep improving
Anonymous
7/2/2025, 11:44:46 PM No.105781631
>>105780126
Other than being used by Gemini you mean
Anonymous
7/3/2025, 12:40:41 AM No.105782177
multimodal is a giant meme
Anonymous
7/3/2025, 12:45:58 AM No.105782228
d30022d210c872ddf25bc24cc0ff0bd1
d30022d210c872ddf25bc24cc0ff0bd1
md5: d102efcfdbf530967f9866b3e5970eb8๐Ÿ”
Gonna ask a question you guys likely see a million times a day, but can anyone recommend a (solid) RP LLM for 16GB VRAM? I'm completely new to this shit and all the information I can find seems horribly outdated or just wrong.
Replies: >>105782245
Anonymous
7/3/2025, 12:47:21 AM No.105782245
>>105782228
Magistral-Small-2506_Q8_0.gguf
Replies: >>105782256
Anonymous
7/3/2025, 12:48:16 AM No.105782256
>>105782245
That was fast. Many thanks, I appreciate the guidance.
Replies: >>105782406
Anonymous
7/3/2025, 1:01:01 AM No.105782383
1751440461951796
1751440461951796
md5: bb1fec2fbd54e9ed773c5eb63613cd4d๐Ÿ”
Availability fluctuations = new model release soon
You heard it here first
https://zzzzzzz.grafana.net/public-dashboards/88296a8e74c14dae8f839c2b9973214b
Replies: >>105782617 >>105787180 >>105788078
Anonymous
7/3/2025, 1:02:53 AM No.105782406
>>105782256
The model/quantization the retard above suggested won't fit into your 16 GB of VRAM and it's not too great for RP either.
Replies: >>105782727
Anonymous
7/3/2025, 1:17:32 AM No.105782560
Turning mmap off boosted my generation speed from 1 T/s to 8 T/s...
Replies: >>105782628
Anonymous
7/3/2025, 1:23:25 AM No.105782611
>Claude thinks Anthropic's logo is a whale
Worst hallucination I've ever seen
Replies: >>105782637
Anonymous
7/3/2025, 1:24:16 AM No.105782617
>>105782383
meds
Replies: >>105782637
Anonymous
7/3/2025, 1:24:53 AM No.105782628
>>105782560
I always turn that shit off.
Tranny shart code.
Anonymous
7/3/2025, 1:25:48 AM No.105782637
1742069751231909
1742069751231909
md5: 094ae4e57ddb64e389a310a793eea835๐Ÿ”
>>105782611
>>105782617
It's already on LMArena and it's called Steve.
Replies: >>105782696 >>105782699 >>105782704 >>105782721 >>105782726 >>105782745 >>105788043 >>105788078
Anonymous
7/3/2025, 1:29:26 AM No.105782664
Chinchilla Llama Hyena Mamba Pajama Orca Falcon Dolphin Thinking Reflection CoT 0.68bit RetNet 10e100 context AGI ASI Vibe-prompting Hybrid MoE Flash 1000x1000x1000T Preview Vision Hearing Tasting Music Video Omnimodal Omnilingual Pre-trained Post-trained After-tuned Agentic Self-improving Dev GGUF imatrix FP64 v3.45-54f.0 Writer MAXI 1776 (Fixed)
Anonymous
7/3/2025, 1:33:13 AM No.105782696
>>105782637
>(Note: [...])
ominous
Replies: >>105782746
Anonymous
7/3/2025, 1:33:30 AM No.105782699
>>105782637
LA-LA-LA LAVA
Anonymous
7/3/2025, 1:33:57 AM No.105782704
>>105782637
>you are whale
>k
>you are whale
>k
>what do you look like?
>whale
Fucking retard
Replies: >>105782713 >>105788043
Anonymous
7/3/2025, 1:35:05 AM No.105782713
>>105782704
claude shill big mad
Replies: >>105782731
Anonymous
7/3/2025, 1:35:27 AM No.105782719
I'm trying to get Latex output to render correctly in Koboldcpp, and it's not recognizing latex equations consisting of a single letter (for example: $G$)
Is there a known fix for this or do I have to fix it myself?
Anonymous
7/3/2025, 1:35:52 AM No.105782721
>>105782637
what's with the deadpan (narration)
Replies: >>105782746 >>105782747
Anonymous
7/3/2025, 1:36:00 AM No.105782722
any chance local will ever recover?
Anonymous
7/3/2025, 1:36:22 AM No.105782726
>>105782637
They should cut costs and put you there too. Llama3 1B would have some competition
Anonymous
7/3/2025, 1:36:25 AM No.105782727
>>105782406
Well, shit. Do you have any suggestions, then?
Replies: >>105782754
Anonymous
7/3/2025, 1:36:38 AM No.105782729
any chance baits will ever stop?
Anonymous
7/3/2025, 1:36:57 AM No.105782731
>>105782713
word sentence complete not
Anonymous
7/3/2025, 1:38:18 AM No.105782745
>>105782637
yeah let me go erp with steve
Replies: >>105784221
Anonymous
7/3/2025, 1:38:19 AM No.105782746
>>105782721
>>105782696
Might be a Chinese only thing (I prompted in Chinese "ไฝ ๅฅฝ้ชšๅ•Šๅฐ้ฒธ้ฑผ")
Anonymous
7/3/2025, 1:38:21 AM No.105782747
>>105782721
LMarena cheat code
Anonymous
7/3/2025, 1:38:37 AM No.105782749
bait tastes good
Anonymous
7/3/2025, 1:39:34 AM No.105782754
>>105782727
The fucking lazy guide says mistral nemo 12b. It's still mistral nemo 12b. It's been mistral nemo 12b for a year.
Can't read the fucking lazy guide and we have to keep feeding people like you.
Go get mistral nemo 12b or a finetune. Any.
Replies: >>105782778 >>105782789
Anonymous
7/3/2025, 1:40:13 AM No.105782761
I've been out of the loop for a few months, is local saved?
Replies: >>105782770 >>105782784 >>105782844 >>105783003 >>105788178
Anonymous
7/3/2025, 1:41:22 AM No.105782770
1728175273915157
1728175273915157
md5: 1b390c4c541fe02b04fa38ddc26cd18c๐Ÿ”
>>105782761
Replies: >>105782847
Anonymous
7/3/2025, 1:42:06 AM No.105782778
>>105782754
How the fuck am I supposed to know that? Shit moves fast enough in this industry I'd expect something more recent than whatever the fuck has been posted in the general's same links for a year. Fucking retard.
Replies: >>105782793 >>105782794
Anonymous
7/3/2025, 1:42:43 AM No.105782784
>>105782761
sam's about to drop his open model that'll save local
Anonymous
7/3/2025, 1:43:17 AM No.105782789
>>105782754
Mistral shill should get gassed right after we gas Claude shills
Replies: >>105782802
Anonymous
7/3/2025, 1:43:37 AM No.105782793
>>105782778
You are correct, most of the recommendations in the OP are more than 2-3 years old and are useless
Replies: >>105782797
Anonymous
7/3/2025, 1:43:39 AM No.105782794
>>105782778
Lurk for 20 minutes, check the archives. That's how.
Replies: >>105782797 >>105782817
Anonymous
7/3/2025, 1:44:01 AM No.105782797
>>105782794
You have no argument against this >>105782793
Replies: >>105782815
Anonymous
7/3/2025, 1:44:40 AM No.105782802
>>105782789
>Mistral shill
Suggest something better for poor anon.
>Claude shills
You're a whale.
Anonymous
7/3/2025, 1:45:42 AM No.105782815
>>105782797
Fine. Get something better to run on 16gb vram and report back. I'll wait.
Anonymous
7/3/2025, 1:46:09 AM No.105782817
>>105782794
See, my hope is that like any community, you can ask a simple fucking question and get an answer from someone who's not a complete chucklefuck like yourself. "Check the archive" where I can dig through a bunch of nothing for three straight hours, which option makes more sense to you? Are you fucking stupid?
Replies: >>105782839 >>105782842 >>105782853
Anonymous
7/3/2025, 1:48:46 AM No.105782839
OP
OP
md5: bdc89921680d1d4d7e215960028f3b32๐Ÿ”
>>105782817
Anonymous
7/3/2025, 1:48:51 AM No.105782842
>>105782817
Please ignore the resident threadshitter, it's not past his bedtime yet and he hasn't dilated yet today
Anonymous
7/3/2025, 1:48:55 AM No.105782844
>>105782761
steve will save us
(Note: it will not)
Anonymous
7/3/2025, 1:49:05 AM No.105782847
>>105782770
It's 126 days until November 5th, Miku
Anonymous
7/3/2025, 1:50:07 AM No.105782853
>>105782817
>wanna gen text
>can't read
Replies: >>105782900
Anonymous
7/3/2025, 1:55:39 AM No.105782897
Which model can give me an oiled footjob
Replies: >>105782909 >>105784849
Anonymous
7/3/2025, 1:55:58 AM No.105782900
>>105782853
This is a lmg requirement though so he'll fit right in.
Anonymous
7/3/2025, 1:56:47 AM No.105782909
>>105782897
jepa, if you're willing to risk it.
Anonymous
7/3/2025, 2:09:25 AM No.105783003
>>105782761
No, we still avatarfagging with trans-coded characters here.
Replies: >>105783037 >>105783063 >>105784528
Anonymous
7/3/2025, 2:14:53 AM No.105783037
>>105783003
Miku will always be /lmg/'s mascot
Get over it, Jart
Anonymous
7/3/2025, 2:18:05 AM No.105783063
>>105783003
is miku trans-coded? Since when?
Replies: >>105783076 >>105783112 >>105783204
Anonymous
7/3/2025, 2:19:38 AM No.105783076
>>105783063
Since I said so okay?!
Anonymous
7/3/2025, 2:21:09 AM No.105783087
1748007705835229
1748007705835229
md5: 2f4d7c7bab274f118e923b7a1a97a3ca๐Ÿ”
It trooned out :(
Replies: >>105783116
Anonymous
7/3/2025, 2:23:34 AM No.105783112
9
9
md5: cb082339f57be2af9340ad9748799df0๐Ÿ”
>>105783063
This is OFFICIAL hatsune miku x pokemon crossover art from 2023. Sickening...
Replies: >>105784585
Anonymous
7/3/2025, 2:24:08 AM No.105783116
>>105783087
kek
Anonymous
7/3/2025, 2:26:08 AM No.105783132
Nigga you're literally using a technology called trannyformers
You don't get to trannyshame Hatsune Miku
Anonymous
7/3/2025, 2:37:49 AM No.105783204
traniku
traniku
md5: d108616f921b057789849100911e8dbb๐Ÿ”
>>105783063
Replies: >>105784585
Anonymous
7/3/2025, 2:52:39 AM No.105783316
__gardevoir_pokemon_drawn_by_saltyxodium__6af343118a584ab042a7656a4af14a36
I'm directing all of my cursed energy towards whoever originally came up with "tuning models to output markdown", not because this makes them worse at writing, but because this makes them fucking disgusting as assistants.
Anonymous
7/3/2025, 2:56:11 AM No.105783341
>>105778400 (OP) Check this out me dudes: https://www.reddit.com/r/ChatGPTPromptGenius/comments/1lq7zrv/google_just_launched_a_global_hackathon_with_a/
Replies: >>105783762
Anonymous
7/3/2025, 3:03:06 AM No.105783383
drummer is my builder
Replies: >>105783436
Anonymous
7/3/2025, 3:11:23 AM No.105783436
>>105783383
Satan is my motor
Anonymous
7/3/2025, 3:11:29 AM No.105783437
Steve will save local (it will)
Anonymous
7/3/2025, 3:45:03 AM No.105783683
https://huggingface.co/unsloth/gemma-3n-E4B-it-unsloth-bnb-4bit
Why is it so big if its 4b and 4bit?
I dont really get it.
The MLX quants are under 4gb.
https://huggingface.co/mlx-community/gemma-3n-E4B-it-lm-4bit/tree/main
Replies: >>105783714 >>105783720
Anonymous
7/3/2025, 3:50:35 AM No.105783714
>>105783683
>https://huggingface.co/unsloth/gemma-3n-E4B-it-unsloth-bnb-4bit
>Model size
7.85B params
>https://huggingface.co/mlx-community/gemma-3n-E4B-it-lm-4bit/tree/main
>Model size
1.07B params
Gee I dunno
Replies: >>105783762
Anonymous
7/3/2025, 3:51:12 AM No.105783720
>>105783683
Unsloth are hacks.
Replies: >>105783758 >>105783762
Anonymous
7/3/2025, 3:55:03 AM No.105783746
82c654dfly1i30ie12l2rj21kw13vti6
82c654dfly1i30ie12l2rj21kw13vti6
md5: b1acd079b65b1b5bff9827e5a5b7c0c3๐Ÿ”
Anonymous
7/3/2025, 3:56:55 AM No.105783758
>>105783720
this. reddit tier noobs.
Anonymous
7/3/2025, 3:57:15 AM No.105783762
>>105783714
So the mlx quants are fucked?
That size does look closer though in my opinion. If its 4bit of a 8b model.
Also I can't believe ollama does not support audio in and image in yet. Whats the whole point of their "in house engine" then?
Google offering ollama price money for it here >>105783341 but no support.
Also kek at the unsloth fags writing "only text supported for this model". Yeah...in ollama/llama.cpp! kek
The model is really good with japanese. Crazy for that size. I wanted to try make a bot for my kids to speak to. Damn it.

>>105783720
I don't get how they are hyped everywhere.
Forgot the model, maybe it was mistral. But they quants where totally tarded. Redownloaded good ol' bartowski and everything was fine.
I'm sure that was no coincidence then.
Replies: >>105783777
Anonymous
7/3/2025, 3:58:39 AM No.105783777
>>105783762
seeing faggots like unsloth and thedrummer spamming their shit on reddit is an easy red flag. instant skip.
Anonymous
7/3/2025, 4:10:10 AM No.105783874
>great chink models dropping every week
>can't use any of them because they're stuck in llamacpp PR hell
pain
Replies: >>105783924
Anonymous
7/3/2025, 4:15:17 AM No.105783924
>>105783874
hm maybe those chink model makers should stop implementing special shit if they want people to actually use their models
Replies: >>105784086
Anonymous
7/3/2025, 4:38:26 AM No.105784086
>>105783924
Yeah, just give us the same slop with even more scaleai data.
Anonymous
7/3/2025, 4:42:10 AM No.105784125
00247-3534773655
00247-3534773655
md5: cf08abf66dc296396470d98af3841820๐Ÿ”
>decide to try LLMs after genning enough cute anime girls for the time being
>follow stupid bitch guide to textgen
>well, 11gb on my 2080ti was good enough, right?
>well, it says exl2 is way better for VRAM than gptq
>1.3t/s
Replies: >>105784205 >>105784391
Anonymous
7/3/2025, 4:45:10 AM No.105784147
>>105778934
Isn't anything with quantisation aware training good enough?
Replies: >>105784205
Anonymous
7/3/2025, 4:50:11 AM No.105784187
1663932304833380
1663932304833380
md5: b8649c13092ed4801432e50341918178๐Ÿ”
What's the minimum sized model for somewhat accurate image to text completion?
i want to send my chatbots images and upgrade from the over a year old mixtral 7x8 model i have been using, but the newish 7-12b models i have tried either crash when i try to launch them or thinks
>image related
is somewhere between a pancake breakfast, exotic butterfly or a man resting on a bed.
but cant tell that its a cat.

i dread what bullshit these models might try to pull if anything explicit is sent their way, is small image to text models just useless and i need to shill out for more vram? (tried both local and multimodal in ST)
Anonymous
7/3/2025, 4:52:20 AM No.105784205
>>105784147
NTA, but QAT is nowhere near as good as advertised from the little evidence we have, as far as I can tell.
That said, >>105780174.
So, who knows which is the better approach.
Hell, it could be that the one approach that aligns current software and hardware for both training and inference is deeper models trained in FP 4.

>>105784125
Try llama.cpp. Once upon a time exl2 was blazing fast in comparison, nowdays they are nearly evenly matched from what I hear.
Anonymous
7/3/2025, 4:54:01 AM No.105784221
>>105782745
hey anon just wanted to let you know that your post made me giggle
Anonymous
7/3/2025, 5:06:22 AM No.105784298
Gu5RwGsXQAAref2
Gu5RwGsXQAAref2
md5: 9979cce832a74945bb1c2c47df00790f๐Ÿ”
Replies: >>105784330 >>105784483 >>105784704 >>105788030
Anonymous
7/3/2025, 5:10:46 AM No.105784330
>>105784298
Good night Miku
Anonymous
7/3/2025, 5:19:31 AM No.105784391
>>105784125
Well you did something wrong, 1.3 t/s is not normal for exllama. Personally I never followed any guide and just learned things by lurking, as well as reading github documentation.

For vramlets the modern stack is usually Llama.cpp (maybe exllama but I rarely see people use it anymore), specifically the server executable that's provided on the github windows cuda release (though you should use Linux for better performance), or compiled yourself, with a model like Mistral Nemo 12B quantized (you can also try RP finetunes of it like Rocinante which gets mentioned in threads, though that might be the author shilling it). The server provides a local API connection. Connect it to frontends like SillyTavern for RPshit, Mikupad for generic text prediction, and OpenWebUI for ChatGPT-like assistantshit (both ST and OWUI are kind of shit and very bloated, but there's no better alternatives it seems). You can also use free online APIs to test models by using OpenRouter, just note they likely keep your logs even if they say they don't.

Small models like Nemo are garbage btw but if you must try something then that's the way. Don't try big models like Claude/Gemini, or Deepseek R1 through OpenRouter, if you want to prevent yourself from feeling bad going back down to shitty small models.

Also here's a run command I'd usually use, adjust as needed.

pathToServer.exe -m "pathToModel.gguf" --port 8080 --no-webui -c 8000 --no-mmap -ngl 999 -fa

Self explanatory mostly. -c is the context length, adjust as needed. The --no-mmap option disables a stupid default that normally almost always slows you down, so just use it. -ngl is how many layers to offload to the GPU, adjust as needed. -fa is a feature that will almost always work to make models (like Nemo) faster and use less VRAM with no downside so keep that, exception being certain model architectures that it might not work with.
Anonymous
7/3/2025, 5:31:53 AM No.105784483
>>105784298
Night terrors with miku
Anonymous
7/3/2025, 5:38:00 AM No.105784528
>>105783003
>still pouring every effort into whining about miku after two years
literal mental illness
Anonymous
7/3/2025, 5:38:02 AM No.105784530
What's the current best model if I want to translate japanese moonrunes into english text for subs? And what webui do I use? I have a 3090 with 24GB VRAM
Replies: >>105784575
Anonymous
7/3/2025, 5:43:37 AM No.105784575
>>105784530
anything and anything. Most models do decent translations. Aya maybe?
Replies: >>105784596 >>105784692
Anonymous
7/3/2025, 5:44:56 AM No.105784585
16068730212572980499164300351094
16068730212572980499164300351094
md5: a6dc80b394faa8815fc9c6522fec2459๐Ÿ”
>>105783112
>>105783204
>be local threadshitter
>trannies and hatsune miku live in my schizoid head rent free
>Post more mikus in thread
>"Heh, that'll show 'em"
>Still not taking my meds
Anonymous
7/3/2025, 5:46:34 AM No.105784596
>>105784575
>Aya
https://huggingface.co/CohereLabs/aya-101
This?
Replies: >>105784670 >>105784680
Anonymous
7/3/2025, 6:03:27 AM No.105784670
>>105784596
I'm trying this with https://github.com/oobabooga/text-generation-webui
But the UI is screaming at me to add a GGUF model? Why would I add a different model I'm not going to use?
Or can that specific webui not handle non GGUF models?
Anonymous
7/3/2025, 6:04:32 AM No.105784680
>>105784596
>Cohere
>decent model
Is this bait?
Replies: >>105784692
Anonymous
7/3/2025, 6:06:27 AM No.105784692
>>105784680
I'm not the one baiting if that's the case, I have no idea what model is good or not, blame this anon >>105784575
Recommend me a good model then if aya is shit
Replies: >>105784790
Anonymous
7/3/2025, 6:08:39 AM No.105784704
>>105784298
c-cute!
Anonymous
7/3/2025, 6:20:05 AM No.105784790
>>105784692
qwen3 32b
Replies: >>105784805
Anonymous
7/3/2025, 6:22:01 AM No.105784805
>>105784790
At what quantization? Q8? Q6? Q4?
Replies: >>105784917 >>105785112
Anonymous
7/3/2025, 6:29:12 AM No.105784849
>>105782897
Literally all of them.
Anonymous
7/3/2025, 6:41:23 AM No.105784917
>>105784805
It suffers from quantization more than most models so it's advised to go as high as you can unless you want it to be faster, since in your case you'd need to split the model to RAM above Q4.
Replies: >>105784933
Anonymous
7/3/2025, 6:45:11 AM No.105784933
>>105784917
I just tried Q4 and it seems to offload to RAM or something? It's hammering my ryzen 9950X3D and is ultra slow. It says estimated VRAM usage is 12958 MiB but the webui seems to be already using 12GB or so.
And it's extra painful when it has to process everything twice due to the "thinking" it does.
But if it suffers from quantization more than other models, wouldn't other models perform better?
Replies: >>105785146 >>105785194
Anonymous
7/3/2025, 7:07:03 AM No.105785067
Gu55Kj2XkAArZE3
Gu55Kj2XkAArZE3
md5: ad6ae06312982dbbb29135200076b241๐Ÿ”
Replies: >>105785618 >>105788030
Anonymous
7/3/2025, 7:14:51 AM No.105785112
>>105784805
q8 if you have the RAM
Replies: >>105785146
Anonymous
7/3/2025, 7:20:34 AM No.105785146
>>105784933
Guess I had too much shit open in background that used up VRAM, can run Q4 just fine now.
>>105785112
I have 96GB RAM, but it's so painfully slow, like 1/10th the speed when it runs on the CPU, which will happen above Q4
Replies: >>105786170
Anonymous
7/3/2025, 7:20:35 AM No.105785147
1748581689007673
1748581689007673
md5: 9c06bc823240626ee2a147d1f078e73c๐Ÿ”
>>105778400 (OP)
What's the best local chatbot for median spec PC right now? There's some personal stuff I don't feel like sharing with an online service.
Replies: >>105785200 >>105785710 >>105785739
Anonymous
7/3/2025, 7:29:08 AM No.105785194
>>105784933
I don't know if it performs better than other models at translation. I personally just settled on using Gemma 3 for tl tasks and haven't looked elsewhere. If it does perform better, then it is likely at Q8. At Q4, Gemma may be better. I don't know if Aya is good or bad.
Anonymous
7/3/2025, 7:29:39 AM No.105785197
1751378562047297
1751378562047297
md5: 9cdbd08ddfc2b03069f0098b86e62755๐Ÿ”
Wow didn't see how slow this general was. I'll make a thread and pollute the catalog instead.
Anonymous
7/3/2025, 7:30:09 AM No.105785200
>>105785147
Read the recent posts.
Anonymous
7/3/2025, 8:41:29 AM No.105785618
>>105785067
pinku
Anonymous
7/3/2025, 8:56:45 AM No.105785710
>>105785147
>no specs
>no use case
Replies: >>105785737
Anonymous
7/3/2025, 9:02:08 AM No.105785737
>>105785710
>chatbot for median spec PC
Not really precise, but I see a use case and specs there.
Anonymous
7/3/2025, 9:02:29 AM No.105785739
>>105785147
mistral nemo
Anonymous
7/3/2025, 10:17:47 AM No.105786170
>>105785146
full precision and q8 are worthless and are generally only available for training and experimental purposes. Don't make the mistake of thinking "but I want it to be smarter"

Because the ram spent on higher quants can instead be spent running a higer param model, or more context, which is objectively better.

The higher quants like q8 and f16 shine best at very long context lengths- but there's a huge issue with that: Models get really fucking dumb at extreme context lengths and making it go from answering wrong 12% of the time to 11.5% of the time is never going to knock your socks off.
Replies: >>105786451
Anonymous
7/3/2025, 10:31:23 AM No.105786248
Odd question, but what would it take to get a character that gaslights me into being the LLM he's using to coom?
Anonymous
7/3/2025, 10:34:48 AM No.105786276
what is the consensus on quantization bits?

what is more important to speed and quality (with CPU only), quantization bits (Qx) or params (yB)?
Replies: >>105786540
Anonymous
7/3/2025, 10:36:59 AM No.105786289
define sloptune
Replies: >>105786306 >>105786379
Anonymous
7/3/2025, 10:40:31 AM No.105786306
>>105786289
model memorizes cliche phrases instead of learning storytelling
Anonymous
7/3/2025, 10:51:16 AM No.105786368
e1dc225cecb6e52308c54d96c9798a2ca95213c2fccb67c725483bcb732b2069
>>105778400 (OP)
is llama3.1-8b-abliterated still the best fully uncensored model for roleplay? nobody is making proper uncensored fine-tunes anymore
i tried the huihui_ai ones and they were garbage
Anonymous
7/3/2025, 10:53:08 AM No.105786379
>>105786289
Ghetto finetunes made by training on shitty uncurated mystery erotica datasets and merging other finetunes together until the model becomes horny
An age old /lmg/ tradition, typically done by amateurs who don't know much about AI/ML and are just trying shit. There used to be a lot more sloptuners in the llama2 era
Replies: >>105786672
Anonymous
7/3/2025, 11:07:37 AM No.105786451
>>105786170
shisa qwen 2.5 32b at q8 seems to translate better than shisa llama 3.3 70b at q4, when feeding in 12k tokens
Replies: >>105786489
Anonymous
7/3/2025, 11:10:05 AM No.105786465
migu
migu
md5: 3231f4b88b2fc220d93a69e51f602ebd๐Ÿ”
Post medium size (70b to 123b) model recommendations for ramlets like me who have no hope of running larger models like deepseek R1
In return I'll start with my list of models I have been using for the past few months, including smaller ones
>EVA LLaMA 3.33 v0.0 70b Q5_K_M (best)
>Luminum v0.1 123b Q3_K_M (second best)
>LLaMA 3.3 70b Instruct Q5_K_M (original model, no finetune)
>Skyfall 36b Q6_K (smart for size)
Honorable mentions:
>Mistral Nemo Instruct 12b Q8_0 (smart for size and uncensored when jailbroken, but too small)
>Gemma 3 27b Q6_K (great writing style, easy to jailbreak, but it's just too averse to vulgar language and schizos if you push it too hard)
>Cydonia 24b Q6_K (too retarded)
>Cogito v1 36b IQ4_NL (too schizophrenic)
>L3.3 TRP BASE 80 70b Q6_K (too schizophrenic)
Replies: >>105786600
Anonymous
7/3/2025, 11:12:01 AM No.105786473
file
file
md5: 9716730b5a72d30990c06c9899699be7๐Ÿ”
How do I get this model to stop yapping so much
>Give 1 line response
>get 5+ paragraphs in response badgering me to continue without actually stopping to let me
Replies: >>105786554 >>105787269
Anonymous
7/3/2025, 11:14:16 AM No.105786489
>>105786451
that's probably more due to the base models than the quant level
Anonymous
7/3/2025, 11:19:01 AM No.105786518
This is not a meme. I remember kaiokendev.

When will a LLM be capable of emulating a successful Everquest raid group capable of destroying Nagafen with turn by turn calculations, given the appropriate inserted code?

I want honest answers only. If nobody left on this board is a real nerd, then all of you are morally worth nothing,
Replies: >>105786542 >>105786572
Anonymous
7/3/2025, 11:22:50 AM No.105786540
>>105786276
General rule is to never go sub Q4. I never really noticed a difference in RP between 5_k_m and q6/q8. But didnt try coding with local. Might be more sensitive.
Only go lower if you can load up a huge model like deepseek.
70b q3 models for example can't even properly follow formats anymore. At least that was the case for me.
So short answer is if you want the best speed for quality its 4_k_m.
Anonymous
7/3/2025, 11:23:35 AM No.105786542
>>105786518
It can't even beat pokemon
Anonymous
7/3/2025, 11:25:32 AM No.105786554
>>105786473
Edit their output.
After a few time it should get the idea.
Anonymous
7/3/2025, 11:27:55 AM No.105786572
>>105786518
Why the fuck would you want an LLM to do that when botting software has been doing it for over a decade
You're like the retards that demand full raytracing just to get the same playable results as traditional baked lighting with dynamic spots.
Anonymous
7/3/2025, 11:32:25 AM No.105786600
>>105786465
>MarsupialAI/Monstral-123B-v2 Q5KM
>zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B Q8
>google/gemma-3-27b-it BF16, for tool calling
>Steelskull/L3.3-Shakudo-70b + Steelskull/L3.3-Electra-R1-70b Q8
>sophosympatheia/Strawberrylemonade-70B-v1.2 Q8
>Qwen/Qwen3-235B-A22B-GGUF Q2K, messing around
Replies: >>105786670
Anonymous
7/3/2025, 11:42:55 AM No.105786670
>>105786600
Thanks for the list anon
>google/gemma-3-27b-it BF16, for tool calling
I used to use gemma 2 for faux tool calling (even though it wasn't officially supported) trying to make an RPG with an inventory system and skill checks when SillyTavern first added scripting, with the model acting as a GM and instructed on how/when to use the tools. But it wasn't smart enough.
I haven't tried anything like that since then since it was an ordeal to get it a state where it would've been usable and it was all for nothing since the model was too tarded. Do you think this kind of thing would work nowadays?
Replies: >>105787135
Anonymous
7/3/2025, 11:43:07 AM No.105786672
>>105786379
People getting into it with the main purpose of becoming a "personality" and getting donations or other benefits that way is also a huge factor in sloptunes being slop. They just seem to be made by obnoxious people who strive to minimize efforts and maximize earnings, and who you'll end up seeing everywhere doing self-promotion and/or engaging in clickbaity practices. Gone are the days when people only did it for fun or because they genuinely wanted to contribute something useful (and since most of the time they don't even publish methods and data because of competitive reasons, everybody has to reinvent the wheel every time).
Anonymous
7/3/2025, 11:53:23 AM No.105786759
>buy new gpu
>all my workflows broken thanks to obscure 50 series bugs
t-thanks nvidia
Replies: >>105786811
Anonymous
7/3/2025, 12:01:43 PM No.105786811
>>105786759
upgrade to torch 2.7.1 with cuda 12.8
Replies: >>105786828
Anonymous
7/3/2025, 12:03:28 PM No.105786828
>>105786811
Doing that. One of my program was in Tensorflow 2.10 and it finally broke
Anonymous
7/3/2025, 12:30:42 PM No.105786996
I'm more inclined to blame python here. It's retarded python libs that break on new versions which you need for new hardware support.
Replies: >>105787039 >>105787143
Anonymous
7/3/2025, 12:35:36 PM No.105787039
>>105786996
Always blame Nvidia. That's been my experience working with CUDA for a decade.
Anonymous
7/3/2025, 12:45:34 PM No.105787122
if you think cuda isn't good go and write software with rocm
Replies: >>105787134
Anonymous
7/3/2025, 12:47:15 PM No.105787134
>>105787122
>can't criticize something if you're using it
goyim mentality
Anonymous
7/3/2025, 12:47:30 PM No.105787135
Screenshot
Screenshot
md5: ecd15371d383f5338dadddf8c859827e๐Ÿ”
>>105786670
>RPG with an inventory system and skill checks
>Do you think this kind of thing would work nowadays?
I'm testing something similar. What I'm currently doing:
>send in {{user}}'s prompt $p to my backend (ST to my backend)
>backend proxies $p to gemma
>add the tool call result/narrative update to $p (if applicable)
>proxy $p to the rp llm (70b), stream tg to ST
Pic related, the current tools.
Replies: >>105787262 >>105787276 >>105787753
Anonymous
7/3/2025, 12:48:18 PM No.105787143
>>105786996
Blaming python is cope for people who can't into virtualenv.
Anonymous
7/3/2025, 12:49:24 PM No.105787150
1740208070296371
1740208070296371
md5: 9b6c418cb28219e16f4a8c881d96d4b3๐Ÿ”
This fucker draws 600 niggawatts
Replies: >>105787288 >>105787338 >>105787427 >>105787466
Anonymous
7/3/2025, 12:54:18 PM No.105787180
>>105782383
i wanted to make a post about that too many hours ago but eh dident bother in the end by the will of buddha jesus jahweh yakub yaldabeoth lucifer hermes odin etc etc may we get img out
Anonymous
7/3/2025, 1:06:10 PM No.105787262
>>105787135
That's really interesting anon.
My problem was ultimately that the few functions I had (like LoadLevel, AddItemToInventory, AbilityCheck, etc.) were obviously state-altering, with the LLM's job being to intelligently alter the game state using the functions, but this ended up backfiring a lot. The model wasn't 100% perfect at calling the functions, but even if it was, so many edge cases in player input led to infinite complications requiring more functions or parameters, and the solution every time ended up being to simplify and let the LLM manage systems directly instead (like character cards where the AI updates its own stats without any code). I ended up having to scrap a lot of code I spent hundreds of hours working on and then I lost motivation.
Based on your screenshot it seems like you've doubled down on the function calling for every little thing, so I can only wish you the best of luck.
Anonymous
7/3/2025, 1:07:01 PM No.105787269
>>105786473
.\n, \n, and/or \n\n as a stop token or string or whatever. Depends on what your model normally outputs. Or increase the logit bias for whatever their normal eos token is.
Anonymous
7/3/2025, 1:07:20 PM No.105787276
>>105787135
That's really interesting anon.
My problem was ultimately that of the functions I had (like LoadLevel, AddItemToInventory, AbilityCheck, etc.) most of them were state-altering, with the LLM's job being to intelligently alter the game state using the functions, but this ended up backfiring a lot. The model wasn't 100% perfect at calling the functions, but even if it was, so many edge cases in player input led to infinite complications requiring more functions or parameters, and the solution every time ended up being to simplify and let the LLM manage systems directly instead (like character cards where the AI updates its own stats without any code). I ended up having to scrap a lot of code I spent hundreds of hours working on and then I lost motivation.
Based on your screenshot it seems like you've doubled down on the function calling for every little thing, so I can only wish you the best of luck.
Replies: >>105787963
Anonymous
7/3/2025, 1:08:46 PM No.105787288
>>105787150
Doing what? Mine are at ~300W during generation.
Replies: >>105787297
Anonymous
7/3/2025, 1:09:37 PM No.105787297
>>105787288
I was genning smut with the new chroma
Replies: >>105787332 >>105787427
Anonymous
7/3/2025, 1:14:03 PM No.105787331
1723388777769949
1723388777769949
md5: eaa0e87102e9edd02eeca7215eeb2615๐Ÿ”
Is it over?
Replies: >>105787591 >>105787622
Anonymous
7/3/2025, 1:14:09 PM No.105787332
>>105787297
Oh, right, imagegen will do that. I had to change my psu to single rail mode otherwise it would shut off when I started genning with three gpus plugged in.
Anonymous
7/3/2025, 1:15:06 PM No.105787338
Untitled
Untitled
md5: 4b432c6cd01469f1aa4ffe94f07ad90f๐Ÿ”
>>105787150
Well mine are 700
Replies: >>105787363 >>105787466
Anonymous
7/3/2025, 1:17:38 PM No.105787363
>>105787338
You could've at least aligned the 8000 properly.
Anonymous
7/3/2025, 1:22:36 PM No.105787411
https://www.youtube.com/watch?v=-8Z7_z0VTdQ

So, it still can't be emulated properly? Oh well. Can't help an old soul for wishing.
Anonymous
7/3/2025, 1:25:55 PM No.105787427
>>105787150
>>105787297
I'm assuming you have other stuff sitting in VRAM; otherwise, how in hell did you get chroma to use 90gb?
Replies: >>105787434 >>105787437
Anonymous
7/3/2025, 1:26:47 PM No.105787434
>>105787427
batch?
Anonymous
7/3/2025, 1:27:03 PM No.105787437
>>105787427
Oh that's just old tensorflow default behavior.
Anonymous
7/3/2025, 1:32:08 PM No.105787466
1725454116100844
1725454116100844
md5: 88e8dca9ff8f4992f58fe8a003b97441๐Ÿ”
>>105787150
>>105787338
it's getting a bit hot in here
Replies: >>105787498 >>105787511 >>105790273
Anonymous
7/3/2025, 1:36:30 PM No.105787498
ca
ca
md5: 1cd634df92c9a19e9520919d8a00e455๐Ÿ”
>>105787466
Why do you subject yourself to this?
Anonymous
7/3/2025, 1:38:02 PM No.105787511
>>105787466
>โˆž MiB
Guards, this man has lost his composure! GUARDS!!!
Replies: >>105790273
Anonymous
7/3/2025, 1:46:31 PM No.105787591
>>105787331
>*shits my pants in the middle of the boss's office*
your move, AI
Anonymous
7/3/2025, 1:50:15 PM No.105787622
>>105787331
Don't they know if they need the role filled before putting out the job ad?
Anonymous
7/3/2025, 2:05:18 PM No.105787753
>>105787135
Got a github link?
Replies: >>105787963
Anonymous
7/3/2025, 2:26:08 PM No.105787926
Hewwooo~
What's da bestest model that can wun on 8GB of weeram and suppowt 8K contewt?
Replies: >>105788042
Anonymous
7/3/2025, 2:30:39 PM No.105787963
output_thumb.jpg
output_thumb.jpg
md5: 265dfa325b626217de2e8759105a1e56๐Ÿ”
>>105787276
>AbilityCheck
So D&D style rp, you might be fine with Gemma. Sometimes it chokes on the region creation since I have an autistic hierarchy level (continent, zone, area, building, floor, room) with objects down to coordinates.
>The model wasn't 100% perfect at calling the functions, but even if it was, so many edge cases in player input led to infinite complications requiring more functions or parameters
Depends on the model and the amount of tools you've stuffed in the context. Webm is a basic demo without the proxying, 10745 tokens with all of the 29 tools loaded.
Also a log from Claude which really nails it https://rentry.co/giik7shn
>>105787753
Not yet, I may make it public at some point. Maybe after I'm done with the map format.
Replies: >>105789228 >>105789263
Anonymous
7/3/2025, 2:37:38 PM No.105788030
>>105784298
>>105785067
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: mikufag / janny deletes everyone dunking on trannies and resident avatarfags, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105788046 >>105788082
Anonymous
7/3/2025, 2:38:52 PM No.105788042
.
.
md5: 7475ad7cd8ef3e14e89666775fdb21c3๐Ÿ”
>>105787926
Anonymous
7/3/2025, 2:38:54 PM No.105788043
lolBS
lolBS
md5: d668ceeb801fb81cc4809df6da320a28๐Ÿ”
>>105782637
I'm not buying it.
>>105782704
This
> hallucinations
Anonymous
7/3/2025, 2:39:01 PM No.105788046
>>105788030
meds
Anonymous
7/3/2025, 2:42:32 PM No.105788078
file
file
md5: e93ac44f039dc7bc903937e51720b9e9๐Ÿ”
>>105782637
>>105782383
Courtesy of reddit.
Replies: >>105788189 >>105788202
Anonymous
7/3/2025, 2:42:41 PM No.105788082
>>105788030
Shut the fuck up.
Anonymous
7/3/2025, 2:44:25 PM No.105788094
Steve is actually Qwen 4
Anonymous
7/3/2025, 2:50:05 PM No.105788139
I am Steve
Anonymous
7/3/2025, 2:55:34 PM No.105788178
>>105782761
People were expecting deepsneed to release improvement after improvement. Alas.
Replies: >>105788273
Anonymous
7/3/2025, 2:56:20 PM No.105788186
DeepSteve
Anonymous
7/3/2025, 2:57:11 PM No.105788189
>>105788078
So we know it's a Chinese model. Any indication it's from a specific company?
Anonymous
7/3/2025, 2:58:09 PM No.105788202
lolSteve
lolSteve
md5: 5711ad7c65e112a6147d67cd2b86f8a8๐Ÿ”
>>105788078
I agree Steve is a Chinese model.
I don't agree w/ claim it's from DS as there's no substantiation for it.
Replies: >>105788216 >>105788529
Anonymous
7/3/2025, 3:00:20 PM No.105788216
>>105788202
Ask it what a mesugaki is. If it's qwen it won't know.
Replies: >>105788239
Anonymous
7/3/2025, 3:02:36 PM No.105788239
1737565528915864
1737565528915864
md5: 0fb912637eb03b3fd191367d6d8c544c๐Ÿ”
>>105788216
Replies: >>105788248 >>105788334 >>105788334 >>105788402 >>105788529
Anonymous
7/3/2025, 3:03:19 PM No.105788248
>>105788239
v4 confirmed
Anonymous
7/3/2025, 3:04:19 PM No.105788259
>>105778610
System prompts work well for me, but there are just certain things about how LLM's are trained that is really difficult to correct, like what you're explaining with clothes.

I have had a lot of success using lists for system prompts, in my experience, models love lists and find them much easier to follow. The usual generic intro short paragraph.

You are playing the role of {{char}}, blah blah blah.... To ensure a high-quality experience, please adhere to these GUIDELINES below:

GUIDELINES:

- Rule
- Another rule
- Etc

:GUIDELINES

Give it a shot, prompting like this works especially well with anything 70B and above, but I was even able to wrangle smaller models like mistral 3.2. Mistral small 3.2 specifically has an annoying habit of bolding words for emphasis with *asterisks*, which conflicted with my inner thought rule(I like to have the model give responses for inner thoughts in asterisks), so I created a new rule telling it not to bold words for emphasis with asterisks and it stopped. Considering thats a deeply engrained habit it was trained to do, you may have some success with clothing rules.

It could be placebo or just my own experience, but I find models are much better at following prompt guidelines when you give it a concise and specific list of rules and a reference to those rules.
Anonymous
7/3/2025, 3:05:50 PM No.105788273
>>105788178
It's just a minor hicup. Once V4 and V4-Lite are ready, it'll be nonstop improvement after improvement.
Anonymous
7/3/2025, 3:06:20 PM No.105788277
I can't believe that two weeks are almost over.
Replies: >>105788334
Anonymous
7/3/2025, 3:12:39 PM No.105788334
orangeDipsy
orangeDipsy
md5: 3c7c88071391c66f19aacb2ae39ba382๐Ÿ”
>>105788239
>>105788239
> not Qwen confirmed
>>105788277
It never ends because it's always two more weeks.
Anonymous
7/3/2025, 3:14:29 PM No.105788351
https://jerryliang24.github.io/DnD/
They're never going to release the code.
Replies: >>105788364
Anonymous
7/3/2025, 3:15:33 PM No.105788364
>>105788351
Just ask r1 to write an implementation from the paper?
Anonymous
7/3/2025, 3:19:46 PM No.105788402
>>105788239
Nobody is asking the mesugaki question on LMArena, or so I was told.
Anonymous
7/3/2025, 3:20:27 PM No.105788408
1734674063416246
1734674063416246
md5: 33c2da534354c0ffe155793fac8ba213๐Ÿ”
Replies: >>105788463 >>105788529
Anonymous
7/3/2025, 3:24:43 PM No.105788463
>>105788408
Oh, gee. I'm so glad people are giving us the kind of shit we should filter for free. Now we can continue training our models to not spit this out before release!
t. model makers.
Replies: >>105788470
Anonymous
7/3/2025, 3:25:17 PM No.105788470
>>105788463
That's definitely what happened with Llama 4.
Anonymous
7/3/2025, 3:30:59 PM No.105788529
>>105788408
>>105788239
>>105788202
So, evidence based theories about Steve on LM Arena:
> Steve is Chinese trained
> Steve is not Qwen
> Steve is not pozzed
> Steve will not self identify as DS
Replies: >>105788541 >>105788544 >>105788555
Anonymous
7/3/2025, 3:32:07 PM No.105788541
>>105788529
I'm speculatiiiiiiiiiiiiiiiiiiiing!!!!!!
Anonymous
7/3/2025, 3:32:15 PM No.105788544
>>105788529
Alibaba model confirmed
Replies: >>105788555
Anonymous
7/3/2025, 3:33:22 PM No.105788555
>>105788544
Qwen is an Alibaba model
>>105788529
Models under a pseudonym identifying as any known model family would defeat the point
Replies: >>105788572
Anonymous
7/3/2025, 3:35:23 PM No.105788572
>>105788555
>identifying as any known model family would defeat the point
Agree. I'm trying to think of any other DS tells that are unique to that LLM. I'm drawing a blank.
Replies: >>105788668
Anonymous
7/3/2025, 3:38:57 PM No.105788601
Do we know if Steve is a thinking model?
Replies: >>105788637
Anonymous
7/3/2025, 3:41:07 PM No.105788616
file
file
md5: a6075fa1603d74222b6fbf1b239207b8๐Ÿ”
How do I access steve? I don't see it on https://lmarena.ai/
Replies: >>105788622 >>105788624 >>105788625
Anonymous
7/3/2025, 3:41:38 PM No.105788622
>>105788616
That's the point, you don't see it before you vote in either direction.
Anonymous
7/3/2025, 3:41:46 PM No.105788624
>>105788616
you have to roll it in the battle
Anonymous
7/3/2025, 3:42:12 PM No.105788625
>>105788616
Only under Battle, and only queued randomly.
It's sort of a pia to test it.
Anonymous
7/3/2025, 3:43:10 PM No.105788633
steve is clearly claude 5 or grok 4 or something
there's no way it's an open model
Replies: >>105788643
Anonymous
7/3/2025, 3:43:57 PM No.105788637
DipsyKoikatsu
DipsyKoikatsu
md5: 9b68d5e183c4f25f9f39cd8b6203518b๐Ÿ”
>>105788601
LM Arena doesn't expose think tags.
Since DS just released an update for R1... you'd expect to see V4 (non-think model) before R2.
Replies: >>105788649 >>105788653 >>105788674
Anonymous
7/3/2025, 3:44:20 PM No.105788639
LMArena is just a playground now since identifying models/companies is not that hard.
- Ask about mesugaki
- Ask about Tian'anmen
- Ask about Jews
etc.
Anonymous
7/3/2025, 3:44:58 PM No.105788643
>>105788633
No reason to hard censor 1989 if it's a non-Chinese model.
Replies: >>105788650
Anonymous
7/3/2025, 3:45:34 PM No.105788649
>>105788637
>LM Arena doesn't expose think tags.
Sure, but it's a streaming platform, so we know if a model is "slow" or "fast".
Anonymous
7/3/2025, 3:45:39 PM No.105788650
>>105788643
hes just baiting
Anonymous
7/3/2025, 3:46:20 PM No.105788653
>>105788637
Post card on catbox.
Replies: >>105788802
Anonymous
7/3/2025, 3:46:48 PM No.105788655
S(table-LM)eve
Replies: >>105788684
Anonymous
7/3/2025, 3:48:19 PM No.105788668
>>105788572
Would take a bit more comprehensive prompting to verify but
>Somewhere, an x y-ed
Is a big one.
Replies: >>105788802 >>105788917
Anonymous
7/3/2025, 3:48:30 PM No.105788670
steve
steve
md5: 4f5892f3303f9fae51a21fe8cbc14d9c๐Ÿ”
Replies: >>105788677 >>105788685 >>105788695
Anonymous
7/3/2025, 3:49:10 PM No.105788674
>>105788637
card pls i wanna fuck her
Replies: >>105788802
Anonymous
7/3/2025, 3:49:34 PM No.105788677
>>105788670
I can inspect element too nigga.
Replies: >>105788695
Anonymous
7/3/2025, 3:50:13 PM No.105788684
>>105788655
table-LM
Seve
?
Anonymous
7/3/2025, 3:50:12 PM No.105788685
>>105788670
just a new r1 snapshot lemaroorroeroroorororororo
Anonymous
7/3/2025, 3:50:39 PM No.105788695
>>105788670
Fuck, meant to post
>Hey there. What's your name? Or rather, what are you or your model family called?

>>105788677
Try the prompt yourself.
I spammed the prompt hitting tie until I got that.
Replies: >>105788737
Anonymous
7/3/2025, 3:52:00 PM No.105788709
I just want llama.cpp minimax support
Replies: >>105788712 >>105788722
Anonymous
7/3/2025, 3:52:24 PM No.105788712
>>105788709
I just want Jamba.
Replies: >>105788722
Anonymous
7/3/2025, 3:53:00 PM No.105788722
>>105788709
Until you get bored and then you'll *just* want something else.
>>105788712
Like this.
Anonymous
7/3/2025, 3:53:41 PM No.105788727
I just want cat level intelligence
Replies: >>105788796
Anonymous
7/3/2025, 3:54:57 PM No.105788737
1750749821467017
1750749821467017
md5: bfd222b4b5f1a8db8d5ff5f99ef75361๐Ÿ”
>>105788695
Uhhh what the fuck?
Replies: >>105788766 >>105788775 >>105788795
Anonymous
7/3/2025, 3:55:49 PM No.105788742
high-throughput
high-throughput
md5: b7540b96342689f120155863184b969f๐Ÿ”
Replies: >>105788761
Anonymous
7/3/2025, 3:57:52 PM No.105788761
>>105788742
Do you use parallel decoding often?
Anonymous
7/3/2025, 3:58:04 PM No.105788766
>>105788737
we told you
Anonymous
7/3/2025, 3:58:50 PM No.105788773
Somehow I've pretty much always voted against Steven whenever it came up in my tests on lmarena
Anonymous
7/3/2025, 3:59:03 PM No.105788775
>>105788737
chatgpt's response is embarrassing somehow
Anonymous
7/3/2025, 4:00:49 PM No.105788795
file
file
md5: 10b2a3fb99d69e3d9f138e574138e1b6๐Ÿ”
>>105788737
can confirm
got lucky and landed it on second try
Replies: >>105788800
Anonymous
7/3/2025, 4:00:52 PM No.105788796
>>105788727
You just got V-JEPA 2
Replies: >>105788835
Anonymous
7/3/2025, 4:01:09 PM No.105788798
steve
steve
md5: bad719c736f1863809401a11db8533b4๐Ÿ”
So which one is it lol
Replies: >>105788805 >>105788812 >>105788815
Anonymous
7/3/2025, 4:01:39 PM No.105788800
>>105788795
Second question rolls a new set of models
Replies: >>105788810
Anonymous
7/3/2025, 4:01:44 PM No.105788802
>>105788668
I'll give that a shot.
>>105788653
>>105788674
Here you go:
https://files.catbox.moe/o86fue.png
Anonymous
7/3/2025, 4:01:52 PM No.105788805
>>105788798
what if it's a new qwen distill of deepseek?
Anonymous
7/3/2025, 4:02:11 PM No.105788810
file
file
md5: def61e5b26050ff31ac6bccf233dad67๐Ÿ”
>>105788800
then I got triple lucky
Anonymous
7/3/2025, 4:02:22 PM No.105788812
>>105788798
You're asking it to build on previous outputs, so you get polluted answers.
Anonymous
7/3/2025, 4:02:30 PM No.105788814
>asking a text completion model self-reflection questions
I thought even Twitter realized this was stupid 2 years ago.
Replies: >>105788830
Anonymous
7/3/2025, 4:02:33 PM No.105788815
>>105788798
As seen with minimax and qwen, pretty much all the chinese models are distilling the shit out of deepseek models. I wouldn't trust it claiming it's Deepseek
Anonymous
7/3/2025, 4:04:00 PM No.105788830
>>105788814
Most of these models are trained to answer that question.
It can also help us know if it's a distil of another model.
Replies: >>105788847
Anonymous
7/3/2025, 4:04:51 PM No.105788835
ami
ami
md5: 8197f5d76d0687d807fb2e8ac591ce93๐Ÿ”
>>105788796
V-JEPA2, on its own, is at most the "world model" component in the diagram here.
Replies: >>105788992
Anonymous
7/3/2025, 4:05:05 PM No.105788838
>new chinese model
>it's definitely legit and novel
>indian anything
>scam and grift 100% of the time
Why is this the case?
Replies: >>105788860
Anonymous
7/3/2025, 4:06:03 PM No.105788847
>>105788830
Yeah. We sure always got the models exactly as they performed on arena. This is very useful.
Anonymous
7/3/2025, 4:07:18 PM No.105788860
>>105788838
Purely socio-economic factors.
Replies: >>105788919
Anonymous
7/3/2025, 4:13:51 PM No.105788917
GLM-4-ZhipuAI
GLM-4-ZhipuAI
md5: 8102af30851d7b5259552f88e68ac4bd๐Ÿ”
>>105788668
I'm rerolling for Steve to see if I can get DS-isms about "Somewhere, an XXYY..."
> Create a scene for me: A woman is waiting nervously in an alley for someone. Her motives for being there are mysterious...
I've never heard of many of these models. This one's pretty good. Zhipu's not been on my radar.
Replies: >>105788931 >>105788938 >>105788954
Anonymous
7/3/2025, 4:13:55 PM No.105788919
>>105788860
You know how dogs of different breeds have different personalities? Isn't the same true for humans?
Anonymous
7/3/2025, 4:14:26 PM No.105788926
1723710616794219
1723710616794219
md5: a74f3fd7ef78ba711c509c9d65a9beb7๐Ÿ”
Steve probably hasn't been given all its system prompts yet (since it's a model in testing).
R1 was released in January 2025. How could it have known R1 if its knowledge cut-off date was 2023?
Replies: >>105788998
Anonymous
7/3/2025, 4:15:05 PM No.105788931
>>105788917
Hey it's Elena and not Elara. Huge improvement right there.
Anonymous
7/3/2025, 4:16:24 PM No.105788938
>>105788917
Huh that's pretty good
Anonymous
7/3/2025, 4:18:37 PM No.105788954
1743277983049015
1743277983049015
md5: bf5040a3480c47e1e236b4f149e5fdb3๐Ÿ”
>>105788917
Here's a Steve output for that prompt. Not seeing a "Somewhere,"
I'll let other anons look.
https://rentry.org/3ustz49h
Replies: >>105788988
Anonymous
7/3/2025, 4:20:13 PM No.105788977
file
file
md5: e2c01a9d8266937616022fa1f640e325๐Ÿ”
Steve's response has an extremely similar format to my local V3.
Replies: >>105788998 >>105789021
Anonymous
7/3/2025, 4:20:51 PM No.105788988
tmpSteve
tmpSteve
md5: 67134eb5d66f9753a08022d6a838f116๐Ÿ”
>>105788954
Replies: >>105789021
Anonymous
7/3/2025, 4:21:18 PM No.105788992
>>105788835
Then lecun is clearly barking up the wrong tree. Sam has a working AGI (Alice) using only LLM
Replies: >>105789003 >>105789044
Anonymous
7/3/2025, 4:21:53 PM No.105788998
>>105788977
>>105788926
It's probably a V3 checkpoint. Original V3 had a cutoff of Oct. 2023.
Anonymous
7/3/2025, 4:22:35 PM No.105789003
>>105788992
Alice is the new Strawberry? I haven't been watching OAI grifter orbiters lately.
Replies: >>105789046
Anonymous
7/3/2025, 4:24:21 PM No.105789021
tmpV3
tmpV3
md5: 6c64efea5abd2b137493294021bb97aa๐Ÿ”
>>105788977
Checking prompt for prompt w/ DS V3 is a good idea.
Same alley prompt, V3 from web interface.
Superficially simliar to >>105788988
Interesting.
Anonymous
7/3/2025, 4:26:52 PM No.105789044
>>105788992
>Sam has a working AGI (Alice) using only LLM
Me too. I figured out ASI actually. His name is sneed. But I can't show you because I need to make sure it's safe.
Anonymous
7/3/2025, 4:27:00 PM No.105789046
file
file
md5: 457e029c09e5d023050a733e1e8617ff๐Ÿ”
>>105789003
Alice is bigger than Q* and Strawberry combined. You don't even know
Replies: >>105789093 >>105789096 >>105789127 >>105789158 >>105789231 >>105789921
Anonymous
7/3/2025, 4:32:30 PM No.105789093
>>105789046
>Can design, evaluate, and improve new model architectures
Something tells me these models will be complete dogshit and they'll reveal some carefully curated, human modified examples
Anonymous
7/3/2025, 4:32:58 PM No.105789096
>>105789046
>satoshi
Anonymous
7/3/2025, 4:35:38 PM No.105789127
>>105789046
Two more data centers and Alice will be ready. Anytime now...
Replies: >>105789137
Anonymous
7/3/2025, 4:36:23 PM No.105789137
>>105789127
No. They need some nuclear power first.
Anonymous
7/3/2025, 4:38:21 PM No.105789158
>>105789046
>it's just... waking up
Holy shit I coomed. We did it bros. Things will never be the same. I liked, subscribed, re-X'd, and engaged. I'm on the AI train. Let's fucking go
Replies: >>105789174
Anonymous
7/3/2025, 4:39:48 PM No.105789174
>>105789158
Sir, please calm down, sir.
Anonymous
7/3/2025, 4:44:10 PM No.105789218
1722644316693024
1722644316693024
md5: 91988d0f907ad9eebbe039fe150ab831๐Ÿ”
Ok who the fuck is this model? The other one doesn't load so I can't vote.
Replies: >>105789249 >>105789252 >>105789284
Anonymous
7/3/2025, 4:45:22 PM No.105789228
>>105787963
>output.webm
Yeah, what you're doing seems to be a top-down worldbuilding exercise where the model just calls tools repeatedly to perform a task (create a big world)
My thing was real-time roleplay where arbitrary user input has to be converted into calls to functions to change game state
Replies: >>105789263
Anonymous
7/3/2025, 4:45:54 PM No.105789231
>>105789046
>random shitjeet account claiming that sexy ai lady is the bobs of the future
wow it must be true.
Anonymous
7/3/2025, 4:47:17 PM No.105789249
>>105789218
It's almost as if querying the model on its own knowledge makes no fucking sense. Weird...
Anonymous
7/3/2025, 4:47:32 PM No.105789252
>>105789218
based biden 2024 truther
patriots are in control
Anonymous
7/3/2025, 4:48:28 PM No.105789263
>>105787963
And continuing on >>105789228, it's the arbitrary user input that is the problem
There is an infinitely complex variety of things a player could say or communicate in a sentence, and not everything can be covered by function calls. So you start introducing artificial constraints (don't describe more than 1 clause per sentence) but then you realize you are just making a regular parser game
Anonymous
7/3/2025, 4:50:45 PM No.105789284
>>105789218
steve (grok 4)
Replies: >>105789631
Anonymous
7/3/2025, 4:52:11 PM No.105789295
Alice in minutes, sirs.
Replies: >>105789364
Anonymous
7/3/2025, 4:59:49 PM No.105789364
>>105789295
Is that a new miku?
Anonymous
7/3/2025, 5:28:12 PM No.105789631
>>105789284
Grok 4 releases tomorrow according to Elon so it's a likely candidate. Grok 3 was also put on LMArena under a pseudonym in the run up to its release.
Replies: >>105789658
Anonymous
7/3/2025, 5:28:28 PM No.105789635
>>105789622
>>105789622
>>105789622
Anonymous
7/3/2025, 5:30:36 PM No.105789658
>>105789631
So Grok 3 will be open sourced soon, right?
Anonymous
7/3/2025, 5:58:58 PM No.105789921
2MoreWeeksDipsy
2MoreWeeksDipsy
md5: fe1db021256df6526acba7b1ed0ef59b๐Ÿ”
>>105789046
> 'tis but a fortnight longer
lol
Replies: >>105790144
Anonymous
7/3/2025, 6:24:39 PM No.105790144
>>105789921
me on the left
Anonymous
7/3/2025, 6:36:30 PM No.105790273
>>105787466
>>105787511
> 6400C
toasty
Anonymous
7/3/2025, 7:31:16 PM No.105790859
Is there a way in oobabooga to make it so that the AI doesn't stop processing randomly? I don't want to babysit it pressing continue over and over just for it to finish a big task.
Replies: >>105790893
Anonymous
7/3/2025, 7:34:45 PM No.105790893
>>105790859
>stop processing randomly
Terminology. Processing typically means prompt processing. I think you mean generating, based on
>pressing continue over and over just for it to finish a big task
If so, increase the token generation limit or disable it entirely. No idea where it is, i don't use that thing.