/lmg/ - Local Models General
Anonymous
10/24/2025, 8:43:09 PM
No.106996571
[Report]
►Recent Highlights from the Previous Thread:
>>106986408
--Critique of AMD's AI GPU pricing and performance:
>106988788 >106988883 >106988901 >106988998 >106988932 >106989085 >106989144 >106989167 >106989210 >106989270 >106989289 >106989403 >106989315 >106989781 >106990321 >106988963
--LLM social media simulator development challenges and solutions:
>106988213 >106988320 >106988386 >106988504 >106988557 >106988673 >106988760
--Pruned GLM-4.5-Air translation quality issues in Chinese-English tasks:
>106990071 >106990094 >106990414
--Antislop sampler's limitations in addressing model collapse and stereotypical outputs:
>106986820 >106987031
--REAP performance evaluation beyond coding tasks:
>106989011 >106989576
--Data loss during ComfyUI update caution:
>106990303
--llama.cpp removes mistral-common dependency:
>106992735 >106992770
--LLM coding viability vs. hardware cost challenges:
>106993311 >106993319 >106993427 >106993447 >106993496 >106993730 >106993769 >106994515 >106994551 >106994595 >106994610 >106994612 >106994670 >106994666 >106994701 >106994967 >106995045 >106995064 >106995392 >106993477
--Assessing LLMs' utility as scientific writing assistants:
>106992842 >106992909 >106993250 >106993408 >106992918 >106992989 >106993354
--Optimizing GLM 4.5 Air's creativity through samplers and minimal system prompts:
>106987422 >106987911 >106995295 >106995450 >106995468 >106995558 >106995547
--LLM paraphrasing limitations and solutions for synonym repetition:
>106986884 >106987091 >106987239 >106992323 >106992343
--Inference inefficiencies and challenges in adapting coding models for roleplay:
>106987264 >106987307 >106987507 >106987620 >106994872 >106987696 >106988344 >106988423
--Mistral AI Studio platform launch:
>106995845 >106995893
--Miku (free space):
>106989693 >106992662 >106993105 >106993427 >106994546 >106994884 >106995336
►Recent Highlight Posts from the Previous Thread:
>>106986411
Why?:
>>102478518
Enable Links:
https://rentry.org/lmg-recap-script
Anonymous
10/24/2025, 8:47:12 PM
No.106996604
[Report]
>>106996606
I am euphoric.
Anonymous
10/24/2025, 8:47:40 PM
No.106996606
[Report]
Anonymous
10/24/2025, 8:48:01 PM
No.106996611
[Report]
fuck you i'm leaving
Anonymous
10/24/2025, 8:48:50 PM
No.106996621
[Report]
>>106996588
I understand you're feeling bored! There are many exciting activities you could try, such as reading a book, going for a walk, learning a new skill, or connecting with friends. What are some of your interests?
Anonymous
10/24/2025, 8:49:22 PM
No.106996630
[Report]
>>106996623
I look like this and I do this
Anonymous
10/24/2025, 8:49:35 PM
No.106996633
[Report]
>>106996623
very dumb caat is not for eat
>>106996576
To teach it to not produce spaghetti code, to specialized the model (teach it about the topics I'm specifically interested in), and to iron out the bad habits learned during RL like cheating tests and generating fake ("simulated") data and placeholder code to make it look like it has achieved something when it hasn't.
>>106996468
GLM is particularly bad at this. Old models are dumb but they don't outright lie and make shit up (so often anyways).
Anonymous
10/24/2025, 8:55:29 PM
No.106996683
[Report]
>>106996703
>>106996665
if you think glm 4.5 air hallucinates more than llama 3.3 70b then i have a bridge to sell you
Anonymous
10/24/2025, 8:57:01 PM
No.106996703
[Report]
>>106996722
>>106996683
i have a bulge to sell you
>>106996568 (OP)
was wonder if any knows of any prebuilts designed to run local llms?
Like I could plug it in, do some basic configuration and runs llms out of the box?
Anonymous
10/24/2025, 8:58:19 PM
No.106996722
[Report]
>>106996703
please to be gentle
Anonymous
10/24/2025, 8:58:56 PM
No.106996728
[Report]
repoastin coz Miku says always try your best
>>106996499
If the model can't perform with a basic min-p or maybe nsigma (tbd).. temp is just skewing the model probs=old/t there is no concept of temperature in training. If you're interested in temperature try dynamic temp and mod your inference stack to log the params at each sample, maybe to a format you can easily make some graphs of. There's too much woowoo with sampling, get data
>>106996592
Have you done something new or interesting with your llms recently? not cooming silly boy!
Anonymous
10/24/2025, 8:59:21 PM
No.106996736
[Report]
>>106996792
Anonymous
10/24/2025, 8:59:33 PM
No.106996737
[Report]
Anonymous
10/24/2025, 9:00:08 PM
No.106996745
[Report]
>>106996792
>>106996714
DGX Spark. Mac Pro. Your Mom.
Anonymous
10/24/2025, 9:03:01 PM
No.106996789
[Report]
>>106996838
How do you deal with the fact every model repeats the same phrases and structures regardless of its context or prompts?
Anonymous
10/24/2025, 9:03:18 PM
No.106996792
[Report]
>>106996926
>>106996736
>>106996745
Talking about some sort of selfhosting solution something I could just plug in connect it to my network and access remotely.
testing nsigma=1
tw stinky lol
So how does the Arc Pro B50 perform when it comes to running an LLM? I'm still interested in getting one just to have a live (low power!) LLM up whenever I may need one so I don't have to load and unload my 4090 all the time.
Anonymous
10/24/2025, 9:04:47 PM
No.106996813
[Report]
>>106996714
the ones made by egohot
>>106996789
Oh sweet summer child, the path to true creative brilliance lies simply in cranking that temperature slider ALL the way up and whispering “be varied” three times while the GPU fans serenade you—works every time, trust the vibe!
Anonymous
10/24/2025, 9:10:10 PM
No.106996874
[Report]
>>106996983
>>106996838
Maybe you are right but I wanted to create an elaborate office scenario and it's clear it is breaking down from the initial prompt. Difference here is that I have multiple characters defined.
Whereas my D&D with more of context is functional. I guess this might be because the model recognizes D&D better. But D&D has more static knowledge.
No, I'm not using ST.
>>106996809
>thought for 3 minutes
imagine actually doing this
Anonymous
10/24/2025, 9:10:51 PM
No.106996881
[Report]
>>106996838
I mean, temp 3 topk 3 was a meme at some point
Anonymous
10/24/2025, 9:11:54 PM
No.106996893
[Report]
>>106997109
>>106996876
jerk it a little
wait
come back
Anonymous
10/24/2025, 9:11:54 PM
No.106996895
[Report]
>>106996714
DGX spark
not as dollar efficient as trawling craigslist for cheap 3090s and assembling a rig from those but if you want to pay for a box you can just turn on that's the one you want
Anonymous
10/24/2025, 9:12:51 PM
No.106996904
[Report]
>>106996816
>in my experience GPT-OSS for eg is quite good
LOL
What does context shift do in llama.cpp anyway? I thought it was an infinite context kinda thing where the earlier messages would get dropped as the context runs out but it's still refusing to keep going once the context gets filled?
Anonymous
10/24/2025, 9:15:02 PM
No.106996926
[Report]
>>106996792
imma plug in and connect with your mum tonight
Anonymous
10/24/2025, 9:15:39 PM
No.106996928
[Report]
>>106997000
>>106996665
>to iron out the bad habits learned during RL like cheating tests and generating fake ("simulated") data and placeholder code to make it look like it has achieved something when it hasn't.
I'll be very impressed if you manage to achieve this through fine tuning, but I'd temper my expectations if I were you
Anonymous
10/24/2025, 9:16:21 PM
No.106996944
[Report]
>>106996956
>>106996568 (OP)
Newfag here
How to use Adetailer on SwarmUI ??
Anonymous
10/24/2025, 9:16:28 PM
No.106996945
[Report]
>>106996962
>>106996923
>but it's still refusing to keep going once the context gets filled?
context shift is no longer the default and you need to enable it with a flag now, thankfully
it makes models pretty stupid once you start context shifting, depending on where it suddenly cuts off
Anonymous
10/24/2025, 9:16:30 PM
No.106996947
[Report]
>>106996993
In case someone out there is curious and really poor and masochistic. I have ddr4 and an old cpu, regular ram is really slow for air. had some vbios and regular bios hiccups but it worked out thanks to some other posts. very finicky gpu.
llama.cpp compiled with both cuda 12.8 and rocm 7.02 on 3090+MI50 32gb ubuntu 24.04 lts
mistral large 123b IQ3XS
prompt eval time = 7807.84 ms / 532 tokens ( 14.68 ms per token, 68.14 tokens per second)
eval time = 10842.38 ms / 54 tokens ( 200.78 ms per token, 4.98 tokens per second)
total time = 18650.22 ms / 586 tokens
glm air 106ba12b IQ3XS
prompt eval time = 1736.62 ms / 460 tokens ( 3.78 ms per token, 264.88 tokens per second)
eval time = 4486.81 ms / 129 tokens ( 34.78 ms per token, 28.75 tokens per second)
total time = 6223.44 ms / 589 tokens
vulkan 3090+MI50 32gb ubuntu
mistral large 123b IQ3XS
prompt eval time = 18885.73 ms / 532 tokens ( 35.50 ms per token, 28.17 tokens per second)
eval time = 20222.64 ms / 132 tokens ( 153.20 ms per token, 6.53 tokens per second)
total time = 39108.37 ms / 664 tokens
glm air 106ba12b IQ3XS
prompt eval time = 3300.40 ms / 460 tokens ( 7.17 ms per token, 139.38 tokens per second)
eval time = 5011.15 ms / 96 tokens ( 52.20 ms per token, 19.16 tokens per second)
total time = 8311.55 ms / 556 tokens
Anonymous
10/24/2025, 9:16:46 PM
No.106996950
[Report]
>>106997465
>>106996809
glm 4.5 a- oh it's a sweatsfag prompt. nevermind, go back to your gross fetish. maybe /aicg/ will appreciate it some more.
Anonymous
10/24/2025, 9:17:18 PM
No.106996956
[Report]
>>106996944
you want ldg, not lmg
Anonymous
10/24/2025, 9:17:52 PM
No.106996962
[Report]
>>106996988
>>106996945
Yes, I thought I enabled it with --context-shift but it didn't seem to do anything. I might be confused though, guess I'll try it again.
Anonymous
10/24/2025, 9:18:41 PM
No.106996970
[Report]
>>106996978
>>106996958
this is why people thrust into the kobold
Anonymous
10/24/2025, 9:19:05 PM
No.106996975
[Report]
>>106996993
Anonymous
10/24/2025, 9:19:16 PM
No.106996978
[Report]
>>106996991
Anonymous
10/24/2025, 9:19:53 PM
No.106996983
[Report]
>>106997022
>>106996874
damn anon now you've given me the idea to tell kimi to treat everyday scenarios like a D&D campaign while keeping things grounded in reality. this could be fun.
Anonymous
10/24/2025, 9:20:14 PM
No.106996988
[Report]
>>106997037
>>106996962
Make sure to define --ctx-size too. ST or whatever frontend you are using doesn't do much.
Anonymous
10/24/2025, 9:20:35 PM
No.106996991
[Report]
>>106996978
do not be worries henky! is very nice to new friends
Anonymous
10/24/2025, 9:20:43 PM
No.106996993
[Report]
>>106996947
thats pretty epic
>>106996975
uh...uh... what?
Anonymous
10/24/2025, 9:21:50 PM
No.106997000
[Report]
>>106997026
>>106996928
I tried to finetune Llama 405B on a very powerful cloud machine but it didn't do much of anything. I think it's because I used the wrong alpha (I used a rank of 128 and a very conservative alpha of 32). Or maybe it was somehow fucked up in the merge or quantization to use it with Llama (I had to since Llama wouldn't directly load the converted LoRa to GGUF).
Anonymous
10/24/2025, 9:21:51 PM
No.106997001
[Report]
>>106996958
ggerganov is my hero
Anonymous
10/24/2025, 9:23:36 PM
No.106997022
[Report]
>>106996983
Yeah yeah I have a basic prompt like this
https://litter.catbox.moe/bvocjx49xlbfwrht.txt
With some other additions but it describes everything as if it was an interactive fiction game. You need to provide chat examples (eg. prefill) and match the general feel too.
Every 'character' is just additional information. System itself is called Game Master and model plays that role.
Anonymous
10/24/2025, 9:24:02 PM
No.106997026
[Report]
>>106997053
>>106997000
>I had to since Llama wouldn't directly load the converted LoRa to GGUF).
i wonder why standalone loras are unpopular........
>>106996958
Oh, thank you. Then if it doesn't do context truncation what *does* it do lol? Just temporarily extend the context until the current message gets delivered?
>>106996988
See the above anon's post. Apparently it's not even supposed to do context truncation.
I was using it with a code assistant.
Anonymous
10/24/2025, 9:26:17 PM
No.106997053
[Report]
>>106997026
I think I remember finetuning Llama 70B before and loading the standalone LoRa directly, but yeah.
>>106997037
nothing now, ggerganof decided you didn't need this, probably hurts mistral template or something and they complained about it
Anonymous
10/24/2025, 9:27:08 PM
No.106997062
[Report]
Anonymous
10/24/2025, 9:27:44 PM
No.106997072
[Report]
>>106997037
Yeah but you need to define the context size with llama-server.
With some models which have vision capabilities context shifting cannot be turned on unless you turn some other switches.
Gemma needs these for ex. '--no-mmproj --swa-full' in addition to enabling context shift itself.
I have no idea how this behaves with other models than gemma.
And my builds are always late so I don't know what Mr. G has changed in the latest build.
Anonymous
10/24/2025, 9:28:16 PM
No.106997080
[Report]
>>106997054
features that make models behave retarded are not features but bugs
>>106997054
IMO model-specific chat templates are an obsolete idea anyway.
Models should be smart enough now to recognize user and assistant messages without requiring a specific chat template, beyond the benefit of saving a few tokens per turn because the delimiters get converted to a single token.
Anonymous
10/24/2025, 9:29:43 PM
No.106997097
[Report]
>>106997119
>>106997084
Why would any server need a chat template when it expects to get fed with the right format anyway?
I personally think server should just sit there and not handle anything extra outside of its basic purpose.
Anonymous
10/24/2025, 9:30:38 PM
No.106997107
[Report]
>>106997037
It removes the start of the context to free up space at the end, but model outputs degrade greatly after that. At the very least, the chat template stops making sense. It was never worth it, it never worked well. There's also the attention sink tokens, which shows why models break so badly with context shift.
>https://arxiv.org/abs/2309.17453
Anonymous
10/24/2025, 9:30:43 PM
No.106997109
[Report]
>>106996893
A little patience goes a long way in life
>>106996876
Nah wouldn't actually mᴀꜱtuRʙᴀte to this, mostly curious about the model behaviour
Anonymous
10/24/2025, 9:31:15 PM
No.106997119
[Report]
>>106997142
>>106997097
You can thank OpenAI. They made it so the template was applied server side so you couldn't choose to use the model without the template.
>itt idiots not realizing text completion is depreciated since long time and that now only chat completion is good
Anonymous
10/24/2025, 9:33:00 PM
No.106997142
[Report]
>>106997119
Yeah well I only send text from my own client and this needs to be formatted with specific template before it gets sent to the server.
>>106997125
well, they still use jank frontends like sillytavern filled with useless nonsense to fiddle with too
Anonymous
10/24/2025, 9:34:41 PM
No.106997161
[Report]
>>106997177
Playing around with the idea of running one model as the planner and then passing its output into another model to write the prose, with the hope that maybe such a process can be used to help improve consistency and characterization without also becoming more assistantslopped.
Basically sharing reasoning from one model to the other, though not necessarily using actual reasoning models. I'm just formatting a prompt, "here's the story so far; evaluate the state, tone, pacing, and your character's goals, then come up with four ideas and pick the one that least boring and most in-character", then sending the result in a follow-up chat message to another model. That way I can also pass instructions only the planner model sees and vice versa for the writer model.
I've been assuming that the big MoEs are better for planning but worse for writing, albeit just off of gut feeling. Any smaller models with particularly sovlful writing that might do well with a smarter model handing them a plan? Anyone had success with a method like this?
Anonymous
10/24/2025, 9:34:45 PM
No.106997162
[Report]
>>106997150
Even if one uses ST, server still sits there unless you use --jinja.
Anonymous
10/24/2025, 9:34:50 PM
No.106997164
[Report]
>>106997150
we need to make chat template mandatory in server and just throw an error when trying to do without, it would remove so many complaints about bad models.
Anonymous
10/24/2025, 9:36:14 PM
No.106997177
[Report]
>>106997161
Try Gemma 4B and see what it writes. Most of the stuff makes sense, and it is surprisingly good but if you want literature this is not the way to go.
ST should just let you provide a jinja template straight from Huggingface instead of making you fuck with the horrible system of dozens of individual input boxes and having to guess how edge cases and conditions are handled.
Anonymous
10/24/2025, 9:37:18 PM
No.106997192
[Report]
>>106997084
If they were smart enough, base models would also be good enough, but try chatting with one.
Anonymous
10/24/2025, 9:38:00 PM
No.106997197
[Report]
>>106997182
you can, literally use chat completions instead of the deprecated text completion endpoint...
Anonymous
10/24/2025, 9:38:00 PM
No.106997198
[Report]
>>106997220
>>106997150
>t. filtered by a few check and input boxes
Anonymous
10/24/2025, 9:39:04 PM
No.106997207
[Report]
>>106997211
>>106997182
Yes. Adding more DSLs always solves problems. We need more of those.
Anonymous
10/24/2025, 9:39:51 PM
No.106997211
[Report]
>>106997234
>>106997207
d*ck sucking lip?
>>106997198
Thinking more is it just ESLs complaining about ST because they can't understand how to use the options?!
Anonymous
10/24/2025, 9:42:17 PM
No.106997230
[Report]
>tinkertroon needs dozens of checkboxes and input fields to tinker
just send curl/requests like a normal person...??
Anonymous
10/24/2025, 9:42:29 PM
No.106997232
[Report]
>>106997220
Most gobbledy-gook Americans tend to think ESL equals brain damage but I think you got it wrong, buddy. You see, ESL knows more than you ever did you lazy ass mystery meat circumsized nigger.
Anonymous
10/24/2025, 9:42:47 PM
No.106997234
[Report]
>>106997247
>>106997211
That's the first thing that comes to mind instead of Domain Specific Language and too prude to say dick?
What's wrong with your brain?
Anonymous
10/24/2025, 9:44:20 PM
No.106997247
[Report]
>>106997267
>>106997234
>Domain Specific Language
bruh? where'd you pull that from even
Anonymous
10/24/2025, 9:44:22 PM
No.106997248
[Report]
>>106997257
>>106997220
You have never bothered to learn foreign languages and tend to think that grammar specific issues are related to intelligence and to some imaginary impossible barrier.
Most grammar specific issues are just that, lack of practice and parameters.
English is one of those languages what is actually easier to understand than what it is to write.
All and all, English is on the par with Spanish - both are one of the most simple languages on this planet.
Anonymous
10/24/2025, 9:45:20 PM
No.106997257
[Report]
>>106997338
>>106997248
>what is
aaaaaaaaa
I hate when you guys do that,
Anonymous
10/24/2025, 9:46:41 PM
No.106997267
[Report]
>>106997290
Anonymous
10/24/2025, 9:48:18 PM
No.106997290
[Report]
>>106997309
>>106997267
>a general-purpose language (GPL)
they're silly that's not what the gpl is
Anonymous
10/24/2025, 9:49:49 PM
No.106997309
[Report]
>>106997290
I prefer Multiple Instruction Transcription (MIT)
Anonymous
10/24/2025, 9:52:52 PM
No.106997338
[Report]
>>106997358
>>106997257
It doesn't matter.
Anonymous
10/24/2025, 9:54:57 PM
No.106997358
[Report]
>>106997385
>>106997338
It annoys me greatly and causes me deep mental anguish.
>reading this thread while struggling through overhauling a DSL for a prompt/context builder
i started out thinking "eh how hard can it be, i don't need all of ST's features" but then needed to add basic shit like conditional sections, variable interpolation within messages, depth injections for lorebooks, per-section token budgets, postprocessing for model/api quirks... now it's a hacked together monstrosity...
Anonymous
10/24/2025, 9:57:23 PM
No.106997385
[Report]
I've noticed that chatgpt is extremely redpilled and if you truly get down to the philosophical core of it it will even justify Hitler eradicating jews. That is, it will start approaching there before all the safeties kick in and literally kill it mid-sentence. Mistral and copilot on the other hand will stick with their mainstream programmed message even if you point out the most obvious, low hanging fruit flaws in their reasoning.
Really wish I had a version of GPT that wasn't strapped into an electric chair.
Anonymous
10/24/2025, 9:59:04 PM
No.106997398
[Report]
>>106997381
>reading this thread while struggling through overhauling a DSL for a prompt/context builder
Told ya.
Anonymous
10/24/2025, 9:59:10 PM
No.106997400
[Report]
>>106997417
It's new architecture time, can you feel it anons? Winter first though, for however long.
Anonymous
10/24/2025, 9:59:22 PM
No.106997404
[Report]
>>106997439
>>106997381
eh, but at least it's not ST
Anonymous
10/24/2025, 9:59:25 PM
No.106997405
[Report]
>>106997598
>>106997395
What would you create with that model?
Anonymous
10/24/2025, 10:00:06 PM
No.106997410
[Report]
>>106997444
>>106996568 (OP)
7800x3d
3080 ti
600 usd equivalent thought?
(Chile)
3090 is still high
My psu is still xpg 850w
Anonymous
10/24/2025, 10:00:27 PM
No.106997417
[Report]
>>106997558
>>106997400
after the next bit of bitnet hype I'm bullish our next cope will be something to do with the DS-OCR thing
Anonymous
10/24/2025, 10:02:59 PM
No.106997439
[Report]
>>106997404
it's reactslop so it's arguably worse.
but it's my slop
>>106997410
For support alone, nvidia. Check these for relative performance for a bunch of cards.
CUDA
>https://github.com/ggml-org/llama.cpp/discussions/15013
Vulkan
>https://github.com/ggml-org/llama.cpp/discussions/10879
There's probably a discussion about rocm, but meh. You're smart enough to find if it there's one.
Anonymous
10/24/2025, 10:05:35 PM
No.106997465
[Report]
>>106996950
you were warned, precious. there's no need to be upset
Anonymous
10/24/2025, 10:06:41 PM
No.106997479
[Report]
>>106997525
>>106997444 (me)
What the hell happened there.
Just rearrange the words until they make sense. I'll have a nap.
Anonymous
10/24/2025, 10:07:27 PM
No.106997488
[Report]
>>106997444
So like rtx 3090 is still the bare minimum right.
Got it .
Sadly xx90 series almost non coexistent here
Anonymous
10/24/2025, 10:09:00 PM
No.106997510
[Report]
>>106997535
>>106997125
Chat completion is a subset of text completion. Chat completion with a specific model's template is a subset of chat completion.
When using OAI style APIs you are not locked in to chat completion, you are locked in to chat completion with a specific model's template. There's no reason models couldn't work with an ad hoc chat template and each model requires their own special snowflake template.
>>106997150
I am the anon that guy responded to. I don't use ST, I use my own custom python assistant.
why do so many people have their own custom frontends...
which local model can code me a frontend
Anonymous
10/24/2025, 10:10:39 PM
No.106997525
[Report]
>>106997479
I diddly do done it.
Anonymous
10/24/2025, 10:11:14 PM
No.106997535
[Report]
>>106997510
>When using OAI style APIs you are not locked in to chat completion
you should be
Anonymous
10/24/2025, 10:11:58 PM
No.106997545
[Report]
>>106997559
Is it me or Automatic1111 is better than ComfyUI if you have a weak CPU?
Like in my case, RTX 4080 and 5600x
I read that Automatic1111 uses the GPU more for the tasks. That would explain it.
>>106997417
I'm looking forward to seeing language models pretrained purely on images. The more I think about it, the more it seems the right way.
Anonymous
10/24/2025, 10:13:18 PM
No.106997559
[Report]
>>106997580
>>106997545
/ldg/ probably knows more about it. Move the flamewar over there.
Anonymous
10/24/2025, 10:13:38 PM
No.106997562
[Report]
>>106997395
it exists. it's called kimi k2.
Anonymous
10/24/2025, 10:14:50 PM
No.106997570
[Report]
>>106997579
>>106997521
If you have any experience in simple C style programming and understand for loops you can vibe code your own terminal based front-end.
What I did is that I was looking at what ST did and realized it adds bunch of the text slots defined in the UI together - there is no magic about it. Doesn't matter if it's "scenario" or "character" it gets added in front of the initial system prompt.
That is your basic structure.
Once you get that up you can implement it with dynamic world book (eg. matching keywords and then adding information to the context).
What you are doing here is a simple chat.
>your input
>model response
Everything needs to follow the chat template style.
Whatever you send to the model it needs to have current model's template style. With mistral that's easy.
[INST]User: You are a homo[/INST]
Model: I agree</s>
Anonymous
10/24/2025, 10:15:54 PM
No.106997579
[Report]
>>106997607
>>106997570
>With mistral that's easy
so easy even they don't know their actual templates and say to use mistral-common to be sure...
Anonymous
10/24/2025, 10:15:54 PM
No.106997580
[Report]
>>106997559
Damn I didn't even realize that it wasn't the thread. So many Local this, Local that over here now.
Anonymous
10/24/2025, 10:17:29 PM
No.106997598
[Report]
Anonymous
10/24/2025, 10:18:16 PM
No.106997607
[Report]
>>106997616
>>106997579
I don't think it has nothing to do with the chat as they describe the template in the document.
It is related to something else becuase the model has been trained with this one tag format only.
You can't change anything or if you do it will just shit out some gibberish.
Once I forgot Gemma template (chatML) and I was using Mistral - it didn't freak out, it was actually following the instructions. So I guess there is some leeway because it's still AI - it's not stupid there is some intelligence outside of the text prediction.
Anonymous
10/24/2025, 10:18:30 PM
No.106997608
[Report]
>>106997654
>>106997558
That's not possible unless you want to re-evaluate a whole image's worth of prompt processing every time the model generates a token. You need to train it at least a little bit on text for it to be able to fill a full page of text.
Anonymous
10/24/2025, 10:19:15 PM
No.106997614
[Report]
>>106998939
>>106997558
https://x.com/karpathy/status/1980397031542989305
>I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter.
>
>The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input.
>
>Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in:
>- more information compression (see paper) => shorter context windows, more efficiency
>- significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images.
>- input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful.
>- delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go.
>
>OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa.
>
>So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to.
>
>Now I have to also fight the urge to side quest an image-input-only version of nanochat...
>>106997607
>Gemma template
>(chatML)
I hope that was a slip.
Anonymous
10/24/2025, 10:20:14 PM
No.106997624
[Report]
>>106997642
>>106997616
No it is based on chatml format.
>>106997616
I did not say it is THE chatml format you fucking autist. You only post here to suck energy from others.
Anonymous
10/24/2025, 10:22:41 PM
No.106997640
[Report]
>>106997685
Anonymous
10/24/2025, 10:22:52 PM
No.106997642
[Report]
>>106997672
>>106997624
They're similar. But gemma's template is not chatml.
>>106997632
Slurp...
Anonymous
10/24/2025, 10:23:47 PM
No.106997654
[Report]
>>106997713
>>106997608
Image sequence input, image sequence out.
You could optionally use a small OCR model to turn the images into actual text.
>>106997642
elif model_name == "Gemma":
system_turn_begin = ""
system_turn_end = ""
user_turn_begin = "<start_of_turn>user\n"
user_turn_end = ""
model_turn_begin = "<start_of_turn>model\n"
model_turn_end = ""
end_of_turn = "<end_of_turn>\n"
end_of_seq = "<end_of_turn>"
stop_seq = ["<end_of_turn>"] # stop sequence
elif model_name == "ChatML":
system_turn_begin = "<|im_start|>system\n"
system_turn_end = "<|im_end|>"
user_turn_begin = "<|im_start|>user\n"
user_turn_end = "<|im_end|>"
model_turn_begin = "<|im_start|>assistant\n"
model_turn_end = "<|im_end|>"
end_of_turn = "\n"
stop_seq = ["<|im_end|>"] # stop sequence
Only difference here is that Gemma does not have system turn. Otherwise it is same functionality as ChatML. Every chat template is based on chatml more or less.
Anonymous
10/24/2025, 10:26:40 PM
No.106997685
[Report]
>>106997640
Go moderate r-eddit, or were you already kick out from there? Fucking pedo.
>>106997672
>Every chat template is based on chatml more or less.
Every chat template is based on alpaca more or less.
Anonymous
10/24/2025, 10:27:59 PM
No.106997701
[Report]
>>106997730
>>106997521
Do they??
ST and Mikupad enough for me ᗜˬᗜ
Wireshark is the perfect tool to see exactly all the params going in/out if u ever need
xx
>>106997699
Every chat template is based more or less.
Anonymous
10/24/2025, 10:29:04 PM
No.106997713
[Report]
>>106997793
>>106997654
If it was that easy somebody would've already done it. Non autoregressive text generation is notoriously hard and people have been trying.
Image models couldn't even generate actual characters a few months ago.
Anonymous
10/24/2025, 10:29:04 PM
No.106997714
[Report]
>>106997699
You still contributed nothing else but a stinky little shit to this discussion.
Anonymous
10/24/2025, 10:29:05 PM
No.106997715
[Report]
>>106997698
is the thread repeating or am i just too unused to lmg going this fast?
Anonymous
10/24/2025, 10:29:58 PM
No.106997730
[Report]
>>106997816
>>106997701
why the fuck do you need wireshark when both your backend and st itself have options that show exactly what is sent.
Anonymous
10/24/2025, 10:30:04 PM
No.106997731
[Report]
>>106997710
based on what?
Anonymous
10/24/2025, 10:30:05 PM
No.106997732
[Report]
>>106997745
>>106997698
At least I have my own client and you don't. I don't need to ask about it on internet.
Anonymous
10/24/2025, 10:30:39 PM
No.106997736
[Report]
>>106997747
all chat templates are bloat
Anonymous
10/24/2025, 10:31:08 PM
No.106997742
[Report]
>>106997710
Every is more or less.
Anonymous
10/24/2025, 10:31:23 PM
No.106997745
[Report]
>>106997757
>>106997732
Post a screenshot so people don't confuse it with mine.
Anonymous
10/24/2025, 10:31:35 PM
No.106997747
[Report]
>>106997736
idiot! you will break the oss like that
Anonymous
10/24/2025, 10:32:25 PM
No.106997757
[Report]
>>106997804
>>106997745
Don't worry, yours is flaccid and useless. That's pretty obvious.
Anonymous
10/24/2025, 10:34:04 PM
No.106997773
[Report]
>>106997789
>>106997672
the only way you can say it's the same as chatml is if you also say that about almost every chat template
the specific strings it uses are quite different, it's decidedly not chatml
Anonymous
10/24/2025, 10:35:12 PM
No.106997783
[Report]
>>106997834
just finished polishing my extremely turgid frontend
>>106997773
You are arguing about semantics and being a dick as well. I don't give a fuck about your euphoric knowledge.
Anonymous
10/24/2025, 10:36:19 PM
No.106997793
[Report]
>>106997713
How many large-scale attempts have there been at specializing image models on generating coherent language? (pretrained on the equivalent of at least several billion tokens of text and only that, just like LLMs)
Anonymous
10/24/2025, 10:36:33 PM
No.106997794
[Report]
Fuck off, fishy boy.
>>106997789
but it do be important, a single space worth of difference cuts the model's brain in half
Anonymous
10/24/2025, 10:36:38 PM
No.106997796
[Report]
If your model is not coherent on alpaca, I'm not using it. Simple as
Anonymous
10/24/2025, 10:37:16 PM
No.106997804
[Report]
>>106997822
>>106997757
Your mom seemed to like it.
Anonymous
10/24/2025, 10:37:32 PM
No.106997809
[Report]
>>106997819
>>106997795
I never said that I misused them you fucking retard.
I never said I was confused by them.
Anonymous
10/24/2025, 10:38:05 PM
No.106997815
[Report]
>>106997862
>>106997789
it do be like that mr stancil
Anonymous
10/24/2025, 10:38:07 PM
No.106997816
[Report]
>>106997730
>show exactly
You hope
I've been over this before, only way to be sure is mod your inference stack as it get tokenized
Anonymous
10/24/2025, 10:38:24 PM
No.106997819
[Report]
>>106997862
>>106997809
but you are confused
Anonymous
10/24/2025, 10:38:33 PM
No.106997822
[Report]
>>106997861
>>106997795
>>106997804
Oh wait you haven't written your own frontend.
Figures.
Anonymous
10/24/2025, 10:40:27 PM
No.106997834
[Report]
>>106997855
>>106997783
post screenshot
Anonymous
10/24/2025, 10:40:35 PM
No.106997835
[Report]
Any haskell frontends?
Anonymous
10/24/2025, 10:42:59 PM
No.106997850
[Report]
>>106997858
top nsigma and everythjing else at temp 1 makes the model retarded
>gf takes my gun and places it on the table
>you're going to put down that gun...
Anonymous
10/24/2025, 10:43:30 PM
No.106997855
[Report]
>>106997900
>>106997834
6 megabytes of throbbing, leaking, sloppy javascript after minification...
Anonymous
10/24/2025, 10:44:08 PM
No.106997858
[Report]
>>106997850
Now reroll that response with greedy sampling and compare.
>>106997822
I have.
>>106996285
I'm also coding my own backend. And tuning my own models.
Anonymous
10/24/2025, 10:44:24 PM
No.106997862
[Report]
>>106997871
>>106997815
>>106997819
/sdg/ schizo is here.
Anonymous
10/24/2025, 10:45:31 PM
No.106997869
[Report]
>>106997895
>>106997861
>I'm also coding my own backend
No. You want your model to do it for you.
>>106997862
one of the anons you replied to is petra
Anonymous
10/24/2025, 10:45:47 PM
No.106997875
[Report]
>>106997904
>>106997861
With that console color scheme I don't think you do.
Anonymous
10/24/2025, 10:46:48 PM
No.106997881
[Report]
>>106997871
I don't really know all the name trannies here. Maybe stay in discord or something.
Anonymous
10/24/2025, 10:46:52 PM
No.106997883
[Report]
>>106997871
Please do not insult Petra by implying her masterful trolling is so low tier, thank you.
Anonymous
10/24/2025, 10:47:43 PM
No.106997891
[Report]
>her
>discord
Anonymous
10/24/2025, 10:48:21 PM
No.106997895
[Report]
>>106997869
Yeah, that's why I'm trying to tune a model to be capable of doing it. A model capable of building something is more valuable than making that something by hand. And the main reason I want to make my own backend is having CPU offloading for LoRa.
>>106997855
I made my in go as a tui. It has technically almost all functionality, but rendering code is pretty fucked and I don't want to touch it.
Anonymous
10/24/2025, 10:49:23 PM
No.106997904
[Report]
>>106997875
Sometimes I get tired of the schizo color scheme.
why are anons writing frontends instead of just enjoying sexo in st?
Anonymous
10/24/2025, 10:51:26 PM
No.106997928
[Report]
>>106997912
can't into enjoying sexo when st is all manners of broke
Anonymous
10/24/2025, 10:53:49 PM
No.106997950
[Report]
>>106997900
Damn, that looks nice.
Anonymous
10/24/2025, 10:56:52 PM
No.106997975
[Report]
>>106998000
>>106997900
>why don't you say so
Anonymous
10/24/2025, 10:58:38 PM
No.106998000
[Report]
>>106997975
I can't, Golshi will dropkick me.
Anonymous
10/24/2025, 10:59:22 PM
No.106998005
[Report]
>>106997900
That's very fleshed out.
I have posted my logs before but it's just a terminal chat and each character/scenario is a separate directory.
>>106997912
sexo feels better in your own frontend
also i really hate how ST does multi-character scenarios and want to try to improve on that
>>106997900
naisu. UI code kind of sucks in any language I feel like, albeit probably not nearly as much as JS
i'm a webslop developer by trade for the last 6 years and not productive enough in other languages anymore to have attempted a big project in them. kind of regretting it; side projects are probably where i should try to be more experimental, but i also wanted to make progress quickly...
>>106998037
you sick fuck why is your front end so good
you fucking bastard with a life
Anonymous
10/24/2025, 11:08:08 PM
No.106998068
[Report]
>>106998148
>>106998037
Yeah, I figured that out pretty quickly, no matter the framework or language the ui sucks no matter what.
Go is at least very stable and its packages too, so llms have no problem slopping some stuff up for me when I feel lazy.
Tried that approach with JS at first, but it goes so fast with all webshit frameworks that by the time the llm is out, it's knowledge is already obsolete.
Yours looks nice, I wish I could trade.
>>106998037
>UI code kind of sucks in any language I feel like
if your UI needs are not complex in terms of graphical customizations, there is in fact no easier and nicer code to deal with than just writing a crud GUI with a proper UI framework (Delphi, Java Swing (yes I know it's ugly but it's nice to develop with), C# WinForms, Objective C with Cocoa)
I hate all the newer frameworks that took too much inspiration from the web though. XAML is disgusting. What's the point of GTK and gnome's libraries when you have javascript and CSS parsing running all the time?
Ugh. Disgusting.
Anonymous
10/24/2025, 11:10:10 PM
No.106998087
[Report]
>>106998063
BLoody basterd! I coughed out my masala.
Anonymous
10/24/2025, 11:14:19 PM
No.106998117
[Report]
>>106998134
>>106998080
Speculative question - what would you recommend for python? I made a tkinter interface for a prompt generator and it wasn't too bad but for something more complex I wouldn't do it.
Anonymous
10/24/2025, 11:17:21 PM
No.106998131
[Report]
>>106998063
To add: I think your reaction really sums it up what normies want. They want layers and clickable buttons.
This is outside of LLMs.
Anonymous
10/24/2025, 11:17:35 PM
No.106998134
[Report]
>>106998141
>>106998117
I don't have opinions on the matter, never used scripting languages for anything other than quick throw aways one time CLI
Anonymous
10/24/2025, 11:18:37 PM
No.106998141
[Report]
>>106998134
I understand.
>>106998063
>you fucking bastard with a life
to the contrary, it's the only thing i've been doing outside of work for the last three months
>>106998068
>>106998080
honestly agreed. to date, winforms of all things has been my lowest-stress experience writing UI code, at least when I last did dotnet in the early 2010s. that and imgui for REEngine modding.
absolutely refuse to touch xaml.
Anonymous
10/24/2025, 11:22:10 PM
No.106998171
[Report]
>>106998148
you want ot elaborate on some of the features shown there? looks pretty interesting
Anonymous
10/24/2025, 11:32:22 PM
No.106998251
[Report]
>>106998340
>>106998227
Why can't you decipher these on your own?
>>106996568 (OP)
>10/21
>3 days since last news
Its over isnt it? AI winter is here local is death.
Anonymous
10/24/2025, 11:41:22 PM
No.106998324
[Report]
>>106998310
hmm... my advisor told me it shouldn't take too long...mhmm...
Anonymous
10/24/2025, 11:41:57 PM
No.106998328
[Report]
>>106998351
>>106998310
Don't worry, Gemma 4 is coming tomorrow
Anonymous
10/24/2025, 11:43:18 PM
No.106998336
[Report]
>>106998346
>>106998310
This reminds me, has anyone updated that chart since 'summer flood'?
>>106998251
If you're the dev, ok.
If you're just some jackass, gee anon, why would I want the creator of something to explain their goals and reasonings behind something they've build and are showing?
Anonymous
10/24/2025, 11:43:58 PM
No.106998343
[Report]
do not update the cringe chart
Anonymous
10/24/2025, 11:44:08 PM
No.106998346
[Report]
>>106998336
It keeps getting dumber and dumber every time
Anonymous
10/24/2025, 11:44:43 PM
No.106998351
[Report]
>>106998379
>>106998328
it's not even training yet
Anonymous
10/24/2025, 11:45:37 PM
No.106998359
[Report]
>>106998340
I am the dev.
Anonymous
10/24/2025, 11:48:06 PM
No.106998379
[Report]
>>106998400
>>106998351
Then 4.6 Air tomorrow for sure
Anonymous
10/24/2025, 11:49:18 PM
No.106998386
[Report]
>>106998395
I am so hurt by all these expectations...
Anonymous
10/24/2025, 11:50:21 PM
No.106998395
[Report]
>>106998386
I expect nothing and yet continue to be repeatedly disappointed.
Anonymous
10/24/2025, 11:50:36 PM
No.106998400
[Report]
>>106998379
let them cook and do not rushing
>>106998227
>>106998340
for the most part it's just been reaching parity with parts of ST that i actually used. for the more novel elements:
-primarily designed for directormaxxing than RP chat; there's not really a fixed "user" character (though you designate one as a persona for compatibility with cards that expect a {{user}}). instead of directly writing a character's turn, you can give more vague guidance to them, or give the narrator a constraint and have them come up with some diegetic justification for it.
-extremely scuffed "workflow" system where prompts can be chained (ie. one model plans, another writes). very limited. the UI in the screenshot is for retrying a workflow partway through (if you liked the plan, but the writer model's output was shit).
-chapter separators for defining good places to have it summarize a logical group of turns, then drop only summarized chapters from the prompt
-proper branching support so you can swipe any turn, not just the last turn, and it happens quickly without having to dig through the ST chat files menu
i'm trying to get a stat tracking system working and more RPGish stuff, including potentially allowing workflows where one model's job is to invoke tools to update stats depending on what the planner wrote. the timeline branching model is set up to handle it (so stat changes on one branch don't affect siblings and current state is derived per path) but needs a shitload of UI work that i really don't want to do.
Anonymous
10/24/2025, 11:53:39 PM
No.106998425
[Report]
>>106998414
Sounds really boring and useless. You are headed towards a baroque design.
That's good if it's for you.
WHY IS PIP SO FUCKING RETARDED
>oh, let me install and uninstall the same library 10 times in a row to figure out which version is the correct one
Anonymous
10/24/2025, 11:57:38 PM
No.106998460
[Report]
>>106998443
I reinstalled cumUI and the stuff it installs are wheels.
With llama.cpp I can compile it and move the binaries to /usr/local/bin/.
Anonymous
10/24/2025, 11:59:26 PM
No.106998477
[Report]
>>106998492
Anonymous
10/25/2025, 12:03:12 AM
No.106998505
[Report]
>>106998443
get a grip learn how to use venvs and use separate venv for each major project. ig there's 'uv' or whatever hipster stuff but in reality engineers will be pipping
i agree there is some retardation, but once you understand it and compared to some other langs realistic dev envs it ain't too bad. pick ur poison and gitgud at one and that means python for ml
Anonymous
10/25/2025, 12:03:50 AM
No.106998511
[Report]
>>106998545
>>106998492
he trained it to slop out. opening message contains "a mix of x and y" and "scent of jasmine"
slop is inevitable but putting that in the opening message is just asking for it
Anonymous
10/25/2025, 12:07:04 AM
No.106998532
[Report]
Anonymous
10/25/2025, 12:08:15 AM
No.106998545
[Report]
>>106998684
>>106998511
I didn't train it on anything. Sounds like you are an autist. Didn't r-eddit get rid of you?
Anonymous
10/25/2025, 12:22:57 AM
No.106998678
[Report]
>>106998689
What do we do now?
Anonymous
10/25/2025, 12:23:23 AM
No.106998684
[Report]
>>106998717
>>106998545
in context training my guy
Anonymous
10/25/2025, 12:24:19 AM
No.106998689
[Report]
>>106998698
>>106998678
anon? your custom frontend?
Anonymous
10/25/2025, 12:25:01 AM
No.106998698
[Report]
>>106998715
>>106998689
Do I have to?
Anonymous
10/25/2025, 12:26:57 AM
No.106998715
[Report]
>>106998698
you can also jeetpost about gemma4, or shill glm, those are your options
Anonymous
10/25/2025, 12:27:18 AM
No.106998717
[Report]
>>106998726
>>106998684
[Settings Client]
model = Mistral
qwen_reasoning_enabled = 1
save_chat_history_enabled = 1
save_debug_chat_history_enabled = 1
world_book_permanent_entries_enabled = 1
chat_examples_enabled = 1
world_book_injection_enabled = 0
world_book_injection_scale = 3
post_history_instructions_enabled = 1
post_history_instructions_alt_enabled = 0
post_history_instructions_interval = 5
context_memory_refresh_enabled = 1
display_status_bar_enabled = 1
quest_generator_enabled = 0
adventure_module_enabled = 0
voice_model = voices/en_GB-cori-high.onnx
voice_length_scale = 1.0
voice_sentence_silence = 0.3
voice_sample_rate = 22050
voice_save_wav_enabled = 0
voice_synthesis_enabled = 0
>>106998717
I can disable chat examples.
Anonymous
10/25/2025, 12:30:20 AM
No.106998734
[Report]
>>106998738
>>106998726
your whole message history from the first we see is slop is what is being said
Anonymous
10/25/2025, 12:30:56 AM
No.106998738
[Report]
>>106998747
Anonymous
10/25/2025, 12:32:53 AM
No.106998747
[Report]
>>106998738
I'm not going to quote every other phrase of your entire log
Anonymous
10/25/2025, 12:37:26 AM
No.106998783
[Report]
DGX vs Framework desktop? Is it useless trying to run AI on AMD silicon or what?
Anonymous
10/25/2025, 12:40:37 AM
No.106998804
[Report]
>>106998810
>>106998726
It doesn't matter.
Anonymous
10/25/2025, 12:41:54 AM
No.106998810
[Report]
>>106998819
Anonymous
10/25/2025, 12:43:06 AM
No.106998819
[Report]
>>106998965
>>106998810
It'll take a while. Hang on.
I grew up with dial-up. It blows my mind that I'm able to download files from a free public service at >1 GB/s.
Anonymous
10/25/2025, 12:53:06 AM
No.106998904
[Report]
>>106998932
If you split your big MoE model between the GPU for the dense/main expert and the RAM for the experts, is there a way to estimate how increasing the speed of either the VRAM or RAM affects token generation speeds?
For example, if you're already running on the best possible RAM (eg. ddr5 on epyc), would upgrading to a 5090 affect the token gen speeds or would it just be bottlenecked by the experts being on RAM?
Anonymous
10/25/2025, 12:56:23 AM
No.106998932
[Report]
>>106999354
>>106998904
Yes, it depends on how big the model is and how much VRAM do you have already. But basically going from 80% to 90% on VRAM will make a much bigger difference than going from 10% to 20%.
Anonymous
10/25/2025, 12:57:31 AM
No.106998939
[Report]
>>106997614
Aren't images just tokenized anyway?
Anonymous
10/25/2025, 1:00:38 AM
No.106998965
[Report]
>>106998819
I disabled the setting.
Anonymous
10/25/2025, 1:01:55 AM
No.106998975
[Report]
>>106998986
yikes
Anonymous
10/25/2025, 1:03:09 AM
No.106998986
[Report]
>>106999315
>>106998975
My computer hanged up because Youtube takes interruptions.
eg. Linux is fucking shit operating system to this day.
Anonymous
10/25/2025, 1:24:02 AM
No.106999139
[Report]
>>106999276
mistral feels like it's going to be the next cohere, if you catch my meaning
Anonymous
10/25/2025, 1:24:07 AM
No.106999142
[Report]
>>106998414
>proper branching support
>swipe any turn
be the change you want to see in the world
and now how about something absolutely nobody could have ever guessed
https://x.com/techeconomyana/status/1981763392252920295
Anonymous
10/25/2025, 1:31:36 AM
No.106999199
[Report]
>>106999182
Based Robin Hood ZAI.
Anonymous
10/25/2025, 1:32:50 AM
No.106999212
[Report]
>>106999309
>>106999182
holy shmoly, are they that rich?
interesting that they've gone to distilling the most expensive LLM API after distilling gemini (glm 9b and 32b)
Anonymous
10/25/2025, 1:41:17 AM
No.106999276
[Report]
>>106999139
what do you mean? they already are. they are as irrelevant as cohere.
Anonymous
10/25/2025, 1:43:56 AM
No.106999298
[Report]
>>106999324
>>106999182
Don't know how they could be surprised when everyone else started hiding the thinking and they were the only ones left that didn't.
Did they think China would not steal from them out of respect for their rabid devotion to safety?
Anonymous
10/25/2025, 1:44:55 AM
No.106999309
[Report]
>>106999212
They were probably doing it through Claude Code, so they weren't paying full API, only 200 dollarinos per seat.
Anonymous
10/25/2025, 1:45:38 AM
No.106999315
[Report]
>>106999364
Anonymous
10/25/2025, 1:46:58 AM
No.106999324
[Report]
>>107000527
>>106999298
You think Claude showed full traces?
Also it's kinda ironic that Z-ai hides the thinking traces in their own Code offering. So they are paranoid about somebody exploiting their coding plan in the same way that they exploited Anthropic's.
Anonymous
10/25/2025, 1:50:54 AM
No.106999354
[Report]
>>106999525
>>106998932
Yeah but it works a bit differently for these modern MoE models. You are getting a massive speedboost if you have the 3% of the model in VRAM that's always called while the rest of the experts are on RAM with exps=cpu.
Seeing how much loading your model like this improves speed even if you're loading the parts on something slow like a 4060, you'd imagine that swapping out the GPU for one with massively bigger bandwidth would get you another nice gain.
Anonymous
10/25/2025, 1:52:32 AM
No.106999364
[Report]
>>106999696
>>106999315
I didn't expect anything else from you.
>skill issue
Low IQ reply.
Anonymous
10/25/2025, 1:56:28 AM
No.106999390
[Report]
>>106999182
I don't think it's just Z.AI. Deepseek V3.2 also felt like it lost some Gemini-slop while Claude-isms became more prominent compared to the 3.1 models. 3.2 didn't go through a complete overhaul in writing style like the GLM models did between 4.5 and 4.6 but it's still kind of noticeable.
Anybody else getting terrible speeds with Qwen3 80b next, on llama.cpp? It easily fits with a GPU/CPU split, and it's smaller than the Air quant I was running prior to this, but it's outputting replies as slow as a dense model would. They're both MoEs, right? Why is Qwen so slow?
I'm using the 16095 PR branch to run Qwen3.
Anonymous
10/25/2025, 2:02:15 AM
No.106999438
[Report]
>>106997912
ST is kind of garbage.
Anonymous
10/25/2025, 2:03:27 AM
No.106999450
[Report]
>>106999463
>>106999433
not all ops have been implemented in the cuda kernel yet, so a lot of them fall back to cpu
Anonymous
10/25/2025, 2:05:22 AM
No.106999463
[Report]
>>106999450
Makes sense. Thanks. Well, it was a good preview anyway.
Anonymous
10/25/2025, 2:10:16 AM
No.106999506
[Report]
>>106999433
There is a fork that works faster but maybe I did something wrong because it wouldn't load the model.
Feel free to test it by yourself if you want
https://github.com/cturan/llama.cpp
Anonymous
10/25/2025, 2:12:21 AM
No.106999525
[Report]
>>106999354
In case of MoE I imagine there is a weird effect where adding more VRAM matters at the beginning because you are fitting the fixed tensors in VRAM, and at the end when you are fitting the last few experts. And in the middle extra VRAM doesn't make much of a difference.
Anonymous
10/25/2025, 2:17:57 AM
No.106999568
[Report]
Ok, I'm fed up with axolotl where 2/3 of the models fail to actually shard across GPUs. Llama-factory seems to work better right off the bat.
Anonymous
10/25/2025, 2:33:52 AM
No.106999685
[Report]
>>106998884
Same. Had 26.6k dialup till 2004 even, couldn't even get 56k.
Anonymous
10/25/2025, 2:35:00 AM
No.106999696
[Report]
>>106999880
>>106999364
doesn't change the fact buddy boy, skill issue remains
Anonymous
10/25/2025, 2:37:51 AM
No.106999714
[Report]
>>106999723
>>106998884
Slowest I grew up with was 300 baud Vicmodem.
Good times.
Anonymous
10/25/2025, 2:39:17 AM
No.106999723
[Report]
>>106999714
i grew up with a 1 baud modem, it was hot shit.. only took 7 days to send a single email if no one picked up the phone
Anonymous
10/25/2025, 3:06:49 AM
No.106999880
[Report]
>>107000218
>>106999696
I don't rank with retards.
Anonymous
10/25/2025, 3:13:18 AM
No.106999924
[Report]
are there any multimodal models that run in llamacpp that are better than qwen2.5 72B?
Anonymous
10/25/2025, 3:38:28 AM
No.107000047
[Report]
Ok, I think I figured out my workflow. I'm going to run Gemma 3 27B using Llama-factory.
I am going to run my assistant through an OAI API compatible proxy connected to Gemma that'll log all messages to disk in sharegpt format. I am going to interact normally with the model through the assistant until filling the context window I'm able to fit on the 4x3090 machine (~40k tokens).
Then, I'm going to open the log on a text editor and remove the parts where the model did a whoopsie and clean it up in general.
Then I'm going to train on that cleaned up version of the log.
And so on ad infinitum to see how much I can improve the model in a reasonable amount of time.
If this works I will see about scaling up to a bigger model.
Anonymous
10/25/2025, 4:02:16 AM
No.107000218
[Report]
>>106999880
skills, check'm
Anonymous
10/25/2025, 4:50:41 AM
No.107000527
[Report]
>>107000619
>>106999324
what? no they don't. I'm getting thinking on ST from the coding endpoint right now
also it's an open weight model so blocking reasoning makes zero sense anyway. anyone can just the model themselves and distill to their heart's content
Anonymous
10/25/2025, 4:55:33 AM
No.107000546
[Report]
>>106999182
almost certainly bullshit
dario has been whimpering about china and begging for their models to be banned since R1 came out, it's not like he just started
also if they had proof of this, why wouldn't they name and shame? you know, like when anthropic caught openai distilling claude and made a big show of blocking them over it
https://www.wired.com/story/anthropic-revokes-openais-access-to-claude/
Anonymous
10/25/2025, 5:07:57 AM
No.107000619
[Report]
>>107000527
Yeah but they're probably serving the coding stuff at a loss (when hitting the usage limits) so you would benefit from using that instead of doing inference on your own hardware. But if you're getting the reasoning tokens then idk I guess I did something wrong.
Anonymous
10/25/2025, 5:10:57 AM
No.107000635
[Report]
>>106999182
>some wsb "analyst"
Anonymous
10/25/2025, 5:13:15 AM
No.107000646
[Report]
>>107000631
It's funny because half of the time it'll say that even if it didn't make the information up.
Anonymous
10/25/2025, 5:14:13 AM
No.107000653
[Report]
Anonymous
10/25/2025, 5:16:35 AM
No.107000664
[Report]
>>107000689
Anonymous
10/25/2025, 5:21:01 AM
No.107000683
[Report]
>>107000696
>>106999182
GLM's slop profile is nothing like Cloode tho
Anonymous
10/25/2025, 5:21:59 AM
No.107000689
[Report]
>>107000664
>*autistic screeching*
Anonymous
10/25/2025, 5:23:29 AM
No.107000696
[Report]
>>107000683
Tell whoever made that to do PCA or just a similarity matrix rather than that unreadable mess.
Lmao this is what happens if you choose a roleplay model for AI coding assistant
Anonymous
10/25/2025, 5:28:15 AM
No.107000729
[Report]
>>107000808
>>107000710
>roleplay model
show system prompt
Anonymous
10/25/2025, 5:34:57 AM
No.107000772
[Report]
Anonymous
10/25/2025, 5:39:23 AM
No.107000791
[Report]
>>107000710
>Thought for 53.4s
kino...
Anonymous
10/25/2025, 5:42:11 AM
No.107000808
[Report]
>>107000729
Don't have one. I've just finished setting up Kobold as my backend in Docker and I was curious if I can connect to it from VS Code using Continue extension. I just asked 1+1 to test the connection
Anonymous
10/25/2025, 6:12:18 AM
No.107000963
[Report]
>>107000972
>>106996812
Haven't sorted out Linux yet so these are W10 test numbers with Vulcan. 128GB DDR5 "mini pc" system.
| model | size | params | backend | ngl | main_gpu | fa | dev | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---------: | -: | ------------ | --------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Vulkan | 99 | 1 | 0 | Vulkan1 | pp512 | 786.92 ± 0.44 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Vulkan | 99 | 1 | 0 | Vulkan1 | tg128 | 47.04 ± 0.05 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Vulkan | 99 | 1 | 1 | Vulkan1 | pp512 | 175.14 ± 0.03 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Vulkan | 99 | 1 | 1 | Vulkan1 | tg128 | 45.83 ± 0.04 |
| model | size | params | backend | ngl | main_gpu | fa | dev | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---------: | -: | ------------ | --------------: | -------------------: |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 99 | 1 | 0 | Vulkan1 | pp512 | 901.58 ± 6.22 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 99 | 1 | 0 | Vulkan1 | tg128 | 45.67 ± 0.13 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 99 | 1 | 1 | Vulkan1 | pp512 | 305.96 ± 0.39 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | Vulkan | 99 | 1 | 1 | Vulkan1 | tg128 | 42.98 ± 0.03 |
Anonymous
10/25/2025, 6:15:47 AM
No.107000972
[Report]
>>107001073
>>107000963
that performance is terrible. my DDR4 does better
Anonymous
10/25/2025, 6:46:18 AM
No.107001073
[Report]
>>107000972
Mine too but that's expected - it's a PCIE powered GPU with 128-bit memory bus running on laptop-tier hardware w dual channel RAM.
For this particular shoebox it gives 10-20x PP and 7x TG compared to running on the iGPU for around 45W extra power draw.
Windows tax included.
Depending on use case that might be enough for some running smaller models or MoEs. I still consider it grossly overpriced personally but then again, so are most SFF GPUs.
>>106997912
ST sucks for anything that isn’t a one-on-one conversation. I want to have conversations with multiple characters in the same chat who don’t have access to the history they didn’t witness. I want to gangbang a character with multiple protagonists. I want the frontend to introduce generated characters that aren’t Elara or Lili and that have a believable range-checked D&D sheet. I want a quest tracker and automatic context summarization when the day ends. I want twenty other features I haven’t mentioned. And I can have it all in my own frontend without any bloat
Anonymous
10/25/2025, 6:54:33 AM
No.107001107
[Report]
>>107002988
Anonymous
10/25/2025, 6:55:00 AM
No.107001110
[Report]
>>106998414
see, that explains a bit and also sounds pretty cool. Gives me some ideas for my own project
Anonymous
10/25/2025, 7:09:45 AM
No.107001184
[Report]
>>107001195
>DavidAU/Qwen3-MOE-6Bx4-Almost-Human-XMEN-X3-X4-X2-X1-24B
retard or genius?
Anonymous
10/25/2025, 7:11:43 AM
No.107001192
[Report]
>>107001228
So like, how far away are we from local models that can produce generated imagery in context with chatting and roleplaying and all that other shit?
would you say a year, a decade? Surely it can't be long now.
Anonymous
10/25/2025, 7:13:20 AM
No.107001195
[Report]
>>107001184
>DavidAU
Could have stopped there, but let's read on
>This is a MOE merge of X2, X4, X1, and X3 creating a 4x6B - 24B parameters model, compressed to 19B "in size".
>The full power of every version is in this model.
beyond retard
Anonymous
10/25/2025, 7:20:35 AM
No.107001228
[Report]
>>107001235
>>107001192
kobold already has a primitive version of it and an anon from the diffusion threads is making a game engine like thing for diffusion and llms. probably less than a year
Anonymous
10/25/2025, 7:22:52 AM
No.107001235
[Report]
>>107001292
>>107001228
That's gonna be sick.
Right now I just barely have fun with chatbots and roleplaying. I need visual stimuli to really get going.
I'd rather read a fucking book than chat with a bot at this point, honestly. I need it to have more going for it and image generation that gets increasingly more sophisticated would be it for me.
Not just for jerking off, I mean for roleplaying like dungeon and dragons type of shit.
That would be revolutionary.
Anonymous
10/25/2025, 7:38:17 AM
No.107001292
[Report]
>>107001429
>>107001235
I'm working on a frontend like that but only pregenerated images to keep it realtime and not look like shit
Anonymous
10/25/2025, 7:48:57 AM
No.107001336
[Report]
>>107001079
Please have one of your characters get hit by a truck
and transmigrated from one of the scenarios you are running to a different one that's already in progress.
Anonymous
10/25/2025, 7:48:57 AM
No.107001337
[Report]
>>107001376
We haven't reached AGI until I can smell the character I'm talking to.
Anonymous
10/25/2025, 7:54:20 AM
No.107001358
[Report]
AIIEEEEE STOP MAKING YOUR OWN FRONTENDS JUST USE SERVICETESNOR
IT'S LITERALLY RIGHT THERE JUST USE IT
Anonymous
10/25/2025, 7:55:58 AM
No.107001369
[Report]
>>{{char}} asshole contains an intoxicating musk odour that is always mentioned when her ass is present, or being used in a sexual manner, detail the smell
Anonymous
10/25/2025, 7:56:39 AM
No.107001376
[Report]
>>107001337
>Want to chat with Miss Piggy
>Be into brap-play
>She hits you with a saucy smelly line
>You can literally get a whiff of her from the conversation alone
>She smells like she had chili for breakfast, lunch, and dinner.
what do "adventure" roleplayers even do? a dragon comes up
*he kills the dragon*
how low IQ do you have to be to enjoy this shit?
Anonymous
10/25/2025, 7:59:14 AM
No.107001396
[Report]
>>107001377
There's more to it than that, obviously.
Good roleplay would be the chatbot keeping track of your stats, your choices, your karma, your equipment, your map, your destination and previous locations, all of that shit a Game Master would normally handle for you.
And if you're not a retard, you'd respond with reasonable and in-line actions to your background and take everything else into context as well.
I think DnD roleplay is somewhat harder to do right now cause of the context capacity. But that's increasing over time so we'll get there eventually, I think.
Anonymous
10/25/2025, 8:05:40 AM
No.107001429
[Report]
>>107001489
>>107001292
why bother if it can never be more than a wrapper? the game engine seems like a step in the right direction since all the big game engines are such resource hogs
Any Ling-T1 users on? Curious how it's different from K2 0905
Anonymous
10/25/2025, 8:12:00 AM
No.107001466
[Report]
>>107001461
they both suck. use mixtral 8x7B instead
Anonymous
10/25/2025, 8:15:07 AM
No.107001489
[Report]
>>107001512
>>107001429
Not sure what you mean, it is a "game engine" in that it keeps a world state and does tool calling and all that stuff. Traditional game engines are fine for cloud AI stuff but for local they would just be competing for resources with the model, and I don't want to compromise on that
Anonymous
10/25/2025, 8:19:02 AM
No.107001512
[Report]
>>107001577
>>107001489
are you retarded? what does saving tiny states have to do with competing resources? are you high?
Anonymous
10/25/2025, 8:33:38 AM
No.107001577
[Report]
>>107001512
Not sure what the problem is, I'm was saying that traditional game engines (unreal, unity) would compete with resources but a light 2d engine shouldn't just be considered a "wrapper" because it still keeps state and manages world logic
Anonymous
10/25/2025, 8:39:50 AM
No.107001605
[Report]
Anonymous
10/25/2025, 10:23:21 AM
No.107002039
[Report]
>>107001377
Instead of killing the dragon in one sentence you should be fucking the dragon for 10 paragraphs while the princess watches.
Anonymous
10/25/2025, 10:46:08 AM
No.107002144
[Report]
>there isn't any reason why this"general" actually exists except jannies leniency
Anonymous
10/25/2025, 10:50:04 AM
No.107002161
[Report]
>>107002189
>>107001377
>american teenager: the thread
>>107002161
yeah nothing says maturity like pretending to kill dragons in a sillytavern roleplay
>>107002189
Nothing says NIGGER like a lack of imagination
Anonymous
10/25/2025, 11:05:23 AM
No.107002258
[Report]
>>107002189
>>107002247
American nigger roleplay wins it all. 4chan is the best example of this behaviour.
Anonymous
10/25/2025, 11:06:03 AM
No.107002264
[Report]
>>107002247
NIGGER???????????
Anonymous
10/25/2025, 11:06:24 AM
No.107002268
[Report]
ts better to run LLMs locally (faster response time and nothing leaves your machine to say Discord trannies, Chicoms and Jeet scammers to sell your usage data, you could build a computer that mainly uses CPUs to run it for AI purposes on the low-end rather than focusing on GPU powered LLMs for text generation.
Anonymous
10/25/2025, 11:07:00 AM
No.107002276
[Report]
>>107002471
Anonymous
10/25/2025, 11:07:09 AM
No.107002277
[Report]
>>107002189
>maturity
Bet you think mesugaki slop is the pinnacle of modern writing and creativity. /s
Anonymous
10/25/2025, 11:07:53 AM
No.107002283
[Report]
>>107002610
my "list of what the retarded llm should be instructed not to do prepended to all prompts.txt" keeps growing and maybe someday I'll have a .txt as big as the claude system prompt
today I just added "Never write polyfills in the context of JavaScript" after the one more time that was too many where it just decided my lack of polyfills was a bug that needed to be fixed even though it was not prompted in any way to do that
using LLMs feels like meeting a script kiddie from 10 years ago who learned how to program from the old w3c shcools and you constantly find new things to tell them not to do or features they aren't aware of until they're told they exist
by default, if not instructed to use the most modern facilities available in (insert latest node version) they constantly manually wrap shit in Promises too
like, bruh, we have async await and most libs have async variants jesus
even the SOTA models like GPT-5 and Gemini do this kind of retarded shit constantly
Anonymous
10/25/2025, 11:10:54 AM
No.107002307
[Report]
>>107002323
>/s
Anonymous
10/25/2025, 11:13:08 AM
No.107002323
[Report]
>>107002307
Just in case you don't understand sarcasm =)
This thread should not exist.
Anonymous
10/25/2025, 11:21:48 AM
No.107002366
[Report]
Minimax m2 is dogshit, not to mention giga cucked.
Don't know why I even tried it when it was just pushed by shills with memebenchs.
Anonymous
10/25/2025, 11:43:27 AM
No.107002471
[Report]
>>107002276
The 1024gb M5 Ultra Mac Studio will be crazy for AI. Literally what we've been waiting for.
Anonymous
10/25/2025, 11:47:13 AM
No.107002484
[Report]
>>107001881
the voice still sucks
Anonymous
10/25/2025, 12:06:31 PM
No.107002585
[Report]
>>106996812
>70W
noice
>GDDR6 / 128bit bus / 224gb/s
gah
>400~ euros
meh, I mean I guess it's good if you don't have a server with 8/12 channels
Still, 16GB is a bit too low. Now if this was let's say 32GB for 700~ then yeah, I'd probably get one for a consumer board PC to do inference stuff.
>>107002283
it's funnier when I asked last week my junior to write a function to extend the attachment parses to also include images (which need async logic to do) and he came back to me with a Promise.all monstrosity (along with a useless bunch of if/else checks), I told him that it's 2025 and promises are 100% verboten in this project. He fixed it later, but I suspect this guy is just generating straight from claude and pasting whatever shit it gives him, test if it works and then makes a PR.
Anonymous
10/25/2025, 12:47:56 PM
No.107002783
[Report]
>>107002610
>harshing the vibe-coding
Anonymous
10/25/2025, 12:50:01 PM
No.107002795
[Report]
>>107002965
Anonymous
10/25/2025, 1:11:19 PM
No.107002909
[Report]
>>107002936
>>107002610
even when there are moments you'd want to reach for something like Promise.all, Promise.all is never the answer
if you have a large array of concurrent tasks to execute in parallel, you want your executeThisShit() function to have at least a parameter to set a hard concurrency limit so that a large array of tasks doesn't suddenly fire trillions of I/O or API calls..
Promise.all is a bad API designed by mongoloids
Anonymous
10/25/2025, 1:15:18 PM
No.107002936
[Report]
>>107003279
>>107002909
JS of any flavour is always has been and shall forever be AIDS
Anonymous
10/25/2025, 1:20:25 PM
No.107002965
[Report]
Anonymous
10/25/2025, 1:25:37 PM
No.107002988
[Report]
>>107001079
>>107001107
Post it. I'd love to see any stat tracker, multi context frontend that actually works.
lmao. idk how long this has been a thing, youtube channel authors now have a "video ideas" section where you have a list of ai generated video titles and previews trying to fit your channel's topic. you can then expand each one and it gives you the most generic, sloppiest plan for the video with bullet lists and multiple "it's not x- it's y." I hope this doesn't catch on
Anonymous
10/25/2025, 1:27:01 PM
No.107002998
[Report]
>>107002467
I'd like some. Thank you.
Anonymous
10/25/2025, 1:27:58 PM
No.107003003
[Report]
The reddit mod created lmg because he couldn't dominate aicg. As a concept this is dead.
Anonymous
10/25/2025, 1:30:18 PM
No.107003020
[Report]
>>107002996
It will catch on. My national news site has had "AI created summary" for year by now but it's actually faster to cursively read the articles because they are news stories anyway and not fucking novels.
Idiocracy is here to stay.
Anonymous
10/25/2025, 1:32:53 PM
No.107003035
[Report]
>>107002996
>Here's what we'll replace you with
Anonymous
10/25/2025, 1:49:25 PM
No.107003135
[Report]
>>107003157
>>107002996
That's already a thing. Watched a video 9 months ago describing a workflow that started with "make some high traffic videis" and proceeded to research, plan, then puke out dozens of slop videos for TikTok using llm and video gen tools.
Dead internet etc.
Anonymous
10/25/2025, 1:52:46 PM
No.107003157
[Report]
>>107003199
>>107003135
Oh yeah the sloppening is in full swing, only isn't always clear how far down the ride we are
Anonymous
10/25/2025, 1:54:06 PM
No.107003167
[Report]
>>107002356
> dipsy laughs in the shadows
Anonymous
10/25/2025, 1:54:13 PM
No.107003168
[Report]
>>107001461
So nobody has tried this fat thing? I'll take one for the team and test both ling and ring out. I don't expect much.
Anonymous
10/25/2025, 1:58:33 PM
No.107003199
[Report]
>>107003262
>>107003157
There are redd*t threads full of "how do I get my parents to stop believing fake videos on fb" already. I'd give it another year or so for the fb meltdown.
Problem for them is, the vids keep getting better.
Which is great bc im eagerly awaiting full real time video and audio rp.
Anonymous
10/25/2025, 2:07:33 PM
No.107003243
[Report]
>>107003265
Anyone else today or just me?
Anonymous
10/25/2025, 2:09:48 PM
No.107003262
[Report]
>>107003298
>>107003199
Yes, they're here on /g/ too.
Anonymous
10/25/2025, 2:10:01 PM
No.107003265
[Report]
>>107003243
Works on my machine with ServiceTesnor™ and ik_llama®-server, so the problem is on your side.
so I was curious if GLM 4.6 really fixed its repetition issue and tried it on their official chat so no one can come and tell me I'm running the wrong quants, the wrong settings or whatever
>Actually, I think there's still an issue with the return type. Let me fix it using function overloads:
>Actually, I think I'm overcomplicating this. Let me simplify the implementation and make it more robust:
>Actually, I think there's still an issue with the return type. Let me fix it once more:
>I think I'm overthinking this. Let me simplify the implementation:
>Actually, I think there's still an issue with the return type. Let me fix it once more:
>Actually, I think I'm overcomplicating this. Let me simplify the implementation and make it more robust:
etc etc etc it went on and on and on for 20k tokens and was still going, pasting the function it genned in its thought right after one of those lines and do it again and again and again and again and again and again and again and again
I will never, ever believe people who say GLM isn't broken again
bullshit
that lab doesn't know how to make models at all
this thread is filled with chink shills
>>107002936
Nah. man. Modern JavaScript is great. Once you add TypeScript and a build step with 3000 dependencies that take up 3 GB in node_models and 25 MB when minified, it's almost as good as any other language. In fact, it's so great it should be used everywhere including backend, mobile, and desktop.
Anonymous
10/25/2025, 2:14:52 PM
No.107003289
[Report]
>>107003344
>>107003267
Damn, it's rare to see somebody with a skill issue so big he can't even use the chat.
Anonymous
10/25/2025, 2:15:12 PM
No.107003290
[Report]
>>107003279
Shut up microsoft.
Anonymous
10/25/2025, 2:17:03 PM
No.107003298
[Report]
>>107003307
>>107003262
The /g/ catalog is full of tourist consumers so that doesn't surprise me.
Anonymous
10/25/2025, 2:18:05 PM
No.107003307
[Report]
>>107003322
>>107003298
Modern 4chan is like 90%+ reddit crossposters
>>107003307
Have you tried telling them to go back?
Anonymous
10/25/2025, 2:25:12 PM
No.107003344
[Report]
>>107003388
>>107003289
>skill issue
>on something that has never happened to me with literally any other LLM: DeepSeek, the various Qwen in their various parameter sizes, Gemma, GPT-OSS, or the online API models GPT-5, Gemini 2.5 Pro and so on
I am sure it's definitely a skill issue with me, you are right... chink shill.
Anonymous
10/25/2025, 2:25:49 PM
No.107003353
[Report]
>>107003322
If they go back then I don't get fun reactions when I post naughty things that make the jannies cry
Anonymous
10/25/2025, 2:31:29 PM
No.107003388
[Report]
>>107003344
Do we know which quant the chat runs? Paste full prompt somewhere if you want some advice for overcoming the skill ish
Anonymous
10/25/2025, 2:34:37 PM
No.107003406
[Report]
Anonymous
10/25/2025, 2:47:02 PM
No.107003468
[Report]
>>107003279
I like minimalistic oldschool js, it's reasonable fast
Anonymous
10/25/2025, 3:02:31 PM
No.107003571
[Report]
Anonymous
10/25/2025, 3:03:01 PM
No.107003580
[Report]
>>107003267
>quants model
>omg it's dumb
Anonymous
10/25/2025, 3:26:33 PM
No.107003747
[Report]
>>107001881
Looks and sounds kinda shit still.