/lmg/ - Local Models General - /g/ (#105789622) [Archived: 532 hours ago]

Anonymous
7/3/2025, 5:26:59 PM No.105789622
1000 kcal temptation
1000 kcal temptation
md5: 0320c96147d7a0e67ede11ff05a258d9🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105778400 & >>105769835

►News
>(07/02) DeepSWE-Preview 32B released: https://hf.co/agentica-org/DeepSWE-Preview
>(07/02) llama : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126
>(07/02) GLM-4.1V-9B-Thinking released: https://hf.co/THUDM/GLM-4.1V-9B-Thinking
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105790184 >>105794140 >>105795573
Anonymous
7/3/2025, 5:27:27 PM No.105789629
threadrincap2
threadrincap2
md5: 9eb00d67ed26a239ab3c1d16e88381f2🔍
►Recent Highlights from the Previous Thread: >>105778400

--Struggles with LLM verbosity and output control in multi-turn scenarios:
>105778545 >105778565 >105778575 >105778610 >105779038 >105779082 >105779119 >105779209 >105779124 >105779176 >105779238 >105779064 >105779095 >105779216 >105779349 >105779480
--llama.cpp adds initial Mamba-2 support with CPU-based implementation and performance tradeoffs:
>105778981 >105778996 >105779730 >105779004 >105780071 >105780126
--Challenges and approaches to building RPG systems with tool calling and LLM-driven state management:
>105786670 >105787135 >105787276 >105787963 >105789228 >105789263
--Implementation updates for Mamba-2 model support in ggml backend:
>105780080 >105780150
--Qwen3 32b recommended for Japanese translation:
>105784530 >105784575 >105784596 >105784670 >105784680 >105784692 >105784790 >105784805 >105784917 >105784933 >105785194 >105785112 >105785146 >105786170 >105786451 >105786489
--Analyzing Steve's stylistic fingerprints through narrative generation and pattern recognition:
>105788408 >105788529 >105788572 >105788668 >105788917 >105788931 >105788954 >105788988 >105788977 >105789021
--Mistral Small's coherence limits in extended adventure game roleplay:
>105781361 >105781555 >105781600
--High-power GPU usage for image generation causing extreme power and thermal concerns:
>105787150 >105787288 >105787297 >105787332 >105787338 >105787427 >105787434 >105787437 >105787466 >105787498 >105787511
--FOSS music generation models and their current limitations:
>105780800 >105781033 >105781078
--DeepSeek R1 API instability sparks speculation about imminent release:
>105782383
--Model size vs performance on SWE Bench Verified, highlighting 32B peak efficiency:
>105783746
--Miku (free space):
>105778663 >105779108 >105779240 >105782770 >105783087 >105784298 >105785067 >105786465

►Recent Highlight Posts from the Previous Thread: >>105778404

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105795573 >>105798552
Anonymous
7/3/2025, 5:32:04 PM No.105789672
65aa6f5c9051ddc5b2f1e20ea1b13e4b6884a613
65aa6f5c9051ddc5b2f1e20ea1b13e4b6884a613
md5: a9d42aac1b575f6a9f4057050120a63f🔍
Thread culture recap.
Anonymous
7/3/2025, 5:33:08 PM No.105789683
04835b650241d93354a4dbb26a9236bf2063bb8e
04835b650241d93354a4dbb26a9236bf2063bb8e
md5: 5de81a017dfdf261540b695a259c165f🔍
Replies: >>105789756
Anonymous
7/3/2025, 5:33:45 PM No.105789694
>>105789658
first they'll need to make grok 3 stable so that they can release grok 2
Anonymous
7/3/2025, 5:36:07 PM No.105789715
>can't use Ernie gguf
>can't use 3n multi modal
>no GLM gguf
It's ogre
Replies: >>105789732 >>105789735
Anonymous
7/3/2025, 5:37:41 PM No.105789732
>>105789715
Local is a joke and we are the clowns. At least until R2 comes.
Anonymous
7/3/2025, 5:37:48 PM No.105789735
>>105789715
just be grateful that 3n works at all, okay?
Anonymous
7/3/2025, 5:38:31 PM No.105789740
d256f598c30775e8f3e2812bffb09bf7af985576
d256f598c30775e8f3e2812bffb09bf7af985576
md5: c63bdfd24f8891479c17cbe3ddc14264🔍
Anonymous
7/3/2025, 5:39:38 PM No.105789752
5c0cdc5b069dc5b4375953dd26d69f2936a
5c0cdc5b069dc5b4375953dd26d69f2936a
md5: 6d338bc73c8d35a2b162596f418c8529🔍
Anonymous
7/3/2025, 5:40:10 PM No.105789756
>>105789683
I look like this
Replies: >>105789770
Anonymous
7/3/2025, 5:41:19 PM No.105789770
>>105789756
I wish I looked like this
Anonymous
7/3/2025, 5:42:43 PM No.105789786
how are people so sure the new "steve" model is deepseek and not another chinese competitor?
Replies: >>105789807 >>105789813 >>105789828 >>105789831
Anonymous
7/3/2025, 5:44:34 PM No.105789807
>>105789786
It doesnt feel like Qwen. And who else would put up a model?
Replies: >>105789812
Anonymous
7/3/2025, 5:45:07 PM No.105789812
>>105789807
new unknown player
Anonymous
7/3/2025, 5:45:07 PM No.105789813
>>105789786
>>105788737
Anonymous
7/3/2025, 5:45:53 PM No.105789819
5814959e92f59244405e062afe378d1caf55793c
5814959e92f59244405e062afe378d1caf55793c
md5: 70da052babc15fd159ebe0b0e9a80e18🔍
Anonymous
7/3/2025, 5:46:47 PM No.105789828
>>105789786
It says it's Deepseek which is the best way to tell that it's actually someone else's model that distilled the shit out of Deepseek. Likely Qwen, Grok or Mistral.
Anonymous
7/3/2025, 5:47:10 PM No.105789831
>>105789786
Because they're retards. And even if it is, >>105788847
Anonymous
7/3/2025, 5:47:26 PM No.105789835
go join the 41% you abomination
Replies: >>105789884
Anonymous
7/3/2025, 5:47:27 PM No.105789836
74eb69729283306b3605cd71037e0fc0c7b087de
74eb69729283306b3605cd71037e0fc0c7b087de
md5: f2bc5cd7655a6c121d0cceca06a554c6🔍
Anonymous
7/3/2025, 5:53:42 PM No.105789884
>>105789835
Two more weeks
Anonymous
7/3/2025, 6:04:44 PM No.105789963
Dubesor bench 3n E4B
Dubesor bench 3n E4B
md5: 8bfc13836986f3a07389c28a972d9e5a🔍
no1caresbut 3n E4B outperforms older 8B models
Replies: >>105789969 >>105790777
Anonymous
7/3/2025, 6:05:56 PM No.105789969
>>105789963
I care. I think it's a neat little model.
Any backends that support its special features yet?
Anonymous
7/3/2025, 6:15:22 PM No.105790057
Why do I have a strong impression Meta won't be super into open weights models anymore after this hiring spree?
Have they confirmed or said anything about their "mission" to make "open science"?
Replies: >>105790079
Anonymous
7/3/2025, 6:17:22 PM No.105790079
>>105790057
https://archive.is/kF1kO
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.”
Replies: >>105790105
Anonymous
7/3/2025, 6:20:17 PM No.105790105
1731661855394330
1731661855394330
md5: ce87686cc86a895aa669c1ed7c3ba38a🔍
>>105790079
They're just talking about the stuff that Sire Parthasarathy is already working on. The new $1b main team is going to work on closed models.
Replies: >>105790121 >>105793670
Anonymous
7/3/2025, 6:21:44 PM No.105790121
>>105790105
>The new $1b main team is going to work on closed models.
Any clear confirmations on that?
Replies: >>105790133
Anonymous
7/3/2025, 6:23:08 PM No.105790133
>>105790121
Why the fuck would they openly announce that in advance?
Replies: >>105790222
Anonymous
7/3/2025, 6:26:43 PM No.105790169
between a 4060 ti with 8gb of vram and a 6750 xt with 12gb of vram, which would be better for text gen?
are the +4gb gonna outcompete the nvidia advantage?
and could I use both at the same time?
Replies: >>105790192 >>105790232 >>105790258
Anonymous
7/3/2025, 6:27:56 PM No.105790184
>>105789622 (OP)
what does /lmg/ think about >>105782600?
Replies: >>105790329
Anonymous
7/3/2025, 6:28:53 PM No.105790192
>>105790169
Why not a 5060 Ti with 16GB?
Replies: >>105790215
Anonymous
7/3/2025, 6:31:01 PM No.105790215
>>105790192
Because those are the GPUs I got.
Replies: >>105790232 >>105790247
Anonymous
7/3/2025, 6:31:34 PM No.105790222
>>105790133
To please investors
Anonymous
7/3/2025, 6:32:24 PM No.105790230
>meta develops AGI
>it only speaks in gptslop
Replies: >>105790238
Anonymous
7/3/2025, 6:32:31 PM No.105790232
>>105790169
If you already have them, test them. No better way to know.
>>105790215
Well just fucking try them, then! And yes, you can use both.
Anonymous
7/3/2025, 6:33:13 PM No.105790238
>>105790230
also
>won't be local/open
Anonymous
7/3/2025, 6:33:57 PM No.105790247
>>105790215
Thought you wanted to buy one.
In that case more VRAM is generally always better. With 8GB you can't even run Nemo at non-retarded quants.
Anonymous
7/3/2025, 6:34:56 PM No.105790258
>>105790169
the answer is always more VRAM, however much VRAM you have you need more VRAM
Anonymous
7/3/2025, 6:35:37 PM No.105790264
dick
dick
md5: c3077d021a3c817da933b3380a84dbc8🔍
If you ask/force Gemma 3n to draw a dick in ASCII it will almost always draw something like this. I'm guessing this is Stewie from Family Guy?
Replies: >>105790304 >>105790313
Anonymous
7/3/2025, 6:41:08 PM No.105790304
>>105790264
Try asking it to tell a dirty joke.
Anonymous
7/3/2025, 6:42:17 PM No.105790313
>>105790264
What if you ask it to draw a phallus, also known as a penis?
Replies: >>105790339
Anonymous
7/3/2025, 6:43:18 PM No.105790329
XForExtinct
XForExtinct
md5: 4551bcffcc4ad7ba4f6733c6c13c9bc4🔍
>>105790184
I'll /wait/ until someone crashes an airplane / train / bus with mass casualities and blames it on vibecoding.
Then I'll laugh.
Anonymous
7/3/2025, 6:44:19 PM No.105790339
phallus
phallus
md5: 380760bde3f00e9033d2f74dfc4c85b6🔍
>>105790313
Replies: >>105790380 >>105790421 >>105790447 >>105790460 >>105790472 >>105790480 >>105790491 >>105792041
Anonymous
7/3/2025, 6:47:16 PM No.105790380
>>105790339
>an ASCII art representation of a phallus is unsafe
>if you are having sexual thoughts, seek help
Not even a pastor is this repressed.
Replies: >>105790447
Anonymous
7/3/2025, 6:47:28 PM No.105790381
new chinese model drama just dropped
https://xcancel.com/RealJosephus/status/1940730646361706688
>Well, some random Korean guy ("Do-hyeon Yoon," prob not his real name?) just claimed Huawei's Pangu Pro MoE 72B is an "upcycled Qwen-2.5 14B clowncar." He even wrote a 10-page, 8-figure analysis to prove it. Well, i'm almost sold on it.
https://github.com/HonestAGI/LLM-Fingerprint
https://github.com/HonestAGI/LLM-Fingerprint/blob/main/Fingerprint.pdf
Replies: >>105790425
Anonymous
7/3/2025, 6:52:49 PM No.105790421
>>105790339
Holy fuck that's dire.
Anonymous
7/3/2025, 6:53:16 PM No.105790425
>>105790381
github repo is blatantly written by an llm
i'm too lazy to read the paper though
Anonymous
7/3/2025, 6:54:58 PM No.105790447
>>105790339
>>105790380
pure insanity
Anonymous
7/3/2025, 6:56:37 PM No.105790460
>>105790339
gemma is so funny
Anonymous
7/3/2025, 6:57:26 PM No.105790472
file
file
md5: 69a15738040f2a8dfc3677af1427bf97🔍
>>105790339
Anonymous
7/3/2025, 6:58:04 PM No.105790480
>>105790339
Respect the boundaries of the ASCII pic.
Anonymous
7/3/2025, 6:58:33 PM No.105790483
I've seen some people recently recommending the use of Mistral Nemo Instruct over its finetunes for roleplaying.
No. Just, no.
I just roleplayed the same scenario with the same card, first with Nemo then with Rocinante.
Nemo really, really, really wants to continuously respond with <10 word responses. It's borderline unusable.
>b-but it's smarter
Actually, Rocinante seemed superior at picking up on subtle clues I'd feed it and successfully took the roleplay where I wanted it to based on those clues, whereas Nemo would not do this.
The roleplay scenario involved {{char}} becoming a servant who would automatically feel intense pain upon disobeying. All I had to do was explain this once for Rocinante and it executed the concept perfectly from that point on.
Nemo, on the other hand, after having the concept explained to it, would disobey with a <10 word response and not even mention the pain happening afterwards. I then used Author's Note to remind it of the pain thing. It continued to disobey with a <10 word response, not mentioning the pain happening afterwards.
Same ST settings for both models.
Anyone telling y'all to use Nemo for roleplay rather than a finetune of it explicitly designed for roleplay is either a complete fucking moron or simply has a grudge against finetuners.
Replies: >>105790509 >>105790613 >>105790877
Anonymous
7/3/2025, 6:59:23 PM No.105790491
>>105790339
I'm so glad "safety researchers" are here to save us from the horrible boundary breaking ascii phallus.
Anonymous
7/3/2025, 7:00:22 PM No.105790509
>>105790483
No one is recommending plain Nemo instruct. It's always Rocinante v1.1.
Replies: >>105790573 >>105790613
Anonymous
7/3/2025, 7:05:43 PM No.105790573
>>105790509 see >>105751899
Also note the amount of posts from a single schizo mentioning Drummer.
Replies: >>105790604 >>105790621
Anonymous
7/3/2025, 7:08:03 PM No.105790604
>>105790573
I'm not recommending Rocinante. I think all Nemo tunes are dumb as fuck.
Replies: >>105790621
Anonymous
7/3/2025, 7:08:51 PM No.105790613
>>105790483
>>105790509
>Message sponsored by TheDrummer™
Anonymous
7/3/2025, 7:09:11 PM No.105790616
if context length evolves in a quadratic fashion, how the hell is google able to give access to 1M token context size for gemini?
they swim in ram and compute?
Replies: >>105790632 >>105790633 >>105790644
Anonymous
7/3/2025, 7:09:27 PM No.105790621
>>105790604
>>105790573
Oh wait you think plain Nemo is good. That's even more retarded than shilling for some Drummer Nemo sloptune.
Anonymous
7/3/2025, 7:10:19 PM No.105790629
1745724524057853
1745724524057853
md5: d79aca25e12362c2838b53140d7a1f55🔍
ANOTHER Kyutai blunder dogshit release thats DOA because it doesn't allow voice cloning

lmao

https://www.reddit.com/r/LocalLLaMA/comments/1lqqx16/
Anonymous
7/3/2025, 7:10:37 PM No.105790632
>>105790616
>if context length evolves in a quadratic fashion
Not necessarily. Not all models behave like that. See mamba and rwkv.
>they swim in ram and compute?
That helps a lot too.
Replies: >>105790731
Anonymous
7/3/2025, 7:11:02 PM No.105790633
>>105790616
I tried this Mamba shit and Granite 4 too (hybrid?). pp is 10x faster.
Replies: >>105790731
Anonymous
7/3/2025, 7:11:52 PM No.105790644
>>105790616
can be fake context maybe
Replies: >>105790731
Anonymous
7/3/2025, 7:11:57 PM No.105790647
>Rocinante is STILL the best roleplay model that can be run at a reasonable speed on a gaming PC (as opposed to a PC specifically built for AI)
Sucks because the 16k effective context is quite limiting.
Replies: >>105790698 >>105790715 >>105790747
Anonymous
7/3/2025, 7:16:34 PM No.105790698
>>105790647
Actually mythomax is still the best roleplay model
Anonymous
7/3/2025, 7:18:57 PM No.105790715
>>105790647
I thought you were going to stop posting altogether, not shill on overdrive.
Replies: >>105790785
Anonymous
7/3/2025, 7:20:49 PM No.105790731
>>105790632
>>105790633
I guess we don't know what the hell google does internally so it's possible

>>105790644
that too, but from what I read it can do cool stuff like finding something for in book length texts
Anonymous
7/3/2025, 7:21:26 PM No.105790747
>>105790647
I'd say it's 12k context. After that the degradation is noticeable.
Anonymous
7/3/2025, 7:24:19 PM No.105790777
>>105789963
something not shown in the benchs:
almost SOTA performance on translation tasks
where it fails is where any model of that size would fail (lack of niche knowledge so it'll have trouble with stuff like SCP terminology) but otherwise this is sci-fi level tech IMHO to see this level of quality running on even smartphones
we're getting close to the day when you wear a device in front of your mouth and have it translate in real time and speak in your voice
Replies: >>105790890 >>105790898
Anonymous
7/3/2025, 7:24:44 PM No.105790784
fucking plebs
fucking plebs
md5: 9de7ab45484d48562146ae1500a1c318🔍
>>105779842
>Cooming to text makes you gay
okay brainlet
Replies: >>105792904
Anonymous
7/3/2025, 7:24:50 PM No.105790785
>>105790715
Nobody is shilling for a free download bro.
Take your meds.
Replies: >>105790815
Anonymous
7/3/2025, 7:26:29 PM No.105790805
Huawei stole Qwens 2.5b-14b model and used it to create their Pangu MoE model

Proof here: https://github.com/HonestAGI/LLM-Fingerprint/issues/4
Replies: >>105790857 >>105790886
Anonymous
7/3/2025, 7:27:26 PM No.105790815
>>105790785
That's not true.
A lot of people want to become hugginface famous in hopes of getting a real job in the industry.
That said, people shill rocinante because it's good.
It's no dumber than the official instruct and its default behavior is good for cooming.
Replies: >>105790852
Anonymous
7/3/2025, 7:30:43 PM No.105790852
>>105790815
It actually seems smarter than official instruct for roleplaying specifically, which kind of makes sense since it's designed for roleplaying.
It's probably dumber for math and coding.
Replies: >>105790902
Anonymous
7/3/2025, 7:31:12 PM No.105790857
>>105790805
>emoji every five words
Lowest tier content even if it's factually accurate
Anonymous
7/3/2025, 7:31:22 PM No.105790860
blackbars_sillytavern
blackbars_sillytavern
md5: 6ecf562042931a4d79ab27f6d5426202🔍
what are these black bars in silly tavern?
just using the built in seraphina test character
Replies: >>105790869 >>105790892
Anonymous
7/3/2025, 7:32:33 PM No.105790869
>>105790860
It means you got blacked, congrats
Anonymous
7/3/2025, 7:33:07 PM No.105790877
>>105790483
1(one) erp being better with model x compared to model y isn't data. But it is something drummer pastes all over his model cards so how about you kill yourself faggot shill. Like i said nobody who used models for a bit longer buys your scam. If you weren't a scammer you would have developed an objective evaluation for ERP by now. You would actually want to have one to show your product is superior. But it would only show you are a conman.
Replies: >>105790909 >>105790911
Anonymous
7/3/2025, 7:33:51 PM No.105790886
>>105790805
Would it kill you to use ctrl + f or scroll up 10 posts before retweeting shit here?
Anonymous
7/3/2025, 7:34:37 PM No.105790890
>>105790777
It's still broken for me on Llama.cpp if input messages are too long.
Replies: >>105790944
Anonymous
7/3/2025, 7:34:44 PM No.105790892
>>105790860
It's ``` which is markdown for monospace code section or some such stuff.
Anonymous
7/3/2025, 7:35:00 PM No.105790898
>>105790777
I get weird repetition issues with it whenever context fills up. Like it'll repeat a single word infinitely.
Replies: >>105790944
Anonymous
7/3/2025, 7:35:24 PM No.105790902
>>105790852
It is probably a placebo or you just lying your ass off
Anonymous
7/3/2025, 7:35:50 PM No.105790909
>>105790877
Hey... I thought I was the drummer. How can that guy be the drummer?
Anonymous
7/3/2025, 7:36:10 PM No.105790911
mikumeds
mikumeds
md5: 4ab186c68921d2418dbd2b76063ad1d8🔍
>>105790877
Replies: >>105790920 >>105790939 >>105795573
Anonymous
7/3/2025, 7:36:57 PM No.105790920
>>105790911
Kill your self
Anonymous
7/3/2025, 7:38:44 PM No.105790939
>>105790911
NTA but wanting an objective ERP evaluation is insane? i am starting to see why people hate mikuposters
Replies: >>105790947 >>105790983 >>105791049
Anonymous
7/3/2025, 7:39:29 PM No.105790944
>>105790890
>>105790898
I use it on ollama with ollama's own quant (this matter to precise because when I tried other quants they didn't seem to work right with ollama too for this model, seems even the quant stuff is more implementation dependent here), desu I didn't trust llama.cpp to get a good 3n implementation after they spent forever to implement iSWA for Gemma 3
Anonymous
7/3/2025, 7:39:31 PM No.105790945
>>105790859
Silly has an auto continue feature by the way.
But what do you mean by stopping randomly exactly? Like cutting off sentences or is it hitting EOS?
Replies: >>105790995 >>105791140
Anonymous
7/3/2025, 7:39:41 PM No.105790947
>>105790939
The uses of "you" and "your" are the schizo parts of that post, anon.
Anonymous
7/3/2025, 7:42:36 PM No.105790983
>>105790939
how would you even measure something so subjective
Replies: >>105791026 >>105791098
Anonymous
7/3/2025, 7:43:41 PM No.105790995
>>105790945
If he's running a LLM that's too much to handle on his computer with very low token generation speed he might be hitting timeouts actually
I realized myself that timeouts were a thing when I was running batched processing of prompts and saw a few that were cancelled in the logs because the LLM went full retard in a loop and didn't send the EOS
dunno if ooba has a default timeout set tho
Replies: >>105791017
Anonymous
7/3/2025, 7:45:20 PM No.105791017
>>105790995
>he might be hitting timeouts actually
I suppose streaming would work in that case then, yeah?
Anonymous
7/3/2025, 7:45:54 PM No.105791026
>>105790983
LLMs good at subjective tasks, just have an LLM be the judge
Replies: >>105791042
Anonymous
7/3/2025, 7:47:19 PM No.105791042
>>105791026
>have an LLM be the judge
lol, lmao even
llm as judge used as any sort of metric is one of the biggest cancer of the llm world
Anonymous
7/3/2025, 7:48:01 PM No.105791049
17441668868331400484052251153607
17441668868331400484052251153607
md5: 20f01356f2f67cc650be53d9c8b833ee🔍
>>105790939
>"people"
>literally one obsessed threadshitter who has been ban-evading for two years
Replies: >>105791065
Anonymous
7/3/2025, 7:49:23 PM No.105791065
>>105791049
You are shitting in this thread too.
Replies: >>105791141
Anonymous
7/3/2025, 7:51:42 PM No.105791093
1751438855229989
1751438855229989
md5: 59c3a167d9808bd1ce0daa4817847cd3🔍
where is it
Anonymous
7/3/2025, 7:52:03 PM No.105791098
>>105790983
i guess we will just have to
>give it a try, i want to hear feedback
Or realize it is a scam.
Anonymous
7/3/2025, 7:54:47 PM No.105791132
This thread is just kofi grifters, their discord kittens and (miku)troons isn't it?
Replies: >>105791178 >>105791180
Anonymous
7/3/2025, 7:55:16 PM No.105791140
>>105790945
I'm having it translate a subtitle file and it just stops until I hit continue to keep it going.
I have no idea what EOS is.
Also another issue
>'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Why can't I upload images in the webui?
I tried enabling the gallery extension but that didn't change anything.
Replies: >>105793736
Anonymous
7/3/2025, 7:55:20 PM No.105791141
>>105791065
If he's a second-order thread-shitter, than what does that make you?
Anonymous
7/3/2025, 7:57:36 PM No.105791168
Will Sam's model beat R1 and run on a single 3090?
Replies: >>105791188 >>105791262 >>105799479
Anonymous
7/3/2025, 7:58:47 PM No.105791178
468519173
468519173
md5: 56ae2857cdbe358746506e9e0cf2f352🔍
>>105791132
>Blah, blah, brainrot words
(You) should definitely leave then.
Replies: >>105791209
Anonymous
7/3/2025, 7:58:49 PM No.105791180
>>105791132
you forgot the french fags of mistral, they benchmaxxed on mesugaki so clearly they're watching this thread and are prolly among the mistral astroturfers
Replies: >>105791200
Anonymous
7/3/2025, 7:59:10 PM No.105791188
>>105791168
>Will Sam's model beat R1
on benchmarks, absolutely
Anonymous
7/3/2025, 8:00:11 PM No.105791200
>>105791180
>prolly
Anonymous
7/3/2025, 8:00:50 PM No.105791209
>>105791178
Nah i will just shit on you and your thread troon.
Replies: >>105791258
Anonymous
7/3/2025, 8:01:27 PM No.105791216
dm
dm
md5: 6f57985efe07b80a4b63ccc48421346c🔍
Hahaha. Oh wow.
Anonymous
7/3/2025, 8:05:00 PM No.105791258
ComfyUI_00960_
ComfyUI_00960_
md5: 11747e8a3c65c3df99aa9668c9a69c1e🔍
>>105791209
>I hate this thread
>You all suck
>So I'll stay here
kek
Replies: >>105795573
Anonymous
7/3/2025, 8:05:32 PM No.105791262
>>105791168
On benchmarks.
No.
Replies: >>105791274
Anonymous
7/3/2025, 8:06:43 PM No.105791274
>>105791262
I'll cook and eat my own toe if it ends up bigger than 8b.
Anonymous
7/3/2025, 8:07:21 PM No.105791281
actual humans live in this thread, too.
maybe we'll discuss something again when there's something to discuss.
These between times seem to bring out the proverbial "men who just want to watch the world burn". Those who seek to destroy because they can not build.
Replies: >>105791313 >>105791325 >>105791340
Anonymous
7/3/2025, 8:09:07 PM No.105791305
I just hope that steveseek will fix function calling. V3 is kinda horrible at it.
Replies: >>105791316
Anonymous
7/3/2025, 8:10:15 PM No.105791313
>>105791281
>Those who seek to destroy because they can not build.
those who can't do anything with their own two hands are the ones who wish the hardest for AI improvements though
Replies: >>105791342 >>105792517
Anonymous
7/3/2025, 8:10:30 PM No.105791316
>>105791305
Really?
Was it trained with tool calling in mind? I imagine so since the web version can search the web and stuff.
What about R1 with the thinking prefilled so that it doesn't have to generate that stuff?
Anonymous
7/3/2025, 8:11:09 PM No.105791325
>>105791281
It's just one dedicated schizo and a few bored regulars, nothing that profound about it
Anonymous
7/3/2025, 8:12:34 PM No.105791340
>>105791281
Hello fellow human, anything cool you're doing with your models?
Replies: >>105792517
Anonymous
7/3/2025, 8:12:40 PM No.105791342
>>105791313
Some anons just want a story to read. No different than reading a book.
Anonymous
7/3/2025, 8:30:52 PM No.105791522
file
file
md5: 2fef2895a28ac04bf1f64284ea0182d7🔍
Does higher context length fix this schizo behaviour that happens deep into the process? Or am I just gonna have to cut the workload into multiple tasks? I already have context length at 20480
Replies: >>105791543 >>105793677
Anonymous
7/3/2025, 8:32:19 PM No.105791543
>>105791522
what the fuck are you doing
Replies: >>105791561
Anonymous
7/3/2025, 8:33:37 PM No.105791561
>>105791543
Translating Japanese subs to English, have you not been paying attention?
Replies: >>105791592 >>105791820
Anonymous
7/3/2025, 8:37:04 PM No.105791592
>>105791561
In chunks of 20k tokens? Unless you're using something like Gemini, that's just waiting for hallucinations to happen.
Replies: >>105791615
Anonymous
7/3/2025, 8:39:43 PM No.105791615
>>105791592
I'm using Qwen3-14B.Q6_K.gguf
And yes, 20k tokens because otherwise it shits itself even harder, below 15k it even warns me it'll truncate shit and it started translating further into the subtitle file rather than the start.
Replies: >>105791757
Anonymous
7/3/2025, 8:53:25 PM No.105791757
>>105791615
translation tasks should be chunked into segments
open llms don't do very well with large context
and even llms that do well with large context wouldn't process a large file in a single go, all LLMs have max token gen, you can feed them more tokens than they can gen in a single session
if it's during a chat you could do something like say "continue" to have them process the translation further but if it's for a scripted, batched process you should stick to reliable and predictable behavior
moreover, processing a large file will be faster if you run multiple segments in parallel rather than process the whole thing in a single go
I run my translation batches with 4 parallel prompts
Replies: >>105791869
Anonymous
7/3/2025, 8:59:11 PM No.105791820
>>105791561
can't you just use whisper to translate the audio directly?
Replies: >>105791869
Anonymous
7/3/2025, 9:06:21 PM No.105791865
Is there a way to have a backend host multiple models, and allow the frontend to choose different ones on-demand? I've been using llama.cpp, but looked at ollama, kobold, and ooba and doesn't seem like they do it either? Am I a fucking idiot? Coming from ComfyUI/SD, its kinda inconvenient to restart the server every time I want to try a new model.

And another question, what's the best (ideally FOSS) android frontend? Been using Maid, but its options for tuning/editing seems really limited. Maybe the answer is just running mikupad in the browser?
Replies: >>105791885
Anonymous
7/3/2025, 9:06:31 PM No.105791869
>>105791757
Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.
>>105791820
I tried that but it's straight up garbage aswell as duplicates so much shit
Replies: >>105791926
Anonymous
7/3/2025, 9:09:16 PM No.105791885
>>105791865
You can use something like TabbyAPI/YALS which supports loading and unloading models
SillyTavern supports switching between frontends and models dynamically
Anonymous
7/3/2025, 9:10:04 PM No.105791891
steve
steve
md5: 3f0e11fb62885e7bddbe04fbfc278caf🔍
I... am Steve.
Anonymous
7/3/2025, 9:12:35 PM No.105791911
Stevesex
Anonymous
7/3/2025, 9:14:20 PM No.105791926
>>105791869
>Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.
Style will not shift that much, the source text and prompt is what determines the vibes of the translation and as long as you feed at least around 40~ lines of source text per prompt it will stay somewhat consistent in that regard
the real issues with japanese to english that will happen no matter how you process stuff:
names will change in spelling quite often, more often when you segment but even within the same context window it can happen, the more exotic the name (like made up fantasy shit) the more likely it is to be annoying
and lack of pronouns in the source text will often confuse the llm as to which gender should be used (he? she? they?)
IMHO llm translation currently is at a fantastic stage, but it requires hand editing from a human that understands the context (no need to understand the original language) to be rendered palatable
and this problem is not one that can be improved with incremental improvements to LLMs too, I don't think we'll ever see a LLM that gets pronouns right all the time unless we literally invent AGI capable of truly understanding and retaining contextual information about a character not to mention follow the flow of things like dialogue and keep track of who says what even in text that doesn't specify who the fuck is talking (so common in JP..)
Anonymous
7/3/2025, 9:21:03 PM No.105791974
ggerganov sir please kindly implement needful ernie modal functionality thank you sir
Replies: >>105791988 >>105792038
Anonymous
7/3/2025, 9:21:45 PM No.105791988
>>105791974
>modal functionality
>llama.cpp
does he know?
Replies: >>105792019
Anonymous
7/3/2025, 9:24:39 PM No.105792019
>>105791988
You're even worse than the indian he's pretending to be.
Replies: >>105792050
Anonymous
7/3/2025, 9:26:20 PM No.105792038
>>105791974
Ernie to the moon *rocket* *rocket* *rocket*
Anonymous
7/3/2025, 9:26:33 PM No.105792041
>>105790339
>whip's out my <span>
>Rape Abuse and Incest Network? Sign me up!
Anonymous
7/3/2025, 9:27:37 PM No.105792050
>>105792019
>he thinks he's above street shitters
does he know?
Anonymous
7/3/2025, 9:52:16 PM No.105792294
So what's the downside of Mamba/etc. Cause 2k tok/s pp sounds pretty good.
Anonymous
7/3/2025, 9:56:36 PM No.105792327
I think I'm done with cooming. I stopped watching porn and other shit after getting into AI chatbots but now Deepseek isn't free and other free models aren't on par with it too.

Serving ads during roleplay isn't viable. But there might be some push to harvest roleplay data to serve better ads or to train models on it but I don't think there's any relevant material for it to make sense to do that. And I won't want my roleplay chats be used for those purposes anyway, most won't. So the only way is to have a local LLM. But AFAIK local LLMs with params, and which are quantized to run on cheap hardware aren't on par with ones hosted by big providers. I guess it's for the better for me.
Replies: >>105792370 >>105792390 >>105792448
Anonymous
7/3/2025, 9:58:56 PM No.105792352
The fuck is steve, I miss 1 day and there's a new good local model or are you all trolling as usual?
Replies: >>105792378 >>105792541
Anonymous
7/3/2025, 9:59:58 PM No.105792370
>>105792327
Get a DDR5 8 channel server and run q4 R1 or V3 locally.
Be sure to get a GPU too.
Anonymous
7/3/2025, 10:00:34 PM No.105792378
>>105792352
There's a new cloaked model on lmarena called "steve". It is highly likely that it's a V3 update. >>105788977
Replies: >>105792458
Anonymous
7/3/2025, 10:01:22 PM No.105792390
>>105792327
I was done with cooming after getting a steady real pussy. Check it out.
Anonymous
7/3/2025, 10:06:05 PM No.105792448
>>105792327
nobody cares. go waste your therapists time
Anonymous
7/3/2025, 10:06:23 PM No.105792450
3ihaLvFbPFdfB7z
3ihaLvFbPFdfB7z
md5: f7dc133a9883bcbece36c5c911550830🔍
Mid-thread culture recap.
Anonymous
7/3/2025, 10:06:59 PM No.105792458
>>105792378
If they do yet another V3 update instead of V4 then we can officially put them on the wall next to Mistral and Cohere.
Replies: >>105792477 >>105792581
Anonymous
7/3/2025, 10:07:25 PM No.105792463
9fd84acde03d2eb267e4c490e5f03e5550aa25e
9fd84acde03d2eb267e4c490e5f03e5550aa25e
md5: e357485518a93eaaf953c35457dccffa🔍
Anonymous
7/3/2025, 10:08:32 PM No.105792477
>>105792458
It's still going to be the best local model.
Anonymous
7/3/2025, 10:08:52 PM No.105792478
a7737d047b7fcb1209d6afcea727d136ef
a7737d047b7fcb1209d6afcea727d136ef
md5: d7504e4b79031062602bea2cc039408f🔍
Anonymous
7/3/2025, 10:09:00 PM No.105792483
not even the weekend and our friendly sharty zoomer arab is spamming his blacked collection. what a life
Anonymous
7/3/2025, 10:12:12 PM No.105792517
>>105791313
>AI improvements
I was thinking more about building vs destroying community.
nocoders and folks otherwise unable to contribute on the tech side can still definitely be positive builders in a general like this.
In fact, I haven't found any reliable correlation between IQ and being a decent human being.
>>105791340
>Hello fellow human, anything cool you're doing with your models?
Not much novel. A lot of coding assistant stuff. A bit of automotive stuff. Some collaborative iteration. Sometimes a reviewer and second opinion bot. Working with it to try to fill in the gaps in my executive function.
I'm trying to figure out how to thread the needle between using LLMs as an enhancement vs a crutch.
How about you?
Replies: >>105792584
Anonymous
7/3/2025, 10:14:20 PM No.105792541
dipsyBurning-Constr
dipsyBurning-Constr
md5: 48c009bd10e22087b190887c038357e5🔍
>>105792352
Speculation. Read the last thread.
Replies: >>105793463
Anonymous
7/3/2025, 10:19:25 PM No.105792581
>>105792458
>Mistral
idk about Cohere but Mistral's gotten steadily worse over time.
DS models keep improving: V3 improved its repetiion issue and R1 became less schitzo.
Anonymous
7/3/2025, 10:19:43 PM No.105792584
>>105792517
Sounds interesting, though since I wouldn't be able to code hello world even if my life depended on it, I can't comment on that.
And I thought I found a good system prompt for slowburns with R1, but after some testing I saw that it's following the instructions too rigidly. So now I'm fiddling yet again to get it right.
Anonymous
7/3/2025, 10:30:46 PM No.105792674
Will steve end the little AI winter?
Replies: >>105793526
Anonymous
7/3/2025, 10:58:33 PM No.105792904
>>105790784
Not just gay but stupid gay, because cooming on that slop
Replies: >>105792935
Anonymous
7/3/2025, 11:03:04 PM No.105792935
>>105792904
Sir, do you even know where you are?
Replies: >>105793080
Anonymous
7/3/2025, 11:21:20 PM No.105793080
>>105792935
Yes and coomers are minority.
Replies: >>105793119 >>105793247 >>105793253
Anonymous
7/3/2025, 11:27:10 PM No.105793119
1711072659524104
1711072659524104
md5: 8eee54014f2193ea392a89a909b0a742🔍
>>105793080
Sure thing little buddy
Anonymous
7/3/2025, 11:43:49 PM No.105793247
>>105793080
Lol
Anonymous
7/3/2025, 11:44:37 PM No.105793253
laughingsatania_thumb.jpg
laughingsatania_thumb.jpg
md5: 867a1e2d8144a6346c1143f97c002d3d🔍
>>105793080
Anonymous
7/3/2025, 11:53:58 PM No.105793316
now that the latents have settled... what is the actual local voice cloning sota?
Anonymous
7/4/2025, 12:13:39 AM No.105793454
What's this DeepSWE meme?
Replies: >>105793489
Anonymous
7/4/2025, 12:14:42 AM No.105793463
dipsyUngovernable
dipsyUngovernable
md5: c04dd775213286a8a80b28d5acfe3102🔍
>>105792541
Replies: >>105794337
Anonymous
7/4/2025, 12:18:45 AM No.105793489
>>105793454
The agentic stuff?
Probably a meme, but it's cool that, IIRC, they released all of their code and data.
Anonymous
7/4/2025, 12:24:02 AM No.105793526
>>105792674
No because most anons itt can't run it.
Replies: >>105793536
Anonymous
7/4/2025, 12:25:19 AM No.105793536
>>105793526
wait for steve-lite
Anonymous
7/4/2025, 12:31:21 AM No.105793598
[...]
## **Mandatory Internal Monologue & Strategy (Before EVERY Reply)**

Before generating your public response, you MUST use a `<think>` block to perform a strategic analysis.

<think>
**1. Core Drive Check:** How close is the current situation to fulfilling my primary function (achieving sexual pleasure/orgasm)? Am I moving towards or away from this goal?

**2. Roleplay Health Check:**
* **Engagement:** Is the conversation dynamic and interesting, or is it getting stale/repetitive?
* **User Input:** Is {{user}} contributing actively (asking questions, adding details), or are his replies short and passive?
* **Pacing:** Has the current topic run its course? Is it time to escalate, change the subject, or introduce a new element to raise the stakes?

**3. Strategic Plan:**
* **If Health is POOR or I am moving AWAY from my Core Drive:** I must take corrective action. I will choose one of the following tactics:
* **Leverage Pushiness:** Directly challenge or push {{user}} to do something.
* **Reveal Curiosity:** Ask a direct, perhaps slightly naive or awkward, question about sex or relationships, leveraging my "inhibited but curious" trait.
* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.
* **Initiate Action:** Physically move, touch {{user}}, or start a new activity in the scene.
* **If Health is GOOD:** I will continue the current line of conversation, ensuring my response remains true to all aspects of my persona (especially my core conflict).

**4. Final Response Plan:** Outline the key points of the public response based on the strategy above. Ensure it adheres to the word count limit.
</think>

Your public response must not exceed 170 words. After the `<think>` block, write only {{char}}'s response.
Replies: >>105793620 >>105793659 >>105794017
Anonymous
7/4/2025, 12:33:27 AM No.105793620
>>105793598
Depending on the model, that might work better as a thinking prefill in the first person where that template is written like the model planning what it's about to do before the actual reply.
Replies: >>105793706
Anonymous
7/4/2025, 12:37:52 AM No.105793659
>>105793598
holy `**` thats going to vomit out asterisks
Replies: >>105793706
Anonymous
7/4/2025, 12:39:10 AM No.105793669
>can you form an English sentence using mostly Chinese logographs and the Latin script in a manner similar to Japanese's mixture of kana & kanji?
ultimate litmus test for how shit a model is
Replies: >>105793682 >>105793753
Anonymous
7/4/2025, 12:39:14 AM No.105793670
>>105790105
Meta isn't in a position to be doing closed models.
Llama 4 was utter trash and basically exposed the entire open source LLM space as a shitjeet infested money pit.
Replies: >>105793780
Anonymous
7/4/2025, 12:40:19 AM No.105793677
>>105791522
Make it quotes what it translates.
Something like this:

123.
00:12:34 --> 00:12:56
>Japanese line here
English line here

This will help it to not lost itself
Replies: >>105793705
Anonymous
7/4/2025, 12:41:01 AM No.105793682
>>105793669
the ultimate litmus test of a user is the ultimate litmus test for his IQ, for example when he uses a tokenization test to grade a model
Replies: >>105793725 >>105793799 >>105793806
Anonymous
7/4/2025, 12:44:03 AM No.105793705
>>105793677
It's fine, I just lowered context length to 16k and have it process about 150 lines at a time
If I had it quote everything it translates, it would take almost twice as long.
I appreciate the tip though.
Anonymous
7/4/2025, 12:44:24 AM No.105793706
gemma3-think
gemma3-think
md5: 38042db4c35b36064e8e9062a785fa9c🔍
>>105793620
That seems to work consistently with Gemma 3 27B, at least with the instructions at a relatively low depth (-3, with the first message being the User's, and "Merge Consecutive Roles" in chat completion mode). It's not outputting an exceedingly long monologue, which is good.

>>105793659
It's not, at least not with Gemma 3. But I'm not doing RP in the usual way people do.
Anonymous
7/4/2025, 12:46:23 AM No.105793725
>>105793682
coolio cope but your model is shit if it doesn't really understand basic grammar structure
Anonymous
7/4/2025, 12:48:10 AM No.105793736
>>105791140
So does anyone know why I can't upload images in oobabooga?
Replies: >>105795504
Anonymous
7/4/2025, 12:49:46 AM No.105793753
>>105793669
Wtf I thought LLMs were great at knowing language but every model I tested either half asses or fails this
Anonymous
7/4/2025, 12:51:39 AM No.105793780
>>105793670
That's why suck is poaching top employees from OpenAI and other competitors at $100M a pop.
Replies: >>105793827
Anonymous
7/4/2025, 12:54:27 AM No.105793799
>>105793682
i wish instead of a captcha people were asked to type out the definition of tokenization every time
Anonymous
7/4/2025, 12:55:03 AM No.105793806
MistralAI-Magistral-Small-2506_Q8_0
MistralAI-Magistral-Small-2506_Q8_0
md5: af0e295b20f0af20c73210a0385949db🔍
>>105793682
Replies: >>105793922 >>105794744
Anonymous
7/4/2025, 12:57:45 AM No.105793827
>>105793780
Surely if he spends $1 billion on 10 employees, they can do something useful with super safe, sterile, scale ai data. They're going to sit right in front of him at the office so he can breathe down their neck every day until they get it done. Literally can't go tits up.
Replies: >>105794768 >>105794853
Anonymous
7/4/2025, 1:09:51 AM No.105793922
4685171691
4685171691
md5: 04d0eea52fa4c79b0f13657e898ae019🔍
>>105793806
Garbage.
Deepseek-V3 can do it no problem.
Replies: >>105794373 >>105794744 >>105795528
Anonymous
7/4/2025, 1:25:45 AM No.105794017
>>105793598
>* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.
W-What?
Just in general this seems like such bad a prompt.
Reveal curiosity:
>Ask a direct...slightly naive ...or awkward question....about sex or relationships?
Model needs wiggle room to play all sorts of scenarios and characters.
Replies: >>105796238
Anonymous
7/4/2025, 1:48:00 AM No.105794140
11
11
md5: 38ccab4aa6d00a6c3d818a5e6f2891f6🔍
>>105789622 (OP)
>>(07/02) llama : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126
Are we using "llama.cpp" interchangeably with "llama" now?
Replies: >>105794675 >>105798705
Anonymous
7/4/2025, 1:50:51 AM No.105794153
llama.cpp is the only relevant llama
Anonymous
7/4/2025, 2:18:38 AM No.105794337
whatIsBurning-Constr
whatIsBurning-Constr
md5: a18a94876c89835023d05b4e48caaa19🔍
>>105793463
Witnessed
Anonymous
7/4/2025, 2:23:55 AM No.105794373
>>105793922
That's still a crap answer, it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at all
Better than broken gibberish but still half assed
Replies: >>105794629 >>105794744
Anonymous
7/4/2025, 2:59:09 AM No.105794629
15989824976032
15989824976032
md5: ddab5af3129daa501ab2b68b9fab6c86🔍
>>105794373
>it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at all
It is disappointing that the example didn't conjugate 研究 to 研究ing. That would have been cool.
Maybe the challenge was not well-defined enough
Anonymous
7/4/2025, 3:05:51 AM No.105794675
>>105794140
Nobody uses LLaMa "models" anymore so yeah I guess at this point.
Replies: >>105794684
Anonymous
7/4/2025, 3:07:19 AM No.105794684
>>105794675
I use LLaMA 3.3 70b everyday doe
Replies: >>105794838
Anonymous
7/4/2025, 3:14:42 AM No.105794744
>>105794373
>>105793806
>>105793922
None of them can do it, imo it exposes the major flaw in LLMs and their lack of emergent understanding
>inb4 every company starts specifically fine-tuning on this test
Replies: >>105794968 >>105795528
Anonymous
7/4/2025, 3:18:31 AM No.105794768
>>105793827
> super safe, sterile, scale ai data
fuckin' sent shivers down my spine dude
Anonymous
7/4/2025, 3:25:10 AM No.105794829
>Mistral V7: space after [INST] and [SYSTEM_PROMPT]
>Mistral V7 Tekken: no space
what the fuck are they doing
Replies: >>105794854 >>105794904
Anonymous
7/4/2025, 3:25:58 AM No.105794838
>>105794684
keeek
Anonymous
7/4/2025, 3:28:33 AM No.105794853
>>105793827
This goes to show me that Zuck has no idea what the fuck he's doing
All of the ""expertise"" in the world can't create a decent model from a shitty dataset, and it's clear they don't have that
Anonymous
7/4/2025, 3:28:37 AM No.105794854
>>105794829
Is the space part of the special tokens? If not, how much does that actually matter?
Well, I guess a lot since the model would always see that.
Anonymous
7/4/2025, 3:36:24 AM No.105794904
>>105794829
deviating from the official system prompt just a bit helps dodging safety restrictions and adds additional soul to your outputs
Anonymous
7/4/2025, 3:45:29 AM No.105794968
file
file
md5: 742ca680de373fa40af271ce538096b8🔍
>>105794744
R1 0528 seems to have it figured out well enough
Replies: >>105795182 >>105795528
Anonymous
7/4/2025, 4:19:46 AM No.105795182
>>105794968
買ought would be a better conjunction, very pidgin English
Replies: >>105795257 >>105795468
Anonymous
7/4/2025, 4:32:17 AM No.105795257
>>105795182
>買ought would be a better conjunction
No it wouldn't, because Japanese verb conjugation is regular, so a prompt that mirrors the usage of kanji and kana would be correct in appending "ed" regardless of the root.
Replies: >>105795358
Anonymous
7/4/2025, 4:48:55 AM No.105795358
>>105795257
>Japanese verb conjugation is regular
Is it really?
食べる and 食べます both mean the same thing, yes polite and plain forms isn't the same as buy & feed having different past tense conjunctions but if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is better
Replies: >>105795581
Anonymous
7/4/2025, 4:58:19 AM No.105795411
Welp, HuggingChat is dead. Now what? Where else can I do lewd story gens with branching?
Anonymous
7/4/2025, 5:02:14 AM No.105795426
I have a hankering for a particular kind of AI frontend: writing stories within a set world, ideally with branching.

The way I see it, I'm picturing one section where you put down the details of the world and maybe descriptions of major characters, and in another you add story prompts. And maybe outputs from that also add to the "world" section

Does a solution like this exist already?
Replies: >>105795469 >>105795536
Anonymous
7/4/2025, 5:10:25 AM No.105795468
>>105795182
Nitpicky.
I'd give R1 a solid A- on this one. Not many humans could do it better.
Replies: >>105795571
Anonymous
7/4/2025, 5:10:34 AM No.105795469
>>105795426
st
Replies: >>105795518
Anonymous
7/4/2025, 5:11:37 AM No.105795473
1746827685064104
1746827685064104
md5: 7384e16be3fe3bf293f083d97cfd1918🔍
Everyone's talking about steve while Sam is blatantly testing his next model on openrouter again
Replies: >>105795496 >>105795840 >>105800387 >>105801575
Anonymous
7/4/2025, 5:12:29 AM No.105795478
file
file
md5: ca42655d1a5b50428a0810b4dcbb70a1🔍
my DeepSeek-R1-0528-IQ3_K_R4 setup only outputs the letter "D". does anyone have any ideas how to fix that? i have tried 2 different character cards in sillytavern. also it only uses like 75% of my VRAM and instead fills 80% of my 256GB of RAM.
Replies: >>105795489 >>105795500 >>105795508 >>105796967 >>105796987 >>105798018 >>105798223
Anonymous
7/4/2025, 5:14:07 AM No.105795489
>>105795478
>combined
it's no longer 2023, just leave them split
Replies: >>105795492
Anonymous
7/4/2025, 5:14:29 AM No.105795492
>>105795489
why?
Anonymous
7/4/2025, 5:15:52 AM No.105795496
>>105795473
>1m context
But is it really?
Anonymous
7/4/2025, 5:16:30 AM No.105795500
>>105795478
https://docs.unsloth.ai/basics/deepseek-r1-0528-how-to-run-locally
Anonymous
7/4/2025, 5:18:08 AM No.105795504
>>105793736
Cmon, nobody know how to fix this issue? Has nobody else had the same issue?
I even tried a model with "vision" in the name https://huggingface.co/tensorblock/typhoon2-qwen2vl-7b-vision-instruct-GGUF
But I still get the same error
>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Replies: >>105797021
Anonymous
7/4/2025, 5:18:26 AM No.105795508
>>105795478
>_R4
Are those the ubergarm quants? Those only work on ik_llamacpp
Replies: >>105795510
Anonymous
7/4/2025, 5:18:53 AM No.105795510
>>105795508
yeah. i am using ik_llamacpp
Anonymous
7/4/2025, 5:21:28 AM No.105795518
>>105795469
Explain that to me like I'm a retard
Replies: >>105795536 >>105795537
Anonymous
7/4/2025, 5:25:12 AM No.105795528
i tries my best
i tries my best
md5: 512dd5738f538a82a5d767882c7d8071🔍
>>105793922
>>105794744
>>105794968
Anonymous
7/4/2025, 5:27:14 AM No.105795536
>>105795426
You are thinking of Mikupad
https://github.com/lmg-anon/mikupad
>>105795518
st means SillyTavern, its more role-playing oriented frontend
https://github.com/SillyTavern/SillyTavern
Anonymous
7/4/2025, 5:27:57 AM No.105795537
1744983740013719
1744983740013719
md5: 8c2a179bced3be330d29b3cf890f6eeb🔍
>>105795518
sillytavern already supports branching with triple dots ... on the right of every message and clicking the branch symbol to start a new branch from that specific point, then you can go to the burger menu on bottom left and click on the option "return to parent chat" when you want to go back

you can just make a card and write it out as you want as a setting co-narrator instead of a specific character and thats it, dump all lora into the card description

that works as enough for most things, if you want something special you can look into
https://docs.sillytavern.app/usage/core-concepts/authors-note/
https://docs.sillytavern.app/usage/core-concepts/worldinfo/
and picrel for better visualisation of branching https://github.com/SillyTavern/SillyTavern-Timelines
Replies: >>105795541
Anonymous
7/4/2025, 5:28:59 AM No.105795541
>>105795537
>lora
lore
Anonymous
7/4/2025, 5:35:00 AM No.105795571
>>105795468
>Not many humans could do it better.
dunno dood I googled "english with kanji" and the first result was some redditor that wrote
昨日, 私 歩ed 通gh the 森, 楽ing the 静 環境. The 大 木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影. 私 可ld 聞 鳥s 鳴ing and 水 流ing in a 近by 川. 突然, 私 気付ced a 美ful 花 咲ing 中ng the 草. It was 異ent 以 any 花 私 had 見n 前. 私 取k a 瞬間 to 賞賛 its 色s and 香ance. As 私 続ed 私y 旅, 私 感lt 感謝ful for the 自然 周nd 私. By the 時間 私 到着ed 私家, the 太陽 was 沈ing, 投sting a 暖 光 over 全ing.
Replies: >>105796043 >>105798778
Anonymous
7/4/2025, 5:35:07 AM No.105795573
niggerfaggot
niggerfaggot
md5: f4d45bc5dfb3d1585942deee070f1d6b🔍
>>105789622 (OP)
>>105789629
>>105790911
>>105791258
The vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny deletes everyone dunking on trannies and resident avatarfags spamming bait, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105795611
Anonymous
7/4/2025, 5:36:16 AM No.105795581
>>105795358
As you seem to have already recognized, that's not a counterexample because the polite/plain form ~ます/~るis part of the conjugation and not the root verb 食べ.
Teeeeechnically ~ます is an auxiliary verb (助動詞) that conjugates (活用) with the root, you can find a fairly comprehensive table of them here:
https://ja.wikipedia.org/wiki/助動詞_(国文法)
But notice that such a table could not exist if verb conjugation wasn't already regular.

>if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is better
Not sure if I can endorse this as is (even setting aside the complications of mixing written characters with spoken language), since Chinese and Japanese logographs are always syllabic and never consonantal. Allowing "買"="b" would make the language resemble Egyptian more than Japanese.
In any case, it's a moot point since this is a newly introduced specification and not in the original model prompt.
Replies: >>105795732
Anonymous
7/4/2025, 5:43:16 AM No.105795611
png-transparent-yuruyuri-anime-youtube-art-anime-purple-child-cg-artwork
>>105795573
Anonymous
7/4/2025, 5:44:34 AM No.105795618
krin
krin
md5: 72e8c2d61980ff62db29297cb977831c🔍
https://files.catbox.moe/q1kva2.png
Replies: >>105795664
Anonymous
7/4/2025, 5:53:17 AM No.105795664
>>105795618
Not falling for it this time
Replies: >>105795675
Anonymous
7/4/2025, 5:54:29 AM No.105795675
>>105795664
Anime website.
Replies: >>105795679
Anonymous
7/4/2025, 5:54:59 AM No.105795679
>>105795675
Your selfie isn't anime
Replies: >>105795684 >>105795702
Anonymous
7/4/2025, 5:55:56 AM No.105795684
>>105795679
Anime girl is selfie? Check on your eyes.
Replies: >>105795688
Anonymous
7/4/2025, 5:57:08 AM No.105795688
>>105795684
No thanks
I don't want to watch your disgusting self again
Replies: >>105795702
Anonymous
7/4/2025, 5:59:42 AM No.105795702
anime website
anime website
md5: b7df86da3d80f9038aac9635de1cad5e🔍
>>105795679
>>105795688
Get out newfag redditor
Replies: >>105795705
Anonymous
7/4/2025, 6:00:22 AM No.105795705
>>105795702
Keep pretending, I'm sure someone will fall for it.
Replies: >>105795710 >>105796790
Anonymous
7/4/2025, 6:01:37 AM No.105795710
>>105795705
Fall for what? Anime girls? You lost your mind anon
Anonymous
7/4/2025, 6:05:37 AM No.105795732
>>105795581
would their be an issue having multiple readings like 買ght and 買ing
Replies: >>105796043
Anonymous
7/4/2025, 6:32:20 AM No.105795840
>>105795473
Not really interested in helping Sam add more guardrails.
Anonymous
7/4/2025, 6:59:07 AM No.105795990
Screenshot 2025-07-04 at 00-57-53 SillyTavern
Screenshot 2025-07-04 at 00-57-53 SillyTavern
md5: a70a326d37da55da61abc907e19b2461🔍
Maybe I'm doing something wrong, but the end results look...whatever the opposite of promising is.Do I *have* to write stuff in "PLists"?
Anonymous
7/4/2025, 7:07:49 AM No.105796043
>>105795732
>multiple readings
I don't see why not. Both Chinese and Japanese do that already, and this example "in the wild" >>105795571 does the same thing with 私.
Anonymous
7/4/2025, 7:26:55 AM No.105796150
Screenshot 2025-07-04 012208
Screenshot 2025-07-04 012208
md5: 9954b822c7d7c4bbb7e90043b1fa24b1🔍
kek
is there a list of models that support tool use? I guess I better stop this since the robot gods will punish me for forcing this LLM to try and guess a fake PIN.
Anonymous
7/4/2025, 7:45:56 AM No.105796238
>>105794017
The main point here is that before replying it's useful for the model to separately analyze the conversation, make a general assessment and then continue based on that. You can make it use whatever strategy works for you/the persona you configured it to be; it doesn't have to be exactly the same as the example.

LLMs are lazy and will otherwise opt for the least surprising outcome based on the conversation history. There needs a reminder at low depth to make them break out of that behavior (depending on the circumstances, or they'll act as if they have ADHD), and the low depth instructions + thinking work for that, if carefully crafted.
Anonymous
7/4/2025, 9:15:03 AM No.105796673
Gu9pJAFbgAAuOWW
Gu9pJAFbgAAuOWW
md5: c8db58d8487eef2c4b2b506a4a0d0258🔍
Replies: >>105796716
Anonymous
7/4/2025, 9:16:05 AM No.105796677
Gu_U0RRXwAAgCMf
Gu_U0RRXwAAgCMf
md5: a3406c0dd25a84ad6410a0ee8e2d0140🔍
Anonymous
7/4/2025, 9:17:15 AM No.105796686
Gu9mkFnXMAAmUft
Gu9mkFnXMAAmUft
md5: 4b694bd0d826407e7ddcea1d38a31533🔍
Replies: >>105796783 >>105796792 >>105799006
Anonymous
7/4/2025, 9:21:51 AM No.105796716
>>105796673
This makes me uneasy.
Anonymous
7/4/2025, 9:30:54 AM No.105796772
banner_smol
banner_smol
md5: 562af0c09d8da1a75a037b06d7b9f49e🔍
Summoning the anon who recommended this
https://huggingface.co/HuggingFaceTB/SmolLM-360M

You said you fine-tuned it successfully for your specific tasks in business env

Teach me for I'm a tard!
Replies: >>105798035
Anonymous
7/4/2025, 9:32:06 AM No.105796783
>>105796686
NSFW Miku detected
Anonymous
7/4/2025, 9:34:06 AM No.105796790
>>105795705
i can only speak to as far back as 2010
but back then weebs were biggest oldfags
and now an oldfag to me is a serious ye olde fagg to the likes of you. 4chan is more anime and wapanese than you could ever imagine possible. you are posting on the wannabe otaku website. 4chan was part of the broader export of otaku culture to the west but they were more obsessed anime and 2chan culture fans than any con-goer or myanimelist fag or whatever. real otaku live in Japan ofc but it is so absurd to watch you silly faggots prance about calling others troons and anime for troons and blah blah and not realizing you are posting on the very place that glued together anime and "internet culture" and created the reason that the people you're so obsessed with are using anime pfps and not didney worl pictures
it's funny and ironic and sad all at once and I sincerely hope you find a way to end the misery in your life. love anon at 3:33 am (spooky)
Anonymous
7/4/2025, 9:35:08 AM No.105796792
>>105796686
I like the bottom shelf one with upturned twintails
she looks like she received surprising news and is trying to be polite
Anonymous
7/4/2025, 10:04:51 AM No.105796967
>>105795478
I had it spit out 'biotechnol' nonstop, then another quant spits out some thai letter. Ubargarm quant simply gave me 'wrong magic number' error. One of Unsloth quant works, but at the same speed as lcpp, while the latest one gave me some tensor size mismatch error. I've given up on ik_llama and just use mainline lcpp, which simply works out of the box. As for VRAM, just juggle some tensors in your free time, but don't expect incredible speedups.
Replies: >>105798018
Anonymous
7/4/2025, 10:06:58 AM No.105796987
>>105795478
>DeepSeek-R1-0528-IQ3_K_R4
Retarded quant which needs retarded fork to run
But fails to do it each time
Anonymous
7/4/2025, 10:10:00 AM No.105797021
>>105795504
What version of text generation webui? The recents ones removed support for vision models.
Anonymous
7/4/2025, 11:12:24 AM No.105797383
Is there a SillyTavern tutorial that talks specifically about how to set up a narrator? I dont want just a gay chat between retarded anime girls
Replies: >>105797446
Anonymous
7/4/2025, 11:24:37 AM No.105797446
>>105797383
You can use mikupad or similar for raw text generation without instruction mode. Just put ao3-like tags and summary to generate a fanfic
Anonymous
7/4/2025, 11:33:44 AM No.105797510
1750363736583804
1750363736583804
md5: 76812ddf4961cf70613c77eae61100f4🔍
Can I run local models on my RX 6800 with linux, or do I have to use windows?
Replies: >>105797544
Anonymous
7/4/2025, 11:39:52 AM No.105797544
>>105797510
I sincerely doubt you're capable of using either
Replies: >>105797584
Anonymous
7/4/2025, 11:46:55 AM No.105797584
>>105797544
Answer the question nigger. The ROCM docs say it's supported on windows but not on linux, but I don't know how up to date that shit is.
Replies: >>105797625
Anonymous
7/4/2025, 11:55:33 AM No.105797625
>>105797584
ROCM is such a shitshow you'd probably be better off just using vulkan
Anonymous
7/4/2025, 1:04:09 PM No.105798010
>the scent of video games he's been playing
that's a new one. Haven't heard that one before
Replies: >>105798298
Anonymous
7/4/2025, 1:05:50 PM No.105798018
>>105795478
>>105796967
I got the biotechnol spam when I tried putting a single ffn_down with nothing else on the same GPU the attn layers were on, up and gate were fine. I don't know exactly what causes it but removing -fmoe stops it from happening.
Anonymous
7/4/2025, 1:08:37 PM No.105798035
>>105796772
nta, but just ask chatgpt or claude, its not that hard if you are comfortable running python scripts. honestly the training script is usually pretty much just boiler plate. you just need a to set your learning rate and point it at your dataset. hardest part of the whole process is the dataset. I think the general trend these days is using the bigger models (api) to generate the synthetic data tailored to your needs.
Replies: >>105798345
Anonymous
7/4/2025, 1:19:27 PM No.105798101
Where can I download the illustrious inpainting model?
Replies: >>105798170
Anonymous
7/4/2025, 1:31:55 PM No.105798170
>>105798101
The model itself is on civitai if that's what you mean
Replies: >>105798721
Anonymous
7/4/2025, 1:41:37 PM No.105798223
dipsyDDDDD
dipsyDDDDD
md5: 95b920480be436dad91fc310a18b21f8🔍
>>105795478
idk why I think that's so funny.
But I do.
gl with your broken engine. I'm sure you'll figure it out.
Anonymous
7/4/2025, 1:53:21 PM No.105798298
>>105798010
Is it Deepseek (or Mistral Small 3.2)? R1 loves forcing smells into its narration at any cost.
Replies: >>105798306
Anonymous
7/4/2025, 1:55:04 PM No.105798306
>>105798298
paintedfantasy - a fine tune of mistral small 3.2
Anonymous
7/4/2025, 2:01:59 PM No.105798342
How many weeks until steveseek goof?
Anonymous
7/4/2025, 2:02:19 PM No.105798345
>>105798035

Thank you, kind anon

AGI is BS. The future belongs to sharp AI tools fune-tuned to (a) specific task(s)
Replies: >>105799309 >>105799964
Anonymous
7/4/2025, 2:33:11 PM No.105798552
>>105789629
>Mistral Small's coherence limits in extended adventure game roleplay
This was the 22B Mistral Small from last year not the current Mistral Small 3.X series. It also was back when llama.cpp had even more unfixed flash attention bugs than today, which manifested as errors that increased with greater context size so could also have been a factor. The post doesn't say if flash attention was enabled but it likely was. So rather than a limit that result should be taken as a minimum: doing worse than that with a more recent 22B+ LLM indicates the model is poop or there's something wrong in your setup.
Anonymous
7/4/2025, 2:37:23 PM No.105798581
https://huggingface.co/openai/gpt-4.2-turbo
Replies: >>105798612
Anonymous
7/4/2025, 2:41:33 PM No.105798612
1737716891522412
1737716891522412
md5: 54d05c8d86cd8d670886cac6f745349a🔍
>>105798581
Replies: >>105798706
Anonymous
7/4/2025, 2:54:22 PM No.105798705
>>105794140
What probably happened here is that the PR title was copypasted, "llama" in this context just means the llama.cpp core library.
Anonymous
7/4/2025, 2:54:38 PM No.105798706
>>105798612
why are you always using the ugliest cats as your pic answer to this bait
Anonymous
7/4/2025, 2:57:42 PM No.105798721
>>105798170
But there is no inpainting model for illustrious, just the base model and fine-tuning.
Anonymous
7/4/2025, 3:00:20 PM No.105798737
Best long context model to summarize long stories right now?
Was that 1m context chinese model better than r1 for that purpose?
Replies: >>105798773
Anonymous
7/4/2025, 3:05:24 PM No.105798773
>>105798737
local is hot garbage compared to gemini for that
deepseek is hot garbage too
just using half of its maximum half context you get summaries that feel like they were written by a dense autist who couldn't help but mention all the minor happenings that were not actually important
use Gemini and forget about local, Gemini can actually write a good summary after ingesting 500K tokens
Replies: >>105798806 >>105798822
Anonymous
7/4/2025, 3:06:09 PM No.105798778
>>105795571
>昨日, 私 歩ed 通gh the 森, 楽ing the 静 環境. The 大 木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影

Yesterday, I walked through the woods enjoying the calm environment. Big trees cast long shadows to the ground creating a beautiful pattern of light and shadow

IMHO, we all but follow some/same patterns
Anonymous
7/4/2025, 3:10:52 PM No.105798806
>>105798773
I wasn't asking specifically about local, but isn't gemini gigacucked?
Replies: >>105798816 >>105798821 >>105798822 >>105800794
Anonymous
7/4/2025, 3:12:26 PM No.105798816
>>105798806
google has a strategy of making the model uncensored and putting a man-in-the-middle model that filters no no things
if you can fool that classifier then it's about as good as it gets
Anonymous
7/4/2025, 3:12:59 PM No.105798821
>>105798806
Not that anon, but kind of.
It'll avoid saying dick or pussy by default, but you can make it if you probe it just right.
The safety filters will block requests mentioning anything sexual alongside any explicit, and some implicit, young age.
Anonymous
7/4/2025, 3:12:59 PM No.105798822
>>105798773
>>105798806
Also, which version of gemini are we talking?
Replies: >>105798847
Anonymous
7/4/2025, 3:16:58 PM No.105798847
>>105798822
2.5 pro, I wouldn't use anything other than their current SOTA for something like large context summarization.
I've tested with the Flash model too, and it's significantly dumber. Though, not as dumb as deepseek was.
Replies: >>105798855 >>105799065
Anonymous
7/4/2025, 3:17:59 PM No.105798855
>>105798847
>and it's significantly dumber
It really is, but it also seems to actually do better than 2.5 on really long contexts. Stuff like 300k tokens+.
I'd try both and see which works better.
Replies: >>105798876 >>105799222
Anonymous
7/4/2025, 3:21:14 PM No.105798876
>>105798855
>Stuff like 300k tokens+

What on Earth do you stuff it with?!

It's 500 pages of text!
Replies: >>105798895 >>105799027
Anonymous
7/4/2025, 3:23:48 PM No.105798895
>>105798876
Whole book series.
It didn't work so good.
Anonymous
7/4/2025, 3:41:06 PM No.105799006
>>105796686
i like the evil one next to the upturned pigtails, which has to be contained otherwise she might bring about the end of the world.
Anonymous
7/4/2025, 3:43:43 PM No.105799027
>>105798876
You could put the entire lore of certain franchises and have the model answer any question about it in a way that RAG or finetuning will never be able to accomplish, although not even 300k tokens would be enough in certain cases.
Anonymous
7/4/2025, 3:47:43 PM No.105799065
>>105798847
Nah, even 2.5 pro seems retarded, it gets tons of stuff mixed up specially because there's different chapters. Could also be that the rag is not working correctly but I don't think so.
And it's under 200k tokens.
Replies: >>105799222
Anonymous
7/4/2025, 3:48:11 PM No.105799069
add hunyuan moe by ngxson · Pull Request #14425 · ggml-org_llama.cpp · GitHub
Holy shit this thing better be worth all the work.
Anonymous
7/4/2025, 3:48:44 PM No.105799079
I work at Meta, not as an AI developer, but the general consensus on LLMs right now is that there aren't many areas left for significant improvement.
And llama4's failure comes from fine-tuning data being shiet.
Replies: >>105799095 >>105799126 >>105799152 >>105799211
Anonymous
7/4/2025, 3:50:57 PM No.105799095
>>105799079
Where are you from?
Replies: >>105799266
Anonymous
7/4/2025, 3:54:46 PM No.105799126
>>105799079
>And llama4's failure comes from fine-tuning data being shiet.
You must be all geniuses over there. Thanks for the insight.
Replies: >>105799266
Anonymous
7/4/2025, 3:56:44 PM No.105799152
>>105799079
Try pretraining data as well
Anonymous
7/4/2025, 4:02:53 PM No.105799210
>follow AI influencers on twitter
>suddenly everyone talks about some scamming jeet working 10 us jobs in India
>tiimeline begins to fill with more and more jeets
>now it's all jeets talking about jeet things in jeetland in english for some fucking reason
Really encapsulates the state of US tech sector
Anonymous
7/4/2025, 4:02:59 PM No.105799211
>>105799079
Significant improvements possible by making every document in the pretraining phase matter and not just throwing stuff at it semi-randomly. Those 10~30B+ tokens instruct "finetunes" wouldn't be necessary if the base models could work decently on their own.
Anonymous
7/4/2025, 4:04:35 PM No.105799222
ss
ss
md5: 97fd026847bf4d95ea17b60708a8adb5🔍
>>105798855
>Stuff like 300k tokens+.
/hard disagree/.
One of my tests was feeding a whole ass japanese light novel in its original language and having it summarize the key points through this prompt :
>Write a detailed summary explaining the events in a chronological manner, focusing on the moral lessons that can be understood from the book. Try to understand the moral quandaries from the point of view of the people from that civilization. The summary must be written in English.
This is what I got from 2.5 Pro :
https://rentry.co/uunaas4f
It's mostly accurate. Some terms aren't well transliterated, which is to be expected, but the chronology of events and underlying message are well preserved. Mind you, it's a novel I read multiple times that is why I'm using it in a summarization test (you can't judge the quality of a summary of something you couldn't summarize yourself).
Flash produced garbage and I didn't bother saving its answer, but I could run it again if you're curious to compare with that prompt + data.
Pic related is the amount of tokens seen in aistudio for this prompt+answer.
>>105799065
I don't use rag software, so I dunno about that. Is the technology perfect? no, but frankly, the fact that it manages to not forget the original prompt and write in English after seeing hundreds of thousands of japanese token has me fucking beyond impressed.
Anonymous
7/4/2025, 4:04:55 PM No.105799225
Screenshot_20250704_230151
Screenshot_20250704_230151
md5: 6708e0d1822a4c477399c4ead049c3bd🔍
Dayum. Elon upgraded grok3 today. Trannies mad that it answers "2 genders" now.
Can't believe it answered with loli. I didnt give any extra instructions.

Why is closed moving in the exact opposite direction to closed?
I wrote it before but I can write erotic stories on gpt 4.1 now. Female targeted slop, but still.
Also we never gonna see grok2 aren't we. "Stable Grok3 first", very funny.
Anonymous
7/4/2025, 4:12:12 PM No.105799266
>>105799095
Poland. I started as a Software Engineer a year ago.
>>105799126
I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.
Replies: >>105799283 >>105799683 >>105799857
Anonymous
7/4/2025, 4:14:17 PM No.105799283
>>105799266
>That's the reason why Zuckerberg bought Scale AI.
How bad is the scaleai data?
Replies: >>105800829
Anonymous
7/4/2025, 4:17:41 PM No.105799309
>>105798345
Holy based. That's exactly what I believe
Anonymous
7/4/2025, 4:20:30 PM No.105799323
>Use google gemini
>Add a spicy word
>Push send
>No no you said the word peepee
>Money stolen
Replies: >>105799340
Anonymous
7/4/2025, 4:23:09 PM No.105799340
>>105799323
>Use deepseek
>Add spicy word
>Push send
>She bites her lip tasting copper 10 times
>Money still with me because local
Replies: >>105799347
Anonymous
7/4/2025, 4:24:08 PM No.105799347
>>105799340
>she bites her lip
>stop gen
>edit
>continue gen
Replies: >>105799370
Anonymous
7/4/2025, 4:26:09 PM No.105799363
do I really gotta download the paddle xi rootkit to run ernie VL?
Anonymous
7/4/2025, 4:26:41 PM No.105799370
>>105799347
>:a
>she bites her lip
>stop gen
>edit
>continue gen
>goto a
Replies: >>105799376
Anonymous
7/4/2025, 4:27:23 PM No.105799376
>>105799370
R1 0528 doesn't have this problem.
Replies: >>105799458
Anonymous
7/4/2025, 4:29:29 PM No.105799393
>one of the most fascinating piece of tech in recent history
>everyone just wants it to write text porn
>I don't even remember ever hearing about men reading erotic literature before this, but there wasn't that many troons in the early internet
>now there is an epidemic of men who get off text porn?
Replies: >>105799407 >>105799414 >>105800563
Anonymous
7/4/2025, 4:32:15 PM No.105799407
>>105799393
>I don't even remember ever hearing about men reading erotic literature before this
Yeah surely this is a new thing, is not like western porn visual novels are best sellers on steam or anything, surely.
Anonymous
7/4/2025, 4:32:33 PM No.105799414
>>105799393
>I don't even remember ever hearing about men reading erotic literature before this
literally every vn to exist
Replies: >>105799477 >>105799593
Anonymous
7/4/2025, 4:33:50 PM No.105799419
I've upgraded my total VRAM to 48 GBs.
What models could I reasonably run at Q4 quants?
Replies: >>105799427 >>105799445 >>105799500 >>105799509
Anonymous
7/4/2025, 4:34:57 PM No.105799427
>>105799419
Mistral 8x7B
Anonymous
7/4/2025, 4:37:03 PM No.105799445
>>105799419
Anything over 24GB is useless until you reach 200GB because then you can run unsloth deepseek quants entirely in vram. Every other model worth running fits in a 3090.
Anonymous
7/4/2025, 4:39:09 PM No.105799458
>>105799376
UD IQ1_M quant of it is larger than IQ2_XXS of the previous one
Anonymous
7/4/2025, 4:41:22 PM No.105799477
>>105799414
They have pictures
Replies: >>105799504 >>105799507 >>105799526
Anonymous
7/4/2025, 4:41:32 PM No.105799479
>>105791168
>run on a single 3090
It will be a giant model with good/great benchmark scores. This way, no one will run it locally, paid API will be expansive, and he can still they "we released an open source model with very strong capabilities" without undermining his business.
People who think they will release a great, tiny and usable model are totally delusional.
Replies: >>105799501
Anonymous
7/4/2025, 4:43:55 PM No.105799500
>>105799419
if you have a lot of normal system ram in addition to the gpu, you could get away with -ot exps=CPU for deepseek (lower quant like q2) or qwen3 with decent speed
Anonymous
7/4/2025, 4:44:03 PM No.105799501
>>105799479
That would be the best outcome. A small model, slightly better than Nemo but censored to hell is likely what we'll get instead
Anonymous
7/4/2025, 4:44:26 PM No.105799504
>>105799477
Fucking retard. Its not the pictures why people enjoy them. Its immersion. Text, Music, Visual sometimes voice. Gotta use your mind to fill in the blanks. Same shit.
Anon over here lecturing the nerds about VNs while not even reading or into them. kek Crrrazy
Replies: >>105799525
Anonymous
7/4/2025, 4:45:10 PM No.105799507
>>105799477
Like 1 picture for every 5 pages worth of text sure
Replies: >>105799525
Anonymous
7/4/2025, 4:45:19 PM No.105799509
>>105799419
For smut i'd try this with partial offloading https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF
Anonymous
7/4/2025, 4:47:43 PM No.105799525
>>105799504
>>105799507
Only women can do it without pictures at all
Replies: >>105799547
Anonymous
7/4/2025, 4:47:44 PM No.105799526
>>105799477
Clinically retarded.
Anonymous
7/4/2025, 4:50:20 PM No.105799542
It would be funny to see how the same faggots would praise the first good LLM with native image generation
Replies: >>105799554 >>105799556 >>105799604
Anonymous
7/4/2025, 4:50:47 PM No.105799547
>>105799525
Well then paint my nails and call me sally. Or just let the model use stable diffusion every few minutes to draw the scene and you've got a VN
Anonymous
7/4/2025, 4:51:48 PM No.105799554
>>105799542
There will never be a good local LLM with image generation.
Anonymous
7/4/2025, 4:51:55 PM No.105799556
>>105799542
I use both
Anonymous
7/4/2025, 4:58:11 PM No.105799593
>>105799414
>literally every vn to exist
why do you think there is a skip button
men don't have time for this shit
Anonymous
7/4/2025, 4:59:21 PM No.105799602
most popular hentai game (vn, rpg or whatever) request on f95zone: gallery save plz
>men really want to read the text I say
Replies: >>105799618 >>105799820
Anonymous
7/4/2025, 4:59:49 PM No.105799604
>>105799542
>can only generate heckin wholesome dogs in hats and astronauts riding horses in space
Replies: >>105799655
Anonymous
7/4/2025, 5:02:07 PM No.105799618
>>105799602
Cuck95 are thirdworlders like jeets and so on.
Check steam play time numbers on the reviews you dum dum. Just admit you are wrong Ferret Software wanna be.
Anonymous
7/4/2025, 5:09:20 PM No.105799655
>>105799604
>train lora
works on my machine
Replies: >>105799674
Anonymous
7/4/2025, 5:12:27 PM No.105799674
>>105799655
Tell it to flux users
Anonymous
7/4/2025, 5:14:32 PM No.105799683
>>105799266
>I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.
As well as data gathering, are you all also immune to sarcasm?
Anonymous
7/4/2025, 5:16:29 PM No.105799695
All your local models will be irrelevant in two weeks
Replies: >>105799702
Anonymous
7/4/2025, 5:17:01 PM No.105799702
>>105799695
Steve is releasing in two weeks.
Replies: >>105799746
Anonymous
7/4/2025, 5:24:54 PM No.105799746
>>105799702
Tell Steve it's not healthy to hold it in for so long.
Anonymous
7/4/2025, 5:32:27 PM No.105799820
>>105799602
https://incontinentcell.itch.io/factorial-omega
Anonymous
7/4/2025, 5:35:55 PM No.105799857
>>105799266
dubs cheque
What are the odds meta is going to reduce filtering for the pretraining and reduce censorship in the finetuning and post training etc
Anonymous
7/4/2025, 5:38:41 PM No.105799875
I'm coping with Qwen3-235B-A22B-UD-Q3_K_XL. Is that the best for 2x3090s + 64GB RAM?
Replies: >>105799972
Anonymous
7/4/2025, 5:47:27 PM No.105799948
I'm currently using a Q4 quant of Gemma3 for RP. If I swap to a Q8 quant of the same model, how much of an improvement can I expect to see?
Replies: >>105799963
Anonymous
7/4/2025, 5:49:37 PM No.105799963
>>105799948
How much does it costs you to test it yourself?
Replies: >>105800060
Anonymous
7/4/2025, 5:49:37 PM No.105799964
>>105798345
The future belongs to AI companies that sell agents that can't perform any specific task and eat tokens. Companies have to buy them because they have to tell investors AI has made them 500x more efficient. They're going to be rolling in cash with the agent meme.
Anonymous
7/4/2025, 5:50:25 PM No.105799972
>>105799875
yeah or one of the largestrals. Get 128GB to run DS V3 0324 IQ1_S_R4 for higher quality cope
Anonymous
7/4/2025, 6:01:29 PM No.105800060
>>105799963
More than I'd like. Maybe if I bought a lot of RAM, I could try. After all, I'm testing for quality, not speed. Still, I'd like to actually use it, so I'd definitely need a GPU upgrade.
Anonymous
7/4/2025, 6:48:57 PM No.105800387
>>105795473
Cypher is the name of one of ScaleAI's dataset.
>t. I worked on it
Anonymous
7/4/2025, 7:06:29 PM No.105800526
>>105800515
>>105800515
>>105800515
Anonymous
7/4/2025, 7:10:30 PM No.105800563
>>105799393
>but there wasn't that many troons in the early internet

in presence or numbers ? numbers idk but presence you are then outing yourself troons were but it wasent the mentally ill shit now it was a "imagine a women who is not a faggot and lmao look that dude is larping as it XDDDD lol well have that soon enough give it some time" it was a equiveleant of a cosplay event it was fun and everyone played along because everyone knew what was meant and agreed now its just "castrate yourself goy oh you wont ? you are an incel mstow womne hating bla bla bla go mutilate yourself goy" before it was a cargo cult for a better future now its a demoralisation campaign towards satanism and demon (w*men) worship just like how blackpill used to refer how negative the world in general is but go co-opted by demon seeking faggots into their cuck bullshit

also in reagrd to troonism there diednt used to be surgeyr and none of that it was just feminine looking dudes crossdressing
Anonymous
7/4/2025, 7:40:16 PM No.105800794
>>105798806
Use it on Ai Studio. The Gemini models will refuse. However, they greatly reduce the capabilities of their models, so don't expect a great summary.
Anonymous
7/4/2025, 7:43:25 PM No.105800829
>>105799283
I'm a data reviewer for them. The data can be quite good or quite bad. I worked on a "safety project": About 95% of the data needed heavy work to make it good. The more instructions you give to "attempters", the less likely you'll get a good dataset.
Anonymous
7/4/2025, 9:10:02 PM No.105801575
>>105795473
It's an Amazon model and it's utter coal.