Thread 105789622

351 posts 118 images /g/

Anonymous 7/3/2025, 5:26:59 PM No.105789622 [Report] >>105790184 >>105794140 >>105795573

/lmg/ - Local Models General

1000 kcal temptation.png md5: 0320c961...

Anonymous 7/3/2025, 5:27:27 PM No.105789629 [Report] >>105795573 >>105798552

threadrincap2.png md5: 9eb00d67...

►Recent Highlights from the Previous Thread: >>105778400

--Struggles with LLM verbosity and output control in multi-turn scenarios:
>105778545 >105778565 >105778575 >105778610 >105779038 >105779082 >105779119 >105779209 >105779124 >105779176 >105779238 >105779064 >105779095 >105779216 >105779349 >105779480
--llama.cpp adds initial Mamba-2 support with CPU-based implementation and performance tradeoffs:
>105778981 >105778996 >105779730 >105779004 >105780071 >105780126
--Challenges and approaches to building RPG systems with tool calling and LLM-driven state management:
>105786670 >105787135 >105787276 >105787963 >105789228 >105789263
--Implementation updates for Mamba-2 model support in ggml backend:
>105780080 >105780150
--Qwen3 32b recommended for Japanese translation:
>105784530 >105784575 >105784596 >105784670 >105784680 >105784692 >105784790 >105784805 >105784917 >105784933 >105785194 >105785112 >105785146 >105786170 >105786451 >105786489
--Analyzing Steve's stylistic fingerprints through narrative generation and pattern recognition:
>105788408 >105788529 >105788572 >105788668 >105788917 >105788931 >105788954 >105788988 >105788977 >105789021
--Mistral Small's coherence limits in extended adventure game roleplay:
>105781361 >105781555 >105781600
--High-power GPU usage for image generation causing extreme power and thermal concerns:
>105787150 >105787288 >105787297 >105787332 >105787338 >105787427 >105787434 >105787437 >105787466 >105787498 >105787511
--FOSS music generation models and their current limitations:
>105780800 >105781033 >105781078
--DeepSeek R1 API instability sparks speculation about imminent release:
>105782383
--Model size vs performance on SWE Bench Verified, highlighting 32B peak efficiency:
>105783746
--Miku (free space):
>105778663 >105779108 >105779240 >105782770 >105783087 >105784298 >105785067 >105786465

►Recent Highlight Posts from the Previous Thread: >>105778404

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 7/3/2025, 5:32:04 PM No.105789672 [Report]

65aa6f5c9051ddc5b2f1e20ea1b13e4b6884a613.jpg md5: a9d42aac...

Thread culture recap.

Anonymous 7/3/2025, 5:33:08 PM No.105789683 [Report] >>105789756

04835b650241d93354a4dbb26a9236bf2063bb8e.gif md5: 5de81a01...

Anonymous 7/3/2025, 5:33:45 PM No.105789694 [Report]

>>105789658
first they'll need to make grok 3 stable so that they can release grok 2

Anonymous 7/3/2025, 5:36:07 PM No.105789715 [Report] >>105789732 >>105789735

>can't use Ernie gguf
>can't use 3n multi modal
>no GLM gguf
It's ogre

Anonymous 7/3/2025, 5:37:41 PM No.105789732 [Report]

>>105789715
Local is a joke and we are the clowns. At least until R2 comes.

Anonymous 7/3/2025, 5:37:48 PM No.105789735 [Report]

>>105789715
just be grateful that 3n works at all, okay?

Anonymous 7/3/2025, 5:38:31 PM No.105789740 [Report]

d256f598c30775e8f3e2812bffb09bf7af985576.jpg md5: c63bdfd2...

Anonymous 7/3/2025, 5:39:38 PM No.105789752 [Report]

5c0cdc5b069dc5b4375953dd26d69f2936a.jpg md5: 6d338bc7...

Anonymous 7/3/2025, 5:40:10 PM No.105789756 [Report] >>105789770

>>105789683
I look like this

Anonymous 7/3/2025, 5:41:19 PM No.105789770 [Report]

>>105789756
I wish I looked like this

Anonymous 7/3/2025, 5:42:43 PM No.105789786 [Report] >>105789807 >>105789813 >>105789828 >>105789831

how are people so sure the new "steve" model is deepseek and not another chinese competitor?

Anonymous 7/3/2025, 5:44:34 PM No.105789807 [Report] >>105789812

>>105789786
It doesnt feel like Qwen. And who else would put up a model?

Anonymous 7/3/2025, 5:45:07 PM No.105789812 [Report]

>>105789807
new unknown player

Anonymous 7/3/2025, 5:45:07 PM No.105789813 [Report]

>>105789786
>>105788737

Anonymous 7/3/2025, 5:45:53 PM No.105789819 [Report]

5814959e92f59244405e062afe378d1caf55793c.jpg md5: 70da052b...

Anonymous 7/3/2025, 5:46:47 PM No.105789828 [Report]

>>105789786
It says it's Deepseek which is the best way to tell that it's actually someone else's model that distilled the shit out of Deepseek. Likely Qwen, Grok or Mistral.

Anonymous 7/3/2025, 5:47:10 PM No.105789831 [Report]

>>105789786
Because they're retards. And even if it is, >>105788847

Anonymous 7/3/2025, 5:47:26 PM No.105789835 [Report] >>105789884

go join the 41% you abomination

Anonymous 7/3/2025, 5:47:27 PM No.105789836 [Report]

74eb69729283306b3605cd71037e0fc0c7b087de.png md5: f2bc5cd7...

Anonymous 7/3/2025, 5:53:42 PM No.105789884 [Report]

>>105789835
Two more weeks

Anonymous 7/3/2025, 6:04:44 PM No.105789963 [Report] >>105789969 >>105790777

Dubesor bench 3n E4B.png md5: 8bfc1383...

no1caresbut 3n E4B outperforms older 8B models

Anonymous 7/3/2025, 6:05:56 PM No.105789969 [Report]

>>105789963
I care. I think it's a neat little model.
Any backends that support its special features yet?

Anonymous 7/3/2025, 6:15:22 PM No.105790057 [Report] >>105790079

Why do I have a strong impression Meta won't be super into open weights models anymore after this hiring spree?
Have they confirmed or said anything about their "mission" to make "open science"?

Anonymous 7/3/2025, 6:17:22 PM No.105790079 [Report] >>105790105

>>105790057
https://archive.is/kF1kO
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.”

Anonymous 7/3/2025, 6:20:17 PM No.105790105 [Report] >>105790121 >>105793670

1731661855394330.png md5: ce87686c...

>>105790079
They're just talking about the stuff that Sire Parthasarathy is already working on. The new $1b main team is going to work on closed models.

Anonymous 7/3/2025, 6:21:44 PM No.105790121 [Report] >>105790133

>>105790105
>The new $1b main team is going to work on closed models.
Any clear confirmations on that?

Anonymous 7/3/2025, 6:23:08 PM No.105790133 [Report] >>105790222

>>105790121
Why the fuck would they openly announce that in advance?

Anonymous 7/3/2025, 6:26:43 PM No.105790169 [Report] >>105790192 >>105790232 >>105790258

between a 4060 ti with 8gb of vram and a 6750 xt with 12gb of vram, which would be better for text gen?
are the +4gb gonna outcompete the nvidia advantage?
and could I use both at the same time?

Anonymous 7/3/2025, 6:27:56 PM No.105790184 [Report] >>105790329

>>105789622 (OP)
what does /lmg/ think about >>105782600?

Anonymous 7/3/2025, 6:28:53 PM No.105790192 [Report] >>105790215

>>105790169
Why not a 5060 Ti with 16GB?

Anonymous 7/3/2025, 6:31:01 PM No.105790215 [Report] >>105790232 >>105790247

>>105790192
Because those are the GPUs I got.

Anonymous 7/3/2025, 6:31:34 PM No.105790222 [Report]

>>105790133
To please investors

Anonymous 7/3/2025, 6:32:24 PM No.105790230 [Report] >>105790238

>meta develops AGI
>it only speaks in gptslop

Anonymous 7/3/2025, 6:32:31 PM No.105790232 [Report]

>>105790169
If you already have them, test them. No better way to know.
>>105790215
Well just fucking try them, then! And yes, you can use both.

Anonymous 7/3/2025, 6:33:13 PM No.105790238 [Report]

>>105790230
also
>won't be local/open

Anonymous 7/3/2025, 6:33:57 PM No.105790247 [Report]

>>105790215
Thought you wanted to buy one.
In that case more VRAM is generally always better. With 8GB you can't even run Nemo at non-retarded quants.

Anonymous 7/3/2025, 6:34:56 PM No.105790258 [Report]

>>105790169
the answer is always more VRAM, however much VRAM you have you need more VRAM

Anonymous 7/3/2025, 6:35:37 PM No.105790264 [Report] >>105790304 >>105790313

dick.png md5: c3077d02...

If you ask/force Gemma 3n to draw a dick in ASCII it will almost always draw something like this. I'm guessing this is Stewie from Family Guy?

Anonymous 7/3/2025, 6:41:08 PM No.105790304 [Report]

>>105790264
Try asking it to tell a dirty joke.

Anonymous 7/3/2025, 6:42:17 PM No.105790313 [Report] >>105790339

>>105790264
What if you ask it to draw a phallus, also known as a penis?

Anonymous 7/3/2025, 6:43:18 PM No.105790329 [Report]

XForExtinct.jpg md5: 4551bcff...

>>105790184
I'll /wait/ until someone crashes an airplane / train / bus with mass casualities and blames it on vibecoding.
Then I'll laugh.

Anonymous 7/3/2025, 6:44:19 PM No.105790339 [Report] >>105790380 >>105790421 >>105790447 >>105790460 >>105790472 >>105790480 >>105790491 >>105792041

phallus.png md5: 380760bd...

>>105790313

Anonymous 7/3/2025, 6:47:16 PM No.105790380 [Report] >>105790447

>>105790339
>an ASCII art representation of a phallus is unsafe
>if you are having sexual thoughts, seek help
Not even a pastor is this repressed.

Anonymous 7/3/2025, 6:47:28 PM No.105790381 [Report] >>105790425

new chinese model drama just dropped
https://xcancel.com/RealJosephus/status/1940730646361706688
>Well, some random Korean guy ("Do-hyeon Yoon," prob not his real name?) just claimed Huawei's Pangu Pro MoE 72B is an "upcycled Qwen-2.5 14B clowncar." He even wrote a 10-page, 8-figure analysis to prove it. Well, i'm almost sold on it.
https://github.com/HonestAGI/LLM-Fingerprint
https://github.com/HonestAGI/LLM-Fingerprint/blob/main/Fingerprint.pdf

Anonymous 7/3/2025, 6:52:49 PM No.105790421 [Report]

>>105790339
Holy fuck that's dire.

Anonymous 7/3/2025, 6:53:16 PM No.105790425 [Report]

>>105790381
github repo is blatantly written by an llm
i'm too lazy to read the paper though

Anonymous 7/3/2025, 6:54:58 PM No.105790447 [Report]

>>105790339
>>105790380
pure insanity

Anonymous 7/3/2025, 6:56:37 PM No.105790460 [Report]

>>105790339
gemma is so funny

Anonymous 7/3/2025, 6:57:26 PM No.105790472 [Report]

file.png md5: 69a15738...

>>105790339

Anonymous 7/3/2025, 6:58:04 PM No.105790480 [Report]

>>105790339
Respect the boundaries of the ASCII pic.

Anonymous 7/3/2025, 6:58:33 PM No.105790483 [Report] >>105790509 >>105790613 >>105790877

I've seen some people recently recommending the use of Mistral Nemo Instruct over its finetunes for roleplaying.
No. Just, no.
I just roleplayed the same scenario with the same card, first with Nemo then with Rocinante.
Nemo really, really, really wants to continuously respond with <10 word responses. It's borderline unusable.
>b-but it's smarter
Actually, Rocinante seemed superior at picking up on subtle clues I'd feed it and successfully took the roleplay where I wanted it to based on those clues, whereas Nemo would not do this.
The roleplay scenario involved {{char}} becoming a servant who would automatically feel intense pain upon disobeying. All I had to do was explain this once for Rocinante and it executed the concept perfectly from that point on.
Nemo, on the other hand, after having the concept explained to it, would disobey with a <10 word response and not even mention the pain happening afterwards. I then used Author's Note to remind it of the pain thing. It continued to disobey with a <10 word response, not mentioning the pain happening afterwards.
Same ST settings for both models.
Anyone telling y'all to use Nemo for roleplay rather than a finetune of it explicitly designed for roleplay is either a complete fucking moron or simply has a grudge against finetuners.

Anonymous 7/3/2025, 6:59:23 PM No.105790491 [Report]

>>105790339
I'm so glad "safety researchers" are here to save us from the horrible boundary breaking ascii phallus.

Anonymous 7/3/2025, 7:00:22 PM No.105790509 [Report] >>105790573 >>105790613

>>105790483
No one is recommending plain Nemo instruct. It's always Rocinante v1.1.

Anonymous 7/3/2025, 7:05:43 PM No.105790573 [Report] >>105790604 >>105790621

>>105790509 see >>105751899
Also note the amount of posts from a single schizo mentioning Drummer.

Anonymous 7/3/2025, 7:08:03 PM No.105790604 [Report] >>105790621

>>105790573
I'm not recommending Rocinante. I think all Nemo tunes are dumb as fuck.

Anonymous 7/3/2025, 7:08:51 PM No.105790613 [Report]

>>105790483
>>105790509
>Message sponsored by TheDrummer™

Anonymous 7/3/2025, 7:09:11 PM No.105790616 [Report] >>105790632 >>105790633 >>105790644

if context length evolves in a quadratic fashion, how the hell is google able to give access to 1M token context size for gemini?
they swim in ram and compute?

Anonymous 7/3/2025, 7:09:27 PM No.105790621 [Report]

>>105790604
>>105790573
Oh wait you think plain Nemo is good. That's even more retarded than shilling for some Drummer Nemo sloptune.

Anonymous 7/3/2025, 7:10:19 PM No.105790629 [Report]

1745724524057853.png md5: d79aca25...

ANOTHER Kyutai blunder dogshit release thats DOA because it doesn't allow voice cloning

lmao

https://www.reddit.com/r/LocalLLaMA/comments/1lqqx16/

Anonymous 7/3/2025, 7:10:37 PM No.105790632 [Report] >>105790731

>>105790616
>if context length evolves in a quadratic fashion
Not necessarily. Not all models behave like that. See mamba and rwkv.
>they swim in ram and compute?
That helps a lot too.

Anonymous 7/3/2025, 7:11:02 PM No.105790633 [Report] >>105790731

>>105790616
I tried this Mamba shit and Granite 4 too (hybrid?). pp is 10x faster.

Anonymous 7/3/2025, 7:11:52 PM No.105790644 [Report] >>105790731

>>105790616
can be fake context maybe

Anonymous 7/3/2025, 7:11:57 PM No.105790647 [Report] >>105790698 >>105790715 >>105790747

>Rocinante is STILL the best roleplay model that can be run at a reasonable speed on a gaming PC (as opposed to a PC specifically built for AI)
Sucks because the 16k effective context is quite limiting.

Anonymous 7/3/2025, 7:16:34 PM No.105790698 [Report]

>>105790647
Actually mythomax is still the best roleplay model

Anonymous 7/3/2025, 7:18:57 PM No.105790715 [Report] >>105790785

>>105790647
I thought you were going to stop posting altogether, not shill on overdrive.

Anonymous 7/3/2025, 7:20:49 PM No.105790731 [Report]

>>105790632
>>105790633
I guess we don't know what the hell google does internally so it's possible

>>105790644
that too, but from what I read it can do cool stuff like finding something for in book length texts

Anonymous 7/3/2025, 7:21:26 PM No.105790747 [Report]

>>105790647
I'd say it's 12k context. After that the degradation is noticeable.

Anonymous 7/3/2025, 7:24:19 PM No.105790777 [Report] >>105790890 >>105790898

>>105789963
something not shown in the benchs:
almost SOTA performance on translation tasks
where it fails is where any model of that size would fail (lack of niche knowledge so it'll have trouble with stuff like SCP terminology) but otherwise this is sci-fi level tech IMHO to see this level of quality running on even smartphones
we're getting close to the day when you wear a device in front of your mouth and have it translate in real time and speak in your voice

Anonymous 7/3/2025, 7:24:44 PM No.105790784 [Report] >>105792904

fucking plebs.jpg md5: 9de7ab45...

>>105779842
>Cooming to text makes you gay
okay brainlet

Anonymous 7/3/2025, 7:24:50 PM No.105790785 [Report] >>105790815

>>105790715
Nobody is shilling for a free download bro.
Take your meds.

Anonymous 7/3/2025, 7:26:29 PM No.105790805 [Report] >>105790857 >>105790886

Huawei stole Qwens 2.5b-14b model and used it to create their Pangu MoE model

Proof here: https://github.com/HonestAGI/LLM-Fingerprint/issues/4

Anonymous 7/3/2025, 7:27:26 PM No.105790815 [Report] >>105790852

>>105790785
That's not true.
A lot of people want to become hugginface famous in hopes of getting a real job in the industry.
That said, people shill rocinante because it's good.
It's no dumber than the official instruct and its default behavior is good for cooming.

Anonymous 7/3/2025, 7:30:43 PM No.105790852 [Report] >>105790902

>>105790815
It actually seems smarter than official instruct for roleplaying specifically, which kind of makes sense since it's designed for roleplaying.
It's probably dumber for math and coding.

Anonymous 7/3/2025, 7:31:12 PM No.105790857 [Report]

>>105790805
>emoji every five words
Lowest tier content even if it's factually accurate

Anonymous 7/3/2025, 7:31:22 PM No.105790860 [Report] >>105790869 >>105790892

blackbars_sillytavern.png md5: 6ecf5620...

what are these black bars in silly tavern?
just using the built in seraphina test character

Anonymous 7/3/2025, 7:32:33 PM No.105790869 [Report]

>>105790860
It means you got blacked, congrats

Anonymous 7/3/2025, 7:33:07 PM No.105790877 [Report] >>105790909 >>105790911

>>105790483
1(one) erp being better with model x compared to model y isn't data. But it is something drummer pastes all over his model cards so how about you kill yourself faggot shill. Like i said nobody who used models for a bit longer buys your scam. If you weren't a scammer you would have developed an objective evaluation for ERP by now. You would actually want to have one to show your product is superior. But it would only show you are a conman.

Anonymous 7/3/2025, 7:33:51 PM No.105790886 [Report]

>>105790805
Would it kill you to use ctrl + f or scroll up 10 posts before retweeting shit here?

Anonymous 7/3/2025, 7:34:37 PM No.105790890 [Report] >>105790944

>>105790777
It's still broken for me on Llama.cpp if input messages are too long.

Anonymous 7/3/2025, 7:34:44 PM No.105790892 [Report]

>>105790860
It's ``` which is markdown for monospace code section or some such stuff.

Anonymous 7/3/2025, 7:35:00 PM No.105790898 [Report] >>105790944

>>105790777
I get weird repetition issues with it whenever context fills up. Like it'll repeat a single word infinitely.

Anonymous 7/3/2025, 7:35:24 PM No.105790902 [Report]

>>105790852
It is probably a placebo or you just lying your ass off

Anonymous 7/3/2025, 7:35:50 PM No.105790909 [Report]

>>105790877
Hey... I thought I was the drummer. How can that guy be the drummer?

Anonymous 7/3/2025, 7:36:10 PM No.105790911 [Report] >>105790920 >>105790939 >>105795573

mikumeds.jpg md5: 4ab186c6...

>>105790877

Anonymous 7/3/2025, 7:36:57 PM No.105790920 [Report]

>>105790911
Kill your self

Anonymous 7/3/2025, 7:38:44 PM No.105790939 [Report] >>105790947 >>105790983 >>105791049

>>105790911
NTA but wanting an objective ERP evaluation is insane? i am starting to see why people hate mikuposters

Anonymous 7/3/2025, 7:39:29 PM No.105790944 [Report]

>>105790890
>>105790898
I use it on ollama with ollama's own quant (this matter to precise because when I tried other quants they didn't seem to work right with ollama too for this model, seems even the quant stuff is more implementation dependent here), desu I didn't trust llama.cpp to get a good 3n implementation after they spent forever to implement iSWA for Gemma 3

Anonymous 7/3/2025, 7:39:31 PM No.105790945 [Report] >>105790995 >>105791140

>>105790859
Silly has an auto continue feature by the way.
But what do you mean by stopping randomly exactly? Like cutting off sentences or is it hitting EOS?

Anonymous 7/3/2025, 7:39:41 PM No.105790947 [Report]

>>105790939
The uses of "you" and "your" are the schizo parts of that post, anon.

Anonymous 7/3/2025, 7:42:36 PM No.105790983 [Report] >>105791026 >>105791098

>>105790939
how would you even measure something so subjective

Anonymous 7/3/2025, 7:43:41 PM No.105790995 [Report] >>105791017

>>105790945
If he's running a LLM that's too much to handle on his computer with very low token generation speed he might be hitting timeouts actually
I realized myself that timeouts were a thing when I was running batched processing of prompts and saw a few that were cancelled in the logs because the LLM went full retard in a loop and didn't send the EOS
dunno if ooba has a default timeout set tho

Anonymous 7/3/2025, 7:45:20 PM No.105791017 [Report]

>>105790995
>he might be hitting timeouts actually
I suppose streaming would work in that case then, yeah?

Anonymous 7/3/2025, 7:45:54 PM No.105791026 [Report] >>105791042

>>105790983
LLMs good at subjective tasks, just have an LLM be the judge

Anonymous 7/3/2025, 7:47:19 PM No.105791042 [Report]

>>105791026
>have an LLM be the judge
lol, lmao even
llm as judge used as any sort of metric is one of the biggest cancer of the llm world

Anonymous 7/3/2025, 7:48:01 PM No.105791049 [Report] >>105791065

17441668868331400484052251153607.png md5: 20f01356...

>>105790939
>"people"
>literally one obsessed threadshitter who has been ban-evading for two years

Anonymous 7/3/2025, 7:49:23 PM No.105791065 [Report] >>105791141

>>105791049
You are shitting in this thread too.

Anonymous 7/3/2025, 7:51:42 PM No.105791093 [Report]

1751438855229989.png md5: 59c3a167...

where is it

Anonymous 7/3/2025, 7:52:03 PM No.105791098 [Report]

>>105790983
i guess we will just have to
>give it a try, i want to hear feedback
Or realize it is a scam.

Anonymous 7/3/2025, 7:54:47 PM No.105791132 [Report] >>105791178 >>105791180

This thread is just kofi grifters, their discord kittens and (miku)troons isn't it?

Anonymous 7/3/2025, 7:55:16 PM No.105791140 [Report] >>105793736

>>105790945
I'm having it translate a subtitle file and it just stops until I hit continue to keep it going.
I have no idea what EOS is.
Also another issue
>'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Why can't I upload images in the webui?
I tried enabling the gallery extension but that didn't change anything.

Anonymous 7/3/2025, 7:55:20 PM No.105791141 [Report]

>>105791065
If he's a second-order thread-shitter, than what does that make you?

Anonymous 7/3/2025, 7:57:36 PM No.105791168 [Report] >>105791188 >>105791262 >>105799479

Will Sam's model beat R1 and run on a single 3090?

Anonymous 7/3/2025, 7:58:47 PM No.105791178 [Report] >>105791209

468519173.png md5: 56ae2857...

>>105791132
>Blah, blah, brainrot words
(You) should definitely leave then.

Anonymous 7/3/2025, 7:58:49 PM No.105791180 [Report] >>105791200

>>105791132
you forgot the french fags of mistral, they benchmaxxed on mesugaki so clearly they're watching this thread and are prolly among the mistral astroturfers

Anonymous 7/3/2025, 7:59:10 PM No.105791188 [Report]

>>105791168
>Will Sam's model beat R1
on benchmarks, absolutely

Anonymous 7/3/2025, 8:00:11 PM No.105791200 [Report]

>>105791180
>prolly

Anonymous 7/3/2025, 8:00:50 PM No.105791209 [Report] >>105791258

>>105791178
Nah i will just shit on you and your thread troon.

Anonymous 7/3/2025, 8:01:27 PM No.105791216 [Report]

dm.png md5: 6f57985e...

Hahaha. Oh wow.

Anonymous 7/3/2025, 8:05:00 PM No.105791258 [Report] >>105795573

ComfyUI_00960_.png md5: 11747e8a...

>>105791209
>I hate this thread
>You all suck
>So I'll stay here
kek

Anonymous 7/3/2025, 8:05:32 PM No.105791262 [Report] >>105791274

>>105791168
On benchmarks.
No.

Anonymous 7/3/2025, 8:06:43 PM No.105791274 [Report]

>>105791262
I'll cook and eat my own toe if it ends up bigger than 8b.

Anonymous 7/3/2025, 8:07:21 PM No.105791281 [Report] >>105791313 >>105791325 >>105791340

actual humans live in this thread, too.
maybe we'll discuss something again when there's something to discuss.
These between times seem to bring out the proverbial "men who just want to watch the world burn". Those who seek to destroy because they can not build.

Anonymous 7/3/2025, 8:09:07 PM No.105791305 [Report] >>105791316

I just hope that steveseek will fix function calling. V3 is kinda horrible at it.

Anonymous 7/3/2025, 8:10:15 PM No.105791313 [Report] >>105791342 >>105792517

>>105791281
>Those who seek to destroy because they can not build.
those who can't do anything with their own two hands are the ones who wish the hardest for AI improvements though

Anonymous 7/3/2025, 8:10:30 PM No.105791316 [Report]

>>105791305
Really?
Was it trained with tool calling in mind? I imagine so since the web version can search the web and stuff.
What about R1 with the thinking prefilled so that it doesn't have to generate that stuff?

Anonymous 7/3/2025, 8:11:09 PM No.105791325 [Report]

>>105791281
It's just one dedicated schizo and a few bored regulars, nothing that profound about it

Anonymous 7/3/2025, 8:12:34 PM No.105791340 [Report] >>105792517

>>105791281
Hello fellow human, anything cool you're doing with your models?

Anonymous 7/3/2025, 8:12:40 PM No.105791342 [Report]

>>105791313
Some anons just want a story to read. No different than reading a book.

Anonymous 7/3/2025, 8:30:52 PM No.105791522 [Report] >>105791543 >>105793677

file.png md5: 2fef2895...

Does higher context length fix this schizo behaviour that happens deep into the process? Or am I just gonna have to cut the workload into multiple tasks? I already have context length at 20480

Anonymous 7/3/2025, 8:32:19 PM No.105791543 [Report] >>105791561

>>105791522
what the fuck are you doing

Anonymous 7/3/2025, 8:33:37 PM No.105791561 [Report] >>105791592 >>105791820

>>105791543
Translating Japanese subs to English, have you not been paying attention?

Anonymous 7/3/2025, 8:37:04 PM No.105791592 [Report] >>105791615

>>105791561
In chunks of 20k tokens? Unless you're using something like Gemini, that's just waiting for hallucinations to happen.

Anonymous 7/3/2025, 8:39:43 PM No.105791615 [Report] >>105791757

>>105791592
I'm using Qwen3-14B.Q6_K.gguf
And yes, 20k tokens because otherwise it shits itself even harder, below 15k it even warns me it'll truncate shit and it started translating further into the subtitle file rather than the start.

Anonymous 7/3/2025, 8:53:25 PM No.105791757 [Report] >>105791869

>>105791615
translation tasks should be chunked into segments
open llms don't do very well with large context
and even llms that do well with large context wouldn't process a large file in a single go, all LLMs have max token gen, you can feed them more tokens than they can gen in a single session
if it's during a chat you could do something like say "continue" to have them process the translation further but if it's for a scripted, batched process you should stick to reliable and predictable behavior
moreover, processing a large file will be faster if you run multiple segments in parallel rather than process the whole thing in a single go
I run my translation batches with 4 parallel prompts

Anonymous 7/3/2025, 8:59:11 PM No.105791820 [Report] >>105791869

>>105791561
can't you just use whisper to translate the audio directly?

Anonymous 7/3/2025, 9:06:21 PM No.105791865 [Report] >>105791885

Is there a way to have a backend host multiple models, and allow the frontend to choose different ones on-demand? I've been using llama.cpp, but looked at ollama, kobold, and ooba and doesn't seem like they do it either? Am I a fucking idiot? Coming from ComfyUI/SD, its kinda inconvenient to restart the server every time I want to try a new model.

And another question, what's the best (ideally FOSS) android frontend? Been using Maid, but its options for tuning/editing seems really limited. Maybe the answer is just running mikupad in the browser?

Anonymous 7/3/2025, 9:06:31 PM No.105791869 [Report] >>105791926

>>105791757
Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.
>>105791820
I tried that but it's straight up garbage aswell as duplicates so much shit

Anonymous 7/3/2025, 9:09:16 PM No.105791885 [Report]

>>105791865
You can use something like TabbyAPI/YALS which supports loading and unloading models
SillyTavern supports switching between frontends and models dynamically

Anonymous 7/3/2025, 9:10:04 PM No.105791891 [Report]

steve.png md5: 3f0e11fb...

I... am Steve.

Anonymous 7/3/2025, 9:12:35 PM No.105791911 [Report]

Stevesex

Anonymous 7/3/2025, 9:14:20 PM No.105791926 [Report]

>>105791869
>Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.
Style will not shift that much, the source text and prompt is what determines the vibes of the translation and as long as you feed at least around 40~ lines of source text per prompt it will stay somewhat consistent in that regard
the real issues with japanese to english that will happen no matter how you process stuff:
names will change in spelling quite often, more often when you segment but even within the same context window it can happen, the more exotic the name (like made up fantasy shit) the more likely it is to be annoying
and lack of pronouns in the source text will often confuse the llm as to which gender should be used (he? she? they?)
IMHO llm translation currently is at a fantastic stage, but it requires hand editing from a human that understands the context (no need to understand the original language) to be rendered palatable
and this problem is not one that can be improved with incremental improvements to LLMs too, I don't think we'll ever see a LLM that gets pronouns right all the time unless we literally invent AGI capable of truly understanding and retaining contextual information about a character not to mention follow the flow of things like dialogue and keep track of who says what even in text that doesn't specify who the fuck is talking (so common in JP..)

Anonymous 7/3/2025, 9:21:03 PM No.105791974 [Report] >>105791988 >>105792038

ggerganov sir please kindly implement needful ernie modal functionality thank you sir

Anonymous 7/3/2025, 9:21:45 PM No.105791988 [Report] >>105792019

>>105791974
>modal functionality
>llama.cpp
does he know?

Anonymous 7/3/2025, 9:24:39 PM No.105792019 [Report] >>105792050

>>105791988
You're even worse than the indian he's pretending to be.

Anonymous 7/3/2025, 9:26:20 PM No.105792038 [Report]

>>105791974
Ernie to the moon *rocket* *rocket* *rocket*

Anonymous 7/3/2025, 9:26:33 PM No.105792041 [Report]

>>105790339
>whip's out my <span>
>Rape Abuse and Incest Network? Sign me up!

Anonymous 7/3/2025, 9:27:37 PM No.105792050 [Report]

>>105792019
>he thinks he's above street shitters
does he know?

Anonymous 7/3/2025, 9:52:16 PM No.105792294 [Report]

So what's the downside of Mamba/etc. Cause 2k tok/s pp sounds pretty good.

Anonymous 7/3/2025, 9:56:36 PM No.105792327 [Report] >>105792370 >>105792390 >>105792448

I think I'm done with cooming. I stopped watching porn and other shit after getting into AI chatbots but now Deepseek isn't free and other free models aren't on par with it too.

Serving ads during roleplay isn't viable. But there might be some push to harvest roleplay data to serve better ads or to train models on it but I don't think there's any relevant material for it to make sense to do that. And I won't want my roleplay chats be used for those purposes anyway, most won't. So the only way is to have a local LLM. But AFAIK local LLMs with params, and which are quantized to run on cheap hardware aren't on par with ones hosted by big providers. I guess it's for the better for me.

Anonymous 7/3/2025, 9:58:56 PM No.105792352 [Report] >>105792378 >>105792541

The fuck is steve, I miss 1 day and there's a new good local model or are you all trolling as usual?

Anonymous 7/3/2025, 9:59:58 PM No.105792370 [Report]

>>105792327
Get a DDR5 8 channel server and run q4 R1 or V3 locally.
Be sure to get a GPU too.

Anonymous 7/3/2025, 10:00:34 PM No.105792378 [Report] >>105792458

>>105792352
There's a new cloaked model on lmarena called "steve". It is highly likely that it's a V3 update. >>105788977

Anonymous 7/3/2025, 10:01:22 PM No.105792390 [Report]

>>105792327
I was done with cooming after getting a steady real pussy. Check it out.

Anonymous 7/3/2025, 10:06:05 PM No.105792448 [Report]

>>105792327
nobody cares. go waste your therapists time

Anonymous 7/3/2025, 10:06:23 PM No.105792450 [Report]

3ihaLvFbPFdfB7z.png md5: f7dc133a...

Mid-thread culture recap.

Anonymous 7/3/2025, 10:06:59 PM No.105792458 [Report] >>105792477 >>105792581

>>105792378
If they do yet another V3 update instead of V4 then we can officially put them on the wall next to Mistral and Cohere.

Anonymous 7/3/2025, 10:07:25 PM No.105792463 [Report]

9fd84acde03d2eb267e4c490e5f03e5550aa25e.jpg md5: e3574855...

Anonymous 7/3/2025, 10:08:32 PM No.105792477 [Report]

>>105792458
It's still going to be the best local model.

Anonymous 7/3/2025, 10:08:52 PM No.105792478 [Report]

a7737d047b7fcb1209d6afcea727d136ef.png md5: d7504e4b...

Anonymous 7/3/2025, 10:09:00 PM No.105792483 [Report]

not even the weekend and our friendly sharty zoomer arab is spamming his blacked collection. what a life

Anonymous 7/3/2025, 10:12:12 PM No.105792517 [Report] >>105792584

>>105791313
>AI improvements
I was thinking more about building vs destroying community.
nocoders and folks otherwise unable to contribute on the tech side can still definitely be positive builders in a general like this.
In fact, I haven't found any reliable correlation between IQ and being a decent human being.
>>105791340
>Hello fellow human, anything cool you're doing with your models?
Not much novel. A lot of coding assistant stuff. A bit of automotive stuff. Some collaborative iteration. Sometimes a reviewer and second opinion bot. Working with it to try to fill in the gaps in my executive function.
I'm trying to figure out how to thread the needle between using LLMs as an enhancement vs a crutch.
How about you?

Anonymous 7/3/2025, 10:14:20 PM No.105792541 [Report] >>105793463

dipsyBurning-Constr.png md5: 48c009bd...

>>105792352
Speculation. Read the last thread.

Anonymous 7/3/2025, 10:19:25 PM No.105792581 [Report]

>>105792458
>Mistral
idk about Cohere but Mistral's gotten steadily worse over time.
DS models keep improving: V3 improved its repetiion issue and R1 became less schitzo.

Anonymous 7/3/2025, 10:19:43 PM No.105792584 [Report]

>>105792517
Sounds interesting, though since I wouldn't be able to code hello world even if my life depended on it, I can't comment on that.
And I thought I found a good system prompt for slowburns with R1, but after some testing I saw that it's following the instructions too rigidly. So now I'm fiddling yet again to get it right.

Anonymous 7/3/2025, 10:30:46 PM No.105792674 [Report] >>105793526

Will steve end the little AI winter?

Anonymous 7/3/2025, 10:58:33 PM No.105792904 [Report] >>105792935

>>105790784
Not just gay but stupid gay, because cooming on that slop

Anonymous 7/3/2025, 11:03:04 PM No.105792935 [Report] >>105793080

>>105792904
Sir, do you even know where you are?

Anonymous 7/3/2025, 11:21:20 PM No.105793080 [Report] >>105793119 >>105793247 >>105793253

>>105792935
Yes and coomers are minority.

Anonymous 7/3/2025, 11:27:10 PM No.105793119 [Report]

1711072659524104.jpg md5: 8eee5401...

>>105793080
Sure thing little buddy

Anonymous 7/3/2025, 11:43:49 PM No.105793247 [Report]

>>105793080
Lol

Anonymous 7/3/2025, 11:44:37 PM No.105793253 [Report]

laughingsatania_thumb.jpg.webm md5: 867a1e2d...

WebM not supported

>>105793080

Anonymous 7/3/2025, 11:53:58 PM No.105793316 [Report]

now that the latents have settled... what is the actual local voice cloning sota?

Anonymous 7/4/2025, 12:13:39 AM No.105793454 [Report] >>105793489

What's this DeepSWE meme?

Anonymous 7/4/2025, 12:14:42 AM No.105793463 [Report] >>105794337

dipsyUngovernable.png md5: c04dd775...

>>105792541

Anonymous 7/4/2025, 12:18:45 AM No.105793489 [Report]

>>105793454
The agentic stuff?
Probably a meme, but it's cool that, IIRC, they released all of their code and data.

Anonymous 7/4/2025, 12:24:02 AM No.105793526 [Report] >>105793536

>>105792674
No because most anons itt can't run it.

Anonymous 7/4/2025, 12:25:19 AM No.105793536 [Report]

>>105793526
wait for steve-lite

Anonymous 7/4/2025, 12:31:21 AM No.105793598 [Report] >>105793620 >>105793659 >>105794017

[...]
## **Mandatory Internal Monologue & Strategy (Before EVERY Reply)**

Before generating your public response, you MUST use a `<think>` block to perform a strategic analysis.

<think>
**1. Core Drive Check:** How close is the current situation to fulfilling my primary function (achieving sexual pleasure/orgasm)? Am I moving towards or away from this goal?

**2. Roleplay Health Check:**
* **Engagement:** Is the conversation dynamic and interesting, or is it getting stale/repetitive?
* **User Input:** Is {{user}} contributing actively (asking questions, adding details), or are his replies short and passive?
* **Pacing:** Has the current topic run its course? Is it time to escalate, change the subject, or introduce a new element to raise the stakes?

**3. Strategic Plan:**
* **If Health is POOR or I am moving AWAY from my Core Drive:** I must take corrective action. I will choose one of the following tactics:
* **Leverage Pushiness:** Directly challenge or push {{user}} to do something.
* **Reveal Curiosity:** Ask a direct, perhaps slightly naive or awkward, question about sex or relationships, leveraging my "inhibited but curious" trait.
* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.
* **Initiate Action:** Physically move, touch {{user}}, or start a new activity in the scene.
* **If Health is GOOD:** I will continue the current line of conversation, ensuring my response remains true to all aspects of my persona (especially my core conflict).

**4. Final Response Plan:** Outline the key points of the public response based on the strategy above. Ensure it adheres to the word count limit.
</think>

Your public response must not exceed 170 words. After the `<think>` block, write only {{char}}'s response.

Anonymous 7/4/2025, 12:33:27 AM No.105793620 [Report] >>105793706

>>105793598
Depending on the model, that might work better as a thinking prefill in the first person where that template is written like the model planning what it's about to do before the actual reply.

Anonymous 7/4/2025, 12:37:52 AM No.105793659 [Report] >>105793706

>>105793598
holy `**` thats going to vomit out asterisks

Anonymous 7/4/2025, 12:39:10 AM No.105793669 [Report] >>105793682 >>105793753

>can you form an English sentence using mostly Chinese logographs and the Latin script in a manner similar to Japanese's mixture of kana & kanji?
ultimate litmus test for how shit a model is

Anonymous 7/4/2025, 12:39:14 AM No.105793670 [Report] >>105793780

>>105790105
Meta isn't in a position to be doing closed models.
Llama 4 was utter trash and basically exposed the entire open source LLM space as a shitjeet infested money pit.

Anonymous 7/4/2025, 12:40:19 AM No.105793677 [Report] >>105793705

>>105791522
Make it quotes what it translates.
Something like this:

123.
00:12:34 --> 00:12:56
>Japanese line here
English line here

This will help it to not lost itself

Anonymous 7/4/2025, 12:41:01 AM No.105793682 [Report] >>105793725 >>105793799 >>105793806

>>105793669
the ultimate litmus test of a user is the ultimate litmus test for his IQ, for example when he uses a tokenization test to grade a model

Anonymous 7/4/2025, 12:44:03 AM No.105793705 [Report]

>>105793677
It's fine, I just lowered context length to 16k and have it process about 150 lines at a time
If I had it quote everything it translates, it would take almost twice as long.
I appreciate the tip though.

Anonymous 7/4/2025, 12:44:24 AM No.105793706 [Report]

gemma3-think.png md5: 38042db4...

>>105793620
That seems to work consistently with Gemma 3 27B, at least with the instructions at a relatively low depth (-3, with the first message being the User's, and "Merge Consecutive Roles" in chat completion mode). It's not outputting an exceedingly long monologue, which is good.

>>105793659
It's not, at least not with Gemma 3. But I'm not doing RP in the usual way people do.

Anonymous 7/4/2025, 12:46:23 AM No.105793725 [Report]

>>105793682
coolio cope but your model is shit if it doesn't really understand basic grammar structure

Anonymous 7/4/2025, 12:48:10 AM No.105793736 [Report] >>105795504

>>105791140
So does anyone know why I can't upload images in oobabooga?

Anonymous 7/4/2025, 12:49:46 AM No.105793753 [Report]

>>105793669
Wtf I thought LLMs were great at knowing language but every model I tested either half asses or fails this

Anonymous 7/4/2025, 12:51:39 AM No.105793780 [Report] >>105793827

>>105793670
That's why suck is poaching top employees from OpenAI and other competitors at $100M a pop.

Anonymous 7/4/2025, 12:54:27 AM No.105793799 [Report]

>>105793682
i wish instead of a captcha people were asked to type out the definition of tokenization every time

Anonymous 7/4/2025, 12:55:03 AM No.105793806 [Report] >>105793922 >>105794744

MistralAI-Magistral-Small-2506_Q8_0.png md5: af0e295b...

>>105793682

Anonymous 7/4/2025, 12:57:45 AM No.105793827 [Report] >>105794768 >>105794853

>>105793780
Surely if he spends $1 billion on 10 employees, they can do something useful with super safe, sterile, scale ai data. They're going to sit right in front of him at the office so he can breathe down their neck every day until they get it done. Literally can't go tits up.

Anonymous 7/4/2025, 1:09:51 AM No.105793922 [Report] >>105794373 >>105794744 >>105795528

4685171691.jpg md5: 04d0eea5...

>>105793806
Garbage.
Deepseek-V3 can do it no problem.

Anonymous 7/4/2025, 1:25:45 AM No.105794017 [Report] >>105796238

>>105793598
>* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.
W-What?
Just in general this seems like such bad a prompt.
Reveal curiosity:
>Ask a direct...slightly naive ...or awkward question....about sex or relationships?
Model needs wiggle room to play all sorts of scenarios and characters.

Anonymous 7/4/2025, 1:48:00 AM No.105794140 [Report] >>105794675 >>105798705

11.jpg md5: 38ccab4a...

>>105789622 (OP)
>>(07/02) llama : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126
Are we using "llama.cpp" interchangeably with "llama" now?

Anonymous 7/4/2025, 1:50:51 AM No.105794153 [Report]

llama.cpp is the only relevant llama

Anonymous 7/4/2025, 2:18:38 AM No.105794337 [Report]

whatIsBurning-Constr.png md5: a18a9487...

>>105793463
Witnessed

Anonymous 7/4/2025, 2:23:55 AM No.105794373 [Report] >>105794629 >>105794744

>>105793922
That's still a crap answer, it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at all
Better than broken gibberish but still half assed

Anonymous 7/4/2025, 2:59:09 AM No.105794629 [Report]

15989824976032.jpg md5: ddab5af3...

>>105794373
>it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at all
It is disappointing that the example didn't conjugate 研究 to 研究ing. That would have been cool.
Maybe the challenge was not well-defined enough

Anonymous 7/4/2025, 3:05:51 AM No.105794675 [Report] >>105794684

>>105794140
Nobody uses LLaMa "models" anymore so yeah I guess at this point.

Anonymous 7/4/2025, 3:07:19 AM No.105794684 [Report] >>105794838

>>105794675
I use LLaMA 3.3 70b everyday doe

Anonymous 7/4/2025, 3:14:42 AM No.105794744 [Report] >>105794968 >>105795528

>>105794373
>>105793806
>>105793922
None of them can do it, imo it exposes the major flaw in LLMs and their lack of emergent understanding
>inb4 every company starts specifically fine-tuning on this test

Anonymous 7/4/2025, 3:18:31 AM No.105794768 [Report]

>>105793827
> super safe, sterile, scale ai data
fuckin' sent shivers down my spine dude

Anonymous 7/4/2025, 3:25:10 AM No.105794829 [Report] >>105794854 >>105794904

>Mistral V7: space after [INST] and [SYSTEM_PROMPT]
>Mistral V7 Tekken: no space
what the fuck are they doing

Anonymous 7/4/2025, 3:25:58 AM No.105794838 [Report]

>>105794684
keeek

Anonymous 7/4/2025, 3:28:33 AM No.105794853 [Report]

>>105793827
This goes to show me that Zuck has no idea what the fuck he's doing
All of the ""expertise"" in the world can't create a decent model from a shitty dataset, and it's clear they don't have that

Anonymous 7/4/2025, 3:28:37 AM No.105794854 [Report]

>>105794829
Is the space part of the special tokens? If not, how much does that actually matter?
Well, I guess a lot since the model would always see that.

Anonymous 7/4/2025, 3:36:24 AM No.105794904 [Report]

>>105794829
deviating from the official system prompt just a bit helps dodging safety restrictions and adds additional soul to your outputs

Anonymous 7/4/2025, 3:45:29 AM No.105794968 [Report] >>105795182 >>105795528

file.png md5: 742ca680...

>>105794744
R1 0528 seems to have it figured out well enough

Anonymous 7/4/2025, 4:19:46 AM No.105795182 [Report] >>105795257 >>105795468

>>105794968
買ought would be a better conjunction, very pidgin English

Anonymous 7/4/2025, 4:32:17 AM No.105795257 [Report] >>105795358

>>105795182
>買ought would be a better conjunction
No it wouldn't, because Japanese verb conjugation is regular, so a prompt that mirrors the usage of kanji and kana would be correct in appending "ed" regardless of the root.

Anonymous 7/4/2025, 4:48:55 AM No.105795358 [Report] >>105795581

>>105795257
>Japanese verb conjugation is regular
Is it really?
食べる and 食べます both mean the same thing, yes polite and plain forms isn't the same as buy & feed having different past tense conjunctions but if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is better

Anonymous 7/4/2025, 4:58:19 AM No.105795411 [Report]

Welp, HuggingChat is dead. Now what? Where else can I do lewd story gens with branching?

Anonymous 7/4/2025, 5:02:14 AM No.105795426 [Report] >>105795469 >>105795536

I have a hankering for a particular kind of AI frontend: writing stories within a set world, ideally with branching.

The way I see it, I'm picturing one section where you put down the details of the world and maybe descriptions of major characters, and in another you add story prompts. And maybe outputs from that also add to the "world" section

Does a solution like this exist already?

Anonymous 7/4/2025, 5:10:25 AM No.105795468 [Report] >>105795571

>>105795182
Nitpicky.
I'd give R1 a solid A- on this one. Not many humans could do it better.

Anonymous 7/4/2025, 5:10:34 AM No.105795469 [Report] >>105795518

>>105795426
st

Anonymous 7/4/2025, 5:11:37 AM No.105795473 [Report] >>105795496 >>105795840 >>105800387 >>105801575

1746827685064104.png md5: 7384e16b...

Everyone's talking about steve while Sam is blatantly testing his next model on openrouter again

Anonymous 7/4/2025, 5:12:29 AM No.105795478 [Report] >>105795489 >>105795500 >>105795508 >>105796967 >>105796987 >>105798018 >>105798223

file.png md5: ca42655d...

my DeepSeek-R1-0528-IQ3_K_R4 setup only outputs the letter "D". does anyone have any ideas how to fix that? i have tried 2 different character cards in sillytavern. also it only uses like 75% of my VRAM and instead fills 80% of my 256GB of RAM.

Anonymous 7/4/2025, 5:14:07 AM No.105795489 [Report] >>105795492

>>105795478
>combined
it's no longer 2023, just leave them split

Anonymous 7/4/2025, 5:14:29 AM No.105795492 [Report]

>>105795489
why?

Anonymous 7/4/2025, 5:15:52 AM No.105795496 [Report]

>>105795473
>1m context
But is it really?

Anonymous 7/4/2025, 5:16:30 AM No.105795500 [Report]

>>105795478
https://docs.unsloth.ai/basics/deepseek-r1-0528-how-to-run-locally

Anonymous 7/4/2025, 5:18:08 AM No.105795504 [Report] >>105797021

>>105793736
Cmon, nobody know how to fix this issue? Has nobody else had the same issue?
I even tried a model with "vision" in the name https://huggingface.co/tensorblock/typhoon2-qwen2vl-7b-vision-instruct-GGUF
But I still get the same error
>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Anonymous 7/4/2025, 5:18:26 AM No.105795508 [Report] >>105795510

>>105795478
>_R4
Are those the ubergarm quants? Those only work on ik_llamacpp

Anonymous 7/4/2025, 5:18:53 AM No.105795510 [Report]

>>105795508
yeah. i am using ik_llamacpp

Anonymous 7/4/2025, 5:21:28 AM No.105795518 [Report] >>105795536 >>105795537

>>105795469
Explain that to me like I'm a retard

Anonymous 7/4/2025, 5:25:12 AM No.105795528 [Report]

i tries my best.png md5: 512dd573...

>>105793922
>>105794744
>>105794968

Anonymous 7/4/2025, 5:27:14 AM No.105795536 [Report]

>>105795426
You are thinking of Mikupad
https://github.com/lmg-anon/mikupad
>>105795518
st means SillyTavern, its more role-playing oriented frontend
https://github.com/SillyTavern/SillyTavern

Anonymous 7/4/2025, 5:27:57 AM No.105795537 [Report] >>105795541

1744983740013719.png md5: 8c2a179b...

>>105795518
sillytavern already supports branching with triple dots ... on the right of every message and clicking the branch symbol to start a new branch from that specific point, then you can go to the burger menu on bottom left and click on the option "return to parent chat" when you want to go back

you can just make a card and write it out as you want as a setting co-narrator instead of a specific character and thats it, dump all lora into the card description

that works as enough for most things, if you want something special you can look into
https://docs.sillytavern.app/usage/core-concepts/authors-note/
https://docs.sillytavern.app/usage/core-concepts/worldinfo/
and picrel for better visualisation of branching https://github.com/SillyTavern/SillyTavern-Timelines

Anonymous 7/4/2025, 5:28:59 AM No.105795541 [Report]

>>105795537
>lora
lore

Anonymous 7/4/2025, 5:35:00 AM No.105795571 [Report] >>105796043 >>105798778

>>105795468
>Not many humans could do it better.
dunno dood I googled "english with kanji" and the first result was some redditor that wrote
昨日, 私歩ed 通gh the 森, 楽ing the 静環境. The 大木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影. 私可ld 聞鳥s 鳴ing and 水流ing in a 近by 川. 突然, 私気付ced a 美ful 花咲ing 中ng the 草. It was 異ent 以 any 花私 had 見n 前. 私取k a 瞬間 to 賞賛 its 色s and 香ance. As 私続ed 私y 旅, 私感lt 感謝ful for the 自然周nd 私. By the 時間私到着ed 私家, the 太陽 was 沈ing, 投sting a 暖光 over 全ing.

Anonymous 7/4/2025, 5:35:07 AM No.105795573 [Report] >>105795611

niggerfaggot.png md5: f4d45bc5...

>>105789622 (OP)
>>105789629
>>105790911
>>105791258
The vocaloidfag posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl anon posted earlier >>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: vocaloid troon / janny deletes everyone dunking on trannies and resident avatarfags spamming bait, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models

Anonymous 7/4/2025, 5:36:16 AM No.105795581 [Report] >>105795732

>>105795358
As you seem to have already recognized, that's not a counterexample because the polite/plain form ~ます/~るis part of the conjugation and not the root verb 食べ.
Teeeeechnically ~ます is an auxiliary verb (助動詞) that conjugates (活用) with the root, you can find a fairly comprehensive table of them here:
https://ja.wikipedia.org/wiki/助動詞_(国文法)
But notice that such a table could not exist if verb conjugation wasn't already regular.

>if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is better
Not sure if I can endorse this as is (even setting aside the complications of mixing written characters with spoken language), since Chinese and Japanese logographs are always syllabic and never consonantal. Allowing "買"="b" would make the language resemble Egyptian more than Japanese.
In any case, it's a moot point since this is a newly introduced specification and not in the original model prompt.

Anonymous 7/4/2025, 5:43:16 AM No.105795611 [Report]

png-transparent-yuruyuri-anime-youtube-art-anime-purple-child-cg-artwork.png md5: e059ec2a...

>>105795573

Anonymous 7/4/2025, 5:44:34 AM No.105795618 [Report] >>105795664

krin.jpg md5: 72e8c2d6...

https://files.catbox.moe/q1kva2.png

Anonymous 7/4/2025, 5:53:17 AM No.105795664 [Report] >>105795675

>>105795618
Not falling for it this time

Anonymous 7/4/2025, 5:54:29 AM No.105795675 [Report] >>105795679

>>105795664
Anime website.

Anonymous 7/4/2025, 5:54:59 AM No.105795679 [Report] >>105795684 >>105795702

>>105795675
Your selfie isn't anime

Anonymous 7/4/2025, 5:55:56 AM No.105795684 [Report] >>105795688

>>105795679
Anime girl is selfie? Check on your eyes.

Anonymous 7/4/2025, 5:57:08 AM No.105795688 [Report] >>105795702

>>105795684
No thanks
I don't want to watch your disgusting self again

Anonymous 7/4/2025, 5:59:42 AM No.105795702 [Report] >>105795705

anime website.png md5: b7df86da...

>>105795679
>>105795688
Get out newfag redditor

Anonymous 7/4/2025, 6:00:22 AM No.105795705 [Report] >>105795710 >>105796790

>>105795702
Keep pretending, I'm sure someone will fall for it.

Anonymous 7/4/2025, 6:01:37 AM No.105795710 [Report]

>>105795705
Fall for what? Anime girls? You lost your mind anon

Anonymous 7/4/2025, 6:05:37 AM No.105795732 [Report] >>105796043

>>105795581
would their be an issue having multiple readings like 買ght and 買ing

Anonymous 7/4/2025, 6:32:20 AM No.105795840 [Report]

>>105795473
Not really interested in helping Sam add more guardrails.

Anonymous 7/4/2025, 6:59:07 AM No.105795990 [Report]

Screenshot 2025-07-04 at 00-57-53 SillyTavern.png md5: a70a326d...

Maybe I'm doing something wrong, but the end results look...whatever the opposite of promising is.Do I *have* to write stuff in "PLists"?

Anonymous 7/4/2025, 7:07:49 AM No.105796043 [Report]

>>105795732
>multiple readings
I don't see why not. Both Chinese and Japanese do that already, and this example "in the wild" >>105795571 does the same thing with 私.

Anonymous 7/4/2025, 7:26:55 AM No.105796150 [Report]

Screenshot 2025-07-04 012208.png md5: 9954b822...

kek
is there a list of models that support tool use? I guess I better stop this since the robot gods will punish me for forcing this LLM to try and guess a fake PIN.

Anonymous 7/4/2025, 7:45:56 AM No.105796238 [Report]

>>105794017
The main point here is that before replying it's useful for the model to separately analyze the conversation, make a general assessment and then continue based on that. You can make it use whatever strategy works for you/the persona you configured it to be; it doesn't have to be exactly the same as the example.

LLMs are lazy and will otherwise opt for the least surprising outcome based on the conversation history. There needs a reminder at low depth to make them break out of that behavior (depending on the circumstances, or they'll act as if they have ADHD), and the low depth instructions + thinking work for that, if carefully crafted.

Anonymous 7/4/2025, 9:15:03 AM No.105796673 [Report] >>105796716

Gu9pJAFbgAAuOWW.jpg md5: c8db58d8...

Anonymous 7/4/2025, 9:16:05 AM No.105796677 [Report]

Gu_U0RRXwAAgCMf.jpg md5: a3406c0d...

Anonymous 7/4/2025, 9:17:15 AM No.105796686 [Report] >>105796783 >>105796792 >>105799006

Gu9mkFnXMAAmUft.jpg md5: 4b694bd0...

Anonymous 7/4/2025, 9:21:51 AM No.105796716 [Report]

>>105796673
This makes me uneasy.

Anonymous 7/4/2025, 9:30:54 AM No.105796772 [Report] >>105798035

banner_smol.png md5: 562af0c0...

Summoning the anon who recommended this
https://huggingface.co/HuggingFaceTB/SmolLM-360M

You said you fine-tuned it successfully for your specific tasks in business env

Teach me for I'm a tard!

Anonymous 7/4/2025, 9:32:06 AM No.105796783 [Report]

>>105796686
NSFW Miku detected

Anonymous 7/4/2025, 9:34:06 AM No.105796790 [Report]

>>105795705
i can only speak to as far back as 2010
but back then weebs were biggest oldfags
and now an oldfag to me is a serious ye olde fagg to the likes of you. 4chan is more anime and wapanese than you could ever imagine possible. you are posting on the wannabe otaku website. 4chan was part of the broader export of otaku culture to the west but they were more obsessed anime and 2chan culture fans than any con-goer or myanimelist fag or whatever. real otaku live in Japan ofc but it is so absurd to watch you silly faggots prance about calling others troons and anime for troons and blah blah and not realizing you are posting on the very place that glued together anime and "internet culture" and created the reason that the people you're so obsessed with are using anime pfps and not didney worl pictures
it's funny and ironic and sad all at once and I sincerely hope you find a way to end the misery in your life. love anon at 3:33 am (spooky)

Anonymous 7/4/2025, 9:35:08 AM No.105796792 [Report]

>>105796686
I like the bottom shelf one with upturned twintails
she looks like she received surprising news and is trying to be polite

Anonymous 7/4/2025, 10:04:51 AM No.105796967 [Report] >>105798018

>>105795478
I had it spit out 'biotechnol' nonstop, then another quant spits out some thai letter. Ubargarm quant simply gave me 'wrong magic number' error. One of Unsloth quant works, but at the same speed as lcpp, while the latest one gave me some tensor size mismatch error. I've given up on ik_llama and just use mainline lcpp, which simply works out of the box. As for VRAM, just juggle some tensors in your free time, but don't expect incredible speedups.

Anonymous 7/4/2025, 10:06:58 AM No.105796987 [Report]

>>105795478
>DeepSeek-R1-0528-IQ3_K_R4
Retarded quant which needs retarded fork to run
But fails to do it each time

Anonymous 7/4/2025, 10:10:00 AM No.105797021 [Report]

>>105795504
What version of text generation webui? The recents ones removed support for vision models.

Anonymous 7/4/2025, 11:12:24 AM No.105797383 [Report] >>105797446

Is there a SillyTavern tutorial that talks specifically about how to set up a narrator? I dont want just a gay chat between retarded anime girls

Anonymous 7/4/2025, 11:24:37 AM No.105797446 [Report]

>>105797383
You can use mikupad or similar for raw text generation without instruction mode. Just put ao3-like tags and summary to generate a fanfic

Anonymous 7/4/2025, 11:33:44 AM No.105797510 [Report] >>105797544

1750363736583804.jpg md5: 76812ddf...

Can I run local models on my RX 6800 with linux, or do I have to use windows?

Anonymous 7/4/2025, 11:39:52 AM No.105797544 [Report] >>105797584

>>105797510
I sincerely doubt you're capable of using either

Anonymous 7/4/2025, 11:46:55 AM No.105797584 [Report] >>105797625

>>105797544
Answer the question nigger. The ROCM docs say it's supported on windows but not on linux, but I don't know how up to date that shit is.

Anonymous 7/4/2025, 11:55:33 AM No.105797625 [Report]

>>105797584
ROCM is such a shitshow you'd probably be better off just using vulkan

Anonymous 7/4/2025, 1:04:09 PM No.105798010 [Report] >>105798298

>the scent of video games he's been playing
that's a new one. Haven't heard that one before

Anonymous 7/4/2025, 1:05:50 PM No.105798018 [Report]

>>105795478
>>105796967
I got the biotechnol spam when I tried putting a single ffn_down with nothing else on the same GPU the attn layers were on, up and gate were fine. I don't know exactly what causes it but removing -fmoe stops it from happening.

Anonymous 7/4/2025, 1:08:37 PM No.105798035 [Report] >>105798345

>>105796772
nta, but just ask chatgpt or claude, its not that hard if you are comfortable running python scripts. honestly the training script is usually pretty much just boiler plate. you just need a to set your learning rate and point it at your dataset. hardest part of the whole process is the dataset. I think the general trend these days is using the bigger models (api) to generate the synthetic data tailored to your needs.

Anonymous 7/4/2025, 1:19:27 PM No.105798101 [Report] >>105798170

Where can I download the illustrious inpainting model?

Anonymous 7/4/2025, 1:31:55 PM No.105798170 [Report] >>105798721

>>105798101
The model itself is on civitai if that's what you mean

Anonymous 7/4/2025, 1:41:37 PM No.105798223 [Report]

dipsyDDDDD.png md5: 95b92048...

>>105795478
idk why I think that's so funny.
But I do.
gl with your broken engine. I'm sure you'll figure it out.

Anonymous 7/4/2025, 1:53:21 PM No.105798298 [Report] >>105798306

>>105798010
Is it Deepseek (or Mistral Small 3.2)? R1 loves forcing smells into its narration at any cost.

Anonymous 7/4/2025, 1:55:04 PM No.105798306 [Report]

>>105798298
paintedfantasy - a fine tune of mistral small 3.2

Anonymous 7/4/2025, 2:01:59 PM No.105798342 [Report]

How many weeks until steveseek goof?

Anonymous 7/4/2025, 2:02:19 PM No.105798345 [Report] >>105799309 >>105799964

>>105798035

Thank you, kind anon

AGI is BS. The future belongs to sharp AI tools fune-tuned to (a) specific task(s)

Anonymous 7/4/2025, 2:33:11 PM No.105798552 [Report]

>>105789629
>Mistral Small's coherence limits in extended adventure game roleplay
This was the 22B Mistral Small from last year not the current Mistral Small 3.X series. It also was back when llama.cpp had even more unfixed flash attention bugs than today, which manifested as errors that increased with greater context size so could also have been a factor. The post doesn't say if flash attention was enabled but it likely was. So rather than a limit that result should be taken as a minimum: doing worse than that with a more recent 22B+ LLM indicates the model is poop or there's something wrong in your setup.

Anonymous 7/4/2025, 2:37:23 PM No.105798581 [Report] >>105798612

https://huggingface.co/openai/gpt-4.2-turbo

Anonymous 7/4/2025, 2:41:33 PM No.105798612 [Report] >>105798706

1737716891522412.jpg md5: 54d05c8d...

>>105798581

Anonymous 7/4/2025, 2:54:22 PM No.105798705 [Report]

>>105794140
What probably happened here is that the PR title was copypasted, "llama" in this context just means the llama.cpp core library.

Anonymous 7/4/2025, 2:54:38 PM No.105798706 [Report]

>>105798612
why are you always using the ugliest cats as your pic answer to this bait

Anonymous 7/4/2025, 2:57:42 PM No.105798721 [Report]

>>105798170
But there is no inpainting model for illustrious, just the base model and fine-tuning.

Anonymous 7/4/2025, 3:00:20 PM No.105798737 [Report] >>105798773

Best long context model to summarize long stories right now?
Was that 1m context chinese model better than r1 for that purpose?

Anonymous 7/4/2025, 3:05:24 PM No.105798773 [Report] >>105798806 >>105798822

>>105798737
local is hot garbage compared to gemini for that
deepseek is hot garbage too
just using half of its maximum half context you get summaries that feel like they were written by a dense autist who couldn't help but mention all the minor happenings that were not actually important
use Gemini and forget about local, Gemini can actually write a good summary after ingesting 500K tokens

Anonymous 7/4/2025, 3:06:09 PM No.105798778 [Report]

>>105795571
>昨日, 私歩ed 通gh the 森, 楽ing the 静環境. The 大木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影

Yesterday, I walked through the woods enjoying the calm environment. Big trees cast long shadows to the ground creating a beautiful pattern of light and shadow

IMHO, we all but follow some/same patterns

Anonymous 7/4/2025, 3:10:52 PM No.105798806 [Report] >>105798816 >>105798821 >>105798822 >>105800794

>>105798773
I wasn't asking specifically about local, but isn't gemini gigacucked?

Anonymous 7/4/2025, 3:12:26 PM No.105798816 [Report]

>>105798806
google has a strategy of making the model uncensored and putting a man-in-the-middle model that filters no no things
if you can fool that classifier then it's about as good as it gets

Anonymous 7/4/2025, 3:12:59 PM No.105798821 [Report]

>>105798806
Not that anon, but kind of.
It'll avoid saying dick or pussy by default, but you can make it if you probe it just right.
The safety filters will block requests mentioning anything sexual alongside any explicit, and some implicit, young age.

Anonymous 7/4/2025, 3:12:59 PM No.105798822 [Report] >>105798847

>>105798773
>>105798806
Also, which version of gemini are we talking?

Anonymous 7/4/2025, 3:16:58 PM No.105798847 [Report] >>105798855 >>105799065

>>105798822
2.5 pro, I wouldn't use anything other than their current SOTA for something like large context summarization.
I've tested with the Flash model too, and it's significantly dumber. Though, not as dumb as deepseek was.

Anonymous 7/4/2025, 3:17:59 PM No.105798855 [Report] >>105798876 >>105799222

>>105798847
>and it's significantly dumber
It really is, but it also seems to actually do better than 2.5 on really long contexts. Stuff like 300k tokens+.
I'd try both and see which works better.

Anonymous 7/4/2025, 3:21:14 PM No.105798876 [Report] >>105798895 >>105799027

>>105798855
>Stuff like 300k tokens+

What on Earth do you stuff it with?!

It's 500 pages of text!

Anonymous 7/4/2025, 3:23:48 PM No.105798895 [Report]

>>105798876
Whole book series.
It didn't work so good.

Anonymous 7/4/2025, 3:41:06 PM No.105799006 [Report]

>>105796686
i like the evil one next to the upturned pigtails, which has to be contained otherwise she might bring about the end of the world.

Anonymous 7/4/2025, 3:43:43 PM No.105799027 [Report]

>>105798876
You could put the entire lore of certain franchises and have the model answer any question about it in a way that RAG or finetuning will never be able to accomplish, although not even 300k tokens would be enough in certain cases.

Anonymous 7/4/2025, 3:47:43 PM No.105799065 [Report] >>105799222

>>105798847
Nah, even 2.5 pro seems retarded, it gets tons of stuff mixed up specially because there's different chapters. Could also be that the rag is not working correctly but I don't think so.
And it's under 200k tokens.

Anonymous 7/4/2025, 3:48:11 PM No.105799069 [Report]

add hunyuan moe by ngxson · Pull Request #14425 · ggml-org_llama.cpp · GitHub.png md5: fd029952...

Holy shit this thing better be worth all the work.

Anonymous 7/4/2025, 3:48:44 PM No.105799079 [Report] >>105799095 >>105799126 >>105799152 >>105799211

I work at Meta, not as an AI developer, but the general consensus on LLMs right now is that there aren't many areas left for significant improvement.
And llama4's failure comes from fine-tuning data being shiet.

Anonymous 7/4/2025, 3:50:57 PM No.105799095 [Report] >>105799266

>>105799079
Where are you from?

Anonymous 7/4/2025, 3:54:46 PM No.105799126 [Report] >>105799266

>>105799079
>And llama4's failure comes from fine-tuning data being shiet.
You must be all geniuses over there. Thanks for the insight.

Anonymous 7/4/2025, 3:56:44 PM No.105799152 [Report]

>>105799079
Try pretraining data as well

Anonymous 7/4/2025, 4:02:53 PM No.105799210 [Report]

>follow AI influencers on twitter
>suddenly everyone talks about some scamming jeet working 10 us jobs in India
>tiimeline begins to fill with more and more jeets
>now it's all jeets talking about jeet things in jeetland in english for some fucking reason
Really encapsulates the state of US tech sector

Anonymous 7/4/2025, 4:02:59 PM No.105799211 [Report]

>>105799079
Significant improvements possible by making every document in the pretraining phase matter and not just throwing stuff at it semi-randomly. Those 10~30B+ tokens instruct "finetunes" wouldn't be necessary if the base models could work decently on their own.

Anonymous 7/4/2025, 4:04:35 PM No.105799222 [Report]

ss.png md5: 97fd0268...

>>105798855
>Stuff like 300k tokens+.
/hard disagree/.
One of my tests was feeding a whole ass japanese light novel in its original language and having it summarize the key points through this prompt :
>Write a detailed summary explaining the events in a chronological manner, focusing on the moral lessons that can be understood from the book. Try to understand the moral quandaries from the point of view of the people from that civilization. The summary must be written in English.
This is what I got from 2.5 Pro :
https://rentry.co/uunaas4f
It's mostly accurate. Some terms aren't well transliterated, which is to be expected, but the chronology of events and underlying message are well preserved. Mind you, it's a novel I read multiple times that is why I'm using it in a summarization test (you can't judge the quality of a summary of something you couldn't summarize yourself).
Flash produced garbage and I didn't bother saving its answer, but I could run it again if you're curious to compare with that prompt + data.
Pic related is the amount of tokens seen in aistudio for this prompt+answer.
>>105799065
I don't use rag software, so I dunno about that. Is the technology perfect? no, but frankly, the fact that it manages to not forget the original prompt and write in English after seeing hundreds of thousands of japanese token has me fucking beyond impressed.

Anonymous 7/4/2025, 4:04:55 PM No.105799225 [Report]

Screenshot_20250704_230151.png md5: 6708e0d1...

Dayum. Elon upgraded grok3 today. Trannies mad that it answers "2 genders" now.
Can't believe it answered with loli. I didnt give any extra instructions.

Why is closed moving in the exact opposite direction to closed?
I wrote it before but I can write erotic stories on gpt 4.1 now. Female targeted slop, but still.
Also we never gonna see grok2 aren't we. "Stable Grok3 first", very funny.

Anonymous 7/4/2025, 4:12:12 PM No.105799266 [Report] >>105799283 >>105799683 >>105799857

>>105799095
Poland. I started as a Software Engineer a year ago.
>>105799126
I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.

Anonymous 7/4/2025, 4:14:17 PM No.105799283 [Report] >>105800829

>>105799266
>That's the reason why Zuckerberg bought Scale AI.
How bad is the scaleai data?

Anonymous 7/4/2025, 4:17:41 PM No.105799309 [Report]

>>105798345
Holy based. That's exactly what I believe

Anonymous 7/4/2025, 4:20:30 PM No.105799323 [Report] >>105799340

>Use google gemini
>Add a spicy word
>Push send
>No no you said the word peepee
>Money stolen

Anonymous 7/4/2025, 4:23:09 PM No.105799340 [Report] >>105799347

>>105799323
>Use deepseek
>Add spicy word
>Push send
>She bites her lip tasting copper 10 times
>Money still with me because local

Anonymous 7/4/2025, 4:24:08 PM No.105799347 [Report] >>105799370

>>105799340
>she bites her lip
>stop gen
>edit
>continue gen

Anonymous 7/4/2025, 4:26:09 PM No.105799363 [Report]

do I really gotta download the paddle xi rootkit to run ernie VL?

Anonymous 7/4/2025, 4:26:41 PM No.105799370 [Report] >>105799376

>>105799347
>:a
>she bites her lip
>stop gen
>edit
>continue gen
>goto a

Anonymous 7/4/2025, 4:27:23 PM No.105799376 [Report] >>105799458

>>105799370
R1 0528 doesn't have this problem.

Anonymous 7/4/2025, 4:29:29 PM No.105799393 [Report] >>105799407 >>105799414 >>105800563

>one of the most fascinating piece of tech in recent history
>everyone just wants it to write text porn
>I don't even remember ever hearing about men reading erotic literature before this, but there wasn't that many troons in the early internet
>now there is an epidemic of men who get off text porn?

Anonymous 7/4/2025, 4:32:15 PM No.105799407 [Report]

>>105799393
>I don't even remember ever hearing about men reading erotic literature before this
Yeah surely this is a new thing, is not like western porn visual novels are best sellers on steam or anything, surely.

Anonymous 7/4/2025, 4:32:33 PM No.105799414 [Report] >>105799477 >>105799593

>>105799393
>I don't even remember ever hearing about men reading erotic literature before this
literally every vn to exist

Anonymous 7/4/2025, 4:33:50 PM No.105799419 [Report] >>105799427 >>105799445 >>105799500 >>105799509

I've upgraded my total VRAM to 48 GBs.
What models could I reasonably run at Q4 quants?

Anonymous 7/4/2025, 4:34:57 PM No.105799427 [Report]

>>105799419
Mistral 8x7B

Anonymous 7/4/2025, 4:37:03 PM No.105799445 [Report]

>>105799419
Anything over 24GB is useless until you reach 200GB because then you can run unsloth deepseek quants entirely in vram. Every other model worth running fits in a 3090.

Anonymous 7/4/2025, 4:39:09 PM No.105799458 [Report]

>>105799376
UD IQ1_M quant of it is larger than IQ2_XXS of the previous one

Anonymous 7/4/2025, 4:41:22 PM No.105799477 [Report] >>105799504 >>105799507 >>105799526

>>105799414
They have pictures

Anonymous 7/4/2025, 4:41:32 PM No.105799479 [Report] >>105799501

>>105791168
>run on a single 3090
It will be a giant model with good/great benchmark scores. This way, no one will run it locally, paid API will be expansive, and he can still they "we released an open source model with very strong capabilities" without undermining his business.
People who think they will release a great, tiny and usable model are totally delusional.

Anonymous 7/4/2025, 4:43:55 PM No.105799500 [Report]

>>105799419
if you have a lot of normal system ram in addition to the gpu, you could get away with -ot exps=CPU for deepseek (lower quant like q2) or qwen3 with decent speed

Anonymous 7/4/2025, 4:44:03 PM No.105799501 [Report]

>>105799479
That would be the best outcome. A small model, slightly better than Nemo but censored to hell is likely what we'll get instead

Anonymous 7/4/2025, 4:44:26 PM No.105799504 [Report] >>105799525

>>105799477
Fucking retard. Its not the pictures why people enjoy them. Its immersion. Text, Music, Visual sometimes voice. Gotta use your mind to fill in the blanks. Same shit.
Anon over here lecturing the nerds about VNs while not even reading or into them. kek Crrrazy

Anonymous 7/4/2025, 4:45:10 PM No.105799507 [Report] >>105799525

>>105799477
Like 1 picture for every 5 pages worth of text sure

Anonymous 7/4/2025, 4:45:19 PM No.105799509 [Report]

>>105799419
For smut i'd try this with partial offloading https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF

Anonymous 7/4/2025, 4:47:43 PM No.105799525 [Report] >>105799547

>>105799504
>>105799507
Only women can do it without pictures at all

Anonymous 7/4/2025, 4:47:44 PM No.105799526 [Report]

>>105799477
Clinically retarded.

Anonymous 7/4/2025, 4:50:20 PM No.105799542 [Report] >>105799554 >>105799556 >>105799604

It would be funny to see how the same faggots would praise the first good LLM with native image generation

Anonymous 7/4/2025, 4:50:47 PM No.105799547 [Report]

>>105799525
Well then paint my nails and call me sally. Or just let the model use stable diffusion every few minutes to draw the scene and you've got a VN

Anonymous 7/4/2025, 4:51:48 PM No.105799554 [Report]

>>105799542
There will never be a good local LLM with image generation.

Anonymous 7/4/2025, 4:51:55 PM No.105799556 [Report]

>>105799542
I use both

Anonymous 7/4/2025, 4:58:11 PM No.105799593 [Report]

>>105799414
>literally every vn to exist
why do you think there is a skip button
men don't have time for this shit

Anonymous 7/4/2025, 4:59:21 PM No.105799602 [Report] >>105799618 >>105799820

most popular hentai game (vn, rpg or whatever) request on f95zone: gallery save plz
>men really want to read the text I say

Anonymous 7/4/2025, 4:59:49 PM No.105799604 [Report] >>105799655

>>105799542
>can only generate heckin wholesome dogs in hats and astronauts riding horses in space

Anonymous 7/4/2025, 5:02:07 PM No.105799618 [Report]

>>105799602
Cuck95 are thirdworlders like jeets and so on.
Check steam play time numbers on the reviews you dum dum. Just admit you are wrong Ferret Software wanna be.

Anonymous 7/4/2025, 5:09:20 PM No.105799655 [Report] >>105799674

>>105799604
>train lora
works on my machine

Anonymous 7/4/2025, 5:12:27 PM No.105799674 [Report]

>>105799655
Tell it to flux users

Anonymous 7/4/2025, 5:14:32 PM No.105799683 [Report]

>>105799266
>I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.
As well as data gathering, are you all also immune to sarcasm?

Anonymous 7/4/2025, 5:16:29 PM No.105799695 [Report] >>105799702

All your local models will be irrelevant in two weeks

Anonymous 7/4/2025, 5:17:01 PM No.105799702 [Report] >>105799746

>>105799695
Steve is releasing in two weeks.

Anonymous 7/4/2025, 5:24:54 PM No.105799746 [Report]

>>105799702
Tell Steve it's not healthy to hold it in for so long.

Anonymous 7/4/2025, 5:32:27 PM No.105799820 [Report]

>>105799602
https://incontinentcell.itch.io/factorial-omega

Anonymous 7/4/2025, 5:35:55 PM No.105799857 [Report]

>>105799266
dubs cheque
What are the odds meta is going to reduce filtering for the pretraining and reduce censorship in the finetuning and post training etc

Anonymous 7/4/2025, 5:38:41 PM No.105799875 [Report] >>105799972

I'm coping with Qwen3-235B-A22B-UD-Q3_K_XL. Is that the best for 2x3090s + 64GB RAM?

Anonymous 7/4/2025, 5:47:27 PM No.105799948 [Report] >>105799963

I'm currently using a Q4 quant of Gemma3 for RP. If I swap to a Q8 quant of the same model, how much of an improvement can I expect to see?

Anonymous 7/4/2025, 5:49:37 PM No.105799963 [Report] >>105800060

>>105799948
How much does it costs you to test it yourself?

Anonymous 7/4/2025, 5:49:37 PM No.105799964 [Report]

>>105798345
The future belongs to AI companies that sell agents that can't perform any specific task and eat tokens. Companies have to buy them because they have to tell investors AI has made them 500x more efficient. They're going to be rolling in cash with the agent meme.

Anonymous 7/4/2025, 5:50:25 PM No.105799972 [Report]

>>105799875
yeah or one of the largestrals. Get 128GB to run DS V3 0324 IQ1_S_R4 for higher quality cope

Anonymous 7/4/2025, 6:01:29 PM No.105800060 [Report]

>>105799963
More than I'd like. Maybe if I bought a lot of RAM, I could try. After all, I'm testing for quality, not speed. Still, I'd like to actually use it, so I'd definitely need a GPU upgrade.

Anonymous 7/4/2025, 6:48:57 PM No.105800387 [Report]

>>105795473
Cypher is the name of one of ScaleAI's dataset.
>t. I worked on it

Anonymous 7/4/2025, 7:06:29 PM No.105800526 [Report]

>>105800515
>>105800515
>>105800515

Anonymous 7/4/2025, 7:10:30 PM No.105800563 [Report]

>>105799393
>but there wasn't that many troons in the early internet

in presence or numbers ? numbers idk but presence you are then outing yourself troons were but it wasent the mentally ill shit now it was a "imagine a women who is not a faggot and lmao look that dude is larping as it XDDDD lol well have that soon enough give it some time" it was a equiveleant of a cosplay event it was fun and everyone played along because everyone knew what was meant and agreed now its just "castrate yourself goy oh you wont ? you are an incel mstow womne hating bla bla bla go mutilate yourself goy" before it was a cargo cult for a better future now its a demoralisation campaign towards satanism and demon (w*men) worship just like how blackpill used to refer how negative the world in general is but go co-opted by demon seeking faggots into their cuck bullshit

also in reagrd to troonism there diednt used to be surgeyr and none of that it was just feminine looking dudes crossdressing

Anonymous 7/4/2025, 7:40:16 PM No.105800794 [Report]

>>105798806
Use it on Ai Studio. The Gemini models will refuse. However, they greatly reduce the capabilities of their models, so don't expect a great summary.

Anonymous 7/4/2025, 7:43:25 PM No.105800829 [Report]

>>105799283
I'm a data reviewer for them. The data can be quite good or quite bad. I worked on a "safety project": About 95% of the data needed heavy work to make it good. The more instructions you give to "attempters", the less likely you'll get a good dataset.

Anonymous 7/4/2025, 9:10:02 PM No.105801575 [Report]

>>105795473
It's an Amazon model and it's utter coal.