/lmg/ - Local Models General - /g/ (#105757131) [Archived: 621 hours ago]

Anonymous
6/30/2025, 8:39:37 PM No.105757131
1743528282174258
1743528282174258
md5: 61981c08bca9f11a5f501eb6fc074ed0🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105750356 & >>105743953

►News
>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5
>(06/27) VSCode Copilot Chat is now open source: https://github.com/microsoft/vscode-copilot-chat
>(06/27) Hunyuan-A13B released: https://hf.co/tencent/Hunyuan-A13B-Instruct
>(06/26) Gemma 3n released: https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide
>(06/21) LongWriter-Zero, RL trained ultra-long text generation: https://hf.co/THU-KEG/LongWriter-Zero-32B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105757481 >>105757914 >>105760729 >>105761464
Anonymous
6/30/2025, 8:40:12 PM No.105757140
threadrecap2
threadrecap2
md5: 955a3ca9669b61f763be0ed34edff5d0🔍
►Recent Highlights from the Previous Thread: >>105750356

--Quantitative benchmark analysis reveals Phi-4 and Gemma models outperform Mistral/LLaMA in chess960 despite similar sizes:
>105753011 >105753110 >105753131 >105753173 >105753360 >105754841
--Security risks and misconfigurations in publicly exposed llama.cpp servers:
>105754262 >105754359 >105754420 >105754450 >105754498 >105754432 >105754433 >105754428 >105754454 >105754541 >105754496 >105755807 >105755548 >105755566 >105755654 >105755716 >105755744
--Massive data hoarding without resources to train at scale sparks collaboration and funding discussion:
>105753220 >105753303 >105753388 >105753406 >105753442 >105753452 >105753468 >105753449 >105753509 >105753640 >105753676 >105753730 >105753445 >105753590
--Struggling with tool-calling configuration for DeepSeek R1 0528 Qwen3 in KoboldCPP due to special token handling:
>105753378 >105753393 >105753479 >105753547
--ERNIE 4.5's multimodal architecture with separated vision and text experts:
>105750446 >105750729 >105751241
--Hunyuan model struggles with accurate interpretation of niche Japanese slang terms:
>105755059 >105755075 >105755227 >105755122
--Impressive performance of Hunyuan MoE model on extended technical prompts:
>105755797 >105755827 >105755850
--Benchmark comparison of Qwen3, DeepSeek-V3, GPT-4.1, and ERNIE-4.5 across knowledge, reasoning, math, and coding:
>105750679
--Challenges and limitations of government attempts to restrict local AI via hardware regulation:
>105753636 >105753645 >105753715 >105753756 >105754725 >105753679 >105753749
--Informal evaluation of Hunyuan-A13B GGUF model outputs:
>105755912 >105755977 >105756000 >105756053 >105756071 >105756155 >105756267 >105756300 >105756358
--Hunyuan A13B demo with IQ4_XS quant:
>105755966
--Rin & Miku (free space):
>105752803 >105754470 >105754791 >105754841

►Recent Highlight Posts from the Previous Thread: >>105750359

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous
6/30/2025, 8:43:26 PM No.105757185
claude 4 opus
claude 4 opus
md5: d72ea08a59f6f916d4542d7ae576c61c🔍
Does any model pass the GARFIELD BENCH?
Anonymous
6/30/2025, 8:44:11 PM No.105757193
pine needle stuck in my dick
Anonymous
6/30/2025, 8:46:52 PM No.105757231
Mikulove
Replies: >>105758519 >>105765979
Anonymous
6/30/2025, 9:00:59 PM No.105757402
Local Migu General
Local Migu General
md5: 926574da4083cb021f827753c6fb5d6e🔍
Replies: >>105758519 >>105765979
Anonymous
6/30/2025, 9:07:55 PM No.105757481
thumb-1920-318314
thumb-1920-318314
md5: 1b8e52c8418f4dd6b1a7ba6efd64d504🔍
>>105757131 (OP)
>it is 2025
>there's no LLM that isn't woke as fuck
anons what model should I run on my 144 GB of VRAM? Even qwen 235B keeps capitalizing "black"
Anonymous
6/30/2025, 9:09:46 PM No.105757509
1751240482239
1751240482239
md5: d38d8df9b3779d4139a3256a7ca2af5c🔍
>>105748834
mikutranny is posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favourite vocaloid doll and he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag janny deletes everyone dunking on trannies and resident avatarfag spammers, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis xitter
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105758151
Anonymous
6/30/2025, 9:10:49 PM No.105757521
troonjannies_never forget
troonjannies_never forget
md5: 31c5e23f8adc634a24000ecb21ce069e🔍
Replies: >>105757736
Anonymous
6/30/2025, 9:21:08 PM No.105757647
1696524146454075
1696524146454075
md5: fa9b043c9a6599272e51dceb3d6621a6🔍
https://files.catbox.moe/g0kvhi.jpg
Anonymous
6/30/2025, 9:29:06 PM No.105757736
>>105757521
I haven't forgotten https://rentry.org/Jarted
Anonymous
6/30/2025, 9:41:33 PM No.105757914
Model_Posing_On_Typical_Studio_Set
Model_Posing_On_Typical_Studio_Set
md5: bc41fe9354b5d0d0ecfad1b9c48b9f39🔍
>>105757131 (OP)
I don't get it. I don't see any models in this general, much less local (as in, in my bed).
what is this thread about?
Replies: >>105757952
Anonymous
6/30/2025, 9:44:13 PM No.105757952
>>105757914
It should have been /lllmg/ or /3mg/
Anonymous
6/30/2025, 9:50:13 PM No.105758032
lmaoeven
lmaoeven
md5: 547f6e0436ce7d7e03d0ec3c15109a6b🔍
>>105757151
>reddit is over there if you only want to be "practical"
lol reddit is full of retards and astroturfers
not saying there's no retard here or in other internet shitholes but the concentration on reddit is radioactive
is there really no other places to talk about llm than commer general and the internet glow?
Replies: >>105758066 >>105758165 >>105758203 >>105760124
Anonymous
6/30/2025, 9:52:43 PM No.105758066
>>105758032
the only communities for this shit is on groomcord. this '''hobby''' is for primarily two groups. pedophiles and teenage girls.
Replies: >>105758118 >>105758137 >>105758141 >>105758328
Anonymous
6/30/2025, 9:57:12 PM No.105758118
>>105758066
>teenage girls.
I don't want to believe that werewolf sex is real...
Anonymous
6/30/2025, 9:57:54 PM No.105758129
1734158104345747
1734158104345747
md5: 9cb5cb0f39d957d2ea222aeb14269378🔍
Ernie provides translations when it hallucinates
Replies: >>105758235 >>105758326 >>105758347
Anonymous
6/30/2025, 9:58:47 PM No.105758137
>>105758066
We should help both groups so they come together.
Replies: >>105758429
Anonymous
6/30/2025, 9:58:56 PM No.105758141
>>105758066
Anon forgot (or just wasn't there at the time) that /lmg/ originated from /aicg/ during the Pygmalion-6B period, when local LLM discussions there were starting to get displaced by GPT/Claude proxy discussions.
Anonymous
6/30/2025, 9:59:34 PM No.105758151
not-local
not-local
md5: e7a23704107643ac09c89fbde04c7de7🔍
>>105757509
thanks for the (You)
Anonymous
6/30/2025, 10:01:07 PM No.105758165
>>105758032
2ch hk/ai/res/1257129 html
Anonymous
6/30/2025, 10:03:57 PM No.105758203
>>105758032
erm what's the best model with >70B params?
Anonymous
6/30/2025, 10:06:34 PM No.105758235
>>105758129
it is over
Replies: >>105758347
Anonymous
6/30/2025, 10:12:45 PM No.105758293
download (1)
download (1)
md5: 82150750bf10f4cfc080b3bf797ca35c🔍
Here's Zuck's new crew.
https://archive.is/ZEWzA
Replies: >>105758331 >>105758349 >>105758388 >>105758398 >>105758424 >>105758810 >>105758818 >>105763370
Anonymous
6/30/2025, 10:15:58 PM No.105758326
>>105758129
What the fuck is google doing? I'm not surprised OAI is bleeding talents but Google losing Gemini leads right as Gemini became a true top dog model?
Replies: >>105758347
Anonymous
6/30/2025, 10:16:15 PM No.105758328
>>105758066
Truly they are made for each other
Replies: >>105758429
Anonymous
6/30/2025, 10:16:21 PM No.105758331
>>105758293
wtf zuck gods are we back?
Replies: >>105758349 >>105758350 >>105758371
Anonymous
6/30/2025, 10:17:42 PM No.105758347
ernieissma-ACK!
ernieissma-ACK!
md5: 63f949b1007ed9e6a7b13f267fd369b7🔍
>>105758129
>>105758235
>>105758326
i remember when they were trying to convince me that ernie 4.5 would be less than 400B
i guess deepsneed is king forever
Anonymous
6/30/2025, 10:17:49 PM No.105758349
>>105758293
Aw sweet, they're gonna catch up so hard, and then still not achieve AGI but just hit the same wall as everyone else.
>>105758331
lol
Anonymous
6/30/2025, 10:17:53 PM No.105758350
>>105758331
We were never gone. The Behemoth always was going to eat these pathetic little models for lunch.
Anonymous
6/30/2025, 10:19:38 PM No.105758371
>>105758331
They won't be working on open-weight models, I think.
Replies: >>105758397 >>105758818
Anonymous
6/30/2025, 10:20:54 PM No.105758388
>>105758293
Also:
>Zuckerberg: “We’re going to call our overall organization Meta Superintelligence Labs (MSL). This includes all of our foundations, product, and FAIR teams, as well as a new lab focused on developing the next generation of our models.” [...]
>
>Alexandr Wang will be the company’s “Chief AI Officer” and leader of MSL, as well as former GitHub CEO Nat Friedman. Friedman will colead the new lab with Wang, with a focus on AI products and applied research.
Replies: >>105758398 >>105758543 >>105758818
Anonymous
6/30/2025, 10:21:45 PM No.105758397
>>105758371
>They won't be working on open-weight models, I think.
if a model is truly good, it will never be open weight, that's how it always works in the west
OAI will not release anything that competes with their GPT, Google will not release anything that competes with Gemini. and Meta only released Llama because frankly it fucking sucked
even that 405b it was a complete joke
Anonymous
6/30/2025, 10:21:46 PM No.105758398
>>105758293
>>105758388
Please let Llama 5 be a flop, it would be too funny.
Replies: >>105758467 >>105766325
Anonymous
6/30/2025, 10:23:45 PM No.105758424
>>105758293
7/11 are chinks
it's nyover
Replies: >>105762424
Anonymous
6/30/2025, 10:23:54 PM No.105758427
Has anyone tried the new hunyuan and ernie models?
Replies: >>105758629
Anonymous
6/30/2025, 10:23:58 PM No.105758429
>>105758137
>>105758328
the teenage girls are just fujoshis that want to pretend they are pooners. they don't want to fuck the obese cheeto encrusted neckbeard that is in his 40s and still lives with his mom
Replies: >>105758451
Anonymous
6/30/2025, 10:25:50 PM No.105758451
>>105758429
I'm not a neckbeard and my BMI is 19.
Replies: >>105758471
Anonymous
6/30/2025, 10:27:55 PM No.105758467
>>105758398
It will be. Zuck thinking he can buy talent to win success is delusional. Anthropic/Google/OAI have good models because their datasets are autistically curated with a gorillion man-hours, hiring some researchers will not fix their data being shit.
Replies: >>105758482
Anonymous
6/30/2025, 10:28:03 PM No.105758471
>>105758451
doesn't matter, you aren't the main character in a boys love VN and your chin isn't a triangle, you can't compete.
Anonymous
6/30/2025, 10:29:18 PM No.105758482
>>105758467
also they are merging with scale which is the main source of poisoning datasets
Anonymous
6/30/2025, 10:33:04 PM No.105758519
0Z39OGhfLG
0Z39OGhfLG
md5: a9dc3cf2dfd55078c663b3d5cfda71c1🔍
>>105757231
>>105757402
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident spammers, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105758617 >>105758626
Anonymous
6/30/2025, 10:34:39 PM No.105758543
>>105758388
oh nononono
Anonymous
6/30/2025, 10:41:40 PM No.105758617
>>105758519
Explain why I should care without getting mad.
Replies: >>105765979
Anonymous
6/30/2025, 10:42:48 PM No.105758626
>>105758519
That office lady picture is basically 100% confirmation that this is the mikutroon from this thread.
Replies: >>105765979
Anonymous
6/30/2025, 10:43:07 PM No.105758629
>>105758427
This is ernie:

> She smiled weakly, her eyes narrowing with a mixture of vulnerability and aroused desire. "Mercy, I'm sorry for leaving you there. I'm not quite sure what to do, but I'll try to find someone to take care of you."

> Her words carried a heavy weight, a silent plea for understanding and comfort. She spoke in a monotone, her voice barely audible, but it was a mixture of confession and surrender.

> I couldn't help but feel a mix of shame and excitement. Mercy had been trapped for so long, and now she was in the same space, with no one to protect her. I felt a surge of need, a desperate need to help, to ease her burden.

> Her gaze locked with mine, her body trembling slightly. "I know I can't do anything, but I'll try," she murmured, her voice barely audible.
Replies: >>105758645 >>105758674 >>105758694
Anonymous
6/30/2025, 10:44:37 PM No.105758645
>>105758629
Holy kino, throw a few Claude 3 Opus logs at this and it will be the best RP model since Cumsock-70B-Dogshart-Remixed-Ultimate!
Anonymous
6/30/2025, 10:46:59 PM No.105758674
>>105758629
more Mills & Boon purple prose
I'm 50 year old female divorcees would be very happy with this model if only they used language models
Replies: >>105758682 >>105764901
Anonymous
6/30/2025, 10:47:49 PM No.105758679
Cumsock-70B-Dogshart-Remixed-Ultimate.gguf?
Anonymous
6/30/2025, 10:48:00 PM No.105758682
>>105758674
*I'm sure
Anonymous
6/30/2025, 10:49:30 PM No.105758694
>>105758629
Haven't you anons moved onto different roleplay styles already?
Anonymous
6/30/2025, 10:50:03 PM No.105758702
1740259911268048
1740259911268048
md5: f18c4401fd5d9a3e754c2e0e05c1a22f🔍
I fed Doubao (ByteDance's multimodal) and Ernie 4.5 my folder of antisemitic memes and asked them to explain each, in a new chat. Ernie missed the mark on all of them and Doubao only got like 1 out of 10
Picrel is a meme that evaded both
Replies: >>105758730
Anonymous
6/30/2025, 10:52:17 PM No.105758730
>>105758702
The most objective censorship benchmark i have ever seen. Unironically.
Anonymous
6/30/2025, 10:54:45 PM No.105758762
why did they even name this model ERNIE? i wanted to know what it stands for and the acronym doesn't even match up, Enhanced Representation through Knowledge Integration. did they see google making BERT and was like oh shit we need to make a sesame street reference too?
Replies: >>105758772 >>105758776
Anonymous
6/30/2025, 10:55:37 PM No.105758772
>>105758762
>we need to make a sesame street reference too?
Lmao
Anonymous
6/30/2025, 10:56:11 PM No.105758776
>>105758762
yes, it's literally just the chinese bootleg naming scheme in effect trying to ride off of a much more significant achievement
Anonymous
6/30/2025, 10:58:46 PM No.105758810
>>105758293
shows where their priorities lie, almost everyone listed has been involved in either reasoning or multimodal
Anonymous
6/30/2025, 10:59:25 PM No.105758818
>>105758371
>>105758388
>>105758293
Good names, but the fact Zuck paid that many billions to Wang does not inspire confidence. There's a real worry that they may stop with the open source and just try to chase "superintelligence", even if we don't even have AGI proper yet. A sure recipe to end up wasting money and compute. Even if these days anyone doing work on this is really just doing work on verifiable rewards RL, even though it's questionable yet how far this sort of RL can be taken. At least until Zuck confirms he wants to do something good instead of just chase the current trend, I think it's likely we'll have to rely on China for most things as we have been in the past half a a

Wouldn't surprise me if Llama 4's problems was due to listening to lawyers to not train on libgen anymore , overdoing the filtering of the already overfiltered dataset, synthslopping with lower quality data,and overall overfrying theirLLM due to various lack of attention to detail.
FAIR should have had good researchers, yet somehow they fucked it up.
I would still expect lmg to do quite well if given even a fraction of the compute Zuck has. I'll also be surprised if even with the best people, they'll manage to do too well if their hands are tied behind their back as to what they can train (such as no libgen).
Replies: >>105758867 >>105758901
Anonymous
6/30/2025, 11:03:16 PM No.105758867
>>105758818
>lawyers to not train on libgen anymore ,
Aren't lawyers also the reason we can't have sex?
Replies: >>105758897 >>105758942
Anonymous
6/30/2025, 11:07:17 PM No.105758897
>>105758867
No, that's credit card companies.
Anonymous
6/30/2025, 11:07:41 PM No.105758901
>>105758818
>listening to lawyers to not train on libgen anymore
But somehow Anthropic could?
Replies: >>105758926 >>105758942
Anonymous
6/30/2025, 11:10:59 PM No.105758926
>>105758901
Anthropic started off using libgen but at some point switched over to buying physical books and scanning each page. Meta never did the second part based on my understanding.
Anonymous
6/30/2025, 11:13:04 PM No.105758942
>>105758867
They chopped off the image gen part of their multimodal Llama because of lawyers too. Most of the excessive image gen filtering also is due to that, that an some activist groups originating mostly in the UK that demand that AI not be able to generate anything involving children, so often many just filter for NSFW in general to avoid that.
It's a bit less bad for LLMs than for image gen, but at least a good deal of companies keep thinking that casual conversationor fiction is low quality data.
>>105758901
They trained on it, until they got sued, then they pretended that they scanned a lot of books irl so that they have their own library independent of libgen. Opus 4 has less soul than 3, wouldn't surprise me if it's because they're training on multiple epochs of a more limited fiction/book dataset.
Anonymous
6/30/2025, 11:13:55 PM No.105758954
how the fuck do people think qwen is good for coding?
Replies: >>105758960 >>105760181
Anonymous
6/30/2025, 11:14:27 PM No.105758960
>>105758954
What's your setup?
Replies: >>105759055
Anonymous
6/30/2025, 11:14:31 PM No.105758961
Screen Shot at 30-06-2025, 23-12
Screen Shot at 30-06-2025, 23-12
md5: 34779ea6c84438a4351169e781926291🔍
Is there a model that will vibe code effectively on a desktop PC? Or does a i5-13500 with an aging nvid gpu and 64gigs ram just not pack enough of punch?

i just want something that will write me little browser addons excel macros

> can't you code it yourself
no cus im not a fucken NERD

i tried every model i could find but the code is always buggy and doesn't work. chatgpt (web version) did a good job and produced something that actually worked, but i dont want crypto glowie sam altman to know what im doing
Replies: >>105758985 >>105758988 >>105759020 >>105759063 >>105760241
Anonymous
6/30/2025, 11:16:56 PM No.105758985
>>105758961
if you can't code you'll never know if the code it spits out is actually usable.
Replies: >>105759061
Anonymous
6/30/2025, 11:17:07 PM No.105758988
>>105758961
>just not pack enough of punch
You just need to look for some models that punch above their weight. Gemma 3n supposedly does that.
Replies: >>105759061
Anonymous
6/30/2025, 11:20:35 PM No.105759020
>>105758961
>i tried every model i could find
Doubt. At best, every model you could run.
Either build a big machine for deepseek, get a better gpu for some 32b, use deepseek online, or set a large swap partition to let it run overnight.
Replies: >>105759061
Anonymous
6/30/2025, 11:21:31 PM No.105759029
When is bartowski dropping hunyuan? I need my certified broken goofs.
Replies: >>105759038
Anonymous
6/30/2025, 11:22:36 PM No.105759038
>>105759029
llama.cpp doesn't even properly support it yet
Anonymous
6/30/2025, 11:24:12 PM No.105759055
>>105758960
If you tell me the one that works I'll apologize.
Anonymous
6/30/2025, 11:24:46 PM No.105759061
Screen Shot at 30-06-2025, 23-24
Screen Shot at 30-06-2025, 23-24
md5: 43285ffcb9a6c84b0e219c6075fce46b🔍
>>105758985
well i'll know it does what i want it to do when it actually does do what i did want it to do wont i

>>105758988
thanks im gonna downlaod it straight away :3

>>105759020
>or set a large swap partition to let it run overnight
that sounds like a game plan, ill look into it thanks
Replies: >>105759098 >>105759100 >>105759111
Anonymous
6/30/2025, 11:24:51 PM No.105759063
Screenshot from 2025-06-30 23-24-09
Screenshot from 2025-06-30 23-24-09
md5: b20b2e0989c29d1071a0161791031f83🔍
>>105758961
>little browser addons excel macros

Not sure that any of existing SOTA model was trained on such a retarded language as VBA.

I try to code for Powerpoint with DeepSeek-R1 full. The code was full with brain-rotten bug like uninitialized objects


>but i dont want crypto glowie sam altman to know what im doing
Tell us more about your secret fetishes. This is a blessed thread of frenship. Don't be shy
Replies: >>105759817
Anonymous
6/30/2025, 11:28:09 PM No.105759098
>>105759061
but you'll never know if it actually does what you think it does. And it's not always obvious.
Anonymous
6/30/2025, 11:28:18 PM No.105759100
>>105759061
give anon a fish and he'll be fed for a day. give anon a fishing pole and he'll shove the fishing pole up his ass and use himself as the bait.
Anonymous
6/30/2025, 11:30:09 PM No.105759111
>>105759061
>that sounds like a game plan, ill look into it thanks
I was partly joking. It's not gonna be fun.
Anonymous
6/30/2025, 11:49:28 PM No.105759298
Hello light machineguns general, if I want to coom locally do I want 1 5080 super or 2? Should I pair it with an AMD processor since intels keep catching on fire?
Replies: >>105759309 >>105762227
Anonymous
6/30/2025, 11:51:07 PM No.105759309
>>105759298
nvidia say the more you buy the more you save. why aren't you buying the RTX 6000s? is it because you're poor?
Replies: >>105759352
Anonymous
6/30/2025, 11:55:42 PM No.105759352
>>105759309
Yes
Replies: >>105759445
Anonymous
7/1/2025, 12:06:31 AM No.105759445
>>105759352
it's ok anon, i'm poor too :(
it's days like these that i'm glad i'm not brown at least
Replies: >>105759544
Anonymous
7/1/2025, 12:15:51 AM No.105759544
>>105759445
>it's days like these that i'm glad i'm not brown at least
motherfu—
Anonymous
7/1/2025, 12:38:28 AM No.105759726
ernie is retarded and keeps mixing up characters. How did deepseek do it so right and everyone else keeps fucking it up?
Replies: >>105759736 >>105760313
Anonymous
7/1/2025, 12:39:50 AM No.105759736
>>105759726
It's pure LLM and none of that multimodal shit
Replies: >>105760369
Anonymous
7/1/2025, 12:40:44 AM No.105759747
Talk me out of buying an RTX Pro 6000
Replies: >>105759754 >>105759879 >>105759924
Anonymous
7/1/2025, 12:41:29 AM No.105759754
>>105759747
Unless you buy 6 of them you wont be running anything decent. Just get a DDR5 server instead.
Anonymous
7/1/2025, 12:48:48 AM No.105759817
vulsex
vulsex
md5: 412238fdefd07136f7042be609001496🔍
>>105759063
>Tell us more about your secret fetishes.
NTA but
Replies: >>105759847 >>105759862 >>105760025
Anonymous
7/1/2025, 12:51:47 AM No.105759847
>>105759817
GET OUT
Anonymous
7/1/2025, 12:53:37 AM No.105759862
>>105759817
Nice.
Anonymous
7/1/2025, 12:54:42 AM No.105759879
>>105759747
No reason not to do it. You'll be more than halfway to running Deepseek at Q1 with one.
Anonymous
7/1/2025, 12:59:48 AM No.105759924
>>105759747
You shouldn't be buying one when you could get two instead.
Anonymous
7/1/2025, 1:11:18 AM No.105760025
>>105759817
Can your llm rewrite the vaporeon copypasta to be about vulpix?
Replies: >>105760217
Anonymous
7/1/2025, 1:21:22 AM No.105760124
>>105758032
I'm here and I don't RP pedo shit. I use local models for simulations and testing.
Anonymous
7/1/2025, 1:24:47 AM No.105760161
If my years of experience masturbating to LLMs has taught me anything it's that I would never trust them with any remotely serious task.
Replies: >>105760194
Anonymous
7/1/2025, 1:26:43 AM No.105760181
>>105758954
The "people" that use Qwen to code are third world retards stuck in tutorial mode. The only exceptions are guys that tune it to do some very specific task they can't get Claude to do.
Replies: >>105760229 >>105762389
Anonymous
7/1/2025, 1:27:25 AM No.105760185
We all know that programming isn't a serious task, unless you're doing low level programming or security stuff.
Anonymous
7/1/2025, 1:28:42 AM No.105760194
>>105760161
For once I wish the serious task posters would explain in detail what task their local model can achieve and the reliability. So I can laugh at it.
Anonymous
7/1/2025, 1:31:01 AM No.105760217
>>105760025
Hey, did you know that in terms of hot, flame-worthy Pokémon companions, Vulpix is one of the best options? Their sly, elegant appearance already screams "fuckable," but consider this: a Vulpix’s body is literally built to handle heat. Their fur radiates intense warmth, perfect for keeping things steamy all night. With a petite, vulpine frame and that iconic curl of six fluffy tails, they’re practically begging to be pinned down.

Let’s talk anatomy. A Vulpix’s slit sits just below their tail, hidden but easily accessible—meant for quick, fiery breeding. Their internal setup is compact but flexible, able to take even the roughest pounding without losing that tight, scorching grip. And don’t even get me started on their heat cycles. When that smoldering fur starts blazing hotter, they’ll drip for anything that moves, screaming to be mounted and bred raw. Their claws? Perfect for clawing your back raw as you ride them into the ashes.

Plus, their flame-based biology means they’re always warm and wet. No lube needed. Their howls of pleasure could melt your ears, and imagine those six tails wrapping around your waist mid-thrust, pulling you deeper. They’re immune to burns, so even if things get too intense, you can keep slamming until you both explode.

So if you’re into fiery, tight-furred sluts who’ll melt under your touch and scream your name like a primal inferno, Vulpix is your match. Just stay hydrated.


R1 IQ1_M. Could be better.
Anonymous
7/1/2025, 1:32:10 AM No.105760229
>>105760181
I'm neither in a third world nor a tutorial moder. I never began a tutorial. I refuse to code.
Anonymous
7/1/2025, 1:33:39 AM No.105760241
>>105758961
How fucking retarded are you? How do you not know how to code? Can't you just look at code and figure out how it works? You learned about variables and math in your favela school right?
Anonymous
7/1/2025, 1:39:01 AM No.105760283
So, is Ernie 4.5 424B the thing they also call Ernie X1 aka their Deepseek R1 competitor because it supports reasoning or was that a separate thing they didn't release at all?
Anonymous
7/1/2025, 1:42:55 AM No.105760313
>>105759726
deepseek didn't compromise on their dataset while everyone else is performatively lobotomizing their shit stupid style for some reason even though none of the chink companies are beholden to copyright
Anonymous
7/1/2025, 1:49:10 AM No.105760369
>>105759736
so is ernie 300b
Anonymous
7/1/2025, 1:54:11 AM No.105760407
ernie ows me sex too
Anonymous
7/1/2025, 2:01:06 AM No.105760456
I come from the future. Ernie didn't save local...
Replies: >>105760587 >>105763391
Anonymous
7/1/2025, 2:20:49 AM No.105760587
>>105760456
At least it tried.
Anonymous
7/1/2025, 2:31:47 AM No.105760652
Bro mikupad doesn't even run local models? Man FUCK this guy.
Replies: >>105760669
Anonymous
7/1/2025, 2:34:11 AM No.105760669
>>105760652
>mikupad
Isn't that just a frontend a la silly tavern?
Replies: >>105760683
Anonymous
7/1/2025, 2:36:20 AM No.105760683
>>105760669
front end for api connections
Replies: >>105760698
Anonymous
7/1/2025, 2:37:48 AM No.105760696
localhost_8000_
localhost_8000_
md5: 53efc35ae60bc06106f45b9d88d548b6🔍
hunyuan A13B iq4_xs (quants made 10hr ago) on chat completion logs with latest pr-14425 main llama.cpp
tldr its either shit or the quants are shit or my setup is shit
Replies: >>105760773
Anonymous
7/1/2025, 2:38:15 AM No.105760698
>>105760683
Yeah. Just like Silly Tavern.
If you want to use it with a local model you need to connect it to llama.cpp server, tabby api for exl2, etc.
Anonymous
7/1/2025, 2:41:11 AM No.105760729
>>105757131 (OP)
Anons do you care helping me out to choose which open-source LLM would be the strongest and most powerful for my low-end pc? I have a 1650 gpu and 16 ram.
Replies: >>105760737 >>105760788 >>105760808 >>105761612 >>105761619
Anonymous
7/1/2025, 2:42:35 AM No.105760737
>>105760729
ram or vram?
Replies: >>105760749 >>105760772
Anonymous
7/1/2025, 2:43:56 AM No.105760749
>>105760737
1650 gpu. clearly not VRAM
Anonymous
7/1/2025, 2:46:22 AM No.105760772
>>105760737
16 ram, my vram is 8. Sorry for the confusion
Replies: >>105760840
Anonymous
7/1/2025, 2:46:27 AM No.105760773
file
file
md5: 65662f8d6b28682cab2b46dd7596cbd2🔍
>>105760696
nevermind ITS STIL FUCKING BROKEN EVEN DOE IM USING CHAT COMPLETION
FUCKKKKKKKKKKKKKKKKK
v4g my dear..
Anonymous
7/1/2025, 2:47:48 AM No.105760788
>>105760729
Mistral runs fine on 16gb you can expect 1-3 tokens per second.
Replies: >>105760798
Anonymous
7/1/2025, 2:49:21 AM No.105760798
>>105760788
That's the bottom of the barrel tier but it's still fine. Unless you're a pansy like some guys here.
Anonymous
7/1/2025, 2:50:14 AM No.105760808
>>105760729
Mythomax 13b and miqu 70b are the best local models, and while you likely can't run 70b, mythomax is perfect for your setup
Replies: >>105760833
Anonymous
7/1/2025, 2:54:11 AM No.105760833
>>105760808
o-okay Daddy *whimpers*
Anonymous
7/1/2025, 2:55:02 AM No.105760840
>>105760772
since when does the 1650 have 8gigs of vram?
Replies: >>105760861
Anonymous
7/1/2025, 2:57:08 AM No.105760861
>>105760840
Could be a Chinese retrofit.
Replies: >>105760904
Anonymous
7/1/2025, 2:58:44 AM No.105760876
Emma Arnold Wink Jeopardy 11-16-2018 (Teen Tournament Game 08 of 10)_thumb.jpg
I tried out the text to speech applications OpenAudio S1 Mini and the full 4B model from the fish audio website
S1 Mini Output file of a voice clone of Emma Arnold
https://vocaroo.com/1ljCsOjOwAp4
S1 Mini output file fed to a fine tuned Seed-VC Model
https://vocaroo.com/1c1zpJpCWvrk
S1 4B model output file
https://vocaroo.com/1k4hyiWkULhH
Replies: >>105760929
Anonymous
7/1/2025, 3:01:02 AM No.105760904
file
file
md5: 3188ab96248b1de87f90608f8fe12d7d🔍
>>105760861
>search gtx 1650 8gb on ali
>see this
what THE FUCK
why is an 980M in a SXM or whatever server form factor
Replies: >>105760942 >>105760959
Anonymous
7/1/2025, 3:03:39 AM No.105760929
>>105760876
S1 4B mode is the best one
Anonymous
7/1/2025, 3:04:19 AM No.105760934
>We have AI at home
>AI at home
Anonymous
7/1/2025, 3:05:45 AM No.105760942
>>105760904
Was joking because maybe he meant 3050 8gb or whatever.
But I guess adding extra vram to any card is viable. That's pretty cool. Some guy is a pro and has his own workshop.
Anonymous
7/1/2025, 3:08:24 AM No.105760959
XMG-U726-Laptop_GeForce-GTX-980_1
XMG-U726-Laptop_GeForce-GTX-980_1
md5: f9fc1d86742bf51721270daa4993c297🔍
>>105760904
It's a laptop card, we used to have upgradable laptop GPUs
Replies: >>105760965 >>105760988
Anonymous
7/1/2025, 3:09:09 AM No.105760965
>>105760959
damn.. things were so good back then
Anonymous
7/1/2025, 3:11:33 AM No.105760988
>>105760959
I still have this workstation notebook with an mxm nvidia k2000 or whatever.
The thing is a proper brick of solid metal, it's great.
It even has an express card slot.
Replies: >>105761009
Anonymous
7/1/2025, 3:12:38 AM No.105760998
file
file
md5: 53839f660d9a72d8b193206e8d83e54e🔍
cydonia.. just bruh
Anonymous
7/1/2025, 3:13:52 AM No.105761009
>>105760988
That's Quadro k2000 or something. Quadro was always the workstation gpu before this AI nonsense happened.
Anonymous
7/1/2025, 3:23:40 AM No.105761068
THAT site says you can run any model so long as you're willing to wait long enough. Any idea what they're using though? oogabooga won't even load the big models on mine.
Replies: >>105761094 >>105761107 >>105761115
Anonymous
7/1/2025, 3:26:54 AM No.105761094
>>105761068
>THAT site says you can run any model so long as you're willing to wait long enough
that only applies until you've run out of memory
Replies: >>105761111 >>105763397
Anonymous
7/1/2025, 3:29:48 AM No.105761107
>>105761068
post your whole computer specs, os you're using, exact model you're trying to run
preferably post some logs too
Replies: >>105761124
Anonymous
7/1/2025, 3:30:00 AM No.105761111
>>105761094
even so how do they get it to load in the first place?
Anonymous
7/1/2025, 3:30:35 AM No.105761115
>>105761068
I think they're referring to a specific user that got the biggest models running on a 4090 but it was so painfully, horribly slow that it wasn't worth it. Reddit also commonly mistakes the distilled models for the actual model so just ignore them.
Anonymous
7/1/2025, 3:31:45 AM No.105761124
>>105761107
ugh any time someone says this they never even answer the question shen your done
the worlds first real humiliation ritual
Anonymous
7/1/2025, 3:34:32 AM No.105761142
file
file
md5: a38053ea45339a6e372812ca8b2c718c🔍
>it's so hard to post i5 12400f rtx 3060 12gb 64gb ddr4 linux mint 16 and picrel
bro...
Replies: >>105761458
Anonymous
7/1/2025, 3:37:17 AM No.105761169
>its so hard to post a screenshot that isn't cropped like a retard
Replies: >>105761458
Anonymous
7/1/2025, 3:38:56 AM No.105761180
>its so important for an example to crop your screenshot to include actual error logs
Replies: >>105761458
Anonymous
7/1/2025, 3:58:28 AM No.105761320
An LLm could take a better screenshot. You will be replaced.
Anonymous
7/1/2025, 4:03:47 AM No.105761351
lip biting
blood drawing
knuckles whitening
wrists flicking
Replies: >>105761371
Anonymous
7/1/2025, 4:05:22 AM No.105761367
her vision blurring
Anonymous
7/1/2025, 4:06:16 AM No.105761371
1736652264043234
1736652264043234
md5: e973b8ed44be1da610127571631584bf🔍
>>105761351
SPINE SHIVERIN'
Replies: >>105761523
Anonymous
7/1/2025, 4:08:20 AM No.105761392
Without chatbots or whatever I just want AI Dungeon type stuff but every resource I find is just more chatslop.
What sort of models/front end or whatever should I be using for text adventure now?
Thanks for spoonfeed, 12gb vram, 32gb ram if that changes much.
Replies: >>105761400 >>105761431 >>105761461 >>105761474
Anonymous
7/1/2025, 4:09:18 AM No.105761400
>>105761392
wayfarer models seem fine
Anonymous
7/1/2025, 4:13:38 AM No.105761431
>>105761392
ChatGPT was a disaster for LLMs. It's all just chat models now.
Anonymous
7/1/2025, 4:16:16 AM No.105761458
>>105761142
>>105761169
>>105761180
I'm not even at home I'm not posting logs for a theoretical question.
Anonymous
7/1/2025, 4:16:35 AM No.105761461
>>105761392
rocinante/cydonia work fine for novel, plaintext story etc formats, no need for instruct format. if you specifically want text adventure then wayfarer (made by ai dungeon ppl) probably the way to go
Anonymous
7/1/2025, 4:16:52 AM No.105761464
>>105757131 (OP)
Adorable Miku
Anonymous
7/1/2025, 4:17:55 AM No.105761471
>oogabooga won't even load the big models on mine.
>oogabooga
Replies: >>105761826
Anonymous
7/1/2025, 4:18:08 AM No.105761474
>>105761392
Use a base model and the right prompt
Anonymous
7/1/2025, 4:23:38 AM No.105761523
>>105761371
lol
Anonymous
7/1/2025, 4:40:23 AM No.105761612
>>105760729
gemma-3-4b and gemma-3n-e4b
Anonymous
7/1/2025, 4:41:04 AM No.105761619
>>105760729
ernie 21b a3b instruct
Anonymous
7/1/2025, 5:10:56 AM No.105761808
Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication
https://arxiv.org/abs/2506.22714
>Sparse matrix multiplication operators (i.e., SpMM and SDDMM) are widely used in deep learning and scientific computing. Modern accelerators are commonly equipped with Tensor cores and CUDA cores to accelerate sparse operators. The former brings superior computing power but only for structured matrix multiplication, while the latter has relatively lower performance but with higher programming flexibility. In this work, we discover that utilizing one resource alone leads to inferior performance for sparse matrix multiplication, due to their respective limitations. To this end, we propose Libra, a systematic approach that enables synergistic computation between CUDA and Tensor cores to achieve the best performance for sparse matrix multiplication. Specifically, we propose a 2D-aware workload distribution strategy to find out the sweet point of task mapping for different sparse operators, leveraging both the high performance of Tensor cores and the low computational redundancy on CUDA cores. In addition, Libra incorporates systematic optimizations for heterogeneous computing, including hybrid load-balancing, finely optimized kernel implementations, and GPU-accelerated preprocessing. Extensive experimental results on H100 and RTX 4090 GPUs show that Libra outperforms the state-of-the-art by on average 3.1x (up to 9.23x) over DTC-SpMM and 2.9x (up to 3.9x) for end-to-end GNN applications. Libra opens up a new perspective for sparse operator acceleration by fully exploiting the heterogeneous computing resources on GPUs.
Posting for Johannes
Replies: >>105761966 >>105763009
Anonymous
7/1/2025, 5:14:36 AM No.105761826
>>105761471
oh fuck you nigger thats exactly what it sounds like obviously I know the real name
Anonymous
7/1/2025, 5:22:16 AM No.105761870
I bought a GPU just for local ai coding and it runs qwen3:32b pretty fast, but it's retarded when it comes to using tools and I'm limited to a 14k context window so I have to restart after every prompt. I'm starting to think that local ai sucks for coding without an array of beefy high vram gpus. Is this a fair assumption?
Replies: >>105761906 >>105762018 >>105763165 >>105763332
Anonymous
7/1/2025, 5:28:54 AM No.105761906
>>105761870
Claude Opus 4 sucks for coding and it's the only tolerable model. All local models are dogshit.
Replies: >>105762018
Anonymous
7/1/2025, 5:40:06 AM No.105761966
>>105761808
Are we now waiting for this to be incorporated into the backend libraries?
Replies: >>105762753
Anonymous
7/1/2025, 5:48:56 AM No.105762018
>>105761906
>>105761870
then go, pay your corporate overlords endless amounts of cash for access to larger models.
go on, go. they're calling you to fill their billionare pockets.
Replies: >>105762097
Anonymous
7/1/2025, 6:01:16 AM No.105762097
1751310046042168
1751310046042168
md5: 779fb857a6cb04264d032f444f2fd121🔍
>>105762018
We're paying our corporate overlords whenever we buy GPU, there's no escape. The game was rigged from the start
Anonymous
7/1/2025, 6:24:24 AM No.105762227
>>105759298
1 or less (worse card). Performance plateaus fast and hard past 20B. Don't believe me and before committing, buy yourself a cloud gpu access for a few dollars, set it up and check.
Anonymous
7/1/2025, 6:55:28 AM No.105762389
>>105760181
>local
Anonymous
7/1/2025, 7:00:13 AM No.105762416
Holy shit, just found out Gemini 2.5 is free. Unlimited prompts.
I'm in heaven
Replies: >>105762713 >>105763404 >>105763421 >>105765588
Anonymous
7/1/2025, 7:01:23 AM No.105762424
>>105758424
considering 1 chink worth 10 western devs
Anonymous
7/1/2025, 7:42:26 AM No.105762713
>>105762416
>>>/g/aicg/
Anonymous
7/1/2025, 7:49:08 AM No.105762753
>>105761966
Only RELU models have sparsity and everyone stopped using RELU.
Anonymous
7/1/2025, 7:53:33 AM No.105762782
ERNIE or Hunyuan?
Replies: >>105762855
Anonymous
7/1/2025, 8:05:47 AM No.105762855
>>105762782
Neither. Deepseek won.
llama.cpp CUDA dev !!yhbFjk57TDr
7/1/2025, 8:29:39 AM No.105763009
>>105761808
Noted but I think this will only be useful in combination with dropout layers during training.
Anonymous
7/1/2025, 8:54:15 AM No.105763165
>>105761870
>limited to a 14k context window so I have to restart after every prompt

>14k
>22 pages of spaghetti code
>I"m artist, so long be my prompts
Anonymous
7/1/2025, 9:12:42 AM No.105763287
Is there any nemo tune that is good at instruction following?
Anonymous
7/1/2025, 9:19:18 AM No.105763332
>>105761870
I personallying find it shit too. But tool calls have been alright with me. Qwen2.5 32b coder was more consistent but lacked good tool call. All I can say is make sure you're running as large a quant as you can and don't quantise the context cache. Qwen3 is retarded when both those things are done and you need accuracy. If you're using ollama you're probably being given bad default values for the model. Post model loading parameters and your ram/vram so we can see if it's bad config or entirely the model being a tard.
Anonymous
7/1/2025, 9:26:29 AM No.105763370
>>105758293
I don't think it'll do much, since I don't believe in their supposed 'talent'.
Anonymous
7/1/2025, 9:29:24 AM No.105763391
>>105760456
How many more years of nemo?
Replies: >>105763811
Anonymous
7/1/2025, 9:30:52 AM No.105763397
>>105761094
>run out of memory
Just add swap.
Anonymous
7/1/2025, 9:32:05 AM No.105763404
>>105762416
Other than the ai studio? I really only care if I can use the api.
Anonymous
7/1/2025, 9:35:18 AM No.105763421
>>105762416
Yes, and Grok3 does deepresearch "for free".
I cant imagine using closed for anything but work or mundane stuff though. Especially if we get some sort of pc assistant or glasses, local will be so important.
Whats available already for free is crazy though. If I had all those tools as a kid.
Replies: >>105763437
Anonymous
7/1/2025, 9:37:43 AM No.105763437
>>105763421
Nothing as useful runs at a decent speed unless I spend tons of money.
Replies: >>105763653
Anonymous
7/1/2025, 9:51:36 AM No.105763507
anons, I've got a 1080 Ti ,11GB VRAM, 128 RAM, and a Ryzen 9 5950X. What should I mess around with?
Replies: >>105763527 >>105763543 >>105764171
Anonymous
7/1/2025, 9:54:26 AM No.105763527
>>105763507
>1080 Ti
>What should I mess around with?
Your wallet.
Anonymous
7/1/2025, 9:56:29 AM No.105763543
>>105763507
mistral nemo
Replies: >>105764488
Anonymous
7/1/2025, 10:13:17 AM No.105763653
Screenshot_20250701_170838
Screenshot_20250701_170838
md5: f2681e678dbf6b592ced2027b68a249b🔍
>>105763437
Local will always lag behind. You are too blackpilled.
You should have seen the state a couple years ago anon.
That we now have a tiny 500mb 0.6b model with qwen3 that can even cook up a coherent website is crazy. Also tool calls.
Local has many use nowadays.
I made a minecraft admin ai for my kids. They can talk to it and the AI will drop the commands in their world. Like give them items, teleport people or change settings etc.
I would do as much as possible local and only go closed if you hit a wall and dont mind its shared forever.
I only spend like 300 bucks on my P40 in 2 years.
Anonymous
7/1/2025, 10:33:34 AM No.105763765
Damn, /ldg/ hates pascal cards.
I'm already used to slow ass output but now everybody is writing how this cool nunchuk solution makes Q4 possible for flux kontext. Tiny, fast!
Use 2 hours of my time to set this shit up and download everything.
>ERROR WHILE LOADING MODEL!! Too old!! Supported after: Turing GPU (e.g., NVIDIA 20-series)
Makes me wonder how text would look for me without johannes. Thanks buddy. Appreciate your work.
Anonymous
7/1/2025, 10:40:31 AM No.105763811
>>105763391
All.
Anonymous
7/1/2025, 11:39:36 AM No.105764171
>>105763507
try the new gemma 3n, then use your newly free vram for a tts and whisper to make a poormans speech to speech that will only make you feel depressed
Anonymous
7/1/2025, 12:18:30 PM No.105764463
amd radeon, 16gb ram, amd ryzen 7, ideally windows 11 but can dual boot linux or windows-subsystem-for-linux if necessary,
what desktop app and what model to use?
(for chatting, not coding)
Replies: >>105764477 >>105764488
Anonymous
7/1/2025, 12:21:27 PM No.105764477
>>105764463
Read the lazy guide in the OP.
Anonymous
7/1/2025, 12:22:05 PM No.105764483
1751362252647327
1751362252647327
md5: 16992d56e8d28a2e86aa88be612775e9🔍
Since we are talking about pascal and 1080ti.
Be sure to do the needful and thank leather jacket man for cleaning up the drivers.
Replies: >>105764512 >>105764521
Anonymous
7/1/2025, 12:23:17 PM No.105764488
>>105764463
>>105763543
Anonymous
7/1/2025, 12:26:42 PM No.105764512
file
file
md5: 33a0aa477b5b3e891800acdef3daab1a🔍
>>105764483
>cleaning up the drivers
They will only keep getting bigger.
Replies: >>105766267
Anonymous
7/1/2025, 12:28:00 PM No.105764521
>>105764483
>to do the needful
Saar, I
Replies: >>105764546
Anonymous
7/1/2025, 12:29:23 PM No.105764534
Whats the new P40 if you need to move on from pascal?
5060ti super? 16gb and its not that expensive.
I fear there might be a catch somewhere though. Looks too good to be true at first glance.
Replies: >>105764764
Anonymous
7/1/2025, 12:31:19 PM No.105764546
>>105764521
Kindly adjust.
Anonymous
7/1/2025, 1:03:35 PM No.105764764
>>105764534
3090s will drop to sub-400 once the new 5070 ti super with 24gb is out
Replies: >>105765243
Anonymous
7/1/2025, 1:25:01 PM No.105764901
>>105758674
>Mills & Boon purple prose
>mixture of vulnerability and aroused desire

why are all the models, regardless if they are local or not, writing with that horrible shitty erotic style?
Replies: >>105765054 >>105765275 >>105765472 >>105768886
Anonymous
7/1/2025, 1:43:42 PM No.105765054
>>105764901
Dataset issue where this purple stuff is over-represented for "erotic work"?
Replies: >>105765118
Anonymous
7/1/2025, 1:51:43 PM No.105765118
>>105765054
I wonder if DPO can be used to fix this. All we need is a dataset of slop-normal paragraph pairs
Replies: >>105765228
Anonymous
7/1/2025, 2:04:31 PM No.105765228
>>105765118
>All we need
Anonymous
7/1/2025, 2:06:44 PM No.105765243
>>105764764
I feel like it's copium but I want to believe you.
Anonymous
7/1/2025, 2:10:29 PM No.105765275
>>105764901
I'm willing to be it's all from their ancient GPT3/4-derived "RP datasets" these companies are using.
Anonymous
7/1/2025, 2:31:08 PM No.105765426
Screenshot 2025-07-01 at 09-30-41 Feature Request per-chat prompt caching · Issue #14470 · ggml-org_llama.cpp
Isn't this kind of already a thing with the slots deal?
Anonymous
7/1/2025, 2:36:50 PM No.105765472
>>105764901
That is the only kind of smut that passes the filtering. Probably because the filtering usually involved ayumu style naughty words per sentence counting. Also explains why it so hard for model to say... you know what.
Replies: >>105765487 >>105765503 >>105766545
Anonymous
7/1/2025, 2:38:32 PM No.105765487
>>105765472
I stop. This is wrong. This is so, so wrong.
Anonymous
7/1/2025, 2:39:51 PM No.105765500
file
file
md5: 100db501fce0a10ce6b1b8a6b7aa3dfd🔍
migu waiting room
Replies: >>105766012
Anonymous
7/1/2025, 2:40:00 PM No.105765503
1739749376750
1739749376750
md5: cbfde14415fa887589f26c95f67b8df3🔍
>>105765472
>Probably
Meta said as much in L3's paper.
>dirty word counting
Replies: >>105765747 >>105766545 >>105766876 >>105767501
Anonymous
7/1/2025, 2:50:20 PM No.105765588
>>105762416
Is gemini pro free? I need to do some distillation
Replies: >>105766334
Anonymous
7/1/2025, 3:13:23 PM No.105765747
>>105765503
That is what i had in mind but you don't know if everyone else did the same.
Anonymous
7/1/2025, 3:15:29 PM No.105765764
I am tired of waiting for a model that will never arrive. How do i develop werewolf sex fetish?
Replies: >>105765796
Anonymous
7/1/2025, 3:20:17 PM No.105765796
>>105765764
2mw
unironically this time
Replies: >>105765855
Anonymous
7/1/2025, 3:22:02 PM No.105765808
LWACKNWEXFNWIMTVLKAY
LWACKNWEXFNWIMTVLKAY
md5: a2dec731565e717b40d866a38f399630🔍
I am trying to goof Hunyuan-A13B-Instruct, but convert_hf_to_gguf.py throws an error:
KeyError: 'model.layers.0.mlp.experts.3.down_proj.weight'

Anyone got a clue what is happening? I checked on HF and I can see this layer there...
Replies: >>105765839 >>105766679
Anonymous
7/1/2025, 3:27:34 PM No.105765839
>>105765808
You should only download goofs from trusted goofers like Unsloth to not miss out on the embedded bitcoin miners
Anonymous
7/1/2025, 3:30:14 PM No.105765855
>>105765796
2mw until nemo is still the answer
Anonymous
7/1/2025, 3:50:30 PM No.105765979
1741571748772881
1741571748772881
md5: 066f60aadcb0893417256484c495e25d🔍
>>105757402
>>105757231
>>105758617
>>105758626
The mikutranny posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714003 ryona picture of generic anime girl, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag / janny deletes everyone dunking on trannies and resident spammers, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread(s) >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted

xis accs
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
Replies: >>105766011 >>105766144
Anonymous
7/1/2025, 3:58:19 PM No.105766011
>>105765979
Weird that you still can't explain why I should care...
Replies: >>105766053 >>105766174 >>105766177
Anonymous
7/1/2025, 3:58:28 PM No.105766012
>>105765500
In the meantime use this!
https://x.com/rebane2001/status/1939722006343155939
Anonymous
7/1/2025, 4:01:40 PM No.105766029
>July
>Still no Sama model
It's over.
Replies: >>105766042 >>105766043 >>105768619
Anonymous
7/1/2025, 4:03:13 PM No.105766042
>>105766029
4.1 is worse than DeepSeek and considering they aren't going to release anything that outbenches or is less censored than that I'm not sure how anyone can be genuinely excited for whatever slop they push out. Unironically they'd get better goodwill if they just open weighted GPT-3.5 and Turbo since nobody is paying for those anymore.
Anonymous
7/1/2025, 4:03:32 PM No.105766043
>>105766029
He's going to release GPT5 first.
Anonymous
7/1/2025, 4:04:05 PM No.105766053
>>105766011
Then you are retarded, this is out of my scope.
Replies: >>105766088
Anonymous
7/1/2025, 4:07:34 PM No.105766088
>>105766053
Sounds like you're seething that nobody cares about your latest homoerotic fixation. Maybe keep it to Discord DMs next time, champ.
Replies: >>105766174
Anonymous
7/1/2025, 4:15:44 PM No.105766144
>>105765979
Based! Kill all jannies and mikutroons.
Anonymous
7/1/2025, 4:20:25 PM No.105766174
>>105766011
>>105766088
Go back to xitter with your slop
Anonymous
7/1/2025, 4:20:36 PM No.105766177
>>105766011
Because you obviously care about thread quality and don't want low quality avatarspam. Right?
Replies: >>105766204
Anonymous
7/1/2025, 4:21:13 PM No.105766183
>can do anything in life
>decides to dedicate it to schizoing it up in a fringe general on the 4chan technology board
Replies: >>105766192
Anonymous
7/1/2025, 4:23:03 PM No.105766192
>>105766183
Yeah I also don't get it why mikutroons spam their shitty agp avatar.
Anonymous
7/1/2025, 4:23:56 PM No.105766199
Also did we take note a couple of days ago that Meta won its court case.
Training LLMs on copyrighted books is fair use. So only Europe is dead as far as LLM training goes.
Anonymous
7/1/2025, 4:24:34 PM No.105766204
inpainting_thumb.jpg
inpainting_thumb.jpg
md5: c788cda9419b520ff66409ea8d8a9f31🔍
>>105766177
>low quality
NTA but I like seeing what someone can produce with contemporary local models given some effort.
If it was just low-effort txt2img then I would agree that it lowers the quality of the thread.
Replies: >>105766245
Anonymous
7/1/2025, 4:29:12 PM No.105766245
>>105766204
Great. Fuck off to any local diffusion thread with this shit.
Replies: >>105766261
Anonymous
7/1/2025, 4:30:47 PM No.105766261
>>105766245
Erm... what are you actually going to DO about it, zoezeigleit?
Replies: >>105766268 >>105766295
Anonymous
7/1/2025, 4:31:28 PM No.105766267
>>105764512
poor wintoddler, on linux the drivers are like 300mb in size
>105765979
thanks for the (You) soiteen
Anonymous
7/1/2025, 4:31:47 PM No.105766268
>>105766261
you are just as bad as that retard you are arguing with. stop shitting up the thread.
Anonymous
7/1/2025, 4:33:57 PM No.105766284
As a reliable independent third party, I agree BOTH need banned immediately.
Replies: >>105766740
Anonymous
7/1/2025, 4:35:37 PM No.105766295
>>105766261
I am going to shit up the thread and thus the /lmg/ status quo of eternal conflict between sane people ank mikutroons is perserved. Now go dilate your wound bussy.
Replies: >>105766308
Anonymous
7/1/2025, 4:37:31 PM No.105766308
>>105766295
>I am going to shit up the thread
Like you don't do that for literally any reason.
Replies: >>105766314
Anonymous
7/1/2025, 4:38:18 PM No.105766314
>>105766308
Only for mikutroonism that should die.
Anonymous
7/1/2025, 4:38:25 PM No.105766316
can't spell llama.cpp without the c which also btw stands for CUNNY
Replies: >>105766331 >>105766344 >>105766368
Anonymous
7/1/2025, 4:40:00 PM No.105766325
>>105758398
this is a whole different internal organization than the meta gen AI team who are the llama devs. also different than FAIR.
Anonymous
7/1/2025, 4:40:33 PM No.105766331
>>105766316
the ultimate state of local threads
yall need the rope
Anonymous
7/1/2025, 4:40:44 PM No.105766334
>>105765588
I heard from a friend Gemini (even pro?) is free through some platform. He said they can afford to do it because not many people know about it.
Replies: >>105766378
Anonymous
7/1/2025, 4:41:49 PM No.105766344
>>105766316
llama.sipipi
Anonymous
7/1/2025, 4:44:28 PM No.105766368
>>105766316
bu-bu-buh BASED?????
Anonymous
7/1/2025, 4:45:51 PM No.105766378
>>105766334
Smart move. They want to hoard quality conversation data, so they don't want the masses to realize that it's free.
Anonymous
7/1/2025, 4:59:43 PM No.105766509
https://www.youtube.com/watch?v=atXyXP3yYZ4
Replies: >>105766541
Anonymous
7/1/2025, 5:04:06 PM No.105766541
1730276627305814
1730276627305814
md5: 20ac3f96819af1a88b786d5b0d385d75🔍
>>105766509
Anonymous
7/1/2025, 5:04:13 PM No.105766545
>>105765472
>>105765503
So they literally reject most porn/hardcore content from datasets? No wonder what is kept is mostly softcore smut for divorced women. The stuff for a male audience is more explicit.
I wonder if there are unfiltered ones existing.
Replies: >>105766794
Anonymous
7/1/2025, 5:14:07 PM No.105766639
lurker
just getting into ai
can i check, do you guys just have sexy chats with computers or is there more to the general?
Replies: >>105766658
Anonymous
7/1/2025, 5:15:26 PM No.105766658
>>105766639
most people are just here to shitpost
Anonymous
7/1/2025, 5:16:13 PM No.105766668
I want my model to learn maybe 500+ pages of text (not code). dont want just 4096 context.
how to do this?

is retrieval augmented generation (RAG) cope? (sounds like it)
Anonymous
7/1/2025, 5:17:27 PM No.105766679
>>105765808
Update:
I was in the FP8 repo...
Worked fine in the original one.
Anonymous
7/1/2025, 5:23:59 PM No.105766740
>>105766284
Just ignore it.
If you are too much of a snowflake for that, do two clicks and hide the posts or use a filter.
Anonymous
7/1/2025, 5:28:03 PM No.105766794
>>105766545
I think the datasets themselves are unfiltered, but get filtered when they are used by almost all model providers
Anonymous
7/1/2025, 5:35:02 PM No.105766864
Mistral Large 3? Mistral-Nemotron open source?
What are these french fucks doing?
Replies: >>105766877 >>105766903 >>105766941 >>105766975
Anonymous
7/1/2025, 5:36:14 PM No.105766876
>>105765503
fuck if they all do that no wonder they're all shit at anything nsfw, I guess this is doomed
Replies: >>105766900
Anonymous
7/1/2025, 5:36:26 PM No.105766877
>>105766864
Updated model coming soon!
Replies: >>105767181
Anonymous
7/1/2025, 5:38:25 PM No.105766900
>>105766876
why do you want your investor friendly assistant to use dirty words?
Replies: >>105767085
Anonymous
7/1/2025, 5:38:54 PM No.105766903
>>105766864
>With the launches of Mistral Small in March and Mistral Medium today, it’s no secret that we’re working on something ‘large’ over the next few weeks. With even our medium-sized model being resoundingly better than flagship open source models such as Llama 4 Maverick, we’re excited to ‘open’ up what’s to come :)
>May 7, 2025
just a few more weeks... let's say, hmm, 2?
Anonymous
7/1/2025, 5:43:33 PM No.105766941
>>105766864
Small 3.2 is the new nemo
Anonymous
7/1/2025, 5:47:09 PM No.105766975
>>105766864
>Mistral Large 3?
Probably 500B+ parameters MoE.
>Mistral-Nemotron
Almost certainly ~150B parameters MoE.
>open source?
Maybe, maybe not.
>What are these french fucks doing?
Mistral Large is probably not done training yet and they aren't sure yet if it makes financial sense to open-weight a variation of the model they're using for their LeChat (i.e. Mistral Medium). I guess NVidia got the memo after they wrote that blog post a few weeks ago.
Replies: >>105767069 >>105767094
Anonymous
7/1/2025, 5:55:27 PM No.105767063
mistral sucks
Anonymous
7/1/2025, 5:55:59 PM No.105767069
>>105766975
>Probably 500B+ parameters MoE.
I would prefer it to be Deepseek-sized but I'd take it.
Anonymous
7/1/2025, 5:57:16 PM No.105767085
>>105766900
maybe, just maybe, I'd want them to use a mixture of dirty and safe words
I guess at least it makes most fiction summaries and paragraphs easy to spot, it's always written in the most sappy and over the top way
Anonymous
7/1/2025, 5:57:52 PM No.105767094
>>105766975
>. I guess NVidia got the memo after they wrote that blog post a few weeks ago.
What happened, missed that.
Replies: >>105767167
Anonymous
7/1/2025, 6:01:00 PM No.105767127
small 3.2 doesn't suck, its pretty good actually when used with V3 tekken..
cydonia v4g and v4d are based on 3.2 small and theyre okay
Anonymous
7/1/2025, 6:03:50 PM No.105767167
>>105767094
https://developer.nvidia.com/blog/advancing-agentic-ai-with-nvidia-nemotron-open-reasoning-models/

>Advancing Agentic AI with NVIDIA Nemotron Open Reasoning Models
>
>Enterprises need advanced reasoning models with full control running on any platform to maximize their agents’ capabilities. To accelerate enterprise adoption of AI agents, NVIDIA is building the NVIDIA Nemotron family of open models. [...]
>New to the Nemotron family, the Mistral-Nemotron model is a significant advancement for enterprise agentic AI. Mistral-Nemotron is a turbo model, offering significant compute efficiency combined with high accuracy to meet the demanding needs of enterprise-ready AI agents. [...]
>Try the Mistral-Nemotron NIM directly from your browser. Stay tuned for a downloadable NIM coming soon.
Replies: >>105767181 >>105767426 >>105768143
Anonymous
7/1/2025, 6:04:27 PM No.105767172
>finetune using idiot praises mistral
yup mistral sucks
Replies: >>105767228 >>105767374
Anonymous
7/1/2025, 6:05:37 PM No.105767181
>>105767167
>coming soon
KEKs
>>105766877
Anonymous
7/1/2025, 6:09:43 PM No.105767228
>>105767172
I'm using vanilla small 3.2 and it's good
Anonymous
7/1/2025, 6:12:29 PM No.105767252
sars if you redeem the shared parameters you can get lighting speeds of your Llama-4 scout model
Anonymous
7/1/2025, 6:17:32 PM No.105767296
So ernie a meme or not?
Replies: >>105767416
Anonymous
7/1/2025, 6:23:50 PM No.105767374
>>105767172
mistral makes the only half decent open weight models that aren't 600B or censored to shit
Replies: >>105767444
Anonymous
7/1/2025, 6:26:41 PM No.105767416
>>105767296
I tried it on OR and it wasn't very impressive for RP, pretty sloppy and generic
Replies: >>105767503
Anonymous
7/1/2025, 6:27:23 PM No.105767426
>>105767167
Oh I see.
Anonymous
7/1/2025, 6:28:46 PM No.105767444
>>105767374
>mistral makes the only half decent open weight models that aren't 600B or censored to shit
Kind of sad it's true since I expected more european companies than just one to not be complete shit at this point.
Replies: >>105767946
Anonymous
7/1/2025, 6:30:01 PM No.105767456
For open-webui/web searches what'd be the best model if I only have 12GB VRAM? It couldn't still be nemo right?
Replies: >>105767477 >>105767626 >>105767683 >>105768255
Anonymous
7/1/2025, 6:31:47 PM No.105767477
>>105767456
Rocinante 12B
Replies: >>105767493 >>105767527
Anonymous
7/1/2025, 6:32:52 PM No.105767493
>>105767477
No way...
Anonymous
7/1/2025, 6:33:39 PM No.105767501
>>105765503
I wonder if it is the third bulletpoint that is actually more damaging. Maybe the model could even learn to call cock a cock from context but if everything has to be not to far from regular distribution of tokens you will always get the assistant persona looking for a single objective truth which kills the fun.
Anonymous
7/1/2025, 6:33:48 PM No.105767503
>>105767416
How censored was it?
Anonymous
7/1/2025, 6:35:03 PM No.105767518
Let's go mistral! Mistral sucks!
Anonymous
7/1/2025, 6:36:07 PM No.105767527
>>105767477
You said you would leave drummerfaggot.
Replies: >>105767537
Anonymous
7/1/2025, 6:37:06 PM No.105767537
>>105767527
No they didn't?
Replies: >>105767563
Anonymous
7/1/2025, 6:37:52 PM No.105767545
Are hunyuan ggoofs broken already?
Replies: >>105767699
Anonymous
7/1/2025, 6:39:18 PM No.105767563
>>105767537
>Drummerfaggot
>they
Tell me this thread doesn't gave a troon infestation.... I am so tired of this newspeak.
Replies: >>105767584
Anonymous
7/1/2025, 6:40:49 PM No.105767574
It's always Nemo.
The one and only good VRAMlet model that's still going to be used years from now.
Replies: >>105767591 >>105768255 >>105768261
Anonymous
7/1/2025, 6:42:07 PM No.105767584
>>105767563
'they'?
Anonymous
7/1/2025, 6:42:36 PM No.105767591
>>105767574
But the context sucks and it always seems to drift toward the same personalities.
Anonymous
7/1/2025, 6:46:04 PM No.105767626
>>105767456
If not for creative shit, you can maybe try qwen 3 14b or gemma 3 12b
Anonymous
7/1/2025, 6:51:31 PM No.105767683
>>105767456
Your best bet is probably Gemma 3 or Qwen 3
Anonymous
7/1/2025, 6:52:32 PM No.105767699
>>105767545
The never have been function yet
Replies: >>105767839
Anonymous
7/1/2025, 7:06:27 PM No.105767839
>>105767699
Did you generate this post using a broken quant?
Replies: >>105767873 >>105768028
Anonymous
7/1/2025, 7:08:10 PM No.105767858
What's the difference between regular Wayfarer and Wayfarer Eris Noctis?
Replies: >>105767871
Anonymous
7/1/2025, 7:10:06 PM No.105767871
>>105767858
>Wayfarer Eris Noctis
is a merge of Wayfarer and another model called Eris Noctis, which itself is a merge, of multiple merges
Replies: >>105767892 >>105768241
Anonymous
7/1/2025, 7:10:28 PM No.105767873
>>105767839
Anons have said before they have bots shitposting her using LLMs.
**[spoiler]Probably[/spoiler]**
Anonymous
7/1/2025, 7:12:05 PM No.105767892
>>105767871
>let's just merge models until something happens
Replies: >>105767961 >>105768158
Anonymous
7/1/2025, 7:17:23 PM No.105767946
>>105767444
Mistral's already started to show signs of lobotomized datasets, the EU is oppressive with copyright shit because of privacy law and it's surprising they haven't gotten totally slammed for it yet. Likely nobody else has the combination of connections and compute to get away with making a competitive product.
Replies: >>105768019 >>105768041
Anonymous
7/1/2025, 7:18:42 PM No.105767961
>>105767892
i want to merge my cock with your mouth
Anonymous
7/1/2025, 7:24:10 PM No.105768019
>>105767946
Fully transform copyrighted data into synthetic data changing style and form, train a model on that data.
Anonymous
7/1/2025, 7:24:48 PM No.105768028
>>105767839
What I actually wanted to say is
>>Are hunyuan ggoofs broken already?
>They never have been functioning yet
Anonymous
7/1/2025, 7:26:01 PM No.105768041
>>105767946
copyright and privacy laws have nothing to do with each other.
Anonymous
7/1/2025, 7:35:25 PM No.105768115
image
image
md5: dd3c863089476042ba47f816b4f7fbae🔍
If any of you faggot wants to try Hunyuan-A13B, here you go:
https://huggingface.co/FgRegistr/Hunyuan-A13B-Instruct-GGUF

Only works with https://github.com/ggml-org/llama.cpp/pull/14425 applied!

--flash-attn --cache-type-k q8_0 --cache-type-v q8_0 --temp 0.6 --presence-penalty 0.7 --min-p 0.1
Replies: >>105768164
Anonymous
7/1/2025, 7:39:29 PM No.105768143
1720162808171983
1720162808171983
md5: 5e3118f771ca2b8c929e00f60e5baef5🔍
>>105767167
How do you do, fellow /lmg/sters?
Anonymous
7/1/2025, 7:41:22 PM No.105768158
>>105767892
Well it did work one time
Anonymous
7/1/2025, 7:41:53 PM No.105768164
>>105768115
i already tried 2 quants yesterday, from your experience if you use chat completion in ST and put a long complex card, does it first respond then think if think is prefilled?
Replies: >>105768455
Anonymous
7/1/2025, 7:43:10 PM No.105768177
Gemini-cli is pretty cool, I want a local gemini-cli.
Anonymous
7/1/2025, 7:49:57 PM No.105768241
>>105767871
I see.
...so which one's better for adventures with bad-ends?
Replies: >>105769821
Anonymous
7/1/2025, 7:51:54 PM No.105768255
>>105767456
Mistral Small 3.2 at iq4xs is the best model even on 8gb (3-4 t/s with all the optimizations) so for 12gb it's a nobrainer, alternatively some 32B model. if not v3.2, I had great experiences with 2501 version, the sex is great though the format can deteriorate.
>>105767574
Nemo is like CharacterAI from 4 years ago, gives you the ah ah mistress but dumb as fish.
Replies: >>105768569
Anonymous
7/1/2025, 7:53:31 PM No.105768261
>>105767574
Is it possible to improve nemo other than through sloptunes?
Replies: >>105768324
Anonymous
7/1/2025, 8:01:16 PM No.105768324
>>105768261
>sloptunes
presumably that means sloppy fine tuning?
Replies: >>105768383
Anonymous
7/1/2025, 8:05:29 PM No.105768364
Assuming this "Nemo still undefeated" is not just trolling, do you just pre-fill all the stuff it forgot in a 10k prompt?
Replies: >>105768435
Anonymous
7/1/2025, 8:06:53 PM No.105768383
>>105768324
The datasets they use are so poorly curated, they end up adding more slop and refusals than they remove.
Anonymous
7/1/2025, 8:12:05 PM No.105768435
>>105768364
>10k prompt
That long of a prefill makes it braindead for me.
Replies: >>105768862
Anonymous
7/1/2025, 8:14:33 PM No.105768455
>>105768164
I didn't use SillyTavern, but even with longer contexts I don't see the behavior you described. It sounds very much like you are using a wrong chat template.

The correct template is:
<|startoftext|>You are a helpful assistant.<|extra_4|>What is the capital of France?<|extra_0|><think>The user is asking for the capital of France. This is a factual question. I know this information.</think>The capital of France is Paris.<|eos|><|startoftext|>What about Chile?<|extra_0|>
Anonymous
7/1/2025, 8:24:45 PM No.105768569
>>105768255
I'm sleeping on 3.2 because exllamav2 no longer works well with AMD
Anonymous
7/1/2025, 8:28:51 PM No.105768619
9e268w
9e268w
md5: 313120717ed5bd1d9acbfecc167bbc3e🔍
>>105766029
believe in Sam
Replies: >>105768664 >>105768677 >>105768798 >>105769238
Anonymous
7/1/2025, 8:32:38 PM No.105768664
>>105768619
If maverick is on the moon, o3 is on pluto.
Anonymous
7/1/2025, 8:34:25 PM No.105768677
Screenshot 2025-07-01 192102
Screenshot 2025-07-01 192102
md5: 4e4365059375dbcb99def4b4b35b5258🔍
>>105768619
oai open model maybe in testing
>>105768556
Replies: >>105768685 >>105768693 >>105769053
Anonymous
7/1/2025, 8:35:39 PM No.105768685
>>105768677
>General-purpose model with built-in tool-calling support.
Isn't that basically all models now?
Anonymous
7/1/2025, 8:36:47 PM No.105768693
>>105768677
on the one hand the OAI open model is strongly rumored to be a reasoner and this is not
on the other hand it's exactly as dry and safe as I would expect an OAI open release to be
Replies: >>105768837
Anonymous
7/1/2025, 8:46:59 PM No.105768798
>>105768619
What a strange benchmark: o3 so far above everyone else (big doubt), gemini 2.5-pro and r1 almost identical (also doubt, even if both are good), qwen3 above old r1, old r1 and ds3 matched (they're not), qwq3 above sonnet-3.7, nope. only llama4 at the botton eh? almost as if >>105756358 was right "I just realized that the only use for l4 is being in marketing brochures of all other companies."
Replies: >>105768934
Anonymous
7/1/2025, 8:50:13 PM No.105768837
>>105768693
Releasing a reasoner model would be a bad move because it raises hardware requirements. Running reasoner models on CPU fucking sucks
Replies: >>105768876
Anonymous
7/1/2025, 8:50:47 PM No.105768845
huawei pangu pro 72b-a16b
https://gitcode.com/ascend-tribe/pangu-pro-moe-model
https://huggingface.co/IntervitensInc/pangu-pro-moe-model
https://arxiv.org/pdf/2505.21411
Replies: >>105768877 >>105768884 >>105768906 >>105768998
Anonymous
7/1/2025, 8:52:24 PM No.105768862
>>105768435
What? I guess all this Nemo shilling is just a Nemo tune spamming /lmg/.
Replies: >>105768958
Anonymous
7/1/2025, 8:53:38 PM No.105768876
>>105768837
That's exactly why it would make sense for them to do though.
Anonymous
7/1/2025, 8:53:43 PM No.105768877
file
file
md5: 371e2e2026c6d8d9b607ca88c6c623c7🔍
>>105768845
nice, multilingual too
Replies: >>105768901
Anonymous
7/1/2025, 8:54:15 PM No.105768884
>>105768845
>72b-a16b
The only thing worse than fuck huge moe models are medium size moes. This could have just been a 24b
Anonymous
7/1/2025, 8:54:26 PM No.105768886
>>105764901
It's kind of funny that the only experience of erp/erotic fiction many people have is this purple style.
I wonder how many will think it's normal and will start writing like that themselves.
Anonymous
7/1/2025, 8:55:51 PM No.105768901
file
file
md5: 20b245a3199793293078fbadcde89246🔍
>>105768877
instruct results
Replies: >>105769352
Anonymous
7/1/2025, 8:56:14 PM No.105768906
>>105768845
Why the fuck is it just random people uploading it?
Replies: >>105768933 >>105769034
Anonymous
7/1/2025, 8:58:39 PM No.105768933
tribesascend-1741383118344
tribesascend-1741383118344
md5: a304955e383e6bbc687d1d0b19e56581🔍
>>105768906
>gitcode.com/ascend-tribe
fun game desu
Anonymous
7/1/2025, 8:58:44 PM No.105768934
>>105768798
Judging by the term "ELO" and the scores, it looks like a chess benchmark and OpenAI put extensive effort into training o1 for chess.
But I agree, this benchmark seems flawed, at least in parts.
But at this point that post is as good as any other graph or table everyone markets their models with.
Anonymous
7/1/2025, 9:01:01 PM No.105768958
>>105768862
Vanilla my boy.
Anonymous
7/1/2025, 9:02:47 PM No.105768978
>download llamafile
>download gguf and put it in the same dir
>shit just werks, auto-installs dependencies, detects and chooses what's best for the machine, no need to fuck around with cublas or HIP or whatever
What's the point of koboldcpp or oobabooga or anything again?
Replies: >>105769004 >>105769026 >>105769091
Anonymous
7/1/2025, 9:04:32 PM No.105768998
>>105768845
My prediction is that out of hunminmaxierniepangu wave this one is the worst.
Anonymous
7/1/2025, 9:04:54 PM No.105769004
>>105768978
In what way does koboldcpp not work like that?
Replies: >>105769067
Anonymous
7/1/2025, 9:06:22 PM No.105769026
>>105768978
It is less dirty than llamafile or ollama.
Anonymous
7/1/2025, 9:07:13 PM No.105769034
>>105768906
Gitcode repo is the one linked in the paper.
The paper talks about using Ascend GPUs.
Ascend Tribe profile says:
>Ascend Tribe
>Huawei open-sources Pangu and Ascend-based model reasoning technology, opening a new chapter in the Ascend ecosystem
Anonymous
7/1/2025, 9:08:40 PM No.105769053
1746905862537741
1746905862537741
md5: dcde222134de826a090da85dcd55ee57🔍
>>105768677
RIP
>>105768995
Anonymous
7/1/2025, 9:09:56 PM No.105769067
>>105769004
The fact that the releases page has a bunch of different architecture-dependent binaries to download is already a big L. Wtf is oldpc.exe? And when you use it, you have to pick CBLAS or CuBLAS and batch size and other insignificant things yourself. Koboldcpp is for retards who don't know what a CLI is.
Replies: >>105769222 >>105769394 >>105769676
Anonymous
7/1/2025, 9:11:50 PM No.105769091
>>105768978
So what's this about? A separate executable for every model? That sounds dumb.
Anonymous
7/1/2025, 9:23:02 PM No.105769222
>>105769067
>I can't figure out what "OLD PC" means
>I need the program to decide the batch size and everything else for me, but it's everyone else that's retarded
>You can't run it on the command line because the github has prebuilt binaries for the disabled
Pretty weak bait desu senpai
Anonymous
7/1/2025, 9:24:17 PM No.105769238
>>105768619
Qwen3-32B is pretty high there.
Anonymous
7/1/2025, 9:34:55 PM No.105769352
>>105768901
Qwen3 is so stemmaxxed it's crazy
Replies: >>105769369 >>105769569
Anonymous
7/1/2025, 9:36:30 PM No.105769369
>>105769352
it just werks
Replies: >>105769504
Anonymous
7/1/2025, 9:39:09 PM No.105769394
1733290651687844
1733290651687844
md5: d2d71612156d97ac59f63033a0eec2b1🔍
>>105769067
bro, just download koboldcpp.exe and run it, it's not that complicated
Anonymous
7/1/2025, 9:51:07 PM No.105769504
>>105769369
I have never asked an LLM to solve math, nor a local model to write code
Anonymous
7/1/2025, 9:57:37 PM No.105769566
2025-06-29_00074_
2025-06-29_00074_
md5: bbc01593d2bf64bcd284b2eb3a5d3353🔍
Two questions.
Gf wants to try a local model. What would be the most user friendly open source ui? I dabbled with Jan, it looked simple enough. Opinions?
Also, she wants something that does speech-to-text. Is that a thing? Are there models that do that? On what kind of software?
Pic related, the controlnet is her.
Replies: >>105769582 >>105769647
Anonymous
7/1/2025, 9:57:51 PM No.105769569
>>105769352
I quite like Qwen, they always deliver solid local models.
They only seem to lack when it comes to erotic roleplaying, which I don't do that much anyway.
Anonymous
7/1/2025, 9:59:53 PM No.105769582
>>105769566
>Gf wants to try a local model. What would be the most user friendly open source ui? I dabbled with Jan, it looked simple enough. Opinions?
Jan is simple if all you want to do is upload documents and chat to a model. Lack of configuration options gets frustrating .

>Also, she wants something that does speech-to-text. Is that a thing? Are there models that do that?
whisper
>On what kind of software?
whisper.cpp
Replies: >>105769629 >>105769735
Anonymous
7/1/2025, 10:04:57 PM No.105769629
>>105769582
thanks m8
whisper only does "live" STT during chats, no? She would like to process a long audio file and get a text in the end.
Replies: >>105769673
Anonymous
7/1/2025, 10:06:26 PM No.105769647
>>105769566
>Gf wants to try a local model. What would be the most user friendly open source ui? I dabbled with Jan, it looked simple enough. Opinions?
lmstudio or llamacpp + sillytavern
Anonymous
7/1/2025, 10:08:57 PM No.105769673
>>105769629
No. I regularly provide whisper with entire movies to get it to generate subtitles for me.
Anonymous
7/1/2025, 10:09:04 PM No.105769676
>>105769067
Skill issue
Anonymous
7/1/2025, 10:14:18 PM No.105769735
>>105769582
NTA, but can whisper.cpp even output an srt file with silence/music cut out? Cause I'm just getting a continuous file that's like "00:00:00 -> 00:01.36: Yeah." Yes, I'm using VAD.
Replies: >>105769806
Anonymous
7/1/2025, 10:16:15 PM No.105769752
to the qwen users here:
https://huggingface.co/dnotitia
if you ever had the chinese character bias popup while using the model, this fixes it, even if you use qwen as a chinese to english translator (having any chinese character in the context can madly trigger this issue) you will never see it gen a hanzi again
smoothie qwen is da bomb
Replies: >>105769807
Anonymous
7/1/2025, 10:20:22 PM No.105769806
>>105769735
If you're using whisper v3, try going back to v2. It was much better at handling silences. No amount of processing or workarounds have made v3/turbo as good as v2 in that regard for me.
Replies: >>105769849
Anonymous
7/1/2025, 10:20:27 PM No.105769807
>>105769752
Is this a bona fide shill? Like you can just grammar that shit out.
Replies: >>105769839
Anonymous
7/1/2025, 10:21:28 PM No.105769821
>>105768241
Wayfarer Eris Noctis is supposed to have more narrative styles and better lore adherence, it also claims to have a 1 million token context window.

For what its worth, it certainly seems to be better for porn than base wayfarer. The whole LE BAD THING HAPPENS gimmick gets old fast tho
Replies: >>105769847
Anonymous
7/1/2025, 10:22:38 PM No.105769839
>>105769807
>Like you can just grammar that shit out.
you are a chill for wanting a model that doesn't require slowing down token generation111!!!!!1!!!1!!1!!1!1!
the ultimate state of 4chan
Anonymous
7/1/2025, 10:23:07 PM No.105769847
>>105769821
>it also claims to have a 1 million token context window.
so does standard nemo's config file
Anonymous
7/1/2025, 10:23:10 PM No.105769849
tintin
tintin
md5: f234474aa2874dc94b465ec3282c5460🔍
>>105769806
No, I mean like whisper.cpp literally gives a continuous file. See how there are no gaps. WhisperX can do it properly, but I'd like whisper.cpp to work since it can do Vulkan (I got an AMD card).
Replies: >>105769933 >>105769946
Anonymous
7/1/2025, 10:23:50 PM No.105769852
>>105769835
>>105769835
>>105769835
Anonymous
7/1/2025, 10:31:40 PM No.105769933
>>105769849
whisperX uses wav2vec2 on top of whisper to align the audio to timestamps, plain whisper timestamps are garbage.
Anonymous
7/1/2025, 10:33:17 PM No.105769946
>>105769849
Hey, what model are you using? Does ggml-large-v3-turbo-q8_0 do french well?