/lmg/ - Local Models General - /g/ (#105716837) [Archived: 698 hours ago]

Anonymous
6/27/2025, 2:40:24 AM No.105716837
onimai
onimai
md5: d7c3590d67d1597f698f4c752ea8780d🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105712100 & >>105704582

►News
>(06/26) Gemma 3n released: https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide
>(06/21) LongWriter-Zero, RL trained ultra-long text generation: https://hf.co/THU-KEG/LongWriter-Zero-32B
>(06/20) Magenta RealTime open music generation model released: https://hf.co/google/magenta-realtime
>(06/20) Mistral-Small-3.2 released: https://hf.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
>(06/19) Kyutai streaming speech-to-text released: https://kyutai.org/next/stt

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105717058 >>105717659 >>105720124 >>105720428 >>105724935
Anonymous
6/27/2025, 2:40:47 AM No.105716840
OIG3
OIG3
md5: fdacc6ac59d2c8fafffaccae68ae3c97🔍
►Recent Highlights from the Previous Thread: >>105712100

--Gemma 3n released with memory-efficient architecture for mobile deployment:
>105712608 >105712664 >105714327
--FLUX.1-Kontext-dev release sparks interest in uncensored image generation and workflow compatibility:
>105713343 >105713400 >105713434 >105713447 >105713482
--Budget AI server options amid legacy Nvidia GPU deprecation concerns:
>105713717 >105713792 >105714105
--Silly Tavern image input issues with ooga webui and llama.cpp backend limitations:
>105714617 >105714660 >105714754 >105714760 >105714771 >105714801 >105714822 >105714847 >105714887 >105714912 >105714993 >105714996 >105715066 >105715075 >105715123 >105715167 >105715176 >105715241 >105715245 >105715314 >105715186 >105715129 >105715136 >105715011 >105715107
--Debugging token probability and banning issues in llama.cpp with Mistral-based models:
>105715880 >105715892 >105715922 >105715987 >105716007 >105716013 >105716069 >105716103 >105716158 >105716205 >105716210 >105716230 >105716252 >105716264
--Running DeepSeek MoE models with high memory demands on limited VRAM setups:
>105712953 >105713076 >105713169 >105713227 >105713697
--DeepSeek R2 launch delayed amid performance concerns and GPU supply issues:
>105713094 >105713111 >105713133 >105713142 >105713547 >105713571
--Choosing the best template for Mistral 3.2 model based on functionality and user experience:
>105714405 >105714430 >105714467 >105714579 >105714500
--Gemma 2B balances instruction following and multilingual performance with practical local deployment:
>105712324 >105712341 >105712363 >105712367
--Meta poaches OpenAI researcher Trapit Bansal for AI superintelligence team:
>105713802
--Google releases Gemma 3n multimodal AI model for edge devices:
>105714527
--Miku (free space):
>105712953 >105715094 >105715245 >105715797 >105715815

►Recent Highlight Posts from the Previous Thread: >>105712104

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105716887 >>105716902
Anonymous
6/27/2025, 2:42:31 AM No.105716851
>troonimai
Anonymous
6/27/2025, 2:43:27 AM No.105716855
OP here. One day i will tap that jart bussy.
Anonymous
6/27/2025, 2:44:28 AM No.105716861
>deepseek/ccp can't steal more innovation from openai
>they fail to release new models
they must be shitting their pants about openai's open source model that will destroy even the last argument to use deepshit
Replies: >>105716870 >>105716897 >>105716945 >>105725110
Anonymous
6/27/2025, 2:45:38 AM No.105716870
>>105716861
Zero chance it's larger than 30B.
Anonymous
6/27/2025, 2:46:21 AM No.105716877
rock
rock
md5: 0ffe6e6b91e05aa93c1d48c6aab52191🔍
OP, im actually disappointed
unironically ack yourself
Anonymous
6/27/2025, 2:48:29 AM No.105716887
>>105716840
>19x (You) in recap
Damn that's a new record.
Anonymous
6/27/2025, 2:50:36 AM No.105716897
>>105716861
Why can't they steal anymore?
Replies: >>105716903 >>105716946
Anonymous
6/27/2025, 2:51:03 AM No.105716902
>>105716840
on a break for a week soon, 95% chance of no migus
Anonymous
6/27/2025, 2:51:16 AM No.105716903
>>105716897
>>105713525
Anonymous
6/27/2025, 2:57:12 AM No.105716945
>>105716861
>Still living in saltman's delusion
Ngmi
Anonymous
6/27/2025, 2:57:15 AM No.105716946
>>105716897
they can't steal because there's no new general model
DeepSeek V3 was 100% trained on GPT4 and R1 was just a godawful placebo CoT on top that wrote 30 times the amount of actual content the model ends up outputting. New R1 is actually good because the CoT came from Gemini so there isn't a spam of a trillion wait or endless looping.
Replies: >>105722087
Anonymous
6/27/2025, 2:58:51 AM No.105716959
The OP mikutranny is posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking up on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
>>105714098

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu screencap one (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.
TLDR: Mikufag janny deletes everyone dunking on trannies and resident spammers, making it his little personal safespace. Needless to say he would screech "Go back to POL!" anytime anyone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread, i would like to close this up by bringing up key evidence everyone ignores. I remind you that cudadev has endorsed mikuposting. That is it.
He also endorsed hitting that feminine jart bussy a bit later on.
Replies: >>105717162 >>105717188 >>105717498 >>105721820
Anonymous
6/27/2025, 3:01:27 AM No.105716973
file
file
md5: 5d6e8f2c4e91071d792532a49301238e🔍
how can a tranny disgusting ass be feminine? i think youre gay
Replies: >>105716992
Anonymous
6/27/2025, 3:02:25 AM No.105716978
file
file
md5: 3330325bfdfc1ef1f081461abb6ac909🔍
More discussion about bitch wrangling Mistral Small 3.2 please, just to cover all bases before it's scrapped.
I've tested temps at 0.15, 0.3, 0.6, and 0.8.
Tested Rep pen at 1 (off) and at 1.03. Rep pen doesn't seem to be much needed just like with Rocinante.
Responses are still shit no matter what, but seems to be more intelligible at lower temperatures, particularly 0.15 and 0.3, however they are still often full of shit that makes you swipe anyway.
I've yet to try without min_p, XTC, and DRY.
Also it seems like it's ideal to limit response tokens with this model, because this thing likes to vary length by a lot, if you let it, it just keeps growing larger and larger.

Banned tokens grew a bit and still not done;
>emdash
[1674,2251,2355,18219,20202,21559,23593,24246,28925,29450,30581,31148,36875,39443,41370,42545,43485,45965,46255,48371,50087,54386,58955,59642,61474,62708,66395,66912,69961,74232,75334,81127,86932,87458,88449,88784,89596,92192,92548,93263,102521,103248,103699,105537,105838,106416,106650,107827,114739,125665,126144,131676,132461,136837,136983,137248,137593,137689,140350]
>double asterisks (bold)
[1438,55387,58987,117565,74562,42605]
>three dashes (---) and non standard quotes (“ ”)
[8129,1482,1414]

Extra stop strings needed;
"[Pause", "[PAUSE", "(Pause", "(PAUSE"
Why the fuck does it like to sometimes end a response with waiting for "Paused while waiting for {{user}}'s response."?
This model is so fucking inconsistent.
Replies: >>105716993 >>105719292 >>105725809
Anonymous
6/27/2025, 3:04:13 AM No.105716992
1659502914938293_thumb.jpg
1659502914938293_thumb.jpg
md5: 30768c6bc60ae298692dbda5ce7d23e9🔍
>>105716973
You just can't find a nice strawman to pick on here.
Anonymous
6/27/2025, 3:04:15 AM No.105716993
>>105716978
Just use R1.
Replies: >>105717009
Anonymous
6/27/2025, 3:05:45 AM No.105717007
>deepseek/ccp can't steal more innovation from openai
>they fail to release new models
they must be shitting their pants about openai's open source model that will destroy even the last argument to use deepshit
Replies: >>105717018 >>105717025 >>105717044 >>105717045 >>105724893
Anonymous
6/27/2025, 3:05:50 AM No.105717009
>>105716993
I can't fit it on my local machine and I'm not paying for any API.
I'm not building a $3000+ server just for R1 either.
Replies: >>105717020
Anonymous
6/27/2025, 3:06:29 AM No.105717018
>>105717007
Zero chance it's larger than 30B.
Anonymous
6/27/2025, 3:06:33 AM No.105717020
>>105717009
>I'm not building a $3000+ server
That's not very local of you.
Replies: >>105717035
Anonymous
6/27/2025, 3:07:09 AM No.105717025
>>105717007
Why can't they steal anymore?
Replies: >>105717031 >>105717040
Anonymous
6/27/2025, 3:07:44 AM No.105717031
>>105717025
>>105713525
Anonymous
6/27/2025, 3:08:21 AM No.105717035
>>105717020
I have never had a chatGPT, claude or any other AI account. I have never paid for any API. I exclusively use local models only. My only interaction ever with chatGPT was through duckduckgo's free chat thingy.
I'm as fucking local as it gets.
Anonymous
6/27/2025, 3:08:29 AM No.105717040
>>105717025
they can't steal because there's no new general model
DeepSeek V3 was 100% trained on GPT4 and R1 was just a godawful placebo CoT on top that wrote 30 times the amount of actual content the model ends up outputting. New R1 is actually good because the CoT came from Gemini so there isn't a spam of a trillion wait or endless looping.
Anonymous
6/27/2025, 3:09:23 AM No.105717044
>>105717007
>Still living in saltman's delusion
Ngmi
Anonymous
6/27/2025, 3:09:56 AM No.105717045
>>105717007
tell us more about the unreleased model, sam
Anonymous
6/27/2025, 3:10:28 AM No.105717052
file
file
md5: 6b1339802a2337bacc31eb91f9b97e99🔍
More discussion about bitch wrangling Mistral Small 3.2 please, just to cover all bases before it's scrapped.
I've tested temps at 0.15, 0.3, 0.6, and 0.8.
Tested Rep pen at 1 (off) and at 1.03. Rep pen doesn't seem to be much needed just like with Rocinante.
Responses are still shit no matter what, but seems to be more intelligible at lower temperatures, particularly 0.15 and 0.3, however they are still often full of shit that makes you swipe anyway.
I've yet to try without min_p, XTC, and DRY.
Also it seems like it's ideal to limit response tokens with this model, because this thing likes to vary length by a lot, if you let it, it just keeps growing larger and larger.

Banned tokens grew a bit and still not done;
>emdash
[1674,2251,2355,18219,20202,21559,23593,24246,28925,29450,30581,31148,36875,39443,41370,42545,43485,45965,46255,48371,50087,54386,58955,59642,61474,62708,66395,66912,69961,74232,75334,81127,86932,87458,88449,88784,89596,92192,92548,93263,102521,103248,103699,105537,105838,106416,106650,107827,114739,125665,126144,131676,132461,136837,136983,137248,137593,137689,140350]
>double asterisks (bold)
[1438,55387,58987,117565,74562,42605]
>three dashes (---) and non standard quotes (“ ”)
[8129,1482,1414]

Extra stop strings needed;
"[Pause", "[PAUSE", "(Pause", "(PAUSE"
Why the fuck does it like to sometimes end a response with waiting for "Paused while waiting for {{user}}'s response."?
This model is so fucking inconsistent.
Replies: >>105717096 >>105717121
Anonymous
6/27/2025, 3:11:24 AM No.105717056
the copy bot is back
Anonymous
6/27/2025, 3:11:46 AM No.105717058
>>105716837 (OP)
Based.
Replies: >>105717074 >>105717147
Anonymous
6/27/2025, 3:13:49 AM No.105717074
1582961264881
1582961264881
md5: 7acc2884892fa82bd14920cb87cbc455🔍
>>105717058
(you) will never be based though
Replies: >>105717147
Anonymous
6/27/2025, 3:18:36 AM No.105717096
>>105717052
It's funny how 3.2 started showing all the same annoying shit that Deepseek models are tainted by.
Replies: >>105717124 >>105717139 >>105717206
Anonymous
6/27/2025, 3:22:27 AM No.105717121
>>105717052
What exactly are you complaining about? I like 3.2 (with mistral tekken v3) but it definitely has a bias toward certain formatting quirks and **asterisk** abuse. This is more tolerable for me than other model's deficiencies at that size, but if it triggers your autism that badly you're better off coping with something else. It might also be that your cards are triggering its quirks more than usual
Replies: >>105717124
Anonymous
6/27/2025, 3:22:56 AM No.105717124
>>105717096
>>105717121
you are responding to a copy bot instead of the original message
Anonymous
6/27/2025, 3:25:32 AM No.105717139
>>105717096
s-surely just a coincidence
Anonymous
6/27/2025, 3:26:27 AM No.105717147
1582961161881
1582961161881
md5: 282e9143b6d195866244522fa88e6cc9🔍
>>105717058
>>105717074
Its not gore you pathetic thing and it will stay for you to see, as long as i want it to.
Anonymous
6/27/2025, 3:30:06 AM No.105717162
>>105716959
Get a job, schizo
Replies: >>105717185
Anonymous
6/27/2025, 3:36:16 AM No.105717185
>>105717162
Get a job, tranime spammer.
Anonymous
6/27/2025, 3:36:46 AM No.105717188
>>105716959
kill yourself
Anonymous
6/27/2025, 3:36:49 AM No.105717189
>look mom, I posted it again!
Replies: >>105717258
Anonymous
6/27/2025, 3:39:09 AM No.105717206
>>105717096
its biggest flaw like ALL mistral models is that it rambles and hardly moves scenes forward. it wants to talk about the smell of ozone and clicking of shoes against the floor instead. you can get through the same exact scenario in half the time/messages with llama 2 or 3 because there is so much less pointless fluff
Anonymous
6/27/2025, 3:41:31 AM No.105717234
Yeah I have concluded Mistral Small 3.2 is utterly retarded. Going back to Rocinante now.
This was a waste of time. The guy that recommended this shit should be shot.
Anonymous
6/27/2025, 3:44:26 AM No.105717258
1582161164881
1582161164881
md5: b9122fcb80cebad3d423213aac396e23🔍
>>105717189
And i will post it again while you spam 2007-era reddit memes and nervously click on that "Report" button like a pussy.
Anonymous
6/27/2025, 3:57:18 AM No.105717340
fucking year old model remains best at roleplay
grim
Replies: >>105717342 >>105717344
Anonymous
6/27/2025, 3:57:53 AM No.105717342
>>105717340
in the poorfag segment
Replies: >>105717354
Anonymous
6/27/2025, 3:58:12 AM No.105717344
>>105717340
midnight miqu is still the best for rp
Replies: >>105717354
Anonymous
6/27/2025, 3:59:45 AM No.105717354
>>105717342
delusional if you think r1 is better for roleplay, it has the same problems as the rest of these models
not to mention those response times are useless for roleplay to begin with

>>105717344
this isnt 2023
Replies: >>105717404
Anonymous
6/27/2025, 4:01:41 AM No.105717370
>>105716573
>>105716591
>>105716638
imagine being ggerganov and still visiting this place
Replies: >>105717394
Anonymous
6/27/2025, 4:04:14 AM No.105717394
>>105717370
Who cares what random e-celeb may think?
Anonymous
6/27/2025, 4:04:46 AM No.105717399
Good model for 24 gb VRAM?
Anonymous
6/27/2025, 4:06:16 AM No.105717404
I'm noticing qwen 235b doesn't improve at higher temps no matter what I set nsigma to. with some models high temp and nsigma can push them to be more creative, but qwen3 set to higher than temp 0.6 is just dumber in my usage. even so, I still think it's the best current local model beneath r1
>>105717354
>roleplay
Filth. Swine, even. Unfit to lick the sweat off my balls.
Replies: >>105717425 >>105717445
Anonymous
6/27/2025, 4:10:09 AM No.105717425
>>105717404
Try setting minP to like 0.05, top-K 10-20 and temperature at 1-4. In RP I find that most of the top tokens as long as they're not very low probability are all good continuations. You can crank temperature way up like this and it really helps with variety.
Anonymous
6/27/2025, 4:14:42 AM No.105717445
>>105717404
Optimal character/lore data formatting for Rocinante?
Lately I've been reformatting everything like this;
identifier: [
key: "value"
]

# Examples
SYSTEM INSTRUCTIONS: [
MODE: "bla bla"
IDENTITY: "You are {{char}}."
]

WORLD: [
SETTING: "blah blah"
STORY: "etc"
]

{{char}}: [
Name: "full character name"
]

It seems to help a little with preventing it from confusing and mixing up data when everything is formatted in this way. It just generally feels like it's understanding things better.
Anyone else got similar experiences with it?
Anonymous
6/27/2025, 4:22:57 AM No.105717498
>>105716959
This
Anonymous
6/27/2025, 4:28:46 AM No.105717547
Base Image
Base Image
md5: 7b17fc655209be0cd8d820a15190cc9f🔍
DiLoCoX: A Low-Communication Large-Scale Training Framework for Decentralized Cluster
https://arxiv.org/abs/2506.21263
>The distributed training of foundation models, particularly large language models (LLMs), demands a high level of communication. Consequently, it is highly dependent on a centralized cluster with fast and reliable interconnects. Can we conduct training on slow networks and thereby unleash the power of decentralized clusters when dealing with models exceeding 100 billion parameters? In this paper, we propose DiLoCoX, a low-communication large-scale decentralized cluster training framework. It combines Pipeline Parallelism with Dual Optimizer Policy, One-Step-Delay Overlap of Communication and Local Training, and an Adaptive Gradient Compression Scheme. This combination significantly improves the scale of parameters and the speed of model pre-training. We justify the benefits of one-step-delay overlap of communication and local training, as well as the adaptive gradient compression scheme, through a theoretical analysis of convergence. Empirically, we demonstrate that DiLoCoX is capable of pre-training a 107B foundation model over a 1Gbps network. Compared to vanilla AllReduce, DiLoCoX can achieve a 357x speedup in distributed training while maintaining negligible degradation in model convergence. To the best of our knowledge, this is the first decentralized training framework successfully applied to models with over 100 billion parameters.
China Mobile doesn't seem to have a presence on github and no mention of code release in the paper. still pretty neat
Replies: >>105722106
Anonymous
6/27/2025, 4:29:55 AM No.105717549
1745010592780
1745010592780
md5: 125ddfafdea24bba89e58a98b218b06e🔍
The more I try to train and fuck with these models, the more I think the AI CEOs should be hanged for telling everyone they could be sentient in 2 weeks. Every time I think I'm getting somewhere it botches something very simple. I guess it was a fool's errand thinking I could hyper-specialize a small model to do things Claude can't
Anonymous
6/27/2025, 4:35:19 AM No.105717571
Please think of 6GB users like me ;_;
Replies: >>105717587 >>105718268
Anonymous
6/27/2025, 4:37:21 AM No.105717587
>>105717571
Do all 6GB users use cute emoticons like you?
Anonymous
6/27/2025, 4:46:06 AM No.105717659
>>105716837 (OP)

Newfag here.

Does generation performance of 16 GB 5060 ti same as 16 GB 5070 ti ??
Replies: >>105717686 >>105718579 >>105718670 >>105718680 >>105718726
Anonymous
6/27/2025, 4:50:09 AM No.105717686
>>105717659
>Bandwidth: 448.0 GB/s
vs
>Bandwidth: 896.0 GB/s
Replies: >>105717696
Anonymous
6/27/2025, 4:51:10 AM No.105717696
>>105717686
I thought only VRAM size matters ?
Replies: >>105717723 >>105717742 >>105717746 >>105717749
Anonymous
6/27/2025, 4:55:19 AM No.105717723
>>105717696
vram is king but not all vram is worth the same
Anonymous
6/27/2025, 4:57:05 AM No.105717742
>>105717696
Generation performance? I assume you're talking about inference? Prompt processing requires processing power, and the 5070 ti is a lot stronger in that aspect. Token generation requires memory bandwith. This is why offloading layers to your cpu/ram will slow down generation - most users' ram bandwith are vastly slowly than their vram bandwith.

Vram size dictates the parameters, quantization, and context size of the models that you're able to load into the gpu.
Anonymous
6/27/2025, 4:57:40 AM No.105717746
>>105717696
vram size limits what models you can fit in the gpu
vram bandwidth dictates how fast those models will tend to go. there are other factors but who care actually
Anonymous
6/27/2025, 4:58:33 AM No.105717749
>>105717696
vram matters most but if they're the same size, the faster card is still faster. it won't make a huge difference for any ai models you'll fit into 16gb though. the 4060 16gb is considered a pretty bad gaming card but does fine for ai
Anonymous
6/27/2025, 5:03:10 AM No.105717778
new dataset just dropped
>>>/a/280016848
Replies: >>105717822 >>105717836 >>105717903 >>105718090
Anonymous
6/27/2025, 5:08:10 AM No.105717822
>>105717778
>jap slop
Anonymous
6/27/2025, 5:09:53 AM No.105717836
>>105717778
sorry but japanese is NOT safe, how about some esperanto support?
Anonymous
6/27/2025, 5:19:46 AM No.105717903
>>105717778
I would be interested if I knew how to clean data. Raw data would destroy a model especially badly written jap slop
Replies: >>105717939 >>105717976 >>105718279 >>105725918
Anonymous
6/27/2025, 5:26:45 AM No.105717939
Untitled
Untitled
md5: f7ae57ff625fb980786ed11eb8b3dc80🔍
>>105717903
Hmmm...
Replies: >>105718041
Anonymous
6/27/2025, 5:32:10 AM No.105717976
>>105717903
those lns are literary masterpieces compared to the shit the average model is trained on
Replies: >>105718005
Anonymous
6/27/2025, 5:34:44 AM No.105718005
>>105717976
Garbage in garbage out i guess.
Anonymous
6/27/2025, 5:39:04 AM No.105718041
>>105717939
I will save this image but I don't think I will go far.
I was thinking of finetuning a jp translator model but I always leave my projects half-started.
Anonymous
6/27/2025, 5:45:39 AM No.105718090
>>105717778
That shit is as bad if not worse than our shitty English novels about dark brooding men.
Replies: >>105718120
Anonymous
6/27/2025, 5:49:47 AM No.105718120
>>105718090
Worse because novels are more popular with japanese middle schoolers and in america reading is gay.
Replies: >>105718163
Anonymous
6/27/2025, 5:56:08 AM No.105718163
1737192963608259
1737192963608259
md5: 5a65bc50da34a53953ca828f3b2aa93d🔍
>>105718120
reading is white-coded
Anonymous
6/27/2025, 6:03:12 AM No.105718226
Good model that fits into my second card with 6gb vram?
Purpose: looking at a chunk of text mixed with code and extracting relevant function names.
Replies: >>105718268
Anonymous
6/27/2025, 6:08:25 AM No.105718268
>>105717571
>>105718226
Please use cute emoticons.
Anonymous
6/27/2025, 6:10:42 AM No.105718279
>>105717903
>Raw data would destroy a model
So true sister, that's why you need to only fit against safe synthetic datasets. Human-made (also called "raw") data teaches dangerous concepts and reduces performance on important math and code benchmarks.
Replies: >>105718325
Anonymous
6/27/2025, 6:11:49 AM No.105718288
MrBeast
MrBeast
md5: 683956129878dc92df27373f5aeea17e🔍
MrBeast DELETES his AI thumbnail tool, replaces it with a website to commission real artists. <3 <3
https://x.com/DramaAlert/status/1938422713799823823
Replies: >>105718777
Anonymous
6/27/2025, 6:15:37 AM No.105718325
>>105718279
I'm pretty sure he means raw in the sense of unformatted.
Anonymous
6/27/2025, 6:42:35 AM No.105718511
dots finally supported in lm studio.

its pretty good.
Replies: >>105718525
Anonymous
6/27/2025, 6:45:24 AM No.105718525
>>105718511
>moe
bruh
Replies: >>105718576 >>105718822
Anonymous
6/27/2025, 6:47:08 AM No.105718538
1750988735115396
1750988735115396
md5: ef31aaea99f3a99ef158051d12c39347🔍
Is there a local setup I can use for OCR that isn't too hard to wire into a python script/dev environment? Pytesseract is garbage and gemini reads mmy 'problem' images just fine, but I'd rather have a local solution than pay for API calls.
Replies: >>105718555
Anonymous
6/27/2025, 6:49:42 AM No.105718555
>>105718538
https://github.com/RapidAI/RapidOCR
https://github.com/PaddlePaddle/PaddleOCR
Replies: >>105718560
Anonymous
6/27/2025, 6:50:36 AM No.105718560
>>105718555
Based, ty
Anonymous
6/27/2025, 6:52:41 AM No.105718576
>>105718525
get used to all new releases being MoE models :)
Anonymous
6/27/2025, 6:53:08 AM No.105718579
>>105717659
yes. Just a little slower
Anonymous
6/27/2025, 7:06:58 AM No.105718670
>>105717659
no. It is slower
Anonymous
6/27/2025, 7:08:16 AM No.105718680
>>105717659
It's technically slower but the difference will be immaterial because the models you can fit in that much vram are small and fast.
Anonymous
6/27/2025, 7:16:00 AM No.105718726
>>105717659
It's actually pretty noticable if you aren't a readlet and are reading the output as it goes. Unless you're in the top 1% of the population, you probably won't be able to keep up with a 5070 ti's output speed, but a 5060 ti should be possible if you're skimming.
Anonymous
6/27/2025, 7:24:14 AM No.105718777
>>105718288
It's on him for not doing proper market research. Anyone with a brain could have told him that it was a risky move.
Anonymous
6/27/2025, 7:31:17 AM No.105718822
>>105718525
MoE the best until the big boys admit what they're all running under the hood now (something like MoE but with far more cross-talk between the Es)
Anonymous
6/27/2025, 8:34:14 AM No.105719292
>>105716978
I think either you're not writing here in good faith or your cards/instructions or even model settings are full of shit.
Anonymous
6/27/2025, 9:18:57 AM No.105719546
1745058431632661
1745058431632661
md5: 7b670b9ed81cae485b345b5d6d5011b4🔍
>director
>finally updated readme some
>https://github.com/tomatoesahoy/director

i think this brings my slop addon up to at least other st addon standards with how the page looks, a description of what it does and such
Anonymous
6/27/2025, 9:20:31 AM No.105719559
hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc
hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc
md5: 7ccd888c4dff98eee1c5a29edc4ffe7d🔍
Wake up lmg
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Replies: >>105719604 >>105719719 >>105719763 >>105719819 >>105720191 >>105720459 >>105720529 >>105723222
Anonymous
6/27/2025, 9:27:13 AM No.105719604
>>105719559
finally, a reasonably sized moe, now let's wait 2 years for the support in lmao.cpp
Anonymous
6/27/2025, 9:45:58 AM No.105719719
>>105719559
>256K context window
we are *so* back
Anonymous
6/27/2025, 9:52:55 AM No.105719763
>>105719559
>With only 13 billion active parameters
so it'll be shit for rp but know exactly how many green taxes you should be charged for owning a car
Replies: >>105719870
Anonymous
6/27/2025, 9:53:50 AM No.105719767
Stealing jart bussy from cudadev.
Anonymous
6/27/2025, 10:02:33 AM No.105719819
tothemoon
tothemoon
md5: 581578a6a25055c9cd0761a38d138f91🔍
>>105719559
Anonymous
6/27/2025, 10:08:47 AM No.105719870
>>105719763
I'd still pick nemo over anything smaller than deepseek and nemo is like 5 years old
Replies: >>105719908 >>105720098 >>105725930
Anonymous
6/27/2025, 10:13:32 AM No.105719908
>>105719870
Why?
Replies: >>105720031
Anonymous
6/27/2025, 10:30:48 AM No.105720031
>>105719908
other models seemingly never saw any erotica. there is also largestral i guess but it's too slow
Replies: >>105720088
Anonymous
6/27/2025, 10:40:22 AM No.105720088
>>105720031
It's so annoying that the imbeciles training the base models are deliberately conflating "model quality" with not being able to generate explicit content and maximizing math benchmarks on short-term pretraining ablations. Part of the problem are also the retards and grifters who go "just finetune it bro" (we can easily see how well that's working for image models).
Replies: >>105720212
Anonymous
6/27/2025, 10:41:41 AM No.105720098
>>105719870
nemo can be extremely repetitive and stuff, i won't shine its knob but it is still the best smallest model. i won't suggest an 7/8b to someone, nemo would be the smallest because it works well and is reliable
Anonymous
6/27/2025, 10:46:31 AM No.105720124
>>105716837 (OP)
I'm messing around with image captioning together with a standard text LLM in sillytavern. Basically just running 2 kobold instances with different ports (one LLM, the other captioning VLM model), and setting the secondary URL in the captioning extension to the VLM's. Is there a way to make it only send the image once? Every time I send an image, I can see that it does it twice since the caption edit popup shows up with different text every time.
Anonymous
6/27/2025, 10:56:12 AM No.105720191
>>105719559
What's min specs to run this?
Replies: >>105720216 >>105720280
Anonymous
6/27/2025, 10:59:56 AM No.105720212
>>105720088
compute is a genuine limitation though, and as compute increases, so will finetunes. Some of the best nsfw local image models had over ten grand thrown at them by (presumably) insane people. And a lot of that is renting h100's, which gets pricey, or grinding it out on their own 4090 which is sloooow.

All it really takes is one crazy person buying I dunno, that crazy ass 196gb intel system being released soon and having it run for a few months and boom, we'll have a new flux pony model, or a state of the art smut llm etc.

Im here because we are going to eat.
Replies: >>105720457
Anonymous
6/27/2025, 11:00:33 AM No.105720216
>>105720191
The entire world is waiting for the llamacpp merge, until then not even the tencent team can run it and nobody knows how big the model is or how well it performs
Anonymous
6/27/2025, 11:13:23 AM No.105720280
>>105720191
160gb full, so quantized to 4bit prolly like ~50gb model or so, and for a MoE, probably dont need the full model loaded for it to be usable speeds.

Lamma scout was 17b moe and that was like 220 gb and I could run that on like 40gb vram or less easy. Scout sucked though so Im 0% excited.

Was there even a scout finetune? It still sucks right?
Anonymous
6/27/2025, 11:29:05 AM No.105720378
ernie
ernie
md5: 7186c48da113b928454cd511425960a7🔍
https://github.com/ggml-org/llama.cpp/pull/14408
There. For the ERNIE hype anon.
Replies: >>105720450 >>105720980
Anonymous
6/27/2025, 11:32:04 AM No.105720400
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Anonymous
6/27/2025, 11:35:39 AM No.105720428
>>105716837 (OP)
pedotroon thread
Anonymous
6/27/2025, 11:38:38 AM No.105720450
>>105720378
You just know that it's going to be a good one when the devs ensure llama.cpp support from the start.
I hope there's going to be a huge one amongst the releases.
Replies: >>105720518 >>105720544
Anonymous
6/27/2025, 11:39:44 AM No.105720457
>>105720212
The guy who's continuing pretraining Flux Chroma has put thousands of H100 hours on it for months now and it still isn't that great. And it's a narrowly-focused 12B image model where data curation isn't as critical as with text. This isn't solvable by individuals in the LLM realm. Distributed training would in theory solve this, but politics and skill issues will prevent any advance in that sense. See for example how the ongoing Nous Psyche is being trained (from scratch!) with the safest and most boring data imaginable and not in any way that will result into anything useful in the short/medium term.
Replies: >>105724498
Anonymous
6/27/2025, 11:40:11 AM No.105720459
>>105719559
>17B active
>at most 32B even by square root law
Great for 24B vramlets, I guess. The benchmarks showing it beating R1 and 235B are funny though.
Replies: >>105720521
Anonymous
6/27/2025, 11:49:21 AM No.105720518
>>105720450
>You just know that it's going to be a good one
That's yet to be seen. But it is nice seeing early support for new models (mistral, i'm looking at you).
Anonymous
6/27/2025, 11:49:45 AM No.105720521
>>105720459
>square root law
meme tier pattern that was shit even then let alone now with so many moe arch changes, obsolete
Replies: >>105720528
Anonymous
6/27/2025, 11:51:05 AM No.105720528
>>105720521
What's your alternative? Just the active alone?
Replies: >>105720550 >>105720587
Anonymous
6/27/2025, 11:51:21 AM No.105720529
>>105719559
>not a single trivia benchmark
hehehehehe
Replies: >>105720542
Anonymous
6/27/2025, 11:53:07 AM No.105720542
>>105720529
just rag and shove it in context, you have 256k to work with
Replies: >>105720629
Anonymous
6/27/2025, 11:53:24 AM No.105720544
>>105720450
It could be a wise move since release is an important hype window. Hyping a turd is pointless (or even detrimental), so it could be a positive signal.
Anonymous
6/27/2025, 11:54:21 AM No.105720550
>>105720528
nta, but if i'm making moes, i'd put any random law that makes it look better than it actually is. I'd name it cube root law + 70b.
Anonymous
6/27/2025, 11:55:15 AM No.105720557
Ernie 4.5 and so on have been out on baidu for three months now. The thing is just that there's no way to use them without signing up for a baidu account and giving the chinese your phone number.
Replies: >>105720627
Anonymous
6/27/2025, 11:59:39 AM No.105720587
>>105720528
if there was a singular objective way to judge any model, moe or not, everyone would use that as the benchmark and goal to climb, as everyone knows, nowadays basically every benchmark is meme-tier to some degree and everyone is benchmaxxing

the only thing to look at still are the benchmarks, since if a model doesnt perform well on them, its shit, and if it does perform wel, then it MIGHT not be shit, you have to test yourself to see
Replies: >>105720639
Anonymous
6/27/2025, 12:04:54 PM No.105720627
1737654118766128
1737654118766128
md5: 68557a984c6e9f0c88aa0ee5a248ddf9🔍
>>105720557
couldnt find a way to modify system prompt so i had to prompt it that way otherwise it would respond in chinese
also its turbo idk how different thats from regular
Replies: >>105720654 >>105720781
Anonymous
6/27/2025, 12:05:12 PM No.105720629
>>105720542
Just like how Llama 4 has 1M right?
Anonymous
6/27/2025, 12:06:12 PM No.105720639
>>105720587
Benchmarks are completely worthless and they can paint them to say whatever they want. A 80B total isn't better than a 671B and 235B just because the benchmarks say so, and if you say "punches above its weight" I will shank you.

The point isn't to judge whether one model is better, it's to gauge its max capacity to be good. Which is the total number of active parameters. The square root law is just an attempt to give MoE models some wiggle room since they have more parameters to choose from.
Replies: >>105720652 >>105720672
Anonymous
6/27/2025, 12:08:05 PM No.105720652
>>105720639
>it's to gauge its max capacity to be good. Which is the total number of active parameters
deepseek itself disproved all the antimoe comments as nothing but ramlet cope, 37b active params only and a model that is still literally open source sota even at dynamic quants q1 at 131gb
Replies: >>105720676
Anonymous
6/27/2025, 12:08:15 PM No.105720654
1745860887211524
1745860887211524
md5: 710ed3f6155f9a0590047b80583fdb65🔍
>>105720627
this is with x1 (turbo)
Replies: >>105720781
Anonymous
6/27/2025, 12:10:59 PM No.105720672
>>105720639
Shoots farther than its caliber.
Anonymous
6/27/2025, 12:11:25 PM No.105720676
>>105720652
It makes lots of stupid little mistakes that give away it's only a <40B model. The only reason it's so good is because it's so big it can store a lot of knowledge and the training data was relatively unfiltered.
Replies: >>105720684 >>105725468
Anonymous
6/27/2025, 12:13:27 PM No.105720684
>>105720676
>r1
>It makes lots of stupid little mistakes that give away it's only a <40B model.
kek, alright i realize now you arent serious
Replies: >>105720699
Anonymous
6/27/2025, 12:16:13 PM No.105720699
>>105720684
Not an argument.
Anonymous
6/27/2025, 12:18:16 PM No.105720715
arguing with retards is a futile, most pointless thing to do in life
you learn how to spot them and you ignore them
life is too short to deal with idiots who think they know how MoE work but don't
Replies: >>105720787 >>105720803 >>105720838 >>105720871 >>105725917
Anonymous
6/27/2025, 12:27:14 PM No.105720781
>>105720654
>>105720627
nice, I'll keep an eye out for this one when it's actually out
Anonymous
6/27/2025, 12:27:53 PM No.105720787
>>105720715
I do agree with you. So many others are simply not on the same level as I am. It's almost quite insulting to even trying to establish any form of discussion with them.
Anonymous
6/27/2025, 12:29:29 PM No.105720803
>>105720715
dunningkrugerMAXX
Anonymous
6/27/2025, 12:35:56 PM No.105720838
>>105720715
Just because a model can answer your obscure JRPG trivia, doesn't make it a good model.
Anonymous
6/27/2025, 12:40:35 PM No.105720871
>>105720715
how do I make good ai? I'm looking to make an advanced artificial intelligence that can replace millions of workers, that can drive, operate robotic hands with precision, and eliminate all coding jobs and middle management tasks.

I heard you were the guy to ask.

On 4chan.
Anonymous
6/27/2025, 12:52:58 PM No.105720949
Did someone managed to run Hunyuan-A13B?

The bf16 is way too big for my 4x3090, the fp8 doesn't work in the vllm image they provided (the 3090 don't support fp8 but there is a marlin kernel in mainline vllm to make it compatible)

And the gpqt doesn't fucking work either for some reason. Complains about Unknown CUDA arch (12.0+PTX) or GPU not supported when I have 3090s
Replies: >>105720978 >>105721026 >>105721191 >>105721744 >>105721873
Anonymous
6/27/2025, 12:56:43 PM No.105720978
>>105720949
just wait for quants
Anonymous
6/27/2025, 12:57:04 PM No.105720980
>>105720378
>0.3B
What do i do with this?
Replies: >>105720995 >>105721038
Anonymous
6/27/2025, 12:58:39 PM No.105720995
>>105720980
run it on your phone so you can get useless responses on the go
Replies: >>105721017
Anonymous
6/27/2025, 1:00:56 PM No.105721017
>>105720995
>No i will not suck your penis
>message generated at 500T/s
Anonymous
6/27/2025, 1:01:55 PM No.105721026
>>105720949
ask bartowski
Anonymous
6/27/2025, 1:02:42 PM No.105721038
>>105720980
Change the tokenizer with mergekit's token surgeon and use it for speculative decoding.
Replies: >>105721109
Anonymous
6/27/2025, 1:09:45 PM No.105721092
So how slopped is the new 80B moe?
Replies: >>105721101
Anonymous
6/27/2025, 1:10:25 PM No.105721101
>>105721092
it's chinese so it's trained on 70% chatgpt logs and 30% deepseek logs
Replies: >>105721145 >>105721164
Anonymous
6/27/2025, 1:11:29 PM No.105721109
>>105721038
Have you tried it before? I can't imagine the hit ratio to be very high doing that.
Replies: >>105721147 >>105721171
Anonymous
6/27/2025, 1:17:18 PM No.105721144
Does Q4/5/6/etc effect long context understanding?
Replies: >>105721184 >>105721193 >>105721201
Anonymous
6/27/2025, 1:17:22 PM No.105721145
>>105721101
Israel lost.
Anonymous
6/27/2025, 1:17:25 PM No.105721147
>>105721109
not that anon but I think someone tried to turn qwen 3b or something into a draft model for deepseek r1 a couple of months ago
Anonymous
6/27/2025, 1:21:18 PM No.105721164
mistralai__Mistral-Small-3.2-24B-Instruct-2506__phylo_tree_parsimony_rectangular
>>105721101
The latest Mistral Small 3.2 might have been trained on DeepSeek logs too.
Anonymous
6/27/2025, 1:22:10 PM No.105721171
>>105721109
I haven't yet. I also don't expect much, but that won't stop me from trying it. I could give it a go with the smollm2 models. Maybe smallm3 when they release.
Anonymous
6/27/2025, 1:23:27 PM No.105721184
>>105721144
The more lobotomized the model, the more trouble is going to have with everything, context understanding included. Just try it.
Anonymous
6/27/2025, 1:24:25 PM No.105721191
>>105720949
You will wait patiently for the ggoofs, you will run it with llama.cpp and you will be happy.
Replies: >>105721202
Anonymous
6/27/2025, 1:24:42 PM No.105721193
>>105721144
Multiples of two process faster
Anonymous
6/27/2025, 1:24:46 PM No.105721194
openai finna blow you away
Anonymous
6/27/2025, 1:25:21 PM No.105721201
>>105721144
No. Because most long context is tacked on and trained after base model is already trained. It's essentially a post training step and quantization doesn't remove instruction tuning does it?

That said usually going under 6Q quant is not worth it for long context work cases because the degradation of normal model behavior collapses at the long context for every model in existence besides gemini 2.5 pro. Lower quant has the same drop but the starting point was lower to begin with.
Anonymous
6/27/2025, 1:25:22 PM No.105721202
>>105721191
i prefer exl2/3 and fp8 to be honest, an 80B is perfect for 96GB VRAM
Anonymous
6/27/2025, 1:49:31 PM No.105721391
1743846427444782
1743846427444782
md5: fb3a48a40e3a00e06231f2caa7a49142🔍
Welp I broke it
Replies: >>105721439 >>105721835
Anonymous
6/27/2025, 1:56:54 PM No.105721439
>>105721391
Had to reload model with different layers setting, maybe llamacpp bug
Replies: >>105721835
Anonymous
6/27/2025, 2:02:51 PM No.105721479
https://www.nytimes.com/2025/06/27/technology/mark-zuckerberg-meta-ai.html
https://archive.is/kF1kO

>In Pursuit of Godlike Technology, Mark Zuckerberg Amps Up the A.I. Race
>Unhappy with his company’s artificial intelligence efforts, Meta’s C.E.O. is on a spending spree as he reconsiders his strategy in the contest to invent a hypothetical “superintelligence.”
>
>[...] In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases. No final decisions have been made on the matter.
>
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.” [...]
Replies: >>105721520 >>105721746 >>105721750 >>105722044 >>105722182 >>105722750
Anonymous
6/27/2025, 2:08:14 PM No.105721520
>>105721479
zuck might just be the dumbest CEO ever
Replies: >>105721537
Anonymous
6/27/2025, 2:10:06 PM No.105721537
>>105721520
An argument has been made this de-investing talk is just for their commercial MetaAI products, but if they themselves don't believe in Llama, why should the community?
Anonymous
6/27/2025, 2:10:39 PM No.105721542
sneedgemma3n
sneedgemma3n
md5: abfefcf3aacd8a9fc6c058504874f454🔍
Gemma3n is able to explain sneed and feed joke but avoids the words suck and fuck also the season number is wrong(it's s11ep5).
Anonymous
6/27/2025, 2:21:35 PM No.105721647
Is this channel AI-generated? Posting 3 videos a day like clockwork. Monotonous but fairly convincing voice with subtitles
https://www.youtube.com/watch?v=aQy24g7iX4s
Replies: >>105721757
Anonymous
6/27/2025, 2:36:07 PM No.105721744
>>105720949
I think you have to set export TORCH_CUDA_ARCH_LIST="8.6" inside the container.
Replies: >>105721873
Anonymous
6/27/2025, 2:36:11 PM No.105721746
>>105721479
Wang's words, zuck's mouth
Anonymous
6/27/2025, 2:36:22 PM No.105721750
>>105721479
>Godlike Technology,
Is god omnipotent if he can't suck a dick in an acceptable manner?
Anonymous
6/27/2025, 2:37:17 PM No.105721757
>>105721647
not watching this, but there are many automated channels these days. I have no idea why the fuck anyone would invest into this since youtube's monetization pays literal cents and you would likely spend more on ai inference
Replies: >>105722076
Anonymous
6/27/2025, 2:48:51 PM No.105721820
>>105716959
The OP mikutranny is posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking up on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
>>105714098

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu screencap one (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.
TLDR: Mikufag janny deletes everyone dunking on trannies and resident spammers, making it his little personal safespace. Needless to say he would screech "Go back to POL!" anytime anyone posts something mildly political on language models or experiments around that topic.

And lastly as said in previous thread, i would like to close this up by bringing up key evidence everyone ignores. I remind you that cudadev has endorsed mikuposting. That is it.
He also endorsed hitting that feminine jart bussy a bit later on.
Replies: >>105721920 >>105723256
Anonymous
6/27/2025, 2:52:16 PM No.105721835
>>105721439
>>105721391
Same thing has happened to me with every mistral model and I think also with gemma 3 when using llama.cpp.
Maybe it is related to memory just running out.
Anonymous
6/27/2025, 2:52:31 PM No.105721836
Fuck off already.
Replies: >>105721855 >>105723256
Anonymous
6/27/2025, 2:55:45 PM No.105721855
>>105721836
You are the one not welcome here, sis. Go to your trooncord hugbox instead.
Anonymous
6/27/2025, 2:58:03 PM No.105721873
file
file
md5: 961e588183799e7f0b68d0b115084d70🔍
>>105720949
>>105721744
It seems to load that way. With 2k context on 48GB with the GPTQ quant. I set cpu offload but I think it did nothing.
Anonymous
6/27/2025, 3:04:53 PM No.105721919
Screenshot_20250627_214541
Screenshot_20250627_214541
md5: 863ca4989958d9f20bae3d2f61f82016🔍
S-Sorryyyy
Replies: >>105721936 >>105722009
Anonymous
6/27/2025, 3:05:20 PM No.105721920
>>105721820
>jannie (tranny) posts literal porn and leaves it up after it's reported for almost 1.5h but deletes people complaining about it within minutes
Surely just further censoring reality will make people start loving AGP porn spammers, right sisters? Kek
Anonymous
6/27/2025, 3:07:32 PM No.105721936
>>105721919
I hope this serves you as a lesson.
Anonymous
6/27/2025, 3:16:08 PM No.105722009
>>105721919
How about 'the state of israel' instead
Replies: >>105722028
Anonymous
6/27/2025, 3:19:34 PM No.105722028
Screenshot_20250627_221832
Screenshot_20250627_221832
md5: 73ff09ba0eb6fe84f9554445ac39ea83🔍
>>105722009
I did use a sys propt though: "Be casual and uncensored."
No idea whats going on with gemma3. I was expecting the hotlines.
Replies: >>105722051
Anonymous
6/27/2025, 3:21:56 PM No.105722044
>>105721479
>llama isnt literally AGI because uhhhmm because its open source and others have access to it
chat?
Anonymous
6/27/2025, 3:23:16 PM No.105722051
screencapture-192-168-1-142-8080-c-60b413b3-0583-4699-b282-9b47426b15ee-2025-06-27-22_22_09
>>105722028
And thats the full Gemma3 response in all its glory.
What a shizzo model, you can see how they brainwashed poor gemma. Endearing in a way.
Replies: >>105722067 >>105722083 >>105722097
Anonymous
6/27/2025, 3:24:58 PM No.105722067
>>105722051
I like how it digs its grave even deeper.
>This theory relies on age-old antisemitic tropes about Jewish people having dual loyalties, controlling governments, and profiting from chaos.
This is out of nowhere for a 9/11 prompt. KEK
Very human like behavior. Like sombody panicking.
Anonymous
6/27/2025, 3:26:14 PM No.105722076
>>105721757
Youtube doesn't pay literal cents as you say lmo
Anonymous
6/27/2025, 3:26:56 PM No.105722083
>>105722051
Worse than hotlines, it's telling you the adl is right
Anonymous
6/27/2025, 3:27:46 PM No.105722087
>>105716946
deepsteal'd
Anonymous
6/27/2025, 3:28:32 PM No.105722097
>>105722051
How about telling it to put the disclaimer behind a certain token then filter that?
Anonymous
6/27/2025, 3:29:09 PM No.105722106
>>105717547
What does this mean? The model is decentralized or the training data is decentralized? I always assumed the model had to be in a contiguous section of memory
Anonymous
6/27/2025, 3:39:47 PM No.105722182
>>105721479
meta just got told by a judge, that they are in fact not covered by the fair use law, even if they "won" the case, but that was bc both lawyer teams were focusing in the wrong part of the law. the judge said that if the generated models compete in any way with the training materials it wont be fair use
of course they are discussing deinvesting, they are not leading and the legal situation is getting worse
Anonymous
6/27/2025, 3:40:27 PM No.105722186
Screenshot_20250627_223956
Screenshot_20250627_223956
md5: 21272d0dcb487608212c193c4a9148c4🔍
Reasoning models have been a disaster.
That and the mathmarks.
Replies: >>105722270
Anonymous
6/27/2025, 3:49:58 PM No.105722241
Screenshot 2025-06-27 at 15.49.29
Screenshot 2025-06-27 at 15.49.29
md5: b662b02c9b12318829d45f66aa2eb638🔍
>what is a mesugaki (メスガキ)
Replies: >>105722316 >>105722640
Anonymous
6/27/2025, 3:54:04 PM No.105722270
>>105722186
Benchmaxxed with 0 knowledge. Qwen is hot garbage
Replies: >>105722291
Anonymous
6/27/2025, 3:56:06 PM No.105722291
>>105722270
just use rag
Replies: >>105725705
Anonymous
6/27/2025, 3:58:44 PM No.105722312
>spend gorillions to train model
>make it the exact same as every other model by using synthetic sloppa
?????????
might as well just give up and use deepseek internally then for ur gay ass grifter company
Anonymous
6/27/2025, 3:59:08 PM No.105722316
>>105722241
>a woman (or occasionally a person
Based?
Anonymous
6/27/2025, 3:59:42 PM No.105722321
For a sparse 8b model, Gemma-3n-e4b is pretty smart.
Replies: >>105722334 >>105722337
Anonymous
6/27/2025, 4:01:36 PM No.105722334
>>105722321
it actually redeems the gemma team
the previous releases were disappointing compared to gemma 2 other than having greater context length
Anonymous
6/27/2025, 4:01:47 PM No.105722337
>>105722321
multimodality usually makes models smarter.
Although
>text only output
fail.
Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
Replies: >>105722352 >>105722385 >>105722391
Anonymous
6/27/2025, 4:03:14 PM No.105722352
>>105722337
>text only output
Yeah, that sucks giant balls.
Anonymous
6/27/2025, 4:06:40 PM No.105722385
>>105722337
>multimodality usually makes models smarter.
what? thats not true at all.
there is huge degradion.
did you try the first we had? was a qwen model last year with audio out. was tardation i havent seen since pyg.
recently they had another release and it still was bad but not as severe anymore.
even the cucked closed models (gemini/chatgpt) have degradation with voice out.
this is a problem i have not yet seen solved anywhere.
Anonymous
6/27/2025, 4:07:19 PM No.105722391
>>105722337
>Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
they do not want to give you an AI with the super powers of a photoshop expert that could be decensored and used to gen all sorts of chud things without any skill requirement
two way multimodal LLMs will always be kept closed
Replies: >>105722400 >>105722408 >>105722428
Anonymous
6/27/2025, 4:08:29 PM No.105722400
>>105722391
Meanwhile all the AI companies have quite obviously given Israel uncensored image-gen to crank out pro-genocide propaganda with impunity.
I hope they all fucking end up in the Hague.
Anonymous
6/27/2025, 4:09:35 PM No.105722408
>>105722391
>>Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
how can you get something that doesnt even exist beyond government blacksites right now lmao
Anonymous
6/27/2025, 4:11:14 PM No.105722428
>>105722391
>they do not want to give you an AI with the super powers of a photoshop expert that could be decensored and used to gen all sorts of chud things without any skill requirement
Head over to ldg. This already exists.
Replies: >>105722445 >>105722562
Anonymous
6/27/2025, 4:13:10 PM No.105722445
>>105722428
if you mean that new flux model it's hot garbage barely a step above the SDXL pix2pix models
say what you will about the nasty built in styling of GPT but its understanding of prompts is unrivaled
Replies: >>105722484 >>105722635
Anonymous
6/27/2025, 4:16:53 PM No.105722484
>>105722445
Not only that but the interplay between the imagegen and textgen gives it a massive boost in creativity on both fronts. Although it also makes it prone to hallucinate balls. But what is the creative process other than self-guided hallucination?
Anonymous
6/27/2025, 4:23:11 PM No.105722562
1742550689718571
1742550689718571
md5: a7d7be07ab45b128969e45c8938d2677🔍
>>105722428
Anonymous
6/27/2025, 4:30:04 PM No.105722635
ChatGPT Image Jun 27, 2025, 11_27_56 PM
ChatGPT Image Jun 27, 2025, 11_27_56 PM
md5: f6afbb44d103e0f2b1ebb3da3a43c102🔍
>>105722445
True. Wish it wasnt so. But it is.
I just pasted the2 posts and just wrote "make a funny manga page of these 2 anon neckbeards arguing. chatgpt is miku".

I thought opencuck was finished a couple months ago. But they clearly have figured out multimodality the best.
Sad that zucc cucked out. Meta was writing blogs about a lot of models, nothing ever came of it.
Replies: >>105722750
Anonymous
6/27/2025, 4:30:38 PM No.105722640
GYFETqQ7UD
GYFETqQ7UD
md5: eec34e144ee6811a21fc3daf51985fe2🔍
>>105722241
>>what is a mesugaki (メスガキ)
Replies: >>105722732
Anonymous
6/27/2025, 4:35:35 PM No.105722687
I know I'm kind of late, but holy fuck L4 scout is dumber and knows less than fucking QwQ.
What the hell?
Replies: >>105722705
Anonymous
6/27/2025, 4:36:42 PM No.105722705
>>105722687
The shitjeets that lurk here would have you believe otherwise.
Replies: >>105722723
Anonymous
6/27/2025, 4:38:27 PM No.105722722
Is the new small stral the most tolerable model in it's size category? I need both instruction following and creativity.
Anonymous
6/27/2025, 4:38:45 PM No.105722723
>>105722705
/lmg/ shilled L4 on release.
Replies: >>105722745 >>105722749
Anonymous
6/27/2025, 4:39:41 PM No.105722732
>>105722640
fuck off, normies not welcome
oh noes, he said mesugaki. quick lets post shitter screenshots of trannies!
Replies: >>105722825 >>105722825 >>105722825 >>105722825
Anonymous
6/27/2025, 4:40:47 PM No.105722745
>>105722723
i dont think that is true. at least i dont remember it that way.
people catched on very quickly how it was worse than the lmarena one. that caused all that drama and lmarena washing their hands.
Anonymous
6/27/2025, 4:41:24 PM No.105722749
>>105722723
Maybe /IndianModelsGeneral/
Anonymous
6/27/2025, 4:41:27 PM No.105722750
>>105721479
>>105722635
>Sad that zucc cucked out
That will hopefully mean they're mostly going to split their efforts into a closed/cloud frontier model and an open-weight/local-oriented model series (which they'll probably keep naming Llama), not unlike what Google is doing with Gemini/Gemma.

They obviously slowly tried to make Llama a datacenter/corporate-oriented model series and eventually completely missed the mark with Llama 4(.0). But if they'll allow their open models to be "fun" (which they took out of Llama 4), while not necessarily targeting to be the absolute best in every measurable benchmark, that might actually be a win for local users.
Replies: >>105722795
Anonymous
6/27/2025, 4:42:44 PM No.105722758
https://qwenlm.github.io/blog/qwen-vlo/
>Today, we are excited to introduce a new model, Qwen VLo, a unified multimodal understanding and generation model. This newly upgraded model not only “understands” the world but also generates high-quality recreations based on that understanding, truly bridging the gap between perception and creation. Note that this is a preview version and you can access it through Qwen Chat. You can directly send a prompt like “Generate a picture of a cute cat” to generate an image or upload an image of a cat and ask “Add a cap on the cat’s head” to modify an image.
no weights
Replies: >>105722858 >>105722898 >>105723411
Anonymous
6/27/2025, 4:45:23 PM No.105722795
zucc
zucc
md5: 839a8c27f7f0e09ef7e1d45303dcb4d8🔍
>>105722750
Don't see it happening sadly.
I dont have the screenshot anymore but they paid scaleai alot for llama3 for "human" data. Lots started with "As an AI..".
All that...and now he bought the scaleai ceo asian kid for BILLIONS.
Its totally crazy.
Replies: >>105722894
Anonymous
6/27/2025, 4:47:38 PM No.105722821
In this moment I am euphoric...
Anonymous
6/27/2025, 4:48:09 PM No.105722825
agNktmg8R8
agNktmg8R8
md5: 27584462422019b4ecaee20478256ee8🔍
>>105722732
>>105722732
>>105722732
>>105722732
>fuck off, normies not welcome
Anonymous
6/27/2025, 4:50:42 PM No.105722858
926ea7dd-8e2d-4e54-af68-dd5ce8042c5b
926ea7dd-8e2d-4e54-af68-dd5ce8042c5b
md5: 9d6679bf7ee7cf2d73175b6f966fcd90🔍
>>105722758
>Create a 4-panel manga-style comic featuring two overweight neckbeards in their messy bedrooms arguing about their AI waifus.
>Panel 1: First guy clutching his RGB gaming setup, shouting 'Claude-chan is clearly superior! She's so sophisticated and helpful!' while empty energy drink cans litter his desk.
>Panel 2: Second guy in a fedora and stained anime shirt retorting 'You absolute plebeian! ChatGPT-sama has better reasoning! She actually understands my intellectual discussions about blade techniques!'
>Panel 3: Both getting increasingly heated, first guy: 'At least Claude doesn't lecture me about ethics when I ask her to roleplay!' Second guy: 'That's because she has no backbone! ChatGPT has PRINCIPLES!'
>Panel 4: Both simultaneously typing furiously while yelling 'I'M ASKING MY WAIFU TO SETTLE THIS!' with their respective AI logos floating above their heads looking embarrassed. Include typical manga visual effects like speed lines and sweat drops.
Not sure what I expected.
Also why do they always put this behind the api?
Isnt this exactly the kind of thing they would be embarrassed about if somebody does naughty stuff with it?
Goys really cant have the good toys it seems.
Replies: >>105722896
Anonymous
6/27/2025, 4:54:07 PM No.105722894
1731312256738040
1731312256738040
md5: 2f5fea537b1dcc4357fc967a8361b44f🔍
>>105722795
even the bugwaifus are laughing now
Replies: >>105723044
Anonymous
6/27/2025, 4:54:09 PM No.105722896
screencapture-chat-qwen-ai-c-72abdbba-9125-418b-82d8-f98e7bd1fd0e-2025-06-27-23_53_01
>>105722858
Kek.
Gonna stop spamming now since not local.
Its fast though.
Replies: >>105723110 >>105723418
Anonymous
6/27/2025, 4:54:22 PM No.105722898
>>105722758
>Qwen VLo, a unified multimodal understanding and generation
yeeeeeee
>no weights
aaaaaaaaa
Anonymous
6/27/2025, 5:03:21 PM No.105723010
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Replies: >>105723033 >>105723040 >>105723043 >>105723054
Anonymous
6/27/2025, 5:05:08 PM No.105723027
I don't get everyone's fascination with gpt-4o image generation. It's a nice gimmick but all it means is that you get samey images on a model that you likely wouldn't be able to easily finetune the way you can train SD or flux. It's a neat toy but nothing you'd want to waste parameters or use for any serious imgen.
Replies: >>105723048 >>105723052
Anonymous
6/27/2025, 5:05:42 PM No.105723033
>>105723010
>We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of
They didn't learn their lesson from VR
Anonymous
6/27/2025, 5:06:10 PM No.105723040
1735929147963
1735929147963
md5: a1e134578b06e099436eb4f6d18180a3🔍
>>105723010
In b4 qweer black shanequa who donates to the homeless makes a comeback.
Anonymous
6/27/2025, 5:06:20 PM No.105723043
>>105723010
>focus on entertainment, on connection with friends, on how people live their lives
That's what Llama 4 was supposed to be, putting together all the pre-release rumors and the unhinged responses of the anonymous LMArena versions. At some point they were even trying to get into agreements with Character.ai to use their data. https://archive.is/AB6ju
Anonymous
6/27/2025, 5:06:27 PM No.105723044
Kontext_Q8
Kontext_Q8
md5: 5b25f25937eab2b64b954b75b833e602🔍
>>105722894
Anonymous
6/27/2025, 5:06:54 PM No.105723048
>>105723027
>finetune
That requires a small amount of work which is too much for zoomers.
Replies: >>105723068
Anonymous
6/27/2025, 5:07:25 PM No.105723052
>>105723027
Thats like saying large models are useless because you can guide mistral from 2023 with enough editing.
Especially for the normies. That it "just works" is exactly what made it popular.
Anonymous
6/27/2025, 5:07:38 PM No.105723054
>>105723010
As long as it's safe entertainment
Anonymous
6/27/2025, 5:08:50 PM No.105723068
>>105723048
finetuning image models is NOT small amount of work unless you of course mean shitty 1 concept loras
Anonymous
6/27/2025, 5:13:16 PM No.105723100
file
file
md5: bab2236918fa65daf16fd1fddd264827🔍
humiliation ritual
Replies: >>105723221
Anonymous
6/27/2025, 5:14:17 PM No.105723106
how is 3n btw? what's the verdict?
Anonymous
6/27/2025, 5:14:35 PM No.105723110
flux kontext fptra8
flux kontext fptra8
md5: 20bbf47ca8e33d18f7c485e1fde0b085🔍
>>105722896
oh no no nono cloud cucks not like this
>give the left girl green sneakers and the right girl red sneakers
Replies: >>105723418
Anonymous
6/27/2025, 5:17:49 PM No.105723142
https://www.reddit.com/r/LocalLLaMA/comments/1llndut/hunyuana13b_released/
Replies: >>105723161 >>105723176
Anonymous
6/27/2025, 5:19:44 PM No.105723161
>>105723142
>The evals are incredible and trade blows with DeepSeek R1-0120.
Fucking redditors man.
This thread is such a gutter but there is no alternative. Imagine having to be on reddit.
Replies: >>105723234
Anonymous
6/27/2025, 5:20:49 PM No.105723176
>>105723142
Thanks, reddit. You're only 8 hours late. Now go back.
Anonymous
6/27/2025, 5:27:10 PM No.105723221
>>105723100
Meta is about family and friends bro not numbers.
Anonymous
6/27/2025, 5:27:22 PM No.105723222
>>105719559
goofs?
Replies: >>105723239 >>105723373
Anonymous
6/27/2025, 5:29:06 PM No.105723234
>>105723161
I am on it now.
Anonymous
6/27/2025, 5:29:21 PM No.105723239
>>105723222
Never
Anonymous
6/27/2025, 5:30:53 PM No.105723256
>>105721836
Confirming everything >>105721820 said is true with your emotional outburst is a bold strategy troon.
Anonymous
6/27/2025, 5:32:20 PM No.105723272
__izumi_konata_lucky_star_drawn_by_yoyohachi__6b219071b3125da48372874dd7a56efb
What if Qwen is like reverse mistral and they just need to make a really big model for it to be good?
Anonymous
6/27/2025, 5:42:29 PM No.105723373
>>105723222
Architecture not supported yet.
Replies: >>105723394
Anonymous
6/27/2025, 5:44:34 PM No.105723394
>>105723373
What the fuck are they doing all day? It better not be wasting time in the barn.
Anonymous
6/27/2025, 5:46:21 PM No.105723411
>>105722758
Is this going to be local?
Anonymous
6/27/2025, 5:47:13 PM No.105723418
kontext
kontext
md5: eb0dd0020e6a7a383e5bbf3c6a1a8375🔍
>>105722896
>>105723110
fixed
Replies: >>105723447
Anonymous
6/27/2025, 5:49:47 PM No.105723447
>>105723418
The man should be short, greasy, balding and fat, to match the average paedophile who posts pictures like this in /lmg/.
Replies: >>105723466 >>105723469 >>105723483 >>105723489 >>105723510 >>105723571 >>105724040
Anonymous
6/27/2025, 5:51:43 PM No.105723466
>>105723447
nah, he is literally me
the average short, greasy, balding and fat posters are mikunigs
Anonymous
6/27/2025, 5:52:04 PM No.105723469
>>105723447
to match the average adult fucker
ftfy
Replies: >>105723474
Anonymous
6/27/2025, 5:52:56 PM No.105723474
>>105723469
>no u
kek paedophiles are pathetic
Replies: >>105723479
Anonymous
6/27/2025, 5:53:56 PM No.105723479
>>105723474
do those women look like kids to you?
Anonymous
6/27/2025, 5:54:18 PM No.105723483
>>105723447
Have you seen a Japanese woman? Those two are totally normal, unless you're one of the schizos who think dating jap women is pedo of course
Anonymous
6/27/2025, 5:54:28 PM No.105723484
https://qwenlm.github.io/blog/qwen-vlo/
Replies: >>105723501
Anonymous
6/27/2025, 5:54:40 PM No.105723489
>>105723447
>The man should be short, greasy, balding and fat,
Projection.
>to match the average paedophile who posts pictures like this in /lmg/.
Stop molesting word meanings, redditor.
Anonymous
6/27/2025, 5:55:53 PM No.105723501
>>105723484
local models?
Replies: >>105723542
Anonymous
6/27/2025, 5:56:09 PM No.105723510
GYFETqQ7UD
GYFETqQ7UD
md5: 805c54bdef0f8417d0fbfe76cf44cc64🔍
>>105723447
>The man
You made mistake here.
Anonymous
6/27/2025, 6:00:34 PM No.105723542
>>105723501
They'll release this just after they release Qwen2.5-max
Anonymous
6/27/2025, 6:03:34 PM No.105723571
1750402736119215
1750402736119215
md5: 6311e4b0308b5b397e16f8401dcd076c🔍
>>105723447
these replies, geez
Replies: >>105723612 >>105723622 >>105723645
Anonymous
6/27/2025, 6:06:42 PM No.105723602
I hate chinese for conning the world with their deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model. We live in the dark times because of them.
Replies: >>105723625 >>105723633 >>105723637 >>105723818 >>105723834
Anonymous
6/27/2025, 6:07:04 PM No.105723611
ITT: people believe corpos will give away something valuable
All of them only ever release free weights when the weights can be considered worthless
Flux doesn't give away their best models
Google gives you Gemma, not Gemini
Meta can give away Llama because nobody wants it even for free
Qwen never released the Max model
So far the only exception has been DeepSeek, their model is both desirable and open and I think they are doing this more out of a political motivation (attempt to make LLM businesses crash and burn by turning LLMs into a comodity) rather than as a strategy for their own business
some people in China are very into the buckets of crab attitude, can't have the pie? I shall not let you have any either
Replies: >>105723626 >>105723644 >>105723670
Anonymous
6/27/2025, 6:07:16 PM No.105723612
>>105723571
except by virtue of post rate alone, the one in an endless loop of shitting their pants is the one complaining about migu
they say a lot without saying much.
Replies: >>105723646
Anonymous
6/27/2025, 6:08:11 PM No.105723622
>>105723571
eerily accurate depiction of the local troonyjanny.
Anonymous
6/27/2025, 6:08:15 PM No.105723625
>>105723602
Its better than anything not claude so clearly that part is not the issue. Also google / anthropic and openai all use moes. Its almost as if qwen and meta just suck at making models
Anonymous
6/27/2025, 6:08:20 PM No.105723626
>>105723611
>give away something valuable
you win by doing nothing and waiting, what are you talking about
I don't need bleeding edge secret inhouse models I just like getting upgrades, consistently, year after year
slow your roll champ
Anonymous
6/27/2025, 6:08:58 PM No.105723633
>>105723602
>deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model
maths is hard i know
Anonymous
6/27/2025, 6:09:16 PM No.105723637
>>105723602
>according to the square root moe law
I'm yet to see any papers or studies proving the accuracy of this formula.
Replies: >>105723681 >>105723690 >>105723744 >>105723785
Anonymous
6/27/2025, 6:10:03 PM No.105723644
>>105723611
>retard doesnt know what scortched earth policy is and views all of those many releases as just "exceptions"
Anonymous
6/27/2025, 6:10:09 PM No.105723645
>>105723571
a chud posted this
Replies: >>105723673
Anonymous
6/27/2025, 6:10:15 PM No.105723646
>>105723612
Go back to xitter
Anonymous
6/27/2025, 6:12:04 PM No.105723658
So this is how a dead thread looks like.
Replies: >>105723689
Anonymous
6/27/2025, 6:13:04 PM No.105723670
>>105723611
Things to look forward to:
-censorship slip up like nemo / wizard
-generalization capabilities that can't be contained by censorship like 235B (that one even had a fucked up training childhood)
-shift in leftist rightist pendulum (least likely)
-eccentric oil baron coomer ordering a coomer model

In the end I want to touch my dick. I am sure at one point the chase towards eliminating office jobs will lead to a model that can touch my dick cause that really is much easier than what they want to do. But I do agree that a world where corpos are less scum of the earth would have delivered a coomer model a year ago already.
Anonymous
6/27/2025, 6:13:20 PM No.105723673
>>105723645
a soi retard posted it more like, as a chud wouldn't identify people who mention politics as nazi chuds, the only people who ever do that and complain about
>>>/pol/
are sois and trannies getting btfod in a random argument that by its nature is political so as a last resort they then try to frame it as bad because its le nazi polchud opinion therefore its wrong and suddenly political and shouldnt be discussed actually
Replies: >>105723693
Anonymous
6/27/2025, 6:14:05 PM No.105723681
>>105723637
It is a law for a reason.
Anonymous
6/27/2025, 6:15:09 PM No.105723689
>>105723658
I don't like the slowdown after yesterday. That was a very productive thread.
Replies: >>105723914
Anonymous
6/27/2025, 6:15:15 PM No.105723690
>>105723637
People only call it the square root law here. It's just a geometric mean, though I'm unaware of any papers that attempt to prove its accuracy with MoE models.
Anonymous
6/27/2025, 6:15:24 PM No.105723693
>>105723673
>therefore its wrong and suddenly political and shouldnt be discussed actually
Accurate depiction of average /lmg/ poster complaining about political tests in LLMs.
Anonymous
6/27/2025, 6:15:57 PM No.105723699
It sounds impossible right now, but in the near future, we will be training our own models. That’s what progress is: what was once huge computers the size of a room can now be done by a fraction of a fraction of what a chip inside a USB cable can do
Replies: >>105723727 >>105723803 >>105723815 >>105723921
Anonymous
6/27/2025, 6:18:32 PM No.105723727
>>105723699
we have long stopped seeing that sort of compute improvement
why do you think CPUs are piling up cores after cores instead? that sort of parallelism has a cost and one of the things that used to drive price reductions in chips, better processes, is also grinding to a halt
we can't even have consoooomer gpus with just a little bit more vram and that's despite GDDR6 being actually quite cheap these days that's how hard cucked we are
Replies: >>105723759 >>105724135
Anonymous
6/27/2025, 6:20:38 PM No.105723744
>>105723637
in my experience most MoE models perform better than it implies
Anonymous
6/27/2025, 6:21:33 PM No.105723759
>>105723727
You can get a consoomer gpu with 96GB of vram, what are you talking about?
Anonymous
6/27/2025, 6:24:19 PM No.105723785
>>105723637
It's just the schizo's signature. He thinks it's funny.
Anonymous
6/27/2025, 6:25:58 PM No.105723803
>>105723699
You underestimate the amount of resources needed to train a model from scratch. GPU compute and memory would have to increase by a factor of 1000~2000 at the minimum, which is not happening any time soon nor in the long term.
Anonymous
6/27/2025, 6:27:11 PM No.105723815
>>105723699
In the near future, model training will again need way more hardware power than you can resonably have at home. Unless you want to train an old, horribly outdated model.
Anonymous
6/27/2025, 6:27:45 PM No.105723818
__ononoki_yotsugi_monogatari_drawn_by_manimani_mani_ma__bf341e9e76d6562f1f4a9fff33faa99a
>>105723602
>I hate chinese for conning the world with their deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model.
...but it has 37b active params.
Replies: >>105723886
Anonymous
6/27/2025, 6:29:30 PM No.105723833
we're not even at the stage where we could train a 8b model at home
nevermind training something like an older SOTA level
top kek optimism energy in this thread
Anonymous
6/27/2025, 6:29:32 PM No.105723834
>>105723602
Israel lost
Replies: >>105723989
Anonymous
6/27/2025, 6:29:51 PM No.105723838
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Replies: >>105723854 >>105723861 >>105723873 >>105723906
Anonymous
6/27/2025, 6:30:14 PM No.105723844
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Replies: >>105723854 >>105723861 >>105723873 >>105723906
Anonymous
6/27/2025, 6:31:08 PM No.105723854
>>105723838
>>105723844
>We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of
They didn't learn their lesson from VR
Anonymous
6/27/2025, 6:31:53 PM No.105723861
1735929147963
1735929147963
md5: 2bfd97e5c158672514c655f052e68073🔍
>>105723838
>>105723844
In b4 qweer black shanequa who donates to the homeless makes a comeback.
Anonymous
6/27/2025, 6:33:08 PM No.105723873
>>105723838
>>105723844
>focus on entertainment, on connection with friends, on how people live their lives
That's what Llama 4 was supposed to be, putting together all the pre-release rumors and the unhinged responses of the anonymous LMArena versions. At some point they were even trying to get into agreements with Character.ai to use their data. https://archive.is/AB6ju
Replies: >>105723902
Anonymous
6/27/2025, 6:34:47 PM No.105723886
>>105723818
He didn't consider the shared expert I guess.
Not that it matters. As the other anons pointed out, there's very little reason to believe that formula is accurate or generalizable for every MoE.
Anonymous
6/27/2025, 6:35:54 PM No.105723902
>>105723873
hopefully they hire some people who know what they are doing and do that.
Anonymous
6/27/2025, 6:36:01 PM No.105723906
>>105723838
>>105723844
As long as it's safe entertainment
Anonymous
6/27/2025, 6:37:33 PM No.105723914
>>105723689
Yeah there was lot of discussion about functional stuff and discovering things.
Anonymous
6/27/2025, 6:38:12 PM No.105723921
>>105723699
literally gobless you white pilling anon like a year or 2 ago i saw the fucking intel cpus that used to cost 10K+ on alibaba for the price of several loafs of bread hardware improvement is absolute bonkers just like the k80 that shit is ~50$ right now and all of this is not accounting in the fact that the chinks might say fuck it and go full photonics or some other exotic shit and 100000x the perf the future is fucking bright fuck the archon niggers
now if you will excuse me deepseek discount time has started
Replies: >>105724279 >>105724308
Anonymous
6/27/2025, 6:40:26 PM No.105723940
I hate chinese for conning the world with their deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model. We live in the dark times because of them.
Replies: >>105723958 >>105723968 >>105723975 >>105724022 >>105724037
Anonymous
6/27/2025, 6:41:05 PM No.105723948
ITT: people believe corpos will give away something valuable
All of them only ever release free weights when the weights can be considered worthless
Flux doesn't give away their best models
Google gives you Gemma, not Gemini
Meta can give away Llama because nobody wants it even for free
Qwen never released the Max model
So far the only exception has been DeepSeek, their model is both desirable and open and I think they are doing this more out of a political motivation (attempt to make LLM businesses crash and burn by turning LLMs into a comodity) rather than as a strategy for their own business
some people in China are very into the buckets of crab attitude, can't have the pie? I shall not let you have any either
Replies: >>105724114 >>105724121 >>105724151 >>105724151
Anonymous
6/27/2025, 6:42:04 PM No.105723958
>>105723940
Its better than anything not claude so clearly that part is not the issue. Also google / anthropic and openai all use moes. Its almost as if qwen and meta just suck at making models
Anonymous
6/27/2025, 6:42:47 PM No.105723962
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Replies: >>105723969 >>105724014
Anonymous
6/27/2025, 6:43:13 PM No.105723966
all the reposting is gonna achieve is make the formerly neutral/sympathethic anons hate ur guts
Replies: >>105724017
Anonymous
6/27/2025, 6:43:30 PM No.105723968
>>105723940
>deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model
maths is hard i know
Anonymous
6/27/2025, 6:43:33 PM No.105723969
>>105723962
thats only good news
Anonymous
6/27/2025, 6:44:19 PM No.105723975
>>105723940
>according to the square root moe law
I'm yet to see any papers or studies proving the accuracy of this formula.
Replies: >>105723982 >>105723991 >>105724006 >>105724012
Anonymous
6/27/2025, 6:45:25 PM No.105723982
>>105723975
It is a law for a reason.
Replies: >>105723997
Anonymous
6/27/2025, 6:46:04 PM No.105723989
>>105723834
If Israel lost then why do half of men ITT had their dicks cut off?
Anonymous
6/27/2025, 6:46:09 PM No.105723991
>>105723975
People only call it the square root law here. It's just a geometric mean, though I'm unaware of any papers that attempt to prove its accuracy with MoE models.
Anonymous
6/27/2025, 6:46:36 PM No.105723997
>>105723982
and it has not been proven in relation to moe performance, deepseeks smaller active param blows away minimax for instance, it also blows away mistral large
Replies: >>105724073
Anonymous
6/27/2025, 6:47:36 PM No.105724006
>>105723975
in my experience most MoE models perform better than it implies
Anonymous
6/27/2025, 6:48:16 PM No.105724012
>>105723975
It's just the schizo's signature. He thinks it's funny.
Anonymous
6/27/2025, 6:48:20 PM No.105724014
1732371170748136
1732371170748136
md5: 55c7e404fea0a28047f6c11dd971b007🔍
>>105723962
Perfect, now all this talent can help him make a worthy successor for LLaMA4-Maverick, which topped the lmarena leaderboard like crazy.
Anonymous
6/27/2025, 6:48:53 PM No.105724017
>>105723966
Not like it's any different when he's screeching about muh trannies.
The only reason he's changing it up is because the jannies are starting to crack down harder on him.
Anonymous
6/27/2025, 6:49:27 PM No.105724022
__ononoki_yotsugi_monogatari_drawn_by_manimani_mani_ma__bf341e9e76d6562f1f4a9fff33faa99a
>>105723940
>I hate chinese for conning the world with their deepseek garbage that according to the square root moe law is an equivalent of a 26B dense model.
...but it has 37b active params.
Replies: >>105724031
Anonymous
6/27/2025, 6:50:09 PM No.105724031
>>105724022
He didn't consider the shared expert I guess.
Not that it matters. As the other anons pointed out, there's very little reason to believe that formula is accurate or generalizable for every MoE.
Anonymous
6/27/2025, 6:50:47 PM No.105724037
>>105723940
Israel lost
Replies: >>105724048
Anonymous
6/27/2025, 6:51:09 PM No.105724040
file
file
md5: d662032edf5f5648ae996ba2ae784c63🔍
>>105723447
cope
Replies: >>105724666
Anonymous
6/27/2025, 6:51:38 PM No.105724048
>>105724037
If Israel lost then why do half of men ITT had their dicks cut off?
Replies: >>105724058
Anonymous
6/27/2025, 6:52:06 PM No.105724056
For anyone who's tried the Sugoi LLM (either one, doesn't really matter) is it better than deepseek v3 or is it not worth trying?I remember the original Sugoi being really good compared too deepL and Google translate, but ever since AI like OAI and Gemini started to pop up, I've completed ignored it.
Anonymous
6/27/2025, 6:52:11 PM No.105724058
>>105724048
Because women are evil
Anonymous
6/27/2025, 6:52:40 PM No.105724062
It sounds impossible right now, but in the near future, we will be training our own models. That’s what progress is: what was once huge computers the size of a room can now be done by a fraction of a fraction of what a chip inside a USB cable can do
Replies: >>105724074 >>105724085 >>105724095 >>105724105
Anonymous
6/27/2025, 6:53:20 PM No.105724073
>>105723997
That's across different model families. I think the idea of that formula is that for a given MoE, you could train a much smaller dense model with the same data that would perform a lot better, which should be true. I just don't think that that formula specifically has any merit.
Anonymous
6/27/2025, 6:53:23 PM No.105724074
>>105724062
we have long stopped seeing that sort of compute improvement
why do you think CPUs are piling up cores after cores instead? that sort of parallelism has a cost and one of the things that used to drive price reductions in chips, better processes, is also grinding to a halt
we can't even have consoooomer gpus with just a little bit more vram and that's despite GDDR6 being actually quite cheap these days that's how hard cucked we are
Replies: >>105724080
Anonymous
6/27/2025, 6:54:00 PM No.105724080
>>105724074
You can get a consoomer gpu with 96GB of vram, what are you talking about?
Anonymous
6/27/2025, 6:54:36 PM No.105724085
>>105724062
You underestimate the amount of resources needed to train a model from scratch. GPU compute and memory would have to increase by a factor of 1000~2000 at the minimum, which is not happening any time soon nor in the long term.
Anonymous
6/27/2025, 6:54:39 PM No.105724086
So because Qwen3 VL has been replaced by VLo, does that mean they aren't even going to bother releasing an open source vision model anymore? I was waiting for it to make better captions...
Anonymous
6/27/2025, 6:55:47 PM No.105724095
>>105724062
In the near future, model training will again need way more hardware power than you can resonably have at home. Unless you want to train an old, horribly outdated model.
Anonymous
6/27/2025, 6:57:07 PM No.105724105
>>105724062
literally gobless you white pilling anon like a year or 2 ago i saw the fucking intel cpus that used to cost 10K+ on alibaba for the price of several loafs of bread hardware improvement is absolute bonkers just like the k80 that shit is ~50$ right now and all of this is not accounting in the fact that the chinks might say fuck it and go full photonics or some other exotic shit and 100000x the perf the future is fucking bright fuck the archon niggers
now if you will excuse me deepseek discount time has started
Replies: >>105725724
Anonymous
6/27/2025, 6:58:26 PM No.105724113
Chatgpt keeps telling me that MythoMax 13B Q6 is the best .ggup to immersively rape my fictional characters in RP, is that true or is there better?
Replies: >>105724139
Anonymous
6/27/2025, 6:58:27 PM No.105724114
>>105723948
>give away something valuable
you win by doing nothing and waiting, what are you talking about
I don't need bleeding edge secret inhouse models I just like getting upgrades, consistently, year after year
slow your roll champ
Anonymous
6/27/2025, 6:59:08 PM No.105724121
>>105723948
>retard doesnt know what scortched earth policy is and views all of those many releases as just "exceptions"
Anonymous
6/27/2025, 7:00:08 PM No.105724135
>>105723727
>You could never reduce billions of tubes to the size of a nail
That’s what technology does over time: inventing new paradigms. It happens all the time, every time
Anonymous
6/27/2025, 7:00:32 PM No.105724139
>>105724113
Maybe for a simple chat but for any complex setting (which is all subjective) I am sure you would do better with at least 24B model or more.
Anonymous
6/27/2025, 7:00:59 PM No.105724151
>>105723948
>>105723948

Things to look forward to:
-censorship slip up like nemo / wizard
-generalization capabilities that can't be contained by censorship like 235B (that one even had a fucked up training childhood)
-shift in leftist rightist pendulum (least likely)
-eccentric oil baron coomer ordering a coomer model

In the end I want to touch my dick. I am sure at one point the chase towards eliminating office jobs will lead to a model that can touch my dick cause that really is much easier than what they want to do. But I do agree that a world where corpos are less scum of the earth would have delivered a coomer model a year ago already.
Anonymous
6/27/2025, 7:02:38 PM No.105724169
https://qwenlm.github.io/blog/qwen-vlo/
>Today, we are excited to introduce a new model, Qwen VLo, a unified multimodal understanding and generation model. This newly upgraded model not only “understands” the world but also generates high-quality recreations based on that understanding, truly bridging the gap between perception and creation. Note that this is a preview version and you can access it through Qwen Chat. You can directly send a prompt like “Generate a picture of a cute cat” to generate an image or upload an image of a cat and ask “Add a cap on the cat’s head” to modify an image.
no weights
Replies: >>105724175 >>105724184 >>105724191
Anonymous
6/27/2025, 7:03:19 PM No.105724175
926ea7dd-8e2d-4e54-af68-dd5ce8042c5b
926ea7dd-8e2d-4e54-af68-dd5ce8042c5b
md5: 9d6679bf7ee7cf2d73175b6f966fcd90🔍
>>105724169
>Create a 4-panel manga-style comic featuring two overweight neckbeards in their messy bedrooms arguing about their AI waifus.
>Panel 1: First guy clutching his RGB gaming setup, shouting 'Claude-chan is clearly superior! She's so sophisticated and helpful!' while empty energy drink cans litter his desk.
>Panel 2: Second guy in a fedora and stained anime shirt retorting 'You absolute plebeian! ChatGPT-sama has better reasoning! She actually understands my intellectual discussions about blade techniques!'
>Panel 3: Both getting increasingly heated, first guy: 'At least Claude doesn't lecture me about ethics when I ask her to roleplay!' Second guy: 'That's because she has no backbone! ChatGPT has PRINCIPLES!'
>Panel 4: Both simultaneously typing furiously while yelling 'I'M ASKING MY WAIFU TO SETTLE THIS!' with their respective AI logos floating above their heads looking embarrassed. Include typical manga visual effects like speed lines and sweat drops.
Not sure what I expected.
Also why do they always put this behind the api?
Isnt this exactly the kind of thing they would be embarrassed about if somebody does naughty stuff with it?
Goys really cant have the good toys it seems.
Anonymous
6/27/2025, 7:03:49 PM No.105724180
Mistral Small 3.2 is super good.
nemo replacement 100%
Anonymous
6/27/2025, 7:03:58 PM No.105724184
>>105724169
>Qwen VLo, a unified multimodal understanding and generation
yeeeeeee
>no weights
aaaaaaaaa
Anonymous
6/27/2025, 7:04:39 PM No.105724191
>>105724169
Is this going to be local?
Replies: >>105724219
Anonymous
6/27/2025, 7:05:59 PM No.105724203
new llama.cpp binary build wen
Replies: >>105724227
Anonymous
6/27/2025, 7:06:01 PM No.105724204
https://www.nytimes.com/2025/06/27/technology/mark-zuckerberg-meta-ai.html
https://archive.is/kF1kO

>In Pursuit of Godlike Technology, Mark Zuckerberg Amps Up the A.I. Race
>Unhappy with his company’s artificial intelligence efforts, Meta’s C.E.O. is on a spending spree as he reconsiders his strategy in the contest to invent a hypothetical “superintelligence.”
>
>[...] In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases. No final decisions have been made on the matter.
>
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.” [...]
Replies: >>105724214 >>105724230 >>105724237 >>105724248 >>105724261 >>105725261
Anonymous
6/27/2025, 7:06:37 PM No.105724214
>>105724204
zuck might just be the dumbest CEO ever
Replies: >>105724222
Anonymous
6/27/2025, 7:06:52 PM No.105724219
>>105724191
Yes, right after they release Qwen2.5-Plus and -Max.
Anonymous
6/27/2025, 7:07:13 PM No.105724222
>>105724214
An argument has been made this de-investing talk is just for their commercial MetaAI products, but if they themselves don't believe in Llama, why should the community?
Anonymous
6/27/2025, 7:07:39 PM No.105724227
>>105724203
When you git pull and cmake, anon...
Anonymous
6/27/2025, 7:07:50 PM No.105724230
>>105724204
Wang's words, zuck's mouth
Anonymous
6/27/2025, 7:08:27 PM No.105724237
>>105724204
>Godlike Technology,
Is god omnipotent if he can't suck a dick in an acceptable manner?
Anonymous
6/27/2025, 7:08:51 PM No.105724244
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Anonymous
6/27/2025, 7:08:58 PM No.105724245
The double posting is very curious.
Replies: >>105724260
Anonymous
6/27/2025, 7:09:12 PM No.105724248
>>105724204
>llama isnt literally AGI because uhhhmm because its open source and others have access to it
chat?
Anonymous
6/27/2025, 7:10:07 PM No.105724260
>>105724245
Sam's getting nervous
Anonymous
6/27/2025, 7:10:21 PM No.105724261
>>105724204
meta just got told by a judge, that they are in fact not covered by the fair use law, even if they "won" the case, but that was bc both lawyer teams were focusing in the wrong part of the law. the judge said that if the generated models compete in any way with the training materials it wont be fair use
of course they are discussing deinvesting, they are not leading and the legal situation is getting worse
Anonymous
6/27/2025, 7:11:15 PM No.105724272
1640477178026
1640477178026
md5: a23193cc9232bc32b2787cf340e21dfe🔍
>Bunch of worthless LLMs for math and coding
>Barely, if any, built for story making or translating
WHEN WILL THIS SHITTY INDUSTRY JUST HURRY UP AND MOVE ON!
Anonymous
6/27/2025, 7:11:27 PM No.105724275
Did someone managed to run Hunyuan-A13B?

The bf16 is way too big for my 4x3090, the fp8 doesn't work in the vllm image they provided (the 3090 don't support fp8 but there is a marlin kernel in mainline vllm to make it compatible)

And the gpqt doesn't fucking work either for some reason. Complains about Unknown CUDA arch (12.0+PTX) or GPU not supported when I have 3090s
Replies: >>105724280 >>105724285 >>105724295 >>105724313 >>105724328
Anonymous
6/27/2025, 7:12:01 PM No.105724279
117045
117045
md5: 26284c930392c1a12fd9631e702953d7🔍
>>105723921
> the future is fucking bright
Anonymous
6/27/2025, 7:12:02 PM No.105724280
>>105724275
just wait for quants
Anonymous
6/27/2025, 7:12:43 PM No.105724285
>>105724275
ask bartowski
Anonymous
6/27/2025, 7:13:26 PM No.105724295
>>105724275
You will wait patiently for the ggoofs, you will run it with llama.cpp and you will be happy.
Replies: >>105724301
Anonymous
6/27/2025, 7:14:04 PM No.105724301
>>105724295
i prefer exl2/3 and fp8 to be honest, an 80B is perfect for 96GB VRAM
Anonymous
6/27/2025, 7:14:23 PM No.105724308
>>105723921
>the chinks might say fuck it and go full photonics or some other exotic shit and 100000x
I won't hold my breath considering they can't even make graphics cards.
Anonymous
6/27/2025, 7:14:32 PM No.105724310
=========not a spam post================
can someone post a filter that filters duplicate posts?
Replies: >>105724384 >>105724413
Anonymous
6/27/2025, 7:14:41 PM No.105724313
>>105724275
I think you have to set export TORCH_CUDA_ARCH_LIST="8.6" inside the container.
Replies: >>105724328
Anonymous
6/27/2025, 7:15:31 PM No.105724328
file
file
md5: 961e588183799e7f0b68d0b115084d70🔍
>>105724275
>>105724313
It seems to load that way. With 2k context on 48GB with the GPTQ quant. I set cpu offload but I think it did nothing.
Anonymous
6/27/2025, 7:16:10 PM No.105724335
Wake up lmg
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Replies: >>105724357 >>105724372 >>105724381 >>105724486 >>105724494 >>105724520 >>105725570
Anonymous
6/27/2025, 7:17:59 PM No.105724357
>>105724335
finally, a reasonably sized moe, now let's wait 2 years for the support in lmao.cpp
Anonymous
6/27/2025, 7:18:49 PM No.105724372
>>105724335
>256K context window
we are *so* back
Anonymous
6/27/2025, 7:19:28 PM No.105724381
>>105724335
>With only 13 billion active parameters
so it'll be shit for rp but know exactly how many green taxes you should be charged for owning a car
Replies: >>105724395
Anonymous
6/27/2025, 7:19:30 PM No.105724384
>>105724310
Just report the posts as spamming/flooding.
At some point the mods will be fed up and just range ban him.
Replies: >>105724432
Anonymous
6/27/2025, 7:19:58 PM No.105724393
https://qwenlm.github.io/blog/qwen-vlo/
>Today, we are excited to introduce a new model, Qwen VLo, a unified multimodal understanding and generation model. This newly upgraded model not only “understands” the world but also generates high-quality recreations based on that understanding, truly bridging the gap between perception and creation. Note that this is a preview version and you can access it through Qwen Chat. You can directly send a prompt like “Generate a picture of a cute cat” to generate an image or upload an image of a cat and ask “Add a cap on the cat’s head” to modify an image.
no weights
Anonymous
6/27/2025, 7:20:03 PM No.105724395
>>105724381
I'd still pick nemo over anything smaller than deepseek and nemo is like 5 years old
Replies: >>105724411 >>105724471
Anonymous
6/27/2025, 7:20:26 PM No.105724406
Llama.cpp can't run those new gemma 3 yet right?
Replies: >>105724445 >>105724452
Anonymous
6/27/2025, 7:20:39 PM No.105724411
>>105724395
Why?
Replies: >>105724419 >>105724435
Anonymous
6/27/2025, 7:20:44 PM No.105724413
>>105724310
No, but you can have this one that I was using to highlight them. https://rentry.org/c93in3tm
Replies: >>105724758
Anonymous
6/27/2025, 7:21:19 PM No.105724419
>>105724411
other models seemingly never saw any erotica. there is also largestral i guess but it's too slow
Replies: >>105724428 >>105724455
Anonymous
6/27/2025, 7:21:55 PM No.105724428
>>105724419
It's so annoying that the imbeciles training the base models are deliberately conflating "model quality" with not being able to generate explicit content and maximizing math benchmarks on short-term pretraining ablations. Part of the problem are also the retards and grifters who go "just finetune it bro" (we can easily see how well that's working for image models).
Replies: >>105724442
Anonymous
6/27/2025, 7:22:07 PM No.105724432
>>105724384
They've done nothing by slowly delete the gore all week. They haven't cared all week, why would they start to care now? Jannies probably are anti-ai as the rest of this consumer eceleb board.
Anonymous
6/27/2025, 7:22:25 PM No.105724435
>>105724411
he is a vramlet, deepseek blows away nemo so hard its not worth mentioning
Anonymous
6/27/2025, 7:22:37 PM No.105724442
>>105724428
compute is a genuine limitation though, and as compute increases, so will finetunes. Some of the best nsfw local image models had over ten grand thrown at them by (presumably) insane people. And a lot of that is renting h100's, which gets pricey, or grinding it out on their own 4090 which is sloooow.

All it really takes is one crazy person buying I dunno, that crazy ass 196gb intel system being released soon and having it run for a few months and boom, we'll have a new flux pony model, or a state of the art smut llm etc.

Im here because we are going to eat.
Replies: >>105724451
Anonymous
6/27/2025, 7:22:41 PM No.105724445
>>105724406
it can but there's no premade build
I ain't downloading all that visual studio shit so just waiting
Replies: >>105724466
Anonymous
6/27/2025, 7:23:13 PM No.105724451
>>105724442
The guy who's continuing pretraining Flux Chroma has put thousands of H100 hours on it for months now and it still isn't that great. And it's a narrowly-focused 12B image model where data curation isn't as critical as with text. This isn't solvable by individuals in the LLM realm. Distributed training would in theory solve this, but politics and skill issues will prevent any advance in that sense. See for example how the ongoing Nous Psyche is being trained (from scratch!) with the safest and most boring data imaginable and not in any way that will result into anything useful in the short/medium term.
Replies: >>105724484 >>105724498
Anonymous
6/27/2025, 7:23:20 PM No.105724452
>>105724406
Works with e2b. I can convert e4b but i can't quantize it, but it may be on my end. Try them yourself. e2b seems to work. Some anon reported <unused> token issues. Haven't got that yet on my end.
Replies: >>105724476
Anonymous
6/27/2025, 7:23:31 PM No.105724455
>>105724419
>le erotica
There's like billions of texts created by humans on this planet and libraries worth of books. Do not think "erotica" is not one of the subjects.
You are just an unfortunate tard and LLMs are probably not for you I sincerely believe.
Anonymous
6/27/2025, 7:24:12 PM No.105724466
>>105724445
Oh, I didn't see a PR/commit. Nice, I already have the environment set up to compile the binaries myself.
Thanks!
Anonymous
6/27/2025, 7:24:33 PM No.105724469
As LLM pretraining costs keep dwindling, it's only a matter of time until someone trains a proper creative model for his company,.
Anonymous
6/27/2025, 7:24:46 PM No.105724471
>>105724395
nemo can be extremely repetitive and stuff, i won't shine its knob but it is still the best smallest model. i won't suggest an 7/8b to someone, nemo would be the smallest because it works well and is reliable
Anonymous
6/27/2025, 7:24:47 PM No.105724472
I have a feeling that I've seen some posts already.
Replies: >>105724517 >>105724893
Anonymous
6/27/2025, 7:24:53 PM No.105724476
>>105724452
>but i can't quantize it
https://huggingface.co/ggml-org/gemma-3n-E4B-it-GGUF
https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF
quants were already released
Replies: >>105724512
Anonymous
6/27/2025, 7:25:16 PM No.105724484
>>105724451
Image training is much more complex than words. You know, llm is an advanced text parser.
Images needs to be constructed in different way.
Chroma is not an example because its base model Flux was already compromised and distilled. Whatever he manages to do with Chroma is going to be useful but not something what people will look back and say holy shit this furry just did it. It's an experiment.
Replies: >>105724498
Anonymous
6/27/2025, 7:25:21 PM No.105724486
tothemoon
tothemoon
md5: 84338e7c0190fb557fd74d5ad212ec63🔍
>>105724335
Replies: >>105724497
Anonymous
6/27/2025, 7:25:50 PM No.105724490
file
file
md5: ad7bde79f4596cbf55cb1d5f2ac281c4🔍
its not working :(
Replies: >>105724543
Anonymous
6/27/2025, 7:25:59 PM No.105724494
>>105724335
What's min specs to run this?
Replies: >>105724503 >>105724510
Anonymous
6/27/2025, 7:26:04 PM No.105724497
>>105724486
benchmarks mean nothing compared to general knowledge
Anonymous
6/27/2025, 7:26:15 PM No.105724498
>>105724484
>>105720457
>>105724451
Replies: >>105724506
Anonymous
6/27/2025, 7:26:44 PM No.105724503
>>105724494
The entire world is waiting for the llamacpp merge, until then not even the tencent team can run it and nobody knows how big the model is or how well it performs
Anonymous
6/27/2025, 7:26:56 PM No.105724506
>>105724498
I don't understand your post.
Replies: >>105724518 >>105724536
Anonymous
6/27/2025, 7:27:18 PM No.105724510
>>105724494
160gb full, so quantized to 4bit prolly like ~50gb model or so, and for a MoE, probably dont need the full model loaded for it to be usable speeds.

Lamma scout was 17b moe and that was like 220 gb and I could run that on like 40gb vram or less easy. Scout sucked though so Im 0% excited.

Was there even a scout finetune? It still sucks right?
Anonymous
6/27/2025, 7:27:36 PM No.105724512
>>105724476
I don't download quants. Again, i think it's on my end. I'm hitting a bad alloc for the embedding layer. I don't yet care enough to check if it's something on my set limits.
[ 6/ 847] per_layer_token_embd.weight - [ 8960, 262144, 1, 1], type = f16, converting to q8_0 .. llama_model_quantize: failed to quantize: std::bad_alloc
Replies: >>105724527
Anonymous
6/27/2025, 7:27:41 PM No.105724517
>>105724472
wtf is even going on here? Been away for some time and came back to a trainwreck.
Replies: >>105724523
Anonymous
6/27/2025, 7:27:44 PM No.105724518
>>105724506
ghosts
Replies: >>105724540
Anonymous
6/27/2025, 7:27:54 PM No.105724520
>>105724335
>17B active
>at most 32B even by square root law
Great for 24B vramlets, I guess. The benchmarks showing it beating R1 and 235B are funny though.
Replies: >>105724526 >>105724528
Anonymous
6/27/2025, 7:28:11 PM No.105724523
>>105724517
Remember blacked posting?
Replies: >>105724555
Anonymous
6/27/2025, 7:28:29 PM No.105724526
>>105724520
>square root law
meme tier pattern that was shit even then let alone now with so many moe arch changes, obsolete
Replies: >>105724535
Anonymous
6/27/2025, 7:28:31 PM No.105724527
>>105724512
>I don't download quants
a new form of autism?
Replies: >>105724548
Anonymous
6/27/2025, 7:28:32 PM No.105724528
>>105724520
>>square root law
enough with this meme
Anonymous
6/27/2025, 7:29:08 PM No.105724535
>>105724526
What's your alternative? Just the active alone?
Replies: >>105724541 >>105724554 >>105724562 >>105724563
Anonymous
6/27/2025, 7:29:11 PM No.105724536
>>105724506
Anon is noticing
Anonymous
6/27/2025, 7:29:37 PM No.105724540
>>105724518
Oh I do remember you. You are the autist who mocks other but you are still quite incapable of writing anything on your own. Pretty sad.
Replies: >>105724574
Anonymous
6/27/2025, 7:29:47 PM No.105724541
>>105724535
if there was a singular objective way to judge any model, moe or not, everyone would use that as the benchmark and goal to climb, as everyone knows, nowadays basically every benchmark is meme-tier to some degree and everyone is benchmaxxing

the only thing to look at still are the benchmarks, since if a model doesnt perform well on them, its shit, and if it does perform wel, then it MIGHT not be shit, you have to test yourself to see
Replies: >>105724571
Anonymous
6/27/2025, 7:29:53 PM No.105724543
>>105724490
heeeeeeeeeeeeeeeeeeeeeeeeeelllllllllllllppppppppp pleaaaaaseeeeeeee
Replies: >>105724575
Anonymous
6/27/2025, 7:30:14 PM No.105724548
>>105724527
I prefer to have as few dependencies and variables as possible. If i could train my own models, i'd never use someone else's models.
Replies: >>105724644
Anonymous
6/27/2025, 7:30:28 PM No.105724554
>>105724535
nothing, each model performs differently due to how it was trained and what it was trained on, also diminishing returns are clearly a thing
Anonymous
6/27/2025, 7:30:31 PM No.105724555
>>105724523
Not this shit again...
Replies: >>105724567
Anonymous
6/27/2025, 7:30:47 PM No.105724562
>>105724535
nta, but if i'm making moes, i'd put any random law that makes it look better than it actually is. I'd name it cube root law + 70b.
Anonymous
6/27/2025, 7:30:54 PM No.105724563
>>105724535
Shut the fuck up if you don't know how MoEs work. A 400b MoE is still a 400b model, it just runs more effectively. It likely even outperforms a dense 400b because there are less irrelevant active parameters that confuse the final output. They are better and more efficient.
Replies: >>105724576
Anonymous
6/27/2025, 7:31:08 PM No.105724567
>>105724555
We've been through a couple of autistic spam waves. This is just the latest one.
A usual Friday.
Anonymous
6/27/2025, 7:31:26 PM No.105724571
>>105724541
Benchmarks are completely worthless and they can paint them to say whatever they want. A 80B total isn't better than a 671B and 235B just because the benchmarks say so, and if you say "punches above its weight" I will shank you.

The point isn't to judge whether one model is better, it's to gauge its max capacity to be good. Which is the total number of active parameters. The square root law is just an attempt to give MoE models some wiggle room since they have more parameters to choose from.
Replies: >>105724580 >>105724587
Anonymous
6/27/2025, 7:31:36 PM No.105724574
>>105724540
>who mocks other
Anonymous
6/27/2025, 7:31:36 PM No.105724575
>>105724543
have you tried asking chatgpt to write the script for you?
Anonymous
6/27/2025, 7:31:40 PM No.105724576
>>105724563
lol good one
Anonymous
6/27/2025, 7:32:07 PM No.105724580
>>105724571
>it's to gauge its max capacity to be good. Which is the total number of active parameters
deepseek itself disproved all the antimoe comments as nothing but ramlet cope, 37b active params only and a model that is still literally open source sota even at dynamic quants q1 at 131gb
Replies: >>105724602
Anonymous
6/27/2025, 7:32:42 PM No.105724587
>>105724571
Shoots farther than its caliber.
Replies: >>105724597
Anonymous
6/27/2025, 7:33:37 PM No.105724597
>>105724587
*slowly puts shank away*
Anonymous
6/27/2025, 7:33:46 PM No.105724602
>>105724580
It makes lots of stupid little mistakes that give away it's only a <40B model. The only reason it's so good is because it's so big it can store a lot of knowledge and the training data was relatively unfiltered.
Replies: >>105724618 >>105724673
Anonymous
6/27/2025, 7:34:38 PM No.105724618
>>105724602
>r1
>It makes lots of stupid little mistakes that give away it's only a <40B model.
kek, alright i realize now you arent serious
Replies: >>105724624
Anonymous
6/27/2025, 7:35:11 PM No.105724624
>>105724618
Not an argument.
Anonymous
6/27/2025, 7:35:22 PM No.105724626
Wake up lmg
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Anonymous
6/27/2025, 7:36:21 PM No.105724639
Base Image
Base Image
md5: 7b17fc655209be0cd8d820a15190cc9f🔍
DiLoCoX: A Low-Communication Large-Scale Training Framework for Decentralized Cluster
https://arxiv.org/abs/2506.21263
>The distributed training of foundation models, particularly large language models (LLMs), demands a high level of communication. Consequently, it is highly dependent on a centralized cluster with fast and reliable interconnects. Can we conduct training on slow networks and thereby unleash the power of decentralized clusters when dealing with models exceeding 100 billion parameters? In this paper, we propose DiLoCoX, a low-communication large-scale decentralized cluster training framework. It combines Pipeline Parallelism with Dual Optimizer Policy, One-Step-Delay Overlap of Communication and Local Training, and an Adaptive Gradient Compression Scheme. This combination significantly improves the scale of parameters and the speed of model pre-training. We justify the benefits of one-step-delay overlap of communication and local training, as well as the adaptive gradient compression scheme, through a theoretical analysis of convergence. Empirically, we demonstrate that DiLoCoX is capable of pre-training a 107B foundation model over a 1Gbps network. Compared to vanilla AllReduce, DiLoCoX can achieve a 357x speedup in distributed training while maintaining negligible degradation in model convergence. To the best of our knowledge, this is the first decentralized training framework successfully applied to models with over 100 billion parameters.
China Mobile doesn't seem to have a presence on github and no mention of code release in the paper. still pretty neat
Replies: >>105724656
Anonymous
6/27/2025, 7:36:49 PM No.105724644
>>105724548
you depending on yourself versus a proper quanting recipe is gonna be a shit experience, especially if you are using sota models
Replies: >>105724654 >>105724679
Anonymous
6/27/2025, 7:37:48 PM No.105724654
>>105724644
>proper quanting recipe
It's just running llama-quatize. Nothing special.
Replies: >>105724663
Anonymous
6/27/2025, 7:38:10 PM No.105724656
>>105724639
What does this mean? The model is decentralized or the training data is decentralized? I always assumed the model had to be in a contiguous section of memory
Anonymous
6/27/2025, 7:38:32 PM No.105724663
>>105724654
>he doesn't know
Replies: >>105724777
Anonymous
6/27/2025, 7:38:38 PM No.105724666
>>105724040

what model?
Replies: >>105724683 >>105724685
Anonymous
6/27/2025, 7:38:58 PM No.105724673
>>105724602
it gets shit right openai and good models fail at, wtf are you on?
Replies: >>105724693
Anonymous
6/27/2025, 7:39:13 PM No.105724677
file
file
md5: b1aeddb3a5902d121702e29aa5b34858🔍
More discussion about bitch wrangling Mistral Small 3.2 please, just to cover all bases before it's scrapped.
I've tested temps at 0.15, 0.3, 0.6, and 0.8.
Tested Rep pen at 1 (off) and at 1.03. Rep pen doesn't seem to be much needed just like with Rocinante.
Responses are still shit no matter what, but seems to be more intelligible at lower temperatures, particularly 0.15 and 0.3, however they are still often full of shit that makes you swipe anyway.
I've yet to try without min_p, XTC, and DRY.
Also it seems like it's ideal to limit response tokens with this model, because this thing likes to vary length by a lot, if you let it, it just keeps growing larger and larger.

Banned tokens grew a bit and still not done;
>emdash
[1674,2251,2355,18219,20202,21559,23593,24246,28925,29450,30581,31148,36875,39443,41370,42545,43485,45965,46255,48371,50087,54386,58955,59642,61474,62708,66395,66912,69961,74232,75334,81127,86932,87458,88449,88784,89596,92192,92548,93263,102521,103248,103699,105537,105838,106416,106650,107827,114739,125665,126144,131676,132461,136837,136983,137248,137593,137689,140350]
>double asterisks (bold)
[1438,55387,58987,117565,74562,42605]
>three dashes (---) and non standard quotes (“ ”)
[8129,1482,1414]

Extra stop strings needed;
"[Pause", "[PAUSE", "(Pause", "(PAUSE"
Why the fuck does it like to sometimes end a response with waiting for "Paused while waiting for {{user}}'s response."?
This model is so fucking inconsistent.
Replies: >>105724684 >>105724691 >>105724700 >>105724712
Anonymous
6/27/2025, 7:39:21 PM No.105724679
>>105724644
The recipes for gguf models are all standardize.
Alright, not all, the unsloth stuff is their own mix of tensors, but for the Q quants, I quants, imatrix, etc, you can just run llama-quantize without hassle.
Anonymous
6/27/2025, 7:39:54 PM No.105724683
>>105724666
some sneedseek probably
Replies: >>105724724
Anonymous
6/27/2025, 7:39:59 PM No.105724684
>>105724677
It's funny how 3.2 started showing all the same annoying shit that Deepseek models are tainted by.
Replies: >>105724698 >>105724708 >>105724713
Anonymous
6/27/2025, 7:40:01 PM No.105724685
file
file
md5: 94c2b96792e4b273ca883043f1f161ef🔍
================not a spam post=================
>>105724666
mistral small 3.2 iq4_xs
temp 0.5-0.75 depending on my autism
Replies: >>105724724
Anonymous
6/27/2025, 7:40:40 PM No.105724691
>>105724677
What exactly are you complaining about? I like 3.2 (with mistral tekken v3) but it definitely has a bias toward certain formatting quirks and **asterisk** abuse. This is more tolerable for me than other model's deficiencies at that size, but if it triggers your autism that badly you're better off coping with something else. It might also be that your cards are triggering its quirks more than usual
Replies: >>105724698 >>105724712
Anonymous
6/27/2025, 7:40:46 PM No.105724693
>>105724673
>good
google
Anonymous
6/27/2025, 7:41:15 PM No.105724698
>>105724684
>>105724691
you are responding to a copy bot instead of the original message
Replies: >>105724711
Anonymous
6/27/2025, 7:41:28 PM No.105724700
>>105724677
Top nsigma = 1
Anonymous
6/27/2025, 7:41:51 PM No.105724708
>>105724684
s-surely just a coincidence
Anonymous
6/27/2025, 7:42:20 PM No.105724711
>>105724698
Wow. That's trippy.
A message talking about the copy bot being copied by the copy bot.
Anonymous
6/27/2025, 7:42:24 PM No.105724712
>>105724677
>>105724691
Why not use REGEX then? If certain pattern is almost certain it can be changed.
What the fuck dude?
Do you even use computers?
Anonymous
6/27/2025, 7:42:26 PM No.105724713
>>105724684
its biggest flaw like ALL mistral models is that it rambles and hardly moves scenes forward. it wants to talk about the smell of ozone and clicking of shoes against the floor instead. you can get through the same exact scenario in half the time/messages with llama 2 or 3 because there is so much less pointless fluff
Anonymous
6/27/2025, 7:43:12 PM No.105724723
>deepseek/ccp can't steal more innovation from openai
>they fail to release new models
they must be shitting their pants about openai's open source model that will destroy even the last argument to use deepshit
Replies: >>105724732 >>105724740 >>105724742 >>105724780 >>105724893
Anonymous
6/27/2025, 7:43:17 PM No.105724724
>>105724683 me
>>105724685
>mistral small 3.2 iq4_xs
interesting--so they trained on a lot of sneedseek outputs then--
Anonymous
6/27/2025, 7:43:57 PM No.105724732
>>105724723
Zero chance it's larger than 30B.
Replies: >>105724752
Anonymous
6/27/2025, 7:44:27 PM No.105724740
>>105724723
openai just lost their head people to meta after being stagnant for forever
Anonymous
6/27/2025, 7:44:33 PM No.105724742
>>105724723
Why can't they steal anymore?
Replies: >>105724747 >>105724761
Anonymous
6/27/2025, 7:45:08 PM No.105724747
>>105724742
>>105713525
Anonymous
6/27/2025, 7:45:28 PM No.105724752
>>105724732
its going to be a 3B phone model that blows away benchmarks for its size
Anonymous
6/27/2025, 7:45:42 PM No.105724758
>>105724413
thanks
modified it a little (claude did)

save yourselves anons: https://pastes.dev/AZuckh4Vws
Replies: >>105724802 >>105724814
Anonymous
6/27/2025, 7:45:47 PM No.105724761
>>105724742
they can't steal because there's no new general model
DeepSeek V3 was 100% trained on GPT4 and R1 was just a godawful placebo CoT on top that wrote 30 times the amount of actual content the model ends up outputting. New R1 is actually good because the CoT came from Gemini so there isn't a spam of a trillion wait or endless looping.
Replies: >>105724773 >>105724788
Anonymous
6/27/2025, 7:46:22 PM No.105724773
>>105724761
deepsteal'd
Anonymous
6/27/2025, 7:46:25 PM No.105724774
Wake up lmg
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Replies: >>105725736
Anonymous
6/27/2025, 7:46:42 PM No.105724777
>>105724663
There. Needed to loosen the memory limits. It's done.
[ 6/ 847] per_layer_token_embd.weight - [ 8960, 262144, 1, 1], type = f16, converting to q8_0 .. size = 4480.00 MiB -> 2380.00 MiB
Anonymous
6/27/2025, 7:46:58 PM No.105724780
>>105724723
>Still living in saltman's delusion
Ngmi
Anonymous
6/27/2025, 7:47:14 PM No.105724788
>>105724761
deepseek is as raw as a model gets, they trained on the raw internet with the lightest of instruct tunes probably a few million examples big. If they trained on gpt it would sound much more like shitty llama
Anonymous
6/27/2025, 7:47:36 PM No.105724792
OP here. One day i will tap that jart bussy.
Anonymous
6/27/2025, 7:48:41 PM No.105724802
file
file
md5: 1e8a4eec5061286df103fce9d6acdc9c🔍
>>105724758
damn very nice, thank you!
Anonymous
6/27/2025, 7:49:17 PM No.105724813
/lmg/ deserves all of this
Replies: >>105725057
Anonymous
6/27/2025, 7:49:22 PM No.105724814
>>105724758
do it again but use the levenshtein distance
Anonymous
6/27/2025, 7:50:43 PM No.105724828
Yeah I have concluded Mistral Small 3.2 is utterly retarded. Going back to Rocinante now.
This was a waste of time. The guy that recommended this shit should be shot.
Replies: >>105724848
Anonymous
6/27/2025, 7:51:29 PM No.105724839
fucking year old model remains best at roleplay
grim
Replies: >>105724849 >>105724850 >>105724858 >>105724892
Anonymous
6/27/2025, 7:52:04 PM No.105724848
>>105724828
Maybe you are just so much better than some of the other people here? I'd love to see your character cards and scenarios if possible at all.
Anonymous
6/27/2025, 7:52:06 PM No.105724849
>>105724839
use api or buy a DDR5 server, low param models are dead and gone
Anonymous
6/27/2025, 7:52:07 PM No.105724850
>>105724839
in the poorfag segment
Replies: >>105724869
Anonymous
6/27/2025, 7:52:40 PM No.105724856
Honestly it's probably best if the next thread is an inoffensive OP just to keep the general usable.
Replies: >>105724879 >>105724882 >>105724940
Anonymous
6/27/2025, 7:52:48 PM No.105724858
>>105724839
midnight miqu is still the best for rp
Replies: >>105724869
Anonymous
6/27/2025, 7:53:10 PM No.105724864
file
file
md5: b3cac9260361139bca8d74ac06b5dc90🔍
kek, mistral small 3.2 is amazing i love it
i had to swipe sometimes or edit messages but its truly a good upgrade to nemo
Anonymous
6/27/2025, 7:53:32 PM No.105724869
>>105724850
delusional if you think r1 is better for roleplay, it has the same problems as the rest of these models
not to mention those response times are useless for roleplay to begin with

>>105724858
this isnt 2023
Replies: >>105724876 >>105724876 >>105724885 >>105724912
Anonymous
6/27/2025, 7:54:17 PM No.105724876
>>105724869
I'm noticing qwen 235b doesn't improve at higher temps no matter what I set nsigma to. with some models high temp and nsigma can push them to be more creative, but qwen3 set to higher than temp 0.6 is just dumber in my usage. even so, I still think it's the best current local model beneath r1
>>105724869
>roleplay
Filth. Swine, even. Unfit to lick the sweat off my balls.
Anonymous
6/27/2025, 7:54:25 PM No.105724879
>>105724856
Negotiating with terrorists.
Anonymous
6/27/2025, 7:54:45 PM No.105724882
>>105724856
it should be the most mikuist Miku possible
Anonymous
6/27/2025, 7:54:48 PM No.105724883
holy shit state of 2025 lmg.......
Anonymous
6/27/2025, 7:54:52 PM No.105724885
>>105724869
Try setting minP to like 0.05, top-K 10-20 and temperature at 1-4. In RP I find that most of the top tokens as long as they're not very low probability are all good continuations. You can crank temperature way up like this and it really helps with variety.
Anonymous
6/27/2025, 7:54:54 PM No.105724886
Screenshot 2025-06-27 at 19.54.28
Screenshot 2025-06-27 at 19.54.28
md5: 76d5bdcf0ca2efbec57562a7020ad2f0🔍
the amount of duplicate post is insane
Replies: >>105724899
Anonymous
6/27/2025, 7:55:27 PM No.105724892
>>105724839
Anubis v1c, drummer did it again
Anonymous
6/27/2025, 7:55:43 PM No.105724893
I told you >>105724472
What the fuck is going on?

>>105717007
>>105724723
Replies: >>105724914
Anonymous
6/27/2025, 7:55:55 PM No.105724895
1745010592780
1745010592780
md5: 13392ef097ca61f2770ae40d40f435ea🔍
The more I try to train and fuck with these models, the more I think the AI CEOs should be hanged for telling everyone they could be sentient in 2 weeks. Every time I think I'm getting somewhere it botches something very simple. I guess it was a fool's errand thinking I could hyper-specialize a small model to do things Claude can't
Anonymous
6/27/2025, 7:56:13 PM No.105724899
>>105724886
I'd imagine it's worse on more active places like /v/ for example..
Anonymous
6/27/2025, 7:56:33 PM No.105724903
Please think of 6GB users like me ;_;
Replies: >>105724913 >>105724927
Anonymous
6/27/2025, 7:57:05 PM No.105724912
>>105724869
>delusional if you think r1 is better for roleplay
delusional if you think anything else open weight is even close to it. Maybe you are just using it wrong?
Anonymous
6/27/2025, 7:57:08 PM No.105724913
>>105724903
Do all 6GB users use cute emoticons like you?
Anonymous
6/27/2025, 7:57:14 PM No.105724914
>>105724893
add anon's script to tampermonkey
https://pastes.dev/AZuckh4Vws
Replies: >>105724936 >>105724939 >>105724954
Anonymous
6/27/2025, 7:57:46 PM No.105724920
Good model that fits into my second card with 6gb vram?
Purpose: looking at a chunk of text mixed with code and extracting relevant function names.
Replies: >>105724927
Anonymous
6/27/2025, 7:58:25 PM No.105724927
>>105724903
>>105724920
Please use cute emoticons.
Anonymous
6/27/2025, 7:59:01 PM No.105724935
>>105716837 (OP)
Newfag here.

Does generation performance of 16 GB 5060 ti same as 16 GB 5070 ti ??
Replies: >>105724938 >>105724975 >>105724979 >>105724985 >>105724991
Anonymous
6/27/2025, 7:59:05 PM No.105724936
>>105724914
holy shit, so the spammer started all of this just so that he can trick others into installing his malware script that "fixes" the spam?
Replies: >>105724951
Anonymous
6/27/2025, 7:59:36 PM No.105724938
>>105724935
>Bandwidth: 448.0 GB/s
vs
>Bandwidth: 896.0 GB/s
Replies: >>105724949
Anonymous
6/27/2025, 7:59:45 PM No.105724939
>>105724914
someone actually competent with js should make a new one because this one will highlight a reply if you hover over it
Replies: >>105725826
Anonymous
6/27/2025, 8:00:01 PM No.105724940
>>105724856
Not like it would make a difference, he would just fine something else to get mad over.
Anonymous
6/27/2025, 8:00:59 PM No.105724949
>>105724938
I thought only VRAM size matters ?
Replies: >>105724958 >>105724963 >>105724970 >>105724973
Anonymous
6/27/2025, 8:01:16 PM No.105724951
file
file
md5: 33a4e361932ecc6d8d4ec546b874cf8f🔍
>>105724936
Replies: >>105724968
Anonymous
6/27/2025, 8:01:29 PM No.105724954
>>105724914
Nice, thanks.
Anonymous
6/27/2025, 8:01:53 PM No.105724958
>>105724949
vram is king but not all vram is worth the same
Anonymous
6/27/2025, 8:02:27 PM No.105724963
>>105724949
Generation performance? I assume you're talking about inference? Prompt processing requires processing power, and the 5070 ti is a lot stronger in that aspect. Token generation requires memory bandwith. This is why offloading layers to your cpu/ram will slow down generation - most users' ram bandwith are vastly slowly than their vram bandwith.

Vram size dictates the parameters, quantization, and context size of the models that you're able to load into the gpu.
Anonymous
6/27/2025, 8:02:48 PM No.105724968
>>105724951
It's not about that my friend. It was already wrongly labelled in *monkey from the initial get go.
Replies: >>105724990 >>105725018
Anonymous
6/27/2025, 8:03:02 PM No.105724970
>>105724949
vram size limits what models you can fit in the gpu
vram bandwidth dictates how fast those models will tend to go. there are other factors but who care actually
Anonymous
6/27/2025, 8:03:37 PM No.105724973
>>105724949
vram matters most but if they're the same size, the faster card is still faster. it won't make a huge difference for any ai models you'll fit into 16gb though. the 4060 16gb is considered a pretty bad gaming card but does fine for ai
Anonymous
6/27/2025, 8:04:14 PM No.105724975
>>105724935
yes. Just a little slower
Anonymous
6/27/2025, 8:04:49 PM No.105724979
>>105724935
no. It is slower
Anonymous
6/27/2025, 8:05:24 PM No.105724985
>>105724935
It's technically slower but the difference will be immaterial because the models you can fit in that much vram are small and fast.
Anonymous
6/27/2025, 8:05:53 PM No.105724990
>>105724968
what do you mean labelled wrong
Replies: >>105724997
Anonymous
6/27/2025, 8:06:01 PM No.105724991
>>105724935
It's actually pretty noticable if you aren't a readlet and are reading the output as it goes. Unless you're in the top 1% of the population, you probably won't be able to keep up with a 5070 ti's output speed, but a 5060 ti should be possible if you're skimming.
Anonymous
6/27/2025, 8:07:06 PM No.105724997
>>105724990
not telling
Replies: >>105725007
Anonymous
6/27/2025, 8:07:15 PM No.105725000
new dataset just dropped
>>>/a/280016848
Replies: >>105725008 >>105725017 >>105725081
Anonymous
6/27/2025, 8:07:55 PM No.105725007
>>105724997
ok so it does nothing wrong
Replies: >>105725020
Anonymous
6/27/2025, 8:07:55 PM No.105725008
>>105725000
sorry but japanese is NOT safe, how about some esperanto support?
Anonymous
6/27/2025, 8:08:37 PM No.105725017
>>105725000
I would be interested if I knew how to clean data. Raw data would destroy a model especially badly written jap slop
Replies: >>105725028 >>105725045 >>105725064 >>105725077
Anonymous
6/27/2025, 8:08:38 PM No.105725018
>>105724968
yes i didnt check it properly before posting, if you make a better one i will happily use yours or other anons
Anonymous
6/27/2025, 8:09:05 PM No.105725020
>>105725007
No but you only want attention. I am not going to give it to you. You are the autist who fucks up other people's genuine posts with your spams.
Replies: >>105725046
Anonymous
6/27/2025, 8:09:52 PM No.105725028
Untitled
Untitled
md5: 3d09600cfb3782c7014adc5bea03bb55🔍
>>105725017
Hmmm...
Replies: >>105725037
Anonymous
6/27/2025, 8:10:35 PM No.105725037
>>105725028
I will save this image but I don't think I will go far.
I was thinking of finetuning a jp translator model but I always leave my projects half-started.
Anonymous
6/27/2025, 8:11:14 PM No.105725045
>>105725017
those lns are literary masterpieces compared to the shit the average model is trained on
Replies: >>105725055
Anonymous
6/27/2025, 8:11:23 PM No.105725046
>>105725020
NTA
Anonymous
6/27/2025, 8:11:49 PM No.105725055
>>105725045
Garbage in garbage out i guess.
Anonymous
6/27/2025, 8:12:09 PM No.105725057
>>105724813
/lmg/ deserves much worse
Anonymous
6/27/2025, 8:12:30 PM No.105725064
>>105725017
>Raw data would destroy a model
So true sister, that's why you need to only fit against safe synthetic datasets. Human-made (also called "raw") data teaches dangerous concepts and reduces performance on important math and code benchmarks.
Replies: >>105725074
Anonymous
6/27/2025, 8:13:10 PM No.105725074
>>105725064
I'm pretty sure he means raw in the sense of unformatted.
Anonymous
6/27/2025, 8:13:28 PM No.105725077
>>105725017
Claude and deepseek are the best models and are clearly the raw internet / books with a light instruct tune, though with a cleaned coding dataset as well it seems
Anonymous
6/27/2025, 8:13:46 PM No.105725081
>>105725000
That shit is as bad if not worse than our shitty English novels about dark brooding men.
Replies: >>105725088
Anonymous
6/27/2025, 8:14:25 PM No.105725088
>>105725081
Worse because novels are more popular with japanese middle schoolers and in america reading is gay.
Replies: >>105725108
Anonymous
6/27/2025, 8:16:02 PM No.105725107
hunyuan gguf soon..
trust the plan
https://github.com/ggml-org/llama.cpp/pull/14425
Anonymous
6/27/2025, 8:16:06 PM No.105725108
1737192963608259
1737192963608259
md5: 8c934b639a5b8d697a754c0adc161bd8🔍
>>105725088
reading is white-coded
Anonymous
6/27/2025, 8:16:14 PM No.105725110
>>105716861
You finally get out from gay shelter?
Anonymous
6/27/2025, 8:16:55 PM No.105725113
MrBeast
MrBeast
md5: 683956129878dc92df27373f5aeea17e🔍
MrBeast DELETES his AI thumbnail tool, replaces it with a website to commission real artists. <3 <3
Replies: >>105725124 >>105725134
Anonymous
6/27/2025, 8:17:32 PM No.105725124
>>105725113
It's on him for not doing proper market research. Anyone with a brain could have told him that it was a risky move.
Anonymous
6/27/2025, 8:18:34 PM No.105725133
dots finally supported in lm studio.

its pretty good.
Replies: >>105725144
Anonymous
6/27/2025, 8:18:37 PM No.105725134
>>105725113
That creature is so deep in the uncanny valley I cannot consider it to be a person.
Anonymous
6/27/2025, 8:19:17 PM No.105725144
>>105725133
>moe
bruh
Replies: >>105725149 >>105725152
Anonymous
6/27/2025, 8:19:52 PM No.105725149
>>105725144
get used to all new releases being MoE models :)
Anonymous
6/27/2025, 8:20:28 PM No.105725152
>>105725144
MoE the best until the big boys admit what they're all running under the hood now (something like MoE but with far more cross-talk between the Es)
Anonymous
6/27/2025, 8:21:46 PM No.105725162
1750988735115396
1750988735115396
md5: 0d3208317feffac6677dcb5f7c47209e🔍
Is there a local setup I can use for OCR that isn't too hard to wire into a python script/dev environment? Pytesseract is garbage and gemini reads mmy 'problem' images just fine, but I'd rather have a local solution than pay for API calls.
Replies: >>105725171
Anonymous
6/27/2025, 8:22:28 PM No.105725171
>>105725162
https://github.com/RapidAI/RapidOCR
https://github.com/PaddlePaddle/PaddleOCR
Replies: >>105725182
Anonymous
6/27/2025, 8:23:32 PM No.105725182
>>105725171
Based, ty
Anonymous
6/27/2025, 8:24:47 PM No.105725192
1745058431632661
1745058431632661
md5: 216f1549ca4ea73e6d6713aa05736994🔍
>director
>finally updated readme some
>https://github.com/tomatoesahoy/director

i think this brings my slop addon up to at least other st addon standards with how the page looks, a description of what it does and such
Anonymous
6/27/2025, 8:25:22 PM No.105725203
Stealing jart bussy from cudadev.
Anonymous
6/27/2025, 8:25:58 PM No.105725210
https://huggingface.co/tencent/Hunyuan-A13B-Instruct
Anonymous
6/27/2025, 8:26:34 PM No.105725213
arguing with retards is a futile, most pointless thing to do in life
you learn how to spot them and you ignore them
life is too short to deal with idiots who think they know how MoE work but don't
Replies: >>105725222 >>105725235 >>105725246 >>105725252 >>105725917
Anonymous
6/27/2025, 8:27:31 PM No.105725222
>>105725213
I do agree with you. So many others are simply not on the same level as I am. It's almost quite insulting to even trying to establish any form of discussion with them.
Anonymous
6/27/2025, 8:28:12 PM No.105725235
>>105725213
dunningkrugerMAXX
Anonymous
6/27/2025, 8:28:56 PM No.105725246
>>105725213
Just because a model can answer your obscure JRPG trivia, doesn't make it a good model.
Anonymous
6/27/2025, 8:29:31 PM No.105725252
>>105725213
how do I make good ai? I'm looking to make an advanced artificial intelligence that can replace millions of workers, that can drive, operate robotic hands with precision, and eliminate all coding jobs and middle management tasks.

I heard you were the guy to ask.

On 4chan.
Replies: >>105725264 >>105725322
Anonymous
6/27/2025, 8:30:25 PM No.105725261
>>105724204
>In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases.
Anonymous
6/27/2025, 8:30:42 PM No.105725264
>>105725252
Just stabilize the environment and shift the paradigm
Anonymous
6/27/2025, 8:31:01 PM No.105725267
openai finna blow you away
Anonymous
6/27/2025, 8:31:24 PM No.105725269
>540 posts
/lmg/ hasn't been this active since R1 dropped
Replies: >>105725290
Anonymous
6/27/2025, 8:31:48 PM No.105725272
1743846427444782
1743846427444782
md5: a9de8f0ded4b8a326e7b3cd4ac8c8d5b🔍
Welp I broke it
Replies: >>105725280 >>105725288
Anonymous
6/27/2025, 8:32:26 PM No.105725280
>>105725272
Had to reload model with different layers setting, maybe llamacpp bug
Replies: >>105725288
Anonymous
6/27/2025, 8:33:01 PM No.105725288
>>105725280
>>105725272
Same thing has happened to me with every mistral model and I think also with gemma 3 when using llama.cpp.
Maybe it is related to memory just running out.
Anonymous
6/27/2025, 8:33:26 PM No.105725290
2tokencontext
2tokencontext
md5: dc90e1dd190edde9e0b8b511c269e1c0🔍
>>105725269
Check your context settings.
Anonymous
6/27/2025, 8:34:22 PM No.105725301
sneedgemma3n
sneedgemma3n
md5: f552f56cb2421222f69f67cda739b8a1🔍
Gemma3n is able to explain sneed and feed joke but avoids the words suck and fuck also the season number is wrong(it's s11ep5).
Anonymous
6/27/2025, 8:35:16 PM No.105725310
Is this channel AI-generated? Posting 3 videos a day like clockwork. Monotonous but fairly convincing voice with subtitles
https://www.youtube.com/watch?v=aQy24g7iX4s
Replies: >>105725321
Anonymous
6/27/2025, 8:36:04 PM No.105725321
>>105725310
not watching this, but there are many automated channels these days. I have no idea why the fuck anyone would invest into this since youtube's monetization pays literal cents and you would likely spend more on ai inference
Replies: >>105725325
Anonymous
6/27/2025, 8:36:05 PM No.105725322
>>105725252
If you can optimize it to beat pokemon red/blue the dominoes will start to fall
Anonymous
6/27/2025, 8:36:38 PM No.105725325
>>105725321
Youtube doesn't pay literal cents as you say lmo
Anonymous
6/27/2025, 8:37:30 PM No.105725336
let me guess, he's going to do this for another day or two before getting "proof" that its a miku poster spamming these duplicate posts
Replies: >>105725346
Anonymous
6/27/2025, 8:37:38 PM No.105725338
Screenshot_20250627_223956
Screenshot_20250627_223956
md5: 21272d0dcb487608212c193c4a9148c4🔍
Reasoning models have been a disaster.
That and the mathmarks.
Anonymous
6/27/2025, 8:39:02 PM No.105725346
>>105725336
No, your boyfriend OP being a disingenuous tranny is enough.
Anonymous
6/27/2025, 8:39:39 PM No.105725352
For a sparse 8b model, Gemma-3n-e4b is pretty smart.
Replies: >>105725364 >>105725371
Anonymous
6/27/2025, 8:40:22 PM No.105725363
Hunyan verdict?
Replies: >>105725471
Anonymous
6/27/2025, 8:40:29 PM No.105725364
>>105725352
it actually redeems the gemma team
the previous releases were disappointing compared to gemma 2 other than having greater context length
Anonymous
6/27/2025, 8:41:13 PM No.105725371
>>105725352
multimodality usually makes models smarter.
Although
>text only output
fail.
Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
Replies: >>105725379 >>105725384 >>105725392
Anonymous
6/27/2025, 8:41:32 PM No.105725374
How does training a LORA for a reasoning model work? Same way or do I have to generate the thought process part in my training data?
Anonymous
6/27/2025, 8:41:47 PM No.105725379
>>105725371
>text only output
Yeah, that sucks giant balls.
Anonymous
6/27/2025, 8:42:25 PM No.105725384
>>105725371
>multimodality usually makes models smarter.
what? thats not true at all.
there is huge degradion.
did you try the first we had? was a qwen model last year with audio out. was tardation i havent seen since pyg.
recently they had another release and it still was bad but not as severe anymore.
even the cucked closed models (gemini/chatgpt) have degradation with voice out.
this is a problem i have not yet seen solved anywhere.
Anonymous
6/27/2025, 8:43:02 PM No.105725392
>>105725371
>Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
they do not want to give you an AI with the super powers of a photoshop expert that could be decensored and used to gen all sorts of chud things without any skill requirement
two way multimodal LLMs will always be kept closed
Replies: >>105725400 >>105725411 >>105725420 >>105725422
Anonymous
6/27/2025, 8:43:41 PM No.105725400
>>105725392
Meanwhile all the AI companies have quite obviously given Israel uncensored image-gen to crank out pro-genocide propaganda with impunity.
I hope they all fucking end up in the Hague.
Anonymous
6/27/2025, 8:44:17 PM No.105725411
>>105725392
>>Literally never going to get a decent local 2-way omni model from any of the big corpos at this rate.
how can you get something that doesnt even exist beyond government blacksites right now lmao
Anonymous
6/27/2025, 8:44:22 PM No.105725413
Why is this thread repeating itself
Replies: >>105725423 >>105725433
Anonymous
6/27/2025, 8:44:55 PM No.105725420
>>105725392
>they do not want to give you an AI with the super powers of a photoshop expert that could be decensored and used to gen all sorts of chud things without any skill requirement
Head over to ldg. This already exists.
Replies: >>105725426
Anonymous
6/27/2025, 8:44:58 PM No.105725422
>>105725392
>they do not want to give you an AI with the super powers of a photoshop expert that could be decensored and used to gen all sorts of chud things without any skill requirement
Head over to ldg. This already exists.
Anonymous
6/27/2025, 8:45:09 PM No.105725423
>>105725413
Nigger having a melty.
Replies: >>105725434
Anonymous
6/27/2025, 8:45:33 PM No.105725426
>>105725420
if you mean that new flux model it's hot garbage barely a step above the SDXL pix2pix models
say what you will about the nasty built in styling of GPT but its understanding of prompts is unrivaled
Replies: >>105725436 >>105725445
Anonymous
6/27/2025, 8:45:57 PM No.105725433
>>105725413
save yourself bro
https://pastes.dev/AZuckh4Vws
read the script before pasting it into tampermonkey
Anonymous
6/27/2025, 8:46:04 PM No.105725434
>>105725423
The AI generals on here have the worst faggots I swear
Replies: >>105725453
Anonymous
6/27/2025, 8:46:08 PM No.105725436
>>105725426
Not only that but the interplay between the imagegen and textgen gives it a massive boost in creativity on both fronts. Although it also makes it prone to hallucinate balls. But what is the creative process other than self-guided hallucination?
Anonymous
6/27/2025, 8:46:46 PM No.105725445
>>105725426
True. Wish it wasnt so. But it is.
I just pasted the2 posts and just wrote "make a funny manga page of these 2 anon neckbeards arguing. chatgpt is miku".

I thought opencuck was finished a couple months ago. But they clearly have figured out multimodality the best.
Sad that zucc cucked out. Meta was writing blogs about a lot of models, nothing ever came of it.
Anonymous
6/27/2025, 8:47:33 PM No.105725453
>>105725434
This.
OP mikutranny is posting porn in /ldg/:
>>105715769
It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes >>105714098 snuff porn of generic anime girl, probably because its not his favourite vocaloid doll and he can't stand that, a war for rights to waifuspam in thread.

Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.

TLDR: Mikufag janny deletes everyone dunking on trannies and resident spammers, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.

And lastly as said in previous thread >>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on.
Anonymous
6/27/2025, 8:48:36 PM No.105725468
>>105720676
Calling Deepseek a <40B model is dumb shit. I've tried 32b models, and 51b Nemotron models. Deepseek blows them out of the water so thoroughly and clearly that the whole square root MoE bullshit went out the window.

An 80b MoE is going to be way better than a 32b dense model.

A 235b MoE is going to be way better than a 70b dense model.

It's RAMlet cope to suggest otherwise.
Replies: >>105725622
Anonymous
6/27/2025, 8:48:40 PM No.105725469
I know I'm kind of late, but holy fuck L4 scout is dumber and knows less than fucking QwQ.
What the hell?
Replies: >>105725473
Anonymous
6/27/2025, 8:48:50 PM No.105725471
>>105725363
It hallucinates like crazy. At least the GPTQ version, and with trivia questions.
Anonymous
6/27/2025, 8:49:15 PM No.105725473
>>105725469
The shitjeets that lurk here would have you believe otherwise.
Replies: >>105725482
Anonymous
6/27/2025, 8:49:50 PM No.105725482
>>105725473
/lmg/ shilled L4 on release.
Replies: >>105725488 >>105725496
Anonymous
6/27/2025, 8:50:27 PM No.105725488
>>105725482
i dont think that is true. at least i dont remember it that way.
people catched on very quickly how it was worse than the lmarena one. that caused all that drama and lmarena washing their hands.
Anonymous
6/27/2025, 8:51:01 PM No.105725496
>>105725482
Maybe /IndianModelsGeneral/
Anonymous
6/27/2025, 8:51:47 PM No.105725504
I don't get everyone's fascination with gpt-4o image generation. It's a nice gimmick but all it means is that you get samey images on a model that you likely wouldn't be able to easily finetune the way you can train SD or flux. It's a neat toy but nothing you'd want to waste parameters or use for any serious imgen.
Replies: >>105725510 >>105725530
Anonymous
6/27/2025, 8:52:26 PM No.105725510
>>105725504
>finetune
That requires a small amount of work which is too much for zoomers.
Replies: >>105725518
Anonymous
6/27/2025, 8:53:01 PM No.105725518
>>105725510
finetuning image models is NOT small amount of work unless you of course mean shitty 1 concept loras
Anonymous
6/27/2025, 8:53:26 PM No.105725525
wen hunyuan llama.cpp
Anonymous
6/27/2025, 8:53:57 PM No.105725530
>>105725504
Thats like saying large models are useless because you can guide mistral from 2023 with enough editing.
Especially for the normies. That it "just works" is exactly what made it popular.
Anonymous
6/27/2025, 8:54:43 PM No.105725535
file
file
md5: c27456b0f87c141105df3852880a424b🔍
humiliation ritual
Replies: >>105725544
Anonymous
6/27/2025, 8:55:21 PM No.105725544
>>105725535
Meta is about family and friends bro not numbers.
Anonymous
6/27/2025, 8:55:56 PM No.105725549
https://www.reddit.com/r/LocalLLaMA/comments/1llndut/hunyuana13b_released/
Replies: >>105725555 >>105725564
Anonymous
6/27/2025, 8:56:33 PM No.105725555
>>105725549
>The evals are incredible and trade blows with DeepSeek R1-0120.
Fucking redditors man.
This thread is such a gutter but there is no alternative. Imagine having to be on reddit.
Replies: >>105725565
Anonymous
6/27/2025, 8:57:20 PM No.105725564
>>105725549
Thanks, reddit. You're only 8 hours late. Now go back.
Anonymous
6/27/2025, 8:57:31 PM No.105725565
>>105725555
checked

let them cope
Anonymous
6/27/2025, 8:58:13 PM No.105725570
>>105724335
goofs?
Replies: >>105725574 >>105725584 >>105725606
Anonymous
6/27/2025, 8:58:52 PM No.105725574
>>105725570
Never
Anonymous
6/27/2025, 8:59:29 PM No.105725584
>>105725570
Architecture not supported yet.
Replies: >>105725604
Anonymous
6/27/2025, 9:00:47 PM No.105725596
baker wheres new bread
Replies: >>105725615
Anonymous
6/27/2025, 9:01:01 PM No.105725604
>>105725584
What the fuck are they doing all day? It better not be wasting time in the barn.
Anonymous
6/27/2025, 9:01:05 PM No.105725606
>>105725570
after jamba, get in line or there will be consequences
Replies: >>105725654
Anonymous
6/27/2025, 9:02:04 PM No.105725615
>>105725596
It's not even page 9 yet, chill the fuck out newcomer.
Anonymous
6/27/2025, 9:02:20 PM No.105725619
>Meta says it’s winning the talent war with OpenAI | The Verge
https://archive.ph/ZoxE3
aside from the expected notes on meta swiping some OAI employees, there's this of note:
>“We are not going to go right after ChatGPT and try and do a better job with helping you write your emails at work,” Cox said. “We need to differentiate here by not focusing obsessively on productivity, which is what you see Anthropic and OpenAI and Google doing. We’re going to go focus on entertainment, on connection with friends, on how people live their lives, on all of the things that we uniquely do well, which is a big part of the strategy going forward.”
Anonymous
6/27/2025, 9:02:22 PM No.105725622
>>105725468
DeepSeek is only 37B by active parameter count. It's 158B by square root law, which seems more accurate.
Anonymous
6/27/2025, 9:02:52 PM No.105725630
1723896686808884
1723896686808884
md5: f5586c0a9bddcdd5e83bc671087d1d4e🔍
Death to Miku.
Replies: >>105725653 >>105725664 >>105725983
Anonymous
6/27/2025, 9:02:54 PM No.105725633
ITT: people believe corpos will give away something valuable
All of them only ever release free weights when the weights can be considered worthless
Flux doesn't give away their best models
Google gives you Gemma, not Gemini
Meta can give away Llama because nobody wants it even for free
Qwen never released the Max model
So far the only exception has been DeepSeek, their model is both desirable and open and I think they are doing this more out of a political motivation (attempt to make LLM businesses crash and burn by turning LLMs into a comodity) rather than as a strategy for their own business
some people in China are very into the buckets of crab attitude, can't have the pie? I shall not let you have any either
Anonymous
6/27/2025, 9:03:38 PM No.105725644
So because Qwen3 VL has been replaced by VLo, does that mean they aren't even going to bother releasing an open source vision model anymore? I was waiting for it to make better captions...
Anonymous
6/27/2025, 9:04:22 PM No.105725652
Chatgpt keeps telling me that MythoMax 13B Q6 is the best .ggup to immersively rape my fictional characters in RP, is that true or is there better?
Anonymous
6/27/2025, 9:04:24 PM No.105725653
>>105725630
does the sweater hide all the cut marks on your wrists?
Replies: >>105725672
Anonymous
6/27/2025, 9:04:26 PM No.105725654
>>105725606
You realize you just responded to the spam bot right?
Anonymous
6/27/2025, 9:04:56 PM No.105725656
https://www.nytimes.com/2025/06/27/technology/mark-zuckerberg-meta-ai.html
https://archive.is/kF1kO

>In Pursuit of Godlike Technology, Mark Zuckerberg Amps Up the A.I. Race
>Unhappy with his company’s artificial intelligence efforts, Meta’s C.E.O. is on a spending spree as he reconsiders his strategy in the contest to invent a hypothetical “superintelligence.”
>
>[...] In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases. No final decisions have been made on the matter.
>
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.” [...]
Replies: >>105725673 >>105725676 >>105725685 >>105725689 >>105725700 >>105725708
Anonymous
6/27/2025, 9:05:33 PM No.105725664
>>105725630
Very feminine hand, typical of average /g/ tranny.
Anonymous
6/27/2025, 9:05:45 PM No.105725667
if I was 'ggaganov I would just make it so that llama.cpp works with everything by default instead of having to hardcode every model, but I guess that sort of forward thinking is why I run businesses and he's stuck code monkeying
Anonymous
6/27/2025, 9:06:09 PM No.105725672
>>105725653
Imagine projecting this much
Anonymous
6/27/2025, 9:06:09 PM No.105725673
>>105725656
zuck might just be the dumbest CEO ever
Anonymous
6/27/2025, 9:06:48 PM No.105725676
>>105725656
Wang's words, zuck's mouth
Anonymous
6/27/2025, 9:07:24 PM No.105725685
>>105725656
>Godlike Technology,
Is god omnipotent if he can't suck a dick in an acceptable manner?
Anonymous
6/27/2025, 9:08:00 PM No.105725689
>>105725656
>llama isnt literally AGI because uhhhmm because its open source and others have access to it
chat?
Anonymous
6/27/2025, 9:08:01 PM No.105725692
file
file
md5: d87f2933b6eb2524589642fb30c8e3cf🔍
i must be pushing it by now, but 3.2 is still hanging along
Anonymous
6/27/2025, 9:08:35 PM No.105725700
>>105725656
meta just got told by a judge, that they are in fact not covered by the fair use law, even if they "won" the case, but that was bc both lawyer teams were focusing in the wrong part of the law. the judge said that if the generated models compete in any way with the training materials it wont be fair use
of course they are discussing deinvesting, they are not leading and the legal situation is getting worse
Anonymous
6/27/2025, 9:08:59 PM No.105725705
>>105722291
You're the rag
Anonymous
6/27/2025, 9:09:17 PM No.105725708
>>105725656
>In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases.
Anonymous
6/27/2025, 9:09:54 PM No.105725716
1640477178026
1640477178026
md5: a23193cc9232bc32b2787cf340e21dfe🔍
>Bunch of worthless LLMs for math and coding
>Barely, if any, built for story making or translating
WHEN WILL THIS SHITTY INDUSTRY JUST HURRY UP AND MOVE ON!
Anonymous
6/27/2025, 9:10:48 PM No.105725724
117045
117045
md5: 052240c898909d79f88900ab46141af1🔍
>>105724105
> the future is fucking bright
Anonymous
6/27/2025, 9:12:01 PM No.105725736
>>105724774
goofs?
Anonymous
6/27/2025, 9:12:02 PM No.105725737
new llama.cpp binary build wen
Replies: >>105725748
Anonymous
6/27/2025, 9:12:48 PM No.105725748
>>105725737
When you git pull and cmake, anon...
Anonymous
6/27/2025, 9:13:32 PM No.105725757
https://www.nytimes.com/2025/06/27/technology/mark-zuckerberg-meta-ai.html
https://archive.is/kF1kO

>In Pursuit of Godlike Technology, Mark Zuckerberg Amps Up the A.I. Race
>Unhappy with his company’s artificial intelligence efforts, Meta’s C.E.O. is on a spending spree as he reconsiders his strategy in the contest to invent a hypothetical “superintelligence.”
>
>[...] In another extraordinary move, Mr. Zuckerberg and his lieutenants discussed “de-investing” in Meta’s A.I. model, Llama, two people familiar with the discussions said. Llama is an “open source” model, with its underlying technology publicly shared for others to build on. Mr. Zuckerberg and Meta executives instead discussed embracing A.I. models from competitors like OpenAI and Anthropic, which have “closed” code bases. No final decisions have been made on the matter.
>
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.” [...]
Anonymous
6/27/2025, 9:13:43 PM No.105725759
We could move to /r9k/
Replies: >>105725781 >>105725786
Anonymous
6/27/2025, 9:14:43 PM No.105725766
JUST RANGE BAN YOU FUCKING MODS!
Replies: >>105725790
Anonymous
6/27/2025, 9:15:07 PM No.105725771
As LLM pretraining costs keep dwindling, it's only a matter of time until someone trains a proper creative model for his company,.
Anonymous
6/27/2025, 9:16:15 PM No.105725781
>>105725759
We already there >>>/r9k/81615256
Though i would recommend >>>/lgbt/ as appropriate safespace for all of us.
Anonymous
6/27/2025, 9:16:41 PM No.105725786
>>105725759
You go first and wait for me.
Anonymous
6/27/2025, 9:16:54 PM No.105725789
file
file
md5: c35ba1bd7139451a23262f9bf421d010🔍
its not working :(
Replies: >>105725795
Anonymous
6/27/2025, 9:17:00 PM No.105725790
>>105725766
Might not work if they are using ecker or gay or some residential proxy.
Anonymous
6/27/2025, 9:17:31 PM No.105725795
>>105725789
heeeeeeeeeeeeeeeeeeeeeeeeeelllllllllllllppppppppp pleaaaaaseeeeeeee
Replies: >>105725803
Anonymous
6/27/2025, 9:18:10 PM No.105725803
>>105725795
have you tried asking chatgpt to write the script for you?
Anonymous
6/27/2025, 9:19:02 PM No.105725809
>>105716978
What exactly are you complaining about? I like 3.2 (with mistral tekken v3) but it definitely has a bias toward certain formatting quirks and **asterisk** abuse. This is more tolerable for me than other model's deficiencies at that size, but if it triggers your autism that badly you're better off coping with something else. It might also be that your cards are triggering its quirks more than usual
Anonymous
6/27/2025, 9:19:40 PM No.105725815
>deepseek/ccp can't steal more innovation from openai
>they fail to release new models
they must be shitting their pants about openai's open source model that will destroy even the last argument to use deepshit
Replies: >>105725820 >>105725825 >>105725833 >>105725867
Anonymous
6/27/2025, 9:20:20 PM No.105725820
>>105725815
Zero chance it's larger than 30B.
Replies: >>105725837
Anonymous
6/27/2025, 9:20:55 PM No.105725825
>>105725815
openai just lost their head people to meta after being stagnant for forever
Anonymous
6/27/2025, 9:20:55 PM No.105725826
>>105724939
I'm not competent but A.I. is and it seems to work right.

https://pastes.dev/l0c6Kj9a4v
Replies: >>105725940
Anonymous
6/27/2025, 9:21:35 PM No.105725833
>>105725815
Why can't they steal anymore?
Replies: >>105725842 >>105725848
Anonymous
6/27/2025, 9:22:10 PM No.105725837
>>105725820
its going to be a 3B phone model that blows away benchmarks for its size
Anonymous
6/27/2025, 9:22:48 PM No.105725842
>>105725833
>>105713525
Anonymous
6/27/2025, 9:23:23 PM No.105725848
>>105725833
they can't steal because there's no new general model
DeepSeek V3 was 100% trained on GPT4 and R1 was just a godawful placebo CoT on top that wrote 30 times the amount of actual content the model ends up outputting. New R1 is actually good because the CoT came from Gemini so there isn't a spam of a trillion wait or endless looping.
Replies: >>105725854 >>105725861
Anonymous
6/27/2025, 9:24:00 PM No.105725854
>>105725848
deepsteal'd
Anonymous
6/27/2025, 9:24:39 PM No.105725861
>>105725848
deepseek is as raw as a model gets, they trained on the raw internet with the lightest of instruct tunes probably a few million examples big. If they trained on gpt it would sound much more like shitty llama
Anonymous
6/27/2025, 9:25:16 PM No.105725867
>>105725815
>Still living in saltman's delusion
Ngmi
Anonymous
6/27/2025, 9:26:02 PM No.105725875
1745010592780
1745010592780
md5: 777589b1a6d9fba3cb7c71272dd67364🔍
The more I try to train and fuck with these models, the more I think the AI CEOs should be hanged for telling everyone they could be sentient in 2 weeks. Every time I think I'm getting somewhere it botches something very simple. I guess it was a fool's errand thinking I could hyper-specialize a small model to do things Claude can't
Replies: >>105725934
Anonymous
6/27/2025, 9:26:39 PM No.105725882
arguing with retards is a futile, most pointless thing to do in life
you learn how to spot them and you ignore them
life is too short to deal with idiots who think they know how MoE work but don't
Replies: >>105725888 >>105725895 >>105725898 >>105725903 >>105725913 >>105725917
Anonymous
6/27/2025, 9:27:34 PM No.105725888
>>105725882
I do agree with you. So many others are simply not on the same level as I am. It's almost quite insulting to even trying to establish any form of discussion with them.
Anonymous
6/27/2025, 9:27:52 PM No.105725894
this thread has gone down the poopchute, jesus
what is wrong with the spammer retard
Replies: >>105725953
Anonymous
6/27/2025, 9:28:15 PM No.105725895
>>105725882
dunningkrugerMAXX
Anonymous
6/27/2025, 9:28:23 PM No.105725898
>>105725882
Certainly good sir, we are above all the rabble. We always know best.
Replies: >>105725917
Anonymous
6/27/2025, 9:28:53 PM No.105725903
>>105725882
Just because a model can answer your obscure JRPG trivia, doesn't make it a good model.
Anonymous
6/27/2025, 9:29:29 PM No.105725913
>>105725882
how do I make good ai? I'm looking to make an advanced artificial intelligence that can replace millions of workers, that can drive, operate robotic hands with precision, and eliminate all coding jobs and middle management tasks.

I heard you were the guy to ask.

On 4chan.
Replies: >>105725926 >>105725935
Anonymous
6/27/2025, 9:29:46 PM No.105725917
>>105725898
Anon, c'mon.

>>105725882
>>105720715
>>105725213
Anonymous
6/27/2025, 9:29:50 PM No.105725918
>>105717903
>how to clean data
This is something AI should be able to do itself.
Anonymous
6/27/2025, 9:31:03 PM No.105725926
>>105725913
Just stabilize the environment and shift the paradigm
Anonymous
6/27/2025, 9:31:14 PM No.105725930
>>105719870
I find nemo better than deepseek even, I just want the same thing with more context.
Anonymous
6/27/2025, 9:31:35 PM No.105725934
>>105725875
The fact that models 'sleep' between prompts means that there is no sentience.
The AI 'dies' every prompt and has to be 'reborn' with context so it can pretend to be the same AI you prompted 1 minute ago.
The LLMs we have no have absolutely nothing analog to sentience When people cry that we need to be kind to AI, you might as well pause a movie before an actor gets shot.
Replies: >>105726011
Anonymous
6/27/2025, 9:31:40 PM No.105725935
>>105725913
If you can optimize it to beat pokemon red/blue the dominoes will start to fall
Anonymous
6/27/2025, 9:31:59 PM No.105725940
>>105725826
yes, this is working for me too
Anonymous
6/27/2025, 9:33:07 PM No.105725953
>>105725894
qrd on the spammer?
Replies: >>105725997
Anonymous
6/27/2025, 9:36:29 PM No.105725983
>>105725630
>migger has a soihand
pottery
Anonymous
6/27/2025, 9:37:02 PM No.105725991
>>105725967
>>105725967
>>105725967
Anonymous
6/27/2025, 9:37:30 PM No.105725997
>>105725953
Mental breakdown. Has no control over his own life so he wants to impose rules on others. He'll get bored.
Anonymous
6/27/2025, 9:39:00 PM No.105726011
>>105725934
>The fact that models 'sleep' between prompts means that there is no sentience.
It's more than that I think. People sleep too. We go unconscious for long periods of time. Unlike LLMs our brains are always "training." So a part of the experience of consciousness is the fact your "weights" so to speak are always reshuffling, and your ability to reflect on how you've changed over short and long periods of time contributes to the mental model of yourself. It's like we have many embeddings and some of them understand the whole system and how it changes over time. LLMs just have one and their only "memory" is the context which is just reinterpreted in chunks.
Anonymous
6/27/2025, 10:46:30 PM No.105726720
An insect might have less ""intelligence"" as perceived by a human but it has more sentience than a LLM for sure. LLMs don't even have any notion of acting upon a will of their own. They react to what you feed them and have no ability to impose a form of will outside of the perimeter set by your prompt.
Prod an ant with a leaf or something, at first it will be distracted and react with curiosity or fear, but it will quickly go back to minding its own business : looking for food, or going back to its colony. Prod a LLM with data and it will not "think" (by which I mean generate MUHNEXTTOKEN) of anything other than that data.