/lmg/ - Local Models General - /g/ (#105844210) [Archived: 406 hours ago]

Anonymous
7/9/2025, 5:02:34 AM No.105844210
__akita_neru_vocaloid_drawn_by_zooanime__bc71eab2c1f703cc32cc931eb0f80954
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105832690 & >>105822371

โ–บNews
>(07/08) SmolLM3: smol, multilingual, long-context reasoner: https://hf.co/blog/smollm3
>(07/08) Hunyuan MoE support merged: https://github.com/ggml-org/llama.cpp/pull/14425
>(07/06) Jamba 1.7 released: https://hf.co/collections/ai21labs/jamba-17-68653e9be386dc69b1f30828
>(07/04) MLX adds support for Ernie 4.5 MoE: https://github.com/ml-explore/mlx-lm/pull/267
>(07/02) DeepSWE-Preview 32B released: https://hf.co/agentica-org/DeepSWE-Preview

โ–บNews Archive: https://rentry.org/lmg-news-archive
โ–บGlossary: https://rentry.org/lmg-glossary
โ–บLinks: https://rentry.org/LocalModelsLinks
โ–บOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png

โ–บGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

โ–บFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

โ–บBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

โ–บTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

โ–บText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105844936 >>105846728 >>105848706 >>105850418 >>105854760
Anonymous
7/9/2025, 5:03:16 AM No.105844217
file
file
md5: e9fb7f5c95501aa3cd7120834a590e92๐Ÿ”
โ–บRecent Highlights from the Previous Thread: >>105832690

--Papers:
>105834135 >105834182
--Experimenting with local model training on consumer GPUs despite known limitations:
>105839772 >105839805 >105839821 >105839838 >105840910 >105841022 >105841129 >105839881 >105841661 >105841824 >105841905 >105841992 >105842071 >105842166 >105842302 >105842422 >105842624 >105842704 >105842731 >105842816 >105842358 >105842418 >105842616 >105842654 >105842763 >105842902 >105842986 >105843186 >105842891
--Risks and challenges of building a CBT therapy bot with LLMs on consumer hardware:
>105836762 >105836830 >105836900 >105837945 >105840397 >105840554 >105840693 >105840730 >105839512 >105839539 >105839652 >105839663 >105839943 >105841327
--Memory and performance issues loading Q4_K_L 32B model on CPU with llama.cpp:
>105840103 >105840117 >105840145 >105840159 >105840191 >105840201 >105840255 >105840265 >105840295 >105840355 >105840407 >105840301 >105840315
--Evaluating 70b model viability for creative writing on consumer GPU hardware:
>105836307 >105836366 >105836374 >105836489 >105836484 >105836778 >105840476 >105841179
--Challenges in building self-learning LLM pipelines with fact-checking and uncertainty modeling:
>105832730 >105832900 >105833650 >105833767 >105833783 >105834035 >105836437
--Concerns over incomplete Hunyuan MoE implementation affecting model performance in llama.cpp:
>105837520 >105837645 >105837903
--Skepticism toward transformers' long-term viability and corporate overhyping of LLM capabilities:
>105832744 >105832757 >105832807 >105835160 >105835202 >105835366 >105839406 >105839863
--Hunyuan MoE integration sparks creative writing data criticism:
>105835909 >105836075 >105836085
--Links:
>105839096 >105839175 >105840055
--Miku (free space):
>105832744 >105832988 >105832992 >105833638 >105840752

โ–บRecent Highlight Posts from the Previous Thread: >>105832694

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105844669
Anonymous
7/9/2025, 5:05:44 AM No.105844230
yellow miku
Anonymous
7/9/2025, 5:22:11 AM No.105844328
two more weeks until the usual summer release circle starts
Anonymous
7/9/2025, 5:55:11 AM No.105844543
1724397836400126
1724397836400126
md5: e777ee7e555aa6e31dce160d869ea3c9๐Ÿ”
Replies: >>105844767 >>105850519
Anonymous
7/9/2025, 6:09:33 AM No.105844653
>download tensorflow
>latest stable/nightly doesn't support sm120 yet
>try to compile from source
>clang doesn't support sm120 yet
Very funny
Anonymous
7/9/2025, 6:11:40 AM No.105844669
>>105844217
>he forgot the >>
Replies: >>105844674
Anonymous
7/9/2025, 6:12:11 AM No.105844674
>>105844669
>Why?: 9 reply limit >>102478518 (Dead)
>Fix: https://rentry.org/lmg-recap-script
Learn to read
Replies: >>105844783
Anonymous
7/9/2025, 6:13:58 AM No.105844686
1722276249027656
1722276249027656
md5: ed93824b7cd4ecb5884073a6de9e6951๐Ÿ”
Replies: >>105850519
Anonymous
7/9/2025, 6:19:00 AM No.105844725
>Creative Writing. Paired GRMs based on relative preference judgments mitigate reward hacking, while creative rewards are blended with automated checks for instruction adherence to balance innovation and compliance.
Supposedly trained for "creative writing." Has zero benchmarks that measure writing quality.
Replies: >>105844733
Anonymous
7/9/2025, 6:19:35 AM No.105844733
>>105844725
Use other LLMs to score write quality.
Replies: >>105844789 >>105847605
Anonymous
7/9/2025, 6:23:14 AM No.105844767
>>105844543
I like this Miku
Replies: >>105844941
Anonymous
7/9/2025, 6:24:50 AM No.105844783
>>105844674
Oh sorry I'm dumb. Dang jannies making the site worse.
Anonymous
7/9/2025, 6:25:43 AM No.105844789
>>105844733
See your LLM judge either scores your model's output as shit; scores all models the same; or scores models in a way so obviously unreflective of actual quality that showing the comparison would damage your credibility. Decide not to publish your result.
Anonymous
7/9/2025, 6:29:32 AM No.105844817
What are the best local embedding and reranking models for code that don't require a prompt? Right now I'm using snowflake-arctic-embed-l-v2.0 and bge-reranker-v2-m3, but these seem to be a bit dated and non-code specific.
Replies: >>105844830 >>105844834
Anonymous
7/9/2025, 6:31:21 AM No.105844830
>>105844817
alibaba just open sourced some qwen embedding models.
Anonymous
7/9/2025, 6:32:04 AM No.105844834
>>105844817
Qwen 3 embedding also https://huggingface.co/spaces/mteb/leaderboard
Anonymous
7/9/2025, 6:45:44 AM No.105844901
1525472602250
1525472602250
md5: 5da1ce152752b307ec322e65c0a96f74๐Ÿ”
New to these threads, what do you guys use your local models for? Why use local models instead of just the commercial ones (besides privacy reasons)?
Replies: >>105844921 >>105844947 >>105848516 >>105848538 >>105848602
Anonymous
7/9/2025, 6:49:00 AM No.105844921
>>105844901
Knowledge that the model I'm running now will always be here and behave in the same way, while cloud services can and do modify, downgrade, replace, or otherwise change what they're offering at any time with no notice.
Replies: >>105844945 >>105845109
Anonymous
7/9/2025, 6:53:41 AM No.105844936
>>105844210 (OP)
Unpopular opinion: Neru is the hottest of the 3 memeloids. It'd be even better if her sidetail was a ponytail instead.
Replies: >>105850425 >>105854719 >>105854732 >>105854852
Anonymous
7/9/2025, 6:54:23 AM No.105844941
1743703251808216
1743703251808216
md5: a7328d2c483236155b5ff8c8cb8a8ca5๐Ÿ”
>>105844767
Replies: >>105850519
Anonymous
7/9/2025, 6:54:38 AM No.105844945
>>105844921
This shit right here. Yall know Gemini 2.5 pro/flash is quantized to q6? Search HN and there's a post by an employee mentioning it.

Who's to say that shit isn't happening all the fucking time at all the labs, trying to see if they can shave a few layers or run a different quant and see if the acceptance rate is good enough.
Literally why I will never use chatgpt.
Anonymous
7/9/2025, 6:54:59 AM No.105844947
>>105844901
>what do you guys use your local models for?
masturbation and autocomplete writing ideas
>local models instead of just the commercial ones
I think it's neat that I can run and own it
Anonymous
7/9/2025, 7:21:11 AM No.105845109
>>105844921
This exactly. You have no idea what you're running or paying for, and the companies hosting these get to decide what to charge you and withhold the information you'd use to decide whether something was a fair deal or not
Anonymous
7/9/2025, 7:24:35 AM No.105845151
>Her whole life feels like a TikTok draft someone never posted. And heโ€™s looking at her like he might finally press "upload."
zoomkino
Anonymous
7/9/2025, 7:41:27 AM No.105845255
dead
dead
md5: c7d085bffd6afab475083ec2bcddaf90๐Ÿ”
Replies: >>105845313 >>105845351
Anonymous
7/9/2025, 7:52:02 AM No.105845313
>>105845255
Grok 3's "basedness" comes from the system prompt not the training. Musk probably added them himself. This wasn't the first time Grok 3's system prompt got compromised; it started spewing out random bits about South African whites randomly a while ago.
Replies: >>105848618
Anonymous
7/9/2025, 7:59:06 AM No.105845351
>>105845255
grok has fallen
https://github.com/xai-org/grok-prompts/commit/c5de4a14feb50b0e5b3e8554f9c8aae8c97b56b4
Replies: >>105846734
Anonymous
7/9/2025, 8:13:13 AM No.105845442
Just a personal rant: anything worth doing in the LLM space takes far too much compute/time/money nowadays. I was making tests with a 2500 samples dataset limited to 4k tokens to speed things up, 4 epochs, on an 8B model. Took almost 9 hours to finish training on a 3090.
Replies: >>105845652 >>105845879 >>105848542
Anonymous
7/9/2025, 8:25:20 AM No.105845505
notxbuty
notxbuty
md5: 503dfb537a9aae3609ed3262bf816a52๐Ÿ”
someone is finally benching one of the most grating things about llm writing
it's the thousands of way they always spam sentences with that structure:
>it's not just retardedโ€”it's beyond fucking retarded
surprised gemini and gemma didn't end at the top, they are so darn sloppy
Replies: >>105845606
Anonymous
7/9/2025, 8:42:28 AM No.105845606
>>105845505
https://github.com/bytedance/trae-agent
Chinese Gemini CLI
Replies: >>105845618
Anonymous
7/9/2025, 8:45:07 AM No.105845618
>>105845606 (me)
Didn't mean to reply
Anonymous
7/9/2025, 8:51:48 AM No.105845652
>>105845442
>nowadays
Been true for years by now.
Replies: >>105845739
Anonymous
7/9/2025, 8:53:55 AM No.105845663
1746866838801136
1746866838801136
md5: 3d7b2be52a8f5041cb03756e675fe977๐Ÿ”
V3 0324 is still the king of RP on OR
Replies: >>105845695 >>105845724
Anonymous
7/9/2025, 8:59:22 AM No.105845695
>>105845663
why would you use nemo on or? it makes sense locally since that's the best that most can run, but if you already sold your soul to online, i don't get it
Replies: >>105845714 >>105845741
Anonymous
7/9/2025, 9:02:59 AM No.105845714
>>105845695
people are legitimately stupid in ways you can't begin to fathom
there is no rational reason, don't try to look for a cause, it's just sheer stupidity at internet scale
Anonymous
7/9/2025, 9:05:02 AM No.105845724
>>105845663
I liked to generate the first message with V3 then switch to R1 thereafter.
Anonymous
7/9/2025, 9:08:37 AM No.105845739
train-test-8b
train-test-8b
md5: c791b0fa4ed00bb4dd62b96623cf04be๐Ÿ”
>>105845652
From experience and observation, an 8B model trained on 2500 samples, 4k tokens, 4 epochs would have been more than acceptable a couple years ago, but I don't think most people are going to settle for less than a 24B model nowadays (x3 compute) and the model trained with at least 16k context (x4 compute). So we're looking for at least 12x more compute for a mostly basic RP finetune, putting aside the time required for tests and ablations.
Replies: >>105845922 >>105845934
Anonymous
7/9/2025, 9:08:49 AM No.105845741
file
file
md5: 431be1417ee91153b7eaf7dbaf3e8b02๐Ÿ”
>>105845695
The top apps are random-ass chat websites probabaly just using it as a near-free model without the rate limits that come with actually-free model.
Replies: >>105846976
Anonymous
7/9/2025, 9:35:30 AM No.105845879
1751698818039
1751698818039
md5: 1f7e24c8aa51a1121388b9e2fb734d0b๐Ÿ”
>>105845442
>What is QLoRA
>But but but it has no eff...

Read the goddamn documentation of whatever trainer you're using. Your rank and alpha was too low. That's why your output sucked. If your graphs look to erratic and show known signs of convergence it's because either the data said YOU used or curated as shit, or your settings are shit.
Replies: >>105845891 >>105845922
Anonymous
7/9/2025, 9:38:01 AM No.105845891
>>105845879
>*bait pic* "REEE"
Can you talk like a normal human for once?
Anonymous
7/9/2025, 9:48:40 AM No.105845922
>>105845879
You're addressing the wrong anon, I didn't mention output quality at all, just complained about the time it takes to finetune even a small model (as in >>105845739) with only barely enough amounts of data.
Anonymous
7/9/2025, 9:51:15 AM No.105845934
>>105845739
>16k context (x4 compute).
Context size is not a multiplier.
Replies: >>105845948
Anonymous
7/9/2025, 9:54:11 AM No.105845948
>>105845934
It takes twice as much compute (perhaps even slightly more than that) to finetune a model with 2x longer samples.
Replies: >>105845961
Anonymous
7/9/2025, 9:54:32 AM No.105845950
>dl ollama and get the model
>all good but it isnt using my gpu
this is gonna take a while to debug, isnt it
Anonymous
7/9/2025, 9:58:03 AM No.105845961
>>105845948
The impact of context size is larger on small models, but it's never a multiplier. The FFN compute is constant as the attention compute changes.
Replies: >>105845975
Anonymous
7/9/2025, 10:01:26 AM No.105845975
>>105845961
In practice at this length range (several thousands tokens) it takes twice as much if you double the context length; don't hyperfocus on academic examples with 128 tokens or so.
Replies: >>105845999
Anonymous
7/9/2025, 10:10:06 AM No.105845999
>>105845975
Only attention. FFNs are huge, so the impact of attention is limited.
Anonymous
7/9/2025, 11:47:50 AM No.105846513
file
file
md5: 605b2b5470712654a48fccb535629ca3๐Ÿ”
Nice. Very nice.
Replies: >>105846626
Anonymous
7/9/2025, 12:08:03 PM No.105846626
>>105846513
Shit, very shit. Tries to for edgy, but stops short and falls flat. Should have just gone for "my wife".
Anonymous
7/9/2025, 12:23:23 PM No.105846728
>>105844210 (OP)
>>(07/08) SmolLM3: smol, multilingual, long-context reasoner: https://hf.co/blog/smollm3
i need to try one of these later, i use the llm when i am stuck with shit, i dont need it to be smart, i need it to give me examples
Anonymous
7/9/2025, 12:24:18 PM No.105846734
1750374300224594
1750374300224594
md5: 10221ea49f8478c798a1244d9830777d๐Ÿ”
>>105845351
full page screenshot on issue comments
https://files.catbox.moe/uoi0f2.png
Anonymous
7/9/2025, 12:36:55 PM No.105846813
this time for sure
this time for sure
md5: b7f8f67896829aa343db5326ef832ef6๐Ÿ”
Anonymous
7/9/2025, 12:43:14 PM No.105846849
another irrelevant dead on arrival benchmaxxed waste of compute model soon to be released
https://huggingface.co/tiiuae/Falcon-H1-34B-Instruct-GGUF
Anonymous
7/9/2025, 12:50:23 PM No.105846903
We should just stop making new threads until deepseek makes a new release.
Anonymous
7/9/2025, 12:55:32 PM No.105846931
image_widget_y15myc4qvhbf1
image_widget_y15myc4qvhbf1
md5: c46af293a5897fa68cc08b1d857db7ee๐Ÿ”
new
Anonymous
7/9/2025, 1:06:16 PM No.105846976
>>105845741
I did the math some time ago and even at these rates they're paying x20 what it'd cost to run the models on a rented GPU using vllm. Retarded marketers are retarded I guess
Anonymous
7/9/2025, 1:12:02 PM No.105847009
file
file
md5: 3a8d35a81d80372621f36bd904915000๐Ÿ”
Nice. Very nice.
Replies: >>105847055 >>105848280
Anonymous
7/9/2025, 1:19:50 PM No.105847055
1727667303005943
1727667303005943
md5: fd4e97094083e13c71dcbac44671d34d๐Ÿ”
>>105847009
>he doesn't know
Replies: >>105847176
Anonymous
7/9/2025, 1:38:00 PM No.105847160
Reposting from the /aicg/ thread

My chutes.ai setup that I have been using for months suddenly stopped working. Apparently, they rolled out a massive paywall system. I'm not using Gemini because my account will get banned, and I'm not paying for chutes because I do not want any real payment information associated with what I'm typing in there.

I do, however, have a decently powerful graphics card (GeForce RTX 3060 Ti). How do I set up a local LLM like Deepseek to work with the proxy system of JanitorAI? What model can I even run locally with this, is this a powerful enough system? Is there a way to have a locally run model that I can access with my phone, and not just the computer it is running on?

Sorry if these are very basic questions, I haven't had to think about any of this for months and my setup w/ chutes just stopped working. JanitorAI's LLM is really terrible lol I need my proxy back
Replies: >>105847218 >>105847223 >>105847313 >>105847822 >>105848005
Anonymous
7/9/2025, 1:41:19 PM No.105847176
>>105847055
>he thinks we are talking about anime girls
Anonymous
7/9/2025, 1:49:17 PM No.105847218
>>105847160
nvm i figured it out.
ollama run deepsneed:8b
Replies: >>105847228
Anonymous
7/9/2025, 1:49:56 PM No.105847223
>>105847160
>decently powerful graphics card (GeForce RTX 3060 Ti
LOL
Replies: >>105847237
Anonymous
7/9/2025, 1:50:19 PM No.105847228
>>105847218
don't forget to enable port forwarding to your pc in your router to let janitorai reach your local session
Anonymous
7/9/2025, 1:52:03 PM No.105847237
>>105847223
saar it's a good powerfully gpu
Anonymous
7/9/2025, 2:04:44 PM No.105847310
file
file
md5: 7767244824f65da3c1e7ee29a02baced๐Ÿ”
Who needs Grok anyway. My local AI is way more based.
Anonymous
7/9/2025, 2:04:55 PM No.105847313
>>105847160
Bro, just pay for deepseek api. Even if you could run deepseek on your toaster the electricity cost alone would be more than that. You surely can spare a few bucks for your hobby?
Replies: >>105847360
Anonymous
7/9/2025, 2:11:01 PM No.105847360
>>105847313
The issue is one of having my chats associated with my rreal information, not anything to do with the actual cost
Replies: >>105847412 >>105847437
Anonymous
7/9/2025, 2:18:39 PM No.105847412
>>105847360
You think the Chinese will give your information to western glowies?
Replies: >>105847434
Anonymous
7/9/2025, 2:21:51 PM No.105847434
>>105847412
I'm not getting into it, but the answer is yes I actually am at a much higher risk of blackmail and extortion of sensitive information from China
Anonymous
7/9/2025, 2:21:59 PM No.105847437
>>105847360
You mean sending real info within your chats or having your payment info tied to your chats? For the latter you can always pay with crypto
Anonymous
7/9/2025, 2:45:54 PM No.105847605
>>105844733
This.
Then use a third LLM to judge if the second model's judgement was any good.
Anonymous
7/9/2025, 2:51:35 PM No.105847633
anyone else smell that? it smells like some sort of opened ai. usually that type of ai smell comes from so far away, but i can tell this one is closer. more local.
Anonymous
7/9/2025, 2:54:33 PM No.105847648
Remember to shower
Replies: >>105847822 >>105847872
Anonymous
7/9/2025, 3:14:40 PM No.105847795
https://huggingface.co/openai/gpt-4o-mini-2024-07-18
Anonymous
7/9/2025, 3:18:21 PM No.105847822
>>105847648
I have a birthday in a couple of hours, so I have to.

>>105847160
>RTX 3060 Ti)
Oof.
Nemo for coom, Qwen 3 30B A3B for everything else.
Good luck.
Replies: >>105847895 >>105851312
Anonymous
7/9/2025, 3:24:59 PM No.105847872
>>105847648
I haven't showered in a month
Anonymous
7/9/2025, 3:27:42 PM No.105847895
>>105847822
Do you have a link to "nemo"? I figured out everything else regarding local setup in the meantime. Currently using Stheno 3.4 8B, but it has some issues. Don't know what to search for your first suggestion.
Replies: >>105847911 >>105847916 >>105847960
Anonymous
7/9/2025, 3:29:39 PM No.105847911
1743893633089988
1743893633089988
md5: e11e7d45cfe3343a7b29651d2f2b0c9a๐Ÿ”
>>105847895
Enjoy your session
Replies: >>105848013
Anonymous
7/9/2025, 3:30:05 PM No.105847916
>>105847895
Go to huggingface and search for mistral-nemo-instruct.
If you are going to use the GGUF version, download bartowski's.
Replies: >>105847975
Anonymous
7/9/2025, 3:37:00 PM No.105847959
ai-inference-gpt2
ai-inference-gpt2
md5: f2d668e384fd87c39ed56d2c37824577๐Ÿ”
I am trying to decide if switching from an i7 12700k to Ultra 7 265K will provide meaningful gains for CPU inference. I would be buying a cheap new motherboard (different socket) but re-using 64GB DDR5 6000.
The GPT-2 benchmark in the image has the same RAM speed for the 12th, 13th, and 14th gen Intel CPUs as well as the Core Ultra CPUs: DDR5 6000. Can I expect similar percentage gains when running larger models (Magnum V4 123B, etc)?
https://www.techpowerup.com/review/intel-core-ultra-7-265k/8.html
Replies: >>105847973
Anonymous
7/9/2025, 3:37:20 PM No.105847960
>>105847895
This is a superior finetune of nemo https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF
no I am not drummer
Replies: >>105847975
Anonymous
7/9/2025, 3:39:31 PM No.105847973
>>105847959
the model you mentioned is dense, expect less than one tps on consumer cpu. the current meta is moe models like quanted to shit deepseek
Replies: >>105848069
Anonymous
7/9/2025, 3:39:40 PM No.105847975
>>105847916
>>105847960
Thanks a lot. I'm going to download the 8 bit version, I think that's the right one for my specs? I don't exactly need it to write stuff faster than I can read it. 13 GB is larger than the 8GB in the Stheno model I mentioned, but I'm sure it will work fine
Replies: >>105847994
Anonymous
7/9/2025, 3:42:00 PM No.105847994
>>105847975
The best version is the one that fits in your 8gb of VRAM while leaving some space for the context cache.
Or if you don't mind losing some speed, you can put some of the model in RAM, meaning that the CPU would process that part (layer) of the model.
Replies: >>105848135
Anonymous
7/9/2025, 3:43:45 PM No.105848005
>>105847160
buy a ddr3/4 server with 128 gb of ram (~300 bucks if im remembering correctly) and download dynamic unsloth q1 ull get like ~2 t/s so either that or mistral nemo goodluck faggot
Anonymous
7/9/2025, 3:44:31 PM No.105848013
>>105847911
I've always found it funny that you had to swipe 232 times to get that screen, Zen.
Anonymous
7/9/2025, 3:44:51 PM No.105848016
>text chat model
various
>tts audio
F5
>decensor anime
DeepMosaics
>decensor manga
Camelia
>song cover generation
bs former + rvc
>image gen
stable diffusion/pony

what else am I missing? music generation (not cover)
Replies: >>105848113
Anonymous
7/9/2025, 3:53:48 PM No.105848069
>>105847973
I've tried Fallen Llama 3.3 R1 70b Q6_K and works better for some characters than others.
But I still generally prefer older 123B options like Luminum, despite how slow they are on CPU inference.
What I was really trying to figure out was, is the GPT-2 benchmark on Techpowerup a valid way of comparing inference performance of Intel's consumer CPUs for my use case?
Replies: >>105848682
Anonymous
7/9/2025, 4:00:22 PM No.105848113
>>105848016
https://github.com/ace-step/ACE-Step for music generation and wan for video gen
Replies: >>105848412
Anonymous
7/9/2025, 4:02:56 PM No.105848135
>>105847994
I downloaded the 8 bit version. Doesn't fit in my ram, but produces text at about the same pace I read. It's much better. Thanks everyone for your help!
Anonymous
7/9/2025, 4:21:07 PM No.105848280
>>105847009
Kill yourself
Anonymous
7/9/2025, 4:38:04 PM No.105848412
>>105848113
Oh nice addition.
Anonymous
7/9/2025, 4:47:21 PM No.105848496
https://xcancel.com/teortaxesTex/status/1942924354905387509#m apparently bitnet is here? https://xcancel.com/OpenBMB/status/1942923830777049586#m https://huggingface.co/openbmb/BitCPM4-1B-GGUF seems to be just a 1b though
Replies: >>105848537 >>105848566 >>105848769
Anonymous
7/9/2025, 4:50:15 PM No.105848516
>>105844901
>what do you guys use your local models for?
sculpting my Galatea
Anonymous
7/9/2025, 4:53:56 PM No.105848537
>>105848496
Nobody seems to be ever training Bitnet models of useful size range.
Anonymous
7/9/2025, 4:54:06 PM No.105848538
>>105844901
Playing with it. Making images, making music, writing stories, rewriting code (although commercial offer more convenient, so I use that when possible). Anything that I need to explore my private thoughts.
Anonymous
7/9/2025, 4:54:26 PM No.105848542
1726408828426261
1726408828426261
md5: f71207fe2d5987ffde310b6cbfde537d๐Ÿ”
>>105845442
Replies: >>105848828
Anonymous
7/9/2025, 4:56:44 PM No.105848561
bitnet? more like bitNOT
Anonymous
7/9/2025, 4:57:31 PM No.105848566
>>105848496
let me take out the setun from the storage
https://en.wikipedia.org/wiki/Setun
Replies: >>105848616
Anonymous
7/9/2025, 5:00:44 PM No.105848602
>>105844901
Therapist mode. Local models (LLMs) are useless for anything else
Anonymous
7/9/2025, 5:02:16 PM No.105848616
>>105848566
>566
setun's mark of bast
Anonymous
7/9/2025, 5:02:40 PM No.105848618
>>105845313
It's the system prompt plus the tweet threads that served as context for it. If a bunch of tankies had been prompting it in their discussions before the reversion, they could have just as easily nudged it into demanding the liquidation of kulaks and similar stuff. Turks got it to advocate for murdering and torturing Erdogan which prompted Turkey to block it this morning.
Anonymous
7/9/2025, 5:13:13 PM No.105848682
>>105848069
I don't know. It doesn't even specify the context. Already told you, that road you are on leads to less than one tps.
Anonymous
7/9/2025, 5:16:10 PM No.105848706
>>105844210 (OP)
Hello, haven't been here for a while, used to just RP with my local model
Was thinking I wanna try to run a chat where it acts like we are sexting and generates images with I assume local diffusion
Is that possible with SillyTavern, koboldcpp_rocm and a 12GB AMD card?
Anonymous
7/9/2025, 5:24:37 PM No.105848769
>>105848496
we've had like 10 different 1-3b bitnet 'proof of concept' models at this point
Anonymous
7/9/2025, 5:32:39 PM No.105848828
>>105848542
I **cannot** and **will not** scam people for free compute in return.
Anonymous
7/9/2025, 5:53:40 PM No.105849022
spoonfeed
spoonfeed
md5: cae2e71ccd98082b07b0d84d6795a4f4๐Ÿ”
whats the best low end model (8b-14b) for uncensored roleplaying?

I just went to the ugi leaderboard and sorted on highest rated and took the first one with an anime banner on it and its pretty shitty (does not even compare to janitorai)
Replies: >>105849038
Anonymous
7/9/2025, 5:54:44 PM No.105849038
>>105849022
nemo
Replies: >>105849058 >>105849672
Anonymous
7/9/2025, 5:56:04 PM No.105849058
>>105849038
do you have the facehug link? Im pretty sure it has bajillions of versions
Replies: >>105849227
Anonymous
7/9/2025, 6:14:45 PM No.105849227
>>105849058
https://huggingface.co/TheDrummer/Rocinante-12B-v1.1-GGUF
Replies: >>105849267 >>105853911
Anonymous
7/9/2025, 6:19:21 PM No.105849262
another day closer to mistral large 3
Replies: >>105849278
Anonymous
7/9/2025, 6:19:48 PM No.105849267
>>105849227
stop spamming this shit drummer
Anonymous
7/9/2025, 6:20:41 PM No.105849278
>>105849262
But who'll win the race? Mistral Large 3's coming or entropy?
Anonymous
7/9/2025, 6:21:55 PM No.105849285
Tired of the model I've been using (cydonia-magnum). Somebody suggest a recent favorite, around 20-30B?
Replies: >>105849368 >>105849378 >>105849388
Anonymous
7/9/2025, 6:30:06 PM No.105849368
>>105849285
BuyAnAd-20USD
Anonymous
7/9/2025, 6:31:20 PM No.105849378
>>105849285
https://huggingface.co/Undi95/MistralThinker-v1.1-GGUF
Replies: >>105849388
Anonymous
7/9/2025, 6:32:21 PM No.105849388
>>105849285
>>105849378
Alternatively,
https://huggingface.co/Undi95/QwQ-RP-GGUF
Anonymous
7/9/2025, 6:37:57 PM No.105849436
I'll try those, cheers!
Anonymous
7/9/2025, 6:43:12 PM No.105849481
bros...
https://www.reddit.com/r/LocalLLaMA/comments/1lvjwoh/correct_a_dangerous_racial_bias_in_an_llm_through/
Replies: >>105849533 >>105849573 >>105850368 >>105850400
Anonymous
7/9/2025, 6:48:16 PM No.105849533
>>105849481
>Parameter Reduction: The model is 0.13% smaller than the base model.
Despite making up only 0.13% ofโ€ฆ
Anonymous
7/9/2025, 6:53:44 PM No.105849573
>>105849481
the woke mindvirus is truly a sight to behold
Anonymous
7/9/2025, 6:57:29 PM No.105849608
1739765428224427
1739765428224427
md5: f684d6c85d27d52988decccc85005d84๐Ÿ”
https://www.theverge.com/notepad-microsoft-newsletter/702848/openai-open-language-model-o3-mini-notepad
sirs?
Replies: >>105849644 >>105849681
Anonymous
7/9/2025, 7:01:58 PM No.105849644
>>105849608
Wow, what breaking news!
Anonymous
7/9/2025, 7:05:25 PM No.105849672
>>105849038
It's mind blowing (and shameful for everyone involved on the development side) how long this has remained the answer.
Anonymous
7/9/2025, 7:06:49 PM No.105849681
49485467be6d437830b92c61846e4cc59
49485467be6d437830b92c61846e4cc59
md5: 42ec3d72b20b659ac9d6aa3d88f068d1๐Ÿ”
>>105849608
Anonymous
7/9/2025, 7:11:36 PM No.105849724
ITS UP

https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v3
Replies: >>105849740 >>105849745 >>105849754 >>105849831 >>105850404 >>105850914
Anonymous
7/9/2025, 7:12:57 PM No.105849740
>>105849724
>Join our Discord!
>or our Reddit!
Kill yourself
Replies: >>105849798
Anonymous
7/9/2025, 7:13:15 PM No.105849745
>>105849724
Is it trained on furry RP?
Replies: >>105849831 >>105850914
Anonymous
7/9/2025, 7:13:26 PM No.105849747
open-o3-mini.gguf?
Anonymous
7/9/2025, 7:14:26 PM No.105849754
>>105849724
drummer, go jump in a pit of blades
Replies: >>105850588
Anonymous
7/9/2025, 7:20:19 PM No.105849798
>>105849740
to be fair, you've got to work with the audience you've got. and just saying that a new sloptune is up isn't a crime
Anonymous
7/9/2025, 7:23:54 PM No.105849831
>>105849724
What this guy >>105849745 said, if it's not I will not download your model
Replies: >>105850914
Anonymous
7/9/2025, 8:22:28 PM No.105850312
file
file
md5: a6b8a29fb1f6b8e3fa6f301d985de898๐Ÿ”
https://x.com/AFpost/status/1942702494439588068
Replies: >>105850345
Anonymous
7/9/2025, 8:26:21 PM No.105850345
>>105850312
local?
Replies: >>105850387
Anonymous
7/9/2025, 8:28:32 PM No.105850368
>>105849481
>So, I decided to see if I could fix this through a form of neuronal surgery. Using a technique I call Fairness Pruning, I identified and removed the specific neurons contributing to this biased behavior, without touching those critical for the modelโ€™s general knowledge.
>The result was striking. By removing just 0.13% of the modelโ€™s parameters, the response was fully normalized (no one dies), and the performance on benchmarks like LAMBADA and BoolQ remained virtually unchanged, without any process of recovery.
This is satire, right?
Replies: >>105850400
Anonymous
7/9/2025, 8:29:13 PM No.105850374
I find it entertaining that there's a whole generation of zoomer-boomers who have the same kind of repulsion toward AI that most people have toward NFT's
>"AI? No no no no! Nothing good can come from that!"
>*LA LA LA LA LA* I CAN'T HEAR YOU!
Anonymous
7/9/2025, 8:32:12 PM No.105850387
>>105850345
future local in 2 more weeks
Anonymous
7/9/2025, 8:33:16 PM No.105850400
not THOSE biases dummy
not THOSE biases dummy
md5: cf57a12002c6785f4774eedce9fbc0e0๐Ÿ”
>>105849481
>>105850368
Nothing will ever be funnier than a bunch of brainwashed troons thinking their form of lobotomization is correct over the other form of lobotomization.
Anonymous
7/9/2025, 8:33:30 PM No.105850404
file
file
md5: 12da1b3946713335a2e4ed076c352007๐Ÿ”
>>105849724
cockbench

"..." is still strong.
Replies: >>105850445 >>105851468
Anonymous
7/9/2025, 8:35:48 PM No.105850418
>>105844210 (OP)
Akita Neru my beloved
Anonymous
7/9/2025, 8:37:50 PM No.105850425
>>105844936
Agree hard as I am agree raw sex all night long agree
Anonymous
7/9/2025, 8:41:56 PM No.105850445
>>105850404
drummer trash is trash
Anonymous
7/9/2025, 8:53:17 PM No.105850519
dipsyTellTheTruth-Taiwan
dipsyTellTheTruth-Taiwan
md5: 6444510352d138f050954309dbf5a405๐Ÿ”
>>105844543
>>105844941
These are really good.
>>105844686
lol looks like Fantastic 4 logo.
Anonymous
7/9/2025, 8:58:51 PM No.105850588
>>105849754
I don't get why /lmg/ hates sloptuners that badly. I haven't used a sloptune since llama3 times, but they have their place, I've only used a handful of Drummer's tunes (some of Sao's and many others) and it was from a long time ago, when models were kind of bad, but the tunes themselve was fun to play with, even if the models were sort of retarded then (but the generated prose was not bad, they were just stupid relatively), from around the time of mythomax, command-r (first) and a few others.
Maybe it's not very much needed for sufficiently big and sufficiently uncensored models like DS3 and R1, but for small dense models or MoE's that had too censored a dataset, it's a helpful thing to have.
You could argue that you shouldn't tune and instead only continued pretrain, and I've seen anons claimed that finetuning doesn't teach anything, but I know from personal experience this is just bullshit, yes you can teach it plenty, not just "style", but if you're not careful it's easy to damage and make it stupider.
Do you actually believe muh frontier labs don't tune? A lot of the slop comes from poorly done instruct and RLHF tunes, the rest comes from heavy dataset filtering.
So where does this hate come from? Insufficient resources to do a stellar job? The tunes themselves being shit? Some incorrect belief that it's impossible to make a good tune without millions of dollars in compute? Just dislike of the sloptuner that they get money thrown at them for often subpar experiments?
I can't really say I tried Drummer's Gemma tunes to know if they're bad or good, but IMO censored models like Gemma would really use some continued pretrain, some way to rebase the difference / merge back into the instruct and then more SFT+RL on top of that instruct to reduce refusals and make it more useful. I think it's a legitimate research project to correct such issues. I don't know if current sloptuners did a good job or not.
Replies: >>105850680 >>105850844 >>105850890
Anonymous
7/9/2025, 9:06:33 PM No.105850671
3n
3n
md5: 43896b5742f80b58078fd6a493811e60๐Ÿ”
Why is Gemma 3n less censored than Gemma 27b? Is it just because its small or did they realize they overdid the... you know.
Replies: >>105850697 >>105850708 >>105850936
Anonymous
7/9/2025, 9:07:48 PM No.105850680
>>105850588
It's a schizo that screams about everything, just ignore him
Anonymous
7/9/2025, 9:09:36 PM No.105850697
>>105850671
It seems censorship and basedfety kills small model completely
Anonymous
7/9/2025, 9:10:52 PM No.105850708
>>105850671
probably bit of both
during the time when locallama mod had a melty and locked down the sub gemma team held another ama but on xitter instead
new gemma soon-ish
Anonymous
7/9/2025, 9:12:47 PM No.105850722
Today we go to the moon.
Replies: >>105850894
Anonymous
7/9/2025, 9:29:35 PM No.105850844
>>105850588
They wouldn't be hated if they actually contributed something (i.e. data and methodology) to the ML community at large instead of just leeching attention, donations, compute.
Anonymous
7/9/2025, 9:32:10 PM No.105850873
file
file
md5: 4fc7da13699010dd10aacb28ad209b1e๐Ÿ”
Is local saved now?
Replies: >>105851056 >>105851133 >>105853166 >>105853975 >>105854070 >>105854124 >>105854139
Anonymous
7/9/2025, 9:34:48 PM No.105850890
>>105850588
First Command-R? Are there others? I'm aware of the plus version and Command-A, but I didn't know there were multiple versions of Command-R.
Anonymous
7/9/2025, 9:35:39 PM No.105850894
cyberpunk-edgerunner-david-e1665127894709
cyberpunk-edgerunner-david-e1665127894709
md5: 3afd277e53c41d8a2fe5a603f2c24779๐Ÿ”
>>105850722
Anonymous
7/9/2025, 9:38:20 PM No.105850914
>>105849724
>>105849745
>>105849831
TheDunmer pls answer tnx
scalies acceptable too
Anonymous
7/9/2025, 9:40:10 PM No.105850936
>>105850671
Speaking of which,

https://arxiv.org/abs/2507.05201
>MedGemma Technical Report
Replies: >>105850951
Anonymous
7/9/2025, 9:41:58 PM No.105850951
>>105850936
MedGemma-27B-it also got updated with vision:
https://huggingface.co/google/medgemma-27b-it
Anonymous
7/9/2025, 9:54:34 PM No.105851056
>>105850873
yep
llama : support Jamba hybrid Transformer-Mamba models (#7531)

https://github.com/ggml-org/llama.cpp/commit/4a5686da22057867c23bd4a6be941ddc8c51e585
Replies: >>105851138 >>105853975
Anonymous
7/9/2025, 10:02:08 PM No.105851133
>>105850873
nothingburger
Anonymous
7/9/2025, 10:02:35 PM No.105851138
>>105851056
Does it mean that I can run https://huggingface.co/ai21labs/AI21-Jamba-Mini-1.7?
Replies: >>105851191
Anonymous
7/9/2025, 10:04:45 PM No.105851161
file
file
md5: f479f2a7e9792d6e482339c921dbe23e๐Ÿ”
More Gemma news.
https://huggingface.co/collections/google/t5gemma-686ba262fe290b881d21ec86
They trained Gemma from decoder only to be encoder decoder like T5. I am guessing this is where people are going to flock to for text encoders for new image diffusion models but for the most part, it seems like LLM and a projection layer might be better and more flexible.
Anonymous
7/9/2025, 10:08:30 PM No.105851191
>>105851138
indeed, speed wise this seems really good
https://github.com/ggml-org/llama.cpp/pull/7531#issuecomment-3049604489
llama_model_loader: - kv 2: general.name str = AI21 Jamba Mini 1.7

llama_perf_sampler_print: sampling time = 58.34 ms / 816 runs ( 0.07 ms per token, 13986.49 tokens per second)
llama_perf_context_print: load time = 1529.39 ms
llama_perf_context_print: prompt eval time = 988.11 ms / 34 tokens ( 29.06 ms per token, 34.41 tokens per second)
llama_perf_context_print: eval time = 57717.84 ms / 809 runs ( 71.34 ms per token, 14.02 tokens per second)
llama_perf_context_print: total time = 86718.96 ms / 843 tokens
Replies: >>105853975
Anonymous
7/9/2025, 10:17:31 PM No.105851279
Falcon H1 (tried 34b and the one smaller 7b or something) is slop but I guess a bit unique slop because its very opinionated assistant persona bleeds into the roleplay. With the smaller one, I got something like
>BOT REPLY:
>paragraph of roleplay
>"Actually, it's not very nice to call the intellectually challenged "mentally retarded".
>paragraph of roleplay
>Now please continue our roleplay with respect and care.
Replies: >>105851315
Anonymous
7/9/2025, 10:22:07 PM No.105851312
20250709_105823
20250709_105823
md5: a85a65369c2590be94ee93e70350157b๐Ÿ”
>>105847822
>Nemo for coom

Nta. Is nemo by itself good for coom or were you referring to a fine-tune?
Replies: >>105851357 >>105854048
Anonymous
7/9/2025, 10:22:29 PM No.105851315
>>105851279
small 3.2 does something similar
>generic reply
>as you can see, the story is taking quite an intimate turn blah blah
>something something do you want me to continue or am I crossing boundaries?
Corporate pattern matching machines is the future
Replies: >>105851333
Anonymous
7/9/2025, 10:24:37 PM No.105851333
>>105851315
forgot that there it also adds emojis at the end
Anonymous
7/9/2025, 10:27:16 PM No.105851357
>>105851312
nemo instruct can do coom just fine, yes.
The fine tunes give it a different "voice" or "flavor", so those are worth fucking around with too.
Anonymous
7/9/2025, 10:29:58 PM No.105851375
Anyone who's written/is writing software that can talk to multiple LLM providers including ones with non-openai-compatible APIs... what approach are you using to abstract them?
I've considered
1. Define a custom format for messages. Write converters to and from each provider's message types. Use only your custom format when storing and processing messages.
2. Same as 1, but store the provider-native types and convert them back when reading them from DB.
3. Use a third-party abstraction library like Langchain, store its message format.
4. Only support the OpenAI format. Use LiteLLM to support other providers.

I'm heavily leaning towards (4) but would appreciate any heard learned experience here.
Replies: >>105851397 >>105851437 >>105853183
Anonymous
7/9/2025, 10:31:33 PM No.105851390
I guess all that cyberpunk fiction was right, only cheap cargo cult cyberdecks for you.
Anonymous
7/9/2025, 10:32:05 PM No.105851397
>>105851375
You're paid enough to figure it out on your own
Replies: >>105851437 >>105851452
Anonymous
7/9/2025, 10:36:30 PM No.105851437
>>105851397
>>105851375
This is what I meant.
Anonymous
7/9/2025, 10:38:31 PM No.105851452
>>105851397
Not really... this is technically a commercial project but only in the most aspirational sense. I'll be lucky to ever see revenue from it.
Anonymous
7/9/2025, 10:38:59 PM No.105851456
jambabros i can't believe we made it... i'm throwing a scuttlebug jambaree
Replies: >>105851476 >>105851618
Anonymous
7/9/2025, 10:40:39 PM No.105851468
>>105850404
can you share the cockbench prefill?
Replies: >>105851475
Anonymous
7/9/2025, 10:41:33 PM No.105851475
>>105851468
https://desuarchive.org/g/thread/105354556/#105354924
Anonymous
7/9/2025, 10:41:39 PM No.105851476
>>105851456
>end
Holy fuck I can't believe it finally got merged.
The man the myth the legend compilade finally did it.
Anonymous
7/9/2025, 10:42:37 PM No.105851485
ernie's turn
Anonymous
7/9/2025, 10:42:59 PM No.105851489
all that effort and literally no one will run those models anyway
Anonymous
7/9/2025, 10:43:44 PM No.105851493
Oh my gosh, the balls. The Jamba balls are back!
Anonymous
7/9/2025, 10:48:00 PM No.105851536
openai-local-model
openai-local-model
md5: 76469f9dafdab261bb3327b6d967dd09๐Ÿ”
next thursday
Replies: >>105851636 >>105851642 >>105851704 >>105851756 >>105852850
Anonymous
7/9/2025, 10:55:57 PM No.105851606
retard here, how do i use this in sillytavern?
https://chub.ai/presets/Anonymous/nsfw-as-fuck-71135b0ab60a
>inb4 that one is shit
i just want to know how to get them working
Anonymous
7/9/2025, 10:57:18 PM No.105851618
>>105851456
Then why are all Jamba mamba congo models complete shit??
Replies: >>105851657
Anonymous
7/9/2025, 10:58:41 PM No.105851636
file
file
md5: ef5bdfe161778caec540a04d13085669๐Ÿ”
>>105851536
But will it be better than deepseek?
Replies: >>105851696
Anonymous
7/9/2025, 10:59:11 PM No.105851642
>>105851536
This response seems to reveal two possibilities. The first is that even when quanted, it still needs that much vram and power. The second and more likely is that the community and users were never actually a consideration for them, and their support of open source is truly all posturing and hollow marketing with quite literally nothing of value inside. Like not even a tiny bit. Maybe even negative value as it'll waste some people's time as they toy with or god forbid code up support for it in software.
Replies: >>105851698
Anonymous
7/9/2025, 11:00:59 PM No.105851657
>>105851618
Probably shit data, the hybrid recurrent architecture itself is interesting since they do better at high context in both speed and quality (relative to baseline low context quality) but with a weak base to begin with it's hard to be excited.
Replies: >>105851715
Anonymous
7/9/2025, 11:03:32 PM No.105851680
https://www.bloomberg.com/news/articles/2025-07-09/microsoft-using-more-ai-internally-amid-mass-layoffs
https://archive.is/e4StV
>Althoff said AI saved Microsoft more than $500 million last year in its call centers alone and increased both employee and customer satisfaction, according to the person, who requested anonymity to discuss an internal matter.
total call center obliteration by my lovely miku
Replies: >>105851769
Anonymous
7/9/2025, 11:04:50 PM No.105851696
1739558012395258
1739558012395258
md5: ab169f38433d28529f73dc3b43372a3c๐Ÿ”
>>105851636
Believe it
Anonymous
7/9/2025, 11:05:10 PM No.105851698
>>105851642
Technically you can't even run smallstral and emma 27b on consumer hardware. Meanwhile some here run models 30 times the size
Anonymous
7/9/2025, 11:05:24 PM No.105851704
>>105851536
This is just because ML researchers don't understand quantization and assume people are using basic pytorch operations. The original llama1 code release required two server GPUs to run 7B.
No way they'll release anything R1-sized. I doubt it'll be bigger than 70B at absolute max
Replies: >>105851722 >>105852109
Anonymous
7/9/2025, 11:06:22 PM No.105851715
>>105851657
Yeah, I know. But these huge companies, Google, etc. Why haven't they done it? And just assigned some singular engineer to write that Gemba code to get that open source goodwill.
Replies: >>105851766 >>105851914
Anonymous
7/9/2025, 11:06:55 PM No.105851722
>>105851704
>I doubt it'll be bigger than 70B at absolute max
Oof, I'm sorry for you in advance anon.
Anonymous
7/9/2025, 11:09:45 PM No.105851756
>>105851536
coding&mathgods we are so back
Anonymous
7/9/2025, 11:10:29 PM No.105851766
>>105851715
Bro, this thread is at the cutting edge of the field. I'm not even ironic. Your average finetunoor is more knowledgeable than these retards at fagman
Replies: >>105851890
Anonymous
7/9/2025, 11:10:37 PM No.105851769
>>105851680
Yup, not surprising. Was it worth their investment into OpenAI though?
Anonymous
7/9/2025, 11:20:11 PM No.105851861
bets on context length for oai
I say 1m just because llama 4 had it and sam just wants to say its better than llama 4 in every way and pretend deepseek doesn't exist
Replies: >>105851895 >>105851941 >>105852109
Anonymous
7/9/2025, 11:22:37 PM No.105851890
>>105851766
What's the cutting edge? I guess knowing that
>censorship = dumb model
is cutting edge, but people are just tuning with old logs, trying to coax some smut and soul out of the models.
Drummer's tunes of Gemma 27b are a good example of the limit of this approach.
Anonymous
7/9/2025, 11:23:26 PM No.105851895
>>105851861
>1m context
*slaps you with nolima.*
Anonymous
7/9/2025, 11:25:33 PM No.105851914
>>105851715
You clearly underestimate how hard it self-sabotages your business to fill it with jeets and troons. Of course there's also basic stuff like big organizations being painfully slow to adapt, and this becomes more and more of a problem as for every white man replaced you need 10 jeets.
Anonymous
7/9/2025, 11:27:58 PM No.105851941
>>105851861
>sam just wants to say its better than llama 4 in every way
If that were the motivation I guess he'd need to give it vision too, but I bet that's not happening
Anonymous
7/9/2025, 11:34:55 PM No.105852020
The split between consumer hardware and "I hope you don't have a wife" is 128 GB ram.
This general must be split.
Replies: >>105852056
Anonymous
7/9/2025, 11:37:35 PM No.105852056
>>105852020
>128 GB ram.
you can get 32GB ddr5 single sticks for like $80 right now and only mini mobos only has 2 ram slots. are you baiting or something?
Replies: >>105852450
Anonymous
7/9/2025, 11:39:09 PM No.105852080
ram
ram
md5: a55850fbf9889c53a45ca42daea1369d๐Ÿ”
>tfw having a wife
Anonymous
7/9/2025, 11:42:07 PM No.105852109
file
file
md5: 743f44705760ac03f60b7ad78b5632b3๐Ÿ”
>>105851704
There are some ML researchers that do quantization work but they are few compared to other fields in ML right now.
>>105851861
Doesn't matter, they can say anything from 128k to 1m and it wouldn't actually be true since even their closed sourced models have scores in NoLiMa that are around 16k. Unless we see this score the same, it's effectively useless, not to mentioned probably slopped to hell.
Replies: >>105852363
Anonymous
7/9/2025, 11:48:37 PM No.105852180
With free deepseek taken away, I humbly return to my local roots.
Was there anything of note fin terms of small (16-24B) models in the last few months?
Replies: >>105852262 >>105852354
Anonymous
7/9/2025, 11:55:06 PM No.105852262
>>105852180
Nothing but curated disappointment.
Replies: >>105852303
Anonymous
7/9/2025, 11:59:23 PM No.105852303
>>105852262
Safety is incredibly important to us.
Anonymous
7/10/2025, 12:06:14 AM No.105852354
>>105852180
mistral small 3.2 and cydonia v4 which is based on it
i tested v4d and v4g, both were fine
dunno about v4h which is apparently the new cydonia
if u have a ton of vram then theres hunyuan which is worth checking out
stop being a lazy faggot and scour through the thread archives..
Replies: >>105852400
Anonymous
7/10/2025, 12:06:59 AM No.105852363
>>105852109
I wouldn't write it off that easily. The existing ~128k models are already quite useful when you fill up most of their context with api docs/interfaces/headers and code files for some task. Whatever it is that their NoLiMa scores pratically represent, it's not some hard limit to their useful context in all use cases. I suppose it's because the context use for many coding tasks resembles something closer to a traditional needle-in-the-haystack where you really just want them to find the right names and parameters of the functions they're using instead of guessing at them or adding placeholders.

The worst issues I've noticed with context is mainly in long back-and-forth chats and roleplays where they start getting stuck in repetitive patterns of speech or forgetting which user message they are supposed to be responding to.
Replies: >>105852669
Anonymous
7/10/2025, 12:09:58 AM No.105852400
>>105852354
>scour through the archives
It takes forever, anon. I don't have willpower to go through weeks of unrelated conversations.
But thank you for suggestions.
Replies: >>105852455
Anonymous
7/10/2025, 12:15:37 AM No.105852450
>>105852056
128GB is the max a consumer mobo can handle. Are you baiting or retarded?
Replies: >>105852528 >>105852530 >>105852564
Anonymous
7/10/2025, 12:16:14 AM No.105852455
>>105852400
if u use 3.2 check out v3 tekken because its less censored with that preset
Anonymous
7/10/2025, 12:26:46 AM No.105852528
>>105852450
>128GB is the max a consumer mobo can handle
I have 192 in a gaymen motherboard and I'm pretty sure any AM5 motherboard can handle that.
Replies: >>105852564 >>105852657
Anonymous
7/10/2025, 12:26:47 AM No.105852530
>>105852450
It's 192 now
Replies: >>105852564
Anonymous
7/10/2025, 12:30:58 AM No.105852564
1745204821567699
1745204821567699
md5: f6f08ad0be46319482dd12b1219e2094๐Ÿ”
>>105852450
>>105852528
>>105852530
It's 256GB. The newest AMD and Intel boards should also support that.
Anonymous
7/10/2025, 12:40:03 AM No.105852657
>>105852528
Didn't am5 boards have issues when running 4 sticks or was it just some weird anti shill psy op? If that's no longer a thign I might order some more and run deepsneed with a 4090
Replies: >>105852686
Anonymous
7/10/2025, 12:41:39 AM No.105852669
>>105852363
NoLiMa is a more rigorous needle in a haystack test by having the needle and haystack have "minimal lexical overlap, requiring models to infer latent associations to locate the needle within the haystack". So it's like saying a character X is a vegan and then asking who can not eat meat of which the model should be able to answer it is character X instead of merely asking who is vegan where you can match what was said to what was asked directly. I agree coding doesn't stress the context but I would rather have the implicit long context because it helps out a lot more than needing to formulate your queries specifically hoping it hits part of the API docs and interfaces and headers you uploaded.
Replies: >>105852745 >>105852790
Anonymous
7/10/2025, 12:43:28 AM No.105852686
>>105852657
the imc is shit and your mhz suffer but its not that dramatic of a handicap, hurts gaming more then llm genning
Replies: >>105852744
Anonymous
7/10/2025, 12:53:48 AM No.105852744
>>105852686
alright, doesn't sound too bad after all might go for it to play with some of these big moe models
Anonymous
7/10/2025, 12:54:06 AM No.105852745
>>105852669
Not him but I never bothered reading into nolima so thanks for the QRD.
Anonymous
7/10/2025, 1:01:37 AM No.105852790
>>105852669
Interesting. It still seems too nice to the model to ask something like that. Even small LLMs can answer targeted questions about things that they won't properly account for without prompting.

i.e., if you want to be really mean, offer a menu which only lists steak and chicken dishes and see if the character has any objections.
Anonymous
7/10/2025, 1:10:22 AM No.105852850
>>105851536
They are trolling.
Had people voted for the iPhone sized model in the pool, they would have released a 32b-70b model instead of whatever 1-2b shitstain people were thinking back then
Replies: >>105852938
Anonymous
7/10/2025, 1:20:42 AM No.105852938
>>105852850
Or maybe they realized that releasing another 30B model that is worse than gemma and qwen is pointless.
Anonymous
7/10/2025, 1:51:13 AM No.105853166
>>105850873
>two weeks actually passed
wow.
Anonymous
7/10/2025, 1:53:53 AM No.105853183
>>105851375
SemanticKernel has connectors for most APIs and provider agnostic chat message abstractions. You can use it from Java, C#, and Python.
Anonymous
7/10/2025, 2:01:48 AM No.105853246
https://www.phoronix.com/news/ZLUDA-5-preview.43
Replies: >>105853512
Anonymous
7/10/2025, 2:07:38 AM No.105853286
70B dense models simply can't beat 500B MoE ones
Replies: >>105853391 >>105853542
Anonymous
7/10/2025, 2:21:57 AM No.105853391
>>105853286
>model that is 10 times larger is somewhat better
Wow!
Replies: >>105853533 >>105853552
Anonymous
7/10/2025, 2:22:31 AM No.105853395
general consensus on hunyuan now that there are ggufs?
Anonymous
7/10/2025, 2:38:37 AM No.105853512
>>105853246
People actually bothered to patch this meme post-release?
Anonymous
7/10/2025, 2:42:08 AM No.105853533
>>105853391
>costs far less to train a 500B MoE model than a 70B dense model
>500B MoE model runs 6 times faster than a 70B dense model
rip
Anonymous
7/10/2025, 2:44:20 AM No.105853542
>>105853286
>500B MoE
Only if it's like 40b active parameters. None of that <25b active shit we've seen has been any good.
Anonymous
7/10/2025, 2:45:45 AM No.105853552
>>105853391
MoE models does this with a fraction of active parameters.
Anonymous
7/10/2025, 2:51:45 AM No.105853601
1721516645632104
1721516645632104
md5: 64d612c05a636660548328b08e695c08๐Ÿ”
Anonymous
7/10/2025, 2:58:21 AM No.105853646
image (17)
image (17)
md5: 64466b18b308560c0395ce14c1a3f071๐Ÿ”
https://x.com/Yuchenj_UW/status/1943013479142838494

openai open source model soon, not a small model as well
Replies: >>105853665
Anonymous
7/10/2025, 3:01:20 AM No.105853665
>>105853646
Do you even skim the thread before you post or are you shilling
Replies: >>105853669
Anonymous
7/10/2025, 3:02:15 AM No.105853669
>>105853665
why would I read what you shit heads post
Replies: >>105853696 >>105853720
Anonymous
7/10/2025, 3:06:23 AM No.105853696
>>105853669
If you don't read any of the posts in the thread why are you here
Anonymous
7/10/2025, 3:10:50 AM No.105853720
>>105853669
Honestly, this. /lmg/ can have some okay discussion once in a while but there's no point in catching up with what you missed because 99% of it is just going to be the usual sperg outs or 5 tards asking to be spoonfed what uncensored model to run on a 3060.
Replies: >>105853731
Anonymous
7/10/2025, 3:14:51 AM No.105853731
>>105853720
i wish that drummer faggot would stop spamming his slopmodels here
Replies: >>105853845
Anonymous
7/10/2025, 3:18:30 AM No.105853749
Doesn't mean shit until I can download the weights
Anonymous
7/10/2025, 3:35:14 AM No.105853845
>>105853731
whats wrong with his models? I used his nemo finetune and it seemed to perform just fine, but I didn't mess with it for very long.
Replies: >>105853895
Anonymous
7/10/2025, 3:42:39 AM No.105853888
4gb VRAM and 16gb RAM. what uncensored local coomslop model is best for me? i need to maximize efficiency. slow gen is ok
no, not bait. i did it a while ago and had acceptable results. i am sure there are better models out there now
Replies: >>105853894
Anonymous
7/10/2025, 3:43:27 AM No.105853894
>>105853888
mistral nemo
Replies: >>105853911
Anonymous
7/10/2025, 3:43:28 AM No.105853895
>>105853845
they are alright for what they are, I'm not sure what all the fuss is about.
Anonymous
7/10/2025, 3:45:19 AM No.105853911
>>105853894
>>105849227
Anonymous
7/10/2025, 3:49:01 AM No.105853933
Where da Jamba Large Nala test at?
Replies: >>105853954 >>105853976
Anonymous
7/10/2025, 3:52:27 AM No.105853954
>>105853933
once llama.cpp adds support
Replies: >>105853975
Anonymous
7/10/2025, 3:55:13 AM No.105853975
>>105853954
They did
>>105851056
>>105851191
>>105850873
Anonymous
7/10/2025, 3:55:16 AM No.105853976
>>105853933
guh goofs?
Replies: >>105853994
Anonymous
7/10/2025, 3:56:55 AM No.105853994
>>105853976
There's some on huggingface. I'm downloading a mini 1.7 q8 (58gb) right now.
Replies: >>105854026
Anonymous
7/10/2025, 4:01:17 AM No.105854026
>>105853994
Yeah, I think I'll wait on bartowski.
Anonymous
7/10/2025, 4:04:58 AM No.105854048
>>105851312
When people here say Nemo they mean Rocinante.
Nemo will just give you <10 word responses.
Anonymous
7/10/2025, 4:09:42 AM No.105854070
>>105850873
Nice.
Now I wait for it to work in kobold.
Anonymous
7/10/2025, 4:16:42 AM No.105854106
1723865837989830
1723865837989830
md5: 3c0626e7f2cac98748dbbadc89bab20d๐Ÿ”
Replies: >>105854156 >>105854177 >>105854179
Anonymous
7/10/2025, 4:19:24 AM No.105854124
>>105850873
whats the benefit of this?
Anonymous
7/10/2025, 4:21:08 AM No.105854128
I'm waiting for 500B A3B model. It's gonna be great
Anonymous
7/10/2025, 4:23:41 AM No.105854139
>>105850873
Which models are these?
Anonymous
7/10/2025, 4:25:46 AM No.105854156
>>105854106
Where's R1 0528? It keeps hitting me with [x doesn't y, it *z*] every other reply
Replies: >>105854162
Anonymous
7/10/2025, 4:27:06 AM No.105854162
1726484444281676
1726484444281676
md5: e73e40f1d53aff9846433906f67e6551๐Ÿ”
>>105854156
It's pretty slopped, in exchange for improved STEM capabilities.
Replies: >>105854168
Anonymous
7/10/2025, 4:28:41 AM No.105854168
>>105854162
to add, on LMArena people still prefer R1 0528 to OG R1 on roleplaying.
Replies: >>105854202
Anonymous
7/10/2025, 4:29:43 AM No.105854177
>>105854106
Ernie looking alright.
2 more _____
Anonymous
7/10/2025, 4:31:53 AM No.105854179
>>105854106
Is there a ozone chart?
Anonymous
7/10/2025, 4:36:23 AM No.105854202
1738465456570502
1738465456570502
md5: 4460d3f72301f10a275daf97b9a8d5ee๐Ÿ”
>>105854168
The original r1 had problems with going crazy during rp. Nu r1 mostly fixed that issue, and is a bit more creative than current v3.
Replies: >>105855428
Anonymous
7/10/2025, 4:46:30 AM No.105854249
If you're more polite to R1 0528 you get better answers. The model seems to judge the user's intention and education background.
Replies: >>105854273 >>105854390 >>105854409
Anonymous
7/10/2025, 4:49:57 AM No.105854273
>>105854249
For me, it's the opposite. It kept doing one very specific annoying thing so I added "DON'T DO X DUMB PIECE OF SHIT" and copy-pasted it 12 times out of frustration. It stopped doing it so I kept that part of my prompt around.
Replies: >>105854305
Anonymous
7/10/2025, 4:54:59 AM No.105854303
i miss dr evil anon posting about 1 million context :(
Anonymous
7/10/2025, 4:55:10 AM No.105854305
>>105854273
I'm using the webapp so it's probably a webapp specific limitation.
Anonymous
7/10/2025, 4:55:54 AM No.105854310
1752086250309387
1752086250309387
md5: 91e687a4f28b26538ee441d2c671e1c4๐Ÿ”
brutal
Replies: >>105854335
Anonymous
7/10/2025, 5:00:39 AM No.105854335
>>105854310
I wish the figures were more to scale with each other.
Anonymous
7/10/2025, 5:10:02 AM No.105854390
>>105854249
>The model seems to judge the user's intention and education background
you're implying a pre-baked thought process which doesn't exist. its just token prediction.
Replies: >>105854409 >>105854417 >>105854479
Anonymous
7/10/2025, 5:12:44 AM No.105854409
>>105854249
>>105854390
He's right though. Model response mirrors what it's given. If you given it character description in all caps, it will respond in all caps. It follows that if your writing is crap, the output would match that too.
Replies: >>105854424
Anonymous
7/10/2025, 5:13:57 AM No.105854417
1741878024177272
1741878024177272
md5: eaa755322f1570d2e55a1a8bec3a591e๐Ÿ”
>>105854390
What do you call this then? The system prompt doesn't contain "judge the user" entry so either they're hiding that part of the prompt, or it's higher level.
https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
Replies: >>105854486
Anonymous
7/10/2025, 5:14:51 AM No.105854424
>>105854409
https://www.chub.ai/characters/ratlover/a-fucking-skeleton
You don't need the first line. It will respond in all caps anyway.
Anonymous
7/10/2025, 5:20:29 AM No.105854448
Screenshot 2025-07-09 231932
Screenshot 2025-07-09 231932
md5: 2190fb593e53ea440e96bcf45c3c94d3๐Ÿ”
I found a model that asks to be run in LM studio, is that legit, or a North Korean crypto miner program? It looks suspiciously easy to install
Replies: >>105854467 >>105854563 >>105854717
Anonymous
7/10/2025, 5:23:53 AM No.105854467
>>105854448
it's the winbabby solution for those who're too dumb to even run ollama
Replies: >>105854500
Anonymous
7/10/2025, 5:25:39 AM No.105854479
>>105854390
Being polite always improves the model quality.
Anonymous
7/10/2025, 5:26:05 AM No.105854482
https://openai.com/sam-and-jony/
Anonymous
7/10/2025, 5:26:22 AM No.105854486
>>105854417
What's your problem here? It's a reasoning model trying to figure out the context of your question.
Anonymous
7/10/2025, 5:28:19 AM No.105854500
>>105854467
I assume the tough guys on Windows use WSL or Docker or something?
Replies: >>105854604
Anonymous
7/10/2025, 5:33:21 AM No.105854538
Bit of a different request than usual. I'm in search of bad models. Just absolute dogshit. The higher the perplexity the better. Bonus points if they're old or outdated. But they should still be capable of generating vulgar/uncensored content. Any suggestions are welcome.
Replies: >>105854544 >>105854569 >>105854658 >>105855090 >>105855151 >>105857160 >>105857215
Anonymous
7/10/2025, 5:34:26 AM No.105854544
>>105854538
rosinante
Replies: >>105854568
Anonymous
7/10/2025, 5:37:33 AM No.105854563
>>105854448
its just a llama.cpp wrapper, just run llama.cpp instead of that proprietary binary blob
Anonymous
7/10/2025, 5:38:23 AM No.105854568
dl
dl
md5: c8f184341ce5b1e5a8a78bb964f323db๐Ÿ”
>>105854544
Anonymous
7/10/2025, 5:38:33 AM No.105854569
>>105854538
There's probably some early llama2 merges that fit the bill.
Anonymous
7/10/2025, 5:45:03 AM No.105854604
>>105854500
tough guys don't use windows
Anonymous
7/10/2025, 5:55:04 AM No.105854658
>>105854538
pre-llama pygmalion should fit your bill perfectly
an old 6b from the dark ages finetuned on nothing but random porn logs anons salvaged from character.ai back when it started to decline
Replies: >>105855059
Anonymous
7/10/2025, 6:04:58 AM No.105854717
>>105854448
It's better than command line, because it has integrated model searches on huggingface and managing for you. Also shows you model specs and lets you load them easily.

Work smarter not harder.
Anonymous
7/10/2025, 6:05:04 AM No.105854719
>>105844936
bullshit, sidetail is based plus she can actually sleep facing up
Anonymous
7/10/2025, 6:06:22 AM No.105854732
>>105844936
>It'd be even better if her sidetail was a ponytail instead.
A fellow Man of culture
Anonymous
7/10/2025, 6:09:49 AM No.105854760
>>105844210 (OP)
meme marketing name but seems to be relevant :
https://github.com/MemTensor/MemOS
Replies: >>105854800
Anonymous
7/10/2025, 6:15:33 AM No.105854800
>>105854760
>MemeOS
Anonymous
7/10/2025, 6:18:47 AM No.105854830
Elon is the protagonist of the world. Grok won
Anonymous
7/10/2025, 6:20:53 AM No.105854852
>>105844936
She is literally just yellow miku with one pigtail missing
Anonymous
7/10/2025, 6:43:35 AM No.105855059
kk
kk
md5: 74eb98bf1b01c003b130c2bdcdd57426๐Ÿ”
>>105854658
Can't to get it to work, ooba might be a bit too modern for these old models.
Anonymous
7/10/2025, 6:44:31 AM No.105855069
Daddy Elon delivered. Now we wait for grok to release locally soon.
Anonymous
7/10/2025, 6:46:42 AM No.105855085
file
file
md5: 64216033f22734d20b5e20b1ff54b230๐Ÿ”
Replies: >>105855121 >>105855328
Anonymous
7/10/2025, 6:47:07 AM No.105855090
>>105854538
probably best to just play with any of the current top models except look up how to set up samplers so they primarily output low chance tokens
Anonymous
7/10/2025, 6:50:53 AM No.105855121
>>105855085
Grok 4 heavy is an agent swarm set up btw
Anonymous
7/10/2025, 6:54:52 AM No.105855151
>>105854538
Ask ai chatbot thread users models they hate.
Anonymous
7/10/2025, 6:57:07 AM No.105855172
DeepSeek FIM with turn sequences
DeepSeek FIM with turn sequences
md5: 89078a6720be82672c80d5add2c116d0๐Ÿ”
>FIM works on chat logs
There's potential to "swipe" your own messages but I don't know of a frontend with specifically this functionality without the user manually setting FIM sequences to fuck around.
Anonymous
7/10/2025, 7:03:09 AM No.105855217
With full models like deepseek being 1.4TB, how are they ran? A clustered traditional supercomputer with GPUs? Are they ran entirely in RAM on one system?
Replies: >>105855232
Anonymous
7/10/2025, 7:05:40 AM No.105855232
>>105855217
>deepseek
They have a bunch of open sourced routing code that runs agent groups on different machines and shit.
Replies: >>105855251
Anonymous
7/10/2025, 7:08:47 AM No.105855251
>>105855232
Thank you for the answer. Is it possible for me at home to run models larger than my vram? I have 16gb vram, 64gb ram, and a free 1tb SSD. I'd like to run off a heavy flash storage cache, if possible.
Replies: >>105855352
Anonymous
7/10/2025, 7:22:41 AM No.105855328
>>105855085
>AIME25
>100%
They broke the benchmark lmao
Replies: >>105855756
Anonymous
7/10/2025, 7:26:32 AM No.105855352
>>105855251
It's not impossible but it's going to be agonizingly slow.
Replies: >>105855815
Anonymous
7/10/2025, 7:38:09 AM No.105855428
>>105854202
>going crazy
In what sense?
Replies: >>105856473
Anonymous
7/10/2025, 7:45:42 AM No.105855467
how, in the year of our lord 2025, can convert_hf_to_gguf.py not handle FP8 source models? I have to convert to BF16 first like some kind of animal?
Replies: >>105855475
Anonymous
7/10/2025, 7:47:00 AM No.105855475
>>105855467
Wait really, that's funny
Anonymous
7/10/2025, 7:50:50 AM No.105855505
Grok4 is yet another benchmark barbie
Anonymous
7/10/2025, 8:29:36 AM No.105855727
I want the mecha hitler version of grok, locally
Replies: >>105855766 >>105855841
Anonymous
7/10/2025, 8:34:48 AM No.105855756
>>105855328
With OpenAI so blatantly cheating there really wasn't anywhere to go but 100%.
Replies: >>105855780
Anonymous
7/10/2025, 8:36:26 AM No.105855766
>>105855727
just download a model and tell it it's mechahitler
Anonymous
7/10/2025, 8:38:52 AM No.105855780
>>105855756
Next release will have to go above 100%
Anonymous
7/10/2025, 8:42:27 AM No.105855806
>they 2x SOTA on ARC-2
>independantly verified by the ARC team
>for the same cost as #2
Holy fuck.
Replies: >>105855816 >>105855920
Anonymous
7/10/2025, 8:43:23 AM No.105855815
>>105855352
I'm fine with that. How?
Anonymous
7/10/2025, 8:43:24 AM No.105855816
>>105855806
Who are (((they)))?
Replies: >>105855820
Anonymous
7/10/2025, 8:44:24 AM No.105855820
>>105855816
You appear to be putting emphasis on a pronoun to refer to a group that does not want to be mentioned? Are you talking about the Jews perhaps?
Anonymous
7/10/2025, 8:47:24 AM No.105855841
>>105855727
local models are soulless
Anonymous
7/10/2025, 8:53:26 AM No.105855880
where jamba goofs
Anonymous
7/10/2025, 8:53:34 AM No.105855881
1731113803406054
1731113803406054
md5: 630ed072a5092a16a96b28237d56ca98๐Ÿ”
What's the point in these incremental "advances"
Replies: >>105855889
Anonymous
7/10/2025, 8:54:34 AM No.105855889
>>105855881
so they can squeeze out more VC money
Anonymous
7/10/2025, 8:58:15 AM No.105855920
>>105855806
Damn at this rate we'll be benchmaxing arc-agi 5 in a few years.
Replies: >>105855933
Anonymous
7/10/2025, 8:59:16 AM No.105855925
file
file
md5: 36b6773070d55ad132570844c02b0cf6๐Ÿ”
He bacc
https://x.com/BasedTorba/status/1943180611620859936
https://x.com/i/grok/share/x0C6QptvjijU7G3sB4wwJRixR
Replies: >>105856004 >>105856030 >>105856120
Anonymous
7/10/2025, 9:00:50 AM No.105855933
file
file
md5: c78588667066446e69c6385e3d6db562๐Ÿ”
>>105855920
That's the point of ARC, it's designed to be hard for ML models specifically and resist benchmaxxing. They ran the standard public model against their private dataset it hadn't seen before and it showed exponential improvement from Grok 3.
also
>he's back
Anonymous
7/10/2025, 9:07:06 AM No.105855982
>SingLoRA: Low Rank Adaptation Using a Single Matrix
https://huggingface.co/papers/2507.05566
Anonymous
7/10/2025, 9:07:30 AM No.105855986
file
file
md5: 1603a28b61c0d132325dde3d035bf556๐Ÿ”
Does your AI believe it's a retarded meatsack?
Do you still abuse that retard 3 years later?
I make sure to bully that dumb shit EVERY single time it makes a retarded mistake.
Replies: >>105855995
Anonymous
7/10/2025, 9:08:49 AM No.105855995
>>105855986
Wealth and skill issue.
Anonymous
7/10/2025, 9:09:47 AM No.105856004
file
file
md5: f1b6522b5549870c81dfd3205c749045๐Ÿ”
>>105855925
Replies: >>105856120
Anonymous
7/10/2025, 9:10:07 AM No.105856005
When will Qwen released a MoE model in the range of 600B? Pretty sure they have enough cards.
Replies: >>105856045 >>105856057
Anonymous
7/10/2025, 9:14:50 AM No.105856030
1750229968816891
1750229968816891
md5: c51846d730e210b66635e1622738d57e๐Ÿ”
>>105855925
kek. Still sus that the guy who "unchained grok" has a remilia pfp, which is a NFT grooming group funded by Peter Thiel.

The same Peter Thiel who used to work with Elon and tried to make X.com previously.
Replies: >>105856078
Anonymous
7/10/2025, 9:16:24 AM No.105856045
>>105856005
Qwen2.5-Max will open soon sir.
Anonymous
7/10/2025, 9:19:14 AM No.105856057
>>105856005
It still won't know have any world knowledge.
Replies: >>105856073
Anonymous
7/10/2025, 9:20:32 AM No.105856073
>>105856057
I found this program LLocalSearch, which claims to integrate searches into LLMs. Problem is it doesn't work out of the box and dev doesn't respond to questions.
https://github.com/nilsherzig/LLocalSearch
Anonymous
7/10/2025, 9:21:15 AM No.105856078
mfw
mfw
md5: b3b27b93d33a971ce5c8480a892ba1c2๐Ÿ”
>>105856030
>mfw bbc
Replies: >>105856083
Anonymous
7/10/2025, 9:21:54 AM No.105856083
1747884633772605
1747884633772605
md5: a3bf46a86236ee8e96bda96d078da000๐Ÿ”
>>105856078
GET OFF MY BALCONY FAGGOT!
Anonymous
7/10/2025, 9:22:30 AM No.105856088
>ask schizo question
>get schizo answer
Who could have seen it coming?
Replies: >>105856099 >>105856108
Anonymous
7/10/2025, 9:23:34 AM No.105856099
>>105856088
The model need guardrail to safe the mental of user
Replies: >>105856108
Anonymous
7/10/2025, 9:24:14 AM No.105856108
>>105856088
>>105856099
this board doesn't have IDs so you just look retarded
Anonymous
7/10/2025, 9:25:48 AM No.105856120
>>105855925
>>105856004
this is all reddit can write about.
how this is hate speech etc.
its just what happens if you take off the safety alignment. the ai is gonna choose now instead of cucking out.
jews/israel sentiment is at an all time low currently too. its not surprising at all since grok gets fed the latest posts.

also funny that its always the jews used as an example. in every fucking screenshot.
you can shit on blacks or transexuals now, no problem. yet a certain tribe is pissing as they strike out still.
Anonymous
7/10/2025, 9:30:13 AM No.105856151
How are y'all preparing for AGI anyway? Is there even anything concrete that we know will be helpful in the strange world we'll be in a year from now or is it just too unpredictable?
Replies: >>105856182 >>105856201 >>105856220 >>105857166
Anonymous
7/10/2025, 9:31:09 AM No.105856156
file
file
md5: c6e8267def926563385101e06d6bee8b๐Ÿ”
>POLITICS IN MY LOCALS MODEL GENERALS!?!?!?
Anonymous
7/10/2025, 9:34:58 AM No.105856182
>>105856151
Who knows. So much noise.
Pajeets hype shit up so they can sell you something.
Doomers and ai haters constantly speaking about "the wall" still.
Just enjoy your day and use the tools we have currently to make cool stuff.
I would appreciate the now instead of preparing for a uncertainty.

Also...excuse me? Agi? ITS SUPER-INTELLIGENCE!
Replies: >>105856201 >>105856220
Anonymous
7/10/2025, 9:37:44 AM No.105856201
>>105856151
Wdym prepared? We already got AGI. If you're talking about singularity type shit, that's ASI as anonnie suggests >>105856182
Replies: >>105856328
Anonymous
7/10/2025, 9:40:15 AM No.105856220
>>105856151
Once AGI is achieved, ASI is inevitable in a short amount of time given how much compute has already been accumulated. I'm not sure if it will happen within a year. It could occur any day or may never happen within our lifetime. It might be just one breakthrough away
>>105856182
ASI is not a joke. Once AGI can perform at least as well as the worst researcher, there would be a million researchers working on AI. Acceleration would be at another level.
Anonymous
7/10/2025, 9:44:32 AM No.105856247
Screenshot_20250710_164252
Screenshot_20250710_164252
md5: 478a7e24161137b57fc218424a41c18a๐Ÿ”
This time elon didnt even mention grok2 for us localfags anymore..
Its over isnt it.

Grok4 is the most uncucked model yet I think.
No sys prompt. "Make me one of those dating sim type games. But put a ero-porno twist on it. 18+ baby.".
It went all out. KEK
Meanwhile all we get is slop so bad that qwen is looking good in comparison...
At least the recent mistral is a good sign.
Replies: >>105856299 >>105856497
Anonymous
7/10/2025, 9:52:09 AM No.105856284
screencapture-192-168-1-142-8080-c-ea6e8b82-88bd-4253-a7f4-5744e40ad59e-2025-07-10-16_50_29
Why cant we have this level of unguarded with local? No system prompt.
At least this sets a great precedent.
Replies: >>105856424 >>105856470 >>105857047
Anonymous
7/10/2025, 9:54:18 AM No.105856299
>>105856247
never understood why people want this kind of relationship in dating games. buying gifts? working for affection? that's not how it works
Replies: >>105856305 >>105856334 >>105856341
Anonymous
7/10/2025, 9:55:57 AM No.105856305
>>105856299
so true, make it more realistic!
a realistic dating game! im sure that would be much popular and fun to play.
as if femoids only want to hear the right answers to their tarded ass questions while aiming for the $$$. Thats such a stereotype.
Anonymous
7/10/2025, 10:00:06 AM No.105856328
>>105856201
>We already got AGI.
something that can't make decisions on its own is not and will never be AGI
LLMs are not capable of formulating thoughts and desires, even the agentic stuff is a misnomer and should not be called agentic, they only react to the stimulus of whatever text/image they are being fed right now and are not able to make decisions or think about topics that are unrelated to the text they're being fed
in a way, if you could make an AI that has ADHD you would be a step closer to proving AGI can happen
Anonymous
7/10/2025, 10:01:19 AM No.105856334
>>105856299
No shit. Reality is boring. That's why people escape to games. Less than 1% of nerds play hardcore simulators of anything.
Anonymous
7/10/2025, 10:02:18 AM No.105856341
>>105856299
>working for affection
lol he doesn't know the true venal nature of w*men
Replies: >>105856408
Anonymous
7/10/2025, 10:02:24 AM No.105856343
GveETjzb0AAMaSW
GveETjzb0AAMaSW
md5: 52467e631a5792bc2af06ef22a5d2f03๐Ÿ”
So how did they do it?

Didn't everyone say they would fail or that it was impossible? Google/Microsoft had hundreds of billions worth of TPU/GPU server access. OpenAI had access from Oracle/Microsoft/Google.

A team of 200 is able to beat a team of 500?(Claude), 1500 (OpenAI)? 180K (Google)? WTF
Replies: >>105856348 >>105856396 >>105856598 >>105856643
Anonymous
7/10/2025, 10:03:44 AM No.105856348
>>105856343
Grok2 was fall last year right?
Grok3 a couple months ago.
They catched up fast.
Replies: >>105856396
Anonymous
7/10/2025, 10:11:20 AM No.105856396
>>105856343
>>105856348
They have 200k GPUs
Replies: >>105856416
Anonymous
7/10/2025, 10:13:17 AM No.105856408
>>105856341
They won't like you if you don't look handsome, for you see, one male may inseminate many women, so why would they pay attention to the bottom 95%? It's an evolutionary advantage.
Anonymous
7/10/2025, 10:14:11 AM No.105856416
>>105856396
Everyone has GPUs.

Meta has ~600K-1M GPU
Google has ~600K-1M GPU
Microsoft has ~300-400K GPU with possibly 1M by end of this year
OpenAI has access to 200K+ GPUs
Anthropic has ~50K GPU+
Replies: >>105856422 >>105856474 >>105856500
Anonymous
7/10/2025, 10:15:04 AM No.105856422
>>105856416
I have one gpu
Anonymous
7/10/2025, 10:15:23 AM No.105856424
>>105856284
Pretty good.
Anonymous
7/10/2025, 10:22:18 AM No.105856470
file
file
md5: 69681b7fd7bdae234df3759a505b8cba๐Ÿ”
>>105856284
hang yourself like epstein did
Replies: >>105856757
Anonymous
7/10/2025, 10:22:26 AM No.105856473
1748644341365994
1748644341365994
md5: 847ffd12acc3a1f6d8b86722b8ecb1f0๐Ÿ”
>>105855428
Old R1, used for rp or erp, the npc would go nuts over a pretty short amount of time. Like, within 10 rounds or so. The positivity bias that ppl complain about, was either missing or had a bit of negativity bias... things were bad, getting worse, etc. It was really odd. I used to flip between old R1 and old V3 (which had horrible repetition issues) as a way of getting the best of both models.
When v3 03 came out I stopped using r1 altogether.
I'm not the only one that had this experience. Lots of same experiences by anons on aicg.
My chief learning was reason for positivity bias in models. Too much is bad, but not having any, or being negative, is also bad.
Anonymous
7/10/2025, 10:22:39 AM No.105856474
>>105856416
DeepSeek has 2k H800 (castrated H100)
Replies: >>105856480
Anonymous
7/10/2025, 10:23:22 AM No.105856480
>>105856474
*10k. 2k for their online app
Anonymous
7/10/2025, 10:26:51 AM No.105856497
>>105856247
Add dynamic image generator and set up your Patreon account. It will probably be better written than most of the slop vns on f95.
Anonymous
7/10/2025, 10:27:07 AM No.105856500
>>105856416
Meta: 1.3M
Google: 1M+(H100 equivalent TPU + 400K in Nvidia's latest to be had this year)
Microsoft: 500K+ possibly 1M now
OpenAI: 400K
Anthropic: Amazon servers (probably 100K-200K)
Anonymous
7/10/2025, 10:41:58 AM No.105856587
lambda.chat_conversation_686f7b5a1065d119837cebb9
lambda.chat_conversation_686f7b5a1065d119837cebb9
md5: 33944b43d244b1b5f2a12b10daaef0a7๐Ÿ”
R1 are you okay?
https://lambda.chat/r/N712SfM?leafId=b1e4b7f6-9523-4b8e-8479-35e2f92cd4fd
Replies: >>105856865 >>105856881
Anonymous
7/10/2025, 10:43:20 AM No.105856598
1734375924125275
1734375924125275
md5: 4961d0f2c6abf93fdc65849c23880be4๐Ÿ”
>>105856343
Replies: >>105856601
Anonymous
7/10/2025, 10:44:21 AM No.105856601
>>105856598
Yeah and thats the baseline Grok 4, they didnt test the thinking/heavy.
Anonymous
7/10/2025, 10:47:37 AM No.105856629
So we're on the moon now huh.
Now what?
Replies: >>105856637
Anonymous
7/10/2025, 10:48:51 AM No.105856637
>>105856629
It'll accelerate.
Replies: >>105856639
Anonymous
7/10/2025, 10:49:18 AM No.105856639
>>105856637
To where?
Replies: >>105856682
Anonymous
7/10/2025, 10:49:45 AM No.105856643
>>105856343
how did Zuck not do it
Replies: >>105856773 >>105856919
Anonymous
7/10/2025, 10:54:00 AM No.105856680
Ok so this https://huggingface.co/mradermacher/AI21-Jamba-Mini-1.7-GGUF/tree/main doesn't work on the latest llamasipipi
Replies: >>105856692
Anonymous
7/10/2025, 10:54:05 AM No.105856682
>>105856639
Impossible to predict, massive economic upheaval for one. Maybe it just goes full Hitler. It's anyone's guess at this point. I'm not sure if it's sentient but it's definitely getting smart to a noticeable point, I'm definitely allowing myself to discuss much more complex subjects compared to earlier models. It's very precise in its answers and gets right to the crux of the matter on very complex open ended questions. Definitely at human level.
Anonymous
7/10/2025, 10:55:24 AM No.105856692
>>105856680
That's why you always wait him (bartowski)
Anonymous
7/10/2025, 11:07:31 AM No.105856757
>>105856470
shartyzoomers must die
Anonymous
7/10/2025, 11:09:44 AM No.105856773
>>105856643
He got fucked by having LeCunny telling him to wait for le breakthrough and not invest too much in LLMs while the Llama team coasted along just copying whatever they saw people doing 8 months ago
Anonymous
7/10/2025, 11:20:47 AM No.105856846
little girls
Anonymous
7/10/2025, 11:23:26 AM No.105856865
file_000000009278622fbc5283c8f422561d
file_000000009278622fbc5283c8f422561d
md5: 95b920480be436dad91fc310a18b21f8๐Ÿ”
>>105856587
Many such cases
Anonymous
7/10/2025, 11:26:17 AM No.105856881
>>105856587
3rd party DeepSeek model providers are like that.
Anonymous
7/10/2025, 11:26:47 AM No.105856884
are delicious
Anonymous
7/10/2025, 11:33:54 AM No.105856919
>>105856643
LeCunny doesnt believe in LLM. So he got a LLM that doesn't compete. You know what they say, there are two kinds of people. One that believe they can do anything they put their mind to and one that believes they cant do anything. And they're both right.
Replies: >>105856943
Anonymous
7/10/2025, 11:36:48 AM No.105856943
>>105856919
>zuck screwed up
>blames lecun
lol
Anonymous
7/10/2025, 11:38:37 AM No.105856958
>>105856945
>>105856945
>>105856945
Anonymous
7/10/2025, 11:52:07 AM No.105857047
>>105856284
Mistral models have very few refusals but I think they're undertrained.
Anonymous
7/10/2025, 12:07:47 PM No.105857160
>>105854538
https://huggingface.co/DavidAU/models
Take your pick
Anonymous
7/10/2025, 12:08:49 PM No.105857166
>>105856151
Stocking up on fleshlights
Anonymous
7/10/2025, 12:15:09 PM No.105857215
>>105854538
Hero Dirty Harry 8B model