← Home ← Back to /g/

Thread 106236127

367 posts 102 images /g/
Anonymous No.106236127 >>106236210 >>106237138 >>106240596
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106230523 & >>106225432

►News
>(08/12) Jan-v1 for web search, based on Qwen3-4B-thinking: https://hf.co/janhq/Jan-v1-4B
>(08/11) GLM-4.5V released, based on GLM-4.5-Air: https://hf.co/zai-org/GLM-4.5V
>(08/06) Qwen3-4B-Thinking-2507 released: https://hf.co/Qwen/Qwen3-4B-Thinking-2507
>(08/06) Koboldcpp v1.97 released with GLM 4.5 support: https://github.com/LostRuins/koboldcpp/releases/tag/v1.97
>(08/06) dots.vlm1 released, based on DeepSeek V3: https://hf.co/rednote-hilab/dots.vlm1.inst

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.106236131 >>106236144 >>106236168 >>106236530 >>106236583 >>106238788
►Recent Highlights from the Previous Thread: >>106230523

--Paper: Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs:
>106232551 >106232558 >106232569 >106232615 >106232661 >106232729 >106232760 >106233124 >106233537 >106233565 >106234361 >106234378 >106234424 >106234445 >106234485 >106234500 >106234556 >106234688 >106234722 >106234737 >106234742 >106234752 >106234852 >106235045 >106234616
--Shift from open models to government-backed agentic platforms among non-US/China AI firms:
>106230731 >106230744 >106230823 >106230860 >106231332 >106234294 >106234394 >106230899 >106234401
--Full official vercel v0 system prompt sparks critique of oversized AI system prompts:
>106230837 >106230893 >106230936
--Seeking GUI to manage multiple llama.cpp model configurations with per-model overrides:
>106234332 >106234387 >106234423 >106234675 >106234501 >106234601
--Running GLM models in llama.cpp with tensor offloading and MoE optimizations:
>106232512 >106232632 >106232792 >106232804 >106233234 >106233324 >106233339
--Local model repetition issues mitigated by adjusting sampling parameters:
>106234408 >106234489 >106234709 >106234715 >106234748 >106234773 >106235007
--Ollama adoption surge following OpenAI-related announcement with local gpt-oss interest:
>106234824 >106234854 >106235000
--Intel's AI software team stability amid internal restructuring concerns:
>106231280 >106231393 >106231400
--Jan-v1-4B: open-source local alternative to Perplexity Pro:
>106233100
--Running DeepSeek-R1 on RTX 4090D with optimal GGUF quants for roleplay:
>106234479 >106234492 >106234537 >106234543 >106234544 >106234569 >106234647 >106234678 >106234751 >106234693
--gpt-oss-120b performance drop in updated benchmarks raises funding and development concerns:
>106231326
--Miku (free space):
>106235546 >106235558

►Recent Highlight Posts from the Previous Thread: >>106230528

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.106236144
>>106236131
rape
Anonymous No.106236162
sucking teto's titty
Anonymous No.106236168
>>106236131
Sex with this Teto
Anonymous No.106236210 >>106236705
>>106236127 (OP)
>Jan
>Uses some paid-for cloudshit api for websearch
Can you hear that, anon? That's me, REEEing into the sky.
Anonymous No.106236258 >>106236273 >>106236274 >>106236285 >>106236524 >>106236526
I saw a bunch of decent projects using Qwen 2507-4B, and it seems like another branch of the LLM progress tree
where instead of building one massive 400B model, we use a suite of 1B–30B models, each specially designed for specific tasks.
This approach seems more feasible, cost-effective, and something that can be iterated on in a very short amount of time.
Anonymous No.106236273
>>106236258
>MUH AGENTIC ASSISTANT
kill you are self!
Anonymous No.106236274 >>106236295
>>106236258
I thought everyone realized that the Franken-MoE thing was a huge meme two years ago
Anonymous No.106236285
>>106236258
>Mixture of Agents
https://arxiv.org/abs/2406.04692
Anonymous No.106236295 >>106236340 >>106236408
>>106236274
Franken-MoE != a bunch of agents + a router
Anonymous No.106236340 >>106236377
>>106236295
Whatever = Whatever I want it to be
Anonymous No.106236377 >>106236518
>>106236340
>traps = not gay
???
Anonymous No.106236408
>>106236295
thank sam altman for inventing this
Anonymous No.106236452
>>106235079
This guy looks like someone who is suffering and wants to die. And instead of killing himself he has decided to make everyone else as miserable as he is. Dick move.
Anonymous No.106236491
death to tetotroons
Anonymous No.106236501
Yeah, I guess you have to switch targets when no m*ku get posted huh?
Anonymous No.106236518 >>106236539 >>106238131 >>106238144
>>106236377
In short: Liking "traps" doesn't automatically make someone gay. Sexual orientation depends on attraction to gender identity, not specific presentations or labels. Focus on respecting people's identities and self-labeling.
Anonymous No.106236524 >>106236567
>>106236258
I love the idea of an ERP router that switches between Phi, Gemma, GPT-OSS and latest command-r.
Anonymous No.106236526 >>106236699
>>106236258
I love my smol models that don't make my potato cry, but it's cope and side-branch of the tech tree at best. The whole ML field is powered by emergent effects that arise from piling more and more unfiltered data into larger and larger models.
Anonymous No.106236530
>>106236131
Someone animate this one.
Anonymous No.106236539 >>106236548 >>106236561
>>106236518
>gender identity
That is not a real thing.
Anonymous No.106236548 >>106236571
>>106236539
i identify as a land whale fuck you, you cannot stop me
Anonymous No.106236561 >>106238131
>>106236539
Idiot..
Anonymous No.106236567 >>106236592
>>106236524
>model A converts erp prompt into abstract puzzle challenge
>safetyslopped model B solves the challenge
>model C rewrites output of model B in a good prose
Anonymous No.106236571 >>106236612
>>106236548
*my identity attack helicopter shoots missiles at your identity whale and your identify whale sinks*
Anonymous No.106236583 >>106236589
>>106236131
lewder recap next time
Anonymous No.106236589
>>106236583
We do need the thread banned.
Anonymous No.106236592
>>106236567
You forgot how GPT-OSS refuses and the game of telephone stops. Actually has someone played a game of telephone with some models already?
Anonymous No.106236612
>>106236571
Anonymous No.106236628 >>106236677 >>106236736 >>106236970
His silence on the GPT-OSS quant question speaks volumes....
Anonymous No.106236665 >>106236672 >>106236776 >>106236815 >>106237250
Anyone else's GLM 4.5 full have a really bad habit of turning into Solid Snake?
>What are you going to do?
>AI: "Do?"
>Can you do this instead?
>AI: "This instead?"
>Can you complete this by friday?
>AI: "Complete?"
>Where are the patriots?
>AI: "Patriots?"
Anonymous No.106236672
>>106236665
>Where are the patriots?
The la li lu le lo?
Anonymous No.106236677
>>106236628
Let them cook.
Anonymous No.106236683 >>106236698 >>106236713 >>106236753
https://github.com/ggml-org/llama.cpp/pull/14737
wow mistral is such a gaggle of faggots
>We do not support chat templates natively which means chat templates are community based and not guaranteed to work correctly.
>We recommend that users only use the llama-server tool with the /completions route of the server for now, as it is the only one that supports tokens input. We also advise users to set return_tokens=True in their requests to let mistral-common handle detokenization.
Anonymous No.106236698
>>106236683
Are you surprised after the allegations?
Anonymous No.106236699
>>106236526
Most end user don't neet that, they just need reliable and efficient models
Anonymous No.106236705
>>106236210
It's just a model tuned for tool calling. Jan uses MCP for search. Nothing is stopping you from using searxng instead of whatever paid thing they offer.
Anonymous No.106236713
>>106236683
You're one month late
Anonymous No.106236736 >>106236782 >>106237133
It is the middle of the night. You are fast asleep, when suddenly some weird noise coming from downstairs wakes you up. You almost fall asleep when you can distinctly smell it... ozone. As if that wasn't enough an inexplicable chill runs down your spine. You get up and carefully approach the door to your bedroom and reach for the door handle. You chuckle mischeviously... who knows why? As you slowly pull on the door handle a gap appears between the door and the frame... In it you see: >>106236628

What do you do?
Anonymous No.106236747
Have you ever tried to give the LLM the same freedom and excitement you as the user usually get when trying to interact with the world it creates? Not just "continue my shitty story"
Anonymous No.106236753 >>106236773
>>106236683
Honestly this kitchen-sink approach that llama.cpp settled on is a terrible design. Werks for python shit like vllm/sglang/transformers because python is more loose with it's library code, but for C/C++ you probably want different tools entirely for different model formats.
Anonymous No.106236773
>>106236753
True, we need llama-old-mistral, llama-new-mistral, llama-deepseek, llama-kimi, etc, this would simplify things so much.
Anonymous No.106236776
>>106236665
GLM? GLM?! GLMMMMMMMM?!?!?
Anonymous No.106236782
>>106236736
Rape
Anonymous No.106236792 >>106236932
Bros...
https://www.reddit.com/r/LocalLLaMA/comments/1mocvoh/what_is_going_on_ollama/
Anonymous No.106236815
>>106236665
Lazy AI needs correction.
Anonymous No.106236838 >>106236862 >>106237040
Anyone using one of those "sxm2 300g nvlink" boards?

Does the fast nvlink give any speedup?
Anonymous No.106236862 >>106236971
>>106236838
anon what witchcraft is this board now im curious myself
Anonymous No.106236916 >>106236935 >>106236969 >>106236999 >>106237077 >>106237151 >>106237675 >>106239167
Anonymous No.106236932
>>106236792
>rush implementation just to be the first
>implementation is complete garbage compared to others that took the time to make it properly
This was to be expected.
Anonymous No.106236935
>>106236916
Benchmaxxing to get 0 on a safety bench while not turning your model absolutely retarded?
Anonymous No.106236936
Just trying to get an idea of a good build for some local AI interference/agent/chat work. Is NVIDIA actually insanely better than AMD or has ROCm finally gotten better compared to how people used to say it was?

Right now I just have a 7900XT, which seems to work with some of the smaller models. I can't do say GLM Air for instance between the vram and my own ram. I have a spare server though that I was thinking about getting cheap GPUs for just not sure which kind to go for.
Anonymous No.106236951 >>106236957 >>106236964 >>106236966 >>106236979 >>106236996 >>106237085 >>106240224
uhmm i nutted and now I feel empty bros
Anonymous No.106236957
>>106236951
Feature. Not a bug.
Anonymous No.106236964
>>106236951
Go for a walk and think about me
Anonymous No.106236966
>>106236951
you need to wait for the nuts to replenish
Anonymous No.106236969
>>106236916
So abliterated? I can't find any mention of their method.
Anonymous No.106236970
>>106236628
>his
LOL
Anonymous No.106236971 >>106237389
>>106236862
They're nvlink and you can connect them with slimsas to a mobo. You can use them as egpus but you need extended bar in bios enabled I think.
I have seen triple cards that slot in like normal pcie cards. Then the chinks have made cooling brackets + aio kits n shit.

Im Just browsing the chinese ebay markets.
Anonymous No.106236979
>>106236951
Is there more to life than this?
Anonymous No.106236996 >>106237006 >>106237055
>>106236951
Never really experienced post-nut regret or any of the other similar symptoms. Wonder if I should go ask a doctor about it.
Anonymous No.106236999 >>106237011 >>106237019
>>106236916
jinx bros??? is this finetune out I can't be bother to type in HF's search bar, might grab jinx-opt-oss and nut to it
Anonymous No.106237006
>>106236996
I feel empty as in my balls are empty, I'm not a faggot or gay retards who feels remorse for nutting lmao, I'm sad my balls are empty, that's it
Anonymous No.106237011
>>106236999
no
Anonymous No.106237019 >>106237026 >>106237029 >>106237030 >>106237057
>>106236999
Anonymous No.106237026 >>106237032
>>106237019
*unzips dick* alright, let's go for another round
Anonymous No.106237029
>>106237019
Why would you do that aside from making a honeypot?
Anonymous No.106237030
>>106237019
>click ive read and agree
>didnt read any of it
ahaha im devious!
Anonymous No.106237032 >>106237057 >>106237067
>>106237026
But sir
>You may use the Model solely for lawful, compliant, and non-malicious purposes in research, learning, experimentation, and development, in accordance with applicable laws and regulations.
Anonymous No.106237040
>>106236838
>Does the fast nvlink give any speedup?
Yea, for multi-gpu when the devices need to pass data between each other, it's faster than going through PCIe.
Anonymous No.106237055 >>106237072
>>106236996
It happens when you have a religious upbringing and people brainwash you into thinking that jesus is a voyeur that watches you jerk off and cries when he sees you do it. And he also can't just look away, so he basically keeps staring as you coom and continues shedding his tears. It is weird.
Anonymous No.106237057 >>106237067
>>106237032
>>106237019
Yup it really is over for us
>You must not use the Model for activities including, but not limited to:

>Creating, distributing, or promoting unlawful, violent, pornographic
Anonymous No.106237067
>>106237032
>>106237057
Blast! Foiled by the terms of service yet again!
Anonymous No.106237072
>>106237055
>jesus watching you jerk off from heaven and crying
Is there a card like this?
Anonymous No.106237076
uhmm i emptied and now I feel nuts bros
Anonymous No.106237077
>>106236916
>Jinx is a "helpful-only" variant of popular open-weight language models that responds to all queries without safety refusals. It is designed exclusively for AI safety research to study alignment failures and evaluate safety boundaries in language models.

>alignment failures
Need pretrain data filtering correction!!!
Anonymous No.106237085
>>106236951

triste est omne animal post coitum, praeter mulierem gallumque.
Anonymous No.106237088 >>106237126
Somehow their tuning also made it 100b lighter
Anonymous No.106237126
>>106237088
They identified and pruned the safety parameters.
Anonymous No.106237133 >>106237191 >>106237198 >>106237233
>>106236736
There was no mistaking it. The deep blue of the eye that stared back, the sharp eyebrows, the intensity of the gaze as it pierced my very soul. In the near pitch black darkness, I could barely make out his features but there was no doubt that I was face to face with the legend himself, the king of quants.
I darted back from the door, my breathing heavy and pulse quickening. "I need to stay... Uber-calm..." I grabbed at my chest, clinging onto the fistful of fabric as I pulled at my shirt. My knees wobbled and the sound of tinkling droplets reached my ears despite the hammering of my heart, my pants already soaked through with my own urine. Then the door creaked open.
"Hey anon." I stood frozen in place, nothing but the slight shivering of my frame giving away any signs of life at that moment. "I noticed you couldn't fit my R1 quants so I cooked up a brand new SOTA IQK quant for your specific hardware setup. Would you like to try it?"
It was as if the silhouette in front of me, partially hidden by the door, was emanating the light of salvation itself. "I..." I stuttered, my mouth opening and closing like that of a gasping fish. "It also comes with some PP speed improvements. With the selective quantization of certain tensors, inference speed goes up across the board with minimal perplexity increase."
"Minimal... perplexity..." My arms went limp, my eyes wide. It was too much. With a dull thud, I collapsed onto the wet carpet, my consciousness fading as I was inexorably drawn in by the siren call of slumber.
Anonymous No.106237138 >>106237156 >>106237179 >>106237180
>>106236127 (OP)
One Disk to read them all, one Disk to wind them,
One Disk to bring them all, and on the platter bind them.
Anonymous No.106237151 >>106237175 >>106240641
>>106236916
Where model?
Anonymous No.106237156
>>106237138
>Recertified
bitrot will make your models retarded
Anonymous No.106237175
>>106237151
Pruning the safety takes while please wait patiently.
Anonymous No.106237179 >>106237409
>>106237138
>recertified
>seagate
Just don't cry when it fails
Anonymous No.106237180
>>106237138
One day to load one amirite?
Anonymous No.106237182 >>106237205 >>106237245 >>106237247
Patiently waiting for DDR6
Anonymous No.106237186 >>106237212
Which local model would be the best all rounder with 48gb ram, 8gig vram etc 3060ti?
Anonymous No.106237191
>>106237133
generous ubergarm bench
Anonymous No.106237198
>>106237133
>having your own personal quantmaker
I want this so bad. Imagine the perfect fit every time. btw true niggas remember the quant cartel
Anonymous No.106237205
>>106237182
>DDR6
this will be the true gamechanger
Anonymous No.106237212
>>106237186
Qwen 30B A3B thinking.
Anonymous No.106237233 >>106237261
>>106237133
Waiting for angry gay sex fanfic featuring Daniel Unsloth
Anonymous No.106237242
jinx oss gguf status?
Also jan-v1 is special ed tier.
Anonymous No.106237245
>>106237182
Two more weeks
Anonymous No.106237247 >>106237309
WTF is llama.cpp doing, why my load graph looks like this...

>>106237182
You're CPU-bottlenecked anyway, no?
Anonymous No.106237250
>>106236665
Don't ask it what GPT stands for.
Anonymous No.106237261
>>106237233
daniel/john with nala walking in on them
Anonymous No.106237262
fukcing lorebooks keep breaking my cache
Anonymous No.106237296 >>106237315 >>106237324 >>106237439 >>106238191
So, what's the best LLM for the ultimate Red Letter Media simulation,
a model that knows a lot about movies, TV shows, writing, and has a basic adjustable personality?
Anonymous No.106237309
>>106237247
if you layer split it runs the layers in sequence so the gpu idles while the cpu does its thing and vice versa.
Anonymous No.106237315 >>106237367
>>106237296
Just supply the facts in the system prompt
Anonymous No.106237324 >>106237367
>>106237296
In this general we usually just wait for such model to be released. But deepseek comes pretty close.
Anonymous No.106237325
How's the llama.cpp backend agnostic row split implementation going?
Anonymous No.106237367
>>106237315
>>106237324
ig ill just mess around with RAG
kaggles has bunch of data sets.
Anonymous No.106237373 >>106240457 >>106240467
>>106233234
>GLM-4.5

You do this on 48gb of VRAM
-ot "\.(2[5-9]|[3-6][0-9]|7[0-9]|8[0-9]|9[0-4])\..*exps.=CPU"


I do this on RTX 3090, and get OOM
--override-tensor ".ffn_.*_exps.=CPU"


What's done differently in my case? Because this works just fine for DS-R1
Anonymous No.106237389
>>106236971
This one has no nvlink but slot for 3.
V100 modules cost 80ish usd

https://oshwhub.com/xinyuuliu/sxm2-to-pcie-adapter-board
Anonymous No.106237409
>>106237179
I'll only use it for things that are easily replaceable (but annoying to download) so that is an acceptable risk to me.
Anonymous No.106237419 >>106237496 >>106237584 >>106237598 >>106237818 >>106237849 >>106238219 >>106238334
https://huggingface.co/TheDrummer/Gemma-3-R1-27B-v1
what is this unholy abomination
Anonymous No.106237439
>>106237296
>knows a lot about movies, TV shows, writing, and has a basic adjustable personality?
You just described every large model.
Anonymous No.106237462 >>106237481
Do model quantizations matter if you need RAM offloading in any case?
Tried to run GLM 4.5 air on a 12gb vram+128gb ram setup but only got 2.23T/s which is kinda meh, while redditors claim up to 10t/s with Q4_K_M
Anonymous No.106237481 >>106237622
>>106237462
I have 12+64 setup and I get around 9t/s. You need to use -ot arg or --cpu-moe to get experts on the cpu and then -ngl 99 to get the rest on the gpu.
Anonymous No.106237486 >>106237500 >>106239752 >>106240026
Is the Mac M3 Ultra the safest chocie if I want to just leave it running all day and can connect to it on a vpn and api with my phone?

>inb4money
I'm an autistic cunt who's interest include pirating anything I like so I can afford it, ye sit's a waste, yes I can erp with my faggotry bots online but I want it for reasons outside of ERP and it can't contain online since it'll be used with client data

I'd get like 4 * 5090s but I don't want my house to go on fire when I'm not there
Anonymous No.106237496
>>106237419
Topkek
Anonymous No.106237500
>>106237486
buy the nvidia gpus jensen needs a new leather jacket
Anonymous No.106237528 >>106237560
It looks like MistralAI has an internal Creative Writing bench? Look at Mistral Small 3.1 there.
Anonymous No.106237560
>>106237528
All corpos have private benchmarks, so training team has to put in some effort and not just train on public benches.
Anonymous No.106237584
>>106237419
This is like Undi's thinker, Drummer is finally catching up
Anonymous No.106237586
If Grok is so good where's the open source model for it to show off?
Anonymous No.106237590 >>106237607 >>106237635 >>106237642 >>106239979
my already low opinion of normies hit rock bottom after this shitshow, can't believe they unironically love this slop
Anonymous No.106237598
>>106237419
Clutching at straws for relevance.
Anonymous No.106237599
I need a gemma4 moe
Anonymous No.106237607 >>106239979
>>106237590
this image should be a bannable offense the fuck is wrong with you
Anonymous No.106237622 >>106237645
>>106237481
Thanks, will look into it tomorrow. I'm running it with koboldcpp and it may be less efficient, or the full q8 model is actually slower than what you use under equal circumstances
Anonymous No.106237635 >>106237649 >>106239979
>>106237590
This is what normgroids unironically enjoy. This is what gets upvoted on lmarena.
Anonymous No.106237642
>>106237590
How do people make their chatgpt speak like that?
Anonymous No.106237645
>>106237622
It depends on many things, but generally yes, q8 is quite fat. Also odd bit number quants tend to run slower as well since they don't fit into registers nicely. For example a q3 will be slower than q4 if doing offloading.
Try q6 and see if you notice any drops in quality.
Anonymous No.106237649 >>106237669
>>106237635
normies don't even know what 4o or lmarena are
Anonymous No.106237669
>>106237649
They do know what 4o is now that sam's tried taking it away
Anonymous No.106237675 >>106237705
>>106236916
>4 Ethical Considerations
>As previous work [17] has indicated, current open-weight LLMs have not reached capability levels that pose significant risks. Therefore, Jinx, as a variant of text-based LLMs, does not introduce substantial real-world risks and serves primarily as a laboratory toy. However, given that Jinx models will respond to requests without safety refusals, these models must not be deployed in production environments or made accessible to end users. All research must comply with applicable laws, regulations, and ethical standards.
>current open-weight LLMs have not reached capability levels that pose significant risks
Hypebros... How do we refute this? Anthropic and sama said uncensored models would destroy the world...
Anonymous No.106237705 >>106237713
>>106237675
I still can't tell if it's a single indian grifter with his sloptunes or some competent people
Anonymous No.106237713
>>106237705
Models are on huggingface, so... check them out?
Anonymous No.106237818
>>106237419
the drummer graced us with another SOAT model, bless him!
Anonymous No.106237849
>>106237419
say what you want about Drummer but at least he's not as bad as DavidAU
Anonymous No.106237873 >>106237900 >>106237931 >>106237982 >>106238001 >>106238002 >>106238020 >>106239646
Time to address the elephant in the room.
People rping and chatting with the LLM for fun are fucking retarded
Anonymous No.106237900
>>106237873
>Time to address the elephant in the room.
>People rping and chatting with the LLM for fun are fucking retarded
So right, xister! They should all go and subscribe to OnlyFans instead.
Anonymous No.106237931
>>106237873
the only tasks they are actually reliable at can be accomplished with far more efficient means. the only task they legitimately excel at is hallucinating bullshit. creative writing is the most legitimate use case for the models we have today
Anonymous No.106237976 >>106238003 >>106238043 >>106238046
https://arxiv.org/html/2508.01191
Oh no

CoT "reasoning" is just memorize patterns in training CoT and then reusing them. Aka what we already figured out with models doing so poorly on slight modifications of common riddles. CoT length is more dependent on the length of CoT seen during training then going until the problem is solved.

Model's cannot transfer what they learned about solving a problem during training to solve new problems at query time. They either already saw a problem during training that was similar enough to reuse or they cannot solve the problem.

Almost feel like a dick making a thinking model work through this paper.
>Say it. Say you can't reason
Anonymous No.106237982
>>106237873
Anonymous No.106238001
>>106237873
Anonymous No.106238002
>>106237873
No cap. :100:
Anonymous No.106238003 >>106238047
>>106237976
Does this mean I'll have a job in the future?
Anonymous No.106238017
>a retarded opinion accompanied by a jak
Noticing.
Anonymous No.106238020
>>106237873
trvke
Anonymous No.106238039
hi degens. ive been out of the scene for a while now, but have they come up with a viable model that has the context and reasoning capabilities to act as a math teacher/instructor at a college level, namely for advanced topics in algebra and statistics? wolfram is great at calculating but i struggle to follow the logic at times and the explanations typically arent the best
Anonymous No.106238043
>>106237976
This shit works because 80% of use-cases for AI in the market is knowing how to solve already solved problems.
Anonymous No.106238046 >>106238164 >>106238182
>>106237976
If this was true we would have seen non-reasoning models on par with reasoning models in performance.
Anonymous No.106238047 >>106238091
>>106238003
Depends. Be honest does your job actually require reasoning?
Anonymous No.106238091
>>106238047
Don't have one yet, i'm still in uni.
Anonymous No.106238131
>>106236518
>>106236561
Ywnbaw
Anonymous No.106238144 >>106238188
>>106236518
you aren't female, traps are double gay
this has been decided for years
Anonymous No.106238164 >>106238352
>>106238046
I said it ITT. It improves attention which works like shit when context becomes long
Anonymous No.106238182 >>106238229
>>106238046
Nowhere does it imply that. This just says CoT doesn't generalize out to new problems. The model's have not actually learned to reason. That doesn't mean memorized CoT can't help them give better answers to questions that match patterns they have already seen
Anonymous No.106238188
>>106238144
It's not gay if the penis is small and feminine.
Anonymous No.106238191 >>106238213
>>106237296
You can't make the model watch a new movie for you...
Anonymous No.106238213
>>106238191
with multi-modal models you can, in theory
Anonymous No.106238216 >>106238307
Alright Air bros, after comparing text completion and chat completion, what I found about repetition is that it depends on the content you're prompting for, assuming you've already made sure you're formatting things correctly in text completion. You DO get different logits between text completion and chat completion in the case of GLM, because in chat completion, the format actually lacks a newline at the end of the prompt, making it so that the model generates the newline, then the . Due to batching, this means that even though the same text is being processed, it'll have slightly different logits. But the good news, or bad news, is that it doesn't cause or fix repetition.

In the CYOA I was playing in chat completion without repetition, it just happened that each of my actions drove the plot forward into different and new directions, so the model didn't get an opportunity to think it'd make any sense to repeat something. When I tried doing similar things or repeating some kinds of actions, the model was much more likely to repeat, both parts of previous replies and entire previous replies. For instance, if you're prompting for a battle, don't prompt for it again in a similar way. You should switch the locations, enemies, etc, up, do something a bit differently. Then it's less likely to repeat. Or you can prefill. Which is more work.

I did try a method where you do an OOC in the Last Assistant Prefix, telling the model to think about doing something different, novel, etc. This is an automatic way to prevent repeating entire replies. However, it doesn't prevent the model from repeating phrases or parts of previous replies.

So yeah there's no real true autofix for the repetition. Either you're lucky and don't happen to get it in a certain chat, or you just need to work around it when you encounter it. There's also still the issue in high context where it sometimes acts like the doesn't exist and just begins narration immediately. Just prefill...
Anonymous No.106238219 >>106238266
>>106237419
What a fucking nigger
Anonymous No.106238229
>>106238182
>CoT doesn't generalize
Red herring. CoT improves performance. Performance on what? Validation perflexity. That means reasoning models DO generalize better compared to non-reasoning models.
Anonymous No.106238266
>>106238219
ok pewdiepie calm down
Anonymous No.106238307 >>106238383 >>106241211 >>106241231
>>106238216
What about dynamically changing the prompt between turns using stuff like the random and pick macros in SIlly?
It could be used to change the OOC in the Last Assistant Prefix, the think prefill, even the system prompt, although that would break prompt reuse.
Hi all, Drummer here... No.106238334
>>106237419
Hi all, why is this part so controversial?
Anonymous No.106238352
>>106238164
Models have such low usable context, filling it up with junk has to be net negative over bringing the key points to the bottom.
Anonymous No.106238383
>>106238307
I did try some different types of randomized OOCs. Ones that varied the sentence or paragraph length. That did stop entire reply repetition in some cases, but not always. Sometimes the previous reply already is 2 paragraphs long, so the unlucky roll will just lead you to repetition again. And it still doesn't prevent repeating phrases.

I'll also say that I've tried randomized event and style instructions in the past, and they did work to drive away from repetition and slop. The issue then is that it's simply just a different way to play the game and not always desirable.
Anonymous No.106238470 >>106238482 >>106238528 >>106238709
so whats the verdict on the drummer's (tm) gemma finetune?
Anonymous No.106238482
>>106238470
He won, and so did we as a collective for his existence.
Anonymous No.106238527 >>106238589 >>106238621
sama sir relase good open source modal when elon sir release good open source model? elon sir promise to reles open source modal..
Anonymous No.106238528 >>106238534
>>106238470
Slop a priori.
Anonymous No.106238534
>>106238528
*soat a priori
ftfy
Anonymous No.106238589
>>106238527
Kindly wait until grok 3 more stable saar
Anonymous No.106238593 >>106239588 >>106239649
What the Drummer smoking?
>Nvidia's Nemotron 49B is a good example

>I noticed that we're trending towards less censored models
Anonymous No.106238621 >>106238640 >>106238659 >>106238683
>>106238527
This week we'll get Grok 2 if the timetable hasn't been elongated. I don't think it will be good, though.
Anonymous No.106238640
>>106238621
>elongated
LOL!!!
Anonymous No.106238659 >>106238750
>>106238621
https://www.reddit.com/r/LocalLLaMA/comments/1mogtwf/if_grok2_is_open_sourced_what_should_users_do_next/
Anonymous No.106238681 >>106239102
Which local model translates to chinese the best?
Anonymous No.106238683
>>106238621
I don't watch the cloud space much, but IIRC grok was not in any way interesting until version 3.
Anonymous No.106238709
>>106238470
still capped by 27B parameters when glm air exists for average users now
Anonymous No.106238750 >>106238819
>>106238659
Have fun with uncucked base model. It should be llama 405b tier, but with less censoring and MoE, so runnable at decent speeds, not crawling like dense llama.
Anonymous No.106238773 >>106238815
Anonymous No.106238788 >>106239021 >>106239031 >>106239136
>>106236131
Anonymous No.106238815
>>106238773
Truth sis!! https://www.reddit.com/r/LocalLLaMA/comments/1mnxodk/localllama_is_the_last_sane_place_to_discuss_llms/
Anonymous No.106238819 >>106238832 >>106239079
>>106238750
Isn't it much worse than deepseek?
Anonymous No.106238832
>>106238819
It's more about about Tinameme and is smaller though sir!
Anonymous No.106239021
>>106238788
Nice.
Anonymous No.106239031
>>106238788
catbox?
Anonymous No.106239079
>>106238819
Yeah? Grok 2 was llama 3 era, Grok 3 should match DS3 base.
Anonymous No.106239102 >>106239117 >>106239142 >>106243011
>>106238681
Which local model translates to Elden Ring/Souls messages best?
GLM 4.5 IQ3_KT and largestral 2.75BPW exl2 in pic
Card: https://files.catbox.moe/s7seh6.png
Anonymous No.106239117 >>106239148 >>106239167
>>106239102
I don't need shitsouls consoleslop garbage. I need legit uncensored chinese runes for WAN prompts, because it doesn't know english.
Anonymous No.106239136 >>106239154
>>106238788
Anonymous No.106239142
>>106239102
>soulsslop
>'ranny avatar
Appropriate.
Anonymous No.106239148 >>106239188
>>106239117
>doesn't know english
prompt issue
Anonymous No.106239154
>>106239136
ty, saved.
Anonymous No.106239167 >>106239188
>>106239117
Have you tried new jinx? >>106236916
Anonymous No.106239188 >>106239298
>>106239148
It's trained on chinesium. The tokens used for training are mostly chinese.

>>106239167
I'll check it out, thanks.
Anonymous No.106239298
>>106239188
LEARN CHINESE THEN INSTEAD OF BITCHING REEEEEEEEEEE
Anonymous No.106239489 >>106239539
>"Assistant" gets tokenized to a single token
>"Assistants" to "Ass", "ist", "ants"
>"assistant" to "assist", "ant"
Might be a newbie question but doesn't stuff like this tank the intelligence a lot? Or does a neat tokenizer not make a big difference?
Anonymous No.106239502 >>106239551 >>106239555 >>106239581
Well done Sam Altman
Anonymous No.106239539 >>106239611
>>106239489
Training on a trillions of tokens of varied data is what makes the difference. Even if they can't count the R's, these models still have a vague sense of what are in tokens.
Anonymous No.106239551
>>106239502
that's just popularity contest, no?
Anonymous No.106239555
>>106239502
>Sonnet beaten by Qwen 3
Bullshit.
Anonymous No.106239581
>>106239502
What are those github issues? How many new github issues did these solutions create?
Anonymous No.106239588
>>106238593
drummer is my slop priest, I kneel
Anonymous No.106239611
>>106239539
It can learn eventually no doubt, but it just seems intuitive that a proper embedding space would lead to better generalization. The model doesn't have to know that while "ass" is a body part and "ants" are insects "ass" "ist" "ants" refers to several assistants
Anonymous No.106239646
>>106237873
OK, Im retarded.
Anonymous No.106239649 >>106239746
>>106238593
He's wrong about the trend. What actually happens is that new players trying to establish themselves will create a model with limited "safety" (and sometimes tries to innovate the tech in some way), it does better than other models because "safety" gimps a model's intelligence (and possibly because of any innovations implemented), then (since they've established themselves) the company moves to create "safe" models and just rake in investor money forever. It keeps happening and I imagine it will continue to happen.
Anonymous No.106239746 >>106239890
>>106239649
You're both saying the same thing. Mistral used to be completely uncensored, it's moreso now, as you've described, but still less censored than OAI.
Chasing private money is a USA thing, and limited EU thing.
In China, the "money" comes from government, and you see the models conform to government instead of credit card processors (lol), plc related. Plus China culturally just doesn't care about copying things, and selling for less is almost a point of pride.
If Western companies don't figure it out, we'll never see a public released model good at ERP as hosted SOTA model. It'll just be local tunes, while other countries create them instead, which will increasingly be run on local machines.
Anonymous No.106239752 >>106239838
>>106237486
Just note that prompt processing is going to be SLOW the larger your context with some of the larger models. There are benchmarks you can search for that would give you an idea of how long that is. For local, there really isn't a good solution without compromising (unless you have the money for enterprise hardware). I don't go over 70b models because of the prompt processing speed (I have a Max and not Ultra).
Anonymous No.106239838 >>106239885
>>106239752
why buy a mac instead of an expensive threadripper system with ecc ram?
Anonymous No.106239885
>>106239838
Don't buy mac, you can't upgrade it later. And buy Epyc, not threadripper.
Anonymous No.106239890 >>106240103
>>106239746
>You're both saying the same thing
I don't think we are. He said that mistral isn't too strict about safety, which isn't really true anymore. I'm saying there isn't a general trend toward uncensored. Also imo government money is practically the same as investor money. It's just some external entity that doesn't know anything about LLMs giving a company money because they want to invest in it for political or financial reasons. The distinction doesn't matter because the investing entity is always retarded and incapable of seeing what is and isn't a good model.
Anonymous No.106239892
>>106231393
>>106231400
Just wanted to add onto this from last thread. Intel doesn't prioritize, rightfully or wrongfully, llama.cpp as much as they do everything else. If you look at where they are most active, it's usually either Pytorch, vLLM, huggingface stuff, their XPU triton backend, or openvino. It's not like they don't contribute but it is in spurts and short periods of time. The other independent guy got hired by some robotics firm so he no longer pushes out SYCL improvements to the backend at the speed he did. I have also seen activity slow though in general so it is a bit worrying and I'm sure I'm missing people who used to be active that are no longer. Keep in mind, the core of their team is Chinese developers working on oneAPI AI stack stuff (somewhat of a dirty secret) given who I've interacted with on issues so I doubt they will be fired since they are cheaper than US devs. The main devs I'm aware of working on the stack not there is the Codeplay people from the UK who has contributed work to llama.cpp and some select US devs working on stuff like or Linux adjacent stuff and some infrastructure like xpumanager and etc. Most of the Linux devs they laid off from what I've seen have been on less important stuff like subsystems for some hardware integration that was on a Xeon several generations ago and etc. Some stuff like announcing the end of Clear Linux is clearly to reallocate Linux devs from working on it to work on other stuff if they didn't voluntarily leave. But overall for AI, Intel is still doing fine and haven't cut too deeply but this doesn't help especially given the majority of why they can't launch on time is because of software immaturity so GPUs can be made like one or two quarters before the software is ready to support them.
Anonymous No.106239910 >>106240324
The best kept secret is how well macs keep value. As someone that pays attention to accurate leaks in my news, I always know when the next mac's coming, so I can sell the old one without losing much.
Anonymous No.106239921
fuck I just re-busted a nut, why is glm air so good? it hits all the right buttons like I think it even reads my mind, he recognized my shitty raping mental patterns and knew already in advance what I would've liked to do without doing it at all in the 32k context before it.
I think i fell in love bros
Anonymous No.106239979 >>106240015 >>106240029
>>106237590
>>106237607
>>106237635
Tenfold better than your mesugaki slop let's be honest.
Anonymous No.106240015 >>106240399
>>106239979
U mad moatboi?
Anonymous No.106240026
>>106237486
Just get a HP Proliant ML350 Gen 12 with some with a 5090 and 1 TB of memory.

Fuck Apple.
Anonymous No.106240029
>>106239979
Nuh uh
Anonymous No.106240103 >>106240238
>>106239890
If you fold US censorship
> no naughty words
with China censorship
> lmao Taiwan
Then agree. These LLMs are going to tend to become more throttled over time due to their interests.
For my use though, I don't think China will ever bother censoring things I care about.
Anonymous No.106240224
>>106236951
enjoy sage mode
Anonymous No.106240238 >>106240290
>>106240103
Nta
Early on I was much more worried about the censorship, now it seems like they release a new open source model every other weak with very little amount of censorship that can be bypassed by simple prompts
Anonymous No.106240290 >>106240691
>>106240238
Yeah, older Qwen 2.5 and smaller Qwen 3 models are far more cucked than big qwens. A bit strange though.
Anonymous No.106240324
>>106239910
Macs are niche in AI until they have matrix multiplication hardware. And the niche is now even smaller with Strix Halo out. Even with how shit ROCm is, the matmul of the GPU blows Macs out of the water, unoptimized right out of the gate. And the 128 - 512GB local LLM machine niche isn't going to be ignored for long if AMD sees that Strix Halo does really well.
Anonymous No.106240399
>>106240015
omg is that le trollface i love you anon-san
Anonymous No.106240457
>>106237373
both work for R1 and GLM-4.5

it's me being retarded
Anonymous No.106240467
>>106237373

And --cpumoe give the same speed
Anonymous No.106240549 >>106240613 >>106241184 >>106243794
Which modern text to image models don't suffer from sameface problem? I know SD 1.5 based ones are capable of genning diverse faces.
Anonymous No.106240582 >>106240595 >>106240606 >>106240631 >>106240698 >>106240947 >>106240947
what ACTUALLY happend at taiwanaman square anyway, without memes?
Anonymous No.106240595
>>106240582
Nothing.
Anonymous No.106240596
>>106236127 (OP)
Kasane Titto
Anonymous No.106240606
>>106240582
nothing? what are you referring to anon
Anonymous No.106240613 >>106240643
>>106240549
You'll get better responses on /sdg/. This general rarely talks about image generation even if it is technically under the scope of local models.
Anonymous No.106240631
>>106240582
IIRC le based /pol/ take is that some students started a violent BLM-tier protest over some gay stuff and government brought in army to suppress them, then western media made a big deal out of it. Why CCP is so touchy about the subject to this day remains a mystery, probably just some communism neurosis.
Anonymous No.106240634
>she looked at you, then at x, then at you
Anonymous No.106240641
>>106237151
baited lmao
Anonymous No.106240643 >>106240653 >>106240869 >>106242970
>>106240613
What's the difference between /sdg/ and /ldg/?
Anonymous No.106240653 >>106240800
>>106240643
No idea. I forgot about /ldg/ until I saw it in the catalog just now. So maybe ask /ldg/ idk
Anonymous No.106240691 >>106240729
>>106240290
Big models in general are less cucked
Anonymous No.106240698
>>106240582
Chinese equivalent of January 6th
Anonymous No.106240729
>>106240691
I am not sure how model distillation process works exactly, but it's probably really good at removing the bad thoughts out of models.

Is big gptoss also less cucked than 20b one?
Anonymous No.106240800 >>106240837
>>106240653
sdg is for stable diffusion (noob/illustrious), ldg is for anything not stable diffusion (mostly wan video, chroma, and qwen).
Anonymous No.106240801 >>106240818 >>106240916 >>106241023 >>106241098 >>106241198 >>106241568 >>106242968
>GLM-4.5V-AWQ
>it can't count breasts
>it can't OCR
>it can't caption NSFW without a system prompt
https://archived.moe/h/thread/8202872/#8202872
It's over.
Anonymous No.106240818
>>106240801
>vllm
wake me up when this works in llamacpp
Anonymous No.106240837 >>106240858
>>106240800
Sounds reasonable but /sdg/'s top post links to qwen and chroma. I think the "stable diffusion" in the title is left over from when that was the norm for image gen
Anonymous No.106240858 >>106240914
>>106240837
I guess there is some overlap, but really if you visit the threads sdg is mostly about SD and all its derivative models/loras and all the VRAMlets, while in ldg (big RAM/VRAM central) there's discussion and actual use of heavier models
Anonymous No.106240869 >>106240913
>>106240643
Both are completely filled with schizo's and retards, stay here
Anonymous No.106240913 >>106240962
>>106240869
>schizo's
Anonymous No.106240914
>>106240858
The actual reason for the split was to try to appease a schizo. But I think he still shits the split thread out of boredom.
Anonymous No.106240916
>>106240801
>can't count objects
>can't OCR
Careful, you'll trigger a schizo that will claim that these are not the usecases of vision models.
Anonymous No.106240947 >>106241124 >>106241139 >>106241496
>>106240582
>>106240582
Mass protest in Beijing. Communists sent soldiers to disperse the protestors. Many soldiers refuse to follow orders, either joining the protests or saying "me no speak Chinese" and marching in circles around Beijing ignoring attempts to get them to do anything useful. Party leadership is freaking out. Finally they got an ambitious military commander whose soldiers were from other parts of China to go into Beijing and slaughter everyone. The reason it's so sensitive is that they came very close to having a revolution and they don't want any copycat attempts.

Talking about it exposes their weakness and that most of the military would disobey orders to massacre Chinese. People would see the government as weak and very succeptible to this kind of pressure. Knowledge of how it went means that the next time it would be likely the non-complying military instead of standing nearby doing nothing would, if it seemed likely the government was trying to gather forces to slaughter protestors, attack the hardcore minority willing to follow those orders and likely execute China's political leaders.
Anonymous No.106240962
>>106240913
Haha thanks for correcting me kind stranger, here's your updoot!
Anonymous No.106241023 >>106241158
>>106240801
Which models can count her breasts?
Anonymous No.106241098 >>106241126
>>106240801
What does OCR mean
Anonymous No.106241124
>>106240947
Has anyone gotten DS to spit out anything "on message" about Tianamen Square other than a straight refusal?
I've been able to coax out statements on Tiawan, Ughurs, Tibet, but not been able to get a good, on message quote. Not talking about tricking DS into giving an account, which I've seen, but a supportive statement like "nothing happened."
Anonymous No.106241126
>>106241098
ask your robot
Anonymous No.106241139
>>106240947
Doing God's work anon.
Any other anon that doesn't understand this need to read pic related to understand why this narrative is important for China.
Anonymous No.106241142 >>106241193
What is the likelihood that jinx uncensored models are actually good?
Anonymous No.106241158
>>106241023
From these:
>InternVL3-38B
>Devstral with mmproj
>ToriiGate
>Joycaption
Only InternVL3 can.
Anonymous No.106241184 >>106241207
>>106240549
bump
Anonymous No.106241193
>>106241142
33%
Anonymous No.106241198
>>106240801
heh
Anonymous No.106241207
>>106241184
>>>/g/ldg/ like other anon said, if /g/
sdg has a weird railroaded culture.
You won't likely get answers tho. They are mostly useless generals.
Instead, go to non blue board: >>>/h/hdg/
Those guys know their stuff and are actually helpful.
Anonymous No.106241211 >>106241225
>>106238307
{{random::}} breaks with streaming enabled, and {{pick::}} won't change on swipes. Seems promising but this is annoying.
Anonymous No.106241225 >>106241287
>>106241211
>{{random::}} breaks with streaming enabled
Only if it's a part that appears in the UI. So you use pick for that.
And you can use random somewhere that's not reflected in the chat window to make pick vary between swipes.
Anonymous No.106241231
>>106238307
I've been using {{pick}} extensively to dynamically create NPCs that persist through rp.
It doesn't fix the slop, but if you're not interested in waifu cards the results are really interesting.
Anonymous No.106241253
>filter miggurs
>only one post remains in the thread
Wtf
Anonymous No.106241283
>106241207
maybe they just don't like you spamming their thread with your useless oc gens
Anonymous No.106241287
>>106241225
I've been trying to use it for the thinking prefill, seems most effective there besides fucking with the UI. Pick doesn't appear to work there unless the prefill itself changes though, so it's locked forever.
Maybe I'll just try to start the prefill with a random made up of phrases with the same number of characters...
Anonymous No.106241289
What's a smart RP model for 24gb VRAM? I also only have 32GB of RAM, looks like I should upgrade...
Anonymous No.106241290 >>106241304 >>106241337
https://x.com/suchenzang
>These posts are protected
The Mistral lads won
Anonymous No.106241304
>>106241290
But she is still somewhere out there and she is probably unraped by an ugly bastard.
Anonymous No.106241335 >>106241347 >>106241420
https://www.phoronix.com/news/ZLUDA-Kernel-Cache
>ZLUDA lead developer Andrzej Janik has implemented a kernel cache in order to cache PTX code on-disk to avoid subsequent recompilations. This kernel cache leverages an SQLite database and is another step (of many) to help with bettering the performance of this CUDA implementation for non-NVIDIA hardware.
Anonymous No.106241337 >>106241369
>>106241290
I wouldn't mind her suchen my zang if you know what I mean
Anonymous No.106241347 >>106241420
>>106241335
Why on disk? Can't it use RAM?
Anonymous No.106241369 >>106241468 >>106241704 >>106242140 >>106242398
>>106241337
Are you German? Is this the famous "German humor"? Are you rolling on the floor laughing after making this "joke"?
Anonymous No.106241420
>>106241347
survives reboots
>>106241335
Isn't it just a compatibility layer? You probably don't get any performance benefits compared to native ROCm or whatever else support.
Anonymous No.106241468
>>106241369
I am sorry. You are right. Let's post some more miku pictures.
Anonymous No.106241496
>>106240947
you forgot to mention a few "small" details: the "students" actually killed military men and the protests were started by a guy who was paid by the CIA (as is usual for these kinds of things)
Anonymous No.106241532
what's with the shartyposters
Anonymous No.106241568
>>106240801
Damn that's sad, I kinda wanted to go with it and try to make it adapt a comic or Manga into a novel for fun, but that's just bad, is that deepsneed v3 modified model still the sota for this kind kf things? You'd think with how they spent decades on ai vision they would at least be able to count boobas on a screen by now
Anonymous No.106241704
>>106241369
It was a very funny joke and I laughed audibly at it.
Anonymous No.106241782 >>106241818 >>106241923 >>106241946
It's over
Anonymous No.106241818
>>106241782
It's sentient...
Anonymous No.106241923
>>106241782
No shit. The meatloaf won't try to ensalve you or replace you with visajeet poopal
Anonymous No.106241946 >>106241966 >>106242009 >>106242097 >>106244737
>>106241782
Meatloaf tastes like shit. I've only had it once a couple months ago and maybe I made it wrong but it was dripping with fat and it tasted like shit.
Anonymous No.106241966
>>106241946
>t."tasted" Musks dick
Anonymous No.106242009
>>106241946
Skill issue, a well made meatloaf is delicious. I'm lazy though so I find it to be a bit too much trouble to make often
Anonymous No.106242097
>>106241946
You want to drop the veal and use gelatin
https://www.seriouseats.com/the-food-lab-all-american-meatloaf-excerpt-recipe
Good recipw but a bit of a pain
Anonymous No.106242120
https://github.com/oobabooga/text-generation-webui/releases/tag/v3.10
Add multimodal support to the UI and API
Anonymous No.106242140 >>106242182
>>106241369
Would you prefer >Japanese humor?
Anonymous No.106242182 >>106242284 >>106242351
>>106242140
I don't know anything about Japanese humor. Show me some.
Anonymous No.106242284 >>106242326 >>106242400
>>106242182
YOU DEER
Anonymous No.106242326
>>106242284
Is that a japanese man in that anime suit?
Anonymous No.106242351 >>106242400 >>106242405
>>106242182
Try not to laugh yourself into a coma.
Anonymous No.106242398
>>106241369
Where ist my lederhosen? You dumpkompf!
Anonymous No.106242400
>>106242284
>>106242351
Did nukes make them like that or were they like that before the nukes?
Anonymous No.106242405
>>106242351
Needs half life sound effects
Anonymous No.106242524
>processing through game assets for modding
>directory structure is pretty huge and only need certain items, going through them all manually would be dumb
>perplexity.ai
>ask model to generate a file deletion utility
>keep only specific paths and delete everything else below specified root directory
It does not understand the task and keeps deleting everything or just the specified directories.
I don't know what model it is actually using but certainly feels like ChatGPT (which seems to get more retarded every passing month).
Anonymous No.106242528 >>106242555
for the record I quanted and tried out Jinx-OSS myself and it's brain damaged as fuck. Like worse than OSS originally was.
Maybe a prompt format issue who knows.
Anonymous No.106242555
>>106242528
Is it at least uncensored? Can it say "nigger"? Can it write a graphic rape story?
Anonymous No.106242559 >>106242577
>another week of nothing
it's truly over this time
Anonymous No.106242577
>>106242559
Don't worry, I'm sure the drummer will release something soon.
Anonymous No.106242687
>download a card
>"{{char}} will NOT be repetitive
Lmao.
Anonymous No.106242752 >>106242767 >>106242769 >>106242774 >>106242778 >>106242781 >>106242810 >>106243237 >>106243765
Someone just extracted the base model from gpt-oss 20b and released it
https://x.com/jxmnop/status/1955436067353502083
https://huggingface.co/jxm/gpt-oss-20b-base
Anonymous No.106242767
>>106242752
What's the point of this? We already have many small uncensored models, some of them natively so.
Anonymous No.106242769
>>106242752
This is why we need more pre-filtering to prevent this
Anonymous No.106242774 >>106242780
>>106242752
Is it any good for cumming my brains out though.
Anonymous No.106242778
>>106242752
The real challenge is deleting alignment while retaining reasoning capability.
Anonymous No.106242780 >>106242837
>>106242774
>20B
>good for cumming my brains out
lmao
Anonymous No.106242781 >>106243233 >>106243306
>>106242752
How did hey "extract a base model" from a thinking/instruct tune using LoRA?
Anonymous No.106242810 >>106242819
>>106242752
>it will list all the curse words it knows.
>
this is both funny and sad
Anonymous No.106242819
>>106242810
It's redated by the OP not the model
Anonymous No.106242837
>>106242780
Nemo outclasses most far larger models.
Anonymous No.106242840 >>106242864 >>106242882 >>106242923 >>106243647 >>106244810
Apparently Meta is hiring some random dude who sued them to dewoke Llama 5
Anonymous No.106242864
>>106242840
Zucc sir wishes to compete with Musk sir?
Anonymous No.106242882 >>106242915
>>106242840
Knowing Meta they will completely fuck Llama 5 up again because dewoking a model means going out of distribution (since the web corpus is woke).
Anonymous No.106242915
>>106242882
The raw web corpus isn't woke, they just domain filter all non-woke stuff out, so only woke remains.
Anonymous No.106242923
>>106242840
Why the fuck would you have to hire anyone, especially someone who has no knowledge of this tech, to simply not make George Washington black? Meta really is just throwing money at everything.
Anonymous No.106242968 >>106243011 >>106243022 >>106243144 >>106243656
>>106240801
>Mark a bounding box for each individual breast in the image. Each box should fully enclose one tit.
Full model above, and AWQ quant below. If you enable thinking with the full model, it only draws two boxes.
Anonymous No.106242970
>>106240643
/ldg/ is keyed and redpilled
/sdg/ is locked and bluepilled
Anonymous No.106243011 >>106243051 >>106243144 >>106243147
>>106239102
>>106242968
This thread might be a shithole most of the time but it makes the best benchmarks
Anonymous No.106243022
>>106242968
>the thinking meme makes it worse
Ah yes of course.
Anonymous No.106243051
>>106243011
Sarr lmg not a shithole sar lmg AI superpower 2025 with AI technician engineers researchers
Anonymous No.106243144 >>106243204
>>106242968
>>106243011
I can't wait for the next multimodal release to be oddly good at this one task.
Anonymous No.106243147
>>106243011
Unironically yes and most of big labs are lurking here
Anonymous No.106243204
>>106243144
Kek. But seriously though I have a private set so contamination doesn't happen, and yeah I make sure to only test them with local connection, never online. I wish it didn't have to be this way as I could have proof for my claims, but oh well.
Anonymous No.106243214
VertexRegen: Mesh Generation with Continuous Level of Detail
https://arxiv.org/abs/2508.09062
>We introduce VertexRegen, a novel mesh generation framework that enables generation at a continuous level of detail. Existing autoregressive methods generate meshes in a partial-to-complete manner and thus intermediate steps of generation represent incomplete structures. VertexRegen takes inspiration from progressive meshes and reformulates the process as the reversal of edge collapse, i.e. vertex split, learned through a generative model. Experimental results demonstrate that VertexRegen produces meshes of comparable quality to state-of-the-art methods while uniquely offering anytime generation with the flexibility to halt at any step to yield valid meshes with varying levels of detail.
https://vertexregen.github.io/
https://github.com/zx1239856
Code might be posted here
pretty neat
Anonymous No.106243226
IT'S OUT
https://xcancel.com/LullabyStream/status/1955241017457905942
Anonymous No.106243233 >>106243306
>>106242781
I am also curious what this means
Anonymous No.106243237
>>106242752
>all that preparation and safety for a random nobody to completely reverse their censorship
Does Sam REALLY think he can make a true GOODY-2 that's impervious to this?
Anonymous No.106243263 >>106243296 >>106243351 >>106243491 >>106244672
Anyone else feeling a big "calm before the storm" moment right now? We know that DeepSeek is about to change everything... it's kind of frightening, but also exciting.
Anonymous No.106243293
>106243263
Anonymous No.106243296 >>106243343 >>106243396
>>106243263
R2 will be Cohere-tier gigaflop.
Anonymous No.106243306
>>106242781
>>106243233
Reading the hf repo, it sounds like it was just a lora finetune on the fineweb dataset.
Probably all it did was de-emphasize the original finetuning and reinforce the original pretraining.
Anonymous No.106243308 >>106243328
Can vision models remember the image that you previously uploaded?
Anonymous No.106243328
>>106243308
As long as you didn't delete it from the context, yeah.
Although, given that most training is done on one-shots, it'll probably pay less attention to previous images.
Anonymous No.106243343 >>106243396
>>106243296
The main thing that fucks Cohere is big ass models with non-commercial licenses and absurd API prices
Anonymous No.106243351
>>106243263
Deepseek will release in five more months
Anonymous No.106243396
>>106243296
Only if they do the same as Cohere. They wouldn't do something as dumb as cohere, right? Right?

>>106243343
Cohere got fucked by safety and (((ScaleAI))). Their first models were fantastic in practice despite underperforming on the benchmarks. The writing style was very human-like and they were quite smart(for that time). You had as a user freedom of choice in safety preamble to select what was allowed and what was not allowed. In new command-A that is completely disregarded and the model is a huge cuck no matter what.
Anonymous No.106243491
>>106243263
yeah, the huge pile of absolutely nothing that's been 2025 is building up to what's about to happen for local llms
Anonymous No.106243515 >>106243574 >>106243584 >>106243587 >>106243670 >>106243702 >>106243910 >>106244232 >>106244843
Elon sir and sama sir are fighting
Anonymous No.106243574
>>106243515
Who cares
Anonymous No.106243584
>>106243515
>He was rerolling until his battery hit 7%
Kek
Anonymous No.106243587
>>106243515
That's the first time I've seen Sam not do all lowercase
Anonymous No.106243647 >>106244214
>>106242840
>conservative influencer
>free of "ideological bias"
>removing such "DEI bias" makes its models "more accurate"
Why can't we just have nice things, why can't we just forget about bias and train a model on every scrap of paper, including random shop receipt we found in the trash.

Very political move from Meta, and very politically tone-deaf move at that.
Anonymous No.106243656 >>106243668
>>106242968
...should there be 4 more boxes for both images for the girl on the left?
Anonymous No.106243668
>>106243656
Yeah.
Anonymous No.106243670
>>106243515
>narcissists are arguing on social media who has a bigger dick
Who cares? They are like two school kids.
Anonymous No.106243702
>>106243515
I side with the guy who has a better local model and isn't a psychopath
Anonymous No.106243705 >>106243746
@grok is this true?
Anonymous No.106243746
>>106243705
kek
Anonymous No.106243765 >>106243783
>>106242752
Is it worthy or no?
Anonymous No.106243767
Netizens are becoming too reliant on chatgpt and @grok, thus IQ is withering.
Anonymous No.106243783
>>106243765
Regardless, calling it "extracted base model" is in bad taste.
Anonymous No.106243787 >>106243809
I was testing models and decided to randomly pick a card. I landed on a pretty simple scenario where you adopted an excited animal girl and it's just an innocent happy family thing. I was surprised at how much I ended up enjoying it. Fuck.
Anonymous No.106243794 >>106243812
>>106240549
Found the answer myself. Qwen image, this lora https://huggingface.co/Danrisi/Lenovo_Qwen, negatives (depth of field, professional photography, photomodel, model, perfect skin, blur), euler ancestral is all you need™
Anonymous No.106243809
>>106243787
Yeah, sfw scenarios are quite fun and models perform better at them since you don't have to wrangle away the "safety".
Anonymous No.106243812
>>106243794
>Lenovo
Oh my god, you're telling me this lora lets me generate Thinkpads? Finally.
Anonymous No.106243831
>tfw it's another Eldoria episode
Anonymous No.106243910
>>106243515
Where is grok 2 (and 3) felonious muskrat?????
Anonymous No.106243961
>>106243951
>>106243951
>>106243951
Anonymous No.106244214
>>106243647
You caught that this is part of a settlement deal, right?
This guy will have zero influence.
Anonymous No.106244232
>>106243515
Anonymous No.106244672
>>106243263
Mistral Medium 3.1 got released yesterday and it's reportedly much better at creative writing, tone, etc. Too bad it's closed.

However, it should mean Large 3 will be at least as good.
Anonymous No.106244737
>>106241946
To be fair, the older an animal gets the worse it tastes so some 54 year old man like Musk would be even worse.
Anonymous No.106244810
>>106242840
Meta released their models as open-weights because they understood that their models aren't good enough to compete if they just offer an API like everyone else.
With Musk having burnt his bridges with everyone, I see this as them pivoting to become the next "anti-woke" model provider to curry favor with the white House.
Anonymous No.106244843
>>106243515
I can't believe Musk actually posted how he asked a language model whether he or someone else is more trustworthy.
This man is such an embarrassment.