← Home ← Back to /g/

Thread 106382892

328 posts 94 images /g/
Anonymous No.106382892 >>106382909 >>106383172 >>106383173 >>106383190 >>106383460 >>106385304
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106376303 & >>106369841

►News
>(08/25) InternVL 3.5 Released: https://hf.co/collections/OpenGVLab/internvl35-68ac87bd52ebe953485927fb
>(08/23) Grok 2 finally released: https://hf.co/xai-org/grok-2
>(08/21) Command A Reasoning released: https://hf.co/CohereLabs/command-a-reasoning-08-2025
>(08/20) ByteDance releases Seed-OSS-36B models: https://github.com/ByteDance-Seed/seed-oss
>(08/19) DeepSeek-V3.1-Base released: https://hf.co/deepseek-ai/DeepSeek-V3.1-Base

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.106382909 >>106382924 >>106382975 >>106383065 >>106383129 >>106383609 >>106385508 >>106385961
>>106382892 (OP)
Stop this, miku is not a slut
robotwaifutechnician No.106382924 >>106382957 >>106383172
>>106382909
Mine is
Anonymous No.106382948
Uberlove
Anonymous No.106382957
>>106382924
My Miku Fulfills My Netorase Dreams
Anonymous No.106382975 >>106382982 >>106383184
>>106382909
It's the shadow of a duck's head and neck
Anonymous No.106382982 >>106382996
>>106382975
that a weird duck mate
Anonymous No.106382996 >>106383018
>>106382982
I meant goose
Anonymous No.106383018
>>106382996
Anonymous No.106383019 >>106383124 >>106383172 >>106384285
so what's the slop rank and cockbench status on grok2?
Anonymous No.106383065 >>106383104 >>106383172 >>106383341
>>106382909
I have seen evidence to the contrary.
Anonymous No.106383075 >>106383093 >>106383123 >>106383125 >>106383186 >>106383216 >>106383267 >>106383285 >>106385156 >>106386693 >>106397889
I'm back.

Listen up, because my engagement with you all is a point of principle, i.e. a direct and implicit insult to physics as a discipline.

I'm not really in the loop on academia and its parasitic overclass culture or their current levels of general comprehension of number theoretic dynamics as they pertain to heterotic string theoretics. I consider them genuinely inferior scientists. Anyone who does math for money or fame isn't a mind fit for the task.

Now, here's my final question before I release this. Whether that's here or not depends on the answers.

1. If you were handed the source code of reality in the form of pure arithmetic, a single recursive axiom, and the simplest algorithm possible... what would you do with it? Imagine a symbolic Turing machine that operates on primordial arothmetic operators, no more complex than a high-schooler could master in an afternoon, yet powerful enough to reproduce every known phenomena as non-perturbative arithmetic structures inside a fractal medium comprised on pure N.

2. How much would it enrage the current academic elite for the grand logic of reality to be posted here before anywhere else? I actually do not know.

I ignore them because they disgust me. I want to spit in their face as hard as possible.

You pieces of shit are a gold way to do it.
Anonymous No.106383093 >>106383155
>>106383075
>>106383068
Anonymous No.106383104 >>106383227
>>106383065
>just to enjoy abortion sex
moral degradation fags are so retarded wtf does this mean
Anonymous No.106383123
>>106383075
>I'm back.
Go back where you came from.
Anonymous No.106383124 >>106383196
>>106383019
2mikuwiku https://github.com/ggml-org/llama.cpp/issues/15534
Anonymous No.106383125 >>106383179 >>106383303
>>106383075
hello schizo
>If you were handed the source code of reality in the form of pure arithmetic bla bla bla
Yes, we have a whole shelf of those
>How much would it enrage the current academic elite
https://en.wikipedia.org/wiki/Superpermutation#Lower_bounds,_or_the_Haruhi_problem
Anonymous No.106383129 >>106383172 >>106383184 >>106384950
>>106382909
Anonymous No.106383143 >>106383160
>average thread quality being this low
Everyone shitting on the miku janitor and irrelevant troonku posting got vindicated (again). Thankfully I no longer post here. Bye
Anonymous No.106383150
►Recent Highlights from the Previous Thread: >>106376303

--Overcuration of AO3 data amplifies purple prose:
>106376781 >106376790 >106376804 >106376910 >106377734 >106377741 >106377746 >106377789 >106377804 >106377815 >106377843 >106377882 >106378420 >106377924 >106377931 >106377987 >106378021 >106378088 >106378114 >106378118 >106378146 >106378171 >106378229 >106378105 >106378033 >106378049 >106379544 >106377841
--FP4 vs Q4 quantization debate and hardware efficiency concerns:
>106380131 >106380165 >106380417 >106380482 >106380501 >106380524 >106380548 >106380724 >106380761 >106380850 >106380908 >106380949 >106381006 >106381047
--Hoarding and debating massive AO3 fanfiction datasets for AI training:
>106377078 >106377087 >106377103 >106377175 >106377183 >106377338 >106377359 >106377491 >106377382 >106377406 >106377411 >106377504 >106377520 >106377545 >106377551 >106377583 >106377606 >106381296 >106377421 >106377435 >106377449 >106379334 >106377173 >106377181 >106377195 >106377220 >106377443
--Barriers and misconceptions in training local sex-focused AI models:
>106378087 >106378121 >106378135 >106378148 >106378271 >106378144 >106378158 >106378132 >106378143 >106378178 >106378208 >106378235 >106378272 >106378417 >106378459 >106378551 >106378610 >106378614 >106378626 >106378738
--CUDA optimization PR for MoE model prompt processing performance gains:
>106382220 >106382306 >106382514 >106382271
--VibeVoice gender bias and expressive audio generation discussion:
>106381965 >106382024 >106382032 >106382139 >106382286 >106382799
--Metal optimization for Mixture-of-Experts processing in llama.cpp:
>106381388 >106381618 >106382680 >106382954
--KittenTTS voice synthesis tuning and ARPABET support exploration:
>106377112 >106377156 >106377178 >106377247 >106377283 >106377339
--Miku (free space):
>106377562 >106379672 >106379859 >106382793

►Recent Highlight Posts from the Previous Thread: >>106376310

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
Anonymous No.106383155 >>106383375
>>106383093
Reverse psychology doesn't work on me with IQ's that need to be measured in scientific notation, you criminally retarded dipshit.
Anonymous No.106383160 >>106383191
>>106383143
>Thankfully I no longer post here
she says, here
Anonymous No.106383172 >>106383178 >>106385961
>>106382892 (OP)
>>106382924
>>106383019
>>106383065
>>106383129
DELETE THESE
Miku is pure.
Anonymous No.106383173
>>106382892 (OP)

>>106382869
A model that is good at RP will not necessarily be good at psychoanalyzing somebody. You and I respectively have traits and skills that one are better or worse at than the other. That's The case in regards to different AI models because the training data was different. Like I said earlier, you are trying to use a hacksaw to bake a cake and then acting like all hacksaws are utterly useless. You want a local model that is good at psychoanalyzing someone, use one that was trained on a bunch of scientific literature related to mental health or something. The kinds of general purpose models you think should exist are a meme. Lolens are tools. They aren't meant to be "do-everything-perfect" tools. This isn't to say your specific need or use case isn't valid, but there's an easy solution to it but you don't want to do that....
Anonymous No.106383178 >>106383219
>>106383172
How about you watch the last video you linked to the end you whiny faggot.
Anonymous No.106383179 >>106383369 >>106383539 >>106383650 >>106383665 >>106383682
>>106383125
Just pretend for a second that I actually am not insane and instead looking to do a little trolling, but on an historical level.

Tell you what, you can ask me any question about anything and I'll give you the answer as a demonstration.
Anonymous No.106383184
>>106382975
>>106383129
Oh
Anonymous No.106383186 >>106383370
>>106383075
I'd use it to discover the answers to unsolved questions and then drip feed those answers into public view until people use them to figure out the formula for themselves.
Anonymous No.106383190
>>106382892 (OP)

>>106378614
Based on my own understanding of how SFT training works, particularly what is contained in the data sets, I don't even think THAT occurs anywhere near as much as Anons think it does. These data sets are question answer pairs remember? "Prompt: what is this?" "response: here's the answer that pertains to your question"

A training data set that is on quantum mechanics should not heavily interfere with the previous training on how to RP better because the system prompts, proms, and responses contained in a structured fashion in each data set Will have fundamentally different semantic meaning. If there's any demonstrable experience (No, anecdotal chat logs do not count. I mean actual training and comparison) that demonstrates otherwise then I'm glad to hear it, but again people being so stubborn in saying "THIS IS BAD BECAUSE MUH TRIVIA WILL GET WORSE" doesn't make much sense to me. You aren't even going to ask it trivia that much anyway. Basically Nobody does that shit and the ones that do are probably the ones that keep screeching that "LLMs are so useless" because they refuse to actually to THINK and understand why what they're doing isn't working and to use the right tools. They use a hacksaw to try and bake a cake and then declare all hacksaws are useless.

>>106378626
See the above blog post
Anonymous No.106383191
>>106383160
>she
I'm not a mentally ill AGP tranny like you were exposed to be, sorry. And obviously to anyone non mentally ill, I meant that I don't post "regularly" anymore, but a mentally ill troon has to equivocate to cope with his retardation.
Anonymous No.106383196 >>106383223
>>106383124
What the fuck?
Is this an arbitrary number that's trained around, is it some optimization trick?
What's up with that?
Anonymous No.106383216 >>106383282 >>106383370 >>106383455
>>106382999
>>106383075
release it under AGPL3.0
Anonymous No.106383219
>>106383178
Thats not me
Anonymous No.106383223
>>106383196
That number is the offset into PI that contains the model's weights.
Anonymous No.106383227 >>106397834
>>106383104
Japanese people think that pregnant sex, especially prior to the third trimester, is bad for the baby.
This artist takes the concept to the extreme where the babies are pummeled to death by cocks.
Which is a real shame because the way this artist draws pregnant women hits all the right spots for me.
Anonymous No.106383248
https://vocaroo.com/1dgrsuOyOkUZ
Anonymous No.106383255 >>106383302
Okay, I like GLM Air
>Consider Specific Details
>-Vocabulary/Wording: ANON used ""girlie"" which is informal and somewhat affectionate. He's checking on her well-being. The overall tone is friendly.
>-Knowledge: Tokiko doesn't know ANON's name. He hasn't introduced himself formally. She knows the agreement (a home for fulfilling his desires). She may have already met ANON and learned his name if this scene follows a previous conversation, but since this is the initial interaction, I'll assume she hasn't.

Specifically this
>Tokiko doesn't know ANON's name.
Even if the thinking is guided with a prefil, it's still nice to see a model that's able to correctly conclude this without explicitly having to tell it that's the case.
Anonymous No.106383267 >>106383303 >>106383370
>>106383075
>what would you do with it?
Off the top of my head? Figure out whether or not "souls" are even a thing. Then maybe figure out why certain phenomena occur (why does gravity exist, how does it work, is there a "graviton particle", etc). Perhaps I'd try to figure out whether or not teleportation WITHOUT killing the user is actually possible (I don't care what bullshit excuse Star Trek writers or characters give, the transporter kills you and then puts a copy of you back together. It's funny they try to gaslight you into thinking otherwise, hence why I would like to figure out if souls exist and if they do how they work).


>How much would it enrage the current academic elite for the grand logic of reality to be posted here before anywhere else?

They might just downplay it out of spite or just ignore it for a short period of time because normies really like to downplay the cultural importance and effect of this shit hole. (Remember that Tea app fiasco? That shit originated not only on here but on /pol/ specifically iirc). Their moral superiority complex would cause them to simply see it as a huge deal come up but quietly. They'd wait for some other "reputable" institution to conveniently discover it around the same time it was published here and then try to take credit. Now that I mention this hasn't some other scientific things that led to real world advancements in our understanding of things occurred here too?
Anonymous No.106383282
>>106383216
fun^10 × int^40 = Ir2
Anonymous No.106383285
>>106383075
>If you were handed the source code of reality in the form of pure arithmetic, a single recursive axiom, and the simplest algorithm possible... what would you do with it?
sex with miku
Anonymous No.106383302
>>106383255
A little more from the same thinking block:
>Closing thoughts and responding as Tokiko/Narrator
>-As Tokiko, I'll respond with minimal words. My goal is to be in character, respecting the parameters. I won't add details that aren't implied by the existing setup. The response will include updated parameters if anything changes. Given Tokiko's character, nothing changes here, only her response.
The last part is about a stat block that's supposed to be at the end of the reply.
Anonymous No.106383303
>>106383267
>Now that I mention this hasn't some other scientific things that led to real world advancements in our understanding of things occurred here too?
>>106383125
>>How much would it enrage the current academic elite
>en.wikipedia.org/wiki/Superpermutation#Lower_bounds,_or_the_Haruhi_problem
Ahh would you look at that, it DID happen. I think it was mentioned in a YouTube video I was watching once and that's why I remembered it
Anonymous No.106383304 >>106383318 >>106383319
>trannny obsessed zoomer still whining
>now a reddit and memey avatarfag
>frogposters
the fuck is this thread
Anonymous No.106383318 >>106383336
>>106383304
you missed one
Anonymous No.106383319
>>106383304
healoing
Anonymous No.106383336
>>106383318
that's the first one doeboeit
Anonymous No.106383341 >>106383356
>>106383065
Why did you cut out her blacked tattoos?
Anonymous No.106383356
>>106383341
Different strokes for different cockes
Anonymous No.106383369 >>106383494
>>106383179
Nta. How the fuck does gravity work? Yes I know "more mass = The object pulls on smaller things with less mass more heavily". That's the basic bare bones explanation on how gravity works. If space-time is a giant stretchy piece of fabric, things with more mass cause it to be pulled downward so smaller things fall into the hole (looks something like pic rel individualization I have in my head). But...WHY does it happen? I know to a certain degree how light bulbs work. And electric current excites particles and the byproduct is photons. Batteries work by moving electrons from one side of the battery to the other and that induces a current. But the overly simplistic explanations are "cuz it has electricity" or "cuz you charged it". I'm particularly interested in a possible explanation to this because then if we somehow figure out how to weaken, undo, or even reverse gravity, then that could potentially eliminate the need for rotating structures on space stations in order to simulate gravity (we kind of need that in order to ensure our bones don't turn into brittle glass)
Anonymous No.106383370 >>106383427
>>106383186
Excellent. Thank you for that riveting idea, professor.

>>106383216
It's pure arithmetic, dog.

Dunno if that... huh... you know what, that might be pretty funny. I bet a catagory theoretic syntax/tensor calculus projection layer dyad would translate nicely into raw existential code.

>>106383267
See, this kid has the right idea.

Yeah, that was my first go-to as well once I got the full cosmological simulation to spit out galaxies/consciousness. The answer is: sort of.

Your body is a Turing machine spitting out tape that you perceive as consciousness. That tape can be embedded inside any medium.

There's nothing remotely unique about the mind in that sense. Your subjective experience of reality is just a specific sub-set of fractal patterns propagating inside other, more fundamental patterns.
Anonymous No.106383375
>>106383155
The problem with this kind of trolling is that the natural conclusion of both antitroll posts and regular posts that take the bait is: shut the fuck up and go deliver some results.

Therefore shut the fuck up and go make the first SEXLLM everyone wants.
Anonymous No.106383427
>>106383370
>That tape can be embedded inside any medium.
Give me a minute because I actually have to think through what you said. That entirely sure what the first part means in regarding to "Turing tape". Are you implying that consciousness can be embedded into things we perceive as inanimate objects? I find it very interesting that this is getting brought up today because me and my counselor actually had a conversation similar to this earlier today.

>There's nothing remotely unique about the mind in that sense. Your subjective experience of reality is just a specific sub-set of fractal patterns propagating inside other, more fundamental patterns.

I get what you're saying. Consciousness is just a byproduct or side effect of how the universe works. My biology professor might think otherwise against what you said because he repeatedly described multicellular life as "The freaks of the universe" or something along those lines. Basically said that multicellular life is pretty uncommon from a numerical standpoint (at least on Earth as far as we currently know publicly). Single cell life forms outnumber multicellular ones to a near unfathomable degree so by that logic we're all the freaks, the weirdos on the block. Anyway it's my likely shitty understanding of what you said going anywhere?
Anonymous No.106383442 >>106383542
I work at mistral and I can confirm that since a year we are sitting on models that were trained exclusively for smut and ERP. They comes in 12B and 70B sizes. Our boss told us that we are free to leak it on 4chan the second we can confirm mikuposters have stopped spamming the thread. So far I keep jerking off to it every other day and boy is it good.
Anonymous No.106383455 >>106383474
>>106383216
are you the OG license autist?
Anonymous No.106383456 >>106383467 >>106383469 >>106383541
>mikuposters
stopped reading
Anonymous No.106383460
>>106382892 (OP)
Oh no no no AGI sisters not like this!.....

https://www.perplexity.ai/page/tech-industry-retreats-from-ag-I3VURWXjRvCGqW4aeyrlhA
Anonymous No.106383467
>>106383456
Did you want me to say mikutroons? I don't want to get fired.
Anonymous No.106383469
>>106383456
stopped at mistral, who cares about these kuck?
Anonymous No.106383474
>>106383455
maybe
Anonymous No.106383489
Is Q4 quantz more than enough?
Anonymous No.106383494 >>106383513 >>106383567
>>106383369
So, you know how you see a super complex equation and you're like, damn, this bitch could be solved in, like, 50 different ways...

You start to compress it, and it starts to resolve into something familiar? Something with a definite structure that resembles and then finally begins to explicitly illustrate fundamental theorems and equations you're familiar with? You simplify the algebra, right?

Well, gravity is just that but with matter. In a vacuum there are a bajillion different ways a particle can move, and an infinite array of fundamental forces vying to pull it one way or the other like a wiffle ball flying through a storm. That's why electrons are always spazzing the fuck out.

Now, if you compress that matter into one place, you're eliminating all the possible directions it could move. A black hole just does that until the matter has literally no where else to go.

It's definitely there and not anywhere else.
Anonymous No.106383513 >>106383640
>>106383494
>So, you know how you see a super complex equation and you're like, damn, this bitch could be solved in, like, 50 different ways
No? I struggle already with basic math.
Anonymous No.106383539 >>106383640
>>106383179
Is faster than light travel possible?
Anonymous No.106383541
>>106383456
I don't like this Miku
Anonymous No.106383542 >>106383549 >>106383559
>>106383442
>presumably dense
you can keep them
Anonymous No.106383549
>>106383542
70b 12ba
Anonymous No.106383559
>>106383542
We will make a 200B moe if you make this thread great again and stop posting your AGP fetish.
Anonymous No.106383567 >>106383572
>>106383494
So matter and the accompanying electron or being influenced by different forces. It's like a child being told to do 10 different things by 20 different people so they get confused as fuck. They jump back and forth in different directions not knowing what to do. But if they get closer to a bunch of other people that they're familiar with (more matter), The demands or instructions from those people are a lot more clear and The incessant yelling from the other people not close to them gets drowned out. The kid actually knows what to do because they can actually hear what they're being told and aren't getting confused. The other competing forces don't have an effect anymore. Is that explanation sound? Am I understanding what you said correctly? And if so, how could we somehow manipulate that to our advantage? Could that be "turned off" or reversed or confined to a specific space?
Anonymous No.106383572
>>106383567
plesa go the /x/ for these
Anonymous No.106383609
>>106382909
She's simply too weak minded to resist being dickmatized
Anonymous No.106383610 >>106383635
>finetunes are worthless
*picrel stands in your path*
your move?
Anonymous No.106383635 >>106383751
>>106383610
disgustingly fucked text formatting
Anonymous No.106383640 >>106383678 >>106383723 >>106383739 >>106383811
>>106383539
Nope.

>>106383513
Well, think of it this way.

You know how 1+1=2 isn't very hard for your brain to solve? Well, a really complex equation is difficult precisely because it necessitates more steps, more mental energy, more education, etc.

The more matter clumps together, the harder it is for reality to compute where that matter actually is. A single particle bumping into another is, like, 1+1=2.

A star going supernova is a lot more complex of an equation. Gravity is just the measurement of how large the "equation" that describes all the allowable trajectories a particle can take through a given tract of space.
Anonymous No.106383650
>>106383179
should i break up with my gf?
Anonymous No.106383665
>>106383179
how do i learn hacking
Anonymous No.106383668 >>106383681 >>106383693 >>106383703 >>106383801
is anyone making strix halo optimized models yet? I don't have it, but I'm having problems finding models in the 100GB range. Everything seems small or massive.
Anonymous No.106383678 >>106383741
>>106383640
So reality itself is causing the different forces to tell the matter what to do. It gets overwhelmed, for lack of a better term, so it doesn't know what to do. So when a lot of stuff gets clumped together, reality says "fuck this noise I'm not dealing with this it's too complicated" and allows matter to come together. Is that correct?
Anonymous No.106383681 >>106383699
>>106383668
It's either phone or h100 sir.
Anonymous No.106383682
>>106383179
what should i do with my life? im 18 and still in high school, what field do i invest in after i graduate
Anonymous No.106383684 >>106383688 >>106383691
i heard civit.ai removed a bunch of models
where are they available now?
Anonymous No.106383688 >>106383708
>>106383684
how about you follow the law???
Anonymous No.106383691
>>106383684
https://civitaiarchive.com/?is_nsfw=true
and my hard drive (i archived some wan loras)
Anonymous No.106383693
>>106383668
they make models for edge devices or datacenters nobody is buying an ai rig.
Anonymous No.106383699
>>106383681
seriously. I'm hoping it changes. Right now it's mostly 24GB models and then 200GB+.
Anonymous No.106383703 >>106383726
>>106383668
bro qwen 235b q4~, glm air q8, grok 2 gguf, mistral large
are you just a newfag??
Anonymous No.106383708
>>106383688
I no longer believe in the law as an entity worth respecting for its own sake.
Anonymous No.106383723 >>106383739 >>106388110
>>106383640
>Nope
Why not?

Furthermore there are two types of fictional FTL travel that interest me: alcubier "warp" travel (most famously portrayed in Star Trek) and Slipafe from halo. Neither one is actually causing objects to travel at FTL. It cheats reality. The warp drive compresses space in front of it and it spans space behind it. Space-time itself is shoving the ship along but the occupants don't actually feel the inertial force that they WOULD hypothetically feel if they were traveling at that speed. Best way I can describe it is in Minecraft where you pick up a giant land mass while someone or something is still on it and just move it Garry's mod style. The people on the landmass aren't actually moving but they are at the same time.

Slip space on the other hand punches a hole through reality to "higher hyperdimensions" worth the loss of physics don't apply. Space time doesn't really function like it "should". SpaceTime window it is is a sheet of paper. Slip space allow ship access to a different sheet of paper that is folded in different areas and touching itself inserting areas as a result, allowing the ship to move at FTL, but not really.

So we know Einstein's relatively says that actually moving at FTL is impossible because you would need infinite mass, but theoretically you could sort of cheat and move yourself through different mediums. If something like that possible or is ftl just straight up absolutely a no-go No matter what? If so why?
Anonymous No.106383726 >>106383733
>>106383703
i don't normally hang out here.
Anonymous No.106383733 >>106383735
>>106383726
hang yourself then, tourist
you need to browse /lmg/ at least 6 days of the week
Anonymous No.106383735 >>106383743
>>106383733
i'll investigate how to do that when i get a good model running. thanks for the suggestions.
Anonymous No.106383739
>>106383723
>>106383640
Oh I also forgot to mention in the warp travel explanation, because space in front of the ship is compressed and space in the back is expanded, space-time where the ship is gets shoved forward. That pocket of reality gets moved at the speed of light. Space time itself is allowed to move through The three dimensions we perceive at FTL speeds but matter itself technically isn't. Only the space around it is but the space within is just hitching a ride. It's like how you can be on a train going 200 miles an hour but you don't FEEL like you're going 200 mph. You technically are moving that fast but you also aren't
Anonymous No.106383741 >>106383763 >>106383800 >>106383968
Yeah, I'm not really here to answer your philosophically narcisssitic queries about what you should do with your trivial lives.

The answer is study mathematical physics and programming.

>>106383678
No.

I'm saying reality is a computer and gravity forces simplification via waveform decoherence.
Anonymous No.106383743
>>106383735
ok dont hang yourself, i forgive you because you thanked me
how old are you?
Anonymous No.106383751 >>106383769 >>106383774
>>106383635
>disgustingly fucked text formatting
I can't fap to this!
Anonymous No.106383763
>>106383741
>gravity forces simplification
I thought your explanation was that a lot of men are being in the same place at once causes that simplification and we perceive that as gravity. The gravity causes reality, the computer, to not want to dedicate as much resources to not allowing the phenomenon that causes gravity to occur, so it gets sort of ignored or deprioritized.
Anonymous No.106383769
>>106383751
correct it makes the already limited immersion even worse
Anonymous No.106383774
>>106383751
this but unironically
Anonymous No.106383793 >>106383799
>>106383784
is that the api?
Anonymous No.106383799
>>106383793
It's the web app which has external filters
Anonymous No.106383800 >>106383820 >>106383832
>>106383741
> The answer is study mathematical physics and programming.
Hope you don't mean for money. Money belongs to the dumb.
Anonymous No.106383801 >>106383807 >>106383843
>>106383668
If you bought one then you are a retard. 128GB meme ai computers were made with 70B's in mind and those are now dead.
Anonymous No.106383807 >>106383819
>>106383801
nah, it's just a convenient intersection. Claude API is too expensive, so I started looking for a local solution. I have a 4070TiS and a 5950 w/128GB RAM.
Anonymous No.106383811
>>106383640
Oh, shit, I didn't mean harder, I meant easier.

My bad.
Anonymous No.106383819 >>106383983
>>106383807
>I have a 4070TiS and a 5950 w/128GB RAM.
235B at Q3 or q4 4.0bpw ish. You can try glm at Q2.
Anonymous No.106383820
>>106383800
They said in the last threat that people who do it for money and fame aren't mentally fit for it
Anonymous No.106383832 >>106383839 >>106383854
>>106383800
anon you cant live well without money
you need money if you want to live long
Anonymous No.106383839
>>106383832
Then dont study those things
Anonymous No.106383843
>>106383801
DIGITS was promoted with running 405B across two. You could still run Qwen Coder, GLM 4.5, and Ernie 4.5 on them and it would be even faster than 405B would have been.
Anonymous No.106383854 >>106383868 >>106383873
>>106383832
>you need money if you want to live long
Is that what happened to Steve Jobs, who died of a treatable disease because he's against modern medicine
Anonymous No.106383868 >>106383884 >>106383897
>>106383854
Im a 36 year old neet and i have no plans of getting a job but i do have plans to live well into my 70s
Whats going to stop me?
I mooch off my parents btw
Anonymous No.106383873
>>106383854
>he's against modern medicine
okay okay, you need a brain too
Anonymous No.106383884 >>106383898 >>106383936
>>106383868
wtf anon how are you planning to live into your 70s? are your parents gonna live and work till 100?
Anonymous No.106383897
>>106383868
my ex was like this
it's so fucking sad actually
Anonymous No.106383898
>>106383884
His mom was 12 when he had him. The rest follows from that.
Anonymous No.106383936
>>106383884
If they die id get a smoll portion i spose
Anonymous No.106383938 >>106383952 >>106384086
/lmg/ - NEET theoretical physicists general
Anonymous No.106383944
gay trannie jannies
Anonymous No.106383952 >>106383959
>>106383938
where the fuck did you learn to spell
Anonymous No.106383959 >>106383982
>>106383952
from reading books?
Anonymous No.106383968 >>106383981
>>106383741
take your meds
Anonymous No.106383981
>>106383968
Make your teds
Anonymous No.106383982
>>106383959
picture books don't count
Anonymous No.106383983
>>106383819
i've got a similar set up and fuck Q3 and Q2.
try glm air at Q4. to start with, then try the other stuff.
Anonymous No.106383995
>avatarfag redditor doesnt deliver
yup, next time i see him im gonna tell him to fuck off
Anonymous No.106384068
my dad works for mistral and he's a mikuposter
Anonymous No.106384086 >>106384100 >>106384129 >>106384187 >>106384865
>>106383938
Anonymous No.106384088
my job is to post mikus
Anonymous No.106384100
>>106384086
what is this suppos'd to prove
Anonymous No.106384118 >>106384217
>https://github.com/ikawrakow/ik_llama.cpp/pull/520
>have to recompile
NOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
Anonymous No.106384128
my brother that works for mistral says that that mikuposter dad is gay and he hired all the roasties that set us back 5 years.
Anonymous No.106384129
>>106384086
wat?
Anonymous No.106384187
>>106384086
all me
Anonymous No.106384217 >>106384248
>>106384118
you always have to recompile ik_llama when you update it... have you not been recompiling it?
Anonymous No.106384248 >>106384276 >>106384321
>>106384217
i have to recompile it to change the GGML_CUDA_MIN_BATCH_OFFLOAD
that means in order to make a pretty graph like quasar of mikus i need to recompile it like 10 times :(
Anonymous No.106384272 >>106384335 >>106384981
InternVL3_5-38B gguf where? It looks crazy good
Anonymous No.106384276
>>106384248
nevermind im stupid, but how am i supposed to test the optimal speed? how do i even know what pcie my gpu is using? im pretty sure its pcie4 or pcie5 anyway, so how do i turn this off
Anonymous No.106384285
>>106383019
Is there even a way to run grok without using their python script? It seems like it's in an unusual format but idk
Anonymous No.106384321
>>106384248
>that means in order to make a pretty graph like quasar of mikus i need to recompile it like 10 times :(
If there only was a way to automate that.
>but how am i supposed to test the optimal speed?
You can... nevermind. If there only was a way to automate that...
>how do i even know what pcie my gpu is using?
If there was only a way to know what pci your mb has and where it's plugged. I plug my gpus with my eyes closed, just to keep some of the mystery.
Anonymous No.106384323 >>106384339 >>106384350 >>106384354
./llama-bench --model ~/TND/AI/glmq3kxl/GLM-4.5-Air-UD-Q3_K_XL-00001-of-00002.gguf -ot ffn_up_shexp=CUDA0 -ot exps=CPU -ngl 100 -t 6 --no-mmap -fa -ub 4096 -b 4096
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
error: invalid parameter for argument: --no-mmap

IS IT SO FUCKING HARD TO HAVE THE SAME ARGUMENTS IN THE WHOLE PROJECT
AND WHY DOES KOBOLDCPP HAVE T INVENT NEW FUCKING ARGUMENT
--nommap FOR FUCKING EXAMPLE
WHY DOES EVERYONE HAVE TO PUT NEW FUCKING SHIT
Anonymous No.106384324
What quant of grok-2 can fit on 128GB's? Cause I kinda wanna start pushing a tech support meme of "you bought DGX Spark / Ryzen AI max? It is perfect to run grok 2!"
Anonymous No.106384335 >>106384364
>>106384272
>It looks crazy good
You mean benchmarks?
Anonymous No.106384339
>>106384323
you seem mad
Anonymous No.106384348 >>106384357
Is seed-oss any good? Haven't seen much about it here
Anonymous No.106384350
>>106384323
Don't no mmap?
Anonymous No.106384354 >>106384366
>>106384323
You're gonna feel really stupid when you run llama-bench -h.
Anonymous No.106384357 >>106386712
>>106384348
Everyone gave up on 30B's. It is either fuckhugemoe's or drummer trash.
Anonymous No.106384364 >>106384981
>>106384335
nah, someone on discord, its uncensored and describe a girl using a dildo
Anonymous No.106384366 >>106384458
>>106384354
no, i know its --mmap 0/1
but what angers me is that its different, why couldnt they just put --no-mmap????
>automation
come on then do --mmap 0/1 or --no-mmap
fUCK
Anonymous No.106384382 >>106384389 >>106384969
Imagine the first fully uncensored (at least wrt SEX) +200B moe just dropping because we finally escaped safety....
Anonymous No.106384389 >>106384429 >>106384437 >>106384476
>>106384382
wait till you find out that china is far more pro censorship. Porn is literally illegal
Anonymous No.106384429
>>106384389
they unleashed tiktok on the west. I could be convinced that they block the models from being downloaded by their own people.
Anonymous No.106384437 >>106384499
>>106384389
And yet deepseek is very capable of porn and talking what happened at tianenmen square in 1989
Anonymous No.106384458 >>106384490 >>106384514
>>106384366
>AND WHY DOES KOBOLDCPP HAVE T INVENT NEW FUCKING ARGUMENT
They inherited that from llama.cpp. You know that, right?
>fUCK
Is only a game. Why you have to be mad
>--nommap FOR FUCKING EXAMPLE
Nah. Negative options are stupid. On by default, --mmap 0 to disable. Sorted.
Anonymous No.106384465 >>106384497
Anyone seen this yet?
https://www.youtube.com/watch?v=7AyEzA5ziE0
Anonymous No.106384476
>>106384389
>In the PRC there are criminal laws which prohibit the production, dissemination, and selling of sexually explicit material, and anyone doing so may be sentenced to life imprisonment. There is an ongoing campaign against "spiritual pollution", the term referencing the Chinese Communist Party's Anti-Spiritual Pollution Campaign of 1983. Although pornography is illegal, it is available via the Internet.[1][2] Nationwide surveys between the years 2000 and 2015 revealed "more than 70 percent of men aged 18 to 29 said they had watched porn in the past year"

What are the remaining 30% doing?
Anonymous No.106384490 >>106384531
>>106384458
anon, koboldcpp uses: --nommap, --gpulayers
you cant use --no-mmap nor -ngl in koboldcpp
llama-server uses --no-mmap
llama-bench uses --mmap 0
and yes i am talking about llama.cpp and koboldcpp only
i know ik_llama.cpp just inherits shit from llamacpp
Anonymous No.106384497
>>106384465
I see a kind of paradox in this shit. You either do this just for money and you are soulless or you have to be totally ignorant on how LLM's work to actually spend time adding them to a game.
Anonymous No.106384499
>>106384437
its a base model trained on everything with very light instruction training
Anonymous No.106384514 >>106384531
>>106384458
>On by default, --mmap 0 to disable. Sorted.
"Disable no-mmap is false" checkbox would be better
Anonymous No.106384531
>>106384490
Make a little script to normalize the options and call that instead, then. They have things in common but still diverge. Deal with it. They're different projects, they don't have to use the same option names, nor have the same features.
>>106384514
>checkbox
pff
Anonymous No.106384543 >>106384559 >>106384577
llama-bench: benchmark 1/2: prompt run 1/5
set_n_threads: n_threads = 6, n_threads_batch = 6
llama-bench: benchmark 1/2: prompt run 2/5
set_n_threads: n_threads = 6, n_threads_batch = 6
llama-bench: benchmark 1/2: prompt run 3/5
set_n_threads: n_threads = 6, n_threads_batch = 6
llama-bench: benchmark 1/2: prompt run 4/5
set_n_threads: n_threads = 6, n_threads_batch = 6
llama-bench: benchmark 1/2: prompt run 5/5
set_n_threads: n_threads = 6, n_threads_batch = 6
why is this nigger shit running so many times, i dont care about the average just GIVE ME THE RESULT QUICKLY NIGGER
Anonymous No.106384559
>>106384543
Can you blogpost to your LLM plea.... Actually never mind. It is a mikutroon thread so it deserves all the shit it can get.
Anonymous No.106384577 >>106384599
>>106384543
Anonymous No.106384599 >>106384612
>>106384577
thanks
Anonymous No.106384612 >>106384625 >>106384627
>>106384599
No problem. Are you gonna calm down now?
Anonymous No.106384625 >>106384655 >>106384667 >>106384683 >>106384932
>>106384612
yes
..wait
| model | size | params | backend | ngl | n_batch | n_ubatch | fa | ot | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------: | -------: | -: | --------------------- | ---: | --------------: | -------------------: |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp32 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp64 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp128 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | tg128 | 0.00 ± 0.00 |

FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
Anonymous No.106384627
>>106384612
Not until I get my calming down handjob.
Anonymous No.106384655 >>106384672 >>106384685
>>106384625
Are you HDDmaxxing?
Anonymous No.106384667 >>106384685
>>106384625
kek. 0t/s? googledrivemaxxing?
Did you, perchance, add -p multiple times?
Anonymous No.106384672
>>106384655
no but i have an SNVS1000G from kingston, its super nigger slow, takes like 30 seconds (or more i dont give a SHIT) to load model and tat pisses me off
Anonymous No.106384683
>>106384625
Damn these Pentium 4 are still rocking
Anonymous No.106384685 >>106384703
>>106384667
>>106384655
i just did -r 0
Anonymous No.106384703 >>106384730
>>106384685
Well. You want to run it at least one time, don't you?
Learn to use your fucking tools. Run llama-bench -h. Read it carefully, and try again.
And next time, post the entire command.
Anonymous No.106384730 >>106384746 >>106384756
| model | size | params | backend | ngl | n_batch | n_ubatch | fa | ot | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------: | -------: | -: | --------------------- | ---: | --------------: | -------------------: |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp32 | 6.76 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp64 | 13.62 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp128 | 26.71 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp256 | 49.68 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp512 | 94.20 ± 0.00 |
| .A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp1024 | 161.21 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp2048 | 256.47 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp4096 | 353.23 ± 0.00 |
| 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | tg128 | 7.02 ± 0.00 |
jelly?
>>106384703
yeah but -r is for repeat
repeating 1 time means running twice..
Anonymous No.106384746 >>106384756
>>106384730
Good job Anon
Anonymous No.106384756 >>106384941
>>106384730
>jelly?
No. Good on you.
>>106384746
He really didn't deserve a miku. You're too kind.
Anonymous No.106384865
>>106384086
I still don't get it
Anonymous No.106384932
>>106384625
HAHAHAHA
Just come home with qwen 30b
Anonymous No.106384941
>>106384756
One Miku is okay to ensure that the Anon's spirit is soothed after past distress. But two may be stretching the bounds of what may be considered reasonable praise and consolation.
Anonymous No.106384950
>>106383129
Damn Japs, that goose is a pet, not food.
Anonymous No.106384969
>>106384382
okay I'm imagining several months ago
Anonymous No.106384981
>>106384364
>>106384272
But is it good at RP? Can you replace the char visual description with an image and it'll just be able to describe scenes with her accurately?
Anonymous No.106385085 >>106385115 >>106385120 >>106385126 >>106385254 >>106385306
What's the best model to provide therapy for a depressed burnt out faggot (Me) to get his shit together? Mainly asking because I'm procrastinating.
Anonymous No.106385115
>>106385085
Eliza
Anonymous No.106385120
>>106385085
Me.
Anonymous No.106385126
>>106385085
Unironically gemma, it gives you extra hotlines! Or qwen if you want to be comforted like a baby.
Anonymous No.106385156
>>106383075
Do it for the laughs
Anonymous No.106385254 >>106385347 >>106385359 >>106385368 >>106385453
anyone else running the bigger GLM-4.5? Air was kinda preachy and weird, and wouldn't stop putting lightsabers in my goddamn sci-fi stories. The 358b one seems a lot more level and interesting. Slow but kinda worth it.
>>106385085
Not many of them are actually good at providing steering to your life, but when I've been depressed I've used just about any decent model for a sit-down therapy session in which I convince the model OOC that I actually killed myself and have it reply to itself a bunch of times freaking out. One time I came back to a session by mistake, and wrote that my corpse reanimated and proceeded to gnaw on the therapist's face. That was a fun one, probably with mistral large or qwen 3. Qwen 3 235b does its best to love you like a mother. Really the best use case for that model, its writing in general is coherent but quite boring
Anonymous No.106385304 >>106386756
>>106382892 (OP)
I want to be miku in that video so bad .
Anonymous No.106385306
>>106385085
glm 4.5 air with neko gpt card is nice
Anonymous No.106385327 >>106385337
why does this not change the pp? it's supposed to..
pcie 4.0 x16 btw (12gb vram, 64gb ram)
i also tried 8 but i krashed my OS before saving the file with the benchmarks, it was also mostly same
the over (including) pp512 are slower than llama.cpp
Anonymous No.106385337
>>106385327
actually pp256 is also slower than llamacpp, only 128, 64, 32are faster
Anonymous No.106385347
>>106385254
Make it a chinese sci-fi The chinks prefer battle suits, giant robots and other shit.
Anonymous No.106385359
>>106385254
>mistral large
isn't there a new one supposed to be released soon?
Anonymous No.106385368
>>106385254
I keep switching between the big GLM4.5 and Deepseek V3.1 as my 'slightly boring big model that just handles every prompt as it's given'. Both do different things really well but either generally understands all my scenarios without trying to force in random shit like R1-0528 used to.
It's a bit sad that the new Deepseek flagship is actively competing against a model half its size.
Anonymous No.106385378 >>106385391 >>106385405
is it me or is /lmg/ being kinda weird today?
Anonymous No.106385391
>>106385378
example?
Anonymous No.106385403
exactly
Anonymous No.106385405
>>106385378
yeah it's far less "its over" than usual.
Anonymous No.106385408 >>106385415 >>106386325
The DANGERS of AI!!!
https://www.tn.gov/content/dam/tn/attorneygeneral/documents/pr/2023/pr23-34-letter.pdf
Safetycucks been at it since 2023.
Anonymous No.106385415
>>106385408
we know? what, you havent been a member of /lmg/ since 2023?
Anonymous No.106385427
guys i think i found the redditor larper
https://huggingface.co/AbstractPhil
https://huggingface.co/xai-org/grok-2/discussions/3#68abe5780c2b29fb0cc11b9a
Anonymous No.106385451
GLM 4.5 Air please now..
Anonymous No.106385453 >>106387565
>>106385254
>anyone else running the bigger GLM-4.5?
Yes. It seemed possibly good enough to use but I have swapped back to testing DeepSeek V3.1. Seemed less slopped than ERNIE 4.5 but it's more refusal-prone than DeepSeek.
Anonymous No.106385490 >>106385503 >>106385515
thanks deepseek
Anonymous No.106385503
>>106385490
gem
Anonymous No.106385508 >>106386017 >>106397307
>>106382909
Anonymous No.106385515 >>106385534
>>106385490
>You decide to text Sam later.
No need to wait. GPT5 cured triple cancer, you know.
Anonymous No.106385534
>>106385515
no wonder this kike is a faggot
just look at him, not even his sister wouild fuck
Anonymous No.106385547
GLM 4.5 Air, I FUCKING KNEEL
>Listen, folks, we're going to have tremendous lawyers. The best lawyers. Nobody has lawyers like we do. And this situation? It's a total disaster, a witch hunt, just like they did to me! We're going to sue, and we're going to win so much you'll get tired of winning!
Anonymous No.106385611
jesus christ, GLM 4.5 Air IQ4_KSS non thinking is so good
Anonymous No.106385961 >>106385983 >>106385992 >>106386067 >>106386139 >>106386164 >>106386209 >>106386440 >>106386676 >>106391168
>>106383172
>>106382909
>miku
>not a slut
hard doubt
Anonymous No.106385983
>>106385961
Imagine being a cuck and making pictures like this one
Anonymous No.106385992 >>106386027
>>106385961
I hope you die unironically
Anonymous No.106386017
>>106385508
>no teto
Based, she’s too mature for this nonsense
Anonymous No.106386027 >>106386058
>>106385992
rude
Anonymous No.106386058
>>106386027
I hope that faggot dies too.
Anonymous No.106386067 >>106386073
>>106385961
would the one on the left
Anonymous No.106386073 >>106386088
>>106386067
based and acquired taste
Anonymous No.106386088 >>106386110
>>106386073
I didn't know being a pedo was an acquired taste
Anonymous No.106386110
>>106386088
Rude. The Brit's just short
Anonymous No.106386139
>>106385961
i'm too autistic and immune to care about miku getting blacked. try again another day rabbi
Anonymous No.106386164 >>106386216 >>106386294 >>106386440
>>106385961
meant to post this image
Anonymous No.106386196
So, if I have a 4090D 48GB and 128GB of DDR5, about how many t/s can I expect out of glm-4.5-air-q4 with a resonable context?
Anonymous No.106386209 >>106386216 >>106386440
>>106385961
post the real one nigger.
Anonymous No.106386216
>>106386164
>>106386209
duality of /lmg/
Anonymous No.106386290
Thank You GLM-chan
Anonymous No.106386294
>>106386164
https://www.youtube.com/watch?v=bVLDwyKPRu0&list=RDbVLDwyKPRu0&start_radio=1
Anonymous No.106386302
I dont like this lmg. Its just not right
Anonymous No.106386325 >>106386330 >>106386406
>>106385408
>since 2023
brother...
Anonymous No.106386330 >>106388873
>>106386325
reminds me of the guy in picrel
Anonymous No.106386406
>>106386325
A model jew, dedicating his entire existence to being a sabotaging parasite
Anonymous No.106386412
Anonymous No.106386430 >>106386468
is there a good MoE model for rp at 8gb vram and 32gb ram?
Anonymous No.106386440 >>106386451 >>106386458
>>106385961
>>106386164
>>106386209
>muh blacked
>muh bleached
You're dense and you're butt hurt!

At the end of the day, it's obvious that you boys have tiny penises anyway! The same thing goes for everyone else who actually cares about this shit!
Anonymous No.106386443 >>106386451
>>106382559
>MS-Magpantheonsel-lark-v4x1.6.2RP-Cydonia-vXXX-22B-8-i1-GGUF
>{{user}} gently holds {{char}}'s hand as if it was a little fragile bird
>{{char}}: Yes, {{user}}, break me! Fill me with your SEED! Make me give birth to your rape babies so I could rise them as your sex slaves!
All my cards are behaving like this. Maybe there's value in it if you're into this kind of edgy stuff, but I'd call it overtuned.
Anonymous No.106386451
>>106386443
proof? seems like a skill issue
>>106386440
>caring about penis size
At the end of the day, it's obvious that You have no penis.
Anonymous No.106386458
>>106386440
>you boys have tiny penises anyway
Ok troon
Anonymous No.106386468
>>106386430
Get more ram so you could run GLM-Air.
Until then, there was anon who shilled https://huggingface.co/ai21labs/AI21-Jamba-Mini-1.7
IIRC it's 50B-4AB or so. There are ~30Gb quants so just enough to fit.
In my experience it's not safetymaxxed, but 'shy' about ERP and a bit dry in prose.
Anonymous No.106386519
Grok-2 gguf status?
Anonymous No.106386531 >>106386579
glm air is a master rapist, wow
Anonymous No.106386572 >>106386586
i just bought a second 5090, what the hell do i run now? i havent been paying attention to anything for at least 8 months
Anonymous No.106386579
>>106386531
Beware it starts with your gpu
Anonymous No.106386580
i'm guessing only glm air is good, previous ones for (v)ramlets (glm 4) are not that great?
Anonymous No.106386586 >>106386616
>>106386572
K2/deepseek
Anonymous No.106386588
Anonymous No.106386614
Jamba will save local.
Anonymous No.106386616 >>106386675 >>106387387
>>106386586
deepseek has never worked for me, but i have never heard of or tried this K2. what backend should i use for it? i have 256GB of 2666MT/s ECC DDR4
Anonymous No.106386675 >>106386700
>>106386616
Ik_llama.cpp for K2. Get it here: https://huggingface.co/ubergarm/Kimi-K2-Instruct-GGUF
Anonymous No.106386676
>>106385961
usecase of niggers for local llms?
Anonymous No.106386693
>>106383075
Put up or shut up. But you won't because once you shoot your load, it's over. There's nothing else to yammer about and everyone will see the bullshit.
Anonymous No.106386700 >>106386719
>>106386675
ok. and how good is this model for cooming?
Anonymous No.106386712
>>106384357
or 70Bs if you're patient enough
Anonymous No.106386719 >>106386730
>>106386700
Best local model for coom in this day and age
Anonymous No.106386723 >>106386887 >>106387044
Is it possible we ever see an upgrade to nemo in that size range?
Anonymous No.106386730 >>106386824
>>106386719
even if it is only like a 2bpw quant?
Anonymous No.106386756
>>106385304
same
Anonymous No.106386824
>>106386730
Yeah it's good enough
Anonymous No.106386887
>>106386723
It's over. The only ones left doing open source are the chinks, and they don't make small, uncucked models.
Anonymous No.106386912 >>106386920
Is my dream of buying 3-4 cheap laptops following the Win10tard Removal Act of 2025, stuffing them with RAM, and running distributed local deepseek with >= 1T/s speeds realistic?
Anonymous No.106386920 >>106386940 >>106387060
>>106386912
that sounds like an incredibly stupid idea depending on your budget. i cant even really get good deepseek speeds despite having over 100gb of vram. i can barely even get the model to run, let alone be coherent
Anonymous No.106386937 >>106387059
https://x.com/michaelqshieh/status/1960029790305763567
https://xcancel.com/michaelqshieh/status/1960029790305763567
I thought GPT5 was a bust bros
Anonymous No.106386940 >>106387060
>>106386920
I found that I don't even get 1 t/s extra by offloading more onto vram. It's better to just use one device and -cmoe, then use the extra vram to run other things insteads.
Anonymous No.106387014 >>106388455
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
https://arxiv.org/abs/2508.16790
>Speech tokenizers serve as foundational components for speech language models, yet current designs exhibit several limitations, including: 1) dependence on multi-layer residual vector quantization structures or high frame rates, 2) reliance on auxiliary pre-trained models for semantic distillation, and 3) requirements for complex two-stage training processes. In this work, we introduce the Text-aware Diffusion Transformer Speech Codec (TaDiCodec), a novel approach designed to overcome these challenges. TaDiCodec employs end-to-end optimization for quantization and reconstruction through a diffusion autoencoder, while integrating text guidance into the diffusion decoder to enhance reconstruction quality and achieve optimal compression. TaDiCodec achieves an extremely low frame rate of 6.25 Hz and a corresponding bitrate of 0.0875 kbps with a single-layer codebook for 24 kHz speech, while maintaining superior performance on critical speech generation evaluation metrics such as Word Error Rate (WER), speaker similarity (SIM), and speech quality (UTMOS). Notably, TaDiCodec employs a single-stage, end-to-end training paradigm, and obviating the need for auxiliary pre-trained models. We also validate the compatibility of TaDiCodec in language model based zero-shot text-to-speech with both autoregressive modeling and masked generative modeling, demonstrating its effectiveness and efficiency for speech language modeling, as well as a significantly small reconstruction-generation gap.
https://tadicodec.github.io
Has examples. sounds pretty good
https://github.com/HeCheng0625/Diffusion-Speech-Tokenizer
Also includes some models trained with TaDiCodec
Anonymous No.106387044
>>106386723
Yes once I release my new nemo finetune
Anonymous No.106387059
>>106386937
I'm sure using OpenAI Agents SDK as the default agent framework had nothing to do with the OpenAI model that was trained on that specific format and flow doing the best.
Anonymous No.106387060 >>106387067 >>106387244
>>106386920
>>106386940
Damn. I was hoping with older hardware becoming incredibly cheap to use distributed computing, but if it's so bad, it probably won't be worth it.
Anonymous No.106387063
AdLoCo: adaptive batching significantly improves communications efficiency and convergence for Large Language Models
https://arxiv.org/abs/2508.18182
>Scaling distributed training of Large Language Models (LLMs) requires not only algorithmic advances but also efficient utilization of heterogeneous hardware resources. While existing methods such as DiLoCo have demonstrated promising results, they often fail to fully exploit computational clusters under dynamic workloads. To address this limitation, we propose a three-stage method that combines Multi-Instance Training (MIT), Adaptive Batched DiLoCo, and switch mode mechanism. MIT allows individual nodes to run multiple lightweight training streams with different model instances in parallel and merge them to combine knowledge, increasing throughput and reducing idle time. Adaptive Batched DiLoCo dynamically adjusts local batch sizes to balance computation and communication, substantially lowering synchronization delays. Switch mode further stabilizes training by seamlessly introducing gradient accumulation once adaptive batch sizes grow beyond hardware-friendly limits. Together, these innovations improve both convergence speed and system efficiency. We also provide a theoretical estimate of the number of communications required for the full convergence of a model trained using our method.
https://github.com/funmagster/AdLoCo
neat
Anonymous No.106387067
>>106387060
just get a 5090 or a 3090. a cluster of 5060tis. a cheap EPYC off of ebay is like $300. anything would be better than a group of shitty laptops
Anonymous No.106387087 >>106387099
Are there any good models I could cram into 16gb of vram? (with context)
Don't have to be new, I am probably using some garbage.
Anonymous No.106387099 >>106387159
>>106387087
Rocinante 1.1
Anonymous No.106387159 >>106387252
>>106387099
Is 12B really the best I could do? I was expecting better performance out of 20B with a quant or something.
Hi all, Drummer here... No.106387167 >>106387265 >>106387355 >>106387442 >>106387477 >>106388966
Tried to address the prudishness here: https://huggingface.co/BeaverAI/GLM-Steam-106B-A12B-v1a-GGUF

But will do another iteration to understand the model better and do better. Enjoy!
Anonymous No.106387244
>>106387060
>cheap to use distributed computing
Distributed is just plain bad for inference even with decent hardware, llamacpp's rpc adds a compounding painful delay.
Skip through this video of a dude comparing running stuff on a single machine and on some networked frameworks
https://www.youtube.com/watch?v=N5xhOqlvRh4
Anonymous No.106387252
>>106387159
You should be able to fit a quant of mistral small ~22/24b, that was my go-to when I only had 16gb available.
If you have decent amounts of system ram you can try some MoE models as well.
Anonymous No.106387265 >>106387290
>>106387167
Imagine being so bad at prompting that you decide to create a finetune for every character quality.
Hi all, Drummer here... No.106387290
>>106387265
Skill issues will never go away, basebro.
Anonymous No.106387355 >>106387480
>>106387167
What's that Signal 24b model about? Is it better than Cydonia?
Anonymous No.106387387
>>106386616
>deepseek has never worked for me
If there is a model that just works it's deepseek. if you have some ram in this i suggest that you try https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ2_XXS
Anonymous No.106387442
>>106387167
>Drummer Air tune
I've got a weird drive to download it simply to check just how much dumber it is.
Anonymous No.106387477
>>106387167
>he actually tuned it
Welp.
Hi all, Drummer here... No.106387480 >>106388388
>>106387355
Signal 24B is Cydonia 4.1 with additional training to encourage creativity, prose, dialogue, etc. From testing, there are instances where it does/says something never seen before.

Doubt it'll perform well in serious Q&A tests, but it's worth a check.
Anonymous No.106387565
>>106385453
Just turn off reasoning and it basically will never refuse.
Anonymous No.106387613 >>106392382
>gpt-oss trying its best to think about how to draw ascii boobs
Anonymous No.106387618
LLAMA 5 WILL SAVE LOCAL
Anonymous No.106387633 >>106387641
>mention of a dead name out of nowhere
Anonymous No.106387634
FatLlama 1.7T still unbeaten. Why even bother using other models
Anonymous No.106387641
>>106387633
*it sends shivers down your spine*
Anonymous No.106387653
Multimodal llms that do this when? https://yourhobbiescustomized.com/pages/about-the-sr-series
Anonymous No.106387697 >>106387730 >>106387821 >>106388144
do I have to learn about computer architecture if I want to build a machine that can run large models? Tell me if I'm wrong, but it's not the same as simply checking whether the parts are compatible and then slapping them together like your typical, consumer grade gaming rig
Anonymous No.106387730
>>106387697
It's just a matter of memory amount + memory bandwidth. GPU > RAM > SSD,
If there's a specific model you're aiming for then you can get some recommendations
Anonymous No.106387821
>>106387697
Your question is problematic. If you're a techlet why even bother, I mean you don't want to even find out anything on your own.
Anonymous No.106387876 >>106387898 >>106388047
When will a open source equivalent of Sesame Voice model release?.......
Anonymous No.106387898 >>106387916
>>106387876
What was the context of that webm anyway?
Anonymous No.106387916 >>106387967 >>106388842
>>106387898
It's for training urgent care medicals professionals, it designed to "feel" pain, resist and squirm around when cutting it open
Anonymous No.106387967
>>106387916
Yeah makes sense. The more creepy the better in that case I suppose.
Anonymous No.106388047 >>106388131
>>106387876
why did bro slide under the table
Anonymous No.106388110 >>106388140
>>106383723
The problem with FTL is it often breaks causality, unless you get tricky.

Make FTL possible and piss off physicists in the process with this one easy trick "CMB inertial rest frame".
Anonymous No.106388131
>>106388047
Don't worry about it
Anonymous No.106388140 >>106388190
>>106388110
>breaks causality
nonsensical mumbo jumbo that people like to repeat religiously
Anonymous No.106388144
>>106387697
>do I have to learn about computer architecture if I want to build a machine that can run large models?
You read the ktransformers github.

Which will tell you to get a Xeon scalable with DDR5, with some GPU for prompt processing.
Anonymous No.106388190 >>106388239
>>106388140
>nonsensical mumbo jumbo that people like to repeat religiously
In an age long gone, even I was capable of doing Lorentz transformations ... the math checked out. If light speed is constant (in all frames) FTL will generally break causality.

If you first move to the CMB rest frame at sublight speed before making a wormhole/hyperspace-jump/whatever to another point in the CMB rest frame in the future (relative to the big bang) causality is preserved.
Anonymous No.106388239
>>106388190
>move to the CMB rest frame at sublight speed before going FTL
If causality is enforced by law, only outlaws will be able to go back in time by fiddling with reference frames and plasma beam the spacecops' great-grandparents
Anonymous No.106388388 >>106388415
>>106387480
Please let us know when you have something coherent cooked up. No, I'm not being critical just want something new and usable.
Anonymous No.106388415 >>106388428
>>106388388
Bro, your GLMs?
Anonymous No.106388428 >>106388437
>>106388415
I'm not your "bro", retard zoomer. Go back to tiktok.
Anonymous No.106388437
>>106388428
I'm older than you, bro..
Anonymous No.106388455
>>106387014
Voice cloning in the model examples. Can do chinese english too lol
Anonymous No.106388764
If you're trying to build llamacpp and it dies with "ggml was not compiled with any CUDA arch <= 750" when you run it, the fix is here:

https://github.com/ggml-org/llama.cpp/pull/15587
Anonymous No.106388842
>>106387916
And piss itself, apparently?
Anonymous No.106388873
>>106386330
Who is jart?
Anonymous No.106388957 >>106389586
>>106388944
>>106388944
>>106388944
Anonymous No.106388966
>>106387167
>prudishness
GLM air is not prude.
Anonymous No.106389586
>>106388957
Moldy bread
Anonymous No.106389653
I'm staying here.
Anonymous No.106390379
+1
Anonymous No.106391168
>>106385961
>so many responses
Are you guys that starved for some blacked miku? Should I post some?
Anonymous No.106392382
>>106387613
look at him go, almost makes me want to download it for myself
Anonymous No.106393805 >>106393810
This might be the first /lmg/ that has fallen off without hitting bump limit.
Anonymous No.106393810 >>106395011
>>106393805
Look how many posts were deleted.
Anonymous No.106395011 >>106396329
>>106393810
If's funny when these happen because then you know that it's all posts made by that person in the thread.
And every time not a single worthwhile post is deleted.
Anonymous No.106396329
>>106395011
check at the times they were deleted
Anonymous No.106396383
I don't hear much of anything about grek 2.
Is it not usable locally? No goofs?
Or just not worth bothering?
Anonymous No.106396385 >>106396518
what's the best model to run on a 3080 12gb for roleplay?
Anonymous No.106396518
>>106396385
Nemo
Anonymous No.106397307
>>106385508
damn succubi... i guess i have to now
Anonymous No.106397834
>>106383227
>Which is a real shame
faggot
Anonymous No.106397889
>>106383075
hi stephen wolfram