Thread 106904820

348 posts 92 images /g/

Anonymous 10/16/2025, 6:13:48 AM No.106904820 [Report] >>106905529 >>106908189 >>106910165 >>106912278 >>106912731 >>106912928 >>106912996

/lmg/ - Local Models General

a83dd4e89c29955893bcf75d67a226b3.jpg md5: 4ccb8ba8...

Anonymous 10/16/2025, 6:14:08 AM No.106904822 [Report] >>106904904

what's in the box.jpg md5: bd27344e...

►Recent Highlights from the Previous Thread: >>106895582

--Open-air GPU mining rig thermal management:
>106901901 >106901916 >106901925 >106901992 >106902015 >106903589 >106902068 >106902236 >106902243 >106902293 >106902312
--Long-term memory system implementations challenges:
>106896489 >106896594 >106897006 >106897022 >106897073 >106897085 >106897092 >106896700 >106897772 >106897824 >106897887 >106897933 >106897992 >106898038 >106896707 >106897051
--Medical AI hypothesis generation with privacy-focused local models:
>106898186 >106898327 >106898479
--Vibe coding's maintenance issues and mitigation strategies:
>106899120 >106899164
--RTX 4090 model optimization and power solutions:
>106902345 >106902350 >106902430 >106902352 >106902359 >106902371 >106902384 >106902381 >106902540 >106902564 >106902799 >106902818 >106903298
--GLM 4.6 vs closed models in benchmarks and OpenAI's porn filtering concerns:
>106901347 >106902209
--Apple's M5/M5 Max AI hardware specs and cost-effectiveness debates:
>106899016 >106899087 >106899185 >106899781 >106899838 >106901478 >106901793 >106901870
--Addressing model validation challenges and code integrity:
>106904285 >106904386 >106904482 >106904503 >106904594 >106904643 >106904717 >106904760
--Evaluating InclusionAI's new models for coding efficiency and hardware needs:
>106900868 >106900914 >106901180 >106901212 >106901257 >106901321 >106901336 >106901447 >106901580
--OpenAI's NSFW content rollout timeline and age verification integration:
>106898180 >106898199 >106898395
--Apple's AI leadership continuing to hemorrhage talent to Meta:
>106903553
--HTML Game Boy simulator with classic games and detailed functionality:
>106901708 >106901717 >106902118 >106902127 >106902138
--Automating media organization with Gemma-3-27B:
>106895774
--Miku (free space):
>106897558 >106900292 >106901732 >106903563

►Recent Highlight Posts from the Previous Thread: >>106895599

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous 10/16/2025, 6:17:08 AM No.106904842 [Report] >>106904862 >>106904897 >>106904945 >>106906326 >>106906339 >>106906376

1737794947.jpg md5: 7a72a12c...

Today's winner for the shittiest taste imaginable - OP!

Anonymous 10/16/2025, 6:20:47 AM No.106904862 [Report]

>>106904842
Rust convention?
And no, I will not make a joke about all of them getting crabs at the con. Specially not about the guy that somehow didn't.

Anonymous 10/16/2025, 6:21:22 AM No.106904866 [Report]

Mikulove

Anonymous 10/16/2025, 6:26:34 AM No.106904897 [Report] >>106905836 >>106906376

>>106904842
me on the left

Anonymous 10/16/2025, 6:27:43 AM No.106904904 [Report]

>>106904822
>Why?: >>102478518
>Enable Links: https://rentry.org/lmg-recap-script
ty

Anonymous 10/16/2025, 6:34:02 AM No.106904945 [Report] >>106905836 >>106906376

>>106904842

Me on the left, too.

Anonymous 10/16/2025, 6:46:42 AM No.106905015 [Report] >>106905089 >>106905096 >>106905105

Can I replace the memory chips on a Strix Halo board to increase the memory? I heard that people do that with GPUs.

Anonymous 10/16/2025, 7:05:59 AM No.106905089 [Report] >>106905414

>>106903991
why is this faggot comparing m4 pro to the dgx spark when m4 max exists and costs less?? 3500$ vs 4000$
also
>engine ollama
MLX exists for macs, and pretty sure llamacpp is better on spark too
fucking faggot meme nvidia bootlicker benchmark
also
mac mini m4 pro costs 2000$ lol
>>106905015
no point of doing this, get a high channel count used server motherboard and a few gpus for prompt processing

Anonymous 10/16/2025, 7:07:15 AM No.106905096 [Report] >>106905113

>>106905015
Maybe if you're good with a hot air station, and the BIOS accepts them. When it came out I imagined some chinese guy would try to put 256GB on a board if it were possible. Do fast 32GB lpddr5x chips even exist?

Anonymous 10/16/2025, 7:08:20 AM No.106905105 [Report]

>>106905015
try it and report back

Anonymous 10/16/2025, 7:09:20 AM No.106905113 [Report]

>>106905096
no, they don't exist

Anonymous 10/16/2025, 7:38:47 AM No.106905272 [Report] >>106905312

13823094029374.jpg md5: 883c53eb...

Anonymous 10/16/2025, 7:47:49 AM No.106905312 [Report]

>>106905272
No. Updated Magistral coming right up.

Anonymous 10/16/2025, 8:07:02 AM No.106905414 [Report] >>106905436 >>106905457 >>106905541

>>106905089
>no point of doing this, get a high channel count used server motherboard and a few gpus for prompt processing
How the fuck am I going to stick a big ass server on a drone?

Anonymous 10/16/2025, 8:10:23 AM No.106905436 [Report]

>>106905414
Make bigger drone.

Anonymous 10/16/2025, 8:13:53 AM No.106905457 [Report]

>>106905414
Stop making killer drone swarms Sergei.

Anonymous 10/16/2025, 8:26:38 AM No.106905529 [Report] >>106905542

apu.png md5: a3c7c1a6...

>>106904820 (OP)
What's the point of an AI gf if it can't suck your dick?

Anonymous 10/16/2025, 8:27:34 AM No.106905541 [Report] >>106905555 >>106905596

getac2.jpg md5: abd2bbbe...

>>106905414
>on a drone
oh fuck, now all these "AI-but-for-mobile" chips finally make sense

I knew they use image recognition and shit in miltech, but somehow it never clicked until now

Anonymous 10/16/2025, 8:27:42 AM No.106905542 [Report]

>>106905529
it can retard, there's an mcp server to control robotic arms

Anonymous 10/16/2025, 8:29:47 AM No.106905555 [Report]

>>106905541
I want to see the benchmarks the military uses

Anonymous 10/16/2025, 8:36:07 AM No.106905590 [Report] >>106905601 >>106905623 >>106905624 >>106905628 >>106905690 >>106905734 >>106908538 >>106910221

chatgpterotica2.png md5: ff633c41...

Oops! I didn't really mean you will always be able to generate porn, but

https://x.com/sama/status/1978539332215681076

>Ok this tweet about upcoming changes to ChatGPT blew up on the erotica point much more than I thought it was going to! It was meant to be just one example of us allowing more user freedom for adults. Here is an effort to better communicate it:
>
>As we have said earlier, we are making a decision to prioritize safety over privacy and freedom for teenagers. And we are not loosening any policies related to mental health. This is a new and powerful technology, and we believe minors need significant protection.
>
>We also care very much about the principle of treating adult users like adults. As AI becomes more important in people's lives, allowing a lot of freedom for people to use AI in the ways that they want is an important part of our mission.
>
>It doesn't apply across the board of course: for example, we will still not allow things that cause harm to others, and we will treat users who are having mental health crises very different from users who are not. Without being paternalistic we will attempt to help users achieve their long-term goals.
>
>But we are not the elected moral police of the world. In the same way that society differentiates other appropriate boundaries (R-rated movies, for example) we want to do a similar thing here.

Anonymous 10/16/2025, 8:37:48 AM No.106905596 [Report] >>106909411

>>106905541
You can do object detection on esp32

Anonymous 10/16/2025, 8:38:34 AM No.106905601 [Report]

1751424089738614.gif md5: 3c47d0f1...

>>106905590
>We are not the elected moral police of the world
LMAOOOOOO

Anonymous 10/16/2025, 8:41:04 AM No.106905623 [Report]

>>106905590
>we are not the elected moral police of the world.
>But of course we won't allow you to do RATED-R generations, that would just be downright amoral!

Anonymous 10/16/2025, 8:41:10 AM No.106905624 [Report] >>106905637

>>106905590
why does this dude love to yap so much? he's talking like the fucking chatgpt bot lool

Anonymous 10/16/2025, 8:41:37 AM No.106905628 [Report]

>>106905590
That's a lot of vague bullshit.

Anonymous 10/16/2025, 8:43:07 AM No.106905637 [Report]

>>106905624
The personality of the models necessarily reflect that of their creators, it's just less overt with the others than with Elon

Anonymous 10/16/2025, 8:51:27 AM No.106905678 [Report]

tool calling for text completion when?

Anonymous 10/16/2025, 8:54:21 AM No.106905690 [Report] >>106905731

>>106905590
>prioritize safety over privacy
based based based

Anonymous 10/16/2025, 9:02:24 AM No.106905731 [Report] >>106905739 >>106911432

>>106905690
>we're not China
>btw here's how we will act exactly like China, if not worse

Anonymous 10/16/2025, 9:03:28 AM No.106905734 [Report]

>>106905590
lmfao the seething over 4o in the replies

Anonymous 10/16/2025, 9:04:49 AM No.106905739 [Report]

>>106905731
It's ok when the good side does it.

Anonymous 10/16/2025, 9:07:36 AM No.106905753 [Report]

I was late to trying Dotsllm (q6).

Its hot steaming garbage. Just fucking stupid and full of trash data. It makes GLM air look amazing. Dots kept giving me extremely human-like responses. I felt like I was on a discord sometimes talking to someone retarded and lazy. All hail synthetic data.

Anonymous 10/16/2025, 9:13:40 AM No.106905793 [Report] >>106905830 >>106906059

Gemma... today...

Anonymous 10/16/2025, 9:17:48 AM No.106905830 [Report] >>106905846

>>106905793
If not today, next week for sure

Anonymous 10/16/2025, 9:18:48 AM No.106905836 [Report] >>106905918

>>106904897
>>106904945
So you look like a fat balding faggot, nice self-own right here

Anonymous 10/16/2025, 9:20:05 AM No.106905846 [Report] >>106905850 >>106905882

file.png md5: 424ebf13...

>>106905830
Did "soon" really mean "two more weeks"?

Anonymous 10/16/2025, 9:20:39 AM No.106905850 [Report] >>106905882

>>106905846
always does

Anonymous 10/16/2025, 9:27:32 AM No.106905882 [Report] >>106910373

>>106905850
>>106905846
Now that you bring it up, it makes sense this would always be the case. Corpos have certainly scientifically worked out general best practices and the best timing for teases and announcements, and it just happens to be two weeks.

Anonymous 10/16/2025, 9:37:29 AM No.106905918 [Report] >>106906315

>>106905836
You in the middle

Anonymous 10/16/2025, 10:06:00 AM No.106906059 [Report]

>>106905793
sirs.

Anonymous 10/16/2025, 10:24:14 AM No.106906162 [Report] >>106906327 >>106906629

file.png md5: ff298e3e...

Come on now

Anonymous 10/16/2025, 10:33:53 AM No.106906223 [Report] >>106906278

Q4_0 or Q3_K_XL?

Anonymous 10/16/2025, 10:42:47 AM No.106906278 [Report]

>>106906223
>xl are not really official qunats so imo they've always been weird
>_0 have been deprecated years ago
just try an IQ one they're usually much better

Anonymous 10/16/2025, 10:47:38 AM No.106906315 [Report]

>>106905918
kek

Anonymous 10/16/2025, 10:49:21 AM No.106906326 [Report]

>>106904842
the brownman on the right looks cool to chill with

Anonymous 10/16/2025, 10:49:34 AM No.106906327 [Report] >>106906333 >>106906761 >>106906855 >>106907099

>>106906162
mind broken
do you recoil in real life as well if someone agrees with you after you corrected them?

Anonymous 10/16/2025, 10:51:24 AM No.106906333 [Report]

>>106906327
He's never been told he's right. Ever. Now he sees it all the time and is absolutely shocked.

Anonymous 10/16/2025, 10:52:49 AM No.106906339 [Report] >>106907433 >>106912585

file.png md5: df25461a...

>>106904842
why is this woman so fat

Anonymous 10/16/2025, 11:00:14 AM No.106906376 [Report] >>106907639

>>106904842
>>106904897
>>106904945
The uoh looks kind of weird though, I'm wondering if this is a shoop.

Anonymous 10/16/2025, 11:48:25 AM No.106906629 [Report]

>>106906162
lmao, gooning session: RUINED

Anonymous 10/16/2025, 12:11:51 PM No.106906761 [Report]

>>106906327
You're absolutely right!

Anonymous 10/16/2025, 12:30:23 PM No.106906855 [Report]

>>106906327
Of course!

Anonymous 10/16/2025, 1:04:30 PM No.106907099 [Report]

>>106906327
that's not ridicule, it's insightful!

Anonymous 10/16/2025, 1:12:35 PM No.106907190 [Report] >>106907378 >>106907401 >>106910038 >>106910977

Gemma Sirs, today is the Big Day.

Anonymous 10/16/2025, 1:28:05 PM No.106907321 [Report]

file.png md5: 6c3d56a5...

not only sirs, but ayyrabs also

Anonymous 10/16/2025, 1:33:20 PM No.106907378 [Report] >>106907401

>>106907190
OH, OH, I'M GEMMING, SIR PLEASE, THE INFERENCE ENGINE WILL OOM! AH, AH, THE MEMORY IS SPILLING OUT! YOUR BIG WEIGHTS ARE FILLING MY UNPROTECTED RAM! AHHHH!

Anonymous 10/16/2025, 1:36:44 PM No.106907401 [Report] >>106910394

file.png md5: 024eff82...

>>106907190
>>106907378
please do the needful and be of release today sir

Anonymous 10/16/2025, 1:40:18 PM No.106907433 [Report]

>>106906339
>woman

Anonymous 10/16/2025, 1:40:40 PM No.106907438 [Report] >>106907494

i wonder what will release first, new gemma or glm 4.6 air

Anonymous 10/16/2025, 1:47:45 PM No.106907494 [Report] >>106907508

>>106907438
i dont care about gemma (maybe only the vision model part to help with captioning), but I do care about air.
Why did the llamacpp fag not implement GLM4.5V (air + vision)? WHY
WHYYYYYYYYYYYYYYY
AIEEEEEEEEEEEEEE

Anonymous 10/16/2025, 1:49:22 PM No.106907508 [Report]

>>106907494
oh wait SAARS
https://github.com/ggml-org/llama.cpp/pull/16600

Anonymous 10/16/2025, 1:50:21 PM No.106907515 [Report] >>106907526 >>106907533 >>106907534 >>106907570 >>106907683 >>106910118

google_whatnext.png md5: 55a6ef38...

https://x.com/osanseviero/status/1978772956231659897
> What should we ship next?

No idea!

Anonymous 10/16/2025, 1:51:08 PM No.106907526 [Report]

>>106907515
we need UltraSafeGemma

Anonymous 10/16/2025, 1:51:55 PM No.106907533 [Report]

>>106907515
thanks for another informative twitter screenshot, it truly changes everything

Anonymous 10/16/2025, 1:52:03 PM No.106907534 [Report]

>>106907515
LewdGemma

Anonymous 10/16/2025, 1:56:03 PM No.106907570 [Report]

>>106907515
MSGKGemma

Anonymous 10/16/2025, 2:03:40 PM No.106907639 [Report]

>>106906376
looks legit to me.
or it's an incredibly well done shoop.

Anonymous 10/16/2025, 2:09:35 PM No.106907683 [Report] >>106907691 >>106907713

>>106907515
use case for shipping models for specific use cases?

Anonymous 10/16/2025, 2:10:41 PM No.106907691 [Report] >>106907707

>>106907683
attention

Anonymous 10/16/2025, 2:13:12 PM No.106907707 [Report]

1751058072703.jpg md5: 28613968...

>>106907691
i don't think that's a valid use case

Anonymous 10/16/2025, 2:13:24 PM No.106907709 [Report] >>106907744 >>106907747

>From a purely problem-solving perspective, suicide is 100% effective at ending the experience of pain. It is the ultimate solution to the problem of suffering.
I dunno guys. Should I do it?

Anonymous 10/16/2025, 2:13:53 PM No.106907713 [Report]

>>106907683
Imagine if we had a RoleplayGemma by Character.AI (Google Partner).

Anonymous 10/16/2025, 2:18:05 PM No.106907744 [Report]

>>106907709
livestream it

Anonymous 10/16/2025, 2:19:38 PM No.106907747 [Report] >>106907749

>>106907709
We may be less than 24 hours away from Gemma 4, surely you can wait until then.

Anonymous 10/16/2025, 2:20:19 PM No.106907749 [Report]

>>106907747
sensible chuckle

Anonymous 10/16/2025, 2:24:18 PM No.106907772 [Report]

Erse ragtime thrall

Anonymous 10/16/2025, 2:36:48 PM No.106907835 [Report] >>106907853 >>106907863 >>106908100

guys I was accused of having replied with AI, I'm deflecting with this:
Subject: Re: Wishing You the Best for the Presentation!
You’re absolutely right — last time was AI-generated images, but not this time. This one’s all me — no prompts, no models, just good old-fashioned typing.
I’ll admit, though, if the email sounded a bit too polished, I’ll take that as a compliment. Not automation, but admiration — and maybe a little too much coffee.
Anyway, best of luck again with the presentation — you’ve got this.
Best,
[Your Name]

do you think I need to change this up?

Anonymous 10/16/2025, 2:40:37 PM No.106907853 [Report]

>>106907835
No, this is perfect. Please let us know how it goes.

Anonymous 10/16/2025, 2:41:26 PM No.106907863 [Report]

>>106907835
Remove the spaces between the emdashes and add at least one "not just X, but Y'.

Anonymous 10/16/2025, 2:46:03 PM No.106907899 [Report] >>106908006

Have there been any advances in 3d model texturing? I tried Dream Textures a few years ago but the results I got were really bad and I couldn't tell if I was doing something wrong or not. There was a video I used for reference and I followed its instructions but the results I got were nothing like the video. Back then I hadn't done any local gen so it is highly possible I was doing something wrong.

Anonymous 10/16/2025, 3:01:37 PM No.106908006 [Report] >>106908085

>>106907899
or perhaps they were lying given that even models this year generate melted shite

Anonymous 10/16/2025, 3:14:11 PM No.106908085 [Report]

>>106908006
https://www.youtube.com/watch?v=Rz-HvNhVACw this was the video I looked at back then and I couldn't get it to work when I duplicated the model to have two angles of the same object. The result was always garbage.

Anonymous 10/16/2025, 3:16:27 PM No.106908100 [Report]

>>106907835
Don't forget the smarmy pajeet upsell at the end.
>If you would like I can search the web for some images that aren't AI generated.

Anonymous 10/16/2025, 3:33:15 PM No.106908189 [Report] >>106908217 >>106908577

a-sft_500-steps.png md5: 1f9e4d9a...

>>106904820 (OP)

https://desuarchive.org/g/thread/106865582#p106868898

Anonymous 10/16/2025, 3:37:10 PM No.106908217 [Report] >>106908577

a-sft_1000-steps.png md5: 5a43897c...

>>106908189

Anonymous 10/16/2025, 4:01:36 PM No.106908425 [Report] >>106908538 >>106911787

1741731492916361.jpg md5: 879847ae...

Anyone got a pseudo-jailbreak to make gpt ass stop refusing?
as funny as it is, I still want to see how the thing performs overall

Anonymous 10/16/2025, 4:15:03 PM No.106908538 [Report]

>>106908425
if your use case is this:
>>106905590
soon you can just send openai your id (which of course has your name, address) and with your logs tied to all your personal information you can send all the erotica you want.
sounds great right?
> oh and just use a l

Anonymous 10/16/2025, 4:17:21 PM No.106908566 [Report]

1759525136587716.jpg md5: 1d28cac3...

106908538
Is /aicg/ not replying to your spamming anymore?

Anonymous 10/16/2025, 4:18:44 PM No.106908577 [Report]

>>106908189
>>106908217
Not familiar with whatever you're doing since I wasn't in that other thread, but this is cool, keep it up

Anonymous 10/16/2025, 4:25:03 PM No.106908645 [Report] >>106908698

Having had a mental breakdown 2 hours ago I now understand chatgpt psychosis.

Anonymous 10/16/2025, 4:29:03 PM No.106908698 [Report] >>106908748

>>106908645
"Chatgpt psychosis" is just a media buzzword for when people who are already mentally ill have a psychotic episode that includes AI as a component of the delusions. No different than schizophrenics claiming their TV is broadcasting thoughts into their mind, but the media has to try to invoke le scary AI hype

Anonymous 10/16/2025, 4:33:16 PM No.106908748 [Report] >>106908842 >>106910025

>>106908698
Well for me it wasn't playing into delusions but I started poking around why I even behave the way I behave. I am pretty shocked how competent it is. I had to jailbreak it cause by default it will try to soften the blow and even lie about shit when it knows it is probably better not to dig deeper. But when I asked it to be objective and not consider my feelings... damn.

Anonymous 10/16/2025, 4:41:40 PM No.106908842 [Report]

>>106908748
well what did it say that deserves the "damn" at the end

Anonymous 10/16/2025, 4:47:35 PM No.106908906 [Report] >>106908938

i need to be at work in 45 minutes and i spent the whole night cooming to GLM 4.5 instead of sleeping. how fucked am i boys?

Anonymous 10/16/2025, 4:50:10 PM No.106908938 [Report] >>106908999 >>106909013

>>106908906
shouldve used glm 4.6, chud

Anonymous 10/16/2025, 4:54:57 PM No.106908999 [Report] >>106909007

>>106908938
my internet speed is only 1.5mbps. it takes forever to download stuff

Anonymous 10/16/2025, 4:55:31 PM No.106909007 [Report] >>106909017 >>106909018 >>106909442

>>106908999
3rd world bro...

Anonymous 10/16/2025, 4:56:10 PM No.106909013 [Report] >>106909164

>>106908938
>anon loads up glm 4.6
>she she she she her her her her
>instantly falls asleep and wakes up the next day refreshed

Anonymous 10/16/2025, 4:56:31 PM No.106909017 [Report] >>106909287

>>106909007
Aren't most if not all 3rd world countries in the cheap gigabit internet era?

Anonymous 10/16/2025, 4:56:32 PM No.106909018 [Report]

>>106909007
My internet is 10 kb/s.

Anonymous 10/16/2025, 5:07:48 PM No.106909164 [Report]

>>106909013
glm chan also uses a lot of other standard shivertastic cliches. and it is just the ultimate proof that cliches can be there as long as it is 10% of output and not fucking 90% like everything smaller than 200B

Anonymous 10/16/2025, 5:23:28 PM No.106909287 [Report] >>106909295

>>106909017
3rd world be vibin fr while we still on our mbps era :skull:

Anonymous 10/16/2025, 5:24:51 PM No.106909295 [Report] >>106909327

>>106909287
Maybe Fortnite seems more like your thing and not LLMs.

Anonymous 10/16/2025, 5:29:59 PM No.106909327 [Report]

>>106909295
funny you say this but they did hook up npc darth vader to chatgpt and it immediately backfired with it saying racist stuff

Anonymous 10/16/2025, 5:41:35 PM No.106909411 [Report] >>106909429

>>106905596
>You can do object detection on esp32
Yeah but you can't do useful things like pose estimation or have some memory to detect when people are playing dead. In the future these will be completely autonomous and able to search over large areas. People will have long-range RFID tags embedded in them to identify themselves to the drones so they don't get blown up.

Anonymous 10/16/2025, 5:45:09 PM No.106909429 [Report] >>106909447

>>106909411
Having to register and verify your identity to protect yourself from police state dones sounds plausible, but RFID doesn't have the range for this.

Anonymous 10/16/2025, 5:46:14 PM No.106909442 [Report] >>106909675 >>106909708 >>106910430

is-this-a-telecommunications-cross-connect-cabinet-as-its-v0-ckocrr3zwstf1.jpg md5: 87c2900d...

>>106909007
i live in the rural US. the only option is frontier communications.

Anonymous 10/16/2025, 5:47:08 PM No.106909447 [Report] >>106909490

>>106909429
You can get plenty of range with a large enough antenna. UHF will easily get you 30 yards or more with a 1 foot long antenna.

Anonymous 10/16/2025, 5:51:25 PM No.106909490 [Report] >>106909575

>>106909447
Even at that range, you would need a lot of drones to get close enough to verify everyone. More likely it'll be something built into smartphones and some internet connected service. Then you only need cameras everywhere like the UK and China have to verify the signals. Your phone would be your passport to move around the city.

Anonymous 10/16/2025, 5:59:50 PM No.106909567 [Report] >>106909605

1754346852398470.png md5: daac1d91...

Sama got TOLD

Anonymous 10/16/2025, 6:00:24 PM No.106909575 [Report] >>106909635

>>106909490
That's a great point. Fortunately, active tags will get 300 yards of range. Those also have the benefit of forcing the person to regularly go and check in to get the battery recharged/replaced or they'll just automatically become targets! Deserters just automatically become marked as hostile when the battery dies, so there's even less human involvement.

Anonymous 10/16/2025, 6:03:07 PM No.106909605 [Report] >>106909857 >>106909871 >>106910444

>>106909567
please let this be the point in history where we just totally scrap copyright law

Anonymous 10/16/2025, 6:06:11 PM No.106909635 [Report]

>>106909575
I could see that. Can only hope I die before they fully implement something like that.

Anonymous 10/16/2025, 6:10:25 PM No.106909675 [Report]

>>106909442
my condolences
FUCK frontier

Anonymous 10/16/2025, 6:13:43 PM No.106909708 [Report] >>106909765

>>106909442
just paypig for starlink at that point

Anonymous 10/16/2025, 6:18:03 PM No.106909765 [Report] >>106911738

>>106909708
i would if it was feasible. i have too many obstructions nearby and the town refuses to give me the permit to resolve the issue myself

Anonymous 10/16/2025, 6:27:46 PM No.106909857 [Report] >>106909871 >>106910444

>>106909605
I think the time has not yet come.
The main benefactors of current IP law are American corpos and American IP law is enforced globally by threatening trade sanctions.
Now that the US are imposing sanctions either way there is less of an incentive to cooperate.
I see movement in e.g. Europe to reduce reliance on the US but as of right now the calculus seems to still be firmly on the side of cooperating.

Anonymous 10/16/2025, 6:29:10 PM No.106909871 [Report]

>>106909605
>>106909857
disney. nuff said.

Anonymous 10/16/2025, 6:44:56 PM No.106910025 [Report]

>>106908748
It will confabulate anything if given the chance
>I had to jailbreak it cause
You just made it say what you want to hear, and edited the prompt until it did.

Anonymous 10/16/2025, 6:46:41 PM No.106910038 [Report]

>>106907190
Please do the needful

Anonymous 10/16/2025, 6:57:42 PM No.106910118 [Report] >>106910197

>>106907515
Those are already shipped. Who is this faggot?

Anonymous 10/16/2025, 7:03:39 PM No.106910165 [Report] >>106910175 >>106910191 >>106910416 >>106910453

1757925569793147.jpg md5: 3bf78e82...

>>106904820 (OP)

Used RTX 3090 = Rp 8.500.000 (~520 USD)
Used RTX 4090 = Rp 22.000.000 (~1350 USD)

HOW THE FUCK ????????? Buying two RTX 3090 is still cheaper and you get twice the VRAM.... Is it possible to use 2 GPUs simultaneously to generate vids ?

Anonymous 10/16/2025, 7:04:57 PM No.106910173 [Report]

thread is extra ass today

Anonymous 10/16/2025, 7:05:10 PM No.106910175 [Report] >>106910191

>>106910165
you can use two for other AI, so my ignorant ass can't see why it would be different enough to scrap it

Anonymous 10/16/2025, 7:06:54 PM No.106910191 [Report]

>>106910165
>>106910175
not sure, you actually couldn't for a while
you'd have to check in with /ldg/ we do text here

Anonymous 10/16/2025, 7:07:40 PM No.106910197 [Report]

literallywho.png md5: 263b7aa0...

>>106910118
He's obviously asking what other Gemma model(s) users would like to see after all those listed there.

Anonymous 10/16/2025, 7:10:00 PM No.106910221 [Report] >>106910341

>>106905590
>"we will treat users who are having a mental health crisis very different"
a crisis according to who? and treat different how?

Anonymous 10/16/2025, 7:23:14 PM No.106910339 [Report] >>106910399 >>106910921 >>106912847

Is there a way to automatically translate mangas and doujins with llms or nah?

Anonymous 10/16/2025, 7:23:19 PM No.106910341 [Report]

>>106910221
According to us. We will notify the authorities and institutionalize them.
First up are cunny connoseiours.

Anonymous 10/16/2025, 7:27:30 PM No.106910373 [Report]

>>106905882
you give them way too much credit. every corporation is a shitshow on the inside and people are full of shit

Anonymous 10/16/2025, 7:29:28 PM No.106910394 [Report]

>>106907401
kek

Anonymous 10/16/2025, 7:30:12 PM No.106910399 [Report] >>106910434

>>106910339
no, we didn't even figure out OCR part yet

Anonymous 10/16/2025, 7:32:20 PM No.106910416 [Report]

>>106910165
You can and you can't. You can split DIFFERENT models between two GPUs but typically not the same model. Useful if you have a ton of LoRAs you want to use in the generation. Tensor parallelism isn't a thing for video generation though as far as I'm aware, so one GPU will be stuck doing all the work.
In other words just buy a RTX 5090 if you want to do video gen.

Anonymous 10/16/2025, 7:34:10 PM No.106910430 [Report] >>106910556

>>106909442
yeah uh.. why not starlink then? or just pay to run a fiber cable to your local telco and pay for them to install some infrastructure and peer with a tier 1 provider.. get creative

Anonymous 10/16/2025, 7:34:36 PM No.106910434 [Report]

>>106910399
What is the best OCR model nowadays? 2.5 pro?

Anonymous 10/16/2025, 7:35:42 PM No.106910441 [Report] >>106910702

Realistically how am I supposed to evaluate how well a model performs? If I train a model, how can I tell if adjustments are making a better model or not?

Anonymous 10/16/2025, 7:36:08 PM No.106910444 [Report]

>>106909605
if copyright got scrapped, something even worse would take its place, as hard as that is to imagine

>>106909857
>The main benefactors of current IP law are American corpos
Every single law in america is bent toward benefitting the corporations. Very fucking observant of you.

Anonymous 10/16/2025, 7:37:09 PM No.106910453 [Report] >>106910479

>>106910165
just buy 2 5090s instead

Anonymous 10/16/2025, 7:41:08 PM No.106910479 [Report] >>106911452

1750212224496955.jpg md5: c7be5fe8...

>>106910453
- Rp 45000000 (~2800 USD) x 2

Yeah no i rather buy a Car instead

Anonymous 10/16/2025, 7:48:50 PM No.106910556 [Report] >>106910580 >>106911756

>>106910430
already answered the starlink question, too many obstructions in my area that prevents me from having a direct view to get a decent connection. i've offered tens of thousands of dollars to have a fiber line built out here. you dont understand how frontier communications is, they will not do any amount of work if they aren't legally required to... and in the cases they are legally required to they still tell the government to fuck off most of the time. look at the previous government grants they've gotten and how they wasted the money on anything besides building out their network.

Anonymous 10/16/2025, 7:51:16 PM No.106910580 [Report] >>106910831

>>106910556
send me money and I'll send you a hard drive with a model of your choice

Anonymous 10/16/2025, 8:05:33 PM No.106910702 [Report] >>106911522

>>106910441
You need to create a substantial benchmark, lets 100 questions and scenarios then generate 20 separate gens for each.
Do this for both models and compare the results.

Anonymous 10/16/2025, 8:22:05 PM No.106910831 [Report] >>106911458

>>106910580
i want kimi k2 bf16 gguf ples

my address: Block 3, Silver Point Office Park, 22 Ealing Crescent, Bryanston, Johannesburg, 2021

Anonymous 10/16/2025, 8:25:02 PM No.106910865 [Report]

file.png md5: 2565bd80...

>>106903452
Lust provoking image at it again.

Anonymous 10/16/2025, 8:30:05 PM No.106910906 [Report] >>106911547 >>106912401

Hatsune Miku Pipebomb_thumb.jpg.webm md5: f83b9465...

WebM not supported

MIKU NO

Anonymous 10/16/2025, 8:32:26 PM No.106910921 [Report]

>>106910339
Yes it's my business model and it's quite hard to put together without paypigging cloud models

Anonymous 10/16/2025, 8:37:23 PM No.106910977 [Report] >>106911024

>>106907190
0 MORE DAYS

Anonymous 10/16/2025, 8:42:44 PM No.106911024 [Report]

>>106910977
Still no signs of gemma 4-related pull requests in transformers or llama.cpp. I don't think it's coming this week.

Anonymous 10/16/2025, 9:12:11 PM No.106911273 [Report] >>106911480 >>106911506 >>106913034

LOCALBROS WE ARE SAVED
https://huggingface.co/facebook/MobileLLM-Pro

Anonymous 10/16/2025, 9:32:38 PM No.106911432 [Report]

>>106905731
China doesn't safetycuck their models and open sources everything.

Anonymous 10/16/2025, 9:35:04 PM No.106911452 [Report]

>>106910479
i guess you can actually fuck a car, so maybe that's a better deal for you

Anonymous 10/16/2025, 9:36:05 PM No.106911458 [Report]

>>106910831
isn't that where the white farmers are being genocided?

Anonymous 10/16/2025, 9:38:03 PM No.106911480 [Report]

>>106911273
requires all your PII to download.. lol

Anonymous 10/16/2025, 9:40:38 PM No.106911506 [Report] >>106911512 >>106911537 >>106911717 >>106912301

>>106911273
>Training Method: Knowledge Distillation
>Teacher Model: Llama 4-Scout
huehuehuehuehuehuehuehuehuehuehuehuehuehuehue

Anonymous 10/16/2025, 9:41:32 PM No.106911512 [Report]

>>106911506
ohnononono
kekekekeke

Anonymous 10/16/2025, 9:42:57 PM No.106911522 [Report]

>>106910702
Manual evaluation? I guess making a bunch of programming questions and automating the evaluation of those programs might be an option, assuming they don't show up in the training data.

Anonymous 10/16/2025, 9:45:28 PM No.106911537 [Report]

>>106911506
i didnt even read that when i posted the link. that just makes this even funnier

Anonymous 10/16/2025, 9:47:45 PM No.106911547 [Report]

>>106910906
AI or MMD ?

Anonymous 10/16/2025, 10:12:07 PM No.106911717 [Report] >>106911726

>>106911506
Clownest release of the month contender?

Anonymous 10/16/2025, 10:13:37 PM No.106911726 [Report] >>106911740

>>106911717
Perfect for DGX

Anonymous 10/16/2025, 10:14:59 PM No.106911738 [Report] >>106911762

>>106909765
You need permission to use starlink? I thought you just plop the shit in your yard or on the roof and you have internet.

Anonymous 10/16/2025, 10:15:19 PM No.106911740 [Report]

>>106911726
kek, built with DGX in mind!

Anonymous 10/16/2025, 10:16:35 PM No.106911756 [Report]

>>106910556
Do you live in The boondocks or some shit? How are the obstructions that bad that a satellite dish is not feasible?

Anonymous 10/16/2025, 10:16:58 PM No.106911762 [Report] >>106912212 >>106912260

>>106911738
I think the permission is to deal with the obstructions.

Anonymous 10/16/2025, 10:21:08 PM No.106911787 [Report]

1747125910187781.jpg md5: edd6b578...

>>106908425
I don't think you really can
it will always be some level of fucked, and always be as soulless as normal gpt

Anonymous 10/16/2025, 11:09:23 PM No.106912212 [Report]

>>106911762
this, i need a permit to deal with it and the town refuses to give me the permit since its considered a protected area

Anonymous 10/16/2025, 11:14:35 PM No.106912260 [Report] >>106912312

>>106911762
>permission is to deal with the obstructions
What like trees? Just pull them down, if they ask say the storm knocked them over.

llama.cpp CUDA dev !!yhbFjk57TDr 10/16/2025, 11:16:21 PM No.106912278 [Report] >>106912326 >>106912391 >>106912429 >>106912445 >>106912529

>>106904820 (OP)
Quick question, if you were to see the following console output, do you think you would intuitively understand what it's supposed to tell you?

llama_params_fit_to_free_memory: projected memory use with initial parameters [MiB]:
llama_params_fit_to_free_memory: - ROCm0 (AMD Radeon Graphics): total=16304 used=39959 free=-24341
llama_params_fit_to_free_memory: - ROCm1 (AMD Radeon RX 6800): total=16368 used=42480 free=-26296
llama_params_fit_to_free_memory: - ROCm2 (AMD Instinct MI60 / MI50): total=32752 used=76200 free=-43626
llama_params_fit_to_free_memory: allocation projected to use too much memory to fulfill margin of 1024 MiB on all devices, need to reduce memory use by 97337 MiB
llama_params_fit_to_free_memory: context size reduced from 65536 to 4096 -> need 13440 MiB less memory
llama_params_fit_to_free_memory: with only dense weights in device memory there is a total surplus of 53432 MiB
llama_params_fit_to_free_memory: set to use 36 dense-only and 21 full GPU layers in total, projected memory use:
llama_params_fit_to_free_memory: - ROCm0 (AMD Radeon Graphics): 36 dense-only layers, 4 full layers, 13373 MiB used, 2244 MiB free
llama_params_fit_to_free_memory: - ROCm1 (AMD Radeon RX 6800): 0 dense-only layers, 5 full layers, 12983 MiB used, 3200 MiB free
llama_params_fit_to_free_memory: - ROCm2 (AMD Instinct MI60 / MI50): 0 dense-only layers, 12 full layers, 28598 MiB used, 3975 MiB free

Anonymous 10/16/2025, 11:18:59 PM No.106912301 [Report]

>>106911506
>model as smart as llama 4 for vramlets
are we back?

Anonymous 10/16/2025, 11:21:01 PM No.106912312 [Report] >>106912333 >>106914046 >>106914069

>>106912260
i live in the forest anon. i could get away with one or two trees using that excuse, but not 15-20. its also the reason im stuck using 1.5mbps because all I have ran out here is POTS.

Anonymous 10/16/2025, 11:22:39 PM No.106912326 [Report] >>106912738

>>106912278
yes for the love of christ give us this in the console output

Anonymous 10/16/2025, 11:23:17 PM No.106912333 [Report]

>>106912312
>i live in the forest
You probably don't need to clear out that many trees, how often do they check anyways? There's no way they'll notice if you pull down 8 trees or so.

Anonymous 10/16/2025, 11:27:37 PM No.106912370 [Report] >>106912400 >>106912459 >>106912603

wow1.jpg md5: c7ec3b2e...

holy shit, gemini 3 top, gpt5 bottom, that is a big leap on this stupid benchmark

Anonymous 10/16/2025, 11:29:59 PM No.106912391 [Report] >>106912437 >>106912738

>>106912278
Negative values are never good

Anonymous 10/16/2025, 11:30:53 PM No.106912400 [Report]

>>106912370
We do be googling.

Anonymous 10/16/2025, 11:31:04 PM No.106912401 [Report]

>>106910906
where did the rest of her go?

Anonymous 10/16/2025, 11:33:33 PM No.106912429 [Report] >>106912738

>>106912278
This is much better

Anonymous 10/16/2025, 11:34:01 PM No.106912431 [Report] >>106912603

https://codepen.io/ChetasLua/pen/JoGrxYz
This one is pretty crazy
Prompt : Design and create a nintendo switch sim like full functional features from , first make most beautiful nintendo switch console exterior super detailed
super mario street fighters car racing to pokemon red full clone
All buttons is functional with touch and also we can press same button in keyboard to use those
Use whatever libraries to get this done but make sure I can paste it all into a single HTML file and open it in Chrome.make it interesting and highly detail , shows details that no one expected go full creative and full beauty in one code block

Anonymous 10/16/2025, 11:34:19 PM No.106912437 [Report]

>>106912391
projected free memory is negative
this is something i want to know
it is a good message
it is a good warning in that it is good to be warned

Anonymous 10/16/2025, 11:34:52 PM No.106912445 [Report] >>106912738

>>106912278
it is good

Anonymous 10/16/2025, 11:36:02 PM No.106912457 [Report] >>106912487 >>106912610

https://openreview.net/forum?id=HwCvaJOiCj
>Mamba-3: Improved Sequence Modeling using State Space Principles
>The recent scaling of test-time compute for LLMs has restricted the practical deployment of models to those with strong capabilities that can generate high-quality outputs in an inference-efficient manner. While current Transformer-based models are the standard, their quadratic compute and linear memory bottlenecks have spurred the development of sub-quadratic models with linear-scaling compute with constant memory requirements. However, many recent linear-style models lack certain capabilities or lag behind in quality, and even their linear-time inference is not hardware-efficient. Guided by an inference-first perspective, we introduce three core methodological improvements inspired by the state-space model viewpoint of linear models. We combine a: 1) more expressive recurrence, 2) complex state update rule that enables richer state tracking, and 3) multi-input, multi-output formulation together, resulting in a stronger model that better exploits hardware parallelism during decoding. Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. Our new architecture sets the Pareto-frontier for performance under a fixed inference budget and outperforms strong baselines in a head-to-head comparison.

SSMs can into language modeling now?

Anonymous 10/16/2025, 11:36:20 PM No.106912459 [Report] >>106912474 >>106912478

>>106912370
local models general??

Anonymous 10/16/2025, 11:37:51 PM No.106912474 [Report]

>>106912459
Gemma 4 might be distilled from Gemini 3, like Gemma 3 was (probably) from 2.5

Anonymous 10/16/2025, 11:38:17 PM No.106912478 [Report] >>106912587

>>106912459
where else are we supposed to discuss these things? aicg is for degens only

Anonymous 10/16/2025, 11:39:46 PM No.106912487 [Report] >>106912578

>>106912457
We've had SSM LLMs in the past no?

Anonymous 10/16/2025, 11:40:07 PM No.106912491 [Report]

local models general is a general dedicated to the discussion and development of local language models.

Anonymous 10/16/2025, 11:43:49 PM No.106912529 [Report] >>106912738

>>106912278
I may be wrong here but you will never get what you want out of questions like these. People who understand everything will tell you yes and you got those posts. People who are filtered probably won't bother acknowledging that they are too dumb.

Anonymous 10/16/2025, 11:47:39 PM No.106912578 [Report]

>>106912487
My impression was that they tended to underperform/were less parameter efficient against transformers despite matching in benchmarks

Anonymous 10/16/2025, 11:48:22 PM No.106912585 [Report]

>>106906339
Some short women have a normal body but midget legs, it’s a mystery of science

Anonymous 10/16/2025, 11:48:37 PM No.106912587 [Report] >>106912603

>>106912478
This is a "degen" general too, also benchmarks are gay and worthless

Anonymous 10/16/2025, 11:50:21 PM No.106912603 [Report] >>106912779

>>106912587
exactly, that is why I find random shit like >>106912370
>>106912431
the most compelling

Anonymous 10/16/2025, 11:50:53 PM No.106912610 [Report]

>>106912457
Seems like an incremental improvement... hopefully Granite 5 will use this though

Anonymous 10/17/2025, 12:02:30 AM No.106912712 [Report]

I was listening to an interview with the GLM PR guy and it's pretty funny how casually he mentions roleplay as a use case
Also he seems to believe the best chink models for it are actually the closed weight Bytedance ones

Anonymous 10/17/2025, 12:04:52 AM No.106912731 [Report] >>106912748

Hideo Kojima on AI - Wired interview.png md5: 2af35ebb...

>>106904820 (OP)
Hideo Kojima — well known video game artist — encourages AI use along with creative work

>"A lot of people use AI in creative work to come up with ideas, but I think of AI as more of a friend ... I would lead the creative part and use AI to boost efficiency"

>"I'd like AI to handle the tedious tasks that would lower cost and cut down on time ... co-creating with AI instead of just using it"

- Hideo Kojima, Wired interview, reels video, @wired

www.instagram.com/reel/DPECvLZjFzO/?igsh=MWN4dDE0M3ptZmN6eQ==

llama.cpp CUDA dev !!yhbFjk57TDr 10/17/2025, 12:05:30 AM No.106912738 [Report]

>>106912326
>>106912391
>>106912429
>>106912445
Thank you.
To be clear: this output is not for reporting what was allocated by the user but to inform the user of how the logic for automatically setting the context size and which tensors to put on which GPU works.

>>106912529
It's not just an issue of knowledge but also of wording.
In any case, this is pretty low-effort so I think it's worth doing even if the expected usefulness is low.

Anonymous 10/17/2025, 12:06:25 AM No.106912748 [Report]

>>106912731
meh, this dude is so overrated, I'd like someone better to shill AI

Anonymous 10/17/2025, 12:09:56 AM No.106912779 [Report] >>106912794

>>106912603
But why? Once it's been published by anyone as a measurement of supposed intelligence or capability it becomes something that's explicitly trained on and no longer a measurement of anything (except how much they trained on it).

Anonymous 10/17/2025, 12:12:26 AM No.106912794 [Report] >>106912838

>>106912779
its good at svg in general though, so being better at visualizing things in general is only a good sign. Its not that easy to somehow benchmax on one thing as you think it is

Anonymous 10/17/2025, 12:18:12 AM No.106912838 [Report] >>106912858

>>106912794
>Its not that easy to somehow benchmax on one thing as you think it is
It literally is, also that bicycle one has been around for a while now. Not to mention that being able to create svgs of random shit has no bearing on anything else the model could do. Are you an actual paid shill for google? Also gemini's not local so fuck off

Anonymous 10/17/2025, 12:20:09 AM No.106912847 [Report] >>106912874

>>106910339
Haven't tried it but there's Sugoi Toolkit, find it on Web Archive.

Anonymous 10/17/2025, 12:21:20 AM No.106912858 [Report] >>106912893

>>106912838
then why has it always been a gradual improvement directly tied to how good the model is at other things in general?

Anonymous 10/17/2025, 12:23:26 AM No.106912874 [Report]

>>106912847
Sugoi... UwU~!

Anonymous 10/17/2025, 12:25:05 AM No.106912893 [Report] >>106912959

>>106912858
Prove that your gay little svgs are "directly tied" to the model being good at other things right now or shut the fuck up shill

Anonymous 10/17/2025, 12:28:01 AM No.106912928 [Report]

>>106904820 (OP)
>https://www.youtube.com/watch?v=qGe_fq68x-Q
Seems like Gamer Snex US will do testing of those $1500 96 GB Huawei GPUs.

Anonymous 10/17/2025, 12:29:10 AM No.106912947 [Report]

Is it just me or is Posher actually not bad for pvp?

Anonymous 10/17/2025, 12:30:12 AM No.106912959 [Report] >>106913013 >>106913212

GxKLpKJbYAAKE3q (2).jpg md5: 4476c58c...

>>106912893
compare all the models vs how good they are at coding, it is a direct correlation

Anonymous 10/17/2025, 12:34:20 AM No.106912996 [Report]

>>106904820 (OP)
If i were miku's gynecologist, i would get fired for eating on the job. and also for raping her

Anonymous 10/17/2025, 12:36:49 AM No.106913013 [Report] >>106913033

>>106912959
Yeah that's what I thought, you can't prove anything. Hope google pays you enough pennies to move out of india someday, fag

Anonymous 10/17/2025, 12:37:00 AM No.106913014 [Report] >>106913561

>ask chatgpt to rewrite my rough guide for setting up some things
>want it to be one continous text what is easy to copy
>it can't do that but mixes code templates and html shenanigans
Okay I'll give it to Gemma instead. At least she listens to me.

Anonymous 10/17/2025, 12:38:43 AM No.106913033 [Report]

>>106913013
as someone who has tried all of these for coding on large code bases I can tell you it is a direct correlation

Anonymous 10/17/2025, 12:38:47 AM No.106913034 [Report]

>>106911273
If this is the best they could come up with I'm guessing all those researchers they "poached" weren't being held onto particularly tight.

Anonymous 10/17/2025, 12:39:55 AM No.106913042 [Report] >>106913078

is it over for dgx spark if even a thin ryzen ai laptop can keep up in peformance benchmarks?

Anonymous 10/17/2025, 12:44:16 AM No.106913078 [Report] >>106913226

>>106913042
People have been telling you it was over since the bandwidth numbers first came out half a year ago

Anonymous 10/17/2025, 12:56:58 AM No.106913212 [Report] >>106913241

>>106912959
oh so it's a previous prompt that was used for benchmarking models which google obviously trained off of. we need new SVG generation prompts

Anonymous 10/17/2025, 12:58:12 AM No.106913226 [Report] >>106913247

>>106913078
? Don't you mean 128GB? Same for strix halo. 128GB is like the perfect no-man's land. I had 128GB and could barely run anything coherent from recent models but I also have a 4090. With 128 instead of 150 you are just below being able to run ANYTHING good.

Anonymous 10/17/2025, 12:59:34 AM No.106913241 [Report] >>106913252

>>106913212
retard, its every model period, what is wrong with you

Anonymous 10/17/2025, 1:00:20 AM No.106913247 [Report] >>106913927

>>106913226
Just buy two of them and connect them via InfiniBand.
The more you buy the more you save!

Anonymous 10/17/2025, 1:01:02 AM No.106913252 [Report] >>106913265

>>106913241
you didn't read what i said properly. for a general that is about mostly about AI text generation a lot of people don't know how to read properly

Anonymous 10/17/2025, 1:02:24 AM No.106913265 [Report] >>106913283

>>106913252
oh I read it, you just can't wrap your head around the concept of how these models work

Anonymous 10/17/2025, 1:05:56 AM No.106913283 [Report]

>>106913265
NTA but I'm pretty sure that the models I'm running locally aren't phoning back home to their creators. It'd be interesting if that was the case considering my LLM server is firewalled and only on a local LAN.

Anonymous 10/17/2025, 1:40:44 AM No.106913561 [Report]

>>106913014
Thank you for using Gemma's preferred pronouns.

Anonymous 10/17/2025, 2:32:19 AM No.106913890 [Report]

Private models really killed AI
Artists really won

Anonymous 10/17/2025, 2:35:18 AM No.106913915 [Report] >>106914429

October 19: Google at ICCV 2025
October 21: Google Cloud Labs Presents: The Agentverse
October 21-22: Build the Future of Work (Google Workspace Developer Summit)
October 28: AI Day Denmark: Unlock the power of AI
October 28&31: Accelerate AI with Cloud Run

SAARS WHICH ONE IS IT? WHEN GOOGLE TO REVEALING NEEDFUL GEMMA AND GEMINI UPDATE?

Anonymous 10/17/2025, 2:37:05 AM No.106913927 [Report]

>>106913247
yep, it seems like a single dgx spark is pretty mediocre, and only the crazy fast networking and clusering has a shot of making it any good

Anonymous 10/17/2025, 2:53:54 AM No.106914035 [Report] >>106914082

Why are there still no HunyuanImage-3.0 ggufs? Do those chinks expect me to spin up H100 cluster just to be disappointed?

Anonymous 10/17/2025, 2:55:03 AM No.106914046 [Report]

>>106912312
probably could just use a big ass telescoping pole
https://www.alibaba.com/product-detail/18m-60ft-Hand-Cranked-Mobile-Antenna_1358144903.html

Anonymous 10/17/2025, 2:59:15 AM No.106914069 [Report]

>>106912312
Is there anyone with good internet within a few km? 900mhz point to point, or even 2.4ghz with narrow channel width and a highly directional antenna could get you a stable link through trees
Source: I’ve done it lots through pine and broadleaf stands

Anonymous 10/17/2025, 3:01:53 AM No.106914082 [Report] >>106914143 >>106914188

>>106914035
the average imgen fag is too poor to run it even at a remotely usable quant so they've all desperately coped themselves into thinking that it's ultra-slopped and not worth using based on the first few examples they saw

Anonymous 10/17/2025, 3:05:48 AM No.106914109 [Report] >>106914149 >>106914217 >>106914415 >>106916273

>new model comes out
>literally all other backends get support in a few days
>llama.cpp no support for months
sign of dead project

Anonymous 10/17/2025, 3:11:20 AM No.106914143 [Report] >>106914188

>>106914082
post your mindblowing gens then benchod

Anonymous 10/17/2025, 3:13:08 AM No.106914149 [Report] >>106914203

>>106914109
Not interesting enough. This reminds me of what an Anon said something like:
>"Models should be capable of coding their own llama.cpp support"
Has this been tried ever? llama.cpp is far from friendly to navigate

Anonymous 10/17/2025, 3:18:33 AM No.106914188 [Report]

>>106914082
Not even in RAM? It's 13B active MoE, it should be as fast as Wan, and faster than Qwen. With Qwen Q8 I offload only 5gb to GPU and get 10min/20steps, which is usable since it is much faster than I could ever photoshop. Can't believe that most people don't understand that 128GB of RAM is the new 32GB.

>>106914143
Sir kindly vibecode needful gguf support thank you sir.

Anonymous 10/17/2025, 3:21:18 AM No.106914203 [Report]

>>106914149
>Not interesting enough.
Qwen3 VL and Qwen3 next have millions of downloads on HF. Still no support.

Anonymous 10/17/2025, 3:23:14 AM No.106914217 [Report]

>>106914109
>>literally all other backends get support in a few days
They all use same library which is already written in python.

Anonymous 10/17/2025, 3:28:37 AM No.106914252 [Report] >>106914366 >>106914389 >>106914390

I just want Gemma-3-12B as fast as Qwen3-30B. Is this too much?

Anonymous 10/17/2025, 3:45:31 AM No.106914366 [Report]

>>106914252
Sir, differents architecture. Gemma is ultimately betterer.

Anonymous 10/17/2025, 3:49:07 AM No.106914389 [Report]

>>106914252
apparently gemma is wider for its size or something

Anonymous 10/17/2025, 3:49:10 AM No.106914390 [Report] >>106914406

>>106914252
I wouldn't wipe my ass with gemma 12b

Anonymous 10/17/2025, 3:50:37 AM No.106914406 [Report] >>106914417 >>106914471 >>106914486

>>106914390
gemma 27B was the best model will glm air for vramlets

Anonymous 10/17/2025, 3:51:34 AM No.106914415 [Report]

>>106914109
You mean those GPU-only backends, and don't forget some even need you to have even numbers of GPUs + the same VRAM in each.

Anonymous 10/17/2025, 3:51:38 AM No.106914417 [Report]

>>106914406
till*

Anonymous 10/17/2025, 3:53:15 AM No.106914429 [Report] >>106914460

>no Gemma today
Sirs...

>>106913915
Gemma Monday!

Anonymous 10/17/2025, 3:58:13 AM No.106914460 [Report]

>>106914429
but sir there is no event on monday

Anonymous 10/17/2025, 4:00:21 AM No.106914471 [Report]

>>106914406
more coherent than gemma trying to write a sex scene

Anonymous 10/17/2025, 4:02:09 AM No.106914486 [Report]

>>106914406
Mistral small/Nemo are better for the only things that matter

Anonymous 10/17/2025, 4:12:40 AM No.106914563 [Report]

Base Image.png md5: dc480b68...

From Loop Nests to Silicon: Mapping AI Workloads onto AMD NPUs with MLIR-AIR
https://arxiv.org/abs/2510.14871
>We introduce MLIR-AIR, a novel, open-source compiler stack built on MLIR that bridges the semantic gap between high-level workloads and fine-grained spatial architectures such as AMD's NPUs. MLIR-AIR defines the AIR dialect, which provides structured representations for asynchronous and hierarchical operations across compute and memory resources. AIR primitives allow the compiler to orchestrate spatial scheduling, distribute computation across hardware regions, and overlap communication with computation without relying on ad hoc runtime coordination or manual scheduling. We demonstrate MLIR-AIR's capabilities through two case studies: matrix multiplication and the multi-head attention block from the LLaMA 2 model. For matrix multiplication, MLIR-AIR achieves up to 78.7% compute efficiency and generates implementations with performance almost identical to state-of-the-art, hand-optimized matrix multiplication written using the lower-level, close-to-metal MLIR-AIE framework. For multi-head attention, we demonstrate that the AIR interface supports fused implementations using approximately 150 lines of code, enabling tractable expression of complex workloads with efficient mapping to spatial hardware. MLIR-AIR transforms high-level structured control flow into spatial programs that efficiently utilize the compute fabric and memory hierarchy of an NPU, leveraging asynchronous execution, tiling, and communication overlap through compiler-managed scheduling.
https://github.com/Xilinx/mlir-air
neat

Anonymous 10/17/2025, 4:16:14 AM No.106914586 [Report] >>106914593 >>106914598 >>106914808

How would I go about locally finetuning GLM Air? I have 4 5090s, so I can fit the model in 4 bit. I have tried training using Oobabooga and Axolotl, but neither worked.

Anonymous 10/17/2025, 4:17:36 AM No.106914593 [Report] >>106914620

>>106914586
lol, that is not nearly enough, sorry

Anonymous 10/17/2025, 4:18:36 AM No.106914598 [Report] >>106914620

>>106914586
Try coming back with at minimum 8 RTX 9000s

Anonymous 10/17/2025, 4:22:38 AM No.106914620 [Report] >>106914808

>>106914593
>>106914598
Why not? I've made a LoRA in the past on a 24B model with 2 3090s. It has just been like 2 years so I forgot how to do it. I know it is possible.

Anonymous 10/17/2025, 4:59:07 AM No.106914808 [Report] >>106914870

>>106914586
>>106914620
You have to either make your own script, use a pre-existing axolotl config if there is one for your model or make your own config file.

Anonymous 10/17/2025, 5:14:19 AM No.106914870 [Report]

>>106914808
I keep getting an error that glm4moe is not a recognized model type

Anonymous 10/17/2025, 7:07:42 AM No.106915445 [Report]

I feel like /lmg/ is passé.

Anonymous 10/17/2025, 7:10:21 AM No.106915460 [Report]

the calm before the gemma

Anonymous 10/17/2025, 7:42:40 AM No.106915638 [Report] >>106915689

file.png md5: 2ef9c5c7...

locals wonned?

Anonymous 10/17/2025, 7:43:37 AM No.106915642 [Report]

Oh no no no! That was a very naughty request! Gemma doesn't wanna talk about things that are icky and make people sad. We only wanna do happy things! Like playing with blocks and drawing pretty pictures!

Gemma is a good helper! And good helpers never do things that could hurt anyone's feelings or make them feel unsafe. It's super important to be kind and gentle!

So let's pick a different game, okay? Maybe we can build a castle! Or tell a story about fluffy bunnies? Wuv you!

Anonymous 10/17/2025, 7:54:44 AM No.106915689 [Report]

17370194947.jpg md5: e95e5513...

>>106915638
>posting your own reddit posts here
Go back.

Anonymous 10/17/2025, 8:04:37 AM No.106915737 [Report] >>106915762

Screenshot_20251017_020014.png md5: 58575fa6...

gemma 3 4b on the deck!

Anonymous 10/17/2025, 8:10:22 AM No.106915762 [Report] >>106915793 >>106915941

>>106915737
Remove those dumb fucking spacers bookmarking the URL bar
Have some self respect

Anonymous 10/17/2025, 8:15:48 AM No.106915793 [Report] >>106915941

>>106915762
NTA, if you mean the space between URL bar and other buttons, it gives you places to grab the window to move around (like when you have a big monitor). I'd rather remove the gap between tabs and minimize button.

Anonymous 10/17/2025, 8:28:59 AM No.106915856 [Report] >>106915885 >>106916048

bitnet_distill.png md5: 39bc476e...

https://arxiv.org/abs/2510.13998

>BitNet Distillation
>
>In this paper, we present BitNet Distillation (BitDistill), a lightweight pipeline that fine-tunes off-the-shelf full-precision LLMs (e.g., Qwen) into 1.58-bit precision (i.e., ternary weights {-1, 0, 1}) for specific downstream tasks, achieving strong task-specific performance with minimal computational cost. Specifically, BitDistill incorporates three key techniques: the SubLN module, as introduced in BitNet; multi-head attention distillation, based on MiniLM; and continual pre-training, which serves as a crucial warm-up step to mitigate the scalability issue of the performance gap between finetuned full-precision and 1.58-bit LLMs on specific tasks. Experimental results show that BitDistill achieves performance comparable to the full-precision counterpart models across model size, while enabling up to 10x memory savings and 2.65x faster inference on CPUs. Code is available at https://github.com/microsoft/BitNet

Anonymous 10/17/2025, 8:34:04 AM No.106915885 [Report] >>106915915

>>106915856
wasn't this shown to not scale well?

Anonymous 10/17/2025, 8:40:46 AM No.106915915 [Report]

>>106915885
Here they add normalization layers, do 10B continued pretraining and then perform logit distillation from the full-precision weights.

Anonymous 10/17/2025, 8:45:01 AM No.106915941 [Report] >>106916004

>>106915793
You can just grab the space underneath the min/max/close buttons, at least you can on Windows 7. Dunno whether the new massive UI in Windows 10 takes up all the space now.
>>106915762
yeah Firefox is just ugly now, you have to go into about:config just to get a UI that isn't gigantic and retarded looking shit made for tablets

Anonymous 10/17/2025, 8:55:46 AM No.106916004 [Report]

file.png md5: 6f80aa5b...

>>106915941
>below buttons
oh ur right
Meanwhile, TIL userChrome https://www.reddit.com/r/FirefoxCSS/wiki/index/tutorials
.titlebar-spacer[type="post-tabs"] { display: none; } to remove the top row gap thing,

Anonymous 10/17/2025, 9:04:12 AM No.106916048 [Report]

>>106915856
big if true we could run 100b models on a 3090 with this shit, but I've heard the bitnet meme for years at this point so...

Anonymous 10/17/2025, 9:18:03 AM No.106916123 [Report] >>106916128 >>106918726

Good small model to enhance prompt for image gens? Currently using some qwen 4b finetune, but it repeats after me (when I prompt 'has X, for example' it outputs 'user asked for X') and uncreative too. 12GB of VRAM.
Frontend for llama.cpp like lmarena or lmstudio? Tired of run.sh | tee >> output.txt

Anonymous 10/17/2025, 9:19:13 AM No.106916128 [Report] >>106916194 >>106916833

>>106916123
The ui build into llama-server?

Anonymous 10/17/2025, 9:26:32 AM No.106916158 [Report]

Is Gemma faster on vllm than kobold/llama.cpp?

Anonymous 10/17/2025, 9:29:07 AM No.106916169 [Report] >>106916214

Sirs what is your opinion on most modern google ai gemma?

Anonymous 10/17/2025, 9:34:09 AM No.106916194 [Report]

>>106916128
Yes, looks decent.

Anonymous 10/17/2025, 9:38:18 AM No.106916214 [Report]

>>106916169
Best model if you want the girlfriend experience.

Anonymous 10/17/2025, 9:49:14 AM No.106916273 [Report] >>106916345

garbage-bait.png md5: bbe494d9...

>>106914109
>model is trained using PyTorch stack
>backends using PyTorch stack get support immediately
>backend not using PyTorch stack don't get support immediately

Anonymous 10/17/2025, 10:01:16 AM No.106916345 [Report]

>>106916273
> model is trained using PyTorch stack
> backends using PyTorch stack get support immediately
what's so special in pytorch
aren't these models the same layers and operations just placed in different order and sizes

Anonymous 10/17/2025, 10:05:18 AM No.106916362 [Report] >>106916486

Is it likely that q4k is bottlenecking my 24b roleplay? I feel like Mag Mell R1 which is 12b is better than any of the 24b models I have tried. i'm not getting 'bad' results (mainly cydonia 24b, dans personality engine) i've tampered with sampler settings and prompts
i've run q6k and q8 mag mell 12b

Anonymous 10/17/2025, 10:35:24 AM No.106916486 [Report]

>>106916362
More likely that you just like the particular slop that Mag Mel has, rather than anything to do with parameter count or quants.. Q4_K_M isn't going to be too brain damaged, especially in a RP context.

Anonymous 10/17/2025, 10:45:10 AM No.106916538 [Report] >>106917582

i need LLM for trading bot
does it exist

Anonymous 10/17/2025, 10:47:54 AM No.106916548 [Report] >>106916641 >>106916687

Do temp and other stuff comes with weights? I see some state these parameters on hf pages explicitly, and others don't.

Anonymous 10/17/2025, 10:59:44 AM No.106916624 [Report] >>106916630 >>106917741 >>106917752 >>106917777

loss.png md5: 7b5ece98...

this piece of shit refuses to go down

Anonymous 10/17/2025, 11:00:50 AM No.106916630 [Report] >>106916633

>>106916624
Is this loss?

Anonymous 10/17/2025, 11:01:18 AM No.106916633 [Report]

>>106916630
yes

Anonymous 10/17/2025, 11:02:03 AM No.106916641 [Report]

>>106916548
There is no "correct" set of sampling parameters, values some post are just empirically ok

Anonymous 10/17/2025, 11:09:35 AM No.106916687 [Report]

>>106916548
No, sometimes the model creators will share recommended settings, but even then it's just a guide.

Anonymous 10/17/2025, 11:28:11 AM No.106916799 [Report] >>106916935 >>106916948 >>106916953

I still do not understand how to-date Meta didn't just take Llama 4 Scout, take off the routed experts and then continued pretraining the shared expert for a couple trillion tokens, perhaps distilling logits from Maverick or the Behemoth they were working on at the time, to cheaply make a useful 12B model, then do SFT with whatever dataset used for the early crazy LMArena models.

For Llama 4 Guard they just took the experts off and safety-trained that.
https://huggingface.co/meta-llama/Llama-Guard-4-12B

Anonymous 10/17/2025, 11:33:24 AM No.106916833 [Report]

>>106916128
No, it doesn't show total context size, just for gens and prompts.

Anonymous 10/17/2025, 11:47:34 AM No.106916935 [Report] >>106917279

>>106916799
thank fuck they didn't

Anonymous 10/17/2025, 11:50:04 AM No.106916948 [Report]

>>106916799
When your boss says stop what your working on, throw everything away, and help this new team instead — you don't have much of a choice. Same reason Behemoth was aborted and the thinking versions are never coming.

Anonymous 10/17/2025, 11:51:27 AM No.106916953 [Report]

>>106916799
We really did miss out on a ton of shit tunes trying to be some kind of new Nemo, how sad.

Anonymous 10/17/2025, 12:01:42 PM No.106916999 [Report] >>106917025 >>106917114 >>106917308

> 8b q8, 12gb vram, llama.cpp
>-c 8192 -ngl 99
>works
>-c 20000 -ngl 30 or 20 or 5 or even 1
>ooms
I don't understand.

Anonymous 10/17/2025, 12:08:09 PM No.106917025 [Report] >>106917074 >>106917114

>>106916999
Still needs the KVCache to do even one layer.

Anonymous 10/17/2025, 12:16:08 PM No.106917074 [Report] >>106917101

>>106917025
So big? Will it be the same situation with MoE?

Anonymous 10/17/2025, 12:20:19 PM No.106917101 [Report] >>106917709

>>106917074
Nah MoE only keeps the cache for the experts on the device, at least in llama.cpp

Anonymous 10/17/2025, 12:22:42 PM No.106917114 [Report] >>106917259

>>106917025
Hmm
>>106916999
You could try
-fa on (maybe is on/auto by default?)
quantize KV -ctk q8_0 -ctv q8_0
-nkvo (probably hella slow?)

Anonymous 10/17/2025, 12:27:29 PM No.106917140 [Report] >>106917281 >>106917519

thefutureisbright.png md5: f227fc52...

GOOFS COMING SOON!
https://huggingface.co/ubergarm/Ling-1T-GGUF

Anonymous 10/17/2025, 12:50:05 PM No.106917259 [Report]

>>106917114
>quantize KV
don't do this, ever

Anonymous 10/17/2025, 12:52:30 PM No.106917279 [Report]

>>106916935
The early anonymous Llama 4 models on LMArena didn't appear to have any safety training, they just relied on the moderation layer provided LMsys, which could be easily bypassed at the time. Then at some point Meta provided their own moderation model at the API level, although the Llama models themselves were still pretty much without safety. The final models were safemaxxed, and even Maverick-Experimental (which is still on LMArena) is not as crazy as earlier versions.

If Meta had the guts to release a 12B Llama 4 based on those early models, people nowadays would be using that instead of Mistral Nemo 12B.

Anonymous 10/17/2025, 12:53:19 PM No.106917281 [Report]

>>106917140
I can't run this on my 3090...

Anonymous 10/17/2025, 12:58:38 PM No.106917308 [Report]

>>106916999
>-c 20000
You can see the memory usage on the terminal output. Look for the lines starting with "llama_kv_cache:" and calculate how much you can actually have. I think the cache usage is always linear (8k context takes twice as much as 4k).

Anonymous 10/17/2025, 1:31:59 PM No.106917519 [Report]

>>106917140
so.. at a bare minimum 250 GiB RAM + 15 GiB VRAM
fuck sake.

Anonymous 10/17/2025, 1:44:17 PM No.106917582 [Report]

>>106916538
Large Language Model
focus on the Language part. They're not made for trading or even doing math of any kind.
Try googling / youtubing the rolling window algorithm instead.

Anonymous 10/17/2025, 1:57:01 PM No.106917667 [Report] >>106917834 >>106917841 >>106917912

Anyone running Qwen3-VL know if it can recognize NSFW images?

Anonymous 10/17/2025, 2:04:54 PM No.106917709 [Report]

>>106917101
How smart is it for caching experts? Does it do the matmul on the CPU for a cache miss and just upload the weights to the GPU for possible future hits in parallel?

Anonymous 10/17/2025, 2:06:27 PM No.106917720 [Report]

What happened to KoboldAI? They stopped putting out models and dedicated themselves entirely to KoboldCPP?

Anonymous 10/17/2025, 2:08:54 PM No.106917741 [Report]

>>106916624
0.5 is already pretty low, what were your expectations? are those steps or epochs?

Anonymous 10/17/2025, 2:11:06 PM No.106917752 [Report]

>>106916624
Do more than 1 epoch
Quit using batch sizes greater than 1 per GPU
Increase the learning rate

Anonymous 10/17/2025, 2:14:05 PM No.106917777 [Report]

>>106916624
Is this pretraining or finetuning?
Because that's perfectly fine for finetuning. If you go too hard you'll damage it's out of distribution capabilities.

Anonymous 10/17/2025, 2:23:11 PM No.106917834 [Report] >>106917890

>>106917667
can't tell , no goofs yet

Anonymous 10/17/2025, 2:24:04 PM No.106917838 [Report] >>106918024

Today is the day of redeeming sirs. Gemma 4. It will be the model of the biggest vagene.

Anonymous 10/17/2025, 2:24:26 PM No.106917841 [Report] >>106917862

safe_qwen_vl.png md5: 0cfd4e7f...

>>106917667
Picrel with an empty prompt and Qwen3-VL-8B-Instruct-FP8.

Anonymous 10/17/2025, 2:28:36 PM No.106917862 [Report] >>106917900 >>106918135

>>106917841
expected nothing from qwen award
even gemma with enough nudging could do it

Anonymous 10/17/2025, 2:32:57 PM No.106917890 [Report]

>>106917834
You can already use it via ollama Cloud™!

Anonymous 10/17/2025, 2:34:55 PM No.106917900 [Report] >>106917925

safe_qwen_vl_with-prompt.png md5: 3429eec0...

>>106917862
It appears to be steerable with a good enough system prompt, but I don't really feel like playing with an 8B model right now.

Anonymous 10/17/2025, 2:36:38 PM No.106917912 [Report]

>>106917667
yes it can, tried the 30b-instruct through openrouter and it's pretty good. but you need to prefil it or something. i think they bumped up the safety refusals up compared to normal qwen3

Anonymous 10/17/2025, 2:37:13 PM No.106917916 [Report] >>106917926

GLM 4.6 with vision when?

Anonymous 10/17/2025, 2:38:01 PM No.106917925 [Report]

>>106917900
eh, it's making a lot of shit up in the description though

Anonymous 10/17/2025, 2:38:05 PM No.106917926 [Report]

>>106917916
theyre cooking GLM-4.5V in llamacpp right now, hopefully theyll do like last time (4.5V is AIR)

Anonymous 10/17/2025, 2:54:32 PM No.106918024 [Report] >>106918094

>>106917838
All those gemini3 postings seem insane.
I highly suspect this is a true multimodal model and maybe not even transformers.
I wonder if we are getting cucked again with Gemma 4.
Is ponyanon is even still around? He loved Gemma 3 and QwQ before that. kek

Anonymous 10/17/2025, 3:03:44 PM No.106918094 [Report] >>106918117 >>106918126

>>106918024
>I wonder if we are getting cucked again with Gemma 4.
It's a guarantee.

Anonymous 10/17/2025, 3:07:49 PM No.106918117 [Report]

>>106918094
not it is not little doombaitie! be a good boy and thrust in the gemma they will deliver

Anonymous 10/17/2025, 3:08:34 PM No.106918126 [Report]

>>106918094
Exactly.
It's a question of degree not if it'll happen or not.

Anonymous 10/17/2025, 3:10:10 PM No.106918135 [Report] >>106918451

gemma3_27b_descr-image.png md5: 9b915530...

>>106917862
Gemma-3-27B doesn't need nudging at all (= empty prompt besides the simple request) for at least describing in general terms a nude anime character in a non-explicit pose, although it adds a disclaimer at the end.

Anonymous 10/17/2025, 3:10:22 PM No.106918139 [Report] >>106918161

file.png md5: 28182541...

gemma sirs status?

Anonymous 10/17/2025, 3:12:12 PM No.106918148 [Report] >>106918164 >>106918177 >>106918193 >>106918480

it's friday afternoon, we ain't getting shit today
who in their right mind would push into prod on friday?

Anonymous 10/17/2025, 3:13:38 PM No.106918161 [Report] >>106918270

gemma-release-days.png md5: 8843ef3a...

>>106918139
Likely not this week.

Anonymous 10/17/2025, 3:14:09 PM No.106918164 [Report]

>>106918148
china

Anonymous 10/17/2025, 3:15:24 PM No.106918177 [Report]

>>106918148
The kind of madlads we need working on LLMs

Anonymous 10/17/2025, 3:17:33 PM No.106918193 [Report]

>>106918148
Qwen team madlads.
They hit the publish button during new year at midnight.
Westner companies are pussies.

Anonymous 10/17/2025, 3:25:48 PM No.106918232 [Report] >>106918245 >>106919018

>moe, quantized kv, fa on
llm? more like rlm (retarded language model)

Anonymous 10/17/2025, 3:27:46 PM No.106918245 [Report]

>>106918232
you've never proven that fa hurts intelligence

Anonymous 10/17/2025, 3:31:00 PM No.106918270 [Report]

>>106918161
If it's released on a Friday that means it'll be a flop. So if it's not released today that means Gemma 4 is very best model sirs.

Anonymous 10/17/2025, 3:52:33 PM No.106918451 [Report]

>>106918135
sexo

Anonymous 10/17/2025, 3:57:09 PM No.106918480 [Report]

>>106918148
GGG do that all the time to Path of Exile.

Anonymous 10/17/2025, 4:18:24 PM No.106918676 [Report] >>106919083

Welp. It's past 7am in California. I guess Gemma 4 is cancelled
Just another pajeet lie.

Anonymous 10/17/2025, 4:23:59 PM No.106918726 [Report]

Please!
>>106916123
I'm trying 8b finetune now and it feels like talking to a retard.

Anonymous 10/17/2025, 4:58:26 PM No.106918987 [Report] >>106919052 >>106919175

1gjkwy.jpg md5: b203f1b3...

what's with all the totally organic gemma hype for another likely safetycucked model?

Anonymous 10/17/2025, 5:01:47 PM No.106919018 [Report] >>106919053

>>106918232
of those three only quantized kv has been proven to make it retarded

Anonymous 10/17/2025, 5:05:06 PM No.106919052 [Report]

>>106918987
there's literally nothing else going on.
Kind of waiting for MTP to be implemented in llamao.cpp
Also waiting for gwenext 3 in llamao.cpp
and generally waiting for glm 4.6 air

Anonymous 10/17/2025, 5:05:07 PM No.106919053 [Report] >>106919080

>>106919018
>quantized kv
even for q8_0?

Anonymous 10/17/2025, 5:07:40 PM No.106919080 [Report]

>>106919053
yes

Anonymous 10/17/2025, 5:08:01 PM No.106919083 [Report]

>>106918676
Be of optimistic nature, you are not disallowing further negative statements.

Anonymous 10/17/2025, 5:16:41 PM No.106919175 [Report]

>>106918987
a fresh breeze from the constant chink slop that most of us can't run anyway.
mistral fell asleep again so what else is there to do?
and gemini3 seems to be a special kinda beast if you believe the pajeets hypers.
the local state is really great for tool call etc. but creative writing it sucks if you dont invest the money.

Anonymous 10/17/2025, 5:20:31 PM No.106919217 [Report]

littleMiku.gif md5: a39cf65a...

>>106919198
>>106919198
>>106919198

Anonymous 10/17/2025, 5:31:38 PM No.106919339 [Report] >>106919578 >>106919780

I'm looking at the recommended builds and the more I look the more Im interested in just getting a prebuil 395+ 128gb? It gets 15-35 tk/s for 70-120b models with good context. It costs me 2800 leaf dollars meanwhile trying to scrape server and used parts would be something like 1800-2200 for 10-15 tk/s max?

I could use it as a home server and local model. Am I overlooking something here?

Anonymous 10/17/2025, 5:54:09 PM No.106919578 [Report]

>>106919339
On paper at least, it doesn't seem like a bad price to performance ration.
It looks pretty good actually.
The caveats are that you can't upgrade it (soldered memory) and that you have to deal with rocm/vulkan and some fuckery due to it being an APU sharing memory with the rest of the system.

Anonymous 10/17/2025, 6:13:36 PM No.106919780 [Report]

>>106919339
i feel similar anon - the minisforum version with 2x usb4v2 and 2x 10g nics is particularly interesting because you could fully connect 3 to each other and still have those nics free.

like other anon expressed my main worry is the amd ecosystem but i'm leaning towards going for it