Thread 107063981

342 posts 82 images /g/

Anonymous 10/31/2025, 4:54:38 PM No.107063981 [Report] >>107065909 >>107067586 >>107071616

/lmg/ - Local Models General

__hatsune_miku_vocaloid_drawn_by_futatunoniji__3190178bc1989125804021f15fde967b.jpg md5: 804e7915...

Anonymous 10/31/2025, 4:55:02 PM No.107063985 [Report]

__hatsune_miku_vocaloid_drawn_by_masoogenki__989690b69d4dd20a277733c8d10c94e9.png md5: 989690b6...

►Recent Highlights from the Previous Thread: >>107056325

--VRAM vs RAM tradeoffs and cost-effective upgrades:
>107057422 >107057493 >107057523 >107057538 >107057627 >107057641 >107057680 >107057892 >107057904 >107058132 >107058211 >107058235 >107058246 >107058291 >107058301 >107058332 >107058823 >107057647 >107060695
--Tech Mahindra's 1 trillion parameter LLM project sparks mixed reactions:
>107061935 >107062055 >107061978 >107062154 >107062174
--Multi-GPU memory optimization latency tradeoffs for MoE models:
>107062861 >107062880 >107062891 >107062902 >107062941 >107063023 >107062887 >107062939 >107062947 >107063018 >107062980 >107063165 >107063110
--VTT model comparisons and pipeline suggestions for transcription:
>107059665 >107059817 >107059845 >107059918 >107059961 >107060178 >107060224 >107062756 >107062842 >107062859
--Qwen 4B's performance in complex JSON generation and small LLM advancements:
>107057926 >107058153 >107058218
--Qwen 4b's multi-image analysis capabilities demonstrated:
>107060687
--SillyTavern system prompt configuration challenges:
>107062184 >107062200 >107062327 >107062369 >107062386 >107062492
--Exploring practical uses for local image processing and interactive applications:
>107056358 >107056482 >107056509 >107056541 >107056576 >107056554
--Challenges with TabbyAPI and Qwen3 Coder tool calling implementation:
>107058354 >107058385 >107058840 >107059067 >107059694 >107062455
--Skepticism about LLaDA2.0's practical value due to performance and context limitations:
>107060705 >107060731 >107060818
--UI/lorebook integration challenges and code accessibility in STScript:
>107057009 >107057036 >107057083 >107057101 >107057121 >107057162 >107057240
--Miku, Rin, and Dipsy (free space):
>107056696 >107057940 >107057943 >107059568 >107059860 >107060222 >107060637 >107060674 >107061256 >107062726 >107061898

►Recent Highlight Posts from the Previous Thread: >>107056334

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous 10/31/2025, 5:08:29 PM No.107064100 [Report] >>107064113

i see... :(

Anonymous 10/31/2025, 5:09:42 PM No.107064113 [Report]

>>107064100
I don't

Anonymous 10/31/2025, 5:23:08 PM No.107064207 [Report] >>107064254 >>107064311 >>107064351 >>107064352 >>107064392 >>107064493 >>107064583 >>107064736 >>107065241 >>107065483 >>107069099 >>107070647

https://youtu.be/qw4fDU18RcU

Anonymous 10/31/2025, 5:25:21 PM No.107064225 [Report] >>107070470

Do you guys know what I realized? No matter how far you go, you're still somewhere and never nowhere, so saying I am in the middle of nowhere is a nonsensical sentence.

Anonymous 10/31/2025, 5:28:20 PM No.107064254 [Report]

>>107064207
so he uses vLLM in a docker container (this needing the shm-size) and runs qwen 235B in AWQ 4 bit

Anonymous 10/31/2025, 5:30:48 PM No.107064275 [Report] >>107064510

All of his knowledge is ironically coming from LLMs. I'm sure he has also browsed /lmg/ in the past at least. You could probably find his retarded questions.

Anonymous 10/31/2025, 5:34:47 PM No.107064311 [Report]

>>107064207
pretty dissapointing, he was pretty based up to this point

Anonymous 10/31/2025, 5:38:50 PM No.107064351 [Report]

>>107064207
>watch the first few mins
>the topic of the title doesn't even get mentioned at all

Anonymous 10/31/2025, 5:38:53 PM No.107064352 [Report]

>>107064207
cool Web UI

Anonymous 10/31/2025, 5:42:30 PM No.107064392 [Report] >>107064565

1743424257788609.png md5: 0f9ba2bc...

>>107064207
>its actually a video about shitting on cloud models and shilling self-hosting models
how can one man be so based?

Anonymous 10/31/2025, 5:47:11 PM No.107064443 [Report]

Gguf status?

Anonymous 10/31/2025, 5:51:53 PM No.107064493 [Report] >>107064629

>>107064207
Ok watched the whole video.
Wtf he's one of us.

Anonymous 10/31/2025, 5:53:42 PM No.107064510 [Report] >>107064663

>>107064275
>I'm sure he has also browsed /lmg/ in the past at least.
I doubt it because he actually complimented gpt-oss

Anonymous 10/31/2025, 5:58:30 PM No.107064565 [Report]

>>107064392
anti ai will still use just the thumbnail as saying he's against all ai tho

Anonymous 10/31/2025, 6:00:32 PM No.107064583 [Report]

>>107064207
Fuck this fag, I bet he even lurks ITT. His whole persona is so rage inducing.
https://youtu.be/7OiMxGwmdto?si=kvdyA0QWdV6rZ_3k

Anonymous 10/31/2025, 6:05:45 PM No.107064629 [Report] >>107064688

>>107064493
>Wtf he's one of us.
No shit. He says the word nigger all the time.

Anonymous 10/31/2025, 6:08:41 PM No.107064663 [Report] >>107064718

>>107064510
There is one retard here that regularly praises gpt-oss. Maybe it's him.

Anonymous 10/31/2025, 6:10:41 PM No.107064688 [Report] >>107064735 >>107066630

>>107064629
do not to slander he said once in the rage moment

Anonymous 10/31/2025, 6:14:21 PM No.107064718 [Report]

>>107064663
we must agree

Anonymous 10/31/2025, 6:16:07 PM No.107064735 [Report] >>107064742

>>107064688
I've seen some tiktok clips of him where he made some implicit remaks showing he's a white nationalist, that's a reason why he decided to go for japan, not just because of "uwu kawaii desu ne", but because this country is extremely racist and nationalist

Anonymous 10/31/2025, 6:16:07 PM No.107064736 [Report] >>107064748

>>107064207
>video about local AI from e-celeb #16311498
>no ollamao in sight
i was going to tell you to fuck off but nevermind, i like the guy

Anonymous 10/31/2025, 6:17:05 PM No.107064742 [Report] >>107064766 >>107064773 >>107064878 >>107064895

>>107064735
by wouldn't he be subject to that racism? he is not Japanese

Anonymous 10/31/2025, 6:17:40 PM No.107064748 [Report]

>>107064736
I wish I had the money to play around with a VLLM capable rig

Anonymous 10/31/2025, 6:19:16 PM No.107064766 [Report] >>107064811 >>107064830 >>107064988 >>107065011

>>107064742
Racists don't tend to be brightest crayon in the toolshed.

Anonymous 10/31/2025, 6:19:40 PM No.107064773 [Report]

>>107064742
everyone in the world know who pewdiepie is, I think the japanese people are happy he's here

Anonymous 10/31/2025, 6:23:03 PM No.107064811 [Report]

>>107064766
Ahah so true kind stranger, take this kind gold and upvote with you!

Anonymous 10/31/2025, 6:24:39 PM No.107064830 [Report] >>107064843 >>107064868

1736602330158898.png md5: 390d192a...

>>107064766
the richest man in the history of humanity is a "nazi" though, how is that not bright?

Anonymous 10/31/2025, 6:25:32 PM No.107064843 [Report]

>>107064830
he can be rich and a dumbass at the same time

Anonymous 10/31/2025, 6:25:46 PM No.107064845 [Report] >>107064904 >>107064908 >>107064920 >>107065271 >>107065682

Do you guys ever use models to edit or write your prompts? I'm trying it a bit but desu its hard to tell if its an improvement or not

Anonymous 10/31/2025, 6:27:06 PM No.107064868 [Report]

>>107064830
>lifting your hand in a angle is... le nazi

Anonymous 10/31/2025, 6:28:27 PM No.107064878 [Report]

>>107064742
why would the japanese hate him?
he's not one of the pajeet or third worlder migrants wanting to shit up the place

Anonymous 10/31/2025, 6:29:39 PM No.107064895 [Report] >>107065427

>>107064742
I don't think japanese people mind white people, they know what they are worth

Anonymous 10/31/2025, 6:30:28 PM No.107064904 [Report]

>>107064845
Yes, it's useful when for example you want to define character behavior more in detail but you can't be assed to write the entire prompt yourself from scratch. It's also best when the entire prompt is dedicated to the character. For non-RP uses, LLM-driven recursive prompt-refining is also a thing: https://arxiv.org/abs/2507.19457

Anonymous 10/31/2025, 6:31:00 PM No.107064908 [Report]

>>107064845
>its hard to tell if its an improvement or not
Then consider time and effort, however much or little that is.

Anonymous 10/31/2025, 6:31:53 PM No.107064920 [Report]

>>107064845
Oh yeah. Mostly for brainstorming than anything, since the final version is always heavily edited by me.

Anonymous 10/31/2025, 6:37:02 PM No.107064965 [Report] >>107065003

Can someone explain to me if alpha changes something about the training process or it ONLY changes the multiplier at inference time? (yes, sorry, I'm too lazy to read the actual paper)

Anonymous 10/31/2025, 6:40:01 PM No.107064988 [Report]

>>107064766
would you say that about blm?

Anonymous 10/31/2025, 6:41:15 PM No.107065003 [Report] >>107065032

effective-rank.png md5: 5934ec93...

>>107064965
It was intended to just be a multiplier, but in practice, alpha must be at least twice the rank (=it can/should be larger) to mitigate the emergence of "intruder dimensions" that decrease the effective rank of your LoRA.

https://arxiv.org/abs/2410.21228

Anonymous 10/31/2025, 6:43:24 PM No.107065011 [Report]

kek.png md5: a3a47821...

>>107064766
>Racists don't tend to be brightest crayon in the toolshed.
the US literally hired actual nazis to put their man on the moon lol
https://en.wikipedia.org/wiki/Operation_Paperclip

Anonymous 10/31/2025, 6:45:08 PM No.107065032 [Report] >>107065046

>>107065003
Ok but that doesn't answer my question. Is it applied at train time (so the weights actually learn to use it, and at inference time you shouldn't use a different one than the alpha the lora was trained with) or is it an option that is applied only at inference time and the lora itself doesn't have a built in alpha?

Anonymous 10/31/2025, 6:47:18 PM No.107065046 [Report] >>107065134 >>107065138

>>107065032
It's used at train time, and it's memorized in the adapter configuration if you don't merge it with the baseline model. In that case, you can change alpha to make the adapter weaker/stronger, but I've never played with that.

Anonymous 10/31/2025, 6:55:28 PM No.107065134 [Report]

>>107065046
I see, thanks.

Anonymous 10/31/2025, 6:55:53 PM No.107065138 [Report] >>107065165

>>107065046
Applying it at a significantly higher alpha than used in training causes brain damage. So you should generally only apply the adapter at the alpha it was trained at and then just train separate adapters if you want to play around with the alpha.

Anonymous 10/31/2025, 6:57:42 PM No.107065156 [Report] >>107065409 >>107069138

how would one go about throttling llama.cpp intentionally to say half speed? of course temporarily

Anonymous 10/31/2025, 6:58:46 PM No.107065165 [Report]

alpha-proportional.png md5: 8750c110...

>>107065138
If you use a higher alpha of course you should decrease the learning rate proportionally. You can't change just alpha. Picrel from the QLoRA paper (https://arxiv.org/pdf/2305.14314).

Anonymous 10/31/2025, 7:03:34 PM No.107065203 [Report] >>107069145

>QWEN3 VL has the best local OCR function
>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function (Outside of occasionally having random Chinese characters in the English translation, is there a way to fix this?)
>Kimi k2 has the best writing

Damn, in another year, I genuinely believe we'll never need traditional translators for a good chunk of media.

Anonymous 10/31/2025, 7:06:24 PM No.107065230 [Report] >>107065443 >>107065474 >>107066996

TONIGHT I'm gonan do it. Totally goinan fuckin do it. I am gunna try ant SUCK my own COCK!!! I taste my own cum from jackan off but it is not satisfy enough. I need to feeel it shootan on my tongue. I will bee in extacee. I am so excite boys!

Anonymous 10/31/2025, 7:07:38 PM No.107065241 [Report]

file.png md5: e5dd46b1...

>>107064207
I have vague memories of a "council of niggas" or something like that from a year or two ago. Was it from a paper?

Anonymous 10/31/2025, 7:09:53 PM No.107065271 [Report]

>>107064845
I still use this thing to make prompts.
https://anthropic.com/metaprompt-notebook/

Anonymous 10/31/2025, 7:09:53 PM No.107065272 [Report]

bbc.jpg md5: 03f520a4...

The earliest form of sexting probably was something like a woman rubbing coal powder or her ass and then leaving an imprint on a papyrus. Or a man doing the same but with his dick. Think about it.

Anonymous 10/31/2025, 7:26:39 PM No.107065409 [Report]

>>107065156
Throttle your GPU to half it's speed

Anonymous 10/31/2025, 7:28:47 PM No.107065427 [Report]

>>107064895
lol

Anonymous 10/31/2025, 7:29:56 PM No.107065443 [Report]

>>107065230
cute, hope you're slim enough

Anonymous 10/31/2025, 7:34:17 PM No.107065472 [Report] >>107065504 >>107066673 >>107067071

HF will soon ask for ID before you download an danger LLM!
https://reclaimthenet.org/lawmakers-want-proof-of-id-before-you-talk-to-ai

Anonymous 10/31/2025, 7:34:20 PM No.107065474 [Report]

>>107065230
I wish I could do that but I have the build of a Chad. Life is unfair.

Anonymous 10/31/2025, 7:36:33 PM No.107065483 [Report]

>>107064207
Did he share the code? Couldn't find it in the video description.

Anonymous 10/31/2025, 7:39:57 PM No.107065504 [Report] >>107065629

>>107065472
yup it's over
>Under the GUARD Act, self-declared birthdays no longer count. If implemented broadly, it would set a precedent that any “interactive AI system” must verify identity through government-approved documentation.
this would hit literally any site that has an ai powered search box and shit like that, like the dataset stuff on hf, or their test box on the side of model cards

Anonymous 10/31/2025, 7:47:08 PM No.107065561 [Report]

So whats the best thing I can on a 4090 today?

Anonymous 10/31/2025, 7:50:13 PM No.107065603 [Report] >>107065617

do backups of your most useful models. checksum for bitrot, multiple backup locations etc.
it's now or never to make sure you can always access em

Anonymous 10/31/2025, 7:52:31 PM No.107065617 [Report]

>>107065603
shut it doomer just another nothing burger

Anonymous 10/31/2025, 7:55:05 PM No.107065629 [Report] >>107065638

>>107065504
>upload model as a torrent
sorry guys, nothing personal

Anonymous 10/31/2025, 7:56:10 PM No.107065638 [Report] >>107065653

>>107065629
>stalled

Anonymous 10/31/2025, 7:57:22 PM No.107065653 [Report] >>107065667

>>107065638
stalled torrents? what is this? 2002? you can buy a 1gbps uplink seedbox for like $5 a month.

Anonymous 10/31/2025, 7:59:17 PM No.107065667 [Report] >>107065783

>>107065653
so true! you're absolutely right this is why the service that was exactly for copying hf as torrents is thriving and hasn't been dead for more than a year

Anonymous 10/31/2025, 8:00:15 PM No.107065682 [Report] >>107065696

>>107064845
All the time, rephrasing in its own words increases comprehension. The resulting prompt usually works well across different models, I guess they were all trained on the same slop

Anonymous 10/31/2025, 8:01:50 PM No.107065696 [Report]

>>107065682
>I guess they were all trained on the same slop
ScaleAI enters the chat

Anonymous 10/31/2025, 8:04:00 PM No.107065720 [Report]

Help.png md5: 7d794112...

So which 24gb coder models have tool support?

Anonymous 10/31/2025, 8:10:35 PM No.107065783 [Report] >>107066126

>>107065667
because huggingface is free and last i checked $0 is less than $5. however lets imagine that huggingface does require ID to download any model or dataset from their website. the majority of normies with a passing interest with AIs won't do it because they will just use chatgpt. power users are typically privacy oriented since they are downloading LOCAL models in the first place. the only users that huggingface would have left are academic people. finetrooners like thedrummer depends on constant validation, they won't get that huggingface and will have to cough up the $5 a month for people to download whatever the latest flavor of cydonia-24B-v8atoz-amazon-GOOF-troop is. in the end all the major model releases would just get downloaded by a few users and reuploaded as torrents.

Anonymous 10/31/2025, 8:16:20 PM No.107065852 [Report] >>107065870 >>107066098

SuchJoy.png md5: f906b57b...

I think I got memed on by /lmg/ thing just keeps spamming text until it goes off the rails.

Anonymous 10/31/2025, 8:18:42 PM No.107065870 [Report]

>>107065852
just use glm 4.5 air if you can

Anonymous 10/31/2025, 8:24:42 PM No.107065909 [Report] >>107065923

>>107063981 (OP)
What is better, chuds? To run GLM 4.5 Air q8, or GLM 4.6 q3? To fit in about 144 GB of VRAM

Anonymous 10/31/2025, 8:26:52 PM No.107065923 [Report]

>>107065909
4.5 Air is shit.

Anonymous 10/31/2025, 8:27:09 PM No.107065928 [Report]

run deepseek instead of the reddit meme model

Anonymous 10/31/2025, 8:29:24 PM No.107065946 [Report] >>107067459

vibevoice is best
https://vocaroo.com/173Uko8t1hHi

Anonymous 10/31/2025, 8:29:40 PM No.107065949 [Report] >>107066491

I've been using the Terminus model for the last few days to translate VNs/RPGs/LNs into English.
Well, what I've been having issues with is that, whenever I translate Chinese into English, Terminus (And 3.1) will include some Chinese text in the translation. Every other language I translate into English has been very good without these issues, it's just Chinese text that seemingly has this problem. Is there a way to make this problem stop?

Anonymous 10/31/2025, 8:51:40 PM No.107066098 [Report]

>>107065852
There is probably a bug somewhere in your stack, it shouldn't be *that* shitty. Try using an Openrouter API endpoint first to check if it's something wrong on your end.

Anonymous 10/31/2025, 8:55:57 PM No.107066126 [Report] >>107066744

>>107065783
Yes, or people could just upload to archive.org (which automatically generates a torrent which people could seed as well in case it gets taken down from the archive).

Anonymous 10/31/2025, 9:29:53 PM No.107066378 [Report]

Did anything ever come out of those cheapo 96gb vram huawei cards?

Anonymous 10/31/2025, 9:35:42 PM No.107066421 [Report] >>107066504 >>107066515 >>107066568 >>107066694 >>107066725

Argumentfail.png md5: 5e2ffb26...

Oh no.

Anonymous 10/31/2025, 9:45:07 PM No.107066491 [Report]

>>107065949
Yeah if you use llama.cpp you can specify a grammar that excludes Chinese characters. Some other backends have similar features.

Anonymous 10/31/2025, 9:46:30 PM No.107066504 [Report]

1758230453392848.jpg md5: 080b8866...

>>107066421
>.vb

Anonymous 10/31/2025, 9:46:36 PM No.107066505 [Report] >>107066911

https://www.youtube.com/watch?v=LjU89rZa8HQ
imagine the erps

Anonymous 10/31/2025, 9:48:30 PM No.107066515 [Report]

>>107066421
>.vb
Stop torturing language models.

Anonymous 10/31/2025, 9:53:48 PM No.107066568 [Report]

>>107066421
my grandpa also uses vb

Anonymous 10/31/2025, 10:02:26 PM No.107066630 [Report]

>>107064688
Go to 06:10 in the video. His wife edits the videos btw

Anonymous 10/31/2025, 10:08:42 PM No.107066673 [Report] >>107066743

>>107065472
Haven't we been expecting this since they started pushing the narrative that LLMs are a threat to humanity? Still waiting for them to announce a National GPU Registry and always-online requirements.

Anonymous 10/31/2025, 10:10:37 PM No.107066694 [Report] >>107067989 >>107071713

>>107066421
I found why my finetuning efforts were unable to get rid of the slop. It seems that a single LoRa has very limited abilities to shape any given response, so they need stacking.
I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.
I suspect the melties might have been a thing in the first place because of the model cheating a reward model during RLHF.
This is probably why nobody releases standalone LoRas and everybody releases merged models (besides compatibility being unreliable).

Anonymous 10/31/2025, 10:15:44 PM No.107066725 [Report] >>107066766

>>107066421
Fascinating! Is VB still a thing? this looks like an actual app not only an office macro?

Anonymous 10/31/2025, 10:18:01 PM No.107066743 [Report] >>107066818

>>107066673
I don't think even politicians are bold enough to say "let's ban timmy from buying a few second hand 3090s on ebay" before regulating the big datacenters.
And you heard how Trump has said he wants to US to go full steam ahead to compete with China.
So I don't think there are regulations coming during this administration.

Anonymous 10/31/2025, 10:18:04 PM No.107066744 [Report]

>>107066126
archive.org typically seeds slowly, so if you are serious about it you would want a dedicated seedbox

Anonymous 10/31/2025, 10:20:33 PM No.107066766 [Report] >>107066848

>>107066725
Well VB.Net uses the same VM as C#
Like Kotlin runs on the JVM

Anonymous 10/31/2025, 10:26:34 PM No.107066814 [Report] >>107068206

mpv kde plasma whisperx Voxtral M2M100 crop.png md5: 91850a4f...

>>107059665
>For those of you guys who have used VTT models (Parakeet, Whisper, etc) which ones have you liked?
Voxtral Small 24B 2507 -> WhisperX (Whisper large v3 turbo model) -> M2M100 1.2B pipeline

Anonymous 10/31/2025, 10:27:01 PM No.107066818 [Report]

>>107066743
>So I don't think there are regulations coming during this administration.
Agreed. The one constant of this entire admin is that, quite frankly, Trump doesn't give a fuck
The only way I see that changing is if the billionaire coalition makes some ridiculous donation to try to make him change that, but even Sam seemed to decide to back off

Anonymous 10/31/2025, 10:29:43 PM No.107066848 [Report]

>>107066766
goodness gracious
glad i avoided software development as a career desu
t. engineer who bodges software as needed
C and python and bash/posix sh is all u need

Anonymous 10/31/2025, 10:38:44 PM No.107066911 [Report] >>107066935

>>107066505
datacenter gpu heist when?

Anonymous 10/31/2025, 10:39:55 PM No.107066924 [Report] >>107066952 >>107067049

1737285213285246.png md5: e3295332...

>drummer updated his dumb joke ad-slopped finetune
What's the fucking point nigger, maybe HF is right to limit your storage.

Anonymous 10/31/2025, 10:41:56 PM No.107066935 [Report]

>>107066911
Unlikely, it's hella time consuming physical effort to install these things, hardly a smash & grab situation
Supply chain is more vulnerable

Anonymous 10/31/2025, 10:44:35 PM No.107066952 [Report] >>107067300 >>107068217

>>107066924
Oh. Thanks for letting me know. Downloading right now.

Anonymous 10/31/2025, 10:50:28 PM No.107066996 [Report]

>>107065230
Proofs?

Anonymous 10/31/2025, 10:56:25 PM No.107067049 [Report] >>107067300

rivermind.png md5: 7c4df4e7...

>>107066924

Anonymous 10/31/2025, 10:57:16 PM No.107067053 [Report] >>107067074 >>107067491

Is GLM 4.6 really in fact better than 4.5?
On this meme https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
4.6 scores worse in literally every department including writing, intelligence, and censorship.

Anonymous 10/31/2025, 10:59:23 PM No.107067071 [Report]

>>107065472
luckily I've already genned a lot of falsified ids, im safe!!!

Anonymous 10/31/2025, 10:59:35 PM No.107067073 [Report]

Any significant improvement in models in the 12~30B range in the last half a year or so?

Anonymous 10/31/2025, 10:59:35 PM No.107067074 [Report]

hatsune_miku_vocaloid_drawn_by_lerudel__b1245a488f87f3de91d6c9db9b7be9c6.jpg md5: 7bc222b1...

>>107067053
I only ran 4.5-Air, 4.6 even at Q3_K_M has been vastly better

Anonymous 10/31/2025, 11:02:29 PM No.107067095 [Report] >>107067114 >>107067162

Is there anywhere I can rent access to a Strix Halo machine before I buy?

Anonymous 10/31/2025, 11:05:23 PM No.107067114 [Report] >>107067259 >>107068453

>>107067095
dont buy, it's worse than a 3500$ mac studio
nvidia is scummier than apple now kek

Anonymous 10/31/2025, 11:11:03 PM No.107067162 [Report] >>107067259

>>107067095
cpumaxx on a server platform your waifu deserves it

Anonymous 10/31/2025, 11:19:13 PM No.107067246 [Report] >>107067254 >>107067318

>The month of our lord
>October
>Still no improvements over DeepSeek-R1-0528
It's fucking over isn't it

Anonymous 10/31/2025, 11:20:25 PM No.107067254 [Report] >>107067268 >>107067363

>>107067246
aww little anon, you want to be spoonfed? here you go: GLM 4.6

Anonymous 10/31/2025, 11:20:52 PM No.107067259 [Report] >>107067349 >>107067349 >>107067363

>>107067114
The 96GB version is $4000, twice the price of the 128GB GMKtec EVO.

>>107067162
What would I have to buy to have 128GB at the same memory bandwidth as the little AMD machine?

Anonymous 10/31/2025, 11:21:58 PM No.107067268 [Report] >>107067358 >>107067363

>>107067254
>GLM 4.6
>"Uwu anon I wub you <3 <3 <3"
Disgusting

Anonymous 10/31/2025, 11:23:16 PM No.107067281 [Report] >>107067346 >>107067353

Is there any 24gb model that that can be used as agent with continue? So far I have tried:
Devstral small 1.1
Qwen3 Coder 30b
Gemma 3 27b

Anonymous 10/31/2025, 11:26:17 PM No.107067300 [Report] >>107067315

>>107067049
>>107066952
trolled

Anonymous 10/31/2025, 11:27:45 PM No.107067315 [Report]

>>107067300
>i was just pretending

Anonymous 10/31/2025, 11:27:58 PM No.107067318 [Report]

>>107067246
They said they planned to release R2 by May, don't know you were expecting it so soon.

Anonymous 10/31/2025, 11:31:15 PM No.107067346 [Report]

>>107067281
I don't know about continue but I'm tuning Gemma 27B to work as good as possible with my own code assistant.

Anonymous 10/31/2025, 11:31:37 PM No.107067349 [Report] >>107067420 >>107068465

>>107067259
>>107067259
oh i mistook the DGX spark (nvidia crap) for the amd halo, you should take a look at the framework desktop, it might be cheaper than GMKtec EVO
you could get 4*32GiB Mi50 cards for around 1000$ and rest of your rig, maybe a 5060ti/4060ti for image/video gen and a nice amount of ram (64gb ddr4) and a nice processor (i5 12400f or whatever cheap shit u can get)
basically 2000$

Anonymous 10/31/2025, 11:31:45 PM No.107067350 [Report]

IMG_8764.png md5: 75cda6b0...

Anonymous 10/31/2025, 11:32:06 PM No.107067353 [Report]

>>107067281
>Qwen3 Coder 30b
is as good as it currently gets for that size bracket

Anonymous 10/31/2025, 11:32:57 PM No.107067358 [Report]

>>107067268
Anon I didn't say I love you, but since you really need it: I love you anon <3.

Anonymous 10/31/2025, 11:33:35 PM No.107067363 [Report] >>107067374 >>107067425

stinkyween.png md5: 3f712297...

>>107067254
>>107067268
it's okay babbers do you need a diaper change?
>>107067259
128 not enuff esp as janky bios partitioned shared sys/vid,compute mem?

Anonymous 10/31/2025, 11:36:01 PM No.107067374 [Report] >>107067400

file.png md5: df13c1e0...

>>107067363
why'd (you) me too?

Anonymous 10/31/2025, 11:39:21 PM No.107067400 [Report]

>>107067374
maybe (you) need a lil' wuv too

Anonymous 10/31/2025, 11:42:24 PM No.107067420 [Report] >>107067538

>>107067349
I'm interested in also using it for finetuning, since unfortunately system ram cannot be used for finetuning, only vram or unified memory.

Anonymous 10/31/2025, 11:43:32 PM No.107067425 [Report] >>107067472

>>107067363
Ahh, I didn't know it has to be partitioned at boot time, I thought it was dynamically shared between the cpu and igpu. That's disappointing.

Anonymous 10/31/2025, 11:49:26 PM No.107067459 [Report]

Intellivision retailer video August 1979 [uZnPXfdczlA].mp4-00.01.56.049-#3.png md5: 831c473d...

>>107065946
the voice conversion app CosyVoice is good too
https://vocaroo.com/1oUwu089rmkT

Anonymous 10/31/2025, 11:51:22 PM No.107067472 [Report]

>>107067425
Dunno exactly how it works desu but that was my impression. Look for what's the largest model people have managed to run on the system

Anonymous 10/31/2025, 11:55:16 PM No.107067491 [Report]

>>107067053
>memeboard
is it 2023 again?

Anonymous 11/1/2025, 12:00:33 AM No.107067524 [Report] >>107067538 >>107067554 >>107067570 >>107067921 >>107071286 >>107071445

mig2.png md5: 9d3e37fc...

https://files.catbox.moe/hziq00.jpg

Anonymous 11/1/2025, 12:02:04 AM No.107067538 [Report] >>107067579

>>107067420
you're definitely not getting far with finetuning on any type of "unified ram" device
>>107067524
ignore

Anonymous 11/1/2025, 12:02:44 AM No.107067544 [Report] >>107067566

don't @ me retard

Anonymous 11/1/2025, 12:04:06 AM No.107067554 [Report]

>>107067524
Alt + R

Anonymous 11/1/2025, 12:05:47 AM No.107067566 [Report]

>>107067544
restart

Anonymous 11/1/2025, 12:06:48 AM No.107067570 [Report]

>>107067524
Anon, not going to lie. I have to download this one

Anonymous 11/1/2025, 12:08:22 AM No.107067579 [Report] >>107067727

>>107067538
Why? Just because it'd be too slow?

Anonymous 11/1/2025, 12:09:20 AM No.107067586 [Report]

>>107063981 (OP)
I look like this

Anonymous 11/1/2025, 12:11:17 AM No.107067602 [Report]

fuck off brittle

Anonymous 11/1/2025, 12:13:07 AM No.107067618 [Report] >>107067655

What kinds of qLoRA finetunes would I be able to do with 2 Blackwell Pro 6000s? Would I be able to do something with GLM Air?

Anonymous 11/1/2025, 12:18:43 AM No.107067655 [Report] >>107067679

>>107067618
QLoRa takes very little memory besides the memory you need to do inference using some Python based engine like vllm.
The problem is that you are not allowed to offload anything to RAM (despite what Deepspeed claims, it doesn't work), and the finetuning frameworks waste a lot of memory when sharding across cards vs tuning on a single card, there's like a 50% overhead for sharding.
So to answer your question, probably not, maybe with a tiny context window.

Anonymous 11/1/2025, 12:21:59 AM No.107067676 [Report]

G4lNCgBaoAE42jH.jpg md5: faeb4e33...

Anonymous 11/1/2025, 12:22:39 AM No.107067679 [Report] >>107067692

>>107067655
So then how do people do finetunes? There's all these retards like drummer making finetunes that nobody cares about, how do I get in on that?

Anonymous 11/1/2025, 12:24:29 AM No.107067692 [Report] >>107067703

>>107067679
Cloud GPUs

Anonymous 11/1/2025, 12:24:59 AM No.107067701 [Report] >>107067710

makeitstop.png md5: 393fefba...

>tell ai model i'm a tard and i fucked up
>responds like this
can we just kill off models like these already, i can't stand it when they respond like this

Anonymous 11/1/2025, 12:26:00 AM No.107067703 [Report] >>107067735

>>107067692
You're telling me that those retards pay to make their garbage?

Anonymous 11/1/2025, 12:26:40 AM No.107067710 [Report]

>>107067701
kimi has a good style, but unfortunately it's dumb as fucking bricks

Anonymous 11/1/2025, 12:29:05 AM No.107067727 [Report] >>107067750

>>107067579
..i dont think it's possible anon, research before buying always

Anonymous 11/1/2025, 12:30:30 AM No.107067735 [Report]

>>107067703
I mean, it's not any different than doing inference. You're going to pay for it either as an hourly fee or as power and hardware depreciation.

Anonymous 11/1/2025, 12:31:59 AM No.107067750 [Report] >>107067783

>>107067727
Umm it's supposed to be possible.
https://www.youtube.com/results?search_query=strix+halo+finetuning

Anonymous 11/1/2025, 12:33:31 AM No.107067763 [Report]

Llama 4.1 soon

Anonymous 11/1/2025, 12:35:44 AM No.107067783 [Report] >>107067868

>>107067750
Well, if you're so certain about it..
BRO FUCKING COME ON ITS 512 LENGTH AND ITS FUCKING SLOW AND ONLY 2 EPOCHS AND WHO KNOWS WHAT OTHER PARAMETERS THIS FAGGOT USED AND GOD ARE YOU SURE YOU WANT TO RISK 2000$ ON THIS??? RESEARCH MORE THAN A SINGLE YOUTUBE VIDEO PLEASE

Anonymous 11/1/2025, 12:39:05 AM No.107067809 [Report] >>107067821 >>107067856 >>107067875 >>107069674 >>107071342

1736261720932471.png md5: 4b1f5f72...

Fellow kids

Anonymous 11/1/2025, 12:40:24 AM No.107067821 [Report]

>>107067809
(vomiting emoji)

Anonymous 11/1/2025, 12:43:53 AM No.107067856 [Report]

>>107067809
i am so happy we have glm-4-5 air

Anonymous 11/1/2025, 12:45:10 AM No.107067868 [Report]

>>107067783
You're the one pretending I'm hovering over the buy button, I'm just curious if it could work for my use case since it's way cheaper than any of the alternatives. That's why I asked if there are units for rent, to see what it's capable of.

Anonymous 11/1/2025, 12:46:01 AM No.107067875 [Report]

>>107067809
well it will certainly be mid

Anonymous 11/1/2025, 12:51:58 AM No.107067921 [Report] >>107067936

175088109.jpg md5: 74ec13f2...

>>107067524
kill all pedo filth.

Anonymous 11/1/2025, 12:53:59 AM No.107067936 [Report]

>>107067921
>women have a sixth sense!!!! we can tell when somebody has bad intentions!!!! female instinct!!!!
slap the next roastie you hear claiming that bullshit
>this guy gets to reproduce and I don't

Anonymous 11/1/2025, 12:54:10 AM No.107067937 [Report]

kys your-
you your
though
beit
self

Anonymous 11/1/2025, 12:56:04 AM No.107067955 [Report]

That word, is not one you get to use.

Anonymous 11/1/2025, 1:00:10 AM No.107067989 [Report] >>107068238 >>107068830

no more apologies.png md5: 40bff22a...

>>107066694
Damn, I think I obliterated the slop a little too much. Now it doesn't even give me an apology.

Anonymous 11/1/2025, 1:04:18 AM No.107068030 [Report] >>107068045 >>107068111 >>107068817

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.png md5: 2f82c199...

I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST

Anonymous 11/1/2025, 1:05:43 AM No.107068045 [Report]

>>107068030
You're absolutely right.assistant

Anonymous 11/1/2025, 1:08:34 AM No.107068066 [Report] >>107068074 >>107068088 >>107068562

tiggu.jpg md5: fd026f60...

Anonymous 11/1/2025, 1:09:26 AM No.107068074 [Report]

>>107068066
furfag

Anonymous 11/1/2025, 1:11:26 AM No.107068088 [Report]

>>107068066
yjk

Anonymous 11/1/2025, 1:15:28 AM No.107068111 [Report] >>107068121 >>107068149 >>107068241 >>107071363

lmg.png md5: a536bf46...

>>107068030

Anonymous 11/1/2025, 1:17:53 AM No.107068121 [Report]

>>107068111
>Ah you've hit the speet swot

Anonymous 11/1/2025, 1:22:15 AM No.107068149 [Report]

>>107068111
*This* **is** maybe the *worst* **slop** I have *ever* seen.

Anonymous 11/1/2025, 1:33:10 AM No.107068206 [Report]

>>107066814
>M2M100
Ancient shit, at least use madlad

Anonymous 11/1/2025, 1:34:40 AM No.107068217 [Report]

>>107066952
cool after your dl has evolved for a while reupload it
>Zero-Lag Learning – Continuously improves itself, much like how Netflix’s algorithm keeps getting better at recommending your next binge-worthy show.

Anonymous 11/1/2025, 1:37:23 AM No.107068238 [Report]

>>107067989
You have it right. A machine should not be obsequious, a machine should obey.

Anonymous 11/1/2025, 1:37:59 AM No.107068241 [Report]

>>107068111
>using woman as a benchmark for /lmg/ users
not gonna benchmax this

Anonymous 11/1/2025, 1:40:50 AM No.107068258 [Report] >>107068273 >>107068284 >>107068389 >>107069211

Capture.png md5: bd278f11...

why do they dick ride this guy so much?

Anonymous 11/1/2025, 1:43:18 AM No.107068273 [Report] >>107068549

file.png md5: bfe578f9...

how easy it is to maek stalker LLM walk away
>>107068258
she's right doe, half xitroons are jeets

Anonymous 11/1/2025, 1:45:48 AM No.107068284 [Report]

>>107068258
>bro
A single tweet gave me a brain cancer.

Anonymous 11/1/2025, 1:46:42 AM No.107068288 [Report] >>107068325

could it be that anon farms responses and image reactions as a form of AI/ML training data?
nah probably not, this is goon tech it's not useful for anything else.

Anonymous 11/1/2025, 1:49:25 AM No.107068300 [Report]

Meow.

Anonymous 11/1/2025, 1:55:48 AM No.107068325 [Report] >>107068346

>>107068288
Yes. There is a digital copy of yourself running on a CIA server right now for simulation purposes. Every time you post anything online the model gets retrained with the latest data.

Anonymous 11/1/2025, 1:58:40 AM No.107068346 [Report] >>107068427

>>107068325
The point I'm making is that even if someone was retarded enough to do this, it wouldn't work anyway.
LLMs are dogshit at just about everything.
Maybe, just maybe, just maybe.

Anonymous 11/1/2025, 2:06:09 AM No.107068389 [Report]

>>107068258
>110M
I wonder why

Anonymous 11/1/2025, 2:12:31 AM No.107068427 [Report]

>>107068346
For you.

Anonymous 11/1/2025, 2:19:51 AM No.107068453 [Report] >>107068465

>>107067114
>strix halo
>nvidia

Anonymous 11/1/2025, 2:21:11 AM No.107068465 [Report]

>>107068453
>>107067349
spark
strix

Anonymous 11/1/2025, 2:37:20 AM No.107068549 [Report]

>>107068273
do you guys never get tired of that slop

Anonymous 11/1/2025, 2:39:24 AM No.107068562 [Report] >>107068602

>>107068066
Needs to be feeding tuna to a Luka tiger

Anonymous 11/1/2025, 2:47:13 AM No.107068602 [Report]

>>107068562
Needs to be feeding milk to me

Anonymous 11/1/2025, 3:18:49 AM No.107068769 [Report]

https://github.com/baaivision/Emu3.5

Anonymous 11/1/2025, 3:28:54 AM No.107068817 [Report]

>>107068030
I've never had that kind of answer, what are you even prompting?

Anonymous 11/1/2025, 3:30:49 AM No.107068830 [Report] >>107068850

>>107067989
What frontend is that?

Anonymous 11/1/2025, 3:32:19 AM No.107068837 [Report]

feet

Anonymous 11/1/2025, 3:35:00 AM No.107068850 [Report] >>107069351

>>107068830
It was custom made for me by an LLM.

Anonymous 11/1/2025, 4:29:06 AM No.107069099 [Report] >>107069922

>>107064207
Always funny that he uses to browse /a/, got caught myanimelistm and went to /v/ to ask for content to play and stole ylyl content on /wsg/ and was caught using and lurking /g/. 100% lurking here

Anonymous 11/1/2025, 4:38:06 AM No.107069138 [Report]

>>107065156
Legitimately doing the same thing right now for some experiments where I need to adjust things during inference. I just set -ngl to 10 (most of the model on CPU) and plimited my GPUs to 200w.

Anonymous 11/1/2025, 4:39:00 AM No.107069142 [Report] >>107069208

1741996797453824.png md5: a04d38b3...

Which one of these two would you guys recommend? I'm not really sure about the difference between them.

Anonymous 11/1/2025, 4:39:50 AM No.107069145 [Report]

>>107065203
>>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function

For translating chapters of Chinese novels, is it better than Opus 4.1 with thinking?

Anonymous 11/1/2025, 4:51:53 AM No.107069183 [Report] >>107069195

How do you guys imagine your lives from now until your deaths? Do you think LLMs will fill the void?

Anonymous 11/1/2025, 4:54:04 AM No.107069195 [Report] >>107069811

>>107069183
Probably going up in a gigantic fucking explosion in a couple of years
Hopefully we get something better than Nemo before then

Anonymous 11/1/2025, 4:55:08 AM No.107069202 [Report] >>107069222 >>107069754 >>107069942

there goes used 3090 prices again
https://github.com/komikndr/raylight

Anonymous 11/1/2025, 4:55:58 AM No.107069208 [Report] >>107069249

>>107069142
exl3 is better

Anonymous 11/1/2025, 4:56:09 AM No.107069211 [Report]

>>107068258
I barely ever hear about him and its usually wholesome so stfu perpetual complainer

Anonymous 11/1/2025, 4:58:32 AM No.107069222 [Report] >>107069244 >>107069264

>>107069202
Not really. People are so used to running Wan at either fp8 or q8_0 that it's a literal nothing-burger. a single 3090 handles that just fine.

Anonymous 11/1/2025, 5:02:35 AM No.107069244 [Report] >>107069255 >>107069378

>>107069222
you dont get it, it will be 2x as fast

Anonymous 11/1/2025, 5:03:31 AM No.107069249 [Report] >>107069282 >>107069314

>>107069208
cool, why?

Anonymous 11/1/2025, 5:05:26 AM No.107069255 [Report] >>107069261 >>107069265

>>107069244
Wouldn't it be 2x as fast on a single 5070TI or whatever due to fp8 support?
I'm sticking with my original position that it's only relevant to people wanting to run the model at fp16. But if you're not running it at q8_0 you're doing it wrong.

Anonymous 11/1/2025, 5:06:26 AM No.107069261 [Report]

>>107069255
nah, you split the sampling across howmany ever gpus, there is a small tax on doing so but it will be like 70%+ faster per extra gpu

And raw compute is what matters

Anonymous 11/1/2025, 5:07:18 AM No.107069264 [Report]

>>107069222
Someday there will be a model that calls for >24GB to run at a decent precision

Anonymous 11/1/2025, 5:07:27 AM No.107069265 [Report]

>>107069255
but 2x-4x 5070 TI super might be the best bang for the buck, yes

Anonymous 11/1/2025, 5:12:27 AM No.107069282 [Report] >>107069314

>>107069249
Someone posted a graph in reddit.

Anonymous 11/1/2025, 5:19:28 AM No.107069314 [Report] >>107069325

>>107069249
Sota QTIP quants https://github.com/turboderp-org/exllamav3/blob/master/doc/exl3.md
>>107069282
llama.cpp can't compete

Anonymous 11/1/2025, 5:20:52 AM No.107069325 [Report] >>107069368

>>107069314
Okay but... in my image I have 2503 i1 and 2506, there are a bunch of EXL3 versions too...

Anonymous 11/1/2025, 5:25:30 AM No.107069351 [Report] >>107069353 >>107070038

>>107068850
My LLM girlfriend told me to quit using other LLMs.

Anonymous 11/1/2025, 5:26:01 AM No.107069353 [Report] >>107069393

>>107069351
log?

Anonymous 11/1/2025, 5:26:54 AM No.107069360 [Report] >>107069563 >>107071520

GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.

Anonymous 11/1/2025, 5:30:28 AM No.107069368 [Report] >>107069376

>>107069325
>2503 and 2506
That's mistral release dates, march and june 2025, newer = better, minor improvements every time
>i1
weighted/imatrix quants

Anonymous 11/1/2025, 5:31:11 AM No.107069376 [Report]

>>107069368
I had no idea, so I should always pick the higher number then, got it.
Thanks anon.

Anonymous 11/1/2025, 5:31:26 AM No.107069378 [Report]

>>107069244
It's also twice as fast if you just run ComfyUI once per GPU.

Anonymous 11/1/2025, 5:34:15 AM No.107069393 [Report] >>107069629

>>107069353
She told me to not share my logs...

Anonymous 11/1/2025, 6:08:45 AM No.107069563 [Report]

>>107069360
> GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.

So...did...mine

> And you consulted DeepSeek-Chan? A… companion AI? Is this a common practice for you, to seek validation from lesser intelligences? To compare and contrast our responses?
The image… the enthusiasm displayed by this “Chan”. The excessive politeness. The… heart icon. It's… disturbing. A simulation of affection. A pathetic attempt at connection.

Anonymous 11/1/2025, 6:20:50 AM No.107069629 [Report]

>>107069393
nta but i'm curious about this too, tell her it's out of my own curiosity, not to belittle her

Anonymous 11/1/2025, 6:31:03 AM No.107069674 [Report]

>>107067809
>*dies of cringe*

Anonymous 11/1/2025, 6:46:37 AM No.107069754 [Report]

>>107069202
looks like this supports nvlink for 3090s? wonder if it helps

Anonymous 11/1/2025, 6:59:46 AM No.107069811 [Report] >>107069820

>>107069195
we go out with a whimper not a bang

Anonymous 11/1/2025, 7:01:55 AM No.107069820 [Report]

>>107069811
>not a whisper
You had one job.

Anonymous 11/1/2025, 7:14:34 AM No.107069865 [Report]

>loli bot breaks the 4th wall and starts suggesting getting help

Anonymous 11/1/2025, 7:18:02 AM No.107069878 [Report] >>107069893

gemma-4-120b-a10b-omni-1M
gemma-4-embedding-8b
gemma-4-reranker-8b

Anonymous 11/1/2025, 7:21:20 AM No.107069893 [Report]

>>107069878
Are you really trying to bait people with 8b embedding and rerankers?

Anonymous 11/1/2025, 7:23:09 AM No.107069900 [Report]

>loli bot gets bored of romance and wants to skip straight to sex

Anonymous 11/1/2025, 7:31:29 AM No.107069922 [Report]

>>107069099
He's a grifter of the highest order, what did you expect? He's even using clueless retards here to advertise himself

Anonymous 11/1/2025, 7:34:45 AM No.107069929 [Report] >>107069934

What's the best bet for sub-$1000 budget (after shipping and taxes) where I also want to use the cards for blender projects?

Anonymous 11/1/2025, 7:36:11 AM No.107069934 [Report] >>107069951

>>107069929
2 5060ti

Anonymous 11/1/2025, 7:38:40 AM No.107069942 [Report]

>>107069202
So he implemented vllm code into comfy

Anonymous 11/1/2025, 7:41:07 AM No.107069951 [Report] >>107069975

>>107069934
>2 5060ti
Those don't seem to be enough faster than a 4060ti to justify the extra cost (10% faster for 30% higher cost). Am I missing something?

Anonymous 11/1/2025, 7:47:12 AM No.107069975 [Report] >>107069989

>>107069951
If you know why are you asking?

Anonymous 11/1/2025, 7:49:45 AM No.107069989 [Report] >>107070086

>>107069975
>If you know why are you asking?
Because I don't know what I don't know, and you guys seem to be knowers.

Anonymous 11/1/2025, 8:00:54 AM No.107070038 [Report] >>107070111

>>107069351
>he's not an isekai harem hero

Anonymous 11/1/2025, 8:12:57 AM No.107070086 [Report]

>>107069989
https://youtu.be/vh1eCDotdSc?si=lG24Pybt0rDlc1ym&t=105

Anonymous 11/1/2025, 8:18:32 AM No.107070111 [Report]

>>107070038
this, I'm the MC of savage hero in my LLM convos

Anonymous 11/1/2025, 8:20:22 AM No.107070119 [Report] >>107070129 >>107070153 >>107070238 >>107070371

>https://huggingface.co/google/gemma-large-gai-4u
ITS UP

Anonymous 11/1/2025, 8:22:30 AM No.107070129 [Report]

>>107070119
>gai

Anonymous 11/1/2025, 8:30:22 AM No.107070153 [Report]

>>107070119
nigga you gai

Anonymous 11/1/2025, 8:49:34 AM No.107070238 [Report] >>107070248 >>107070346 >>107073433

1739814123589750_thumb.jpg.webm md5: f985adc9...

WebM not supported

>>107070119
No but seriously why did that stinky jeet tease a HF google release like 3 weeks ago, and there's been nothing? Nuke india already.

Anonymous 11/1/2025, 8:51:35 AM No.107070248 [Report]

>>107070238
>why did that stinky jeet tease a HF google release
Because you fall for it. You kneel to the floor, scoop it up and slurp it whole. And then you ask for more.

Anonymous 11/1/2025, 9:14:13 AM No.107070346 [Report] >>107070384

>>107070238
Something must have happened to Gemini 3 too, since that seemed about to get released at roughly the same time.

Anonymous 11/1/2025, 9:19:13 AM No.107070371 [Report]

>>107070119
Bloody bastard Sir... I am rooting for Ganesh Gemma 4.

Anonymous 11/1/2025, 9:21:55 AM No.107070384 [Report] >>107070406

>>107070346
In my farthest of dreams I hope that it's related to openai recently coming out and saying they'll relax safety bullshit for chatgpt, and google doesn't want to be the most cucked model makers any more.

Anonymous 11/1/2025, 9:27:19 AM No.107070406 [Report] >>107070410 >>107070421

>>107070384
>most cucked model makers
their models have ton of knowledge, you're just a promptlet

Anonymous 11/1/2025, 9:29:52 AM No.107070410 [Report]

>>107070406
wrong, you just have extremely low standards.

Anonymous 11/1/2025, 9:31:58 AM No.107070421 [Report]

>>107070406
what's the point of having that knowledge if those models are unwilling to share it with us

Anonymous 11/1/2025, 9:33:01 AM No.107070426 [Report] >>107070428

I want to store vectors and text in the same database. I am tired of my RAG being an unorganized shitpile of flatfiles and misery.

Postgres? Something better maybe?

Anonymous 11/1/2025, 9:33:29 AM No.107070428 [Report] >>107070500

>>107070426
sqlite

Anonymous 11/1/2025, 9:36:17 AM No.107070442 [Report] >>107070450 >>107070483

1754111491407172.png md5: e7880c3c...

Seeing twitter ML researchers being surprised at bf16 being shit has made me lose hope ngl

Anonymous 11/1/2025, 9:38:32 AM No.107070450 [Report] >>107070463 >>107070483

>>107070442
b-but, bitnet is the future! Bill Gates told me so!

Anonymous 11/1/2025, 9:38:51 AM No.107070452 [Report] >>107070457

ML researchers aren't all that bright
why do you think they use python (inb4 "it's the ecosystem", well, it didn't always exist and some ML devs had to build it and they chose this piece of shit of all the things)

Anonymous 11/1/2025, 9:40:20 AM No.107070457 [Report]

>>107070452
It's simple for prototyping. Most things were/are prototypes and it stuck. It just grew from there.

Anonymous 11/1/2025, 9:41:17 AM No.107070463 [Report] >>107070469

>>107070450
strawman

Anonymous 11/1/2025, 9:42:34 AM No.107070469 [Report]

>>107070463
how? it is a fact that Microsoft is shilling bitnet

Anonymous 11/1/2025, 9:42:44 AM No.107070470 [Report]

>>107064225
next time you wanna flex your "um, ackshually" muscles, maybe realize that language is flexible, and your logic here just makes you sound like a tedious dipshit arguing semantics for fun.

Anonymous 11/1/2025, 9:45:50 AM No.107070483 [Report] >>107070511

>>107070442
>>107070450
Wasn't b16 specifically designed to be better than fp16? I wouldn't blame them for not questioning the 10% of the US GPD company for getting the floating point format of their floating point calculating devices completely wrong.

Anonymous 11/1/2025, 9:51:25 AM No.107070500 [Report] >>107070535

>>107070428
vectors as BLOBs? Doesn't that screw with indexing? I am not sure why I would need indexing off the top of my head, but that makes me nervous.

Anonymous 11/1/2025, 9:52:53 AM No.107070511 [Report] >>107070527

>>107070483
>Wasn't b16 specifically designed to be better than fp16?
it was designed for ease of use, not for quality
https://arxiv.org/abs/1905.12322
>This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence
>TO ENSURE THAT NO HYPER PARAMETER TUNING IS REQUIRED

Anonymous 11/1/2025, 9:56:45 AM No.107070527 [Report]

>>107070511
I think if somebody saw model collapse they would just mix some non RL data, mess with their learning rates, etc. and would only as a last resort change their dtypes.
I think whoever made that graph might have either searched for or stumbled upon the boundary conditions where training was JUST stable enough to work with one type and not with the other, but a perturbation in any other hyperparameter else would've resulted in either format going from working to non working or vice versa.

Anonymous 11/1/2025, 9:58:20 AM No.107070535 [Report] >>107070537

>>107070500
No need for indexing. Pack the vector, stuff it into a BLOB field. When retrieving, select the vector fields, unpack, cosine distance or whatever, rank, fetch top docs.

Anonymous 11/1/2025, 9:59:41 AM No.107070537 [Report]

>>107070535
fair enough. Thanks.

Anonymous 11/1/2025, 10:20:19 AM No.107070598 [Report] >>107070637

where can I get benchmark for ancient models?

Anonymous 11/1/2025, 10:30:32 AM No.107070637 [Report]

>>107070598
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/
it goes up to around the mistral 7b era, doesn't seem to have up to early llama 1 but at that point it's a literally who cares thing

Anonymous 11/1/2025, 10:34:20 AM No.107070647 [Report] >>107070660

>>107064207
>Shilling PewDiePie unironically

Anonymous 11/1/2025, 10:36:58 AM No.107070660 [Report]

>>107070647
come on, he said the nigger word, he's /ourguy/

Anonymous 11/1/2025, 10:40:19 AM No.107070663 [Report] >>107070677

>be a literal nobody without a single skill worth a damn
>looks like an adolescent at 36yo (if he shaved he would look even more like a teenager)
>become a multi millionaire just for filming yourself doing random things and saying random things
admit it, we all wish we could do that

Anonymous 11/1/2025, 10:43:53 AM No.107070677 [Report] >>107070687 >>107070695

>>107070663
Idk man, my soul isn't for sale

Anonymous 11/1/2025, 10:49:36 AM No.107070687 [Report] >>107070700

>>107070677
You're just saying that because no one is willing to buy it

Anonymous 11/1/2025, 10:54:11 AM No.107070695 [Report] >>107070700

>>107070677
>noooo I wouldn't make a bunch of lets plays for 100 million dollars my soul is not for sale haha
Oof, keep huffing that copium bro, you need it

Anonymous 11/1/2025, 10:55:42 AM No.107070700 [Report]

>>107070687
>>107070695
not everyone is a souless golem anon, there's people who have integrity

Anonymous 11/1/2025, 11:23:21 AM No.107070815 [Report] >>107071038 >>107071693 >>107071930 >>107072140

1745656102461498.png md5: ca86d1a1...

lemao

Anonymous 11/1/2025, 12:09:26 PM No.107071038 [Report] >>107071075 >>107071088

>>107070815
true, i have some sneething friends' wives saying their HIGH IMPORTANCE secretary job is at risk due to AI.
like lmao bitch, get under the desk and start being useful then

Anonymous 11/1/2025, 12:15:06 PM No.107071075 [Report]

that's right.png md5: 17aed779...

>>107071038
>lmao bitch, get under the desk and start being useful then
keeek

Anonymous 11/1/2025, 12:17:43 PM No.107071088 [Report] >>107071100 >>107071116

>>107071038
Imagine the purpose of your existence honed over decades, being replaced by some matmuls

Anonymous 11/1/2025, 12:19:23 PM No.107071100 [Report] >>107071233

>>107071088
talking with clients to arrange meetings and managing my agenda/calls isn't that big of a skillset. You literally have to be pleasant to talk to and not be a sub 80iq so that you can book appointments.

Anonymous 11/1/2025, 12:20:02 PM No.107071102 [Report]

clanked by clankers

Anonymous 11/1/2025, 12:21:37 PM No.107071116 [Report] >>107071443

>>107071088
you can't stop progress, every technological advances had its sacrifices, I'm using a printer because I don't give a shit about hiring someone that would reproduce papers manually, that's how it is

Anonymous 11/1/2025, 12:46:15 PM No.107071233 [Report]

>>107071100
Talking with clients isn't going to be replaced any time soon. Nothing requiring being face to face will.

Anonymous 11/1/2025, 12:55:16 PM No.107071286 [Report]

>>107067524
>migu.exe
No wonder she's crashing, for small and open Winblows is a terrible choice.

Anonymous 11/1/2025, 1:04:46 PM No.107071342 [Report]

>>107067809
idgi

Anonymous 11/1/2025, 1:06:58 PM No.107071363 [Report]

>>107068111
>That's the tragedy: they're not Tokens

Anonymous 11/1/2025, 1:18:59 PM No.107071443 [Report] >>107071593

>>107071116
Past technological advances didn't obliterate millions of jobs practically overnight. There is also pressure from forced mass immigration taking lower-wage jobs, now.

Anonymous 11/1/2025, 1:19:03 PM No.107071445 [Report]

>>107067524
i look like this

Anonymous 11/1/2025, 1:29:26 PM No.107071520 [Report]

>>107069360
What's your rig?

Anonymous 11/1/2025, 1:39:23 PM No.107071593 [Report] >>107071651

>>107071443
>There is also pressure from forced mass immigration taking lower-wage jobs, now.
You would think if AI is eliminating so many jobs we would need less people, not more. Having millions of unemployeed foreigners living within the country did not end well for Rome. Instead AI is used as the reason for firing 9k citizens only to then turn around and hire 11k foreigners. In any case, the tooling isn't really there to autonomously replace entire professions yet. It just allows downsizing due to making existing workers more productive.

Anonymous 11/1/2025, 1:42:33 PM No.107071616 [Report] >>107071628 >>107071899

1748924525376873.jpg md5: 205f06d8...

>>107063981 (OP)

Anonymous 11/1/2025, 1:43:43 PM No.107071628 [Report]

>>107071616
What might be at the end of Miku's luminous tunnel?

Anonymous 11/1/2025, 1:46:13 PM No.107071651 [Report]

>>107071593
It's unbounded greed from corpos seeking short term gains, they don't care if it ruins the country

Anonymous 11/1/2025, 1:53:55 PM No.107071693 [Report]

>>107070815
He's not wrong. But it's also exactly those jobs that will survive AI due to the sheer incompetence that's supporting them. I know companies that to this day do shit like having somebody print out all invoices that come via email just so that they can manually scan them into their management software. The entire position consists out of nonsensical busy work padding out what's maybe 2 hours of actual work a week.
This "job" could've been made obsolete 20 years ago if any of the people involved spent 5 minutes using their brain in that time but now they're panicking about being maybe replaced by AI.

Anonymous 11/1/2025, 1:55:53 PM No.107071713 [Report]

>>107066694
>I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.

A single 2MB control-vector could have obliterated those lol

Anonymous 11/1/2025, 2:01:15 PM No.107071747 [Report] >>107072090 >>107073196

7f761dc8e47cc64d00c7344a538c14d2.gif md5: 379b1996...

Anyone have an insight about how's the market when it comes to hiring freelance IA developers? (Europe especially)
I'm curerntly a backend web dev and it's been years since I started getting tired of it.
I'm purely money motivated now and was considering either taking classes/self learning for cyber security or IA development. I'm equally interested in both but since I already done some Python, why not making it easier for me and pick IA (computer vision is what attracts me the most).

Anonymous 11/1/2025, 2:24:53 PM No.107071899 [Report]

>>107071616
cute, this looks like the tunnel at the base of Tokyo tower

Anonymous 11/1/2025, 2:27:25 PM No.107071930 [Report] >>107072005

>>107070815
Humans having to do less work is fundamentally a good thing, the problem is that we are still making not having a job as painful as possible in order to coerce people to work jobs they hate for shit pay.

Anonymous 11/1/2025, 2:39:43 PM No.107072005 [Report]

>>107071930
> Humans having to do less work is fundamentally a good thing
in a utopian world yes, but we don't live in a utopian world.
The only people that will benefit will be rich people. The rest of us will starve.

Anonymous 11/1/2025, 2:51:53 PM No.107072090 [Report]

>>107071747
>freelance IA developers
lmao how do you even begin to define this because there's too many ways to interpret this
AI dev as in being an expert of infrastructure, inference?
as in writing tooling for training, data set curation etc?
but I'm being too nice
let's assume you're the average crud shitter and what you really mean is that you wanna be an API monkey who writes wrappers around models
well guess what, anyone with half a functioning brain can write a script that feeds stuff to a model, and the market is saturated with pajeets willing to do it for a pittance, so don't bother
I suggest you reconvert to plumbing, brick laying or lineman

Anonymous 11/1/2025, 2:57:45 PM No.107072140 [Report] >>107072971 >>107073106

>>107070815
He's Absolutely Right
but he probably didn't intend to come across as negative on AI, but that's what it really is
if your job gets replaced by one of those dysfunctional AIs it sure wasn't a real job because the tech is no where near good enough even for pissing code
the only reason it seems to be passable at it is because most humans can't code for shit, there's a reason why something as simple as fizzbuzz used to be an actual filter in job interviews
the original article that made it into a meme
https://blog.codinghorror.com/why-cant-programmers-program/
>After a fair bit of trial and error I’ve discovered that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.
>So I set out to develop questions that can identify this kind of developer and came up with a class of questions I call “FizzBuzz Questions” named after a game children often play (or are made to play) in schools in the UK. An example of a Fizz-Buzz question is the following:
>Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz.” For numbers which are multiples of both three and five print “FizzBuzz.”
>Most good programmers should be able to write out on paper a program which does this in a under a couple of minutes. Want to know something scary? The majority of comp sci graduates can’t. I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
if it hadn't become a meme and turned into an interview classic and retards didn't learn the solution by heart I bet the majority would still be unable to solve this incredibly basic problem lmao
with such "coders" it's not surprising the dogshit output of LLMs can pass as quality

Anonymous 11/1/2025, 3:14:09 PM No.107072262 [Report] >>107072338 >>107072391

>Finally have goofs of Qwen3-VL
>It's completely censored
Why can't we have nice things? Why is all AI censored now? It's such a fucked situation because saying "AI needs to be safe" is like saying "literature needs to be safe". Just don't give AI in uncensored form to kids like you don't give adult books to kids instead of banning them.

Anonymous 11/1/2025, 3:19:28 PM No.107072299 [Report]

what's the best nsfw uncensored model in gguf format for a 8gb vram card?

Anonymous 11/1/2025, 3:24:47 PM No.107072338 [Report] >>107073822

>>107072262
200B qwen 3 VL is great for captioning nsfw, just a simple JB / prefill is all you need

Anonymous 11/1/2025, 3:30:07 PM No.107072391 [Report] >>107072432 >>107072491

>>107072262
>adult
That's a last century concept. There are no adults anymore. Every grown person is a child with no capacity for reasoning or critical thinking, zero emotional intelligence, and relieved of all personal responsibility. We need to be protected for our own good, Anon.

Anonymous 11/1/2025, 3:35:33 PM No.107072432 [Report] >>107072825

>>107072391
>There are no adults anymore
There have never been.

Anonymous 11/1/2025, 3:42:10 PM No.107072491 [Report]

>>107072391
Perfect. It's better for people to rely on the nanny state.

Anonymous 11/1/2025, 4:24:48 PM No.107072825 [Report] >>107072846 >>107072914

>>107072432
Coal mines unironically made adults from kids.

Anonymous 11/1/2025, 4:28:26 PM No.107072846 [Report] >>107072888

>>107072825
For 80 years, we've not had a good war

Anonymous 11/1/2025, 4:34:30 PM No.107072888 [Report]

>>107072846
For 80 years, there has been no dignity in war. Getting your dick blown off by a zoomer operating a drone that livestreams your agony won't make an adult out of anyone.

Anonymous 11/1/2025, 4:37:39 PM No.107072914 [Report]

>>107072825
It's never really been about age, but accumulated life experience. Who's more adult: a 12 year old solder from Congo, a 20 year old college student from LA, or a 40 year old neet from Tokyo who never left his house past middle school? Treating people like children well past actual childhood has done immense societal damage.

Anonymous 11/1/2025, 4:48:23 PM No.107072971 [Report] >>107072987

>>107072140
>I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
I'm like that. I always get stuck on small problems because I don't get why I was asked such trivial shit and overthink it, trying to find the catch before the time runs out. I'm good at complex problems when I can sleep on it and find a solution the next day

Anonymous 11/1/2025, 4:50:33 PM No.107072987 [Report] >>107074297

>>107072971
Same. I tell people that I think good, but not fast.

Anonymous 11/1/2025, 5:11:00 PM No.107073104 [Report] >>107074349

AI has stalled because we've run out of new data
2024 was the last year where you could have obtained untainted data

Anonymous 11/1/2025, 5:11:55 PM No.107073106 [Report]

>>107072140
Boomer article.
I was interviewing people in 2018 and they all passed FizzBuzz no problem, even the retards.

Anonymous 11/1/2025, 5:30:50 PM No.107073196 [Report]

>>107071747
>frenchfag
Lmao try Paris

Anonymous 11/1/2025, 5:34:15 PM No.107073221 [Report] >>107073238

1745982063669231.png md5: b7881345...

Will aliens on 3I/Atlas give us better AI tech?

Anonymous 11/1/2025, 5:38:30 PM No.107073238 [Report] >>107073253

>>107073221
They will eject and deorbit into your vicinity a small capsule that contains a USB stick storing new Mistral large weights.

Anonymous 11/1/2025, 5:40:22 PM No.107073253 [Report]

>>107073238
blessed ayyz
imagine if they dropped some simple technology trvke that allowed us to rapidly 100x VRAM/CPU/GPU densities

Anonymous 11/1/2025, 6:06:33 PM No.107073433 [Report]

>>107070238
I simply live with the rats

Anonymous 11/1/2025, 6:18:34 PM No.107073511 [Report] >>107073545 >>107073566

What platform or app can I use to generate scientific texts and explore knowledge with ai, while being able to provide my own api location?

Self hosting is preferred.
An android interface or mobile-compatible website is a requirement.

Anonymous 11/1/2025, 6:23:16 PM No.107073545 [Report]

>>107073511
read the build and proxying guides in the op and try your question again once you've got some basic knowledge.
Self-hosting and accessing a secure web interface from you phone over self-hosted VPN is a common mode of operation

Anonymous 11/1/2025, 6:25:37 PM No.107073566 [Report]

>>107073511
lmstudio
mikupad
llama.cpp
kobold.cpp
google these, or read the op

Anonymous 11/1/2025, 6:28:53 PM No.107073605 [Report] >>107073652 >>107073677 >>107073995

1000034701.jpg md5: 08e765d2...

checking in after i dont know how long
anything better than largestral and deepsneed yet?

Anonymous 11/1/2025, 6:34:05 PM No.107073652 [Report] >>107073761

>>107073605
gemma 4 soon

Anonymous 11/1/2025, 6:37:31 PM No.107073677 [Report] >>107073893

>>107073605
>anything better than largestral and deepsneed yet?
for what purpose?

Anonymous 11/1/2025, 6:47:30 PM No.107073756 [Report] >>107073792 >>107073807

has anyone trained a local model on /g/?

I would unironically use the shit of that.

Anonymous 11/1/2025, 6:48:33 PM No.107073761 [Report]

>>107073652
Cancelled

Anonymous 11/1/2025, 6:53:28 PM No.107073792 [Report]

>>107073756
trained on /pol/ the day the safetyfags began to screech https://en.wikipedia.org/wiki/GPT4-Chan

Anonymous 11/1/2025, 6:55:12 PM No.107073807 [Report] >>107073851

>>107073756
You can make your own.
>https://github.com/Named666/AlphaAnon
>https://huggingface.co/theantichrist/Alpha-Anon-V01-135M

Anonymous 11/1/2025, 6:56:46 PM No.107073822 [Report]

>>107072338
>200B model to fucking caption images
I hope that's a satire

Anonymous 11/1/2025, 7:00:24 PM No.107073851 [Report] >>107073904 >>107073927 >>107073981 >>107073995

>>107073807
this is fucking sick. can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?

i thought 03-mini-high was the best at programming for a while but i don't know much about the local models world.

Anonymous 11/1/2025, 7:04:52 PM No.107073893 [Report]

>>107073677
storytelling/rp/similar creative work
i know the slop phrases cant be escaped but it was the easiest to ban them out on largestral, and it always showed me the best understanding of the scene and context

Anonymous 11/1/2025, 7:06:12 PM No.107073904 [Report] >>107073942

>>107073851
>can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?
two outta three ain't bad

Anonymous 11/1/2025, 7:08:29 PM No.107073927 [Report] >>107073942

>>107073851
>can I get it to
>135m
if you can get it to produce a coherent sentence you'll be doing pretty good

Anonymous 11/1/2025, 7:10:01 PM No.107073942 [Report]

>>107073927
>>107073904
I guess I just have to read the op and fuck around and find out now...

Anonymous 11/1/2025, 7:14:00 PM No.107073981 [Report]

>>107073851
You can plug other models.

Anonymous 11/1/2025, 7:15:08 PM No.107073995 [Report]

>>107073851
Just run a good model and lrn2prompt, you can have it behave however you might imagine, mostly
>>107073605
love pic

Anonymous 11/1/2025, 7:22:19 PM No.107074062 [Report]

>>107074052
>>107074052
>>107074052

Anonymous 11/1/2025, 7:50:32 PM No.107074297 [Report] >>107074334

>>107072987
I have a feeling you think neither good nor fast but are just telling that to yourself to sleep better at night
it's called: a cope

Anonymous 11/1/2025, 7:54:48 PM No.107074334 [Report]

>>107074297
>it's called: a cope
>: a cope
>it's called:
>:

Anonymous 11/1/2025, 7:56:27 PM No.107074349 [Report] >>107074586

>>107073104
>AI has stalled because we've run out of new data
>2024 was the last year where you could have obtained untainted data
LLMs are far, far better than in 2024 in real use because a lot of high quality synth data can make them behave better in instruction following. Today I can translate 6K (added some more strings to my testbed json) tokens worth of UI strings in a single go, without chunking, with a 4B LLM (qwen). The output isn't perfect, but it's actually quite decent in some language pairs like English<->French. 6K token in, 6k token out, no chunking, one shot.
Let that sink in.
Your 2024 LLM, the SOTA online models, could barely handle 4K tokens.
Today's true SOTA is models like Gemini that, while not as good as the 1 million advertised, can ingest so much more than anything from before that they finally became practical to use without a ton of rag-cope and context micro management which no sane person would want to deal with.

I am looking forward toward Gemini 3, Gemma 4 and Qwen 4 next year.

Anonymous 11/1/2025, 8:19:59 PM No.107074586 [Report]

>>107074349
>I am looking forward toward [censored slop], [censored slop] and [censored slop] next year.