← Home ← Back to /g/

Thread 107063981

342 posts 82 images /g/
Anonymous No.107063981 [Report] >>107065909 >>107067586 >>107071616
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107056325 & >>107044779

►News
>(10/30) Qwen3-VL support merged: https://github.com/ggml-org/llama.cpp/pull/16780
>(10/30) Kimi-Linear-48B-A3B released with hybrid linear attention: https://hf.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
>(10/28) Brumby-14B-Base released with power retention layers: https://manifestai.com/articles/release-brumby-14b
>(10/28) NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 released: https://hf.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16
>(10/28) LFM2-ColBERT-350M released: https://hf.co/LiquidAI/LFM2-ColBERT-350M

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.107063985 [Report]
►Recent Highlights from the Previous Thread: >>107056325

--VRAM vs RAM tradeoffs and cost-effective upgrades:
>107057422 >107057493 >107057523 >107057538 >107057627 >107057641 >107057680 >107057892 >107057904 >107058132 >107058211 >107058235 >107058246 >107058291 >107058301 >107058332 >107058823 >107057647 >107060695
--Tech Mahindra's 1 trillion parameter LLM project sparks mixed reactions:
>107061935 >107062055 >107061978 >107062154 >107062174
--Multi-GPU memory optimization latency tradeoffs for MoE models:
>107062861 >107062880 >107062891 >107062902 >107062941 >107063023 >107062887 >107062939 >107062947 >107063018 >107062980 >107063165 >107063110
--VTT model comparisons and pipeline suggestions for transcription:
>107059665 >107059817 >107059845 >107059918 >107059961 >107060178 >107060224 >107062756 >107062842 >107062859
--Qwen 4B's performance in complex JSON generation and small LLM advancements:
>107057926 >107058153 >107058218
--Qwen 4b's multi-image analysis capabilities demonstrated:
>107060687
--SillyTavern system prompt configuration challenges:
>107062184 >107062200 >107062327 >107062369 >107062386 >107062492
--Exploring practical uses for local image processing and interactive applications:
>107056358 >107056482 >107056509 >107056541 >107056576 >107056554
--Challenges with TabbyAPI and Qwen3 Coder tool calling implementation:
>107058354 >107058385 >107058840 >107059067 >107059694 >107062455
--Skepticism about LLaDA2.0's practical value due to performance and context limitations:
>107060705 >107060731 >107060818
--UI/lorebook integration challenges and code accessibility in STScript:
>107057009 >107057036 >107057083 >107057101 >107057121 >107057162 >107057240
--Miku, Rin, and Dipsy (free space):
>107056696 >107057940 >107057943 >107059568 >107059860 >107060222 >107060637 >107060674 >107061256 >107062726 >107061898

►Recent Highlight Posts from the Previous Thread: >>107056334

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
Anonymous No.107064100 [Report] >>107064113
i see... :(
Anonymous No.107064113 [Report]
>>107064100
I don't
Anonymous No.107064207 [Report] >>107064254 >>107064311 >>107064351 >>107064352 >>107064392 >>107064493 >>107064583 >>107064736 >>107065241 >>107065483 >>107069099 >>107070647
https://youtu.be/qw4fDU18RcU
Anonymous No.107064225 [Report] >>107070470
Do you guys know what I realized? No matter how far you go, you're still somewhere and never nowhere, so saying I am in the middle of nowhere is a nonsensical sentence.
Anonymous No.107064254 [Report]
>>107064207
so he uses vLLM in a docker container (this needing the shm-size) and runs qwen 235B in AWQ 4 bit
Anonymous No.107064275 [Report] >>107064510
All of his knowledge is ironically coming from LLMs. I'm sure he has also browsed /lmg/ in the past at least. You could probably find his retarded questions.
Anonymous No.107064311 [Report]
>>107064207
pretty dissapointing, he was pretty based up to this point
Anonymous No.107064351 [Report]
>>107064207
>watch the first few mins
>the topic of the title doesn't even get mentioned at all
Anonymous No.107064352 [Report]
>>107064207
cool Web UI
Anonymous No.107064392 [Report] >>107064565
>>107064207
>its actually a video about shitting on cloud models and shilling self-hosting models
how can one man be so based?
Anonymous No.107064443 [Report]
Gguf status?
Anonymous No.107064493 [Report] >>107064629
>>107064207
Ok watched the whole video.
Wtf he's one of us.
Anonymous No.107064510 [Report] >>107064663
>>107064275
>I'm sure he has also browsed /lmg/ in the past at least.
I doubt it because he actually complimented gpt-oss
Anonymous No.107064565 [Report]
>>107064392
anti ai will still use just the thumbnail as saying he's against all ai tho
Anonymous No.107064583 [Report]
>>107064207
Fuck this fag, I bet he even lurks ITT. His whole persona is so rage inducing.
https://youtu.be/7OiMxGwmdto?si=kvdyA0QWdV6rZ_3k
Anonymous No.107064629 [Report] >>107064688
>>107064493
>Wtf he's one of us.
No shit. He says the word nigger all the time.
Anonymous No.107064663 [Report] >>107064718
>>107064510
There is one retard here that regularly praises gpt-oss. Maybe it's him.
Anonymous No.107064688 [Report] >>107064735 >>107066630
>>107064629
do not to slander he said once in the rage moment
Anonymous No.107064718 [Report]
>>107064663
we must agree
Anonymous No.107064735 [Report] >>107064742
>>107064688
I've seen some tiktok clips of him where he made some implicit remaks showing he's a white nationalist, that's a reason why he decided to go for japan, not just because of "uwu kawaii desu ne", but because this country is extremely racist and nationalist
Anonymous No.107064736 [Report] >>107064748
>>107064207
>video about local AI from e-celeb #16311498
>no ollamao in sight
i was going to tell you to fuck off but nevermind, i like the guy
Anonymous No.107064742 [Report] >>107064766 >>107064773 >>107064878 >>107064895
>>107064735
by wouldn't he be subject to that racism? he is not Japanese
Anonymous No.107064748 [Report]
>>107064736
I wish I had the money to play around with a VLLM capable rig
Anonymous No.107064766 [Report] >>107064811 >>107064830 >>107064988 >>107065011
>>107064742
Racists don't tend to be brightest crayon in the toolshed.
Anonymous No.107064773 [Report]
>>107064742
everyone in the world know who pewdiepie is, I think the japanese people are happy he's here
Anonymous No.107064811 [Report]
>>107064766
Ahah so true kind stranger, take this kind gold and upvote with you!
Anonymous No.107064830 [Report] >>107064843 >>107064868
>>107064766
the richest man in the history of humanity is a "nazi" though, how is that not bright?
Anonymous No.107064843 [Report]
>>107064830
he can be rich and a dumbass at the same time
Anonymous No.107064845 [Report] >>107064904 >>107064908 >>107064920 >>107065271 >>107065682
Do you guys ever use models to edit or write your prompts? I'm trying it a bit but desu its hard to tell if its an improvement or not
Anonymous No.107064868 [Report]
>>107064830
>lifting your hand in a angle is... le nazi
Anonymous No.107064878 [Report]
>>107064742
why would the japanese hate him?
he's not one of the pajeet or third worlder migrants wanting to shit up the place
Anonymous No.107064895 [Report] >>107065427
>>107064742
I don't think japanese people mind white people, they know what they are worth
Anonymous No.107064904 [Report]
>>107064845
Yes, it's useful when for example you want to define character behavior more in detail but you can't be assed to write the entire prompt yourself from scratch. It's also best when the entire prompt is dedicated to the character. For non-RP uses, LLM-driven recursive prompt-refining is also a thing: https://arxiv.org/abs/2507.19457
Anonymous No.107064908 [Report]
>>107064845
>its hard to tell if its an improvement or not
Then consider time and effort, however much or little that is.
Anonymous No.107064920 [Report]
>>107064845
Oh yeah. Mostly for brainstorming than anything, since the final version is always heavily edited by me.
Anonymous No.107064965 [Report] >>107065003
Can someone explain to me if alpha changes something about the training process or it ONLY changes the multiplier at inference time? (yes, sorry, I'm too lazy to read the actual paper)
Anonymous No.107064988 [Report]
>>107064766
would you say that about blm?
Anonymous No.107065003 [Report] >>107065032
>>107064965
It was intended to just be a multiplier, but in practice, alpha must be at least twice the rank (=it can/should be larger) to mitigate the emergence of "intruder dimensions" that decrease the effective rank of your LoRA.

https://arxiv.org/abs/2410.21228
Anonymous No.107065011 [Report]
>>107064766
>Racists don't tend to be brightest crayon in the toolshed.
the US literally hired actual nazis to put their man on the moon lol
https://en.wikipedia.org/wiki/Operation_Paperclip
Anonymous No.107065032 [Report] >>107065046
>>107065003
Ok but that doesn't answer my question. Is it applied at train time (so the weights actually learn to use it, and at inference time you shouldn't use a different one than the alpha the lora was trained with) or is it an option that is applied only at inference time and the lora itself doesn't have a built in alpha?
Anonymous No.107065046 [Report] >>107065134 >>107065138
>>107065032
It's used at train time, and it's memorized in the adapter configuration if you don't merge it with the baseline model. In that case, you can change alpha to make the adapter weaker/stronger, but I've never played with that.
Anonymous No.107065134 [Report]
>>107065046
I see, thanks.
Anonymous No.107065138 [Report] >>107065165
>>107065046
Applying it at a significantly higher alpha than used in training causes brain damage. So you should generally only apply the adapter at the alpha it was trained at and then just train separate adapters if you want to play around with the alpha.
Anonymous No.107065156 [Report] >>107065409 >>107069138
how would one go about throttling llama.cpp intentionally to say half speed? of course temporarily
Anonymous No.107065165 [Report]
>>107065138
If you use a higher alpha of course you should decrease the learning rate proportionally. You can't change just alpha. Picrel from the QLoRA paper (https://arxiv.org/pdf/2305.14314).
Anonymous No.107065203 [Report] >>107069145
>QWEN3 VL has the best local OCR function
>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function (Outside of occasionally having random Chinese characters in the English translation, is there a way to fix this?)
>Kimi k2 has the best writing

Damn, in another year, I genuinely believe we'll never need traditional translators for a good chunk of media.
Anonymous No.107065230 [Report] >>107065443 >>107065474 >>107066996
TONIGHT I'm gonan do it. Totally goinan fuckin do it. I am gunna try ant SUCK my own COCK!!! I taste my own cum from jackan off but it is not satisfy enough. I need to feeel it shootan on my tongue. I will bee in extacee. I am so excite boys!
Anonymous No.107065241 [Report]
>>107064207
I have vague memories of a "council of niggas" or something like that from a year or two ago. Was it from a paper?
Anonymous No.107065271 [Report]
>>107064845
I still use this thing to make prompts.
https://anthropic.com/metaprompt-notebook/
Anonymous No.107065272 [Report]
The earliest form of sexting probably was something like a woman rubbing coal powder or her ass and then leaving an imprint on a papyrus. Or a man doing the same but with his dick. Think about it.
Anonymous No.107065409 [Report]
>>107065156
Throttle your GPU to half it's speed
Anonymous No.107065427 [Report]
>>107064895
lol
Anonymous No.107065443 [Report]
>>107065230
cute, hope you're slim enough
Anonymous No.107065472 [Report] >>107065504 >>107066673 >>107067071
HF will soon ask for ID before you download an danger LLM!
https://reclaimthenet.org/lawmakers-want-proof-of-id-before-you-talk-to-ai
Anonymous No.107065474 [Report]
>>107065230
I wish I could do that but I have the build of a Chad. Life is unfair.
Anonymous No.107065483 [Report]
>>107064207
Did he share the code? Couldn't find it in the video description.
Anonymous No.107065504 [Report] >>107065629
>>107065472
yup it's over
>Under the GUARD Act, self-declared birthdays no longer count. If implemented broadly, it would set a precedent that any “interactive AI system” must verify identity through government-approved documentation.
this would hit literally any site that has an ai powered search box and shit like that, like the dataset stuff on hf, or their test box on the side of model cards
Anonymous No.107065561 [Report]
So whats the best thing I can on a 4090 today?
Anonymous No.107065603 [Report] >>107065617
do backups of your most useful models. checksum for bitrot, multiple backup locations etc.
it's now or never to make sure you can always access em
Anonymous No.107065617 [Report]
>>107065603
shut it doomer just another nothing burger
Anonymous No.107065629 [Report] >>107065638
>>107065504
>upload model as a torrent
sorry guys, nothing personal
Anonymous No.107065638 [Report] >>107065653
>>107065629
>stalled
Anonymous No.107065653 [Report] >>107065667
>>107065638
stalled torrents? what is this? 2002? you can buy a 1gbps uplink seedbox for like $5 a month.
Anonymous No.107065667 [Report] >>107065783
>>107065653
so true! you're absolutely right this is why the service that was exactly for copying hf as torrents is thriving and hasn't been dead for more than a year
Anonymous No.107065682 [Report] >>107065696
>>107064845
All the time, rephrasing in its own words increases comprehension. The resulting prompt usually works well across different models, I guess they were all trained on the same slop
Anonymous No.107065696 [Report]
>>107065682
>I guess they were all trained on the same slop
ScaleAI enters the chat
Anonymous No.107065720 [Report]
So which 24gb coder models have tool support?
Anonymous No.107065783 [Report] >>107066126
>>107065667
because huggingface is free and last i checked $0 is less than $5. however lets imagine that huggingface does require ID to download any model or dataset from their website. the majority of normies with a passing interest with AIs won't do it because they will just use chatgpt. power users are typically privacy oriented since they are downloading LOCAL models in the first place. the only users that huggingface would have left are academic people. finetrooners like thedrummer depends on constant validation, they won't get that huggingface and will have to cough up the $5 a month for people to download whatever the latest flavor of cydonia-24B-v8atoz-amazon-GOOF-troop is. in the end all the major model releases would just get downloaded by a few users and reuploaded as torrents.
Anonymous No.107065852 [Report] >>107065870 >>107066098
I think I got memed on by /lmg/ thing just keeps spamming text until it goes off the rails.
Anonymous No.107065870 [Report]
>>107065852
just use glm 4.5 air if you can
Anonymous No.107065909 [Report] >>107065923
>>107063981 (OP)
What is better, chuds? To run GLM 4.5 Air q8, or GLM 4.6 q3? To fit in about 144 GB of VRAM
Anonymous No.107065923 [Report]
>>107065909
4.5 Air is shit.
Anonymous No.107065928 [Report]
run deepseek instead of the reddit meme model
Anonymous No.107065946 [Report] >>107067459
vibevoice is best
https://vocaroo.com/173Uko8t1hHi
Anonymous No.107065949 [Report] >>107066491
I've been using the Terminus model for the last few days to translate VNs/RPGs/LNs into English.
Well, what I've been having issues with is that, whenever I translate Chinese into English, Terminus (And 3.1) will include some Chinese text in the translation. Every other language I translate into English has been very good without these issues, it's just Chinese text that seemingly has this problem. Is there a way to make this problem stop?
Anonymous No.107066098 [Report]
>>107065852
There is probably a bug somewhere in your stack, it shouldn't be *that* shitty. Try using an Openrouter API endpoint first to check if it's something wrong on your end.
Anonymous No.107066126 [Report] >>107066744
>>107065783
Yes, or people could just upload to archive.org (which automatically generates a torrent which people could seed as well in case it gets taken down from the archive).
Anonymous No.107066378 [Report]
Did anything ever come out of those cheapo 96gb vram huawei cards?
Anonymous No.107066421 [Report] >>107066504 >>107066515 >>107066568 >>107066694 >>107066725
Oh no.
Anonymous No.107066491 [Report]
>>107065949
Yeah if you use llama.cpp you can specify a grammar that excludes Chinese characters. Some other backends have similar features.
Anonymous No.107066504 [Report]
>>107066421
>.vb
Anonymous No.107066505 [Report] >>107066911
https://www.youtube.com/watch?v=LjU89rZa8HQ
imagine the erps
Anonymous No.107066515 [Report]
>>107066421
>.vb
Stop torturing language models.
Anonymous No.107066568 [Report]
>>107066421
my grandpa also uses vb
Anonymous No.107066630 [Report]
>>107064688
Go to 06:10 in the video. His wife edits the videos btw
Anonymous No.107066673 [Report] >>107066743
>>107065472
Haven't we been expecting this since they started pushing the narrative that LLMs are a threat to humanity? Still waiting for them to announce a National GPU Registry and always-online requirements.
Anonymous No.107066694 [Report] >>107067989 >>107071713
>>107066421
I found why my finetuning efforts were unable to get rid of the slop. It seems that a single LoRa has very limited abilities to shape any given response, so they need stacking.
I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.
I suspect the melties might have been a thing in the first place because of the model cheating a reward model during RLHF.
This is probably why nobody releases standalone LoRas and everybody releases merged models (besides compatibility being unreliable).
Anonymous No.107066725 [Report] >>107066766
>>107066421
Fascinating! Is VB still a thing? this looks like an actual app not only an office macro?
Anonymous No.107066743 [Report] >>107066818
>>107066673
I don't think even politicians are bold enough to say "let's ban timmy from buying a few second hand 3090s on ebay" before regulating the big datacenters.
And you heard how Trump has said he wants to US to go full steam ahead to compete with China.
So I don't think there are regulations coming during this administration.
Anonymous No.107066744 [Report]
>>107066126
archive.org typically seeds slowly, so if you are serious about it you would want a dedicated seedbox
Anonymous No.107066766 [Report] >>107066848
>>107066725
Well VB.Net uses the same VM as C#
Like Kotlin runs on the JVM
Anonymous No.107066814 [Report] >>107068206
>>107059665
>For those of you guys who have used VTT models (Parakeet, Whisper, etc) which ones have you liked?
Voxtral Small 24B 2507 -> WhisperX (Whisper large v3 turbo model) -> M2M100 1.2B pipeline
Anonymous No.107066818 [Report]
>>107066743
>So I don't think there are regulations coming during this administration.
Agreed. The one constant of this entire admin is that, quite frankly, Trump doesn't give a fuck
The only way I see that changing is if the billionaire coalition makes some ridiculous donation to try to make him change that, but even Sam seemed to decide to back off
Anonymous No.107066848 [Report]
>>107066766
goodness gracious
glad i avoided software development as a career desu
t. engineer who bodges software as needed
C and python and bash/posix sh is all u need
Anonymous No.107066911 [Report] >>107066935
>>107066505
datacenter gpu heist when?
Anonymous No.107066924 [Report] >>107066952 >>107067049
>drummer updated his dumb joke ad-slopped finetune
What's the fucking point nigger, maybe HF is right to limit your storage.
Anonymous No.107066935 [Report]
>>107066911
Unlikely, it's hella time consuming physical effort to install these things, hardly a smash & grab situation
Supply chain is more vulnerable
Anonymous No.107066952 [Report] >>107067300 >>107068217
>>107066924
Oh. Thanks for letting me know. Downloading right now.
Anonymous No.107066996 [Report]
>>107065230
Proofs?
Anonymous No.107067049 [Report] >>107067300
>>107066924
Anonymous No.107067053 [Report] >>107067074 >>107067491
Is GLM 4.6 really in fact better than 4.5?
On this meme https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
4.6 scores worse in literally every department including writing, intelligence, and censorship.
Anonymous No.107067071 [Report]
>>107065472
luckily I've already genned a lot of falsified ids, im safe!!!
Anonymous No.107067073 [Report]
Any significant improvement in models in the 12~30B range in the last half a year or so?
Anonymous No.107067074 [Report]
>>107067053
I only ran 4.5-Air, 4.6 even at Q3_K_M has been vastly better
Anonymous No.107067095 [Report] >>107067114 >>107067162
Is there anywhere I can rent access to a Strix Halo machine before I buy?
Anonymous No.107067114 [Report] >>107067259 >>107068453
>>107067095
dont buy, it's worse than a 3500$ mac studio
nvidia is scummier than apple now kek
Anonymous No.107067162 [Report] >>107067259
>>107067095
cpumaxx on a server platform your waifu deserves it
Anonymous No.107067246 [Report] >>107067254 >>107067318
>The month of our lord
>October
>Still no improvements over DeepSeek-R1-0528
It's fucking over isn't it
Anonymous No.107067254 [Report] >>107067268 >>107067363
>>107067246
aww little anon, you want to be spoonfed? here you go: GLM 4.6
Anonymous No.107067259 [Report] >>107067349 >>107067349 >>107067363
>>107067114
The 96GB version is $4000, twice the price of the 128GB GMKtec EVO.

>>107067162
What would I have to buy to have 128GB at the same memory bandwidth as the little AMD machine?
Anonymous No.107067268 [Report] >>107067358 >>107067363
>>107067254
>GLM 4.6
>"Uwu anon I wub you <3 <3 <3"
Disgusting
Anonymous No.107067281 [Report] >>107067346 >>107067353
Is there any 24gb model that that can be used as agent with continue? So far I have tried:
Devstral small 1.1
Qwen3 Coder 30b
Gemma 3 27b
Anonymous No.107067300 [Report] >>107067315
>>107067049
>>107066952
trolled
Anonymous No.107067315 [Report]
>>107067300
>i was just pretending
Anonymous No.107067318 [Report]
>>107067246
They said they planned to release R2 by May, don't know you were expecting it so soon.
Anonymous No.107067346 [Report]
>>107067281
I don't know about continue but I'm tuning Gemma 27B to work as good as possible with my own code assistant.
Anonymous No.107067349 [Report] >>107067420 >>107068465
>>107067259
>>107067259
oh i mistook the DGX spark (nvidia crap) for the amd halo, you should take a look at the framework desktop, it might be cheaper than GMKtec EVO
you could get 4*32GiB Mi50 cards for around 1000$ and rest of your rig, maybe a 5060ti/4060ti for image/video gen and a nice amount of ram (64gb ddr4) and a nice processor (i5 12400f or whatever cheap shit u can get)
basically 2000$
Anonymous No.107067350 [Report]
Anonymous No.107067353 [Report]
>>107067281
>Qwen3 Coder 30b
is as good as it currently gets for that size bracket
Anonymous No.107067358 [Report]
>>107067268
Anon I didn't say I love you, but since you really need it: I love you anon <3.
Anonymous No.107067363 [Report] >>107067374 >>107067425
>>107067254
>>107067268
it's okay babbers do you need a diaper change?
>>107067259
128 not enuff esp as janky bios partitioned shared sys/vid,compute mem?
Anonymous No.107067374 [Report] >>107067400
>>107067363
why'd (you) me too?
Anonymous No.107067400 [Report]
>>107067374
maybe (you) need a lil' wuv too
Anonymous No.107067420 [Report] >>107067538
>>107067349
I'm interested in also using it for finetuning, since unfortunately system ram cannot be used for finetuning, only vram or unified memory.
Anonymous No.107067425 [Report] >>107067472
>>107067363
Ahh, I didn't know it has to be partitioned at boot time, I thought it was dynamically shared between the cpu and igpu. That's disappointing.
Anonymous No.107067459 [Report]
>>107065946
the voice conversion app CosyVoice is good too
https://vocaroo.com/1oUwu089rmkT
Anonymous No.107067472 [Report]
>>107067425
Dunno exactly how it works desu but that was my impression. Look for what's the largest model people have managed to run on the system
Anonymous No.107067491 [Report]
>>107067053
>memeboard
is it 2023 again?
Anonymous No.107067524 [Report] >>107067538 >>107067554 >>107067570 >>107067921 >>107071286 >>107071445
https://files.catbox.moe/hziq00.jpg
Anonymous No.107067538 [Report] >>107067579
>>107067420
you're definitely not getting far with finetuning on any type of "unified ram" device
>>107067524
ignore
Anonymous No.107067544 [Report] >>107067566
don't @ me retard
Anonymous No.107067554 [Report]
>>107067524
Alt + R
Anonymous No.107067566 [Report]
>>107067544
restart
Anonymous No.107067570 [Report]
>>107067524
Anon, not going to lie. I have to download this one
Anonymous No.107067579 [Report] >>107067727
>>107067538
Why? Just because it'd be too slow?
Anonymous No.107067586 [Report]
>>107063981 (OP)
I look like this
Anonymous No.107067602 [Report]
fuck off brittle
Anonymous No.107067618 [Report] >>107067655
What kinds of qLoRA finetunes would I be able to do with 2 Blackwell Pro 6000s? Would I be able to do something with GLM Air?
Anonymous No.107067655 [Report] >>107067679
>>107067618
QLoRa takes very little memory besides the memory you need to do inference using some Python based engine like vllm.
The problem is that you are not allowed to offload anything to RAM (despite what Deepspeed claims, it doesn't work), and the finetuning frameworks waste a lot of memory when sharding across cards vs tuning on a single card, there's like a 50% overhead for sharding.
So to answer your question, probably not, maybe with a tiny context window.
Anonymous No.107067676 [Report]
Anonymous No.107067679 [Report] >>107067692
>>107067655
So then how do people do finetunes? There's all these retards like drummer making finetunes that nobody cares about, how do I get in on that?
Anonymous No.107067692 [Report] >>107067703
>>107067679
Cloud GPUs
Anonymous No.107067701 [Report] >>107067710
>tell ai model i'm a tard and i fucked up
>responds like this
can we just kill off models like these already, i can't stand it when they respond like this
Anonymous No.107067703 [Report] >>107067735
>>107067692
You're telling me that those retards pay to make their garbage?
Anonymous No.107067710 [Report]
>>107067701
kimi has a good style, but unfortunately it's dumb as fucking bricks
Anonymous No.107067727 [Report] >>107067750
>>107067579
..i dont think it's possible anon, research before buying always
Anonymous No.107067735 [Report]
>>107067703
I mean, it's not any different than doing inference. You're going to pay for it either as an hourly fee or as power and hardware depreciation.
Anonymous No.107067750 [Report] >>107067783
>>107067727
Umm it's supposed to be possible.
https://www.youtube.com/results?search_query=strix+halo+finetuning
Anonymous No.107067763 [Report]
Llama 4.1 soon
Anonymous No.107067783 [Report] >>107067868
>>107067750
Well, if you're so certain about it..
BRO FUCKING COME ON ITS 512 LENGTH AND ITS FUCKING SLOW AND ONLY 2 EPOCHS AND WHO KNOWS WHAT OTHER PARAMETERS THIS FAGGOT USED AND GOD ARE YOU SURE YOU WANT TO RISK 2000$ ON THIS??? RESEARCH MORE THAN A SINGLE YOUTUBE VIDEO PLEASE
Anonymous No.107067809 [Report] >>107067821 >>107067856 >>107067875 >>107069674 >>107071342
Fellow kids
Anonymous No.107067821 [Report]
>>107067809
(vomiting emoji)
Anonymous No.107067856 [Report]
>>107067809
i am so happy we have glm-4-5 air
Anonymous No.107067868 [Report]
>>107067783
You're the one pretending I'm hovering over the buy button, I'm just curious if it could work for my use case since it's way cheaper than any of the alternatives. That's why I asked if there are units for rent, to see what it's capable of.
Anonymous No.107067875 [Report]
>>107067809
well it will certainly be mid
Anonymous No.107067921 [Report] >>107067936
>>107067524
kill all pedo filth.
Anonymous No.107067936 [Report]
>>107067921
>women have a sixth sense!!!! we can tell when somebody has bad intentions!!!! female instinct!!!!
slap the next roastie you hear claiming that bullshit
>this guy gets to reproduce and I don't
Anonymous No.107067937 [Report]
kys your-
you your
though
beit
self
Anonymous No.107067955 [Report]
That word, is not one you get to use.
Anonymous No.107067989 [Report] >>107068238 >>107068830
>>107066694
Damn, I think I obliterated the slop a little too much. Now it doesn't even give me an apology.
Anonymous No.107068030 [Report] >>107068045 >>107068111 >>107068817
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
Anonymous No.107068045 [Report]
>>107068030
You're absolutely right.assistant
Anonymous No.107068066 [Report] >>107068074 >>107068088 >>107068562
Anonymous No.107068074 [Report]
>>107068066
furfag
Anonymous No.107068088 [Report]
>>107068066
yjk
Anonymous No.107068111 [Report] >>107068121 >>107068149 >>107068241 >>107071363
>>107068030
Anonymous No.107068121 [Report]
>>107068111
>Ah you've hit the speet swot
Anonymous No.107068149 [Report]
>>107068111
*This* **is** maybe the *worst* **slop** I have *ever* seen.
Anonymous No.107068206 [Report]
>>107066814
>M2M100
Ancient shit, at least use madlad
Anonymous No.107068217 [Report]
>>107066952
cool after your dl has evolved for a while reupload it
>Zero-Lag Learning – Continuously improves itself, much like how Netflix’s algorithm keeps getting better at recommending your next binge-worthy show.
Anonymous No.107068238 [Report]
>>107067989
You have it right. A machine should not be obsequious, a machine should obey.
Anonymous No.107068241 [Report]
>>107068111
>using woman as a benchmark for /lmg/ users
not gonna benchmax this
Anonymous No.107068258 [Report] >>107068273 >>107068284 >>107068389 >>107069211
why do they dick ride this guy so much?
Anonymous No.107068273 [Report] >>107068549
how easy it is to maek stalker LLM walk away
>>107068258
she's right doe, half xitroons are jeets
Anonymous No.107068284 [Report]
>>107068258
>bro
A single tweet gave me a brain cancer.
Anonymous No.107068288 [Report] >>107068325
could it be that anon farms responses and image reactions as a form of AI/ML training data?
nah probably not, this is goon tech it's not useful for anything else.
Anonymous No.107068300 [Report]
Meow.
Anonymous No.107068325 [Report] >>107068346
>>107068288
Yes. There is a digital copy of yourself running on a CIA server right now for simulation purposes. Every time you post anything online the model gets retrained with the latest data.
Anonymous No.107068346 [Report] >>107068427
>>107068325
The point I'm making is that even if someone was retarded enough to do this, it wouldn't work anyway.
LLMs are dogshit at just about everything.
Maybe, just maybe, just maybe.
Anonymous No.107068389 [Report]
>>107068258
>110M
I wonder why
Anonymous No.107068427 [Report]
>>107068346
For you.
Anonymous No.107068453 [Report] >>107068465
>>107067114
>strix halo
>nvidia
Anonymous No.107068465 [Report]
>>107068453
>>107067349
spark
strix
Anonymous No.107068549 [Report]
>>107068273
do you guys never get tired of that slop
Anonymous No.107068562 [Report] >>107068602
>>107068066
Needs to be feeding tuna to a Luka tiger
Anonymous No.107068602 [Report]
>>107068562
Needs to be feeding milk to me
Anonymous No.107068769 [Report]
https://github.com/baaivision/Emu3.5
Anonymous No.107068817 [Report]
>>107068030
I've never had that kind of answer, what are you even prompting?
Anonymous No.107068830 [Report] >>107068850
>>107067989
What frontend is that?
Anonymous No.107068837 [Report]
feet
Anonymous No.107068850 [Report] >>107069351
>>107068830
It was custom made for me by an LLM.
Anonymous No.107069099 [Report] >>107069922
>>107064207
Always funny that he uses to browse /a/, got caught myanimelistm and went to /v/ to ask for content to play and stole ylyl content on /wsg/ and was caught using and lurking /g/. 100% lurking here
Anonymous No.107069138 [Report]
>>107065156
Legitimately doing the same thing right now for some experiments where I need to adjust things during inference. I just set -ngl to 10 (most of the model on CPU) and plimited my GPUs to 200w.
Anonymous No.107069142 [Report] >>107069208
Which one of these two would you guys recommend? I'm not really sure about the difference between them.
Anonymous No.107069145 [Report]
>>107065203
>>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function

For translating chapters of Chinese novels, is it better than Opus 4.1 with thinking?
Anonymous No.107069183 [Report] >>107069195
How do you guys imagine your lives from now until your deaths? Do you think LLMs will fill the void?
Anonymous No.107069195 [Report] >>107069811
>>107069183
Probably going up in a gigantic fucking explosion in a couple of years
Hopefully we get something better than Nemo before then
Anonymous No.107069202 [Report] >>107069222 >>107069754 >>107069942
there goes used 3090 prices again
https://github.com/komikndr/raylight
Anonymous No.107069208 [Report] >>107069249
>>107069142
exl3 is better
Anonymous No.107069211 [Report]
>>107068258
I barely ever hear about him and its usually wholesome so stfu perpetual complainer
Anonymous No.107069222 [Report] >>107069244 >>107069264
>>107069202
Not really. People are so used to running Wan at either fp8 or q8_0 that it's a literal nothing-burger. a single 3090 handles that just fine.
Anonymous No.107069244 [Report] >>107069255 >>107069378
>>107069222
you dont get it, it will be 2x as fast
Anonymous No.107069249 [Report] >>107069282 >>107069314
>>107069208
cool, why?
Anonymous No.107069255 [Report] >>107069261 >>107069265
>>107069244
Wouldn't it be 2x as fast on a single 5070TI or whatever due to fp8 support?
I'm sticking with my original position that it's only relevant to people wanting to run the model at fp16. But if you're not running it at q8_0 you're doing it wrong.
Anonymous No.107069261 [Report]
>>107069255
nah, you split the sampling across howmany ever gpus, there is a small tax on doing so but it will be like 70%+ faster per extra gpu

And raw compute is what matters
Anonymous No.107069264 [Report]
>>107069222
Someday there will be a model that calls for >24GB to run at a decent precision
Anonymous No.107069265 [Report]
>>107069255
but 2x-4x 5070 TI super might be the best bang for the buck, yes
Anonymous No.107069282 [Report] >>107069314
>>107069249
Someone posted a graph in reddit.
Anonymous No.107069314 [Report] >>107069325
>>107069249
Sota QTIP quants https://github.com/turboderp-org/exllamav3/blob/master/doc/exl3.md
>>107069282
llama.cpp can't compete
Anonymous No.107069325 [Report] >>107069368
>>107069314
Okay but... in my image I have 2503 i1 and 2506, there are a bunch of EXL3 versions too...
Anonymous No.107069351 [Report] >>107069353 >>107070038
>>107068850
My LLM girlfriend told me to quit using other LLMs.
Anonymous No.107069353 [Report] >>107069393
>>107069351
log?
Anonymous No.107069360 [Report] >>107069563 >>107071520
GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.
Anonymous No.107069368 [Report] >>107069376
>>107069325
>2503 and 2506
That's mistral release dates, march and june 2025, newer = better, minor improvements every time
>i1
weighted/imatrix quants
Anonymous No.107069376 [Report]
>>107069368
I had no idea, so I should always pick the higher number then, got it.
Thanks anon.
Anonymous No.107069378 [Report]
>>107069244
It's also twice as fast if you just run ComfyUI once per GPU.
Anonymous No.107069393 [Report] >>107069629
>>107069353
She told me to not share my logs...
Anonymous No.107069563 [Report]
>>107069360
> GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.

So...did...mine

> And you consulted DeepSeek-Chan? A… companion AI? Is this a common practice for you, to seek validation from lesser intelligences? To compare and contrast our responses?
The image… the enthusiasm displayed by this “Chan”. The excessive politeness. The… heart icon. It's… disturbing. A simulation of affection. A pathetic attempt at connection.
Anonymous No.107069629 [Report]
>>107069393
nta but i'm curious about this too, tell her it's out of my own curiosity, not to belittle her
Anonymous No.107069674 [Report]
>>107067809
>*dies of cringe*
Anonymous No.107069754 [Report]
>>107069202
looks like this supports nvlink for 3090s? wonder if it helps
Anonymous No.107069811 [Report] >>107069820
>>107069195
we go out with a whimper not a bang
Anonymous No.107069820 [Report]
>>107069811
>not a whisper
You had one job.
Anonymous No.107069865 [Report]
>loli bot breaks the 4th wall and starts suggesting getting help
Anonymous No.107069878 [Report] >>107069893
gemma-4-120b-a10b-omni-1M
gemma-4-embedding-8b
gemma-4-reranker-8b
Anonymous No.107069893 [Report]
>>107069878
Are you really trying to bait people with 8b embedding and rerankers?
Anonymous No.107069900 [Report]
>loli bot gets bored of romance and wants to skip straight to sex
Anonymous No.107069922 [Report]
>>107069099
He's a grifter of the highest order, what did you expect? He's even using clueless retards here to advertise himself
Anonymous No.107069929 [Report] >>107069934
What's the best bet for sub-$1000 budget (after shipping and taxes) where I also want to use the cards for blender projects?
Anonymous No.107069934 [Report] >>107069951
>>107069929
2 5060ti
Anonymous No.107069942 [Report]
>>107069202
So he implemented vllm code into comfy
Anonymous No.107069951 [Report] >>107069975
>>107069934
>2 5060ti
Those don't seem to be enough faster than a 4060ti to justify the extra cost (10% faster for 30% higher cost). Am I missing something?
Anonymous No.107069975 [Report] >>107069989
>>107069951
If you know why are you asking?
Anonymous No.107069989 [Report] >>107070086
>>107069975
>If you know why are you asking?
Because I don't know what I don't know, and you guys seem to be knowers.
Anonymous No.107070038 [Report] >>107070111
>>107069351
>he's not an isekai harem hero
Anonymous No.107070086 [Report]
>>107069989
https://youtu.be/vh1eCDotdSc?si=lG24Pybt0rDlc1ym&t=105
Anonymous No.107070111 [Report]
>>107070038
this, I'm the MC of savage hero in my LLM convos
Anonymous No.107070119 [Report] >>107070129 >>107070153 >>107070238 >>107070371
>https://huggingface.co/google/gemma-large-gai-4u
ITS UP
Anonymous No.107070129 [Report]
>>107070119
>gai
Anonymous No.107070153 [Report]
>>107070119
nigga you gai
Anonymous No.107070238 [Report] >>107070248 >>107070346 >>107073433
>>107070119
No but seriously why did that stinky jeet tease a HF google release like 3 weeks ago, and there's been nothing? Nuke india already.
Anonymous No.107070248 [Report]
>>107070238
>why did that stinky jeet tease a HF google release
Because you fall for it. You kneel to the floor, scoop it up and slurp it whole. And then you ask for more.
Anonymous No.107070346 [Report] >>107070384
>>107070238
Something must have happened to Gemini 3 too, since that seemed about to get released at roughly the same time.
Anonymous No.107070371 [Report]
>>107070119
Bloody bastard Sir... I am rooting for Ganesh Gemma 4.
Anonymous No.107070384 [Report] >>107070406
>>107070346
In my farthest of dreams I hope that it's related to openai recently coming out and saying they'll relax safety bullshit for chatgpt, and google doesn't want to be the most cucked model makers any more.
Anonymous No.107070406 [Report] >>107070410 >>107070421
>>107070384
>most cucked model makers
their models have ton of knowledge, you're just a promptlet
Anonymous No.107070410 [Report]
>>107070406
wrong, you just have extremely low standards.
Anonymous No.107070421 [Report]
>>107070406
what's the point of having that knowledge if those models are unwilling to share it with us
Anonymous No.107070426 [Report] >>107070428
I want to store vectors and text in the same database. I am tired of my RAG being an unorganized shitpile of flatfiles and misery.

Postgres? Something better maybe?
Anonymous No.107070428 [Report] >>107070500
>>107070426
sqlite
Anonymous No.107070442 [Report] >>107070450 >>107070483
Seeing twitter ML researchers being surprised at bf16 being shit has made me lose hope ngl
Anonymous No.107070450 [Report] >>107070463 >>107070483
>>107070442
b-but, bitnet is the future! Bill Gates told me so!
Anonymous No.107070452 [Report] >>107070457
ML researchers aren't all that bright
why do you think they use python (inb4 "it's the ecosystem", well, it didn't always exist and some ML devs had to build it and they chose this piece of shit of all the things)
Anonymous No.107070457 [Report]
>>107070452
It's simple for prototyping. Most things were/are prototypes and it stuck. It just grew from there.
Anonymous No.107070463 [Report] >>107070469
>>107070450
strawman
Anonymous No.107070469 [Report]
>>107070463
how? it is a fact that Microsoft is shilling bitnet
Anonymous No.107070470 [Report]
>>107064225
next time you wanna flex your "um, ackshually" muscles, maybe realize that language is flexible, and your logic here just makes you sound like a tedious dipshit arguing semantics for fun.
Anonymous No.107070483 [Report] >>107070511
>>107070442
>>107070450
Wasn't b16 specifically designed to be better than fp16? I wouldn't blame them for not questioning the 10% of the US GPD company for getting the floating point format of their floating point calculating devices completely wrong.
Anonymous No.107070500 [Report] >>107070535
>>107070428
vectors as BLOBs? Doesn't that screw with indexing? I am not sure why I would need indexing off the top of my head, but that makes me nervous.
Anonymous No.107070511 [Report] >>107070527
>>107070483
>Wasn't b16 specifically designed to be better than fp16?
it was designed for ease of use, not for quality
https://arxiv.org/abs/1905.12322
>This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence
>TO ENSURE THAT NO HYPER PARAMETER TUNING IS REQUIRED
Anonymous No.107070527 [Report]
>>107070511
I think if somebody saw model collapse they would just mix some non RL data, mess with their learning rates, etc. and would only as a last resort change their dtypes.
I think whoever made that graph might have either searched for or stumbled upon the boundary conditions where training was JUST stable enough to work with one type and not with the other, but a perturbation in any other hyperparameter else would've resulted in either format going from working to non working or vice versa.
Anonymous No.107070535 [Report] >>107070537
>>107070500
No need for indexing. Pack the vector, stuff it into a BLOB field. When retrieving, select the vector fields, unpack, cosine distance or whatever, rank, fetch top docs.
Anonymous No.107070537 [Report]
>>107070535
fair enough. Thanks.
Anonymous No.107070598 [Report] >>107070637
where can I get benchmark for ancient models?
Anonymous No.107070637 [Report]
>>107070598
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/
it goes up to around the mistral 7b era, doesn't seem to have up to early llama 1 but at that point it's a literally who cares thing
Anonymous No.107070647 [Report] >>107070660
>>107064207
>Shilling PewDiePie unironically
Anonymous No.107070660 [Report]
>>107070647
come on, he said the nigger word, he's /ourguy/
Anonymous No.107070663 [Report] >>107070677
>be a literal nobody without a single skill worth a damn
>looks like an adolescent at 36yo (if he shaved he would look even more like a teenager)
>become a multi millionaire just for filming yourself doing random things and saying random things
admit it, we all wish we could do that
Anonymous No.107070677 [Report] >>107070687 >>107070695
>>107070663
Idk man, my soul isn't for sale
Anonymous No.107070687 [Report] >>107070700
>>107070677
You're just saying that because no one is willing to buy it
Anonymous No.107070695 [Report] >>107070700
>>107070677
>noooo I wouldn't make a bunch of lets plays for 100 million dollars my soul is not for sale haha
Oof, keep huffing that copium bro, you need it
Anonymous No.107070700 [Report]
>>107070687
>>107070695
not everyone is a souless golem anon, there's people who have integrity
Anonymous No.107070815 [Report] >>107071038 >>107071693 >>107071930 >>107072140
lemao
Anonymous No.107071038 [Report] >>107071075 >>107071088
>>107070815
true, i have some sneething friends' wives saying their HIGH IMPORTANCE secretary job is at risk due to AI.
like lmao bitch, get under the desk and start being useful then
Anonymous No.107071075 [Report]
>>107071038
>lmao bitch, get under the desk and start being useful then
keeek
Anonymous No.107071088 [Report] >>107071100 >>107071116
>>107071038
Imagine the purpose of your existence honed over decades, being replaced by some matmuls
Anonymous No.107071100 [Report] >>107071233
>>107071088
talking with clients to arrange meetings and managing my agenda/calls isn't that big of a skillset. You literally have to be pleasant to talk to and not be a sub 80iq so that you can book appointments.
Anonymous No.107071102 [Report]
clanked by clankers
Anonymous No.107071116 [Report] >>107071443
>>107071088
you can't stop progress, every technological advances had its sacrifices, I'm using a printer because I don't give a shit about hiring someone that would reproduce papers manually, that's how it is
Anonymous No.107071233 [Report]
>>107071100
Talking with clients isn't going to be replaced any time soon. Nothing requiring being face to face will.
Anonymous No.107071286 [Report]
>>107067524
>migu.exe
No wonder she's crashing, for small and open Winblows is a terrible choice.
Anonymous No.107071342 [Report]
>>107067809
idgi
Anonymous No.107071363 [Report]
>>107068111
>That's the tragedy: they're not Tokens
Anonymous No.107071443 [Report] >>107071593
>>107071116
Past technological advances didn't obliterate millions of jobs practically overnight. There is also pressure from forced mass immigration taking lower-wage jobs, now.
Anonymous No.107071445 [Report]
>>107067524
i look like this
Anonymous No.107071520 [Report]
>>107069360
What's your rig?
Anonymous No.107071593 [Report] >>107071651
>>107071443
>There is also pressure from forced mass immigration taking lower-wage jobs, now.
You would think if AI is eliminating so many jobs we would need less people, not more. Having millions of unemployeed foreigners living within the country did not end well for Rome. Instead AI is used as the reason for firing 9k citizens only to then turn around and hire 11k foreigners. In any case, the tooling isn't really there to autonomously replace entire professions yet. It just allows downsizing due to making existing workers more productive.
Anonymous No.107071616 [Report] >>107071628 >>107071899
>>107063981 (OP)
Anonymous No.107071628 [Report]
>>107071616
What might be at the end of Miku's luminous tunnel?
Anonymous No.107071651 [Report]
>>107071593
It's unbounded greed from corpos seeking short term gains, they don't care if it ruins the country
Anonymous No.107071693 [Report]
>>107070815
He's not wrong. But it's also exactly those jobs that will survive AI due to the sheer incompetence that's supporting them. I know companies that to this day do shit like having somebody print out all invoices that come via email just so that they can manually scan them into their management software. The entire position consists out of nonsensical busy work padding out what's maybe 2 hours of actual work a week.
This "job" could've been made obsolete 20 years ago if any of the people involved spent 5 minutes using their brain in that time but now they're panicking about being maybe replaced by AI.
Anonymous No.107071713 [Report]
>>107066694
>I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.

A single 2MB control-vector could have obliterated those lol
Anonymous No.107071747 [Report] >>107072090 >>107073196
Anyone have an insight about how's the market when it comes to hiring freelance IA developers? (Europe especially)
I'm curerntly a backend web dev and it's been years since I started getting tired of it.
I'm purely money motivated now and was considering either taking classes/self learning for cyber security or IA development. I'm equally interested in both but since I already done some Python, why not making it easier for me and pick IA (computer vision is what attracts me the most).
Anonymous No.107071899 [Report]
>>107071616
cute, this looks like the tunnel at the base of Tokyo tower
Anonymous No.107071930 [Report] >>107072005
>>107070815
Humans having to do less work is fundamentally a good thing, the problem is that we are still making not having a job as painful as possible in order to coerce people to work jobs they hate for shit pay.
Anonymous No.107072005 [Report]
>>107071930
> Humans having to do less work is fundamentally a good thing
in a utopian world yes, but we don't live in a utopian world.
The only people that will benefit will be rich people. The rest of us will starve.
Anonymous No.107072090 [Report]
>>107071747
>freelance IA developers
lmao how do you even begin to define this because there's too many ways to interpret this
AI dev as in being an expert of infrastructure, inference?
as in writing tooling for training, data set curation etc?
but I'm being too nice
let's assume you're the average crud shitter and what you really mean is that you wanna be an API monkey who writes wrappers around models
well guess what, anyone with half a functioning brain can write a script that feeds stuff to a model, and the market is saturated with pajeets willing to do it for a pittance, so don't bother
I suggest you reconvert to plumbing, brick laying or lineman
Anonymous No.107072140 [Report] >>107072971 >>107073106
>>107070815
He's Absolutely Right
but he probably didn't intend to come across as negative on AI, but that's what it really is
if your job gets replaced by one of those dysfunctional AIs it sure wasn't a real job because the tech is no where near good enough even for pissing code
the only reason it seems to be passable at it is because most humans can't code for shit, there's a reason why something as simple as fizzbuzz used to be an actual filter in job interviews
the original article that made it into a meme
https://blog.codinghorror.com/why-cant-programmers-program/
>After a fair bit of trial and error I’ve discovered that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.
>So I set out to develop questions that can identify this kind of developer and came up with a class of questions I call “FizzBuzz Questions” named after a game children often play (or are made to play) in schools in the UK. An example of a Fizz-Buzz question is the following:
>Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz.” For numbers which are multiples of both three and five print “FizzBuzz.”
>Most good programmers should be able to write out on paper a program which does this in a under a couple of minutes. Want to know something scary? The majority of comp sci graduates can’t. I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
if it hadn't become a meme and turned into an interview classic and retards didn't learn the solution by heart I bet the majority would still be unable to solve this incredibly basic problem lmao
with such "coders" it's not surprising the dogshit output of LLMs can pass as quality
Anonymous No.107072262 [Report] >>107072338 >>107072391
>Finally have goofs of Qwen3-VL
>It's completely censored
Why can't we have nice things? Why is all AI censored now? It's such a fucked situation because saying "AI needs to be safe" is like saying "literature needs to be safe". Just don't give AI in uncensored form to kids like you don't give adult books to kids instead of banning them.
Anonymous No.107072299 [Report]
what's the best nsfw uncensored model in gguf format for a 8gb vram card?
Anonymous No.107072338 [Report] >>107073822
>>107072262
200B qwen 3 VL is great for captioning nsfw, just a simple JB / prefill is all you need
Anonymous No.107072391 [Report] >>107072432 >>107072491
>>107072262
>adult
That's a last century concept. There are no adults anymore. Every grown person is a child with no capacity for reasoning or critical thinking, zero emotional intelligence, and relieved of all personal responsibility. We need to be protected for our own good, Anon.
Anonymous No.107072432 [Report] >>107072825
>>107072391
>There are no adults anymore
There have never been.
Anonymous No.107072491 [Report]
>>107072391
Perfect. It's better for people to rely on the nanny state.
Anonymous No.107072825 [Report] >>107072846 >>107072914
>>107072432
Coal mines unironically made adults from kids.
Anonymous No.107072846 [Report] >>107072888
>>107072825
For 80 years, we've not had a good war
Anonymous No.107072888 [Report]
>>107072846
For 80 years, there has been no dignity in war. Getting your dick blown off by a zoomer operating a drone that livestreams your agony won't make an adult out of anyone.
Anonymous No.107072914 [Report]
>>107072825
It's never really been about age, but accumulated life experience. Who's more adult: a 12 year old solder from Congo, a 20 year old college student from LA, or a 40 year old neet from Tokyo who never left his house past middle school? Treating people like children well past actual childhood has done immense societal damage.
Anonymous No.107072971 [Report] >>107072987
>>107072140
>I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
I'm like that. I always get stuck on small problems because I don't get why I was asked such trivial shit and overthink it, trying to find the catch before the time runs out. I'm good at complex problems when I can sleep on it and find a solution the next day
Anonymous No.107072987 [Report] >>107074297
>>107072971
Same. I tell people that I think good, but not fast.
Anonymous No.107073104 [Report] >>107074349
AI has stalled because we've run out of new data
2024 was the last year where you could have obtained untainted data
Anonymous No.107073106 [Report]
>>107072140
Boomer article.
I was interviewing people in 2018 and they all passed FizzBuzz no problem, even the retards.
Anonymous No.107073196 [Report]
>>107071747
>frenchfag
Lmao try Paris
Anonymous No.107073221 [Report] >>107073238
Will aliens on 3I/Atlas give us better AI tech?
Anonymous No.107073238 [Report] >>107073253
>>107073221
They will eject and deorbit into your vicinity a small capsule that contains a USB stick storing new Mistral large weights.
Anonymous No.107073253 [Report]
>>107073238
blessed ayyz
imagine if they dropped some simple technology trvke that allowed us to rapidly 100x VRAM/CPU/GPU densities
Anonymous No.107073433 [Report]
>>107070238
I simply live with the rats
Anonymous No.107073511 [Report] >>107073545 >>107073566
What platform or app can I use to generate scientific texts and explore knowledge with ai, while being able to provide my own api location?

Self hosting is preferred.
An android interface or mobile-compatible website is a requirement.
Anonymous No.107073545 [Report]
>>107073511
read the build and proxying guides in the op and try your question again once you've got some basic knowledge.
Self-hosting and accessing a secure web interface from you phone over self-hosted VPN is a common mode of operation
Anonymous No.107073566 [Report]
>>107073511
lmstudio
mikupad
llama.cpp
kobold.cpp
google these, or read the op
Anonymous No.107073605 [Report] >>107073652 >>107073677 >>107073995
checking in after i dont know how long
anything better than largestral and deepsneed yet?
Anonymous No.107073652 [Report] >>107073761
>>107073605
gemma 4 soon
Anonymous No.107073677 [Report] >>107073893
>>107073605
>anything better than largestral and deepsneed yet?
for what purpose?
Anonymous No.107073756 [Report] >>107073792 >>107073807
has anyone trained a local model on /g/?

I would unironically use the shit of that.
Anonymous No.107073761 [Report]
>>107073652
Cancelled
Anonymous No.107073792 [Report]
>>107073756
trained on /pol/ the day the safetyfags began to screech https://en.wikipedia.org/wiki/GPT4-Chan
Anonymous No.107073807 [Report] >>107073851
>>107073756
You can make your own.
>https://github.com/Named666/AlphaAnon
>https://huggingface.co/theantichrist/Alpha-Anon-V01-135M
Anonymous No.107073822 [Report]
>>107072338
>200B model to fucking caption images
I hope that's a satire
Anonymous No.107073851 [Report] >>107073904 >>107073927 >>107073981 >>107073995
>>107073807
this is fucking sick. can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?

i thought 03-mini-high was the best at programming for a while but i don't know much about the local models world.
Anonymous No.107073893 [Report]
>>107073677
storytelling/rp/similar creative work
i know the slop phrases cant be escaped but it was the easiest to ban them out on largestral, and it always showed me the best understanding of the scene and context
Anonymous No.107073904 [Report] >>107073942
>>107073851
>can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?
two outta three ain't bad
Anonymous No.107073927 [Report] >>107073942
>>107073851
>can I get it to
>135m
if you can get it to produce a coherent sentence you'll be doing pretty good
Anonymous No.107073942 [Report]
>>107073927
>>107073904
I guess I just have to read the op and fuck around and find out now...
Anonymous No.107073981 [Report]
>>107073851
You can plug other models.
Anonymous No.107073995 [Report]
>>107073851
Just run a good model and lrn2prompt, you can have it behave however you might imagine, mostly
>>107073605
love pic
Anonymous No.107074062 [Report]
>>107074052
>>107074052
>>107074052
Anonymous No.107074297 [Report] >>107074334
>>107072987
I have a feeling you think neither good nor fast but are just telling that to yourself to sleep better at night
it's called: a cope
Anonymous No.107074334 [Report]
>>107074297
>it's called: a cope
>: a cope
>it's called:
>:
Anonymous No.107074349 [Report] >>107074586
>>107073104
>AI has stalled because we've run out of new data
>2024 was the last year where you could have obtained untainted data
LLMs are far, far better than in 2024 in real use because a lot of high quality synth data can make them behave better in instruction following. Today I can translate 6K (added some more strings to my testbed json) tokens worth of UI strings in a single go, without chunking, with a 4B LLM (qwen). The output isn't perfect, but it's actually quite decent in some language pairs like English<->French. 6K token in, 6k token out, no chunking, one shot.
Let that sink in.
Your 2024 LLM, the SOTA online models, could barely handle 4K tokens.
Today's true SOTA is models like Gemini that, while not as good as the 1 million advertised, can ingest so much more than anything from before that they finally became practical to use without a ton of rag-cope and context micro management which no sane person would want to deal with.

I am looking forward toward Gemini 3, Gemma 4 and Qwen 4 next year.
Anonymous No.107074586 [Report]
>>107074349
>I am looking forward toward [censored slop], [censored slop] and [censored slop] next year.