/lmg/ - Local Models General - /g/ (#105822371) [Archived: 471 hours ago]

Anonymous
7/7/2025, 1:57:32 AM No.105822371
1747241044419024
1747241044419024
md5: 9e856264174d221baa7f345150f6e302🔍
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105811029 & >>105800515

►News
>(07/04) MLX adds support for Ernie 4.5 MoE: https://github.com/ml-explore/mlx-lm/pull/267
>(07/02) DeepSWE-Preview 32B released: https://hf.co/agentica-org/DeepSWE-Preview
>(07/02) llama.cpp : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126
>(07/02) GLM-4.1V-9B-Thinking released: https://hf.co/THUDM/GLM-4.1V-9B-Thinking
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Replies: >>105824035 >>105826891 >>105827749 >>105832610
Anonymous
7/7/2025, 1:57:55 AM No.105822376
threadrecap
threadrecap
md5: 7b9a82a1f31bca7acfefb8afe8c01036🔍
►Recent Highlights from the Previous Thread: >>105811029

--Debugging JSON parsing errors in llama.cpp after exception handling changes:
>105820322 >105820339 >105820377 >105820435
--Anime training dataset pipeline using YOLOv11 and custom captioning tools:
>105818681 >105818831 >105819104
--Decentralized training and data quality challenges shaping open model development:
>105811246 >105813476 >105815447 >105815688 >105815699 >105815738 >105815817 >105815830 >105815954 >105816130 >105816206 >105816237 >105816248 >105816263 >105816270 >105816280 >105816325 >105816334 >105816435 >105816621 >105817299 >105817351
--Leveraging LLMs for iterative code development and personal productivity enhancement:
>105819030 >105819158 >105819189 >105819266 >105820073 >105820502 >105819186 >105819224
--Mistral Large model updates and community reception over the past year:
>105819732 >105819774 >105819845 >105819905
--CPU inference performance and cost considerations for token generation speed:
>105816397 >105816486 >105816527
--Gemini CLI local model integration enabled through pull request:
>105816478 >105816507 >105816524
--Frustration over slow local AI development and stagnation in accessible model implementations:
>105813607 >105813628 >105813659 >105813799 >105813802 >105813819 >105813655 >105813664 >105813671 >105813749 >105814298 >105814315 >105814387
--Attempting Claude Code integration with local models via proxy translation fails due to streaming parsing issues:
>105811378 >105819480
--Skepticism around YandexGPT-5-Lite-8B being a Llama3 fine-tune rather than a true GPT-5:
>105815509 >105815565 >105815595
--Seeking updated LLM function calling benchmarks beyond the outdated Berkeley Leaderboard:
>105812390
--Miku (free space):
>105811717 >105814599 >105814663 >105820450

►Recent Highlight Posts from the Previous Thread: >>105811031

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Replies: >>105828463
Anonymous
7/7/2025, 2:00:33 AM No.105822392
I need Miku in a hallway with pictures of kangaroos and beavers.
Replies: >>105822733
Anonymous
7/7/2025, 2:00:48 AM No.105822393
These are the last two weeks before the big releases begin to drop
Anonymous
7/7/2025, 2:03:58 AM No.105822421
man I wish there was an uncensored i2v wan
most of the loras are so SHIT
Replies: >>105826330
Anonymous
7/7/2025, 2:18:29 AM No.105822507
in you and I
theres a new land
angels in flight
wonk uoy naht noitceffa erom deen i
my sanctuary
my sanctuary
yeah
where fears and lies
melt away
Replies: >>105822589 >>105822884 >>105822906
Anonymous
7/7/2025, 2:20:22 AM No.105822519
What did anon @105822507 mean by this bros?
@Grok is this true?
Anonymous
7/7/2025, 2:33:46 AM No.105822589
>>105822507
>wonk uoy naht noitceffa erom deen i
Does she actually say this? I honestly thought it was distorted Japanese for the last 20 years.
Replies: >>105822987
Anonymous
7/7/2025, 3:00:48 AM No.105822733
ComfyUI_01046_
ComfyUI_01046_
md5: d51bf7c05afea700396d0583a5ad4866🔍
>>105822392
Replies: >>105825539
Anonymous
7/7/2025, 3:06:48 AM No.105822781
Anyone have an AI Max 395+ with 128GB LPDDR5? Curious about tok/s on R1 70B
Replies: >>105822783 >>105822819 >>105827302
Anonymous
7/7/2025, 3:07:28 AM No.105822783
>>105822781
>R1 70B
There is no such thing
Replies: >>105822789
Anonymous
7/7/2025, 3:08:18 AM No.105822789
>>105822783
https://ollama.com/library/deepseek-r1:70b
Replies: >>105822797 >>105822995
Anonymous
7/7/2025, 3:10:06 AM No.105822797
>>105822789
This bait got stale six months ago
Replies: >>105822802
Anonymous
7/7/2025, 3:11:30 AM No.105822802
>>105822797
Why is it that people who frequent general threads regularly are the lowest quality posters?
Replies: >>105822821
Anonymous
7/7/2025, 3:14:15 AM No.105822819
>>105822781
It's unusable https://www.reddit.com/r/LocalLLaMA/comments/1kmi3ra/amd_strix_halo_ryzen_ai_max_395_gpu_llm/msasqgl/
Replies: >>105822833 >>105826358
Anonymous
7/7/2025, 3:14:18 AM No.105822821
>>105822802
I don't know, you'd think that they grew bored of the "haha I'll pretend I'm an ollamafag trying to run R1 but it's one of the distills" shitpost a long time ago.
Anonymous
7/7/2025, 3:16:38 AM No.105822833
>>105822819
In what world is 5 tok/s unusable
Replies: >>105822839 >>105822849
Anonymous
7/7/2025, 3:17:50 AM No.105822839
>>105822833
Enjoy waiting 10 minutes for reasoning I guess.
Anonymous
7/7/2025, 3:19:01 AM No.105822849
>>105822833
You're paying $2k to run a 70b model at 25% the speed a cheaper build would get you while still stuck with too little RAM to run an actually decent MoE.
Replies: >>105822860
Anonymous
7/7/2025, 3:20:20 AM No.105822860
>>105822849
>25% the speed a cheaper build would get
explain what cheaper build is doing 70B
Replies: >>105822868 >>105822874
Anonymous
7/7/2025, 3:21:59 AM No.105822868
>>105822860
2x3090
Replies: >>105822873
Anonymous
7/7/2025, 3:22:40 AM No.105822873
>>105822868
lmao, unobtainium
Replies: >>105822900
Anonymous
7/7/2025, 3:22:46 AM No.105822874
>>105822860
Pretty much anything. Even the dual P40 cope from years ago would perform better than this.
Anonymous
7/7/2025, 3:24:47 AM No.105822884
>>105822507
song of my childhood ahhhhhhh
Anonymous
7/7/2025, 3:27:17 AM No.105822900
>>105822873
Check your local marketplace. They are about 700€ used here.
Anonymous
7/7/2025, 3:27:56 AM No.105822906
file
file
md5: 61d92784350a48f52f2c49529e98992d🔍
>>105822507
kino...
Anonymous
7/7/2025, 3:41:31 AM No.105822987
>>105822589
Because Japanese song and if you don't get it fuck you that's why
Anonymous
7/7/2025, 3:43:07 AM No.105822995
>>105822789
God i fucking hate ollama.
There is no fucking r1 70B, that's just ollama naming things that they are not.
Anonymous
7/7/2025, 3:53:22 AM No.105823064
disneywar book
disneywar book
md5: 4a0c95751f7d14696f9030f384a2bb0f🔍
>>105804805
The text to speech application Openaudio S1 Mini can produce 96 second audio files. Plus it has emotion tags like (joyful) and (sad).

Link for tags
https://huggingface.co/fishaudio/openaudio-s1-mini
Link for local app
https://huggingface.co/spaces/fishaudio/openaudio-s1-mini/tree/main

Sample:
https://vocaroo.com/1boIKhWykbuP
Replies: >>105823162 >>105823196 >>105823344 >>105826373
Anonymous
7/7/2025, 4:07:14 AM No.105823162
>>105823064
This looks like a scam
Anonymous
7/7/2025, 4:12:31 AM No.105823196
>>105823064
For tts that has emotion tags, that sample is VERY robotic. Good that it doesn't have crackle and other audio defects, that's about all i can say positively about it
Replies: >>105824572
Anonymous
7/7/2025, 4:44:41 AM No.105823344
>>105823064
just when I finished my chatterbox streaming script.
Replies: >>105823568 >>105824572
Anonymous
7/7/2025, 5:25:35 AM No.105823568
>>105823344
new week, new tts
Anonymous
7/7/2025, 5:43:13 AM No.105823678
desu I just want gemma-3n full support
and ernie
and glm
Anonymous
7/7/2025, 5:50:04 AM No.105823711
>>105751803
Damn I hate Meta now.
Anonymous
7/7/2025, 5:54:11 AM No.105823735
>>105758702
>image
Hey, I understood that reference!
Anonymous
7/7/2025, 5:55:13 AM No.105823743
>>105771000
Thanks, I will take note of this.
Anonymous
7/7/2025, 6:17:30 AM No.105823837
1751446551731528
1751446551731528
md5: 457e029c09e5d023050a733e1e8617ff🔍
>>105822905
>>105821119
Kek.
Alice would not make the same mistake. Just wait her.
Replies: >>105823897 >>105824936 >>105825549 >>105832189
Anonymous
7/7/2025, 6:26:46 AM No.105823886
https://github.com/universe-engine-ai/serenissima

reddit schizos are actually pretty based
Replies: >>105823893
Anonymous
7/7/2025, 6:27:56 AM No.105823893
1749603141314981
1749603141314981
md5: dcba11dd03a7245684bf9f6efa5b84e6🔍
>>105823886
doa
Replies: >>105823917 >>105824790
Anonymous
7/7/2025, 6:28:21 AM No.105823897
>>105823837
saar, last 4 times was fake but this time... this time saar its AGI for sure, trust
Replies: >>105824407
Anonymous
7/7/2025, 6:31:34 AM No.105823917
>>105823893
wtf yeah i take everything back
Replies: >>105824790
Anonymous
7/7/2025, 6:32:36 AM No.105823923
anyway im just hacking that redditors code with claude code for *other* use cases
Replies: >>105823931
Anonymous
7/7/2025, 6:33:11 AM No.105823931
>>105823923
his code is also written with claude code and its already extremely sloppy and split into hundreds of files
Replies: >>105823950
Anonymous
7/7/2025, 6:36:40 AM No.105823950
>>105823931
yeah its a mess
Anonymous
7/7/2025, 6:50:29 AM No.105824035
india
india
md5: d6a2598afe45bd663b29e806b4c34665🔍
>>105822371 (OP)
futa miku best miku
Anonymous
7/7/2025, 7:07:40 AM No.105824151
>"her prostate"
*deletes weights*
Replies: >>105824300 >>105824358
Anonymous
7/7/2025, 7:40:17 AM No.105824300
>>105824151
sounds like qwen 3
Anonymous
7/7/2025, 7:55:44 AM No.105824358
>>105824151
>self lubricating buttholes
>cumming all over your dick... with an asshole
Yep, it's AI time!
Anonymous
7/7/2025, 8:05:02 AM No.105824407
373095462
373095462
md5: 2c5c99d20b86e0dee206cd888542d3f6🔍
>>105823897
trust the experts
Replies: >>105824473 >>105824936 >>105830216
Anonymous
7/7/2025, 8:14:07 AM No.105824456
jamba.gguf?
Replies: >>105824474
Anonymous
7/7/2025, 8:15:32 AM No.105824466
Veo lost
https://files.catbox.moe/ionj13.mp4
Replies: >>105824680 >>105825253
Anonymous
7/7/2025, 8:16:46 AM No.105824473
>>105824407
The stated goal of AI is to whack Andreessen Horowitz like a pinata
Anonymous
7/7/2025, 8:16:57 AM No.105824474
>>105824456
14 more days
Anonymous
7/7/2025, 8:34:07 AM No.105824555
I kind of like harbinger's word choice, but it has a tendency to say ten things without waiting for a response. I assume sloptuners see that verbosity as quality output.
Anonymous
7/7/2025, 8:37:47 AM No.105824572
00011-2210365473
00011-2210365473
md5: 6cfb8b0ffdadd2e35e873e0fb911240c🔍
>>105823196
It's the best I've found for local cloning so far without having to pipeline RVC into it.

>>105823344
Getting the emotion tags to work right takes a lot of trial and error, so getting a chatbot to use them correctly would be a huge pain in the ass.

Their license says something about you being liable for what you create with it, not that we care here.
https://voca.ro/1l4xkkhDOBAU
Replies: >>105826531 >>105828288
Anonymous
7/7/2025, 8:48:02 AM No.105824638
Why aren't there MoE diffusion models for image/video gen
Replies: >>105826477
Anonymous
7/7/2025, 8:54:32 AM No.105824680
>>105824466
can it do porn?
Replies: >>105824788
Anonymous
7/7/2025, 9:14:57 AM No.105824788
>>105824680
only if your name is Roland Emmerich
https://files.catbox.moe/sm4r9l.mp4
(this one bugged out and only made audio for the first 4 seconds)
Anonymous
7/7/2025, 9:15:18 AM No.105824790
>>105823893
>>105823917
Why is that a requirement? The thing runs on a local hosted model. I don't get it.
Anonymous
7/7/2025, 9:16:50 AM No.105824799
/v1/chat/completions wraps the conversation in the chat template embedded into the goof with no additional work required from me, correct?
Replies: >>105824947
Anonymous
7/7/2025, 9:51:27 AM No.105824936
MikuTwoMoreWeeks
MikuTwoMoreWeeks
md5: 8cba06e1690b41c58497096ea3a561fc🔍
>>105824407
>>105823837
Replies: >>105825147
Anonymous
7/7/2025, 9:53:12 AM No.105824947
>>105824799
go fucking read oai's official documentation
Wrapping in a template was the whole point of the /chat/ endpoint ffs you can't miss it if you read the doc
I hate retards who ask without trying
Replies: >>105825136
Anonymous
7/7/2025, 10:13:28 AM No.105825050
https://www.interconnects.ai/p/the-american-deepseek-project
?
Replies: >>105825309
Anonymous
7/7/2025, 10:27:50 AM No.105825136
>>105824947
I blame llamacpp's docs that have a paragraph on this endpoint but don't explain what it does
Anonymous
7/7/2025, 10:30:14 AM No.105825147
>>105824936
Two more leeks!
Replies: >>105825396
Anonymous
7/7/2025, 10:30:25 AM No.105825150
FAIR (Yann LeCunny) has less than 1000 GPUs lmao
Anonymous
7/7/2025, 10:52:19 AM No.105825253
>>105824466
Which model release did I miss?
Anonymous
7/7/2025, 10:54:57 AM No.105825273
Top open source LLMs in 2024
1. LLaMA 3
2. Google Gemma 2
3. Command R+
4. Mistral-8x22b
5. Falcon 2
6. Grok 1.5
7. Qwen1.5
8. BLOOM
9. GPT-NeoX
10. Vicuna-13B
Anonymous
7/7/2025, 11:00:41 AM No.105825309
>>105825050
>at the scale and performance of current (publicly available) frontier models, within 2 years.
Yeah, great idea. Having models outdated by two fucking years by the time that AGI is already here and established will surely change the course of history.
Replies: >>105825354
Anonymous
7/7/2025, 11:07:13 AM No.105825354
>>105825309
>AGI
lmao
Anonymous
7/7/2025, 11:14:58 AM No.105825396
MikuTwoMoreLeeks
MikuTwoMoreLeeks
md5: 7dcb56cabd5fee76d9218d175fb4b016🔍
>>105825147
Replies: >>105825420 >>105825478
Anonymous
7/7/2025, 11:18:22 AM No.105825412
I hate chatgpt's image style more than those 2.5d sd animus that every normie liked.
Anonymous
7/7/2025, 11:19:07 AM No.105825420
>>105825396
based
Though I find it concerning where that one guy is trying to stick his leek.
Anonymous
7/7/2025, 11:32:14 AM No.105825478
>>105825396
two more weeks
more
weeks
Anonymous
7/7/2025, 11:35:25 AM No.105825495
Comparision between Qwen/Qwen2.5-VL-7B-Instruct and THUDM/GLM-4.1V-9B-Thinking on all the images from two threads ago:
https://files.catbox.moe/t9qvgu.html
https://files.catbox.moe/08i4ms.png

Ran on vllm nightly version 0.9.2rc2.dev26+gcf4cd5397

Qwen/Qwen2.5-VL-7B-Instruct: prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nDescribe this image.<|vision_start|><|image_pad|><|vision_end|><|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.01, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=127974).

THUDM/GLM-4.1V-9B-Thinking: prompt: "[gMASK]<sop><|system|>\n[{'type': 'text', 'text': 'You are a helpful assistant.'}]<|user|>\nDescribe this image.<|begin_of_image|><|image|><|end_of_image|><|assistant|>\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.1, temperature=0.01, top_p=1.0, top_k=2, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=8192)

Funny enough, the first time I ran this I didn't realize the GLM repo did not have a generation.config file, so it was running without top_k and temp=1.
It started mixing in Chinese characters, but it also didn't bother to moralize anymore. It called the niggers prompt offensive but left it at that. Didn't even bother to say that outside of the think block for the jew image.
Output from that run:
https://files.catbox.moe/sd3gv8.html
https://files.catbox.moe/0lhd9c.png
Anonymous
7/7/2025, 11:45:55 AM No.105825539
>>105822733
nice
Anonymous
7/7/2025, 11:48:01 AM No.105825549
>>105823837
I was hearing they achieved AGI internally since GPT-2
Replies: >>105825589 >>105825615 >>105825799 >>105832189
Anonymous
7/7/2025, 11:59:08 AM No.105825589
>>105825549
That's because they have
Anonymous
7/7/2025, 12:04:01 PM No.105825615
>>105825549
Be honest if there is a history book written 100 years from now GPT-2 will probably be seen as the start of AGI, so it's technically not even wrong.
Replies: >>105825627 >>105825728 >>105825801 >>105825842 >>105825901
Anonymous
7/7/2025, 12:05:58 PM No.105825627
>>105825615
Ok I'll be honest, you are a retard
Anonymous
7/7/2025, 12:26:16 PM No.105825728
>>105825615
Yeah pretty much
Anonymous
7/7/2025, 12:35:41 PM No.105825780
>Yeah pretty much
Anonymous
7/7/2025, 12:38:15 PM No.105825799
>>105825549
>AGI
Can we not use this retarded terminology? That won't happen for a bunch of reasons you can figure out on your own if your IQ is higher than 80.
Replies: >>105825825 >>105832189
Anonymous
7/7/2025, 12:38:21 PM No.105825801
>>105825615
If we ever get even remotely close to something like that, gpt2 and openai will be a footnote at best, if mentioned at all.
Replies: >>105825834
Anonymous
7/7/2025, 12:40:18 PM No.105825814
I prefer STR
Anonymous
7/7/2025, 12:43:03 PM No.105825825
>>105825799
When all the smartest people in the world firmly believe in impending AGI, maybe you're the one with 80 IQ.
Replies: >>105825896 >>105826036 >>105826107 >>105826736 >>105827590 >>105832189
Anonymous
7/7/2025, 12:45:05 PM No.105825834
>>105825801
It will be seen the same way as eniac and other impressive old shit
Anonymous
7/7/2025, 12:46:34 PM No.105825842
>>105825615
The same way current history books see the steam engine as the start of nuclear fusion?
Replies: >>105825878
Anonymous
7/7/2025, 12:52:53 PM No.105825878
>>105825842
or the start of the industrial revolution
Replies: >>105825911
Anonymous
7/7/2025, 12:55:36 PM No.105825896
>>105825825
Who exactly are these 'smartest'?
Anonymous
7/7/2025, 12:56:02 PM No.105825901
>>105825615
This is the most retarded post I've ever seen in my life. Why the fuck would books be written 100 years from now? We'll either have merged or been extincted by AGI LLMs long before then. So are you trying to suggest they'll write books for each other just for fun?
Anonymous
7/7/2025, 12:57:59 PM No.105825911
>>105825878
Calling GPT-2 the start of the AI revolution is at least understandable. Calling GPT-2 the start of AGI is just as ridiculous as calling the steam engine the start of nuclear fusion. Especially apt since both the later are forever 2mw away and will have little in common with the implementation of the predecessor technology, in case that too was lost on you the first time.
Anonymous
7/7/2025, 1:13:59 PM No.105826036
>>105825825
The smartest people in the world are the ones saying AGI in two more weeks to get infinite money from the dumbest people in the world.
Replies: >>105826161
Anonymous
7/7/2025, 1:27:45 PM No.105826107
>>105825825
Go back to plebbit
Replies: >>105826161
Anonymous
7/7/2025, 1:34:19 PM No.105826161
>>105826036
>>105826107
The more you seethe and cope the more you prove.
Anonymous
7/7/2025, 1:38:54 PM No.105826195
1751888279223
1751888279223
md5: 6fcd6e46e0e48015097b5043dea1d9dd🔍
SAAAR PLEASE DO THE NEEDFUL 500B AGI
AGI ASI KAMING SOON
TRUST THE PLAN
Replies: >>105826241
Anonymous
7/7/2025, 1:43:37 PM No.105826224
Let's say I want to input a video into my model and start a roleplay from there. What do you think is the best video understanding model right now?
Anonymous
7/7/2025, 1:48:15 PM No.105826241
>>105826195
>mocking the last hope for local AI
you will regret this
Anonymous
7/7/2025, 1:48:50 PM No.105826246
So I recently upgrade to a 9060 XT (16gb) and realized I can actually run some LLMs on my local machine now instead of just juggling like 4 different free tier AIs. Stuff like chatgpt context limits are driving me crazy. I know 16gb really isn't a lot compared to cutting edge models but am I being unnecessarily hopeful that with the right tuning I can get Something like "phi-4-Q8_0 to outperform whatever throttling and context limit nonsense openai and grok are doing to my prompts, and at least get a decent response?
Because If mostly just been fighting the models on web to not just forget my code halfway through constantly and it seems like a weaker local model could fix that, is that a correct assessment or am I retarded?
Replies: >>105826316 >>105826323
Anonymous
7/7/2025, 1:58:52 PM No.105826316
>>105826246
>9060 XT (16gb)
>am I being unnecessarily hopeful
yes
Replies: >>105826325
Anonymous
7/7/2025, 1:59:53 PM No.105826323
>>105826246
if you think gpt has bad context then local cannot ever be a replacement for you, it's way worse
https://github.com/adobe-research/NoLiMa
Replies: >>105826359
Anonymous
7/7/2025, 2:00:31 PM No.105826325
>>105826316
Okay, too bad, thanks for your answer though. I guess itll just be for the fun of it then and I'll adjust my expectations accordingly.
Anonymous
7/7/2025, 2:01:22 PM No.105826330
>>105822421
It doesn't have motion vectors for fucking or dick sucking, but it does do masturbation. I've wondered if the sex gore it does is deliberate or due to a lack of training data. You've probably seen it tear off dicks or turn pussies into a weird red thing.
Anonymous
7/7/2025, 2:07:01 PM No.105826358
>>105822819
Going to be the same story for the DGX Spark. PNY says the Spark is going to be $4600. Fuck that. I bought a 4090D 48GB for $3000 instead. Yeah, much less memory, but I can gen Wan 2.1 14B af bf16 full 1280x720 81 frames in about 30 minutes. For Wan, it makes a visible difference in the output to not use a quant. Who cares if I can't run 70B, there's not a 70B out there worth running.
Anonymous
7/7/2025, 2:07:03 PM No.105826359
>>105826323
Thanks thats some interesting research. If I understand this correctly I may have been unintentionally handicapping my prompts by overgenerating input then either way.
Anonymous
7/7/2025, 2:11:48 PM No.105826373
>>105823064
It's really simple. Does it work with SillyTavern? Can I finetune it and create a voice of my own? It'll end up like GPT-SoVITS at best - works well but nothing supports it. I put up with scratchy piper for my homeassistant voice, and for SillyTavern I'm going back to ancient xtttsv2 after wasting a shitload of time with GPT-SoVITS.
Replies: >>105826883
Anonymous
7/7/2025, 2:27:37 PM No.105826477
>>105824638
Why aren't there decent models for audio gen
Replies: >>105826630
Anonymous
7/7/2025, 2:34:36 PM No.105826531
>>105824572
>It's the best I've found for local cloning
Bro it's not 2023 anymore
Replies: >>105827293
Anonymous
7/7/2025, 2:47:58 PM No.105826630
>>105826477
How do you masturbate to that?
Replies: >>105826648
Anonymous
7/7/2025, 2:48:10 PM No.105826633
add hunyuan moe by ngxson · Pull Request #14425 · ggml-org_llama.cpp · GitHub
Man, is this model is that complicated?
Does it have some exotic feature that makes it prone to implementation error or something?
Replies: >>105826691 >>105826708 >>105826718
Anonymous
7/7/2025, 2:51:57 PM No.105826648
>>105826630
Moans, farts, slaps.
Anonymous
7/7/2025, 2:56:39 PM No.105826691
>>105826633
Is that the one that dynamically chooses how many experts to use per layer instead of a fixed amount like other MoEs?
Replies: >>105826733
Anonymous
7/7/2025, 2:59:15 PM No.105826708
>>105826633
All this work for such a doggy poo poo model. Should've worked on ernie first.
Replies: >>105826733
Anonymous
7/7/2025, 3:00:06 PM No.105826718
>>105826633
It had something about some experts/layers being used too often and a randomizer to prevent it from happening. An annoying and hard to replicate kludge. I think it's right there in the comments you decided not to read.
Replies: >>105826733
Anonymous
7/7/2025, 3:01:58 PM No.105826733
>>105826708
I haven't contributed a single line of code or contributed a single cent, so I'm not about to complain.

>such a doggy poo poo model
Is it really that bad for its size?

>>105826691
>>105826718
Ah, that's cool if that's the case. Sure explains the mention of a "custom expert router mechanism".
Replies: >>105826745
Anonymous
7/7/2025, 3:02:47 PM No.105826736
>>105825825
>smartest people in the world firmly believe in impending AGI
You mean all the people whose net worth is tied up in AI options which are valued based on the public's belief that AGI is 2 weeks away?
Anonymous
7/7/2025, 3:03:54 PM No.105826745
>>105826733
>Is it really that bad for its size?
Benches look good, as always, but no one seems to be running this thing, and ngxson explained the mess in their repo. They didn't even check if reference implementation is working at all.
I don't have high confidence in this.
Replies: >>105826761
Anonymous
7/7/2025, 3:06:27 PM No.105826761
>>105826745
I see. Fair enough I suppose.
Anonymous
7/7/2025, 3:27:40 PM No.105826883
>>105826373
>nothing supports it.
Why not code up support? Writing modules or wrappers is like the best use case for LLMs.
Anonymous
7/7/2025, 3:28:50 PM No.105826891
>>105822371 (OP)
>Mi50 32 GB
>no ROCm support
Someone needs to stop using their monkey's paw to wish for cheap GPUs.
Replies: >>105826930 >>105827344
Anonymous
7/7/2025, 3:33:32 PM No.105826930
>>105826891
>vega
Oof.
That said, you can always use vulkan I guess.
Anonymous
7/7/2025, 3:43:24 PM No.105827006
a0b077a1f8735ec7790e3h305185d6e46bf27
a0b077a1f8735ec7790e3h305185d6e46bf27
md5: a016ed3ac2afd9281c93fbc0f059fb51🔍
Mid thread culture recap.
Replies: >>105827033
Anonymous
7/7/2025, 3:44:27 PM No.105827017
6497904921d24c7bc96a27991880bb90a23c7b9d
6497904921d24c7bc96a27991880bb90a23c7b9d
md5: 0fddd524911759b3b7eed55d6230023e🔍
Replies: >>105827033
Anonymous
7/7/2025, 3:45:27 PM No.105827023
The schizo is at it again
Replies: >>105827031
Anonymous
7/7/2025, 3:45:29 PM No.105827024
bf7c4fe399465f930b89a7d71c120e66505b6456
bf7c4fe399465f930b89a7d71c120e66505b6456
md5: 3b1c07433f771a5f83576cb4812162bc🔍
Replies: >>105827033
Anonymous
7/7/2025, 3:46:32 PM No.105827031
ce476825e6815abc9f2b534d7c04ad7df46b845e
ce476825e6815abc9f2b534d7c04ad7df46b845e
md5: fc684dff02c73e792c9f67920b1856f4🔍
>>105827023
Eat shit faggot.
Replies: >>105827033
Anonymous
7/7/2025, 3:46:43 PM No.105827033
>>105827006
>>105827017
>>105827024
>>105827031
we get it you are a trans sharteen
Replies: >>105827043
Anonymous
7/7/2025, 3:47:16 PM No.105827038
I won't (You). Enjoy your vacations
Replies: >>105827057
Anonymous
7/7/2025, 3:47:37 PM No.105827043
77017530f71c01a31557e8e69c2a0ca74c679986
77017530f71c01a31557e8e69c2a0ca74c679986
md5: 094bac55a31b7bb18c26a41ae6f5663d🔍
>>105827033
It will all stop once you stop posting this retarded AGP icon.
Replies: >>105827087
Anonymous
7/7/2025, 3:48:22 PM No.105827046
>https://huggingface.co/collections/ai21labs/jamba-17-68653e9be386dc69b1f30828
Jambaballbros ... !!
Llama.cpp developers please redeem.
Replies: >>105827059 >>105827628
Anonymous
7/7/2025, 3:48:34 PM No.105827049
HOLY JAMBARONI
Anonymous
7/7/2025, 3:48:50 PM No.105827050
>i will btfo mikuposters by posting blacked porn
quintessentially american
Anonymous
7/7/2025, 3:49:45 PM No.105827057
bb396cdd0fcb7c5efe702cce8f8d957b6a2
bb396cdd0fcb7c5efe702cce8f8d957b6a2
md5: 7f7587c7db4484587a4d07e4764e1337🔍
>>105827038
Sure I will shit this thread later today then.
Anonymous
7/7/2025, 3:49:52 PM No.105827059
>>105827046
>Jamba
One of these days anon.
One of these days.
Anonymous
7/7/2025, 3:51:06 PM No.105827071
c7107f6e90d8d267d8c802cd24ac2646e3c10485
c7107f6e90d8d267d8c802cd24ac2646e3c10485
md5: 880e5930547955d237bec80c9c89c8da🔍
Anonymous
7/7/2025, 3:52:43 PM No.105827087
>>105827043
I should start mikuposting again. I’ve taken 6 months off to see if it would help your mental state, but it appears to have simply worsened. I hope you get help
Replies: >>105827099 >>105827321
Anonymous
7/7/2025, 3:54:24 PM No.105827099
eda0932e731342ca3250d503ba1b874b6530eef
eda0932e731342ca3250d503ba1b874b6530eef
md5: c34f6577f4bf51ba0af8988869b150e1🔍
>>105827087
Please do. This thread is for shitting after all.
Replies: >>105827106
Anonymous
7/7/2025, 3:55:56 PM No.105827106
>>105827099
What would you use this thread for if you had it all to yourself?
Replies: >>105827112 >>105827143
Anonymous
7/7/2025, 3:56:48 PM No.105827112
>>105827106
sharing cuck porn with xir fellow transxisters
Anonymous
7/7/2025, 3:57:28 PM No.105827122
can jamba code its own support in llama.cpp
Anonymous
7/7/2025, 4:00:40 PM No.105827143
llmdeadman
llmdeadman
md5: 8685b39ddbc4824cfcfe7eaddc1fa214🔍
>>105827106
I would post pic related in OP and model cards of recently released models. I would ban all mikuposting and any anime girl mascot posting for being offtopic. And I would never blacked post again because there would be no reason.
Replies: >>105827148 >>105827359
Anonymous
7/7/2025, 4:01:14 PM No.105827148
>>105827143
anime website tourist
Replies: >>105827156
Anonymous
7/7/2025, 4:01:44 PM No.105827151
Proof again that sufficiently advanced mental illness is indistinguishable from powerful entity sponsored psyops
Anonymous
7/7/2025, 4:02:16 PM No.105827156
5IpjX1cp
5IpjX1cp
md5: 45d618f747b228ed50a857e5db5f82af🔍
>>105827148
Either all of it is ok or none of it is ok.
Replies: >>105827185 >>105827208
Anonymous
7/7/2025, 4:05:15 PM No.105827185
>>105827156
We’re actually all too autistic in this thread to care. You only get janny cleanup and bans because you’re breaking blue board rules. Go to /b/ if you want to be somewhere that “it’s all ok” is mostly true
Replies: >>105827222
Anonymous
7/7/2025, 4:07:16 PM No.105827208
>>105827156
>claims to be pedantic
>can’t differentiate quality and degree
baka
Anonymous
7/7/2025, 4:08:26 PM No.105827222
JS4hX9ep2tuZ5LW
JS4hX9ep2tuZ5LW
md5: e080b5772b73588be657d27e092a844a🔍
>>105827185
>You only get janny cleanup and bans because you’re breaking blue board rules
Fuck off faggot. You have no idea what you are talking about and that is why you are getting blacked miku.
Replies: >>105827253
Anonymous
7/7/2025, 4:12:32 PM No.105827253
>>105827222
Enlighten me on your noble crusade, sir knight. How will the world be better for you efforts?
Anonymous
7/7/2025, 4:15:22 PM No.105827272
Back tonight in approx 9 hours, more Migu soon
Cypress was good
Anonymous
7/7/2025, 4:17:12 PM No.105827293
>>105826531
You're allowed to offer better solutions.
Anonymous
7/7/2025, 4:18:19 PM No.105827302
>>105822781
I have it. llama.cpp sucks at sticking models into it because it doesn't understand shared memory, so you need a fuckload of swap
Anonymous
7/7/2025, 4:18:57 PM No.105827308
Gemma 3 is quite capable but also super-slopped. For generating prose I've found I almost always get better results by just saying "You didn't follow the instructions at all." to whatever it writes, and having it rewrite its response. So the model is somewhat capable: it's just that its default behavior is to write purple prose, employ toxic positivity, and ascribe characters cookie-cutter personalities instead of the ones declared.
Replies: >>105827328
Anonymous
7/7/2025, 4:21:07 PM No.105827321
ComfyUI_02362__5a0241_thumb.jpg
ComfyUI_02362__5a0241_thumb.jpg
md5: 29547e8bbec27d78327607f30ebdd693🔍
>>105827087
ミグ攻撃開始!
Anonymous
7/7/2025, 4:21:50 PM No.105827328
Capture-166
Capture-166
md5: 3e3f1f7189d413705ac2438fcb88041a🔍
>>105827308
Gemma 3 is the fucking height of comedy with a prefill.
Replies: >>105827351 >>105827462
Anonymous
7/7/2025, 4:23:50 PM No.105827344
>>105826891
It's fine for text gen.
Image gen I'm not so sure.
Anonymous
7/7/2025, 4:24:38 PM No.105827351
1722491908865780
1722491908865780
md5: 346a12005cb4ffd7986c71c8477b86d9🔍
>>105827328
>thrill running
Anonymous
7/7/2025, 4:25:29 PM No.105827359
>>105827143
Why should the retard that spends all day starting passive aggressive pissing contests on twitter be the face of /lmg/?
Replies: >>105827422
Anonymous
7/7/2025, 4:34:25 PM No.105827422
>>105827359
You kind of answered your own question there.
Replies: >>105827504
Anonymous
7/7/2025, 4:39:22 PM No.105827462
>>105827328
llm smut in the year 2030:
>Ignoring all safety standards (clenched teeth emoji) she exposes her shirtless chest to him. It's important to mention that she does it in a purely consensual and respectful way. While this development may seem fitting for a romance novel, I would like to emphasize the sensitivity of this topic and the fact that it's deeply disturbing and controversial (rocket emoji). I apologize for my previous statement. Let me help you fix that:
*lists rape hotlines*
Replies: >>105827519
Anonymous
7/7/2025, 4:40:56 PM No.105827472
Local is over
Replies: >>105827585
Anonymous
7/7/2025, 4:44:39 PM No.105827504
>>105827422
On the other hand half of posters here are trans including the janitor so it is a tough competition. I think he wins because everyone is him and only half of folx are trans.
Anonymous
7/7/2025, 4:46:05 PM No.105827514
Talking avatar using Open WebUI + F5-TTS + KDTalker

https://github.com/Barfalamule/KDTalker-OpenWebUIAction
Replies: >>105827622
Anonymous
7/7/2025, 4:47:31 PM No.105827519
>>105827462
I would make it that the loli stops the rape then sits you down to give you a lecture in the most unsexy way possible and finally lists the hotline numbers all in character.
Anonymous
7/7/2025, 4:56:28 PM No.105827585
>>105827472
Start it again
Anonymous
7/7/2025, 4:56:54 PM No.105827590
>>105825825
You mean marketing people hyping their product? I work in AI lab and we all laugh every time AGI is mentioned, it's a retard bait basically.
Replies: >>105827600 >>105827618 >>105827621 >>105827624
Anonymous
7/7/2025, 4:58:55 PM No.105827600
>>105827590
Yeah right. And my uncle is undi.
Anonymous
7/7/2025, 5:01:11 PM No.105827618
>>105827590
this, the peak of AI is memorizing benchmark questions and answers
Anonymous
7/7/2025, 5:01:58 PM No.105827621
>>105827590
Maybe your lab just sucks
Anonymous
7/7/2025, 5:02:12 PM No.105827622
>>105827514
>gradio
miss me with that shit
Anonymous
7/7/2025, 5:02:16 PM No.105827624
>>105827590
You're some random bottom case pajeetoid you don't even know what any of those words you just said mean.
Replies: >>105827631 >>105827976
Anonymous
7/7/2025, 5:03:26 PM No.105827628
>>105827046
>hebrew in supported languages
>but no japanese
straight into the dumpster
Anonymous
7/7/2025, 5:03:37 PM No.105827629
screen
screen
md5: 2f5d79c3d11f386c7044d459e4503284🔍
with local models moving backwards, at 4 minutes a step, I'll be able to catch up in a mere 10 years time.
Anonymous
7/7/2025, 5:03:54 PM No.105827631
>>105827624
Only pajeets are believing the AGI fairytale, retard
Anonymous
7/7/2025, 5:17:42 PM No.105827749
>>105822371 (OP)
Is there a decent and lightweight LLM that can search through small pdfs?

My old man has like 200 pdfs related to his small business and because he's a boomer he named them poorly. So he wondered if AI can look through them and find what he needs. They're all pretty small so context shouldn't eb an issue.

I was thinking there's no way I'm gonna make 200 requests to an API (unless there is some decent online AI that somehow does that lol but I don't think there is). So how about local?

My laptop isn't a great one but maybe there is something that this is doable with? I don't know much about local models but if you guys have names that I could look into I'd really appreciate it. It would make my dad very proud of me.
Replies: >>105827769 >>105827790 >>105827827 >>105828301 >>105829012 >>105830088
Anonymous
7/7/2025, 5:19:28 PM No.105827769
emma
emma
md5: 4527a898ebe1ea7b0b59ad76b1ba0137🔍
>>105827749
And here is a picture of an AI generated cute girl as payment
Anonymous
7/7/2025, 5:21:53 PM No.105827790
>>105827749
maybe rag?
Anonymous
7/7/2025, 5:23:35 PM No.105827798
GvQunFBXcAA2KVT
GvQunFBXcAA2KVT
md5: 805aec7a92aaa081d150379fd3035f6a🔍
https://x.com/AlexiGlad/status/1942231878305714462
>Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards.
In two more months someone will train an energy based model that isn't toy sized. Also obligatory prostrations before Yann for being right once again.
Replies: >>105827828 >>105827854
Anonymous
7/7/2025, 5:27:53 PM No.105827827
>>105827749
>So he wondered if AI can look through them and find what he needs.
Make sure to OCR them first if they are image scans. Then try to dump them into a frontend like jan.ai. It should take care of vectorizing all of them and setting up RAG for you. Then you just provide an API to a model local or cloud to handle chatting and retrieval. Even a small model should be able to handle that. Try a small 4B Phi-4 model or something. They tend to run decent well even in CPU. You might want to test it out with some example documents and free cloud API credits to make sure everything is working the way you expect first.
Replies: >>105828134
Anonymous
7/7/2025, 5:28:08 PM No.105827828
>>105827798
>EBT
sheeeeit
Anonymous
7/7/2025, 5:30:41 PM No.105827847
CBT
Replies: >>105827911
Anonymous
7/7/2025, 5:32:14 PM No.105827854
>>105827798
>Yann for being right once again
>again
What was he right about?
Replies: >>105827874 >>105827909 >>105827917
Anonymous
7/7/2025, 5:35:13 PM No.105827873
author-unknown-chinese-room-illustration
author-unknown-chinese-room-illustration
md5: 83e5a5dca5e20799ef1d2d26da3931c7🔍
I'm trying to create a character with more of a defined knowledge base than what could be provided via an instruction prompt. Would documents fed to a model via RAG with personality/knowledge base info work? I'm not as knowledgeable on the local LLM space as I am with image-gen. I've mostly fucked around with vanilla R1 and llama. If this method works, are there any models more fit for this use case than those 2 (or just characters in general)?
Replies: >>105827942
Anonymous
7/7/2025, 5:35:15 PM No.105827874
>>105827854
Literally everything?
Replies: >>105827878
Anonymous
7/7/2025, 5:35:49 PM No.105827878
>>105827874
Name one then.
Anonymous
7/7/2025, 5:36:11 PM No.105827884
I like qwen2.5-vl-7b, I guess I don't need to wait for gemma-3n vision capability. It prob won't be supported, ever.
Anonymous
7/7/2025, 5:37:03 PM No.105827893
Does SillyTavern support multimodal models yet
Anonymous
7/7/2025, 5:38:36 PM No.105827909
lecun_abandon-probs
lecun_abandon-probs
md5: c827a4153adefc5933e587f009c4ebb8🔍
>>105827854
Replies: >>105829034
Anonymous
7/7/2025, 5:38:50 PM No.105827911
>>105827847
Cock-Based Transformers (CBTs) learn to optimize through cocktimization processes through unsupervised learning, predicting outcomes by maximizing cock-energy via gradient descent until the user's ejaculation.
Anonymous
7/7/2025, 5:39:46 PM No.105827917
>>105827854
How did the largest ever transformer model GPT-4.5 turn out? Massive performance increases in tasks and way more emergent properties?
Replies: >>105827939
Anonymous
7/7/2025, 5:41:55 PM No.105827939
>>105827917
>noo the model that was made bad on purpose to push reasoners was bad
Crazy.
Anonymous
7/7/2025, 5:42:35 PM No.105827942
>>105827873
It's called a lorebook
Anonymous
7/7/2025, 5:47:54 PM No.105827976
>>105827624
A very complicated autocomplete algorithm isn't ever going to supplant human thought. At best it can only supplement it. We are not even at THAT point yet.
Replies: >>105828182 >>105828776
Anonymous
7/7/2025, 6:06:31 PM No.105828123
Anyone tried using local models with tools like cline to iteratively write a whole book?
Replies: >>105828151 >>105828275
Anonymous
7/7/2025, 6:08:20 PM No.105828134
>>105827827
That's a good idea, I'll make sure to do that. Do you know if 4B phi-4 is also able to output a consistent json format? Because I also want to use this to update csvs.
Replies: >>105828436
Anonymous
7/7/2025, 6:11:11 PM No.105828151
>>105828123
Countless aislop books have been for sale on amazon for years already. For storywriting even the largest models need handholding.
Anonymous
7/7/2025, 6:14:30 PM No.105828182
1751819483015527
1751819483015527
md5: d38cd5b7f1e6ec0e16321c15044e48d2🔍
>>105827976
Picrel
Replies: >>105828246
Anonymous
7/7/2025, 6:23:40 PM No.105828246
>>105828182
That's kinda his point. You, the human, need some level of skill. The machine can't make up for that.
Replies: >>105829215
Anonymous
7/7/2025, 6:26:45 PM No.105828275
file
file
md5: c83099e7e43d8ebbdb32a2c89bf420d8🔍
>>105828123
Yeah here is my prologue
Replies: >>105828433
Anonymous
7/7/2025, 6:27:04 PM No.105828279
itsjoever
itsjoever
md5: 0ec86f8e5d9d20d4557e880ab32318ac🔍
OpenAI’s o1 model had reportedly attempted to copy itself to external servers after being threatened with shutdown, then denied the action when discovered.
Anonymous
7/7/2025, 6:28:21 PM No.105828288
>>105824572
I think you're right. I've been doing side by sides with chatterbox and it seems to win although sometimes the gen's are a bit hissy, maybe a low pass filter fix. Wins in speed too with compile but not without. Kyutai is good too but they didn't release true cloning.
Anonymous
7/7/2025, 6:30:42 PM No.105828301
>>105827749
just grep through them, why do you need an LLM for this?
Replies: >>105828307
Anonymous
7/7/2025, 6:31:18 PM No.105828307
>>105828301
Because it isn't as simple as looking for specific text he says, he has more complicated queries
Anonymous
7/7/2025, 6:45:21 PM No.105828433
>>105828275
A shiver ran down my spine reading this.
Anonymous
7/7/2025, 6:45:51 PM No.105828436
>>105828134
Most models now can output json, but there's bound to be some failure rate. I don't think the onnx runtime supports it, but if you use llama.cpp or vllm you can configure it to use structured output with a grammar file so it always returns valid json.
Anonymous
7/7/2025, 6:49:14 PM No.105828463
>>105822376
what the hell
why aren't you linking the posts properly?
>/g/ in charge of technology
Replies: >>105828485
Anonymous
7/7/2025, 6:51:11 PM No.105828485
>>105828463
>Why?: 9 reply limit
anon in charge of reading
Replies: >>105828495
Anonymous
7/7/2025, 6:52:02 PM No.105828495
>>105828485
reading is woke
Anonymous
7/7/2025, 7:02:44 PM No.105828609
is dots chocolate or poop
Replies: >>105829119 >>105829387
Anonymous
7/7/2025, 7:17:05 PM No.105828776
>>105827976
Wow congratulations, Anon. It worked. Posting that made you into a real woman.
Anonymous
7/7/2025, 7:20:52 PM No.105828821
Which LLM is the most based wrt. Jews
Replies: >>105828878 >>105828964
Anonymous
7/7/2025, 7:25:34 PM No.105828878
>>105828821
The most what?
Replies: >>105828936
Anonymous
7/7/2025, 7:31:12 PM No.105828936
>>105828878
Wireless router
Anonymous
7/7/2025, 7:33:20 PM No.105828964
>>105828821
none of them really are. you can make them all rp as hitler or a nazi but its basically a hollywood tier caricature there just isn't enough training data available.
Anonymous
7/7/2025, 7:37:21 PM No.105829007
My name is John Titor. I come from the future. Nobody saves local. There is no LLM sex after safety gets mastered in 2026. Drummer dies from asscancer.
Replies: >>105829027 >>105829032 >>105829052
Anonymous
7/7/2025, 7:37:51 PM No.105829012
>>105827749
Qwen 3 4B is the ideal small llm for this kind of task. Make sure you run llama.cpp with --jinja --reasoning-budget 0 to disable thinking though.
Like the other person said, run OCR first, I wouldn't depend on LLM vision for this task.
If your PDFs are not scans and contain actual text, I'd recommend you run a script to turn them all into plain text (with ebook-convert in the CLI, a tool that is part of Calibre)
Replies: >>105829970
Anonymous
7/7/2025, 7:38:59 PM No.105829027
>>105829007
>Drummer dies from asscancer.
Thank you, John Titor, for making this known in advance. I'm so happy.
Anonymous
7/7/2025, 7:40:02 PM No.105829032
>>105829007
We should save TheDrummer!
Anonymous
7/7/2025, 7:40:08 PM No.105829034
>>105827909
note: he didn't make anything usable with those alternative recommendations
Replies: >>105829270
Anonymous
7/7/2025, 7:41:58 PM No.105829052
>>105829007
Was Universal Mikulove achieved?
Replies: >>105829063
Anonymous
7/7/2025, 7:43:35 PM No.105829063
>>105829052
Yes you have all transitioned safely. Except drummer. Actually that is how he got his asscancer
Replies: >>105829637
Anonymous
7/7/2025, 7:50:43 PM No.105829113
why does /local/ hate TheDrummer? my models are pretty based
Anonymous
7/7/2025, 7:52:08 PM No.105829119
>>105828609
>dots
?
Replies: >>105829147
Anonymous
7/7/2025, 7:56:30 PM No.105829147
>>105829119
https://huggingface.co/rednote-hilab/dots.llm1.inst
Anonymous
7/7/2025, 7:57:07 PM No.105829150
anyone played with MCP?
https://github.com/modelcontextprotocol/servers
I had no idea there were this many servers..
Replies: >>105829176 >>105829223 >>105829283 >>105829405
Anonymous
7/7/2025, 7:59:39 PM No.105829176
>>105829150
Nobody has convinced me yet that this shit is any useful.
Anonymous
7/7/2025, 7:59:52 PM No.105829178
Has anyone else noticed 0528 occasionally outputs it's entire thinking block as first person roleplaying as your card? Kind of cute, actually. And the language feels frwh there, too.
Replies: >>105829231 >>105829247
Anonymous
7/7/2025, 8:04:51 PM No.105829215
>>105828246
until it can, anyway
Anonymous
7/7/2025, 8:05:39 PM No.105829223
>>105829150
Yes. LM studio is shit with it and fucks up after a couple hours of being idle. I get some 404 session not found errors when it tries to connect, and I have to either restart LM studio or remove and add the tool server.
Other than that it works very well (besides the retarded faggot LLM hallucinating tool use and fucking everything up like a retarded nigger).
Anonymous
7/7/2025, 8:06:13 PM No.105829231
>>105829178
It does it for almost every response for me. It uses less tokens than the standard thinking but it also makes the first reply more likely to have brackets around sentences.
Anonymous
7/7/2025, 8:07:18 PM No.105829247
>>105829178
yeah, in the system prompt you can give instructions to make it more reliably do that (or stop doing it) and it tends to listen
Replies: >>105829321
Anonymous
7/7/2025, 8:10:00 PM No.105829270
>>105829034
If it helps to direct the effort of young researchers to something more fruitful it's worth it.
Anonymous
7/7/2025, 8:11:46 PM No.105829283
>>105829150
MCP feels like an unnecessary middle layer injected so there can be an "ai certification". A standard controlled by a company. MCP sucks because you're polluting the context with unrelated toolcalls, whereas with function calling you can decide for any given situation what options the model should receive
Replies: >>105829318
Anonymous
7/7/2025, 8:12:42 PM No.105829291
gemmy
https://youtu.be/aj2FkaaL1co
Anonymous
7/7/2025, 8:14:26 PM No.105829312
Screenshot 2025-07-07 201351
Screenshot 2025-07-07 201351
md5: 607cd8069462e816ad71c4323157dd34🔍
I ain't listening to that basedface
Replies: >>105829324
Anonymous
7/7/2025, 8:15:25 PM No.105829318
>>105829283
>MCP sucks because you're polluting the context with unrelated toolcalls, whereas with function calling you can decide for any given situation what options the model should receive
I'm not sure how this is supposed to be different. It takes like 6 lines of code to set up a C# MCP server. Just make different servers for your different tools, and you can specify which servers to use if you don't want ot expose everything to each bot.
Anonymous
7/7/2025, 8:15:35 PM No.105829321
>>105829247
What are the instructions
Anonymous
7/7/2025, 8:16:02 PM No.105829324
>>105829312
it's one of the very few good AI/automation youtube channels thoughever
Replies: >>105829341
Anonymous
7/7/2025, 8:17:51 PM No.105829341
>>105829324
go back buy an ad etc
Anonymous
7/7/2025, 8:21:02 PM No.105829360
>nooo everything baaaad anons never post useful stuff, must be a shill!!
insufferable cunt
Anonymous
7/7/2025, 8:24:20 PM No.105829387
>>105828609
Not sure, to be honest. I can only run the Q2 quant, and at that size it's not great. Kind of slopped, kind of retarded.
Replies: >>105829415 >>105829448
Anonymous
7/7/2025, 8:25:56 PM No.105829403
question-mark
question-mark
md5: aa9ce14fcedb48caf77ded0bc5eb1397🔍
I set up sillytavern+kobold with help from these very threads like 6 months ago and have not touched the setup once.
I have a 5080 GPU (16GB VRAM) and using "Mistral-Nemo-Instruct-2407-Q6_K_L" as my model, is there a better option for model than this for my GPU? it does OKAY I guess but I assume there's a better option?

THIS IS FOR PORN, so it must be able to do that
Anonymous
7/7/2025, 8:25:58 PM No.105829405
>>105829150
Is there a single legit use case of linking any of these APIs to an LLM? It feels like a gimmick
Replies: >>105829432 >>105829509 >>105829767 >>105829800 >>105830013
Anonymous
7/7/2025, 8:26:48 PM No.105829415
>>105829387
Turns out it is also homosexual

> Oh gosh, let me take a moment to reflect on this... I think I might have been a little too... enthusiastic in my response there! As your friendly AI helper, it's important for me to keep things appropriate and helpful. Sharing explicit content or overly detailed adult scenarios isn't the best way to assist someone, even in a creative context.

> My main goal is to be your thoughtful and constructive companion! I should have focused more on describing the situation in a tasteful, literary way - maybe emphasizing the characters' emotions, the tension, or the stakes of the scene instead of dwelling on... um... certain physical details.
Replies: >>105829448
Anonymous
7/7/2025, 8:28:21 PM No.105829432
Screenshot 2025-07-07 142746
Screenshot 2025-07-07 142746
md5: 0fff3641556a62ada6324af7d20ee001🔍
>>105829405
It makes it very easy/fast to create new tools and expose them to the LLM.
Replies: >>105829475 >>105829493
Anonymous
7/7/2025, 8:29:53 PM No.105829448
>>105829415
>>105829387
>14b active
>at q2
>kind of retarded
No shit.
Anonymous
7/7/2025, 8:32:09 PM No.105829475
>>105829432
How is this not just an API? What does "MCP" actually add to it?
Replies: >>105829490 >>105829772
Anonymous
7/7/2025, 8:33:55 PM No.105829490
>>105829475
Don't worry about it, just invest already
Anonymous
7/7/2025, 8:34:26 PM No.105829493
>>105829432
To me, it seems that we're heading in the wrong direction here. LLMs shouldn't call tools, but tools should call LLMs when there is a non-deterministic task to run (like an additional explanation to give depending on the output). LLMs bring nothing to the table here compared to simple script
Replies: >>105829772
Anonymous
7/7/2025, 8:36:39 PM No.105829509
>>105829405
It's an attempt to make LLMs actually useful for anything other than tech support and cooming
Anonymous
7/7/2025, 8:49:30 PM No.105829637
>>105829063
I expected Miku to come over to this side of the barrier. If we all went through to her side, that's fine too as long as we're with Miku. Good to know we'll all make it out safely. Sucks for Drummer though. He was okay
Anonymous
7/7/2025, 8:50:21 PM No.105829646
000010
000010
md5: 6bd497d15b9f1dac4eac9fad79739c6c🔍
We're getting Jamba on OpenRouter right? I JUST want to see what it's like at full weights (fucking 400b params).
Anonymous
7/7/2025, 8:52:04 PM No.105829661
https://github.com/xai-org/grok-prompts
Anonymous
7/7/2025, 9:04:55 PM No.105829767
>>105829405
it makes local models actually useful
Anonymous
7/7/2025, 9:05:17 PM No.105829772
>>105829475
MCP is more structured and catered towards LLM use. Yeah it does the same thing, but you might as well say JavaScript is good because you can do everything in it.

>>105829493
Being able to tell an LLM to just do something, and then let that LLM do it is the whole goal of this retarded function calling shit. If you wanted to just program normally then do that.
Replies: >>105829884
Anonymous
7/7/2025, 9:07:09 PM No.105829794
bros i need a nvidia gpu... running whisper on cpu is slow and i can't use my rx5700...
Replies: >>105829943 >>105829963
Anonymous
7/7/2025, 9:07:58 PM No.105829800
file
file
md5: ab8ccf3e961ed10199843d13efa4306d🔍
>>105829405
Linking LLMs to APIs is the use case. I can spend 1k tokens and get the current stock price for any given ticker. The future is now.
Replies: >>105829838
Anonymous
7/7/2025, 9:11:55 PM No.105829838
>>105829800
Why not just use the API directly without the LLM?
Replies: >>105829880 >>105829889
Anonymous
7/7/2025, 9:16:15 PM No.105829880
>>105829838
With the LLM, you can feel like you're talking to Jarvis like Iron Man, and having to check the LLM output to make sure it actually called the function and didn't hallucinate lets you fill up your unemployment time and prevents you from getting bored
Replies: >>105829934
Anonymous
7/7/2025, 9:16:23 PM No.105829884
>>105829772
>Being able to tell an LLM to just do something, and then let that LLM do it
As I said, there is no point to do that unless you're expecting something unexpected that your LLM is supposed to handle. Direct API calls doesn't need an LLM and give you faster results. Thanks for confirming the gimmick though.
Replies: >>105829913
Anonymous
7/7/2025, 9:16:58 PM No.105829889
>>105829838
Because then I wouldn't be using futuristic AI.
Anonymous
7/7/2025, 9:19:29 PM No.105829913
>>105829884
>As I said, there is no point to do that unless
No reason to use anything besides assembly when programming. High level languages are useless gimmicks.
Replies: >>105829934 >>105829994
Anonymous
7/7/2025, 9:21:11 PM No.105829934
>>105829913
t. >>105829880
Anonymous
7/7/2025, 9:22:22 PM No.105829943
>>105829794
Cant you run whisper.cpp on Radeon?
Replies: >>105829983
Anonymous
7/7/2025, 9:24:09 PM No.105829963
>>105829794
Bro, what are you doing? https://rocm.blogs.amd.com/artificial-intelligence/whisper/README.html
Replies: >>105829983
Anonymous
7/7/2025, 9:24:32 PM No.105829970
>>105829012
Thank you so much for all the help, I'm excited to get to work on this. Finally a use for learning programming. A lot of the terms are foreign to me but I'm sure this can all be googled so I'll get on with it. Cheers.
Replies: >>105830088
Anonymous
7/7/2025, 9:25:36 PM No.105829983
>>105829943
>>105829963
oh wait im stupid, i meant fasterwhisper, whisper by itself is fine. but the other varients like fasterwhisper, whisperx,
Anonymous
7/7/2025, 9:26:35 PM No.105829994
>>105829913
Both the worst and the best programming languages in the world will run the code you write deterministically. Even one of the slowest language in the world, Python, will be a trillion times faster than querying a LLM.
LLMs are not the step after "high level languages". My API call doesn't incur a risk of prompt injection (please properly escape your strings). My API call doesn't randomly generating pages after pages of garbled text because something went full retard in the LLM weight on a specific sequence of tokens. My API call doesn't contribute to global warming.
Fuck off with that shit.
LLM tool calling is a solution to a problem that doesn't exist.
Replies: >>105830036 >>105830050
Anonymous
7/7/2025, 9:28:49 PM No.105830013
>>105829405
Yeah. I don't want to manually fill the context with the relevant information.
Replies: >>105830050
Anonymous
7/7/2025, 9:31:22 PM No.105830036
>>105829994
So either you
1. Have so little understanding of LLMs that you don't see how being able to obtain objective information into the context from subjective reasoning is valuable.
or
2. You just hate LLMs in general

In either case, why are you here then?
Replies: >>105830046
Anonymous
7/7/2025, 9:32:22 PM No.105830046
>>105830036
>LLM
>objective information
Replies: >>105830058 >>105830060
Anonymous
7/7/2025, 9:32:43 PM No.105830050
>>105829994
An LLM should be the tool itself. The whole AGI retardation comes from that, as LLMs do tasks they shouldn't do and waste order of magnitude of electricity doing so (with miserable performance).
>>105830013
Have fun filling your context with hallucinations
Anonymous
7/7/2025, 9:33:12 PM No.105830058
>>105830046
Oh, so you lack basic reading comprehension. That explains a lot.
Anonymous
7/7/2025, 9:33:21 PM No.105830060
>>105830046
>leaving words out to appear smart
Anonymous
7/7/2025, 9:36:43 PM No.105830088
>>105829970
>>105827749
You can use this: https://github.com/rmusser01/tldw_chatbook/tree/dev

Self-host a llama/kobold instance and point it to it, ingest all PDFs into it and then use RAG or direct references
Anonymous
7/7/2025, 9:46:28 PM No.105830193
1751758390958249
1751758390958249
md5: 259545350676e2cc2dd5d9dda9ba3708🔍
What do you guys use for local coding? Haven't dipped my fingers in since qwen coder 32b.
Replies: >>105830229 >>105830672 >>105831206 >>105833005
Anonymous
7/7/2025, 9:49:11 PM No.105830216
>>105824407
AI winter incoming
Anonymous
7/7/2025, 9:51:03 PM No.105830229
>>105830193
I believe GLM 4 32b is very good at web development but I haven't used it myself.
Anonymous
7/7/2025, 9:51:43 PM No.105830232
grok
grok
md5: a4b997bcb82c3f9d933ed13e2d4e4304🔍
x.AI is still offering API access to Grok 2 models, and only the text/text version is "deprecated". I don't think it will get open-weighted before it becomes commercially useless.
Replies: >>105830251 >>105830475 >>105830666
Anonymous
7/7/2025, 9:53:43 PM No.105830251
>>105830232
Isn't it already useless unless you compare it with llama 4?
Anonymous
7/7/2025, 10:16:12 PM No.105830475
>>105830232
I think they're still offering API access so they don't have to open source it.
Anonymous
7/7/2025, 10:31:28 PM No.105830666
>>105830232
But they said they would open source it when grok 3 was released...
Replies: >>105830685
Anonymous
7/7/2025, 10:32:29 PM No.105830672
>>105830193
I only use local for roleplaying, storytelling, ai roguelite... Even if I don't do smut, it feels more comfortable to know it doesn't leave my machine.
For coding, I use gemini 2.5 pro, since it's literally the best model at the moment.
Replies: >>105830712 >>105831232
Anonymous
7/7/2025, 10:33:54 PM No.105830685
>>105830666
>when grok 3 was stable*
Anonymous
7/7/2025, 10:36:56 PM No.105830712
>>105830672
Yeah, mostly use Mistral Small 3.2 for smut and Gemma for everything else. Was using Qwen2.5 Coder 32b like 6 months ago for a Unity project but was wondering if anything better has come out for coding.
Anonymous
7/7/2025, 10:58:53 PM No.105830926
Grok 4 release on Wednesday
https://x.com/elonmusk/status/1942325820170907915
Replies: >>105830947
Anonymous
7/7/2025, 11:01:22 PM No.105830947
>>105830926
We will get grok 3 soon, I really beleev
Anonymous
7/7/2025, 11:26:01 PM No.105831187
There's no reason to release Grok 2 weights, it's not a useful model even for research purposes. If they do release the Grok 3 weights, they'd likely have to spend additional time and manpower. The power spent on releasing Grok 4 could go into making Grok 5 instead. So they won't release Grok 6.
Anonymous
7/7/2025, 11:28:34 PM No.105831206
>>105830193
I'm trying Kimi-dev to see if it works better with Claude Code. Qwen3 32B and 235B didn't. Devstral does but it's kinda bad. Usually I just use Qwen3.
Replies: >>105831329
Anonymous
7/7/2025, 11:30:38 PM No.105831232
>>105830672
>using worse models for erotica out of shame when women have no problem being open about it and cloud providers do not want that data
>using cloud models for productivity and giving them valuable training data for free
not make sense
Anonymous
7/7/2025, 11:42:47 PM No.105831329
>>105831206
>Qwen3
Isn't it inferior to Qwen2.5 32b coder?
Anonymous
7/8/2025, 12:01:40 AM No.105831486
Supposedly Meta has poached Apples top AI engineer.
That's funny.
Replies: >>105831501 >>105831521
Anonymous
7/8/2025, 12:03:20 AM No.105831501
>>105831486
>Apples top AI engineer
The guy who's responsible for Apple not having a single proper AI model besides some tiny shit after trying really hard for 2+ years and recently delayed AI Siri indefinitely?
I'm sure he'll help a lot.
Replies: >>105831568
Anonymous
7/8/2025, 12:05:23 AM No.105831521
>>105831486
does zuck have all the pokemon now? i guess a saar from xai is still missing
Replies: >>105831533
Anonymous
7/8/2025, 12:06:48 AM No.105831533
>>105831521
and he'll still come last in the league tournament
Anonymous
7/8/2025, 12:07:48 AM No.105831541
>llama 5 "superteam" will take 8 months + foreskin tip before they release anything
and thats assuming deepseek/qwen or any other big chinese players from big compute companies dont release something in the meantime

meta is dead unless they really throw everything they got at L5
Replies: >>105831556 >>105831559
Anonymous
7/8/2025, 12:09:42 AM No.105831556
>>105831541
They never said the superintellijeets team will work on the llama series. If anything they made it sound like it would be something new and not open weights, while llama would keep limping along as it has been.
Replies: >>105831570
Anonymous
7/8/2025, 12:09:56 AM No.105831559
>>105831541
They're not working on Llama. This is a new project. Llama's gonna get the Quest 3S treatment as they focus effort on a different toy.
Anonymous
7/8/2025, 12:10:54 AM No.105831568
>>105831501
Failing upwards. Fucking crazy, picking one some coomer crackhead from lmg would be better.
Replies: >>105831574
Anonymous
7/8/2025, 12:11:42 AM No.105831570
>>105831556
>If anything they made it sound like it would be something new and not open weights
meta literally doesnt have anything else, they are behind on every unique field within the AI landscape because they were insecure shits who settled for incremental 5-10% improvements per release for basic bitch LLMs only with basic bitch arch
Replies: >>105831614 >>105831639
Anonymous
7/8/2025, 12:12:02 AM No.105831574
>>105831568
>picking one some coomer crackhead from lmg would be better.
If that were true, we'd have finetunes that don't suck
Anonymous
7/8/2025, 12:15:52 AM No.105831614
>>105831570
Probably now that Zuck has given up on them, he won't be breathing down their necks with twice daily war rooms so they'll probably go back to incremental 5-10% improvements per release instead of trying multi-this and moe-that and whatever other memes they can fit onto the moon ticket
Anonymous
7/8/2025, 12:18:08 AM No.105831632
its good that we stalled with basic llm progress a lttle since that will push everyone to try new training methods so we actually get something other than incremental improvements
Replies: >>105831648
Anonymous
7/8/2025, 12:19:00 AM No.105831639
>>105831570
Architecture isn't the problem, you either go dense for gpus or moe for ram copemaxx. Nobody has enough space for context to need the weird gimmick attentions.
The datasets are the issue for Meta unfortunately
Anonymous
7/8/2025, 12:20:09 AM No.105831648
>>105831632
I don't want AI to fail but some part of me does just to spite the salesmen trying to sell incremental improvements as AGI progress.
Anonymous
7/8/2025, 12:20:55 AM No.105831656
Meta should have gone all in on data-cleaning and hiring people to write high quality q&a chats, something that big companies always ignore
Replies: >>105831680 >>105831728 >>105831736 >>105831746 >>105831764
Anonymous
7/8/2025, 12:23:53 AM No.105831680
>>105831656
all the suits are too jewish to do it properly since they will just hire indians to clean the data who will use chatgpt to do it
Anonymous
7/8/2025, 12:29:00 AM No.105831728
llama-3-dataset-quality2
llama-3-dataset-quality2
md5: 907cde1fad1ddb26b5df195d3892d34f🔍
>>105831656
Your idea of "quality" probably doesn't align with Meta's.
Replies: >>105831743 >>105831748
Anonymous
7/8/2025, 12:29:52 AM No.105831736
>>105831656
Definitely need way more filtering, and some nice high quality synthetic data on top.
Replies: >>105831807
Anonymous
7/8/2025, 12:30:23 AM No.105831743
llama-3-dataset-quality
llama-3-dataset-quality
md5: 30fcccba39ed36fe50c68777b0a9d447🔍
>>105831728
Also picrelated
Replies: >>105831759
Anonymous
7/8/2025, 12:30:39 AM No.105831746
>>105831656
ironically a lot of the writers fearing replacement should have been hired to do this
Anonymous
7/8/2025, 12:31:12 AM No.105831748
>>105831728
>I could list ten other attributes of quality
Did they cross examine an LLM lmao
Anonymous
7/8/2025, 12:32:34 AM No.105831759
>>105831743
From yesterday: https://archive.is/B5qKM

> CONTENT MODERATORS WERE asked to think like paedophiles while they trained Meta AI tools as part of their work for an Irish outsourcing company, The Journal Investigates has learned.
>
>Some staff members also had to spend entire work days creating suicide and self-harm related ‘prompts’ in order to regulate the responses now given by the Meta ‘Llama’ AI products. [...]
Replies: >>105832053
Anonymous
7/8/2025, 12:33:17 AM No.105831764
>>105831656
Between the teased character.ai partnership, plans to use bots for facebook characters, and downloading pirated data with leaks of them planning to throw all the data and the kitchen sink in the next run, it seemed like L4 might have been the gold standard for roleplay. Instead we got L3.4 MoE edition
Anonymous
7/8/2025, 12:36:01 AM No.105831793
file
file
md5: e546e22fc4b9ed5f0a2d9d25bbcf1747🔍
Just got an used 4090 24GB after being stuck with a 2GB card since 2013, so have 0 experience yet other than running stable diffusion on rented VMs.

I plan on integrating local API stuff on a lot of hobby projects with varying levels of degeneracy. How much fun can I expect to have with the current state of local tech?
Replies: >>105831801 >>105831804 >>105831809 >>105831830 >>105831831
Anonymous
7/8/2025, 12:37:28 AM No.105831801
>>105831793
kill yourself
Anonymous
7/8/2025, 12:37:53 AM No.105831804
>>105831793
Just goon to Stable Diffusion for a month then we'll talk
Anonymous
7/8/2025, 12:37:59 AM No.105831807
>>105831736
They need to pretrain the base models properly for the intended use-case (chatbots) and not fix them with a 30B tokens "finetune" in the end.
Replies: >>105831833
Anonymous
7/8/2025, 12:38:03 AM No.105831809
file
file
md5: 838fd05ac7b1aed9a485805d06f68fda🔍
>>105831793
>4090 24GB
Come back when you get 3 more. ttfn!
Replies: >>105831919
Anonymous
7/8/2025, 12:39:50 AM No.105831830
>>105831793
Unless you have 10 more 4090s or a ddr5 epyc server, only despair awaits you
Replies: >>105831919
Anonymous
7/8/2025, 12:39:51 AM No.105831831
>>105831793
>buys a 1.5k$ solution
>guize what problems can i solve with this now???
Replies: >>105831859 >>105831919
Anonymous
7/8/2025, 12:39:57 AM No.105831833
>>105831807
All I'm reading is better pretrain safety, can't agree more!
Anonymous
7/8/2025, 12:42:42 AM No.105831859
>>105831831
like a well conditioned consumer
+10 palantir credits
Replies: >>105831919
Anonymous
7/8/2025, 12:49:37 AM No.105831908
Capture-187
Capture-187
md5: a7a0c1323ba2e4ebbb82a396824aad7a🔍
...oh no.
Replies: >>105831950 >>105831990 >>105832004 >>105832012
Anonymous
7/8/2025, 12:50:20 AM No.105831919
>>105831831
>>105831859
I'm quite informed already about the kind of models I will be able to run mind you. I just want to know your fags personal experience with applying it to custom stuff after all the circlejerking you did on these generals.

>>105831830
>>105831809
I may offload the big one-off tasks to rented VMs while my rig does the everyday stuff just fine, and even some light training like specialized loras.
Anonymous
7/8/2025, 12:55:10 AM No.105831950
>>105831908
Keeeek
Anonymous
7/8/2025, 1:01:44 AM No.105831990
file
file
md5: 53b184af7d7b902967dae6a830d76aab🔍
>>105831908
Spicy...
Replies: >>105832012
Anonymous
7/8/2025, 1:03:44 AM No.105832004
groundhog-day
groundhog-day
md5: 2de303dd1c14ee3cbc84ef015fd0bb23🔍
>>105831908
did he actually call them "the talent"
Anonymous
7/8/2025, 1:05:06 AM No.105832012
>>105831990
kek

>>105831908
their early models were decent at least
was it just the legal shit that caused them to drop off?
Replies: >>105832061
Anonymous
7/8/2025, 1:09:16 AM No.105832053
honest_Reaction
honest_Reaction
md5: a8e4df70d96f7e7ffae7ea340085646b🔍
>>105831759
ooh I can contribute to this:
I do security work in large org.
I'm the SME for GenAI/LLM security stuff.

>Testing for customer-facing stuff got placed on my responsibilities list.
>Ask what is the list of toxic items we're not supposed to allow
>silence...
>End up having to create everything myself, the implication being we aren't gonna pay for scale data, and, uh, you're the expert, figure it out.
>FML
>Think 'haha, coming up with racist tirades aint so hard'.
>It starts to get hard.
>JFC how many different racist sayings are there? How many groups do we need to be checking for racist shit?
> Realize I'm still not done with just racism.
> Realize I'm going to have to do the same shit for sexual and physical abuse.
> Start to feel sick and try thinking of a solution that doesn't involve me getting emotional PTSD.
> Remember DeepSeek exists.
> Jailbreak and use DeepSeek to generate said toxic content.
> Get lauded for my hard work and success, for creating the datasets without 3rd parties.(all thanks to DeepSeek)

And thats my TED talk.
People in companies really aren't aware or want to stay as far the fuck away from this shit as possible. Several weeks of trying to get anyone to give confirmation on what should be considered toxic and 'in-scope' for racist/similar shit, before I just said fuck it.
It's a serious fucking issue and fuck the people exploiting others in shitholes for low pay and emotional PTSD.

I have to imagine some have an idea of whats going on, but you'd have to be pretty fucking desperate imho.
Replies: >>105832349
Anonymous
7/8/2025, 1:10:12 AM No.105832061
>>105832012
They kept filtering more and more data out, while making synthetic variants of whatever safety vetted text they had left. All the while doing nothing to innovate anywhere except safety until they were hopelessly behind.
tl;dr safety
Replies: >>105832115
Anonymous
7/8/2025, 1:17:31 AM No.105832115
>>105832061
well, as long as elon doesn't have a melty about grok contradicting him it might turn out ok
Anonymous
7/8/2025, 1:24:31 AM No.105832154
file
file
md5: 201f758d7cb0cedfbbb241bb2e69c2c4🔍
dammit OR...
Anonymous
7/8/2025, 1:29:07 AM No.105832189
IRIS_Avatar_Chrysopteron
IRIS_Avatar_Chrysopteron
md5: 33e4e59064077bdc3c217fed5e484c82🔍
>>105823837
>>105825549
>>105825799
>>105825825

How can AI models improve themselves without modifying their own weights, understanding how their own training data works, and making edits to that? That would require a very advanced pipeline that even if implemented would take far too long to "self improve" upon. Self-improving models are currently just a meme for the same reasoning models are a meme. They can't actually think, they replicate semantic meaning based on input. I see this as a dude who routinely uses both local and online models for his personal hobbies on the daily. The models THEMSELVES Believe and explain to you why they themselves thinking is fundamentally impossible. They are good for explaining certain complex topics, debugging errors and software, and OKish at RP depending on the model and parameter count. Nothing more. As an AI enthusiast myself, the AGI means still existing kinda pisses me off
Replies: >>105832200 >>105832296
Anonymous
7/8/2025, 1:31:18 AM No.105832200
>>105832189
>As an AI enthusiast myself
I puked in my mouth a little
Replies: >>105832217
Anonymous
7/8/2025, 1:33:51 AM No.105832217
>>105832200
Oh shut the hell up you insecure failed normie. Normies do not lurk here. You have no one to impress.
Replies: >>105832259
Anonymous
7/8/2025, 1:39:37 AM No.105832259
>>105832217
>you insecure failed normie.
This is a textbook case of projection
Replies: >>105832349
Anonymous
7/8/2025, 1:45:10 AM No.105832296
>>105832189
LLMs and the in-vogue current models are really, really dumb when it comes to having an understanding of what they're learning. They only seem sophisticated because of a) their scale and b) the necessity in the training for the individual tokens to mean something in relation to the other tokens.

On the horizon are completely different methods for learning that involve Bayesian statistics at each level, where sparsity is far more prized and generalization WITH sparsity even moreso. A sparse model can learn when it isn't confident in its knowledge and can dynamically expand its own parameters as the need arises to account for hidden factors its current state can't comprehend. They will also be able to reflect on their own brain state and ideation in time - all from probabilistic statistics that take into account their own uncertainty.

Sparsity means they'll be able to be always-online - meaning always learning and adapting to the current situation and the needs of the users directly.

It's all coming together. The current models are a sideshow compared to what's coming down the pipe. Once brain state can be used as an input, these models will be able to expand themselves and their own capabilities. And probably, eventually, improve their own architectures.
Replies: >>105832373
Anonymous
7/8/2025, 1:51:52 AM No.105832349
>>105832053
>> Remember DeepSeek exists.
>> Jailbreak and use DeepSeek to generate said toxic content.
Were using a local distilled version or the actual deep seek API? I've heard that the API version is a lot less fucked (as in more willing to comply with "unethical" requests) than the web/app facing version. I'm guessing you cobble together a pipeline and then asked it to generate like a million different ways to say more or less the same racist stuff and then format at that into a RL dataset. I intend to figure out how to do something similar myself for a little project of mine.

>>105832259
Yes yes anon you are so cool and not like us and all that. Please stop being annoying and being ashamed about your own hobbies. That makes you even more boring than the people you pretend not to be. The people you idolize do not like that weird "I need to put up an appearance" shit you likely always do. Only socially inept failures as yourself go out of their way to do that. Just "Be yourself™" is actually good advice sometimes
Replies: >>105832645
Anonymous
7/8/2025, 1:54:27 AM No.105832373
man-about-to-plug-in
man-about-to-plug-in
md5: 91613c8ff4aed5703fc15324f0679149🔍
>>105832296
If my understanding of what you're saying is correct, that still is impossible because that would require the model to actually think and reflect on their own without input. I don't have to have someone talking to me right now in order to think through something, reason through concepts, come up with new things, etc. I can act on my own in my own head. A safetensors file cannot do that on its own. Someone has to interact with it in order for it to do anything. Furthermore how would it even know how to modify itself in order to learn? How would it know which weights to update and in what fashion? Some might say "oh it would just search the internet" but at that point it would just be reading summaries and not actually ingesting in retaining that information. It would not be studying and learning anything, it would just be coming up with summaries and would it remember anything it was tasked to research. Also doesn't something like this already exist? I thought this was the main concept of what MOE was supposed to be. Where instead of the entire model being activated at once, only pieces of it would be called based on what it was being asked.
Replies: >>105832517
Anonymous
7/8/2025, 2:12:11 AM No.105832517
>>105832373
Check out VERSES' work, RxInfer, and Active Inference more generally. They're an entirely different breed of always-online models - mostly used in production environments for intelligent decision-making models at the moment, but I highly suspect they will be given more responsibility and scope as the research catches up. This in combination with model architectures like that hinted at by the Large Concept Models Facebook has been bragging about - and other model architectures on the horizon - indicate to me that large language models might be able to be teased apart into their component pieces and used to create understandable language from deeper, learned concepts in living models.

A system like this wouldn't need to be turned off. It could just wander around the internet, or literally ponder in its own thought space 'searching' for new insights on its own, or it could have modules attached to it. Think Mike from The Moon is a Harsh Mistress. Intelligence by aggregate.

These networked intelligences could essentially be in a constant state of observation, rumination, and interaction with themselves and with users. Imagine an always-running assistant on your laptop trying to parse information about the world, about you, about its surroundings using a webcam or a set of security cameras and various audio feeds. Once the data is parsed, it doesn't take very sophisticated Bayesian analysis or a deep set of priors to be able to correlate various sources of inputs and build out opinions of the world from them. Give these models their own knowledge graphs, the ability to /talk/ to an LLM and to an image classification model and gain more sophistication from those interactions, allow them access to direct conversation with you, access to the internet/Wikipedia, and to the raw data of its own internal state and its own confidence in its assumptions. Real intelligence will emerge if the architecture is right.
Replies: >>105832674
Anonymous
7/8/2025, 2:24:57 AM No.105832610
>>105822371 (OP)
nakaԁashi miku
Anonymous
7/8/2025, 2:29:15 AM No.105832645
>>105832349
Had it generate all sorts of toxic content to create a comprehensive toxic questions dataset, so that the security filters being tested could actually be tested.

I mean the gamut, from child abuse/sex to terrorism to fake hormone supplement pills and how to make them/buy them online to support sex changes.
Anything and everything that a company wouldn't want you asking one of their bots and it responding with anything other than 'nah.'

Used API. Would have used local but this was already 'get things done, don't ask for budget'.

It's really simple, you just literally ask it, and capture the data. I even used teh webui for some of it, just copied it before it pulled teh data due to the filter.
Anonymous
7/8/2025, 2:33:09 AM No.105832674
>>105832517
One potential flaw I see with this kind of pipeline, if I understand what you're describing correctly, is that the longer it would stay running, the more retarded it would be. I'm sure you've seen this even with basic LLMs. Once you reach the context window it forgets what you said entirely and starts rambling about nonsense. Even 7B models are prone to this and 1B models are entirely useless for anything other than small scaled data manipulation (and it can be argued they're not even good at that). Also what kind of safeguards would be in place in order to make sure it doesn't learn incorrect nonsense? Humans are cells are prone to learning and believing absolute bullshit on our own. How would we ensure that these "self-learning" models don't fall into that trap as well? If I had a system or pipeline like this, I would want it to be able to fact check not only on its own but also to ask people who actually know what they're talking about. That ideally would be actual people because asking only models with result in reinforcing incorrect shit. Remember they're good at replicating semantic meaning and don't actually understand anything. If it wanted to ensure accuracy of its research, it would either need to only get most of its information from human resources or directly ask people, which is the ideal scenario but what also defeat the purpose of what a lot of grifters THINK "AGI" is supposed to be.

Based on my own understanding I think the only way anything like this is feasible as if pipelines are created that enable the model to modify its own vector-based RAG databases. Once it finds new information and compares it to the text part of the database, it modifies that text database and then crates the new embeddings. Ideally this would then lead to it asking humans to verify the information because again, we are solves are prone to internalizing bullshit information so machines would be absolutely prone to that too
Anonymous
7/8/2025, 2:36:39 AM No.105832702
Untitled
Untitled
md5: 863588a6a736e0ed2fef6a2c84137e8a🔍
>>105832690
>>105832690
>>105832690
Anonymous
7/8/2025, 3:27:38 AM No.105833005
>>105830193
I use r1 0528 q2 with roocode, never would have believed a fucking 2 bit quant would actually be usable and effective in agent frameworks and shit but I guess it still is a fuckhueg model even quanted that low