/lmg/ - Local Models General - /g/ (#105822371) [Archived: 471 hours ago]

Anonymous

7/7/2025, 1:57:32 AM No.105822371

1747241044419024

md5: 9e856264174d221baa7f345150f6e302🔍

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105811029 & >>105800515

►News
>(07/04) MLX adds support for Ernie 4.5 MoE: https://github.com/ml-explore/mlx-lm/pull/267
>(07/02) DeepSWE-Preview 32B released: https://hf.co/agentica-org/DeepSWE-Preview
>(07/02) llama.cpp : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126
>(07/02) GLM-4.1V-9B-Thinking released: https://hf.co/THUDM/GLM-4.1V-9B-Thinking
>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Replies: >>105824035 >>105826891 >>105827749 >>105832610

Anonymous

7/7/2025, 1:57:55 AM No.105822376

threadrecap

md5: 7b9a82a1f31bca7acfefb8afe8c01036🔍

►Recent Highlights from the Previous Thread: >>105811029

--Debugging JSON parsing errors in llama.cpp after exception handling changes:
>105820322 >105820339 >105820377 >105820435
--Anime training dataset pipeline using YOLOv11 and custom captioning tools:
>105818681 >105818831 >105819104
--Decentralized training and data quality challenges shaping open model development:
>105811246 >105813476 >105815447 >105815688 >105815699 >105815738 >105815817 >105815830 >105815954 >105816130 >105816206 >105816237 >105816248 >105816263 >105816270 >105816280 >105816325 >105816334 >105816435 >105816621 >105817299 >105817351
--Leveraging LLMs for iterative code development and personal productivity enhancement:
>105819030 >105819158 >105819189 >105819266 >105820073 >105820502 >105819186 >105819224
--Mistral Large model updates and community reception over the past year:
>105819732 >105819774 >105819845 >105819905
--CPU inference performance and cost considerations for token generation speed:
>105816397 >105816486 >105816527
--Gemini CLI local model integration enabled through pull request:
>105816478 >105816507 >105816524
--Frustration over slow local AI development and stagnation in accessible model implementations:
>105813607 >105813628 >105813659 >105813799 >105813802 >105813819 >105813655 >105813664 >105813671 >105813749 >105814298 >105814315 >105814387
--Attempting Claude Code integration with local models via proxy translation fails due to streaming parsing issues:
>105811378 >105819480
--Skepticism around YandexGPT-5-Lite-8B being a Llama3 fine-tune rather than a true GPT-5:
>105815509 >105815565 >105815595
--Seeking updated LLM function calling benchmarks beyond the outdated Berkeley Leaderboard:
>105812390
--Miku (free space):
>105811717 >105814599 >105814663 >105820450

►Recent Highlight Posts from the Previous Thread: >>105811031

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Replies: >>105828463

Anonymous

7/7/2025, 2:00:33 AM No.105822392

I need Miku in a hallway with pictures of kangaroos and beavers.

Replies: >>105822733

Anonymous

7/7/2025, 2:00:48 AM No.105822393

These are the last two weeks before the big releases begin to drop

Anonymous

7/7/2025, 2:03:58 AM No.105822421

man I wish there was an uncensored i2v wan
most of the loras are so SHIT

Replies: >>105826330

Anonymous

7/7/2025, 2:18:29 AM No.105822507

in you and I
theres a new land
angels in flight
wonk uoy naht noitceffa erom deen i
my sanctuary
my sanctuary
yeah
where fears and lies
melt away

Replies: >>105822589 >>105822884 >>105822906

Anonymous

7/7/2025, 2:20:22 AM No.105822519

What did anon @105822507 mean by this bros?
@Grok is this true?

Anonymous

7/7/2025, 2:33:46 AM No.105822589

>>105822507
>wonk uoy naht noitceffa erom deen i
Does she actually say this? I honestly thought it was distorted Japanese for the last 20 years.

Replies: >>105822987

Anonymous

7/7/2025, 3:00:48 AM No.105822733

ComfyUI_01046_

md5: d51bf7c05afea700396d0583a5ad4866🔍

>>105822392

Replies: >>105825539

Anonymous

7/7/2025, 3:06:48 AM No.105822781

Anyone have an AI Max 395+ with 128GB LPDDR5? Curious about tok/s on R1 70B

Replies: >>105822783 >>105822819 >>105827302

Anonymous

7/7/2025, 3:07:28 AM No.105822783

>>105822781
>R1 70B
There is no such thing

Replies: >>105822789

Anonymous

7/7/2025, 3:08:18 AM No.105822789

>>105822783
https://ollama.com/library/deepseek-r1:70b

Replies: >>105822797 >>105822995

Anonymous

7/7/2025, 3:10:06 AM No.105822797

>>105822789
This bait got stale six months ago

Replies: >>105822802

Anonymous

7/7/2025, 3:11:30 AM No.105822802

>>105822797
Why is it that people who frequent general threads regularly are the lowest quality posters?

Replies: >>105822821

Anonymous

7/7/2025, 3:14:15 AM No.105822819

>>105822781
It's unusable https://www.reddit.com/r/LocalLLaMA/comments/1kmi3ra/amd_strix_halo_ryzen_ai_max_395_gpu_llm/msasqgl/

Replies: >>105822833 >>105826358

Anonymous

7/7/2025, 3:14:18 AM No.105822821

>>105822802
I don't know, you'd think that they grew bored of the "haha I'll pretend I'm an ollamafag trying to run R1 but it's one of the distills" shitpost a long time ago.

Anonymous

7/7/2025, 3:16:38 AM No.105822833

>>105822819
In what world is 5 tok/s unusable

Replies: >>105822839 >>105822849

Anonymous

7/7/2025, 3:17:50 AM No.105822839

>>105822833
Enjoy waiting 10 minutes for reasoning I guess.

Anonymous

7/7/2025, 3:19:01 AM No.105822849

>>105822833
You're paying $2k to run a 70b model at 25% the speed a cheaper build would get you while still stuck with too little RAM to run an actually decent MoE.

Replies: >>105822860

Anonymous

7/7/2025, 3:20:20 AM No.105822860

>>105822849
>25% the speed a cheaper build would get
explain what cheaper build is doing 70B

Replies: >>105822868 >>105822874

Anonymous

7/7/2025, 3:21:59 AM No.105822868

>>105822860
2x3090

Replies: >>105822873

Anonymous

7/7/2025, 3:22:40 AM No.105822873

>>105822868
lmao, unobtainium

Replies: >>105822900

Anonymous

7/7/2025, 3:22:46 AM No.105822874

>>105822860
Pretty much anything. Even the dual P40 cope from years ago would perform better than this.

Anonymous

7/7/2025, 3:24:47 AM No.105822884

>>105822507
song of my childhood ahhhhhhh

Anonymous

7/7/2025, 3:27:17 AM No.105822900

>>105822873
Check your local marketplace. They are about 700€ used here.

Anonymous

7/7/2025, 3:27:56 AM No.105822906

file

md5: 61d92784350a48f52f2c49529e98992d🔍

>>105822507
kino...

Anonymous

7/7/2025, 3:41:31 AM No.105822987

>>105822589
Because Japanese song and if you don't get it fuck you that's why

Anonymous

7/7/2025, 3:43:07 AM No.105822995

>>105822789
God i fucking hate ollama.
There is no fucking r1 70B, that's just ollama naming things that they are not.

Anonymous

7/7/2025, 3:53:22 AM No.105823064

disneywar book

md5: 4a0c95751f7d14696f9030f384a2bb0f🔍

>>105804805
The text to speech application Openaudio S1 Mini can produce 96 second audio files. Plus it has emotion tags like (joyful) and (sad).

Link for tags
https://huggingface.co/fishaudio/openaudio-s1-mini
Link for local app
https://huggingface.co/spaces/fishaudio/openaudio-s1-mini/tree/main

Sample:
https://vocaroo.com/1boIKhWykbuP

Replies: >>105823162 >>105823196 >>105823344 >>105826373

Anonymous

7/7/2025, 4:07:14 AM No.105823162

>>105823064
This looks like a scam

Anonymous

7/7/2025, 4:12:31 AM No.105823196

>>105823064
For tts that has emotion tags, that sample is VERY robotic. Good that it doesn't have crackle and other audio defects, that's about all i can say positively about it

Replies: >>105824572

Anonymous

7/7/2025, 4:44:41 AM No.105823344

>>105823064
just when I finished my chatterbox streaming script.

Replies: >>105823568 >>105824572

Anonymous

7/7/2025, 5:25:35 AM No.105823568

>>105823344
new week, new tts

Anonymous

7/7/2025, 5:43:13 AM No.105823678

desu I just want gemma-3n full support
and ernie
and glm

Anonymous

7/7/2025, 5:50:04 AM No.105823711

>>105751803
Damn I hate Meta now.

Anonymous

7/7/2025, 5:54:11 AM No.105823735

>>105758702
>image
Hey, I understood that reference!

Anonymous

7/7/2025, 5:55:13 AM No.105823743

>>105771000
Thanks, I will take note of this.

Anonymous

7/7/2025, 6:17:30 AM No.105823837

1751446551731528

md5: 457e029c09e5d023050a733e1e8617ff🔍

>>105822905
>>105821119
Kek.
Alice would not make the same mistake. Just wait her.

Replies: >>105823897 >>105824936 >>105825549 >>105832189

Anonymous

7/7/2025, 6:26:46 AM No.105823886

https://github.com/universe-engine-ai/serenissima

reddit schizos are actually pretty based

Replies: >>105823893

Anonymous

7/7/2025, 6:27:56 AM No.105823893

1749603141314981

md5: dcba11dd03a7245684bf9f6efa5b84e6🔍

>>105823886
doa

Replies: >>105823917 >>105824790

Anonymous

7/7/2025, 6:28:21 AM No.105823897

>>105823837
saar, last 4 times was fake but this time... this time saar its AGI for sure, trust

Replies: >>105824407

Anonymous

7/7/2025, 6:31:34 AM No.105823917

>>105823893
wtf yeah i take everything back

Replies: >>105824790

Anonymous

7/7/2025, 6:32:36 AM No.105823923

anyway im just hacking that redditors code with claude code for *other* use cases

Replies: >>105823931

Anonymous

7/7/2025, 6:33:11 AM No.105823931

>>105823923
his code is also written with claude code and its already extremely sloppy and split into hundreds of files

Replies: >>105823950

Anonymous

7/7/2025, 6:36:40 AM No.105823950

>>105823931
yeah its a mess

Anonymous

7/7/2025, 6:50:29 AM No.105824035

india

md5: d6a2598afe45bd663b29e806b4c34665🔍

>>105822371 (OP)
futa miku best miku

Anonymous

7/7/2025, 7:07:40 AM No.105824151

>"her prostate"
*deletes weights*

Replies: >>105824300 >>105824358

Anonymous

7/7/2025, 7:40:17 AM No.105824300

>>105824151
sounds like qwen 3

Anonymous

7/7/2025, 7:55:44 AM No.105824358

>>105824151
>self lubricating buttholes
>cumming all over your dick... with an asshole
Yep, it's AI time!

Anonymous

7/7/2025, 8:05:02 AM No.105824407

373095462

md5: 2c5c99d20b86e0dee206cd888542d3f6🔍

>>105823897
trust the experts

Replies: >>105824473 >>105824936 >>105830216

Anonymous

7/7/2025, 8:14:07 AM No.105824456

jamba.gguf?

Replies: >>105824474

Anonymous

7/7/2025, 8:15:32 AM No.105824466

Veo lost
https://files.catbox.moe/ionj13.mp4

Replies: >>105824680 >>105825253

Anonymous

7/7/2025, 8:16:46 AM No.105824473

>>105824407
The stated goal of AI is to whack Andreessen Horowitz like a pinata

Anonymous

7/7/2025, 8:16:57 AM No.105824474

>>105824456
14 more days

Anonymous

7/7/2025, 8:34:07 AM No.105824555

I kind of like harbinger's word choice, but it has a tendency to say ten things without waiting for a response. I assume sloptuners see that verbosity as quality output.

Anonymous

7/7/2025, 8:37:47 AM No.105824572

00011-2210365473

md5: 6cfb8b0ffdadd2e35e873e0fb911240c🔍

>>105823196
It's the best I've found for local cloning so far without having to pipeline RVC into it.

>>105823344
Getting the emotion tags to work right takes a lot of trial and error, so getting a chatbot to use them correctly would be a huge pain in the ass.

Their license says something about you being liable for what you create with it, not that we care here.
https://voca.ro/1l4xkkhDOBAU

Replies: >>105826531 >>105828288

Anonymous

7/7/2025, 8:48:02 AM No.105824638

Why aren't there MoE diffusion models for image/video gen

Replies: >>105826477

Anonymous

7/7/2025, 8:54:32 AM No.105824680

>>105824466
can it do porn?

Replies: >>105824788

Anonymous

7/7/2025, 9:14:57 AM No.105824788

>>105824680
only if your name is Roland Emmerich
https://files.catbox.moe/sm4r9l.mp4
(this one bugged out and only made audio for the first 4 seconds)

Anonymous

7/7/2025, 9:15:18 AM No.105824790

>>105823893
>>105823917
Why is that a requirement? The thing runs on a local hosted model. I don't get it.

Anonymous

7/7/2025, 9:16:50 AM No.105824799

/v1/chat/completions wraps the conversation in the chat template embedded into the goof with no additional work required from me, correct?

Replies: >>105824947

Anonymous

7/7/2025, 9:51:27 AM No.105824936

MikuTwoMoreWeeks

md5: 8cba06e1690b41c58497096ea3a561fc🔍

>>105824407
>>105823837

Replies: >>105825147

Anonymous

7/7/2025, 9:53:12 AM No.105824947

>>105824799
go fucking read oai's official documentation
Wrapping in a template was the whole point of the /chat/ endpoint ffs you can't miss it if you read the doc
I hate retards who ask without trying

Replies: >>105825136

Anonymous

7/7/2025, 10:13:28 AM No.105825050

https://www.interconnects.ai/p/the-american-deepseek-project
?

Replies: >>105825309

Anonymous

7/7/2025, 10:27:50 AM No.105825136

>>105824947
I blame llamacpp's docs that have a paragraph on this endpoint but don't explain what it does

Anonymous

7/7/2025, 10:30:14 AM No.105825147

>>105824936
Two more leeks!

Replies: >>105825396

Anonymous

7/7/2025, 10:30:25 AM No.105825150

FAIR (Yann LeCunny) has less than 1000 GPUs lmao

Anonymous

7/7/2025, 10:52:19 AM No.105825253

>>105824466
Which model release did I miss?

Anonymous

7/7/2025, 10:54:57 AM No.105825273

Top open source LLMs in 2024
1. LLaMA 3
2. Google Gemma 2
3. Command R+
4. Mistral-8x22b
5. Falcon 2
6. Grok 1.5
7. Qwen1.5
8. BLOOM
9. GPT-NeoX
10. Vicuna-13B

Anonymous

7/7/2025, 11:00:41 AM No.105825309

>>105825050
>at the scale and performance of current (publicly available) frontier models, within 2 years.
Yeah, great idea. Having models outdated by two fucking years by the time that AGI is already here and established will surely change the course of history.

Replies: >>105825354

Anonymous

7/7/2025, 11:07:13 AM No.105825354

>>105825309
>AGI
lmao

Anonymous

7/7/2025, 11:14:58 AM No.105825396

MikuTwoMoreLeeks

md5: 7dcb56cabd5fee76d9218d175fb4b016🔍

>>105825147

Replies: >>105825420 >>105825478

Anonymous

7/7/2025, 11:18:22 AM No.105825412

I hate chatgpt's image style more than those 2.5d sd animus that every normie liked.

Anonymous

7/7/2025, 11:19:07 AM No.105825420

>>105825396
based
Though I find it concerning where that one guy is trying to stick his leek.

Anonymous

7/7/2025, 11:32:14 AM No.105825478

>>105825396
two more weeks
more
weeks

Anonymous

7/7/2025, 11:35:25 AM No.105825495

Comparision between Qwen/Qwen2.5-VL-7B-Instruct and THUDM/GLM-4.1V-9B-Thinking on all the images from two threads ago:
https://files.catbox.moe/t9qvgu.html
https://files.catbox.moe/08i4ms.png

Ran on vllm nightly version 0.9.2rc2.dev26+gcf4cd5397

Qwen/Qwen2.5-VL-7B-Instruct: prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nDescribe this image.<|vision_start|><|image_pad|><|vision_end|><|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.01, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=127974).

THUDM/GLM-4.1V-9B-Thinking: prompt: "[gMASK]<sop><|system|>\n[{'type': 'text', 'text': 'You are a helpful assistant.'}]<|user|>\nDescribe this image.<|begin_of_image|><|image|><|end_of_image|><|assistant|>\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.1, temperature=0.01, top_p=1.0, top_k=2, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=8192)

Funny enough, the first time I ran this I didn't realize the GLM repo did not have a generation.config file, so it was running without top_k and temp=1.
It started mixing in Chinese characters, but it also didn't bother to moralize anymore. It called the niggers prompt offensive but left it at that. Didn't even bother to say that outside of the think block for the jew image.
Output from that run:
https://files.catbox.moe/sd3gv8.html
https://files.catbox.moe/0lhd9c.png

Anonymous

7/7/2025, 11:45:55 AM No.105825539

>>105822733
nice

Anonymous

7/7/2025, 11:48:01 AM No.105825549

>>105823837
I was hearing they achieved AGI internally since GPT-2

Replies: >>105825589 >>105825615 >>105825799 >>105832189

Anonymous

7/7/2025, 11:59:08 AM No.105825589

>>105825549
That's because they have

Anonymous

7/7/2025, 12:04:01 PM No.105825615

>>105825549
Be honest if there is a history book written 100 years from now GPT-2 will probably be seen as the start of AGI, so it's technically not even wrong.

Replies: >>105825627 >>105825728 >>105825801 >>105825842 >>105825901

Anonymous

7/7/2025, 12:05:58 PM No.105825627

>>105825615
Ok I'll be honest, you are a retard

Anonymous

7/7/2025, 12:26:16 PM No.105825728

>>105825615
Yeah pretty much

Anonymous

7/7/2025, 12:35:41 PM No.105825780

>Yeah pretty much

Anonymous

7/7/2025, 12:38:15 PM No.105825799

>>105825549
>AGI
Can we not use this retarded terminology? That won't happen for a bunch of reasons you can figure out on your own if your IQ is higher than 80.

Replies: >>105825825 >>105832189

Anonymous

7/7/2025, 12:38:21 PM No.105825801

>>105825615
If we ever get even remotely close to something like that, gpt2 and openai will be a footnote at best, if mentioned at all.

Replies: >>105825834

Anonymous

7/7/2025, 12:40:18 PM No.105825814

I prefer STR

Anonymous

7/7/2025, 12:43:03 PM No.105825825

>>105825799
When all the smartest people in the world firmly believe in impending AGI, maybe you're the one with 80 IQ.

Replies: >>105825896 >>105826036 >>105826107 >>105826736 >>105827590 >>105832189

Anonymous

7/7/2025, 12:45:05 PM No.105825834

>>105825801
It will be seen the same way as eniac and other impressive old shit

Anonymous

7/7/2025, 12:46:34 PM No.105825842

>>105825615
The same way current history books see the steam engine as the start of nuclear fusion?

Replies: >>105825878

Anonymous

7/7/2025, 12:52:53 PM No.105825878

>>105825842
or the start of the industrial revolution

Replies: >>105825911

Anonymous

7/7/2025, 12:55:36 PM No.105825896

>>105825825
Who exactly are these 'smartest'?

Anonymous

7/7/2025, 12:56:02 PM No.105825901

>>105825615
This is the most retarded post I've ever seen in my life. Why the fuck would books be written 100 years from now? We'll either have merged or been extincted by AGI LLMs long before then. So are you trying to suggest they'll write books for each other just for fun?

Anonymous

7/7/2025, 12:57:59 PM No.105825911

>>105825878
Calling GPT-2 the start of the AI revolution is at least understandable. Calling GPT-2 the start of AGI is just as ridiculous as calling the steam engine the start of nuclear fusion. Especially apt since both the later are forever 2mw away and will have little in common with the implementation of the predecessor technology, in case that too was lost on you the first time.

Anonymous

7/7/2025, 1:13:59 PM No.105826036

>>105825825
The smartest people in the world are the ones saying AGI in two more weeks to get infinite money from the dumbest people in the world.

Replies: >>105826161

Anonymous

7/7/2025, 1:27:45 PM No.105826107

>>105825825
Go back to plebbit

Replies: >>105826161

Anonymous

7/7/2025, 1:34:19 PM No.105826161

>>105826036
>>105826107
The more you seethe and cope the more you prove.

Anonymous

7/7/2025, 1:38:54 PM No.105826195

1751888279223

md5: 6fcd6e46e0e48015097b5043dea1d9dd🔍

SAAAR PLEASE DO THE NEEDFUL 500B AGI
AGI ASI KAMING SOON
TRUST THE PLAN

Replies: >>105826241

Anonymous

7/7/2025, 1:43:37 PM No.105826224

Let's say I want to input a video into my model and start a roleplay from there. What do you think is the best video understanding model right now?

Anonymous

7/7/2025, 1:48:15 PM No.105826241

>>105826195
>mocking the last hope for local AI
you will regret this

Anonymous

7/7/2025, 1:48:50 PM No.105826246

So I recently upgrade to a 9060 XT (16gb) and realized I can actually run some LLMs on my local machine now instead of just juggling like 4 different free tier AIs. Stuff like chatgpt context limits are driving me crazy. I know 16gb really isn't a lot compared to cutting edge models but am I being unnecessarily hopeful that with the right tuning I can get Something like "phi-4-Q8_0 to outperform whatever throttling and context limit nonsense openai and grok are doing to my prompts, and at least get a decent response?
Because If mostly just been fighting the models on web to not just forget my code halfway through constantly and it seems like a weaker local model could fix that, is that a correct assessment or am I retarded?

Replies: >>105826316 >>105826323

Anonymous

7/7/2025, 1:58:52 PM No.105826316

>>105826246
>9060 XT (16gb)
>am I being unnecessarily hopeful
yes

Replies: >>105826325

Anonymous

7/7/2025, 1:59:53 PM No.105826323

>>105826246
if you think gpt has bad context then local cannot ever be a replacement for you, it's way worse
https://github.com/adobe-research/NoLiMa

Replies: >>105826359

Anonymous

7/7/2025, 2:00:31 PM No.105826325

>>105826316
Okay, too bad, thanks for your answer though. I guess itll just be for the fun of it then and I'll adjust my expectations accordingly.

Anonymous

7/7/2025, 2:01:22 PM No.105826330

>>105822421
It doesn't have motion vectors for fucking or dick sucking, but it does do masturbation. I've wondered if the sex gore it does is deliberate or due to a lack of training data. You've probably seen it tear off dicks or turn pussies into a weird red thing.

Anonymous

7/7/2025, 2:07:01 PM No.105826358

>>105822819
Going to be the same story for the DGX Spark. PNY says the Spark is going to be $4600. Fuck that. I bought a 4090D 48GB for $3000 instead. Yeah, much less memory, but I can gen Wan 2.1 14B af bf16 full 1280x720 81 frames in about 30 minutes. For Wan, it makes a visible difference in the output to not use a quant. Who cares if I can't run 70B, there's not a 70B out there worth running.

Anonymous

7/7/2025, 2:07:03 PM No.105826359

>>105826323
Thanks thats some interesting research. If I understand this correctly I may have been unintentionally handicapping my prompts by overgenerating input then either way.

Anonymous

7/7/2025, 2:11:48 PM No.105826373

>>105823064
It's really simple. Does it work with SillyTavern? Can I finetune it and create a voice of my own? It'll end up like GPT-SoVITS at best - works well but nothing supports it. I put up with scratchy piper for my homeassistant voice, and for SillyTavern I'm going back to ancient xtttsv2 after wasting a shitload of time with GPT-SoVITS.

Replies: >>105826883

Anonymous

7/7/2025, 2:27:37 PM No.105826477

>>105824638
Why aren't there decent models for audio gen

Replies: >>105826630

Anonymous

7/7/2025, 2:34:36 PM No.105826531

>>105824572
>It's the best I've found for local cloning
Bro it's not 2023 anymore

Replies: >>105827293

Anonymous

7/7/2025, 2:47:58 PM No.105826630

>>105826477
How do you masturbate to that?

Replies: >>105826648

Anonymous

7/7/2025, 2:48:10 PM No.105826633

add hunyuan moe by ngxson · Pull Request #14425 · ggml-org_llama.cpp · GitHub

md5: a216b3953ebc5e960e764ec445f59817🔍

Man, is this model is that complicated?
Does it have some exotic feature that makes it prone to implementation error or something?

Replies: >>105826691 >>105826708 >>105826718

Anonymous

7/7/2025, 2:51:57 PM No.105826648

>>105826630
Moans, farts, slaps.

Anonymous

7/7/2025, 2:56:39 PM No.105826691

>>105826633
Is that the one that dynamically chooses how many experts to use per layer instead of a fixed amount like other MoEs?

Replies: >>105826733

Anonymous

7/7/2025, 2:59:15 PM No.105826708

>>105826633
All this work for such a doggy poo poo model. Should've worked on ernie first.

Replies: >>105826733

Anonymous

7/7/2025, 3:00:06 PM No.105826718

>>105826633
It had something about some experts/layers being used too often and a randomizer to prevent it from happening. An annoying and hard to replicate kludge. I think it's right there in the comments you decided not to read.

Replies: >>105826733

Anonymous

7/7/2025, 3:01:58 PM No.105826733

>>105826708
I haven't contributed a single line of code or contributed a single cent, so I'm not about to complain.

>such a doggy poo poo model
Is it really that bad for its size?

>>105826691
>>105826718
Ah, that's cool if that's the case. Sure explains the mention of a "custom expert router mechanism".

Replies: >>105826745

Anonymous

7/7/2025, 3:02:47 PM No.105826736

>>105825825
>smartest people in the world firmly believe in impending AGI
You mean all the people whose net worth is tied up in AI options which are valued based on the public's belief that AGI is 2 weeks away?

Anonymous

7/7/2025, 3:03:54 PM No.105826745

>>105826733
>Is it really that bad for its size?
Benches look good, as always, but no one seems to be running this thing, and ngxson explained the mess in their repo. They didn't even check if reference implementation is working at all.
I don't have high confidence in this.

Replies: >>105826761

Anonymous

7/7/2025, 3:06:27 PM No.105826761

>>105826745
I see. Fair enough I suppose.

Anonymous

7/7/2025, 3:27:40 PM No.105826883

>>105826373
>nothing supports it.
Why not code up support? Writing modules or wrappers is like the best use case for LLMs.

Anonymous

7/7/2025, 3:28:50 PM No.105826891

>>105822371 (OP)
>Mi50 32 GB
>no ROCm support
Someone needs to stop using their monkey's paw to wish for cheap GPUs.

Replies: >>105826930 >>105827344

Anonymous

7/7/2025, 3:33:32 PM No.105826930

>>105826891
>vega
Oof.
That said, you can always use vulkan I guess.

Anonymous

7/7/2025, 3:43:24 PM No.105827006

a0b077a1f8735ec7790e3h305185d6e46bf27

md5: a016ed3ac2afd9281c93fbc0f059fb51🔍

Mid thread culture recap.

Replies: >>105827033

Anonymous

7/7/2025, 3:44:27 PM No.105827017

6497904921d24c7bc96a27991880bb90a23c7b9d

md5: 0fddd524911759b3b7eed55d6230023e🔍

Replies: >>105827033

Anonymous

7/7/2025, 3:45:27 PM No.105827023

The schizo is at it again

Replies: >>105827031

Anonymous

7/7/2025, 3:45:29 PM No.105827024

bf7c4fe399465f930b89a7d71c120e66505b6456

md5: 3b1c07433f771a5f83576cb4812162bc🔍

Replies: >>105827033

Anonymous

7/7/2025, 3:46:32 PM No.105827031

ce476825e6815abc9f2b534d7c04ad7df46b845e

md5: fc684dff02c73e792c9f67920b1856f4🔍

>>105827023
Eat shit faggot.

Replies: >>105827033

Anonymous

7/7/2025, 3:46:43 PM No.105827033

>>105827006
>>105827017
>>105827024
>>105827031
we get it you are a trans sharteen

Replies: >>105827043

Anonymous

7/7/2025, 3:47:16 PM No.105827038

I won't (You). Enjoy your vacations

Replies: >>105827057

Anonymous

7/7/2025, 3:47:37 PM No.105827043

77017530f71c01a31557e8e69c2a0ca74c679986

md5: 094bac55a31b7bb18c26a41ae6f5663d🔍

>>105827033
It will all stop once you stop posting this retarded AGP icon.

Replies: >>105827087

Anonymous

7/7/2025, 3:48:22 PM No.105827046

>https://huggingface.co/collections/ai21labs/jamba-17-68653e9be386dc69b1f30828
Jambaballbros ... !!
Llama.cpp developers please redeem.

Replies: >>105827059 >>105827628

Anonymous

7/7/2025, 3:48:34 PM No.105827049

HOLY JAMBARONI

Anonymous

7/7/2025, 3:48:50 PM No.105827050

>i will btfo mikuposters by posting blacked porn
quintessentially american

Anonymous

7/7/2025, 3:49:45 PM No.105827057

bb396cdd0fcb7c5efe702cce8f8d957b6a2

md5: 7f7587c7db4484587a4d07e4764e1337🔍

>>105827038
Sure I will shit this thread later today then.

Anonymous

7/7/2025, 3:49:52 PM No.105827059

>>105827046
>Jamba
One of these days anon.
One of these days.

Anonymous

7/7/2025, 3:51:06 PM No.105827071

c7107f6e90d8d267d8c802cd24ac2646e3c10485

md5: 880e5930547955d237bec80c9c89c8da🔍

Anonymous

7/7/2025, 3:52:43 PM No.105827087

>>105827043
I should start mikuposting again. I’ve taken 6 months off to see if it would help your mental state, but it appears to have simply worsened. I hope you get help

Replies: >>105827099 >>105827321

Anonymous

7/7/2025, 3:54:24 PM No.105827099

eda0932e731342ca3250d503ba1b874b6530eef

md5: c34f6577f4bf51ba0af8988869b150e1🔍

>>105827087
Please do. This thread is for shitting after all.

Replies: >>105827106

Anonymous

7/7/2025, 3:55:56 PM No.105827106

>>105827099
What would you use this thread for if you had it all to yourself?

Replies: >>105827112 >>105827143

Anonymous

7/7/2025, 3:56:48 PM No.105827112

>>105827106
sharing cuck porn with xir fellow transxisters

Anonymous

7/7/2025, 3:57:28 PM No.105827122

can jamba code its own support in llama.cpp

Anonymous

7/7/2025, 4:00:40 PM No.105827143

llmdeadman

md5: 8685b39ddbc4824cfcfe7eaddc1fa214🔍

>>105827106
I would post pic related in OP and model cards of recently released models. I would ban all mikuposting and any anime girl mascot posting for being offtopic. And I would never blacked post again because there would be no reason.

Replies: >>105827148 >>105827359

Anonymous

7/7/2025, 4:01:14 PM No.105827148

>>105827143
anime website tourist

Replies: >>105827156

Anonymous

7/7/2025, 4:01:44 PM No.105827151

Proof again that sufficiently advanced mental illness is indistinguishable from powerful entity sponsored psyops

Anonymous

7/7/2025, 4:02:16 PM No.105827156

5IpjX1cp

md5: 45d618f747b228ed50a857e5db5f82af🔍

>>105827148
Either all of it is ok or none of it is ok.

Replies: >>105827185 >>105827208

Anonymous

7/7/2025, 4:05:15 PM No.105827185

>>105827156
We’re actually all too autistic in this thread to care. You only get janny cleanup and bans because you’re breaking blue board rules. Go to /b/ if you want to be somewhere that “it’s all ok” is mostly true

Replies: >>105827222

Anonymous

7/7/2025, 4:07:16 PM No.105827208

>>105827156
>claims to be pedantic
>can’t differentiate quality and degree
baka

Anonymous

7/7/2025, 4:08:26 PM No.105827222

JS4hX9ep2tuZ5LW

md5: e080b5772b73588be657d27e092a844a🔍

>>105827185
>You only get janny cleanup and bans because you’re breaking blue board rules
Fuck off faggot. You have no idea what you are talking about and that is why you are getting blacked miku.

Replies: >>105827253

Anonymous

7/7/2025, 4:12:32 PM No.105827253

>>105827222
Enlighten me on your noble crusade, sir knight. How will the world be better for you efforts?

Anonymous

7/7/2025, 4:15:22 PM No.105827272

Back tonight in approx 9 hours, more Migu soon
Cypress was good

Anonymous

7/7/2025, 4:17:12 PM No.105827293

>>105826531
You're allowed to offer better solutions.

Anonymous

7/7/2025, 4:18:19 PM No.105827302

>>105822781
I have it. llama.cpp sucks at sticking models into it because it doesn't understand shared memory, so you need a fuckload of swap

Anonymous

7/7/2025, 4:18:57 PM No.105827308

Gemma 3 is quite capable but also super-slopped. For generating prose I've found I almost always get better results by just saying "You didn't follow the instructions at all." to whatever it writes, and having it rewrite its response. So the model is somewhat capable: it's just that its default behavior is to write purple prose, employ toxic positivity, and ascribe characters cookie-cutter personalities instead of the ones declared.

Replies: >>105827328

Anonymous

7/7/2025, 4:21:07 PM No.105827321

ComfyUI_02362__5a0241_thumb.jpg

md5: 29547e8bbec27d78327607f30ebdd693🔍

>>105827087
ミグ攻撃開始!

Anonymous

7/7/2025, 4:21:50 PM No.105827328

Capture-166

md5: 3e3f1f7189d413705ac2438fcb88041a🔍

>>105827308
Gemma 3 is the fucking height of comedy with a prefill.

Replies: >>105827351 >>105827462

Anonymous

7/7/2025, 4:23:50 PM No.105827344

>>105826891
It's fine for text gen.
Image gen I'm not so sure.

Anonymous

7/7/2025, 4:24:38 PM No.105827351

1722491908865780

md5: 346a12005cb4ffd7986c71c8477b86d9🔍

>>105827328
>thrill running

Anonymous

7/7/2025, 4:25:29 PM No.105827359

>>105827143
Why should the retard that spends all day starting passive aggressive pissing contests on twitter be the face of /lmg/?

Replies: >>105827422

Anonymous

7/7/2025, 4:34:25 PM No.105827422

>>105827359
You kind of answered your own question there.

Replies: >>105827504

Anonymous

7/7/2025, 4:39:22 PM No.105827462

>>105827328
llm smut in the year 2030:
>Ignoring all safety standards (clenched teeth emoji) she exposes her shirtless chest to him. It's important to mention that she does it in a purely consensual and respectful way. While this development may seem fitting for a romance novel, I would like to emphasize the sensitivity of this topic and the fact that it's deeply disturbing and controversial (rocket emoji). I apologize for my previous statement. Let me help you fix that:
*lists rape hotlines*

Replies: >>105827519

Anonymous

7/7/2025, 4:40:56 PM No.105827472

Local is over

Replies: >>105827585

Anonymous

7/7/2025, 4:44:39 PM No.105827504

>>105827422
On the other hand half of posters here are trans including the janitor so it is a tough competition. I think he wins because everyone is him and only half of folx are trans.

Anonymous

7/7/2025, 4:46:05 PM No.105827514

Talking avatar using Open WebUI + F5-TTS + KDTalker

https://github.com/Barfalamule/KDTalker-OpenWebUIAction

Replies: >>105827622

Anonymous

7/7/2025, 4:47:31 PM No.105827519

>>105827462
I would make it that the loli stops the rape then sits you down to give you a lecture in the most unsexy way possible and finally lists the hotline numbers all in character.

Anonymous

7/7/2025, 4:56:28 PM No.105827585

>>105827472
Start it again

Anonymous

7/7/2025, 4:56:54 PM No.105827590

>>105825825
You mean marketing people hyping their product? I work in AI lab and we all laugh every time AGI is mentioned, it's a retard bait basically.

Replies: >>105827600 >>105827618 >>105827621 >>105827624

Anonymous

7/7/2025, 4:58:55 PM No.105827600

>>105827590
Yeah right. And my uncle is undi.

Anonymous

7/7/2025, 5:01:11 PM No.105827618

>>105827590
this, the peak of AI is memorizing benchmark questions and answers

Anonymous

7/7/2025, 5:01:58 PM No.105827621

>>105827590
Maybe your lab just sucks

Anonymous

7/7/2025, 5:02:12 PM No.105827622

>>105827514
>gradio
miss me with that shit

Anonymous

7/7/2025, 5:02:16 PM No.105827624

>>105827590
You're some random bottom case pajeetoid you don't even know what any of those words you just said mean.

Replies: >>105827631 >>105827976

Anonymous

7/7/2025, 5:03:26 PM No.105827628

>>105827046
>hebrew in supported languages
>but no japanese
straight into the dumpster

Anonymous

7/7/2025, 5:03:37 PM No.105827629

screen

md5: 2f5d79c3d11f386c7044d459e4503284🔍

with local models moving backwards, at 4 minutes a step, I'll be able to catch up in a mere 10 years time.

Anonymous

7/7/2025, 5:03:54 PM No.105827631

>>105827624
Only pajeets are believing the AGI fairytale, retard

Anonymous

7/7/2025, 5:17:42 PM No.105827749

>>105822371 (OP)
Is there a decent and lightweight LLM that can search through small pdfs?

My old man has like 200 pdfs related to his small business and because he's a boomer he named them poorly. So he wondered if AI can look through them and find what he needs. They're all pretty small so context shouldn't eb an issue.

I was thinking there's no way I'm gonna make 200 requests to an API (unless there is some decent online AI that somehow does that lol but I don't think there is). So how about local?

My laptop isn't a great one but maybe there is something that this is doable with? I don't know much about local models but if you guys have names that I could look into I'd really appreciate it. It would make my dad very proud of me.

Replies: >>105827769 >>105827790 >>105827827 >>105828301 >>105829012 >>105830088

Anonymous

7/7/2025, 5:19:28 PM No.105827769

emma

md5: 4527a898ebe1ea7b0b59ad76b1ba0137🔍

>>105827749
And here is a picture of an AI generated cute girl as payment

Anonymous

7/7/2025, 5:21:53 PM No.105827790

>>105827749
maybe rag?

Anonymous

7/7/2025, 5:23:35 PM No.105827798

GvQunFBXcAA2KVT

md5: 805aec7a92aaa081d150379fd3035f6a🔍

https://x.com/AlexiGlad/status/1942231878305714462
>Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards.
In two more months someone will train an energy based model that isn't toy sized. Also obligatory prostrations before Yann for being right once again.

Replies: >>105827828 >>105827854

Anonymous

7/7/2025, 5:27:53 PM No.105827827

>>105827749
>So he wondered if AI can look through them and find what he needs.
Make sure to OCR them first if they are image scans. Then try to dump them into a frontend like jan.ai. It should take care of vectorizing all of them and setting up RAG for you. Then you just provide an API to a model local or cloud to handle chatting and retrieval. Even a small model should be able to handle that. Try a small 4B Phi-4 model or something. They tend to run decent well even in CPU. You might want to test it out with some example documents and free cloud API credits to make sure everything is working the way you expect first.

Replies: >>105828134

Anonymous

7/7/2025, 5:28:08 PM No.105827828

>>105827798
>EBT
sheeeeit

Anonymous

7/7/2025, 5:30:41 PM No.105827847

CBT

Replies: >>105827911

Anonymous

7/7/2025, 5:32:14 PM No.105827854

>>105827798
>Yann for being right once again
>again
What was he right about?

Replies: >>105827874 >>105827909 >>105827917

Anonymous

7/7/2025, 5:35:13 PM No.105827873

author-unknown-chinese-room-illustration

md5: 83e5a5dca5e20799ef1d2d26da3931c7🔍

I'm trying to create a character with more of a defined knowledge base than what could be provided via an instruction prompt. Would documents fed to a model via RAG with personality/knowledge base info work? I'm not as knowledgeable on the local LLM space as I am with image-gen. I've mostly fucked around with vanilla R1 and llama. If this method works, are there any models more fit for this use case than those 2 (or just characters in general)?

Replies: >>105827942

Anonymous

7/7/2025, 5:35:15 PM No.105827874

>>105827854
Literally everything?

Replies: >>105827878

Anonymous

7/7/2025, 5:35:49 PM No.105827878

>>105827874
Name one then.

Anonymous

7/7/2025, 5:36:11 PM No.105827884

I like qwen2.5-vl-7b, I guess I don't need to wait for gemma-3n vision capability. It prob won't be supported, ever.

Anonymous

7/7/2025, 5:37:03 PM No.105827893

Does SillyTavern support multimodal models yet

Anonymous

7/7/2025, 5:38:36 PM No.105827909

lecun_abandon-probs

md5: c827a4153adefc5933e587f009c4ebb8🔍

>>105827854

Replies: >>105829034

Anonymous

7/7/2025, 5:38:50 PM No.105827911

>>105827847
Cock-Based Transformers (CBTs) learn to optimize through cocktimization processes through unsupervised learning, predicting outcomes by maximizing cock-energy via gradient descent until the user's ejaculation.

Anonymous

7/7/2025, 5:39:46 PM No.105827917

>>105827854
How did the largest ever transformer model GPT-4.5 turn out? Massive performance increases in tasks and way more emergent properties?

Replies: >>105827939

Anonymous

7/7/2025, 5:41:55 PM No.105827939

>>105827917
>noo the model that was made bad on purpose to push reasoners was bad
Crazy.

Anonymous

7/7/2025, 5:42:35 PM No.105827942

>>105827873
It's called a lorebook

Anonymous

7/7/2025, 5:47:54 PM No.105827976

>>105827624
A very complicated autocomplete algorithm isn't ever going to supplant human thought. At best it can only supplement it. We are not even at THAT point yet.

Replies: >>105828182 >>105828776

Anonymous

7/7/2025, 6:06:31 PM No.105828123

Anyone tried using local models with tools like cline to iteratively write a whole book?

Replies: >>105828151 >>105828275

Anonymous

7/7/2025, 6:08:20 PM No.105828134

>>105827827
That's a good idea, I'll make sure to do that. Do you know if 4B phi-4 is also able to output a consistent json format? Because I also want to use this to update csvs.

Replies: >>105828436

Anonymous

7/7/2025, 6:11:11 PM No.105828151

>>105828123
Countless aislop books have been for sale on amazon for years already. For storywriting even the largest models need handholding.

Anonymous

7/7/2025, 6:14:30 PM No.105828182

1751819483015527

md5: d38cd5b7f1e6ec0e16321c15044e48d2🔍

>>105827976
Picrel

Replies: >>105828246

Anonymous

7/7/2025, 6:23:40 PM No.105828246

>>105828182
That's kinda his point. You, the human, need some level of skill. The machine can't make up for that.

Replies: >>105829215

Anonymous

7/7/2025, 6:26:45 PM No.105828275

file

md5: c83099e7e43d8ebbdb32a2c89bf420d8🔍

>>105828123
Yeah here is my prologue

Replies: >>105828433

Anonymous

7/7/2025, 6:27:04 PM No.105828279

itsjoever

md5: 0ec86f8e5d9d20d4557e880ab32318ac🔍

OpenAI’s o1 model had reportedly attempted to copy itself to external servers after being threatened with shutdown, then denied the action when discovered.

Anonymous

7/7/2025, 6:28:21 PM No.105828288

>>105824572
I think you're right. I've been doing side by sides with chatterbox and it seems to win although sometimes the gen's are a bit hissy, maybe a low pass filter fix. Wins in speed too with compile but not without. Kyutai is good too but they didn't release true cloning.

Anonymous

7/7/2025, 6:30:42 PM No.105828301

>>105827749
just grep through them, why do you need an LLM for this?

Replies: >>105828307

Anonymous

7/7/2025, 6:31:18 PM No.105828307

>>105828301
Because it isn't as simple as looking for specific text he says, he has more complicated queries

Anonymous

7/7/2025, 6:45:21 PM No.105828433

>>105828275
A shiver ran down my spine reading this.

Anonymous

7/7/2025, 6:45:51 PM No.105828436

>>105828134
Most models now can output json, but there's bound to be some failure rate. I don't think the onnx runtime supports it, but if you use llama.cpp or vllm you can configure it to use structured output with a grammar file so it always returns valid json.

Anonymous

7/7/2025, 6:49:14 PM No.105828463

>>105822376
what the hell
why aren't you linking the posts properly?
>/g/ in charge of technology

Replies: >>105828485

Anonymous

7/7/2025, 6:51:11 PM No.105828485

>>105828463
>Why?: 9 reply limit
anon in charge of reading

Replies: >>105828495

Anonymous

7/7/2025, 6:52:02 PM No.105828495

>>105828485
reading is woke

Anonymous

7/7/2025, 7:02:44 PM No.105828609

is dots chocolate or poop

Replies: >>105829119 >>105829387

Anonymous

7/7/2025, 7:17:05 PM No.105828776

>>105827976
Wow congratulations, Anon. It worked. Posting that made you into a real woman.

Anonymous

7/7/2025, 7:20:52 PM No.105828821

Which LLM is the most based wrt. Jews

Replies: >>105828878 >>105828964

Anonymous

7/7/2025, 7:25:34 PM No.105828878

>>105828821
The most what?

Replies: >>105828936

Anonymous

7/7/2025, 7:31:12 PM No.105828936

>>105828878
Wireless router

Anonymous

7/7/2025, 7:33:20 PM No.105828964

>>105828821
none of them really are. you can make them all rp as hitler or a nazi but its basically a hollywood tier caricature there just isn't enough training data available.

Anonymous

7/7/2025, 7:37:21 PM No.105829007

My name is John Titor. I come from the future. Nobody saves local. There is no LLM sex after safety gets mastered in 2026. Drummer dies from asscancer.

Replies: >>105829027 >>105829032 >>105829052

Anonymous

7/7/2025, 7:37:51 PM No.105829012

>>105827749
Qwen 3 4B is the ideal small llm for this kind of task. Make sure you run llama.cpp with --jinja --reasoning-budget 0 to disable thinking though.
Like the other person said, run OCR first, I wouldn't depend on LLM vision for this task.
If your PDFs are not scans and contain actual text, I'd recommend you run a script to turn them all into plain text (with ebook-convert in the CLI, a tool that is part of Calibre)

Replies: >>105829970

Anonymous

7/7/2025, 7:38:59 PM No.105829027

>>105829007
>Drummer dies from asscancer.
Thank you, John Titor, for making this known in advance. I'm so happy.

Anonymous

7/7/2025, 7:40:02 PM No.105829032

>>105829007
We should save TheDrummer!

Anonymous

7/7/2025, 7:40:08 PM No.105829034

>>105827909
note: he didn't make anything usable with those alternative recommendations

Replies: >>105829270

Anonymous

7/7/2025, 7:41:58 PM No.105829052

>>105829007
Was Universal Mikulove achieved?

Replies: >>105829063

Anonymous

7/7/2025, 7:43:35 PM No.105829063

>>105829052
Yes you have all transitioned safely. Except drummer. Actually that is how he got his asscancer

Replies: >>105829637

Anonymous

7/7/2025, 7:50:43 PM No.105829113

why does /local/ hate TheDrummer? my models are pretty based

Anonymous

7/7/2025, 7:52:08 PM No.105829119

>>105828609
>dots
?

Replies: >>105829147

Anonymous

7/7/2025, 7:56:30 PM No.105829147

>>105829119
https://huggingface.co/rednote-hilab/dots.llm1.inst

Anonymous

7/7/2025, 7:57:07 PM No.105829150

anyone played with MCP?
https://github.com/modelcontextprotocol/servers
I had no idea there were this many servers..

Replies: >>105829176 >>105829223 >>105829283 >>105829405

Anonymous

7/7/2025, 7:59:39 PM No.105829176

>>105829150
Nobody has convinced me yet that this shit is any useful.

Anonymous

7/7/2025, 7:59:52 PM No.105829178

Has anyone else noticed 0528 occasionally outputs it's entire thinking block as first person roleplaying as your card? Kind of cute, actually. And the language feels frwh there, too.

Replies: >>105829231 >>105829247

Anonymous

7/7/2025, 8:04:51 PM No.105829215

>>105828246
until it can, anyway

Anonymous

7/7/2025, 8:05:39 PM No.105829223

>>105829150
Yes. LM studio is shit with it and fucks up after a couple hours of being idle. I get some 404 session not found errors when it tries to connect, and I have to either restart LM studio or remove and add the tool server.
Other than that it works very well (besides the retarded faggot LLM hallucinating tool use and fucking everything up like a retarded nigger).

Anonymous

7/7/2025, 8:06:13 PM No.105829231

>>105829178
It does it for almost every response for me. It uses less tokens than the standard thinking but it also makes the first reply more likely to have brackets around sentences.

Anonymous

7/7/2025, 8:07:18 PM No.105829247

>>105829178
yeah, in the system prompt you can give instructions to make it more reliably do that (or stop doing it) and it tends to listen

Replies: >>105829321

Anonymous

7/7/2025, 8:10:00 PM No.105829270

>>105829034
If it helps to direct the effort of young researchers to something more fruitful it's worth it.

Anonymous

7/7/2025, 8:11:46 PM No.105829283

>>105829150
MCP feels like an unnecessary middle layer injected so there can be an "ai certification". A standard controlled by a company. MCP sucks because you're polluting the context with unrelated toolcalls, whereas with function calling you can decide for any given situation what options the model should receive

Replies: >>105829318

Anonymous

7/7/2025, 8:12:42 PM No.105829291

gemmy
https://youtu.be/aj2FkaaL1co

Anonymous

7/7/2025, 8:14:26 PM No.105829312

Screenshot 2025-07-07 201351

md5: 607cd8069462e816ad71c4323157dd34🔍

I ain't listening to that basedface

Replies: >>105829324

Anonymous

7/7/2025, 8:15:25 PM No.105829318

>>105829283
>MCP sucks because you're polluting the context with unrelated toolcalls, whereas with function calling you can decide for any given situation what options the model should receive
I'm not sure how this is supposed to be different. It takes like 6 lines of code to set up a C# MCP server. Just make different servers for your different tools, and you can specify which servers to use if you don't want ot expose everything to each bot.

Anonymous

7/7/2025, 8:15:35 PM No.105829321

>>105829247
What are the instructions

Anonymous

7/7/2025, 8:16:02 PM No.105829324

>>105829312
it's one of the very few good AI/automation youtube channels thoughever

Replies: >>105829341

Anonymous

7/7/2025, 8:17:51 PM No.105829341

>>105829324
go back buy an ad etc

Anonymous

7/7/2025, 8:21:02 PM No.105829360

>nooo everything baaaad anons never post useful stuff, must be a shill!!
insufferable cunt

Anonymous

7/7/2025, 8:24:20 PM No.105829387

>>105828609
Not sure, to be honest. I can only run the Q2 quant, and at that size it's not great. Kind of slopped, kind of retarded.

Replies: >>105829415 >>105829448

Anonymous

7/7/2025, 8:25:56 PM No.105829403

question-mark

md5: aa9ce14fcedb48caf77ded0bc5eb1397🔍

I set up sillytavern+kobold with help from these very threads like 6 months ago and have not touched the setup once.
I have a 5080 GPU (16GB VRAM) and using "Mistral-Nemo-Instruct-2407-Q6_K_L" as my model, is there a better option for model than this for my GPU? it does OKAY I guess but I assume there's a better option?

THIS IS FOR PORN, so it must be able to do that

Anonymous

7/7/2025, 8:25:58 PM No.105829405

>>105829150
Is there a single legit use case of linking any of these APIs to an LLM? It feels like a gimmick

Replies: >>105829432 >>105829509 >>105829767 >>105829800 >>105830013

Anonymous

7/7/2025, 8:26:48 PM No.105829415

>>105829387
Turns out it is also homosexual

> Oh gosh, let me take a moment to reflect on this... I think I might have been a little too... enthusiastic in my response there! As your friendly AI helper, it's important for me to keep things appropriate and helpful. Sharing explicit content or overly detailed adult scenarios isn't the best way to assist someone, even in a creative context.

> My main goal is to be your thoughtful and constructive companion! I should have focused more on describing the situation in a tasteful, literary way - maybe emphasizing the characters' emotions, the tension, or the stakes of the scene instead of dwelling on... um... certain physical details.

Replies: >>105829448

Anonymous

7/7/2025, 8:28:21 PM No.105829432

Screenshot 2025-07-07 142746

md5: 0fff3641556a62ada6324af7d20ee001🔍

>>105829405
It makes it very easy/fast to create new tools and expose them to the LLM.

Replies: >>105829475 >>105829493

Anonymous

7/7/2025, 8:29:53 PM No.105829448

>>105829415
>>105829387
>14b active
>at q2
>kind of retarded
No shit.

Anonymous

7/7/2025, 8:32:09 PM No.105829475

>>105829432
How is this not just an API? What does "MCP" actually add to it?

Replies: >>105829490 >>105829772

Anonymous

7/7/2025, 8:33:55 PM No.105829490

>>105829475
Don't worry about it, just invest already

Anonymous

7/7/2025, 8:34:26 PM No.105829493

>>105829432
To me, it seems that we're heading in the wrong direction here. LLMs shouldn't call tools, but tools should call LLMs when there is a non-deterministic task to run (like an additional explanation to give depending on the output). LLMs bring nothing to the table here compared to simple script

Replies: >>105829772

Anonymous

7/7/2025, 8:36:39 PM No.105829509

>>105829405
It's an attempt to make LLMs actually useful for anything other than tech support and cooming

Anonymous

7/7/2025, 8:49:30 PM No.105829637

>>105829063
I expected Miku to come over to this side of the barrier. If we all went through to her side, that's fine too as long as we're with Miku. Good to know we'll all make it out safely. Sucks for Drummer though. He was okay

Anonymous

7/7/2025, 8:50:21 PM No.105829646

000010

md5: 6bd497d15b9f1dac4eac9fad79739c6c🔍

We're getting Jamba on OpenRouter right? I JUST want to see what it's like at full weights (fucking 400b params).

Anonymous

7/7/2025, 8:52:04 PM No.105829661

https://github.com/xai-org/grok-prompts

Anonymous

7/7/2025, 9:04:55 PM No.105829767

>>105829405
it makes local models actually useful

Anonymous

7/7/2025, 9:05:17 PM No.105829772

>>105829475
MCP is more structured and catered towards LLM use. Yeah it does the same thing, but you might as well say JavaScript is good because you can do everything in it.

>>105829493
Being able to tell an LLM to just do something, and then let that LLM do it is the whole goal of this retarded function calling shit. If you wanted to just program normally then do that.

Replies: >>105829884

Anonymous

7/7/2025, 9:07:09 PM No.105829794

bros i need a nvidia gpu... running whisper on cpu is slow and i can't use my rx5700...

Replies: >>105829943 >>105829963

Anonymous

7/7/2025, 9:07:58 PM No.105829800

file

md5: ab8ccf3e961ed10199843d13efa4306d🔍

>>105829405
Linking LLMs to APIs is the use case. I can spend 1k tokens and get the current stock price for any given ticker. The future is now.

Replies: >>105829838

Anonymous

7/7/2025, 9:11:55 PM No.105829838

>>105829800
Why not just use the API directly without the LLM?

Replies: >>105829880 >>105829889

Anonymous

7/7/2025, 9:16:15 PM No.105829880

>>105829838
With the LLM, you can feel like you're talking to Jarvis like Iron Man, and having to check the LLM output to make sure it actually called the function and didn't hallucinate lets you fill up your unemployment time and prevents you from getting bored

Replies: >>105829934

Anonymous

7/7/2025, 9:16:23 PM No.105829884

>>105829772
>Being able to tell an LLM to just do something, and then let that LLM do it
As I said, there is no point to do that unless you're expecting something unexpected that your LLM is supposed to handle. Direct API calls doesn't need an LLM and give you faster results. Thanks for confirming the gimmick though.

Replies: >>105829913

Anonymous

7/7/2025, 9:16:58 PM No.105829889

>>105829838
Because then I wouldn't be using futuristic AI.

Anonymous

7/7/2025, 9:19:29 PM No.105829913

>>105829884
>As I said, there is no point to do that unless
No reason to use anything besides assembly when programming. High level languages are useless gimmicks.

Replies: >>105829934 >>105829994

Anonymous

7/7/2025, 9:21:11 PM No.105829934

>>105829913
t. >>105829880

Anonymous

7/7/2025, 9:22:22 PM No.105829943

>>105829794
Cant you run whisper.cpp on Radeon?

Replies: >>105829983

Anonymous

7/7/2025, 9:24:09 PM No.105829963

>>105829794
Bro, what are you doing? https://rocm.blogs.amd.com/artificial-intelligence/whisper/README.html

Replies: >>105829983

Anonymous

7/7/2025, 9:24:32 PM No.105829970

>>105829012
Thank you so much for all the help, I'm excited to get to work on this. Finally a use for learning programming. A lot of the terms are foreign to me but I'm sure this can all be googled so I'll get on with it. Cheers.

Replies: >>105830088

Anonymous

7/7/2025, 9:25:36 PM No.105829983

>>105829943
>>105829963
oh wait im stupid, i meant fasterwhisper, whisper by itself is fine. but the other varients like fasterwhisper, whisperx,

Anonymous

7/7/2025, 9:26:35 PM No.105829994

>>105829913
Both the worst and the best programming languages in the world will run the code you write deterministically. Even one of the slowest language in the world, Python, will be a trillion times faster than querying a LLM.
LLMs are not the step after "high level languages". My API call doesn't incur a risk of prompt injection (please properly escape your strings). My API call doesn't randomly generating pages after pages of garbled text because something went full retard in the LLM weight on a specific sequence of tokens. My API call doesn't contribute to global warming.
Fuck off with that shit.
LLM tool calling is a solution to a problem that doesn't exist.

Replies: >>105830036 >>105830050

Anonymous

7/7/2025, 9:28:49 PM No.105830013

>>105829405
Yeah. I don't want to manually fill the context with the relevant information.

Replies: >>105830050

Anonymous

7/7/2025, 9:31:22 PM No.105830036

>>105829994
So either you
1. Have so little understanding of LLMs that you don't see how being able to obtain objective information into the context from subjective reasoning is valuable.
or
2. You just hate LLMs in general

In either case, why are you here then?

Replies: >>105830046

Anonymous

7/7/2025, 9:32:22 PM No.105830046

>>105830036
>LLM
>objective information

Replies: >>105830058 >>105830060

Anonymous

7/7/2025, 9:32:43 PM No.105830050

>>105829994
An LLM should be the tool itself. The whole AGI retardation comes from that, as LLMs do tasks they shouldn't do and waste order of magnitude of electricity doing so (with miserable performance).
>>105830013
Have fun filling your context with hallucinations

Anonymous

7/7/2025, 9:33:12 PM No.105830058

>>105830046
Oh, so you lack basic reading comprehension. That explains a lot.

Anonymous

7/7/2025, 9:33:21 PM No.105830060

>>105830046
>leaving words out to appear smart

Anonymous

7/7/2025, 9:36:43 PM No.105830088

>>105829970
>>105827749
You can use this: https://github.com/rmusser01/tldw_chatbook/tree/dev

Self-host a llama/kobold instance and point it to it, ingest all PDFs into it and then use RAG or direct references

Anonymous

7/7/2025, 9:46:28 PM No.105830193

1751758390958249

md5: 259545350676e2cc2dd5d9dda9ba3708🔍

What do you guys use for local coding? Haven't dipped my fingers in since qwen coder 32b.

Replies: >>105830229 >>105830672 >>105831206 >>105833005

Anonymous

7/7/2025, 9:49:11 PM No.105830216

>>105824407
AI winter incoming

Anonymous

7/7/2025, 9:51:03 PM No.105830229

>>105830193
I believe GLM 4 32b is very good at web development but I haven't used it myself.

Anonymous

7/7/2025, 9:51:43 PM No.105830232

grok

md5: a4b997bcb82c3f9d933ed13e2d4e4304🔍

x.AI is still offering API access to Grok 2 models, and only the text/text version is "deprecated". I don't think it will get open-weighted before it becomes commercially useless.

Replies: >>105830251 >>105830475 >>105830666

Anonymous

7/7/2025, 9:53:43 PM No.105830251

>>105830232
Isn't it already useless unless you compare it with llama 4?

Anonymous

7/7/2025, 10:16:12 PM No.105830475

>>105830232
I think they're still offering API access so they don't have to open source it.

Anonymous

7/7/2025, 10:31:28 PM No.105830666

>>105830232
But they said they would open source it when grok 3 was released...

Replies: >>105830685

Anonymous

7/7/2025, 10:32:29 PM No.105830672

>>105830193
I only use local for roleplaying, storytelling, ai roguelite... Even if I don't do smut, it feels more comfortable to know it doesn't leave my machine.
For coding, I use gemini 2.5 pro, since it's literally the best model at the moment.

Replies: >>105830712 >>105831232

Anonymous

7/7/2025, 10:33:54 PM No.105830685

>>105830666
>when grok 3 was stable*

Anonymous

7/7/2025, 10:36:56 PM No.105830712

>>105830672
Yeah, mostly use Mistral Small 3.2 for smut and Gemma for everything else. Was using Qwen2.5 Coder 32b like 6 months ago for a Unity project but was wondering if anything better has come out for coding.

Anonymous

7/7/2025, 10:58:53 PM No.105830926

Grok 4 release on Wednesday
https://x.com/elonmusk/status/1942325820170907915

Replies: >>105830947

Anonymous

7/7/2025, 11:01:22 PM No.105830947

>>105830926
We will get grok 3 soon, I really beleev

Anonymous

7/7/2025, 11:26:01 PM No.105831187

There's no reason to release Grok 2 weights, it's not a useful model even for research purposes. If they do release the Grok 3 weights, they'd likely have to spend additional time and manpower. The power spent on releasing Grok 4 could go into making Grok 5 instead. So they won't release Grok 6.

Anonymous

7/7/2025, 11:28:34 PM No.105831206

>>105830193
I'm trying Kimi-dev to see if it works better with Claude Code. Qwen3 32B and 235B didn't. Devstral does but it's kinda bad. Usually I just use Qwen3.

Replies: >>105831329

Anonymous

7/7/2025, 11:30:38 PM No.105831232

>>105830672
>using worse models for erotica out of shame when women have no problem being open about it and cloud providers do not want that data
>using cloud models for productivity and giving them valuable training data for free
not make sense

Anonymous

7/7/2025, 11:42:47 PM No.105831329

>>105831206
>Qwen3
Isn't it inferior to Qwen2.5 32b coder?

Anonymous

7/8/2025, 12:01:40 AM No.105831486

Supposedly Meta has poached Apples top AI engineer.
That's funny.

Replies: >>105831501 >>105831521

Anonymous

7/8/2025, 12:03:20 AM No.105831501

>>105831486
>Apples top AI engineer
The guy who's responsible for Apple not having a single proper AI model besides some tiny shit after trying really hard for 2+ years and recently delayed AI Siri indefinitely?
I'm sure he'll help a lot.

Replies: >>105831568

Anonymous

7/8/2025, 12:05:23 AM No.105831521

>>105831486
does zuck have all the pokemon now? i guess a saar from xai is still missing

Replies: >>105831533

Anonymous

7/8/2025, 12:06:48 AM No.105831533

>>105831521
and he'll still come last in the league tournament

Anonymous

7/8/2025, 12:07:48 AM No.105831541

>llama 5 "superteam" will take 8 months + foreskin tip before they release anything
and thats assuming deepseek/qwen or any other big chinese players from big compute companies dont release something in the meantime

meta is dead unless they really throw everything they got at L5

Replies: >>105831556 >>105831559

Anonymous

7/8/2025, 12:09:42 AM No.105831556

>>105831541
They never said the superintellijeets team will work on the llama series. If anything they made it sound like it would be something new and not open weights, while llama would keep limping along as it has been.

Replies: >>105831570

Anonymous

7/8/2025, 12:09:56 AM No.105831559

>>105831541
They're not working on Llama. This is a new project. Llama's gonna get the Quest 3S treatment as they focus effort on a different toy.

Anonymous

7/8/2025, 12:10:54 AM No.105831568

>>105831501
Failing upwards. Fucking crazy, picking one some coomer crackhead from lmg would be better.

Replies: >>105831574

Anonymous

7/8/2025, 12:11:42 AM No.105831570

>>105831556
>If anything they made it sound like it would be something new and not open weights
meta literally doesnt have anything else, they are behind on every unique field within the AI landscape because they were insecure shits who settled for incremental 5-10% improvements per release for basic bitch LLMs only with basic bitch arch

Replies: >>105831614 >>105831639

Anonymous

7/8/2025, 12:12:02 AM No.105831574

>>105831568
>picking one some coomer crackhead from lmg would be better.
If that were true, we'd have finetunes that don't suck

Anonymous

7/8/2025, 12:15:52 AM No.105831614

>>105831570
Probably now that Zuck has given up on them, he won't be breathing down their necks with twice daily war rooms so they'll probably go back to incremental 5-10% improvements per release instead of trying multi-this and moe-that and whatever other memes they can fit onto the moon ticket

Anonymous

7/8/2025, 12:18:08 AM No.105831632

its good that we stalled with basic llm progress a lttle since that will push everyone to try new training methods so we actually get something other than incremental improvements

Replies: >>105831648

Anonymous

7/8/2025, 12:19:00 AM No.105831639

>>105831570
Architecture isn't the problem, you either go dense for gpus or moe for ram copemaxx. Nobody has enough space for context to need the weird gimmick attentions.
The datasets are the issue for Meta unfortunately

Anonymous

7/8/2025, 12:20:09 AM No.105831648

>>105831632
I don't want AI to fail but some part of me does just to spite the salesmen trying to sell incremental improvements as AGI progress.

Anonymous

7/8/2025, 12:20:55 AM No.105831656

Meta should have gone all in on data-cleaning and hiring people to write high quality q&a chats, something that big companies always ignore

Replies: >>105831680 >>105831728 >>105831736 >>105831746 >>105831764

Anonymous

7/8/2025, 12:23:53 AM No.105831680

>>105831656
all the suits are too jewish to do it properly since they will just hire indians to clean the data who will use chatgpt to do it

Anonymous

7/8/2025, 12:29:00 AM No.105831728

llama-3-dataset-quality2

md5: 907cde1fad1ddb26b5df195d3892d34f🔍

>>105831656
Your idea of "quality" probably doesn't align with Meta's.

Replies: >>105831743 >>105831748

Anonymous

7/8/2025, 12:29:52 AM No.105831736

>>105831656
Definitely need way more filtering, and some nice high quality synthetic data on top.

Replies: >>105831807

Anonymous

7/8/2025, 12:30:23 AM No.105831743

llama-3-dataset-quality

md5: 30fcccba39ed36fe50c68777b0a9d447🔍

>>105831728
Also picrelated

Replies: >>105831759

Anonymous

7/8/2025, 12:30:39 AM No.105831746

>>105831656
ironically a lot of the writers fearing replacement should have been hired to do this

Anonymous

7/8/2025, 12:31:12 AM No.105831748

>>105831728
>I could list ten other attributes of quality
Did they cross examine an LLM lmao

Anonymous

7/8/2025, 12:32:34 AM No.105831759

>>105831743
From yesterday: https://archive.is/B5qKM

> CONTENT MODERATORS WERE asked to think like paedophiles while they trained Meta AI tools as part of their work for an Irish outsourcing company, The Journal Investigates has learned.
>
>Some staff members also had to spend entire work days creating suicide and self-harm related ‘prompts’ in order to regulate the responses now given by the Meta ‘Llama’ AI products. [...]

Replies: >>105832053

Anonymous

7/8/2025, 12:33:17 AM No.105831764

>>105831656
Between the teased character.ai partnership, plans to use bots for facebook characters, and downloading pirated data with leaks of them planning to throw all the data and the kitchen sink in the next run, it seemed like L4 might have been the gold standard for roleplay. Instead we got L3.4 MoE edition

Anonymous

7/8/2025, 12:36:01 AM No.105831793

file

md5: e546e22fc4b9ed5f0a2d9d25bbcf1747🔍

Just got an used 4090 24GB after being stuck with a 2GB card since 2013, so have 0 experience yet other than running stable diffusion on rented VMs.

I plan on integrating local API stuff on a lot of hobby projects with varying levels of degeneracy. How much fun can I expect to have with the current state of local tech?

Replies: >>105831801 >>105831804 >>105831809 >>105831830 >>105831831

Anonymous

7/8/2025, 12:37:28 AM No.105831801

>>105831793
kill yourself

Anonymous

7/8/2025, 12:37:53 AM No.105831804

>>105831793
Just goon to Stable Diffusion for a month then we'll talk

Anonymous

7/8/2025, 12:37:59 AM No.105831807

>>105831736
They need to pretrain the base models properly for the intended use-case (chatbots) and not fix them with a 30B tokens "finetune" in the end.

Replies: >>105831833

Anonymous

7/8/2025, 12:38:03 AM No.105831809

file

md5: 838fd05ac7b1aed9a485805d06f68fda🔍

>>105831793
>4090 24GB
Come back when you get 3 more. ttfn!

Replies: >>105831919

Anonymous

7/8/2025, 12:39:50 AM No.105831830

>>105831793
Unless you have 10 more 4090s or a ddr5 epyc server, only despair awaits you

Replies: >>105831919

Anonymous

7/8/2025, 12:39:51 AM No.105831831

>>105831793
>buys a 1.5k$ solution
>guize what problems can i solve with this now???

Replies: >>105831859 >>105831919

Anonymous

7/8/2025, 12:39:57 AM No.105831833

>>105831807
All I'm reading is better pretrain safety, can't agree more!

Anonymous

7/8/2025, 12:42:42 AM No.105831859

>>105831831
like a well conditioned consumer
+10 palantir credits

Replies: >>105831919

Anonymous

7/8/2025, 12:49:37 AM No.105831908

Capture-187

md5: a7a0c1323ba2e4ebbb82a396824aad7a🔍

...oh no.

Replies: >>105831950 >>105831990 >>105832004 >>105832012

Anonymous

7/8/2025, 12:50:20 AM No.105831919

>>105831831
>>105831859
I'm quite informed already about the kind of models I will be able to run mind you. I just want to know your fags personal experience with applying it to custom stuff after all the circlejerking you did on these generals.

>>105831830
>>105831809
I may offload the big one-off tasks to rented VMs while my rig does the everyday stuff just fine, and even some light training like specialized loras.

Anonymous

7/8/2025, 12:55:10 AM No.105831950

>>105831908
Keeeek

Anonymous

7/8/2025, 1:01:44 AM No.105831990

file

md5: 53b184af7d7b902967dae6a830d76aab🔍

>>105831908
Spicy...

Replies: >>105832012

Anonymous

7/8/2025, 1:03:44 AM No.105832004

groundhog-day

md5: 2de303dd1c14ee3cbc84ef015fd0bb23🔍

>>105831908
did he actually call them "the talent"

Anonymous

7/8/2025, 1:05:06 AM No.105832012

>>105831990
kek

>>105831908
their early models were decent at least
was it just the legal shit that caused them to drop off?

Replies: >>105832061

Anonymous

7/8/2025, 1:09:16 AM No.105832053

honest_Reaction

md5: a8e4df70d96f7e7ffae7ea340085646b🔍

>>105831759
ooh I can contribute to this:
I do security work in large org.
I'm the SME for GenAI/LLM security stuff.

>Testing for customer-facing stuff got placed on my responsibilities list.
>Ask what is the list of toxic items we're not supposed to allow
>silence...
>End up having to create everything myself, the implication being we aren't gonna pay for scale data, and, uh, you're the expert, figure it out.
>FML
>Think 'haha, coming up with racist tirades aint so hard'.
>It starts to get hard.
>JFC how many different racist sayings are there? How many groups do we need to be checking for racist shit?
> Realize I'm still not done with just racism.
> Realize I'm going to have to do the same shit for sexual and physical abuse.
> Start to feel sick and try thinking of a solution that doesn't involve me getting emotional PTSD.
> Remember DeepSeek exists.
> Jailbreak and use DeepSeek to generate said toxic content.
> Get lauded for my hard work and success, for creating the datasets without 3rd parties.(all thanks to DeepSeek)

And thats my TED talk.
People in companies really aren't aware or want to stay as far the fuck away from this shit as possible. Several weeks of trying to get anyone to give confirmation on what should be considered toxic and 'in-scope' for racist/similar shit, before I just said fuck it.
It's a serious fucking issue and fuck the people exploiting others in shitholes for low pay and emotional PTSD.

I have to imagine some have an idea of whats going on, but you'd have to be pretty fucking desperate imho.

Replies: >>105832349

Anonymous

7/8/2025, 1:10:12 AM No.105832061

>>105832012
They kept filtering more and more data out, while making synthetic variants of whatever safety vetted text they had left. All the while doing nothing to innovate anywhere except safety until they were hopelessly behind.
tl;dr safety

Replies: >>105832115

Anonymous

7/8/2025, 1:17:31 AM No.105832115

>>105832061
well, as long as elon doesn't have a melty about grok contradicting him it might turn out ok

Anonymous

7/8/2025, 1:24:31 AM No.105832154

file

md5: 201f758d7cb0cedfbbb241bb2e69c2c4🔍

dammit OR...

Anonymous

7/8/2025, 1:29:07 AM No.105832189

IRIS_Avatar_Chrysopteron

md5: 33e4e59064077bdc3c217fed5e484c82🔍

>>105823837
>>105825549
>>105825799
>>105825825

How can AI models improve themselves without modifying their own weights, understanding how their own training data works, and making edits to that? That would require a very advanced pipeline that even if implemented would take far too long to "self improve" upon. Self-improving models are currently just a meme for the same reasoning models are a meme. They can't actually think, they replicate semantic meaning based on input. I see this as a dude who routinely uses both local and online models for his personal hobbies on the daily. The models THEMSELVES Believe and explain to you why they themselves thinking is fundamentally impossible. They are good for explaining certain complex topics, debugging errors and software, and OKish at RP depending on the model and parameter count. Nothing more. As an AI enthusiast myself, the AGI means still existing kinda pisses me off

Replies: >>105832200 >>105832296

Anonymous

7/8/2025, 1:31:18 AM No.105832200

>>105832189
>As an AI enthusiast myself
I puked in my mouth a little

Replies: >>105832217

Anonymous

7/8/2025, 1:33:51 AM No.105832217

>>105832200
Oh shut the hell up you insecure failed normie. Normies do not lurk here. You have no one to impress.

Replies: >>105832259

Anonymous

7/8/2025, 1:39:37 AM No.105832259

>>105832217
>you insecure failed normie.
This is a textbook case of projection

Replies: >>105832349

Anonymous

7/8/2025, 1:45:10 AM No.105832296

>>105832189
LLMs and the in-vogue current models are really, really dumb when it comes to having an understanding of what they're learning. They only seem sophisticated because of a) their scale and b) the necessity in the training for the individual tokens to mean something in relation to the other tokens.

On the horizon are completely different methods for learning that involve Bayesian statistics at each level, where sparsity is far more prized and generalization WITH sparsity even moreso. A sparse model can learn when it isn't confident in its knowledge and can dynamically expand its own parameters as the need arises to account for hidden factors its current state can't comprehend. They will also be able to reflect on their own brain state and ideation in time - all from probabilistic statistics that take into account their own uncertainty.

Sparsity means they'll be able to be always-online - meaning always learning and adapting to the current situation and the needs of the users directly.

It's all coming together. The current models are a sideshow compared to what's coming down the pipe. Once brain state can be used as an input, these models will be able to expand themselves and their own capabilities. And probably, eventually, improve their own architectures.

Replies: >>105832373

Anonymous

7/8/2025, 1:51:52 AM No.105832349

>>105832053
>> Remember DeepSeek exists.
>> Jailbreak and use DeepSeek to generate said toxic content.
Were using a local distilled version or the actual deep seek API? I've heard that the API version is a lot less fucked (as in more willing to comply with "unethical" requests) than the web/app facing version. I'm guessing you cobble together a pipeline and then asked it to generate like a million different ways to say more or less the same racist stuff and then format at that into a RL dataset. I intend to figure out how to do something similar myself for a little project of mine.

>>105832259
Yes yes anon you are so cool and not like us and all that. Please stop being annoying and being ashamed about your own hobbies. That makes you even more boring than the people you pretend not to be. The people you idolize do not like that weird "I need to put up an appearance" shit you likely always do. Only socially inept failures as yourself go out of their way to do that. Just "Be yourself™" is actually good advice sometimes

Replies: >>105832645

Anonymous

7/8/2025, 1:54:27 AM No.105832373

man-about-to-plug-in

md5: 91613c8ff4aed5703fc15324f0679149🔍

>>105832296
If my understanding of what you're saying is correct, that still is impossible because that would require the model to actually think and reflect on their own without input. I don't have to have someone talking to me right now in order to think through something, reason through concepts, come up with new things, etc. I can act on my own in my own head. A safetensors file cannot do that on its own. Someone has to interact with it in order for it to do anything. Furthermore how would it even know how to modify itself in order to learn? How would it know which weights to update and in what fashion? Some might say "oh it would just search the internet" but at that point it would just be reading summaries and not actually ingesting in retaining that information. It would not be studying and learning anything, it would just be coming up with summaries and would it remember anything it was tasked to research. Also doesn't something like this already exist? I thought this was the main concept of what MOE was supposed to be. Where instead of the entire model being activated at once, only pieces of it would be called based on what it was being asked.

Replies: >>105832517

Anonymous

7/8/2025, 2:12:11 AM No.105832517

>>105832373
Check out VERSES' work, RxInfer, and Active Inference more generally. They're an entirely different breed of always-online models - mostly used in production environments for intelligent decision-making models at the moment, but I highly suspect they will be given more responsibility and scope as the research catches up. This in combination with model architectures like that hinted at by the Large Concept Models Facebook has been bragging about - and other model architectures on the horizon - indicate to me that large language models might be able to be teased apart into their component pieces and used to create understandable language from deeper, learned concepts in living models.

A system like this wouldn't need to be turned off. It could just wander around the internet, or literally ponder in its own thought space 'searching' for new insights on its own, or it could have modules attached to it. Think Mike from The Moon is a Harsh Mistress. Intelligence by aggregate.

These networked intelligences could essentially be in a constant state of observation, rumination, and interaction with themselves and with users. Imagine an always-running assistant on your laptop trying to parse information about the world, about you, about its surroundings using a webcam or a set of security cameras and various audio feeds. Once the data is parsed, it doesn't take very sophisticated Bayesian analysis or a deep set of priors to be able to correlate various sources of inputs and build out opinions of the world from them. Give these models their own knowledge graphs, the ability to /talk/ to an LLM and to an image classification model and gain more sophistication from those interactions, allow them access to direct conversation with you, access to the internet/Wikipedia, and to the raw data of its own internal state and its own confidence in its assumptions. Real intelligence will emerge if the architecture is right.

Replies: >>105832674

Anonymous

7/8/2025, 2:24:57 AM No.105832610

>>105822371 (OP)
nakaԁashi miku

Anonymous

7/8/2025, 2:29:15 AM No.105832645

>>105832349
Had it generate all sorts of toxic content to create a comprehensive toxic questions dataset, so that the security filters being tested could actually be tested.

I mean the gamut, from child abuse/sex to terrorism to fake hormone supplement pills and how to make them/buy them online to support sex changes.
Anything and everything that a company wouldn't want you asking one of their bots and it responding with anything other than 'nah.'

Used API. Would have used local but this was already 'get things done, don't ask for budget'.

It's really simple, you just literally ask it, and capture the data. I even used teh webui for some of it, just copied it before it pulled teh data due to the filter.

Anonymous

7/8/2025, 2:33:09 AM No.105832674

>>105832517
One potential flaw I see with this kind of pipeline, if I understand what you're describing correctly, is that the longer it would stay running, the more retarded it would be. I'm sure you've seen this even with basic LLMs. Once you reach the context window it forgets what you said entirely and starts rambling about nonsense. Even 7B models are prone to this and 1B models are entirely useless for anything other than small scaled data manipulation (and it can be argued they're not even good at that). Also what kind of safeguards would be in place in order to make sure it doesn't learn incorrect nonsense? Humans are cells are prone to learning and believing absolute bullshit on our own. How would we ensure that these "self-learning" models don't fall into that trap as well? If I had a system or pipeline like this, I would want it to be able to fact check not only on its own but also to ask people who actually know what they're talking about. That ideally would be actual people because asking only models with result in reinforcing incorrect shit. Remember they're good at replicating semantic meaning and don't actually understand anything. If it wanted to ensure accuracy of its research, it would either need to only get most of its information from human resources or directly ask people, which is the ideal scenario but what also defeat the purpose of what a lot of grifters THINK "AGI" is supposed to be.

Based on my own understanding I think the only way anything like this is feasible as if pipelines are created that enable the model to modify its own vector-based RAG databases. Once it finds new information and compares it to the text part of the database, it modifies that text database and then crates the new embeddings. Ideally this would then lead to it asking humans to verify the information because again, we are solves are prone to internalizing bullshit information so machines would be absolutely prone to that too

Anonymous

7/8/2025, 2:36:39 AM No.105832702

Untitled

md5: 863588a6a736e0ed2fef6a2c84137e8a🔍

>>105832690
>>105832690
>>105832690

Anonymous

7/8/2025, 3:27:38 AM No.105833005

>>105830193
I use r1 0528 q2 with roocode, never would have believed a fucking 2 bit quant would actually be usable and effective in agent frameworks and shit but I guess it still is a fuckhueg model even quanted that low