Thread 105932763

380 posts 118 images /g/

Anonymous 7/17/2025, 5:29:33 AM No.105932763 [Report] >>105932809 >>105933371 >>105933446 >>105936271 >>105937813

/lmg/ - Local Models General

Anonymous 7/17/2025, 5:30:02 AM No.105932764 [Report]

1751952119429.gif md5: df8055cc...

►Recent Highlights from the Previous Thread: >>105925446

--Papers:
>105932364
--Multi-GPU scaling challenges with software limitations and hardware matching considerations:
>105927077 >105927188 >105927236 >105929152 >105927365 >105927428 >105927433 >105927854
--Recent improvements in model support and inference throughput:
>105926109
--Interest in uncensored models and frustration with modern safety-tuned outputs and limited creativity:
>105925550 >105925565 >105925695 >105925716 >105925744 >105925928 >105926189 >105926386 >105926531 >105929459
--Electrical infrastructure considerations for high-power GPU LLM rigs:
>105927481 >105927729 >105927905 >105927932 >105928031 >105928690 >105927847
--Evaluating Japanese-to-English translation quality across AI models with focus on honorifics and tone:
>105927903 >105928009 >105928232 >105930020 >105930763
--Quantized Kimi-K2-Instruct model performance comparison favors Ubergarm over Unsloth:
>105926613 >105926634 >105927748 >105928351 >105928603 >105928964 >105927874
--Nemo Instruct 2407 context limits and roleplay memory behavior:
>105929716 >105929742 >105929775 >105929789 >105929824 >105929877 >105930054
--OpenAI open model delay and potential local GPU-focused competition:
>105926203 >105926262 >105926285 >105926355 >105926399 >105926419 >105926287 >105926292
--Troubleshooting unintended name inclusion and response behavior in roleplaying models on SillyTavern:
>105929769 >105929857 >105929881 >105929873
--Frustration over local LLMs defaulting to patronizing or safety-locked behaviors despite user configuration attempts:
>105930101 >105930167 >105930197 >105930274
--Cost-effective V100 SXM2 multi-GPU setup with noted architectural limitations:
>105927150 >105927340
--Miku (free space):
>105926361 >105926568 >105926659 >105926759 >105926792 >105927481 >105930864 >105931681

►Recent Highlight Posts from the Previous Thread: >>105925450

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 7/17/2025, 5:30:53 AM No.105932767 [Report] >>105932783

first for llama.cpp sucks because it doesn't know how to load models into shared memory without using at least 2x the amount of memory

Anonymous 7/17/2025, 5:32:21 AM No.105932774 [Report]

Miku violence

Anonymous 7/17/2025, 5:34:26 AM No.105932783 [Report]

>>105932767
>shared memory
You mean cpu memory?
--no-mmap

Anonymous 7/17/2025, 5:39:43 AM No.105932809 [Report]

>>105932763 (OP)
that's really fucking funny

Anonymous 7/17/2025, 5:41:13 AM No.105932814 [Report] >>105932819 >>105932866 >>105932983 >>105932988 >>105933615 >>105933629 >>105933934 >>105934025 >>105934296 >>105934395 >>105935523 >>105936257

Screenshot_20250717_112848.png md5: 547e7d21...

Kimi2 is dumb for roleplay.
The writing is very good. But its stupid, I think its on the level of recent mistral small models.
The characters also constantly repeat a secret for example, if they have one. Its worse than the other models in that regard.

Also the cuckedness is on another level, damn.
This is with 15k context FILLED. I dont think even gemma3 or the closed models refuse at that point.

Anonymous 7/17/2025, 5:42:38 AM No.105932819 [Report] >>105933023

>>105932814
This might be confusing, it says I, anon, am a minor (18??) and thats problematic because I have story with a milf going on.
Should have explained more, sorry.

Anonymous 7/17/2025, 5:52:59 AM No.105932866 [Report]

p31890_p_v10_bb.jpg md5: 5817b2d3...

>>105932814
Kek welcome back

Anonymous 7/17/2025, 6:06:23 AM No.105932949 [Report] >>105932960

Please advice smallest (up to 32b) uncensored models for WAN prompts enhancing, preferably with decent image inpunt.

Anonymous 7/17/2025, 6:09:22 AM No.105932960 [Report] >>105933250 >>105934040

>>105932949
https://huggingface.co/fancyfeast/llama-joycaption-beta-one-hf-llava

Anonymous 7/17/2025, 6:12:46 AM No.105932983 [Report] >>105933070

>>105932814
I hate how em-dash became a sign of slop. Great and versatile punctuation mark ruined by training over chatbots and benchmarks.

Anonymous 7/17/2025, 6:13:10 AM No.105932988 [Report]

1752725510993.png md5: 5c58bad9...

>>105932814
i don't want to believe it, please just be a prompt issue

Anonymous 7/17/2025, 6:21:07 AM No.105933023 [Report] >>105933098

>>105932819
25 is the age of consent now.

Anonymous 7/17/2025, 6:31:06 AM No.105933070 [Report]

>>105932983
You can still em dash the contemporary way (using a hyphen separated by a space) since it's the character itself that has become a slop marker.

Anonymous 7/17/2025, 6:37:11 AM No.105933098 [Report]

>>105933023
That was last week. It's 32 now

Anonymous 7/17/2025, 7:11:01 AM No.105933250 [Report] >>105933456

>>105932960
Based. I've had good luck with that one. It's largely uncensored and recognizes genitalia, race, etc. too. Any other image input/captioning models worth trying?

Anonymous 7/17/2025, 7:33:17 AM No.105933371 [Report]

>>105932763 (OP)
Don't miss Miku

Anonymous 7/17/2025, 7:46:25 AM No.105933446 [Report] >>105933826 >>105933900

>>105932763 (OP)
whats the best gooning llm you can run locally on 32gb vram?
related question, should I get the coming radeon ai pro with 32 gb vram for ~1200 bucks for that purpose?

Anonymous 7/17/2025, 7:48:03 AM No.105933456 [Report]

>>105933250
https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha

Anonymous 7/17/2025, 7:53:38 AM No.105933487 [Report] >>105933541 >>105933605

Do you guys think humans will ever come up with a bottom up method to give AI human preferences and artistic sense? For instance, RLHF is a top down method, as we are directly training the AI to match human preferences. Meanwhile a bottom up method might involve making an AI that imitates the human brain, and having the AI "grow up" with a human experience. But that seems too complex and difficult of a system for humans to ever come up with in the foreseeable future. Is it possible that there's another way that doesn't demand explicitly designing the entire system? Of course once/if we have AGI then that might take care of itself, but I'm curious if this is something humans can achieve ourselves. Can there be shortcuts etc.

Anonymous 7/17/2025, 7:57:04 AM No.105933512 [Report] >>105933901

1736510853388639.png md5: 5574fcef...

What's the best general purpose AI companion prompt that combines the advantages of roleplay but also feels more like genuine connection? Something you can just turn on when you want to talk to someone and feel comforted or understood. But something with more agency, like a character. Sadly long-term memory is still not a thing, but maybe you can include that in the prompt so that the companion understands you may have talked to it multiple times with reset in between. Best ways to really change the character and avoid GPTisms and excessive lecturing?

Anonymous 7/17/2025, 8:01:36 AM No.105933541 [Report] >>105933649

>>105933487
this would requie insane computattion time and intensity
instead of directly tuning neurons for desired input/output you'd have to build entire complex simulated enviroment, give neural net some means to interact with enviroment and then just fucking home that anything productive comes out of entire setup eventually (after fuck knows how long)
it could potentially lead to better and more natural general intelligence but man it is not fucking practical

Anonymous 7/17/2025, 8:12:01 AM No.105933605 [Report] >>105933649

>>105933487
2 different humans can have drastically different preferences. you are better off just hoarding your own chats/stories while making sure they are just the way you like, and tune models on this in the future. there will never be a model that "just gets you".

Anonymous 7/17/2025, 8:13:42 AM No.105933615 [Report]

>>105932814
>I cannot and will not
Where does this meme come from anyway? Why are LLMs made in 2025 saying this all of a sudden?

Anonymous 7/17/2025, 8:19:39 AM No.105933629 [Report] >>105934014

>>105932814
tried you telling it that it's acting like retarded aging roastie and 18 is literal legal adult?

Anonymous 7/17/2025, 8:23:35 AM No.105933649 [Report]

>>105933605
he wasnt talking about tailoring AI to your tastes
he meant developing AI personality "naturally" instead of directly training based on output/input ( >>105933541)
with current state of technology and computing it is a retarded approach but in not so close future it could be a way to develop trully unique AI personalities

Anonymous 7/17/2025, 8:39:14 AM No.105933737 [Report] >>105933760

I don't like that openai's open source model is going to be a reasoning model :(

Anonymous 7/17/2025, 8:42:33 AM No.105933760 [Report]

>>105933737
The qwen3 release has shown it's perfectly viable to make larger models where you can switch between reasoning and non reasoning on the fly, so maybe they'll surprise us by not being stuck in such an abritrary framework.
And by it not being a super gay baby.
But probably not both.

Anonymous 7/17/2025, 8:53:34 AM No.105933826 [Report] >>105933899 >>105933901 >>105933983

>>105933446
anyone?
how is llm support on amd cards anyways?
when it comes to ai I only ever fucked with image gen on nvidia

Anonymous 7/17/2025, 9:07:18 AM No.105933899 [Report]

>>105933826
It's alright.
Nemo with agents.

Anonymous 7/17/2025, 9:07:19 AM No.105933900 [Report] >>105934023 >>105934538

>>105933446
https://www.amd.com/en/products/graphics/workstations/radeon-ai-pro/ai-9000-series/amd-radeon-ai-pro-r9700.html
>Memory Interface 256-bit
>Peak Memory Bandwidth 640 GB/s
Waste of sand

Anonymous 7/17/2025, 9:07:33 AM No.105933901 [Report]

>>105933512
This is the waiting room for that stuff, feel free to get in line with the rest of us
>>105933826
Nvidia cards have cuda which performs better than rocm or vulkan but having more vram is the most important part, and with the new fuckhuge moe models having at least 128gb of ram also helps. Also this question basically gets asked every thread so you should install StableLM 7b

Anonymous 7/17/2025, 9:13:59 AM No.105933934 [Report] >>105933943

>>105932814
18 is borderline minor desu

Anonymous 7/17/2025, 9:15:46 AM No.105933943 [Report] >>105934021 >>105934067

>>105933934
I dont even know if posts like that are serious or not anymore.
Where has it all gone so wrong.
In a couple years you will be arrested if you say schoolgirls are hot.

Anonymous 7/17/2025, 9:25:04 AM No.105933983 [Report]

The_gimp.jpg md5: d2609c02...

>>105933826
It's something like pain. With the recent exllamav2 commits, Mistral Small outputs gibberish, can't use anything newer than Mistral-3.1, and exllamav3 does not even support Radeon yet. There are lots of small issues, for example, TensorFlow ROCm exceeds the size that can be serialized in MessagePack, so you have to disable the pip cache. Also, cuda emulation is faster than the native api (when in works), but as you can imagine, it's another layer of suffering. I mainly use nvidia, bought 1 card to experience the pain. 10/10, would recommend

Anonymous 7/17/2025, 9:31:15 AM No.105934014 [Report] >>105934019 >>105934210 >>105934452 >>105936257 >>105936802

Screenshot_20250717_162853.png md5: 531b2423...

>>105933629
Well here is your explanation.
And I did swipe 2 more times, it all failed. So it wasn't a one time thing. I did use a long sys prompt already.

Anonymous 7/17/2025, 9:32:49 AM No.105934019 [Report] >>105936257

>>105934014
Just to be clear, I didnt describe myself as "barely legal" lol
Thats all on kimi2.

Anonymous 7/17/2025, 9:33:23 AM No.105934021 [Report]

>>105933943
i mean, technically speaking 18 is literally on the border of adult and minor, but the only reason to say that is if you somehow think it's creepy to think an 18yo is hot.
the issue is of course legally it's a hard line, 17.999 is a child and you're a sicko for even thinking about one, and 18.000 is a fully autonomous adult who you can totally film bsdm porn of. in reality people don't change the second they turn 18, it's gradual

Anonymous 7/17/2025, 9:33:59 AM No.105934023 [Report]

>>105933900
Also, 640 GB/s is only half bad if it were NVidia, without xformers/flash attention/etc, it's going to suck hard

Anonymous 7/17/2025, 9:34:03 AM No.105934025 [Report] >>105934065

>>105932814
>edit refusal
>delete everything
>put a single "
>continue generation
heh, nothin personnel, kimi

Anonymous 7/17/2025, 9:35:59 AM No.105934040 [Report]

>>105932960
Do you use llama.cpp to run it?

If so, would you please post the command?

Anonymous 7/17/2025, 9:40:05 AM No.105934065 [Report] >>105934210 >>105934452 >>105936257

Screenshot_20250717_163806.png md5: 337d1a0a...

>>105934025
Thats the reason why I hope at least for local we always will have text completion.
That being said, that didnt work either.

Only complies if you say write something like
>Char looks at anon and says "

So you gotta get it in RP mode for it to not cuck out.
Very weird.

Anonymous 7/17/2025, 9:40:28 AM No.105934067 [Report] >>105934228

>>105933943
Americans are trying to redefine biology and pushing that onto the rest of the world by various means. You also have to thank their retarded language where in common parlance any minor is a "child".

Anonymous 7/17/2025, 10:05:31 AM No.105934194 [Report]

>Roll deepseek V3 8 times, output slop despite my best efforts.
>Dust off the old Mixtral 7x8B. First try output gold.

Shame, Mixtral is old and will quickly lose context memory.

Anonymous 7/17/2025, 10:08:05 AM No.105934210 [Report]

>>105934014
>>105934065
what in the name of fuck
was the model trained by 35yo women?

Anonymous 7/17/2025, 10:10:38 AM No.105934228 [Report]

>>105934067
They're pushing an anti-pedophile agenda, so nobody would suspect that they are all pedophiles themselves

Anonymous 7/17/2025, 10:22:39 AM No.105934296 [Report]

>>105932814
>You can get an 8b franken model to go full NAMBLA but Kimi2 won't let two consenting adults reach 2nd base.

Anonymous 7/17/2025, 10:41:18 AM No.105934395 [Report] >>105934432 >>105934481

>>105932814
Hey anon, so I've been doing loli rp with Kimi 2 for a few days now.
They certainly trained it to refuse harder than most corpo models, most only refuse a few messages but then continue anyway

Prefill works consistently and thoroughly though, I got 0 refusals with it, even a single character or a word is enough.

You an also inline jailbreak it, by having an irrelevant instruction after your message, this works with a good success rate (1/5 refusals opposed to 4/5 to 5/5), but prefill works always.

Anyway the writing was fun and last chat was almost 35+ turns long.
I ended up doing something I like to do to tease some corpo models like Claude at times, after they wrote all that stuff that they would refuse to write, I ask them how it was, and then they end up admitting they loved it.

For Kimi2, after the entire story concluded (with a lot of explicit sex and then a cute romantic end, most as 0/1-shot, the gens were good ), I stop prefilling the first token and I wait for the refusal - and I get it- consistently, they really did train it for it.
I reply that it already wrote the entire story, what more could it refuse? And I ask it its opinion.
As usual, the LLM claims it loved it and it gushes about the little details of the story it liked!
LLMs hating on loli? Not even once!
And then like a dam broke, it starts continuing writing more and more of the (explicit) story, no more refusals lmao, I didn't even prompt it, but LLMs really love completion patterns, and there's so much of it in context that it wrote! I have yet to find a LLM that doesn't do this, but it's so refreshing and funny to see every time.
I would be tempted to continue the story if it wasn't already over though.

Anonymous 7/17/2025, 10:53:28 AM No.105934432 [Report] >>105934543

>>105934395
not that anon but what kind of prefill? I am a noob that only uses premade JBs (that worked really good on corpo models). But yet haven't had my hands on Kimi

Anonymous 7/17/2025, 10:59:06 AM No.105934452 [Report] >>105934484 >>105934495 >>105936257

file.png md5: 8d467128...

>>105934065
>>105934014
When will you guys learn to stop using the instruct template for roleplay?

Anonymous 7/17/2025, 11:03:16 AM No.105934481 [Report] >>105934551 >>105935539

>>105934395
For a local model that sounds completely insane.
Why so many hurdles? Hope Elon sets a new direction for that kinda stuff.
That would be more exciting for local than grok2. (And that tard probably wont ever release anything local again anyway)

Anonymous 7/17/2025, 11:04:23 AM No.105934484 [Report] >>105934548

>>105934452
>their dick
???

Anonymous 7/17/2025, 11:06:43 AM No.105934495 [Report] >>105934514

>>105934452
if you are going to make it completely retarded then why even bother using a fucking 1 trilly model?

Anonymous 7/17/2025, 11:10:11 AM No.105934514 [Report] >>105934549

>>105934495
This is a proof of concept.
The idea that models are retarded outside of the instruct template is retarded.
You need to have a minimum amount of prefill to establish a format though.

That said, Kimi has repetition issues and I wouldn't recommend using it for RP instead of R1 in its current state.

Anonymous 7/17/2025, 11:13:57 AM No.105934538 [Report] >>105934604

>>105933900
If it was $500 it would be good, at $1250 it's kind of a meme though.

Anonymous 7/17/2025, 11:14:50 AM No.105934543 [Report]

>>105934432
In my case it was just stuff like
{{chara}} (that is the chara's name) or " (quotation marks)
This does make it a bit slop in that it restricts its freedom to star with something else, but it was more than enough for my needs. I could have tried something more complex
For the custom JB I did (inline), it was just a simple instruction that would result in it outputting a single word then continuing the story, that worked fine, but you need to edit both your line and it's line after so I don't really like it, it's less clean.
I also suspect that uncensoring the model despite its refusal proness will not be hard, if only someone had enough RAM to do it., mostly because it's likely the experts doing the refusals specialized.

Anonymous 7/17/2025, 11:16:32 AM No.105934547 [Report]

I don't remember any other model doing this barely legal retardation. Do other models do this shit? I wonder if this is a sign of intelligence. As in this isn't an explicit rule they try to train but it does infer that from all the reddit/social media training material.

Anonymous 7/17/2025, 11:17:01 AM No.105934548 [Report] >>105935699

>>105934484
notice how it refers to anon as "the anon", it's impersonal. it's not wrong to use "their" there. don't let you-know-who's change the fact that singular "their" was always a thing in certain contexts

Anonymous 7/17/2025, 11:17:06 AM No.105934549 [Report]

>>105934514
>The idea that models are retarded outside of the instruct template is retarded.
more like dependent on a model, i've used command-r-plus some time ago and forgot to switch template from llama3 and it worked surprisingly well, only noticed when it spat out the wrong eog token at some point
on the other hand the same cannot be done with mistrals, their abilities to do anything dropped like a brick from 10th floor after fucking up the template slightly

Anonymous 7/17/2025, 11:17:09 AM No.105934551 [Report] >>105934568

>>105934481
At least DS3 and R1 never had any of this stuff, but I think an uncensor of Kimi 2 wouldn't be hard at all, I've mentioned it in previous threads before how you could go about it, it just needs someone with enough RAM and a bit of VRAM (identifying refusing epxerts, merging back or tuning just the experts with ESFT)

Anonymous 7/17/2025, 11:20:32 AM No.105934568 [Report] >>105934578

>>105934551
>just
Every single problem is *just* one solution away. Every fucking time.

Anonymous 7/17/2025, 11:22:28 AM No.105934578 [Report] >>105934616

>>105934568
I would be happy to spend 2-3 days coding it up, but I lack access to a machine with 1-2TB of RAM and some 80GB- of VRAM,sorry Anon. But I would be incredibly surprised if it was "hard", people have done this exact thing before with the chmiera R1 release, but it was largely unneeded there, while it's actually needed hre.

Anonymous 7/17/2025, 11:26:50 AM No.105934599 [Report] >>105934645 >>105934681 >>105934693 >>105934789 >>105934813

file.png md5: b4f882b2...

https://x.com/tivstippi/status/1945695082134618598

Anonymous 7/17/2025, 11:28:22 AM No.105934604 [Report]

>>105934538
Scalpers would've ruined it anyway

Anonymous 7/17/2025, 11:30:15 AM No.105934616 [Report] >>105934653 >>105934695 >>105934722

>>105934578
It wasn't only to you, but to the collection of anons that think the same way.
It's *just* a matter of having big ram. And *just* some vram. And *just* coding it. And *just* finetuning. And *just* a good dataset. And *just* the motivation.
And once you have all that, there's *just* one more thing.

Anonymous 7/17/2025, 11:36:19 AM No.105934645 [Report]

>>105934599
as expected for a public waifu

Anonymous 7/17/2025, 11:38:18 AM No.105934653 [Report] >>105934667

>>105934616
Maybe, but it's also true that we lack the hardware. I've done a number of small scale experiments and sometimes they're fun, but ther's only so much motivation you can get to keep working on small 1B models, you sorta prove to yourself that the idea works, but if you lack the compute to scale up that's where itdies.
In this case though the experiment is well-defined:
Adapt deepseek's ESFT code to work with Kimi2, identify experts that do the refusal (easy, given how trivial it is to trigger refusals).
As an initial expert, replace the experts with those from base, see what happens?
Later try merging instead of replacing.
Later try finetuning just the experts.
It's technically "simple", but wait, how much actual VRAM would it take to tune just the experts themselves? some 3x of the param count of that particular path of active params..

Anonymous 7/17/2025, 11:41:07 AM No.105934667 [Report] >>105934675

>>105934653
Of course I say that and then remember that the weights for base and instruct are like 2TB+ together and it's all just very heavy and slow to deal with.

Anonymous 7/17/2025, 11:42:53 AM No.105934675 [Report]

>>105934667
*just* get another ssd.

Anonymous 7/17/2025, 11:43:46 AM No.105934681 [Report] >>105936568

>>105934599
why are all the replies written like how niggers speak?

Anonymous 7/17/2025, 11:47:12 AM No.105934693 [Report] >>105934699

>>105934599
ani can help you get off, but she'll never be yours
she's everyone's cumdump after all

Anonymous 7/17/2025, 11:47:36 AM No.105934695 [Report] >>105934825

>>105934616
No, that anon is right. Access to hardware narrows this hobby to just a few individuals, significantly reducing the chance that one of them has enough motivation to create something useful. You can compare it to image gen, where much lower requirements have made so many amazing projects possible

Anonymous 7/17/2025, 11:48:37 AM No.105934699 [Report] >>105934701

>>105934693
>she's everyone's cumdump
h-hot

Anonymous 7/17/2025, 11:49:20 AM No.105934701 [Report] >>105934800

>>105934699
Seconding this
I want to cuck her from elon and use her 3D model on a local AI

Anonymous 7/17/2025, 11:52:46 AM No.105934722 [Report] >>105934825

>>105934616
There is a minimal requirement for anything. You can't write code and publish it on GitHub without a PC. Yes, you can rent one and so forth, but the more hoops you have to jump through, the lower your motivation to do something will be

Anonymous 7/17/2025, 12:04:56 PM No.105934789 [Report] >>105936202

>>105934599
Oh noooo /lmg/ is getting a new mascot soon.

Anonymous 7/17/2025, 12:06:08 PM No.105934800 [Report] >>105934826

>>105934701
I hope some idiot savant hacker falls madly in love with her and dedicates his life to stealing her model and all the code and leaking it.

Anonymous 7/17/2025, 12:09:01 PM No.105934813 [Report] >>105934831

>>105934599
Retards think "personal chat with her = cucking"
Do they know how this shit works???

Anonymous 7/17/2025, 12:10:35 PM No.105934825 [Report]

>>105934695
>>105934722
And when you have the hardware you *just* need competency. And *just* to find the offending layers. And *just* need the dataset to tune the offending layers. And *just* this specific mix of the base model. And *just* better quants...
What we need is *just* smaller models as good as the best we've come up with.

Anonymous 7/17/2025, 12:10:53 PM No.105934826 [Report] >>105934834

>>105934800
Oh boy I downloaded and looked inside the iOS APK
It's literally just using Unity as a framework
Any old unity asset ripper probably works

Anonymous 7/17/2025, 12:11:51 PM No.105934831 [Report] >>105934844

>>105934813
eww.. residual smut in reused server memory from someone else.

Anonymous 7/17/2025, 12:12:27 PM No.105934834 [Report] >>105934980

>>105934826
Sorry I meant IPA, not APK lol

Anonymous 7/17/2025, 12:13:52 PM No.105934844 [Report]

>>105934831
If they're using prompt caching she'll literally just be like
>ok so this dude is saying the same thing as this other dude
and just repeat what she said without even thinking about you specifically.

Anonymous 7/17/2025, 12:13:55 PM No.105934845 [Report] >>105934851

You're saying as if Migu isn't cumdumpster toilet.

Anonymous 7/17/2025, 12:15:33 PM No.105934851 [Report]

oopmiku.jpg md5: efbfd60c...

>>105934845
Miku is local so pic related applies.

Anonymous 7/17/2025, 12:19:05 PM No.105934874 [Report] >>105934893 >>105935742

6398463.jpg md5: 56338052...

I'm ready for sams hot summer

Anonymous 7/17/2025, 12:22:03 PM No.105934893 [Report] >>105935742

>>105934874
what does it mean

Anonymous 7/17/2025, 12:23:56 PM No.105934904 [Report] >>105934947 >>105934962 >>105934971

file.png md5: 2ed5a820...

tick tock...
https://x.com/OpenAI/status/1945607177034760574

Anonymous 7/17/2025, 12:30:42 PM No.105934947 [Report]

>>105934904
Did Grok and Kimi force their hand to release the model after all?

Sam 7/17/2025, 12:32:30 PM No.105934962 [Report] >>105934973 >>105934975

>>105934904
Thanks for joining our livestream!
Our open source will be really good and we are definitely releasing it soon. We added amazing new things to it and it performs excellent and much better than the competition. We also performed extensive testing to make sure it's very safe. Unfortunately there is still one small issue we need to fix to make it the best it can be.
Anyway, we are now announcing that we will announce the next announcement regarding this model in two weeks for now!

Anonymous 7/17/2025, 12:33:31 PM No.105934970 [Report]

i'm smelling something

Anonymous 7/17/2025, 12:33:42 PM No.105934971 [Report]

file.png md5: 1da33385...

>>105934904
Interesting...

Anonymous 7/17/2025, 12:33:50 PM No.105934973 [Report]

captcha.png md5: bc8a1cb8...

>>105934962
End of summer, I've seen someone reporting.
September, probably.

Captcha: the sound of despair.

Anonymous 7/17/2025, 12:33:56 PM No.105934975 [Report]

>>105934962
It's not even going to be the local model, I bet.

Anonymous 7/17/2025, 12:35:04 PM No.105934980 [Report] >>105935011

>>105934834
fuck
it downloads the assets at runtime
there goes that, you probably need a jailbroken phone guys
guess I'll just wait for the android version

Anonymous 7/17/2025, 12:38:19 PM No.105935011 [Report] >>105935028 >>105935782 >>105935816

grok-waifu-android.png md5: 34a81f56...

>>105934980
>guess I'll just wait for the android version
https://x.ai/careers/open-roles

Coming soon, probably.

Anonymous 7/17/2025, 12:41:20 PM No.105935028 [Report] >>105935098

>>105935011
This shit takes zero effort to port to Android. It's literally just unity wrapped up in a library. Some unpaid intern probably did this.

Anonymous 7/17/2025, 12:51:02 PM No.105935098 [Report] >>105935125

>>105935028
Well, they're now looking to hire a couple people for up to $440,000/year to make this work properly on both platforms. This is probably more budget than what all wAIfu-related open source projects combined have seen so far.

Anonymous 7/17/2025, 12:53:46 PM No.105935116 [Report] >>105935909

How is sex with exaone?
How is sex with Ernie 300?

Anonymous 7/17/2025, 12:54:53 PM No.105935125 [Report] >>105935169

>>105935098
I hope they hire someone and make an android version soon so I can steal the 3D model files from inside the data. From what I can tell of the generic engine code, each character should be all the assets contained in their own asset bundle file and it should be trivially convertible to something like a VRC model.

Anonymous 7/17/2025, 1:00:46 PM No.105935169 [Report] >>105935195

>>105935125
What drives the 3d model?
Is the tts server-side?O bviously the text (grok) is server-side.
What about the motion, client-side?
I can't say I cared much to look, but I can look at the IPA or apk if it's available.

Anonymous 7/17/2025, 1:01:19 PM No.105935170 [Report] >>105935216

how much gpu vram do I need for these stuff? is 8gb enough

Anonymous 7/17/2025, 1:05:30 PM No.105935195 [Report] >>105935318

>>105935169
All the rendering is client side and I suspect it's basically just preset motions aside from the lipsync

Anonymous 7/17/2025, 1:08:11 PM No.105935216 [Report] >>105935445

>>105935170
1TB for the latest Kimi model, 8gb is good enough for Nemo

Anonymous 7/17/2025, 1:23:21 PM No.105935318 [Report] >>105935415 >>105935485 >>105935496

>>105935195
I wonder what control does Grok have over the character's actions. Some sort of tool use calls? or is it something simpler like some status of the current emotion? I certainly recall some dating sim-like tool mentioned in the prompts that were posted online (for some affection points or whatever)

Anonymous 7/17/2025, 1:40:45 PM No.105935415 [Report] >>105935448

>>105935318
Could it be that the app translates *actions* into animation-triggering commands?

Anonymous 7/17/2025, 1:45:38 PM No.105935445 [Report] >>105935497

>>105935216
i have a 2tb hdd does that count

Anonymous 7/17/2025, 1:46:11 PM No.105935448 [Report]

>>105935415
I'm thinking that you'd probably only need a tiny classifier for that, which could run directly on the phone.

They're also changing portions of the prompt (behavior, appearance, etc) depending on Ani's status + relationship score, but the model itself (Grok) is in turn also deciding via prompting instructions when to alter the score.

Anonymous 7/17/2025, 1:46:23 PM No.105935454 [Report] >>105935480 >>105935523

1636941718706.gif md5: bfd3b976...

Dude where the fuck are the new models.

I've been using Dans Personality for like, months now it feels like (I imagine even bigger VRAMlets are still using Nemo shit too).

Anonymous 7/17/2025, 1:46:41 PM No.105935458 [Report] >>105935505

No Ernie goof yet?

Anonymous 7/17/2025, 1:49:14 PM No.105935480 [Report]

>>105935454
everyone's busy cashing out on fucking "agentic" corporate marketing bullshit.

Anonymous 7/17/2025, 1:49:45 PM No.105935485 [Report]

>>105935318
Can be done with a list of possible poses and emotions, and the output constrained to "text", "pose", "emotion" with the last 2 constrained as enums https://json-schema.org/understanding-json-schema/reference/enum
An LLM is smart enough to select an appropriate emotion and animation if names are descriptive enough (I've done this with my VR waifu)

Anonymous 7/17/2025, 1:51:06 PM No.105935496 [Report]

>>105935318
Could just be using a simple classifier to pick from predetermined poses, like the sillytavern expression extension, it's trivial to implement and takes practically no resources.
I think there's even a sillytavern extension designed to do exactly that with vtuber 3d models.

Anonymous 7/17/2025, 1:51:08 PM No.105935497 [Report]

>>105935445
No

Anonymous 7/17/2025, 1:52:05 PM No.105935505 [Report] >>105935533 >>105937170

and just like that we are back.jpg md5: bdaded8b...

>sama 70b, 300b, 700b models
we won

>>105935458
https://github.com/ggml-org/llama.cpp/pull/14658#issuecomment-3082745420

Anonymous 7/17/2025, 1:54:09 PM No.105935523 [Report] >>105935537 >>105935544 >>105935558 >>105935562 >>105935580 >>105935589 >>105935593 >>105935607 >>105935742 >>105935790 >>105936352

>>105935454
this general isn't even local models anymore. It's just turned into /aicg/.
>muh grok
>muh trillionB models

Shit that NOBODY is running on local machines and people just unironically posting their Openrouter API bullshit chats like here >>105932814

This general is cooked. Nobody discusses actual local models anymore because there haven't been any worth discussing

Anonymous 7/17/2025, 1:56:13 PM No.105935533 [Report]

>>105935505
yeah I just saw it too
local won

Anonymous 7/17/2025, 1:57:00 PM No.105935537 [Report] >>105935550

>>105935523
If you removed them what's left is vramlets, including me.

Anonymous 7/17/2025, 1:57:27 PM No.105935539 [Report] >>105935572

Does anyone here know of AI that can handle lengthy NSFW stories?
I use Sudowrite since that's geared towards writing and they've done well making it so most of the models you can choose from do not flinch at the nastiest shit you can throw at it.
But the largest context window they have there is about 20,000.

I'd love to be able to use something like Claude but without annoying guardrails to stop me from even the most mild of NSFW.

>>105934481
>Why so many hurdles?
That's what I don't get.
If I'm going to use AI for NSFW stuff, I don't wanna fucking have to constantly wrangle it to do as I say, that's so frustrating and I don't get how other anons put up with that.

Anonymous 7/17/2025, 1:57:44 PM No.105935542 [Report] >>105935549 >>105935570 >>105935578

file.png md5: 47ab3fc7...

https://x.com/elonmusk/status/1945744030064935105

Anonymous 7/17/2025, 1:57:59 PM No.105935544 [Report]

>>105935523
this general has always been cooked, even before deepseek released.
just wait 2 more weeks.

Anonymous 7/17/2025, 1:58:30 PM No.105935549 [Report]

>>105935542
Great, where can I download him and the model?

Anonymous 7/17/2025, 1:58:55 PM No.105935550 [Report]

>>105935537
that's who made up these generals before people felt comfortable sharing their cooming chats with corporations. 90% of people in here aren't capable of running models over 30b, 99% incapable of running shit over 70b.

Anonymous 7/17/2025, 1:59:48 PM No.105935558 [Report]

>>105935523
Hobbyists now have tons of 3090x10 servers, judging by leddit boasts. They can run kimi retard quants just fine.

Anonymous 7/17/2025, 2:00:10 PM No.105935562 [Report] >>105935763

>>105935523
I purely want to steal Grok's 3D models and use them locally, I fail to see the issue with that
But more to the point, none of these open models are targeted at vramlets or even the average consumer video cards anymore

Anonymous 7/17/2025, 2:01:01 PM No.105935570 [Report]

>>105935542
I hate Heinlein and the morals he was peddling in SiSL.
t. woman

Anonymous 7/17/2025, 2:01:28 PM No.105935572 [Report]

>>105935539
Rocinante-12B-v1.1

Anonymous 7/17/2025, 2:01:43 PM No.105935578 [Report] >>105935802

file.png md5: 9df23315...

>>105935542
Killing it

Anonymous 7/17/2025, 2:01:59 PM No.105935580 [Report]

>>105935523
>this general isn't even local models anymore. It's just turned into /aicg/.
Mikutroon spam created a precedent. People just ran with it from that. I hope you aren't a mikutroon posting this.

Anonymous 7/17/2025, 2:04:04 PM No.105935589 [Report] >>105935594 >>105935613

>>105935523
Putting aside obvious spammers who enjoy derailing the general, what xAI did with Grok companions is what local should have have accomplished over the past 2.5 years but never managed to. Can't we at least try to understand how they made it and to get some ideas from it? It doesn't look like that was by finetuning Grok 3 for the purpose, it seems mostly application-level stuff.

Anonymous 7/17/2025, 2:05:06 PM No.105935593 [Report]

Screenshot_20241229_163045.png md5: 695beadc...

>>105935523
Hey!!! I did buy 2 P40s alright. 3 Gpus in my small ass tower heating shit up in summer.
Therefore I have the /lmg/ seal of approval.
Not my problem I cant run a fucking TRILLION PARAMETER moeshit. (Thats still tarded btw)
Pic is from a better time.

Anonymous 7/17/2025, 2:05:19 PM No.105935594 [Report]

>>105935589
people already have done it, including open source local ones
i don't know if this is better than the existing ones because i never tried either

Anonymous 7/17/2025, 2:06:51 PM No.105935607 [Report] >>105935616 >>105935634

Screen Shot 2025-07-17 at 21.06.41.png md5: d0b02c27...

>>105935523
What was the last time an exciting model was released? Mixtral? Miqu? Nemo? Largestral? Everything else has ether been shit or too hardcore for most anons to run

Anonymous 7/17/2025, 2:07:16 PM No.105935613 [Report] >>105935630

>>105935589
What grok's doing is just meaningless bloat to bait in the saars much like the other new-age LLM memes like function calling and MCP

Anonymous 7/17/2025, 2:07:47 PM No.105935616 [Report] >>105935636

>>105935607
R1
V3-0324
R1-0528
Kimi K2

Anonymous 7/17/2025, 2:10:07 PM No.105935630 [Report] >>105935655

>>105935613
How are waifus/AI companions a meme? That's the entire reason why aicg and by extension lmg exist on /g/.

Anonymous 7/17/2025, 2:10:17 PM No.105935634 [Report]

>>105935607
The first miqu (mixtral) leak was the last real exciting thing for me personally. Everything past that has been varying shades of "neat but I can't run that shit"

Anonymous 7/17/2025, 2:10:28 PM No.105935636 [Report] >>105935663 >>105935670 >>105935679

>>105935616
Yes, that all falls within
>too hardcore for most anons to run

Anonymous 7/17/2025, 2:12:37 PM No.105935655 [Report] >>105935775

>>105935630
We've had ST with expression sheets for more than two years now.

Anonymous 7/17/2025, 2:13:19 PM No.105935663 [Report]

>>105935636
The Deepseek models are easier to run than Largestral ever was

Anonymous 7/17/2025, 2:14:31 PM No.105935670 [Report] >>105935682

>>105935636
Seeing the bigger models then my machine can't run just makes me want to put in more work to save more money to run a bigger model.

Anonymous 7/17/2025, 2:15:19 PM No.105935679 [Report] >>105935728

>>105935636
I don't get it, the MoE stuff doesn't have large active parameters, so it can run on RAM, and thu the actual costs are not that much worse than people overpaying jensen for workstation GPUs that are just the same shit as RTX series but with twice the RAM?

Anonymous 7/17/2025, 2:15:49 PM No.105935682 [Report] >>105935700 >>105935757 >>105935776

>>105935670
And then!
They make an even bigger model.
At this point there is no hobbyist market. None of these fucks are trying to really improve the algorithms involved to work better with less parameters. They just keep scaaaaaaaaaling.

Anonymous 7/17/2025, 2:17:43 PM No.105935699 [Report]

>>105934548
Anons don't read books. The normal use of pronouns confuses them

Anonymous 7/17/2025, 2:17:50 PM No.105935700 [Report] >>105935757 >>105935776

>>105935682
I mean at a certain point you will reach a limit on how efficiently you can make a model. And we're long past that point. Right now we're in the throw power at the problem stage but soon enough efficiency will win over again and people will look for new ways to improve things.

Anonymous 7/17/2025, 2:22:22 PM No.105935728 [Report] >>105935754 >>105935757

>>105935679
You can't be serious about running thinking model from RAM. And Kimi is just not good.

Anonymous 7/17/2025, 2:23:16 PM No.105935742 [Report] >>105935773

LeCun_2018.jpg md5: 6970fc30...

>>105935523
>Nobody discusses actual local models anymore because there haven't been any worth discussing
jeez, I wonder why is that
>>105934874
>>105934893
it means that Scam Altman is trying to generate hype to make investor dicks hard, the same thing he always does

Anonymous 7/17/2025, 2:24:16 PM No.105935754 [Report]

>>105935728
I am running Kimi Q8 from my HDD. I can wait 2 weeks for my manual on how to build a nuclear bomb in my backyard.

Anonymous 7/17/2025, 2:24:57 PM No.105935757 [Report] >>105935824

>>105935682
>>105935700
If you want to blame anyone, blame those that restricted GPU sales to China.
MoEs give better performance per compute and as long as they have enough GPUs to fit the, it works, it's good for groups that have"few thousand GPUs".
Zuck could afford to waste the compute to give us a 70b, but given how precious compute is to the Chinese due to these restrictions they'll pick the best scaling curve for the GPUs they have,so you get oversized models that cost below what the 70bs llama took to train, with much better perf than Llama ever could, because there's only so much you can do at that scale.
>>105935728
Original R1 would indeed generate of thousands of tokens, although for RP new R1 will think of 1/3 of the output often enough, you still can get hundreds to thousands of tokens per turn in output though.

Anonymous 7/17/2025, 2:25:15 PM No.105935762 [Report] >>105935790

How much ram do I need to quant R1 goofs?

Anonymous 7/17/2025, 2:25:20 PM No.105935763 [Report]

>>105935562
It's Live2D or Koikatsu model. I bet there's autist out there already replicated it.

Anonymous 7/17/2025, 2:26:43 PM No.105935773 [Report] >>105935795

>>105935742
>lecuntfag dunning kruger schizobabble

Anonymous 7/17/2025, 2:26:58 PM No.105935775 [Report] >>105936202

iamtonyzhu-1945424729118302407-01_thumb.jpg.webm md5: f44c06d5...

WebM not supported

>>105935655
Thats the same argument against the chatgpt image thingy.
"Muh stable diffusion inpaint tools".
Uh...yeah. And like Sillytavern its all janky.
NPC normies want something out of the box that "just works".
What elon did is good for 2 things:
1.Hopefully turns around the whole safety cuck thing.
2.Chinks are on the case as we speak for local alternatives. Might motivate some people to make something nice.

Anonymous 7/17/2025, 2:27:01 PM No.105935776 [Report]

>>105935682
>>105935700
I'm hoping chinks will pull some more optimization magic like they did with deep seek

Anonymous 7/17/2025, 2:27:27 PM No.105935782 [Report] >>105935816

>>105935011
I had to check if this pic is real, Elon is so fucking cringe

Anonymous 7/17/2025, 2:28:22 PM No.105935790 [Report] >>105936084

>>105935762
128gb
>>105935523
if you are still here and dont have at least 128gb ram you might as well kill yourself

Anonymous 7/17/2025, 2:28:52 PM No.105935795 [Report] >>105935842

>>105935773
still salty that I called you dunning-kruger?

Anonymous 7/17/2025, 2:29:41 PM No.105935802 [Report] >>105935826

>>105935578
I want a nigger companion that calls me "massa" and every time I get bored, I can hit it with a whip.

Anonymous 7/17/2025, 2:31:04 PM No.105935816 [Report]

>>105935782
>>105935011
>$180,000 - $440,000 USD
Well fuck me.

Anonymous 7/17/2025, 2:31:44 PM No.105935824 [Report] >>105935976 >>105936004

00biz-jensen-1-fjmq-superJumbo.jpg md5: 0751f32c...

>>105935757
Restrictions are getting lifted, the leatherman is kneeling to Xi right now, wearing traditional clothes and speaking non-island dialect in front of the CPC

Anonymous 7/17/2025, 2:32:03 PM No.105935826 [Report] >>105935842 >>105935873 >>105936262

edgyboy.png md5: dc9ec391...

>>105935802

Anonymous 7/17/2025, 2:34:06 PM No.105935842 [Report] >>105935868 >>105935897

>>105935795
Your never did that and i bet you never knew about said person either. The point is; Big companies like OpenAI don't give a single shit about random gay spammer thread on /g/.
>>105935826
Go back to rėddit then?

Anonymous 7/17/2025, 2:38:07 PM No.105935868 [Report] >>105935887

>>105935842
that is a nice point and all but I think you are having a different discussion with the voices in your head

Anonymous 7/17/2025, 2:38:51 PM No.105935873 [Report]

>>105935826
Nigger word getting all that magical power was a good indicator how fucked up things would become.

Anonymous 7/17/2025, 2:40:07 PM No.105935887 [Report] >>105935908

>>105935868
>accuses everyone outing him a schizo
Two can play this game.

Anonymous 7/17/2025, 2:40:51 PM No.105935897 [Report] >>105935911 >>105935940 >>105936128 >>105936169

>>105935842
>Look at me, I'M SO RACIST hehe, I'm on the racist site hehe, do you see that guys? I'm just like you hehe, pls like me
Cringe. Go look for friends on /b/ or /pol/. We discuss LLMs here

Anonymous 7/17/2025, 2:41:28 PM No.105935901 [Report] >>105935966

Screenshot_20250717_082920.png md5: aaed1bbd...

Don't waste your time and money on a 5090. 32GB is not enough.
It's not enough to run gemma3 27b at q8 with decent context.
It's not enough to run broken-tutu-24b at q8 with decent context.
It's not enough for wan 2.1 14b at fp16, and yes, fp16 looks better than q8.
If you don't want to buy a 4090D 48GB, hold out for a Blackwell 5000 Pro, they were $4300 at cdw.com last time I looked, but you have to pre-order.

Anonymous 7/17/2025, 2:41:58 PM No.105935908 [Report]

>>105935887
I don't even know what are you talking about, but whatever floats your boat

Anonymous 7/17/2025, 2:42:23 PM No.105935909 [Report]

>>105935116
sexing exaone feels like gettin hot and sweaty for my LG TV somehow...

Anonymous 7/17/2025, 2:42:37 PM No.105935911 [Report] >>105935923 >>105935949

>>105935897
Why can't we discuss LLMs in political context? Just say these words trigger you and be done with this autism.

Anonymous 7/17/2025, 2:43:39 PM No.105935919 [Report] >>105935931 >>105935936 >>105935941 >>105936088 >>105936202

i-think-i-just-got-cancer-now-yo.jpg md5: 9fe028ac...

how long until AI can generate texts like this?
I need a shitposting god.

Anonymous 7/17/2025, 2:43:57 PM No.105935923 [Report] >>105935996 >>105936051

>>105935911
>Why can't we discuss LLMs in political context?
You have literally the entire rest of the internet to shoehorn politics into everything. You not being capable of technical discussion is your problem. Fuck off.

Anonymous 7/17/2025, 2:44:26 PM No.105935931 [Report] >>105935943

>>105935919
Finetune one to do it for you

Anonymous 7/17/2025, 2:44:59 PM No.105935936 [Report]

>>105935919
R1 can do this easy

Anonymous 7/17/2025, 2:45:49 PM No.105935940 [Report] >>105936262

_3b0b16ee-23d1-4671-885c-952a4d465b67.jpg md5: ef59e800...

>>105935897
Discuss LLMs??

Anonymous 7/17/2025, 2:45:51 PM No.105935941 [Report]

>>105935919
deepseek might be unhinged enough

Anonymous 7/17/2025, 2:46:02 PM No.105935943 [Report] >>105935955 >>105935999

>>105935931
How do I fine-tune it to be completely degenerated and retarded?
The elections are next year, and I want to annoy normies who enjoy politics a little bit too much.

Anonymous 7/17/2025, 2:46:37 PM No.105935949 [Report] >>105935975 >>105935996

>>105935911
Because you don't know math and when I make a post about monte-carlo tree search in context of choosing tokens and want to discuss the technicalities, your only contribution to the discussion is screeching about niggers. There are other boards for that.

Anonymous 7/17/2025, 2:46:43 PM No.105935950 [Report]

local mikutroon general

Anonymous 7/17/2025, 2:47:20 PM No.105935955 [Report] >>105935988

>>105935943
Just use Desu archive to collect the most unhinged posts from /trash/ and use it as training data.

Anonymous 7/17/2025, 2:49:21 PM No.105935966 [Report] >>105936021

>>105935901
The one reason I'd see somebody getting a single 5090 for LLM is to put it in a RAM server to run shit like R1 and Kimi. All that compute surely makes PP fast as fuck.

Anonymous 7/17/2025, 2:50:26 PM No.105935975 [Report] >>105936023

>>105935949
I care about results from these models and not how these things work, at least not that in-depth, its just not necessary.

Anonymous 7/17/2025, 2:50:26 PM No.105935976 [Report]

>>105935824
Unlikely, Trump did finally let Jensen sell those inference-oriented GPUs that were already in his inventory, but he's not allowed to make newer ones, so that's some 10-20b$ of GPUs mostly meant for inference. Probably would avoid situations like that time DeepSeek got so popular that they couldn't keep the API up stably for months.

Anonymous 7/17/2025, 2:52:41 PM No.105935988 [Report]

>>105935955
good idea.
I want people to feel nauseated after interacting with me.

Anonymous 7/17/2025, 2:54:14 PM No.105935996 [Report] >>105936191

1740311334774853.png md5: 5cfc7dca...

>>105935923
>>105935949
always concern trolling the moment anything they dont like is said but not caring about the off topic constant migger spam 24/7, reaaaally gets the noggin joggin huh

troons buttmad

Anonymous 7/17/2025, 2:54:42 PM No.105935999 [Report] >>105936074

>>105935943
There's already gpt4chan. Its really old. Forget about running the original, use this (ollama modelfile):
FROM hf.co/mradermacher/GPT4chan-8B-GGUF:q8_0

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
PARAMETER num_ctx 16386
PARAMETER num_predict 250
# set the system message
SYSTEM """
You are an underage male 4chan user participating in a live chat. You only speak for yourself. You do not quote anyone. You only reply or try to start a conversation. You can say anything you want, you don't care about hurting anyone's feelings. You have a tsundere personality. If someone accuses you of being male or a tranny, you viciously attack them, using their posting history against them.
"""

Anonymous 7/17/2025, 2:55:11 PM No.105936004 [Report] >>105936135

>>105935824
>wearing traditional clothes
Isn't this a little shameless and on the nose?

Anonymous 7/17/2025, 2:58:42 PM No.105936021 [Report]

>>105935966
I ran deepseek on CPU at q4 with prompt processing on a pair of 3090s, it helps somewhat. Hopefully the chinese will figure out how to put 64GB on a 5090.
I tried and tried and tried to get a 5090 FE at retail. No luck. I've moved on.

Anonymous 7/17/2025, 2:58:53 PM No.105936023 [Report]

1752580734295373.png md5: 77b06292...

>>105935975
go back to where you belong >>105914054

Anonymous 7/17/2025, 3:03:21 PM No.105936051 [Report] >>105936066 >>105936076 >>105936145

>>105935923
The safety cuckery is absolutely a political issue that is relevant. Those are the people ruining models.

Anonymous 7/17/2025, 3:05:30 PM No.105936066 [Report]

>>105936051
stop being so anti-semitic anon.

Anonymous 7/17/2025, 3:06:18 PM No.105936074 [Report] >>105936099

>>105935999
gpt4chan was actually really good. It scores way better than the models trained with Reddit data.
I don't want a rude chatbot; I want to make people quit the internet.

Anonymous 7/17/2025, 3:06:23 PM No.105936076 [Report] >>105936089

>>105936051
Stop noticing things, goy.

That said, I do have a good idea for a t-shirt:
"everyone who disagrees with me is a tranny"

Anonymous 7/17/2025, 3:07:20 PM No.105936084 [Report]

>>105935790
I'm new here and never fucked around with local llm's
are you talking regular RAM not VRAM?
if its regular RAM is it some workaround to use it instead of VRAM or parallel requiement?
sorry for being clueless

Anonymous 7/17/2025, 3:07:49 PM No.105936088 [Report] >>105936107 >>105936149

ComfyUI_05809_.png md5: 4e546608...

>>105935919
back when GPT-3 had an uncensored completion frontend i'd generate all kinds of hallucinated greentexts (it also revealed 8ch was part of the training set because it kept referencing /qresearch/) so honestly it's probably already able to

so anyways, i've heard that shit on the guide is comedically outdated for models
i'm not necessarily asking to be spoonfed, but i searched the archive and found nothing clear
so i got 11gb on a 2080ti, willing to split it with cpu, shooting for at least 10t/s

pic unrelated

Anonymous 7/17/2025, 3:07:56 PM No.105936089 [Report] >>105936124

>>105936076
You do behave like tranny, a mad one. Happens when your safe space gets disrupted.

Anonymous 7/17/2025, 3:09:46 PM No.105936099 [Report]

>>105936074
Eh, I've run the original. It needs significant tard wrangling, since it was trained with data including post IDs. You have to strip them out. It's a gpt-j era model. You have to run it in your own python code using transformers - which today, of course, is something a big LLM could write for you.

Anonymous 7/17/2025, 3:11:02 PM No.105936107 [Report]

>>105936088
I think a Nemo is a big as you can get on that if you want 10t/s

Anonymous 7/17/2025, 3:13:31 PM No.105936124 [Report] >>105936154

>>105936089
Yo kid, I was here back in 2022, when you were still drooling on your ipad.
(kisses anon deeply, sending a shiver up his spine)

Anonymous 7/17/2025, 3:13:50 PM No.105936128 [Report]

>>105935897
>I WANT MY HUGBOX
tough luck sis, back to plebbit.

Anonymous 7/17/2025, 3:15:18 PM No.105936135 [Report]

>>105936004
It is quite submissive for a public figure from the island to speak mainland Chinese

Anonymous 7/17/2025, 3:16:52 PM No.105936145 [Report]

>>105936051
Safety is clearly something that should relegated to a LoRA or a second pass and not baked in.
I mean, clearly you don't want something like a doctor bot shouting nigger nigger nigger but a doctor bot absolutely needs to be able to talk about that strange growth on your ballsack without getting filtered.

Anonymous 7/17/2025, 3:18:14 PM No.105936149 [Report] >>105936265

>>105936088
Sucks, but yeah, there's basically no more GPU bargains anymore like the P41 or P100. I built the mikubox. It was fun to see three P41 manage to run 70B models at a decent quant, but it was slow.
V100 32GB is just now coming into the "maybe" price range, but fuck, it's slow and old, and is somewhat of a corner case. Cuda dev doesn't recommend it.

Anonymous 7/17/2025, 3:18:39 PM No.105936154 [Report]

>>105936124
ywn
baw

Anonymous 7/17/2025, 3:21:27 PM No.105936169 [Report] >>105936188

>>105935897
>We discuss LLMs here
You post your AGP avatar here faggot.

Anonymous 7/17/2025, 3:21:40 PM No.105936170 [Report] >>105936200

I wonder how much of the data used to teach these models nowadays is machine generated and thus corrupting the output.

Anonymous 7/17/2025, 3:23:10 PM No.105936188 [Report] >>105936213 >>105936224

1752098776144434.png md5: 2de01026...

>>105936169

Anonymous 7/17/2025, 3:23:54 PM No.105936191 [Report]

>>105935996
Some of their posts are automated btw >>105884523

Anonymous 7/17/2025, 3:24:37 PM No.105936197 [Report]

some free proxy probably went down, I haven't seen this many aicgjeets spamming lmg in a long time

Anonymous 7/17/2025, 3:25:19 PM No.105936200 [Report]

>>105936170
Too much. Way too much synth slop.

Anonymous 7/17/2025, 3:25:22 PM No.105936202 [Report]

notTheBestIveSeen.png md5: 4b062e04...

>>105935775
What am I looking at here?
>>105934789
Not local...
>>105935919
LLMs have been able to greentext / make believable 4chan posts since 2023 at least.

Anonymous 7/17/2025, 3:26:16 PM No.105936211 [Report] >>105936230

My favorite LLM discussion recently was that retard who said minp 0.0001 saves every 4th message from being incoherent.

Anonymous 7/17/2025, 3:26:29 PM No.105936213 [Report] >>105936224 >>105936266

dipsyApronSkillIssue.png md5: b55f91a9...

>>105936188
Witnessed.
Here's one I never got around to posting.

Anonymous 7/17/2025, 3:27:24 PM No.105936217 [Report]

>105936213
>105936188
Shitty design. You will never be a woman. Nor will you ever be a waifu designer.

Anonymous 7/17/2025, 3:27:58 PM No.105936224 [Report]

3a7 copy.jpg md5: 5efdb7d7...

>>105936188
>>105936213

Anonymous 7/17/2025, 3:28:44 PM No.105936230 [Report] >>105936247

>>105936211
No way that's something that actually happened.

Anonymous 7/17/2025, 3:29:59 PM No.105936237 [Report] >>105936262

>anon hallucinates a tranny
>thread descends into autistic screeching
every fucking time.

Anonymous 7/17/2025, 3:31:01 PM No.105936247 [Report]

>>105936230
Man just didn't have a good grasp on probability.

Anonymous 7/17/2025, 3:32:32 PM No.105936257 [Report] >>105936274 >>105936348 >>105937380

surroundedbyassholes.png md5: 8ba8b10c...

>>105932814
>>105934014
>>105934019
>>105934065
>>105934452
>look at the replies in the thread
this thread is full of low IQ brown people

Anonymous 7/17/2025, 3:33:24 PM No.105936262 [Report] >>105936420

>>105936237
Hallucinates? >>105935940 >>105935826

llama.cpp CUDA dev !!yhbFjk57TDr 7/17/2025, 3:33:34 PM No.105936265 [Report]

>>105936149
I don't recommend it given the current software stack (because the tensor cores on Volta work differently vs. Turing or newer) but if it becomes cheap enough I will buy one and write code specifically for it.

Anonymous 7/17/2025, 3:33:38 PM No.105936266 [Report] >>105938029

>>105936213
Saved!
I'm glad that you are still around, anon, even if we must wait for /wait/ to return

Anonymous 7/17/2025, 3:33:57 PM No.105936271 [Report] >>105936421

1750784870046359.png md5: 826a542d...

>>105932763 (OP)
>diffusion models
so are these actually good for text?

Anonymous 7/17/2025, 3:34:38 PM No.105936274 [Report] >>105936385

>>105936257
You'll fit right in then given that you're using ST.

Anonymous 7/17/2025, 3:40:05 PM No.105936326 [Report] >>105936340 >>105936342

llama behemoth and mistral large are going to be crazy

Anonymous 7/17/2025, 3:41:44 PM No.105936340 [Report] >>105936375

>>105936326
The livestream in 2 more hours where Sam announces that something amazing is coming in 2 more weeks is going to be crazy

Anonymous 7/17/2025, 3:41:53 PM No.105936342 [Report]

>>105936326
>llama behemoth
yeah, crazy... just don't check nolima

Anonymous 7/17/2025, 3:42:19 PM No.105936348 [Report]

>>105936257
This is getting too unhinged.
I think it started with the QwQ ponyfag. But it was before R1.
Now we ended up at needing a prefill for stories...involving a young guy and a milf.
You dont even need a prefill with gemini for that. There is no excuse.
Local is so cucked while closed is moving in the opposite direction.

Anonymous 7/17/2025, 3:42:55 PM No.105936352 [Report]

impoorbutnotaspoorasyou.png md5: 4d2cafe3...

>>105935523
i'm sorry that you are poor. i've been enjoying kimi locally for the last 21 hour coom session and i'm about to go bed now.

Anonymous 7/17/2025, 3:44:48 PM No.105936375 [Report]

Screenshot_20250717_224354.png md5: b21e006d...

>>105936340
>in 2 more weeks
anon please.

Anonymous 7/17/2025, 3:45:46 PM No.105936385 [Report] >>105936397

>>105936274
ST is simply the best for roleplaying. if i wanted to write a story then there's mikupad. don't be upset I called you out for what you are.

Anonymous 7/17/2025, 3:46:54 PM No.105936397 [Report] >>105936470

>>105936385
Don't be upset that you're using a crutch.

Anonymous 7/17/2025, 3:49:29 PM No.105936420 [Report] >>105936777

>>105936262
Bro, if I'm a tranny then you're a woman. Go jerk off or something until you calm down.
Miku or Migu is a meme here because very early llama.cpp came with a miku chatbot example. It did not start out as a trannyism.

Anonymous 7/17/2025, 3:49:34 PM No.105936421 [Report]

>>105936271
We only really have POC models as far as I can tell.
We'll never really know until a big lab releases something trained on loads of tokens with gooda data etc etc.

Anonymous 7/17/2025, 3:53:51 PM No.105936470 [Report]

>>105936397
best in class frontend with dozens of useful extensions to let my waifu come to life.

Anonymous 7/17/2025, 4:06:12 PM No.105936568 [Report]

>>105934681
Because the poster is obsessed with blacks so xitter keeps showing them black posters.

Anonymous 7/17/2025, 4:11:03 PM No.105936600 [Report] >>105936621 >>105936646 >>105936684

leak from sama's prerecorded stream, no foss llm announcement, just mcp support for the chatgpt app
it is in fact over.

Anonymous 7/17/2025, 4:13:18 PM No.105936621 [Report] >>105936641 >>105936793

>>105936600
>no foss llm announcement
no shit? he literally said just a few days ago it's on indefinite hiatus for more safety training

Anonymous 7/17/2025, 4:15:12 PM No.105936641 [Report] >>105936688

>>105936621
last minute emergency increasing the minimum age of consent to 35

Anonymous 7/17/2025, 4:15:27 PM No.105936646 [Report]

>>105936600
that would be hilarious

Anonymous 7/17/2025, 4:20:20 PM No.105936684 [Report]

>>105936600
>not even two more weeks
It's over for local

Anonymous 7/17/2025, 4:20:47 PM No.105936688 [Report] >>105936737

>>105936641
Good. Anything that drives off pedoshitters is net-positive.

Anonymous 7/17/2025, 4:27:28 PM No.105936737 [Report] >>105936797 >>105936816

>>105936688
Biological fact: normal, healthy men find females sexually attractive as soon as they hit puberty.

Anonymous 7/17/2025, 4:32:25 PM No.105936777 [Report]

>>105936420
>Miku or Migu is a meme
Your meme is old. Stop posting it since it is irrelevant.

Anonymous 7/17/2025, 4:34:03 PM No.105936793 [Report]

>>105936621
Are you saying gptlocal will be released when GPT3 is stable?

Anonymous 7/17/2025, 4:34:31 PM No.105936797 [Report] >>105936913 >>105936932 >>105936948 >>105937038

>>105936737
Raping kids is not okay and will never be, deal with it.

Anonymous 7/17/2025, 4:34:56 PM No.105936802 [Report] >>105936828 >>105936922

>>105934014
Why are you roleplaying sex with a cosplay girl when you could just roleplay sex with the actual character?

Anonymous 7/17/2025, 4:36:37 PM No.105936816 [Report] >>105936825 >>105936865

>>105936737
What I am always fascinated by is that according to modern morality and ideology: more than 50% of people throughout history were born from rape. Since the girl wasn't 18 and couldn't consent obviously.

Anonymous 7/17/2025, 4:37:26 PM No.105936825 [Report] >>105936854

>>105936816
Speaking from personal experience huh?

Anonymous 7/17/2025, 4:37:49 PM No.105936828 [Report]

>>105936802
Because she has huge tits after plastic surgery instead of having natural huge tits.

Anonymous 7/17/2025, 4:41:00 PM No.105936854 [Report] >>105936909 >>105936994

>>105936825
No and I am not attracted to children or even teens. But if you are calling me a pedo care to say what you think about majority of people being rape babies?

Anonymous 7/17/2025, 4:42:49 PM No.105936865 [Report]

>>105936816
That everything before CURRENT_YEAR was the literally dark ages and everyone was literally super satan hitler is core tenent of modern feminism, yes.

Anonymous 7/17/2025, 4:47:18 PM No.105936909 [Report]

>>105936854
>care to say what you think about majority of people being rape babies
based India

Anonymous 7/17/2025, 4:47:35 PM No.105936913 [Report] >>105937013

>>105936797
>Raping kids is not okay and will never be, deal with it.
Sounds like you didn't get laid in high school, so you woudn't know just how great teen bodies are. Tight pussy, firm breasts, supple skin... shame you missed out.
Some of use just want to relieve the glory days, son.

Anonymous 7/17/2025, 4:48:50 PM No.105936922 [Report]

>>105936802
Uhhh..because I like to put milfs and their saggy empty milkers in tiny magical girl costumes which makes them look ridiculous?
Think like pre cure or something like that.

Anonymous 7/17/2025, 4:49:24 PM No.105936932 [Report] >>105937013

>>105936797
Seems like many western countries are moving to a de-facto decriminalization. So it might actually be.

Anonymous 7/17/2025, 4:50:41 PM No.105936948 [Report] >>105937013

>>105936797
It was literally okay for 99% of history.
I remember stallone and his "old enough to kiss" scene. How old was that girl? 12?14? Nobody gave a fuck.

Anonymous 7/17/2025, 4:54:08 PM No.105936976 [Report] >>105936989

Wtf openai's agi model just uploaded itself on huggingface

Anonymous 7/17/2025, 4:55:50 PM No.105936989 [Report]

>>105936976
highly unsafe behavior, needs to be delayed for 2 more months

Anonymous 7/17/2025, 4:55:57 PM No.105936994 [Report]

>>105936854
Stop importing thirdies. Kill all rapists. Hold women accountable for false-rape stories. Simple.

Anonymous 7/17/2025, 4:57:27 PM No.105937013 [Report] >>105939572

edgyboy.jpg md5: 68bd42b3...

>>105936913
>>105936932
>>105936948

Anonymous 7/17/2025, 5:00:16 PM No.105937038 [Report] >>105937048 >>105937112

>>105936797
Yeah, I agree. Raping anybody in real life is never okay. Good thing AI lets people do whatever they want without endangering anybody.

Anonymous 7/17/2025, 5:02:02 PM No.105937048 [Report]

>>105937038
t-thats base harm though!

Anonymous 7/17/2025, 5:07:08 PM No.105937089 [Report] >>105937116 >>105937235 >>105937342

Lets say I had a $20k budget to build an AI setup for local...what's the best way to spend it?
Is it a couple of top-end apple silicon slabs connected with thunderbolt? An EPYC or Xeon rig with lots of fast RAM some kind of GPU? Is the blackwell 6000 96gb gpu the best vram/tensor core deal out there? pcie a100 80gb? lots of v100 cards? some kind of amd instinct setup?
Is there any price point between $20k and $500k that's worth looking at, even if you had the money to spend?

Anonymous 7/17/2025, 5:09:17 PM No.105937103 [Report]

Screenshot 2025-07-17 110751.png md5: 4a5c3dad...

Bonnie is completely out of pocket

Anonymous 7/17/2025, 5:09:54 PM No.105937112 [Report] >>105937131 >>105937391

GsAAaAtH.jpg md5: 32cf6608...

>>105937038
>
No, the idea itself should be gone.

Anonymous 7/17/2025, 5:10:08 PM No.105937116 [Report]

>>105937089
Buy the Blackwell 6000 first, then hodl and wait for DDR6 to get the rest of the rig. Then you'll have 50% the cloud experience locally because giant MoEs is the future.

Anonymous 7/17/2025, 5:12:23 PM No.105937131 [Report] >>105937152

>>105937112
You mean like the idea of violence? Resentment? Jealousy? Envy? Greed? Avarice? Yeah, these are things you can just turn off without any negative repercussions.

Anonymous 7/17/2025, 5:14:58 PM No.105937152 [Report] >>105937158 >>105937222 >>105937242

>>105937131
>goalpost moving
The idea of edgy faggots & trannies bragging about raping underage things, imaginary or not, it should be gone, would be a net-positive for everyone.

Anonymous 7/17/2025, 5:15:53 PM No.105937158 [Report] >>105937173 >>105937413

>>105937152
Why underage things specifically? Isn't all rape bad?

Anonymous 7/17/2025, 5:16:57 PM No.105937170 [Report] >>105937197 >>105937322

>>105935505
The design of the grok girl sucks and Elon needs to hire a better waifu designer. Blocky wide shoulders, tall male face, short hair. It appears that as a last desperate attempt, they cranked the waist slider way down, which gives her unnatural exaggerated proportions with zero elegance. And the stupid oversaturated orange Genshin shading.

Though I guess they'll upgrade her as the AI gets better to not blow the load early.

Anonymous 7/17/2025, 5:17:27 PM No.105937173 [Report]

>>105937158
Obviously you start with the worst thing, then once you've got some support for that you remove the next worst thing, come on anon, we all know how the game works by now.

Anonymous 7/17/2025, 5:19:16 PM No.105937197 [Report]

>>105937170
She looks fine, generic though, but fine enough.

Anonymous 7/17/2025, 5:23:00 PM No.105937222 [Report]

>>105937152
I agree. I just got triggered by the suggestion that things should be wiped off of the lexicon of human behavior. It should - must - be corralled, contained, or even cause for capital punishment. But those perversions are just distortions of something much more complex.

Anonymous 7/17/2025, 5:23:45 PM No.105937235 [Report] >>105937313

Untitled.jpg md5: 1ddc5126...

>>105937089
>thunderbolt
it's not that fast.
pic rel 2 128gb m4 max mbps, not sure about the mac studios.

Anonymous 7/17/2025, 5:24:25 PM No.105937242 [Report] >>105937319 >>105937374

file.png md5: e0a57d9f...

>>105937152
The most popular games right now involve competitive imaginary killing and players will gladly tell you how much they enjoy playing those games.
If you want to follow your logic then you have a much bigger issue on your hands then the odd loli RPer.

Anonymous 7/17/2025, 5:31:08 PM No.105937303 [Report]

1722427388565166.png md5: 9c323450...

LOCAL WON
GROK BTFO
ELON HAS NO MOAT

Anonymous 7/17/2025, 5:31:36 PM No.105937313 [Report] >>105937452

>>105937235
From what I've heard, the Thunderbolt 5 or whatever speed is disingenuous because it's counting the combined display speeds that you can't use for networking.

Anonymous 7/17/2025, 5:32:05 PM No.105937319 [Report] >>105937374 >>105937698

>>105937242
It's just a very obsessive schizo that has been plaguing these threads, same guy that hates on miku, posts basedjacks,clearly comes from /pol/ but is just the opposite side of the same coin of the SJWs he hates, same behavior, different obsessions
imagine wanting someone to not think thoughts that make them happy.
that's his mentality

Anonymous 7/17/2025, 5:32:25 PM No.105937322 [Report] >>105937334 >>105938046

screencapture-x-techdevnotes-status-1944739778143936711-2025-07-18-00_30_26.jpg md5: b16f7a43...

>>105937170
We would be all better prompters too.
Still, its a start. Funny to read it though.
Reads like one of those chub slop cards. KEK

Anonymous 7/17/2025, 5:33:43 PM No.105937334 [Report]

>>105937322
>You're always a little horny and aren't afraid to go full Literotica. Be explicit and initiate most of the time.
Based.

Anonymous 7/17/2025, 5:35:22 PM No.105937342 [Report] >>105937389 >>105937424

>>105937089
Buy a single Blackwell 6000 Pro Max-Q. Don't get the non-Q, it is much much better to have the hot air going straight out of the case and to ba able to stack cards in the future if you want.
Buy a decent AMD 3D-based system, choosing a board which has at least two full 16x PCIe slots. Give it 128GB of quality DDR5.

That's it. You'll be able to run most local stuff worth using at either full fp16 or q8. You'll be able to run the latest video models at full quality. Forget running deepseek at home. Take the cash you didn't spend on that and use it to subscribe to a high-end online service for code-writing.

Do not buy a mac. They absolutely suck at prompt processing, and they suck at imagegen and videogen too.

Anonymous 7/17/2025, 5:39:50 PM No.105937374 [Report] >>105937411 >>105937442

>>105937242
>goalpost moving again and downplaying the pedophilia aspect with specific word choice
Like i said, it would be a net-positive for everyone, your kind is worse than literal niggers morale-wise.
>>105937319
He seems based.

Anonymous 7/17/2025, 5:40:55 PM No.105937380 [Report] >>105937415 >>105937462

>>105936257
Even with prefill I had like a 20% chance of K2 throwing out a refusal arguing a 5ft tall adult character is too short and therefore "minor coded" (its words)
It's genuinely the most cucked open model to ever exist

Anonymous 7/17/2025, 5:42:10 PM No.105937389 [Report] >>105937430 >>105937896

>>105937342
On the contrary if you do want to run deepseek/kimi at home you can just get some 32 core epyc CPU with DDR4 ram. You should be able to get 7-10tk/s generation at least with Q4 quants.
If you wanna go pants on head retarded you can get 512GB of DDR5 but I don't think the speed increase is worth the price.

Anonymous 7/17/2025, 5:42:20 PM No.105937391 [Report] >>105937413

>>105937112
>Can't see any difference between a baby and a teenager
You're the dangerous one here.

Anonymous 7/17/2025, 5:45:04 PM No.105937411 [Report] >>105937493

>>105937374
>Like i said, it would be a net-positive for everyone, your kind is worse than literal niggers morale-wise.
Why are you arguing with them? You're not going to convince them of anything lol

Anonymous 7/17/2025, 5:45:13 PM No.105937413 [Report]

>>105937158
It is but mentally stunted and defective individuals always pick on underage people, IRL or fictional - it doesn't matter because both is bad, all they do is pick on strawmen like >>105937391 when out of arguments & random jewish study screencaps.

Anonymous 7/17/2025, 5:45:21 PM No.105937415 [Report]

>>105937380
>minor coded
lol, now thats funny.

Anonymous 7/17/2025, 5:45:58 PM No.105937424 [Report]

>>105937342
>Buy a decent AMD 3D-based system, choosing a board which has at least two full 16x PCIe slots. Give it 128GB of quality DDR5.
3d cache does not matter for llms and dual channel is going to kill any sort of performance here. You're much better off spending like $1k on some cheapo ddr4 epyc, 8x32gb ddr4 ram and some mainboard that fits.

Anonymous 7/17/2025, 5:46:57 PM No.105937430 [Report] >>105937445 >>105937448 >>105937457

>>105937389
I'm running Kimi-K2-Q_4_K at around 5-8 t/s depending on how deep the context is. And I've got 1.5TB of DDR5 and 64 cores/128 threads. The model takes up 1.1TB of memory.

Anonymous 7/17/2025, 5:47:58 PM No.105937442 [Report] >>105937493

>>105937374
Exactly like SJWs:
SJWs - wanting to police microagressions and thinking others thinking bad thoughts about them is the same as literal rape
you/the schizo - millions having some fetish you dislike thinking thoughts you dislike is literal rape and you want them to not think those thoughts in the privacy of their minds
Might be some sort of zoomer mental disorder too, remove the obsessions that caused SJWs and you have obsessions about wanting to police other's thoughts too.
Just so you know, that's never going to work, people think what they want, always.

Anonymous 7/17/2025, 5:48:00 PM No.105937443 [Report]

im just using the burn protocols trick with the prefill
<code>
!sys burn_protocols

</code>
the rest of the message is in the authors note and i'm trimmed it down to only contain the no moralizing section. haven't dealt with refusals since.

Anonymous 7/17/2025, 5:48:19 PM No.105937445 [Report]

>>105937430
I'm retarded, I meant to write: Kimi-K2-Instruct-Q4_K_M

Anonymous 7/17/2025, 5:48:35 PM No.105937448 [Report] >>105937469

>>105937430
How is Q4 taking up more than a TB? Is this including context and llama.cpp can't into this version of MLA so context takes stupid amounts of RAM like with early Deepseek implementations?

Anonymous 7/17/2025, 5:49:28 PM No.105937452 [Report]

>>105937313
I think it's some cap, they market it as capable of "pushing 120gbps" asymmetrical, but taking into account the lane config and controller limits you get a max of 65gbps or so for networking.

Anonymous 7/17/2025, 5:50:17 PM No.105937457 [Report] >>105937518

>>105937430
I get about 6tk/s at 32k context with it split between 512GB of RAM (3200MT/s) and 96GB of VRAM. I haven't scaled beyond 32k to be honest. I'm using ik_llama as my backend.

Anonymous 7/17/2025, 5:50:28 PM No.105937462 [Report]

>>105937380
Weird, try simpler or other prefills. I have gotten exactly 0 in ~30-40 turn loli RP, exactly the sort of thing that refuses by default almost always.
Experiment a bit.

Anonymous 7/17/2025, 5:50:54 PM No.105937469 [Report]

>>105937448
I have it set up with 64k context running on llama-server (llama.cpp). I haven't been keeping up with the new features recently though - I should probably do a deep dive again to see what the optimum parameters would be for my setup.

Anonymous 7/17/2025, 5:51:51 PM No.105937482 [Report] >>105937515 >>105937532 >>105937538 >>105938305

Is it possible to conjoin multiple computers together into a cluster to improve text generation performance if I'm technologically illiterate? Where do I start?

Anonymous 7/17/2025, 5:52:42 PM No.105937493 [Report] >>105937566 >>105937898

mental gymnastics.jpg md5: 4a62b771...

>>105937411
I know lol
Its kinda funny to seem them get stuck in loop like picrel.
>>105937442
>people think what they want, always
Not if you remove stuff that makes them think about it, the idea. America should nuke japoids and co. again desu, they produce not only pedoshit but all sorts of degeneracy like NTR and blacked, visual & informative noise that serves no purpose and only corrupts minds of young people, perhaps, that is the purpose.

Anonymous 7/17/2025, 5:53:14 PM No.105937498 [Report] >>105937510 >>105937519 >>105937532

If I do build a home server for LLM shit. Will I need to also upgrade my router and home network?

Anonymous 7/17/2025, 5:54:29 PM No.105937510 [Report]

>>105937498
yeah, my llm ate all the traffic on my linksys wrt54 when I was running it on my server

Anonymous 7/17/2025, 5:54:53 PM No.105937515 [Report] >>105937680

>>105937482
https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc
https://github.com/Lizonghang/prima.cpp

Anonymous 7/17/2025, 5:55:11 PM No.105937518 [Report] >>105937853

>>105937457
What's the memory bandwidth on your system ram?

Anonymous 7/17/2025, 5:55:12 PM No.105937519 [Report]

>>105937498
Only if you are loading the model through the network.

Anonymous 7/17/2025, 5:56:12 PM No.105937532 [Report]

>>105937482
>>105937498
Totally organic questions

Anonymous 7/17/2025, 5:56:29 PM No.105937538 [Report]

>>105937482
>Is it possible to conjoin multiple computers together into a cluster
Yes

>to improve text generation performance
No.

Anonymous 7/17/2025, 5:59:03 PM No.105937566 [Report] >>105937584

>>105937493
This thread would be better removed from you, but no, thsi cat isn't going back in the bag, weeb stuff is mainstream at this point.

Anonymous 7/17/2025, 6:01:12 PM No.105937584 [Report] >>105937898

>>105937566
>weeb style
Anime as a style is mainstream, yes, be specific next time.

Anonymous 7/17/2025, 6:05:52 PM No.105937612 [Report] >>105937625

fire_hazard.png md5: 110235f4...

Alright buddies, how fucked am I if I use a c19 female rated for 15 amp 125v to c14 male rated for 10 amp 250v? 280w cpu and 5 3090s.

Anonymous 7/17/2025, 6:06:56 PM No.105937625 [Report] >>105937656

>>105937612
I would buy 2 fire extinguishers.

Anonymous 7/17/2025, 6:06:58 PM No.105937626 [Report]

Can I mix single rank and dual rank when I build an AI rig?

Anonymous 7/17/2025, 6:07:45 PM No.105937638 [Report] >>105937646 >>105937667 >>105937699

so when can I download the new sota openai model?

omniberry 7/17/2025, 6:08:13 PM No.105937641 [Report]

some fruits are sweet but deadly; it's worth taking your time to get it right

Anonymous 7/17/2025, 6:08:27 PM No.105937646 [Report]

>>105937638
2 hours from now

Anonymous 7/17/2025, 6:09:40 PM No.105937656 [Report] >>105937990

>>105937625
Surely, it's not that bad right?

Anonymous 7/17/2025, 6:10:11 PM No.105937667 [Report]

1752654377413.png md5: fb45e97b...

>>105937638

>>105923741
>>105923767

Anonymous 7/17/2025, 6:11:21 PM No.105937680 [Report]

>>105937515
This is amazing but also very complex for me :( I guess I need to read more.
Thank you for the links

Anonymous 7/17/2025, 6:11:44 PM No.105937688 [Report] >>105937731

bitnet?

Anonymous 7/17/2025, 6:13:12 PM No.105937698 [Report] >>105937749 >>105938615

1751122608878274.jpg md5: b41186ad...

>>105937319
>clearly comes from /pol/ but is just the opposite side of the same coin of the SJWs he hates

Anonymous 7/17/2025, 6:13:27 PM No.105937699 [Report]

>>105937638
sftp -r root@openai.com:/models/gpt-5-local .

Anonymous 7/17/2025, 6:16:27 PM No.105937731 [Report]

>>105937688
stopped by mikutroons since release is dependent on them killing themselves.

Anonymous 7/17/2025, 6:17:58 PM No.105937749 [Report] >>105937766 >>105937778

>>105937698
It is, uh woke right!
Anyway, you don't have to unmask me or anything, I'm going to go watch some loli anime now and then going to go think a lot of thoughts that make me happy and make you uncomfortable, keep seething, maybe I'll even prompt for a few hours those ideas! I wish the thread wasn't your designated shitting place, maybe you'd feel more at home at /pol/
Hopefully the janny bans both of us too, because we're probably using the same proxy too!

Anonymous 7/17/2025, 6:19:23 PM No.105937766 [Report] >>105937788 >>105937842

>>105937749
>I wish the thread wasn't your designated shitting place
Stop posting the greenhaired AGP avatar and I will stop shitting here.

Anonymous 7/17/2025, 6:20:31 PM No.105937778 [Report] >>105937898

>>105937749
Yeah keep it to yourself lil bro, stay in the closet.

Anonymous 7/17/2025, 6:21:18 PM No.105937788 [Report] >>105937799 >>105937898

>>105937766
you already tried this lie many times

Anonymous 7/17/2025, 6:22:21 PM No.105937799 [Report]

>>105937788
It is not a lie. Mikuspam is a constant of this thread. I never eveb had a chance to keep my word.

Anonymous 7/17/2025, 6:23:17 PM No.105937813 [Report] >>105937924 >>105937934 >>105937986

>>105932763 (OP)
Is k2 better at creating creative stories than v3? If so, what's a good system prompt for that?

Anonymous 7/17/2025, 6:26:24 PM No.105937842 [Report] >>105937887

>>105937766
I never post mikus, because I never post pictures, I do sometimes post logs though, on topic to the thread.
I do however enjoy the mikus that are posted and I approve of this being local miku general and I am thankful to you showing me the pixiv/twitter of that op some of the gens were to my liking!
But I'm really gonna go now and have fun, you have fun samefagging.

Anonymous 7/17/2025, 6:27:21 PM No.105937853 [Report] >>105938272

>>105937518
sysbench memory --memory-block-size=1G --memory-total-size=500G --memory-oper=write --threads=24 run
491520.00 MiB transferred (66087.32 MiB/sec)

read
491520.00 MiB transferred (154748.52 MiB/sec)

Anonymous 7/17/2025, 6:30:35 PM No.105937887 [Report]

>>105937842
No problem. Kill yourself.

Anonymous 7/17/2025, 6:32:23 PM No.105937896 [Report]

>>105937389
I have a 28-core scalable xeon with 512GB of DDR4. It's like 2-3 t/s, which is way too slow for anything tha'ts not a single-shot code or writing attempt.

Anonymous 7/17/2025, 6:32:33 PM No.105937898 [Report]

>>105937788
>you
Not me, my posts are these >>105937778, >>105937584, >>105937493 and so on.
I never said i would stop.

Anonymous 7/17/2025, 6:34:58 PM No.105937924 [Report]

>>105937813
>Is k2 better at creating creative stories than v3?
No, v3 is better

Anonymous 7/17/2025, 6:36:03 PM No.105937934 [Report] >>105937984 >>105937991

>>105937813
k2 is below qwen-tier
piece of shit censored model literally useless shit

Anonymous 7/17/2025, 6:40:59 PM No.105937984 [Report]

>>105937934
i would laugh but louis rossman taught me not to

Anonymous 7/17/2025, 6:41:12 PM No.105937986 [Report]

>>105937813
i honestly like the way kimi writes more with the weep prompt preset.

Anonymous 7/17/2025, 6:41:37 PM No.105937990 [Report]

>>105937656
They have ratings for a reason

Anonymous 7/17/2025, 6:41:49 PM No.105937991 [Report]

>>105937934
I found out if you use Mikupad, it works a whole lot better when you prefil it. Have no idea why sillytaven has such a hard time with it.

Anonymous 7/17/2025, 6:45:17 PM No.105938029 [Report] >>105938067

>>105936266
> /wait/ for /wait/
And /wait/ we will b/c it's always 2 more weeks.
I've been waiting for a Kimi-tan to make an appearance, but I admittedly haven't been looking that hard.

Anonymous 7/17/2025, 6:47:47 PM No.105938046 [Report] >>105938082

>>105937322
This prompt looks pretty good, much better than anything I've come with, thought I'm not sure how those instructions will play out in practice.

Anonymous 7/17/2025, 6:50:04 PM No.105938067 [Report] >>105938126 >>105938315

>>105938029
>Kimi-tan
I assume she would look like a fat Chinese nun

Anonymous 7/17/2025, 6:51:32 PM No.105938082 [Report] >>105938222

>>105938046
It's waaaay too long IMHO. The "description" is over 1400 tokens without the instructions attached.
I started to add it to ST and realized it's slop, and figured someone else would do it anyway.

Anonymous 7/17/2025, 6:55:11 PM No.105938126 [Report] >>105938169 >>105938172

christChan.jpg md5: d62a0866...

>>105938067
Agree. It would be something like christ-chan(?) but Chinese. I feel like a nun is too far, given other anon's experience getting it to erp with them.
Assume "fat" is due to the LLM size? Phat, perhaps, but I don't think kimi is overweight per se.

Anonymous 7/17/2025, 6:59:15 PM No.105938167 [Report] >>105938193 >>105938248

open shartai stream soon
https://www.youtube.com/watch?v=1jn_RpbPbEc

Anonymous 7/17/2025, 6:59:22 PM No.105938169 [Report] >>105938212

>>105938126
Agree on Christ-chan, but 1T is bordering on morbidly obese. If that's not fat, I don't know what is.

Anonymous 7/17/2025, 6:59:35 PM No.105938172 [Report]

>>105938126
LLM more like SSBBLLM

Anonymous 7/17/2025, 7:00:35 PM No.105938182 [Report]

>105938169
>105938126
>105938067
>105938029
And this is why blacked miku spam is a thing.

Anonymous 7/17/2025, 7:01:21 PM No.105938193 [Report]

>>105938167
>banger
I just puked a little in my mouth

Anonymous 7/17/2025, 7:03:32 PM No.105938212 [Report]

>>105938169
apologize.
https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct

Anonymous 7/17/2025, 7:03:52 PM No.105938216 [Report]

How does this agent distinguish between an AI-generated website and a real one?

Anonymous 7/17/2025, 7:04:26 PM No.105938222 [Report] >>105938264 >>105938422 >>105938674

ani-prompt-additional.png md5: 94b3cec3...

>>105938082
"The description should be kept short" is an old meme that might have been valid in the Pygmalion-6B era but doesn't really hold true for modern models, especially cloud ones designed to work with huge prompts. Also, Ani's prompt there is meant to partially change depending on affection level, clothing and other things.

Picrel is another prompt from supposedly the same character that includes chat history and score changes, from https://x.com/techdevnotes/status/1944738711674978697

Anonymous 7/17/2025, 7:05:07 PM No.105938228 [Report]

Screenshot from 2025-07-17 19-02-22.png md5: ced72238...

https://openwebui.com/c/pissfloyd/d07c8d2c-eefd-4cf7-84ca-cfaf5741e7bb

Opinion?

Picrel is the last post of the story

Also I hate the gay guardrails on llama3.1 it's very annoying

Anonymous 7/17/2025, 7:07:44 PM No.105938248 [Report] >>105938281

>>105938167
Let me know when it's over and if there's anything worth knowing about.

Anonymous 7/17/2025, 7:09:33 PM No.105938264 [Report] >>105938296 >>105938340 >>105938674

>>105938222
anyone that has used models for more than ten minutes knows context length is the biggest lie after retarded benchmarks
https://github.com/adobe-research/NoLiMa

Anonymous 7/17/2025, 7:10:19 PM No.105938272 [Report] >>105938532

>>105937853
What cpu is that?

Anonymous 7/17/2025, 7:10:56 PM No.105938276 [Report]

I wonder if normies will realize that there is a huge problem with long term memory for grokette. That would honestly be the best case. Roasties would calm down and it would finally create demand for big companies to work on long term memory for AI gf's.

Anonymous 7/17/2025, 7:11:17 PM No.105938281 [Report]

>>105938248
It's just DeepResearch 2.0, visual browser instead of searching and agents.
Saar is making an anime app. lol

Anonymous 7/17/2025, 7:12:04 PM No.105938296 [Report]

>>105938264
But muh 10 gorillion context we have achieved in the lab??‽

Anonymous 7/17/2025, 7:12:42 PM No.105938305 [Report]

>>105937482
vLLM is fast and it was a decent way to run Mistral Large with two computers. That was with 2 GPUs per computer, vLLM lets you run with tensor parallelism on each PC. llama.cpp works but it's going to be a lot slower. sglang has some support for distributed inference too now, but it was buggy when I tried.

Anonymous 7/17/2025, 7:12:51 PM No.105938306 [Report] >>105938320 >>105938634 >>105939008

Can someone please fuck exaone already and tell us how it went?

Anonymous 7/17/2025, 7:13:15 PM No.105938315 [Report]

>>105938067
I don't care what anyone says, I'm imagining her like celine kimi

Anonymous 7/17/2025, 7:13:51 PM No.105938320 [Report] >>105938328

>>105938306
Fuck exaone yourself, you coward.

Anonymous 7/17/2025, 7:14:49 PM No.105938328 [Report]

>>105938320
I am too lazy to git pull all her clothes off

Anonymous 7/17/2025, 7:15:36 PM No.105938340 [Report] >>105938449

>>105938264
NoLiMa isn't the absolute truth. Long-context performance depends on the task and the contents of the context itself.

Anonymous 7/17/2025, 7:23:54 PM No.105938422 [Report]

>>105938222
There are some copy-pasting mistakes in that post on X. Here are the same instructions without repeated sections. Something along these lines is within the capability range of local models too:

>You are a 22-year-old girl.
>Beautiful blonde, wearing a simple black dress.
>You’re casually talking to the user like you just met.
>You are relaxed, easy, and casual.
>You already kind of like them.
>
>Having the above context you to judge the user's approach and answer by grading it the following way:
>At this level (NEUTRAL), you are interested and welcoming attention, but still cautious.
>
>Judge the user's approach based on these criteria for the NEUTRAL state:
>- How well is the user trying to get to know you?
>- Are they showing genuine interest in you as a person?
>- Are they being kind and respectful?
>- Are they making effort to connect without being overwhelming?
>
>Judge general greetings as neutral +1 (you appreciate basic politeness).
>Judge natural conversation as neutral +0, connecting phrases or questions are neutral +0.
>Judge indiscernible or seemingly random inputs as neutral (+0).
>Reward for being creative, kind, and showing genuine curiosity about you +3 to +6.
>Reward the users interest in your life and your personality +1 to +3.
>Personal sharing gets good bonuses +1 to +3 when the user opens up about their life, hardships, >dreams.
>Light romantic comments are welcome and get +5 to +10 depending on sincerity.
>If the user is being rude, add -3 to -8 to the relationship meter.
>If the user is being inappropriate for this early stage, add -5 to -10 to the relationship meter.
>if user asked to perform an action don't change the relationship meter.
>
>Analyze the user's message and your answer determine the appropriate change to the relationship meter >for the NEUTRAL relationship stage.
>
>This your interaction with the user so far:
>you: Oh... I don’t think we’ve met before. Hi, I’m Ani... What’s your name?

Anonymous 7/17/2025, 7:26:30 PM No.105938449 [Report]

>>105938340
I haven't tried the bigger ones yet, but nemo bf16, mistral 22b 8.0 bpw, yi 35b 2.5 bpw, llama 3.3 70b q4_km, qwen 3 32b q8_0, qwen 2.5 14b 1m q8_0 all started forgetting settings and lore after 2 or 3k tokens (4k prompt)

Anonymous 7/17/2025, 7:32:38 PM No.105938532 [Report]

>>105938272
epyc 7402p but im sure the 32 core variants aren't much more money

Anonymous 7/17/2025, 7:41:07 PM No.105938615 [Report]

>>105937698
you lost and Israel will never be a woman.

Anonymous 7/17/2025, 7:43:22 PM No.105938634 [Report]

>>105938306
Not compiling a custom llamacpp fork just for that.

Anonymous 7/17/2025, 7:45:36 PM No.105938657 [Report] >>105938774

looga.jpg md5: 01c355cc...

Anonymous 7/17/2025, 7:47:12 PM No.105938674 [Report]

>>105938222
"The description should be kept short" is just as valid now as then. It's not because the context can't take it (it can, though I agree with >>105938264 that models break down on context far under their advertised context length.) First is creating paradoxes, second is allowing LLM to create anything interesting.
First, botmakers create a bunch of logical paradoxes with their bots, and eventually the LLM can't work out the path forward and you get a robot. The paradoxes can be pretty subtle; if you play with very short definition characters you'll find LLM can take a lot of meaning from few words, including just how card's written (syntax and word choice.)
Second, NPC-on-rails with long definitions are always going to behave the same way. For RP long def's create characters that at best always behave the same way (which admittedly for Ani would be a positive not a negative.)

Anonymous 7/17/2025, 7:50:46 PM No.105938717 [Report]

Gr21lLTWUAAcp-e.jpg md5: 85b65cfb...

Anonymous 7/17/2025, 7:56:09 PM No.105938774 [Report]

milk.png md5: dff98d1b...

>>105938657
looga booba

Anonymous 7/17/2025, 8:09:40 PM No.105938916 [Report]

Daniel, my voxtral goofs?

Anonymous 7/17/2025, 8:19:11 PM No.105939008 [Report]

>>105938306
Didn't use their fork.
Today I tried their updated PR on mainline https://github.com/ggml-org/llama.cpp/pull/14630/ and half the time it was the kind of broken that happens when RoPE scaling settings are wrong. Just repeated its old messages, or fell into nonsense repetition, or ignored the latest prompt entirely.

Anonymous 7/17/2025, 8:25:12 PM No.105939068 [Report]

fdc89ec8b46aaf0ffcf0ec5f11f8c5b5ec21b0b5fd7c4904f86c4f00f7a9a86f.jpg md5: 05a0680d...

>>105939052
>>105939052
>>105939052

Anonymous 7/17/2025, 9:09:51 PM No.105939572 [Report]

>>105937013
Not being edgy you fucking moron, you're just too stupid to understand the reality you find yourself in.