← Home ← Back to /g/

Thread 107147210

371 posts 84 images /g/
Anonymous No.107147210 [Report] >>107150730 >>107152961 >>107153296 >>107154355
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107138606 & >>107129334

►News
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html
>(11/06) LocalSong 700M melodic instrumental music generation model released: https://hf.co/Localsong/LocalSong
>(11/05) MegaDLMs framework for training diffusion language models released: https://github.com/JinjieNi/MegaDLMs
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.107147214 [Report]
►Recent Highlights from the Previous Thread: >>107138606

--Agentic finetuning success with Gemma 3 27b using dataset duplication strategy:
>107140749 >107140853 >107140874 >107141186 >107141904 >107145572 >107145579 >107141303
--Model performance comparison and IF evaluation benchmark discussion:
>107145761 >107145774 >107145810 >107145849 >107146116 >107146184 >107146306 >107145947 >107145956
--Strategies for preserving Opus-3 model conversations before deprecation:
>107140145 >107140264 >107140360 >107140384
--Exploring free proxy models for logic/programming tasks and style transfer via LoRA:
>107140277 >107140356 >107140365 >107140399 >107140446 >107141293
--Single vs dual-GPU dilemma for performance vs power safety tradeoffs:
>107143867 >107143877 >107143878 >107143946 >107144867 >107144872 >107144155
--Sampling optimization debate for creative RP with minP/Top-P and temperature tuning:
>107139402 >107139418 >107139447 >107139500 >107139577 >107139540 >107139897 >107139915
--Llama training methodology and safety implications of validation set optimization:
>107140894 >107140932 >107141030 >107141086 >107141101
--Neural network depth and Gemini 1.2T model performance speculation:
>107145345
--Toss model performance vs Gemma 3 in practical applications:
>107145833 >107145904 >107146168
--Cydonia model performance comparisons and upcoming releases:
>107140380 >107140394 >107140486 >107141250 >107140397 >107140661 >107143958 >107143966 >107146415 >107146427 >107146449 >107146485 >107146506
--DDR4-6000 price spike frustrations and DDR5 transition speculation:
>107139738 >107139779 >107139792 >107139982 >107139985 >107142864 >107142896 >107143500
--Qwen data increases overfitting risk in CoT models:
>107140601
--Gemma finetuning results with QwQ's data: less neurotic, still verbose:
>107139425
--Miku (free space):
>107140392

►Recent Highlight Posts from the Previous Thread: >>107138613

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
Anonymous No.107147228 [Report]
>Toss model
Anonymous No.107147241 [Report] >>107147250 >>107147288 >>107147352
Can someone post a QRD for setting up VibeVoice? What repo, what settings etc..
Anonymous No.107147250 [Report]
>>107147241
It's stuck in python hell so just use the official demos on huggingface or find a comfyui node or something
Anonymous No.107147259 [Report] >>107147277
Can someone post a QRD for setting up Nemo? What fork, what temperature etc..
Anonymous No.107147262 [Report]
blos? is we over? >>107147122
Anonymous No.107147277 [Report] >>107147336 >>107147348
>>107147259
stop doubting yourself and just do what you think is right. it'll work out, believe in yourself
Anonymous No.107147288 [Report] >>107147308
>>107147241
I gotchu
https://github.com/vibevoice-community/VibeVoice?tab=readme-ov-file
Anonymous No.107147295 [Report] >>107147339
Can someone post a QRD for improving confidence uwu? Which hustler's plan, which youtube channel etc..
Anonymous No.107147304 [Report]
anyone have any idea as to why sillytavern keeps deciding to insert every entry from the lorebook at the very beginning of each chat despite none of the trigger words being mentioned?
Anonymous No.107147308 [Report] >>107147352
>>107147288
thank u anon, im gonna read the source code before installing to make sure we're safe
Anonymous No.107147336 [Report]
>>107147277
>average normalfag advice
Anonymous No.107147339 [Report]
>>107147295
I think you should try the Drummer plan! I tried ERP with the Rocinante's model and it helped me talk to white girls. Make sure to join our discord and look for the right channel for a better experience ;)
https://huggingface.co/TheDrummer/Rocinante-12B-v1.1
Anonymous No.107147348 [Report]
>>107147277
6'4" adonis's dating advice to 5'5" balding indian friend
Anonymous No.107147352 [Report] >>107147364 >>107147681
>>107147241
Back up of the original repo here:
https://github.com/great-wind/MicroSoft_VibeVoice
1.5B is still up:
https://huggingface.co/microsoft/VibeVoice-1.5B
Torrent of the repo (dunno if still seeded):
magnet:?xt=urn:btih:b5a84755d0564ab41b38924b7ee4af7bb7665a18&dn=VibeVoice&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
Torrent for VibeVoice 7B:
magnet:?xt=urn:btih:d72f835e89cf1efb58563d024ee31fd21d978830&dn=microsoft_VibeVoice-Large&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
Sampling with examples:
https://desuarchive.org/g/thread/106516368/#q106519850
https://desuarchive.org/g/thread/106516368/#q106519945
>>107147308
Good idea since the vibevoice-community repo has continued to be modified from the original and you don't know what was put into it since.
Anonymous No.107147364 [Report]
>>107147352
Thank you so so much anon <3
so so so so so much <3
Anonymous No.107147367 [Report] >>107147386 >>107147516
>Let me write:
[1500 tokens]
>Wait, the user mentioned [minor detail], I should include that.
[1500 tokens]
>Hmm, I think I should expand the other part
[1500 tokens]
>Good, let's now write the reply:</think>
K2 is so amazing. The way it plans ahead is so thorough. I love it.
Anonymous No.107147386 [Report] >>107147469 >>107147516
>>107147367
so you wait like 15 minutes before even seeing a single token
Anonymous No.107147469 [Report] >>107147479 >>107147500 >>107147516 >>107147559 >>107147600
>>107147386
Nobody actually runs Kimi locally everyone just uses the website and/or API and then lies about using it locally.
Anonymous No.107147479 [Report] >>107147487
>>107147469
so then everyone is a faggot?
Anonymous No.107147487 [Report]
>>107147479
UwU
Anonymous No.107147500 [Report] >>107147512 >>107147559 >>107147600
>>107147469
It's really time to just rename this general to /omg/ - open model general and drop the retarded local pretense
Anonymous No.107147512 [Report]
>>107147500
I mean there's still lots of people in the thread that run models locally. But it's mostly just redditards that bother trying to run shit like kimi at 0.01 token/sec and drive up RAM prices in the process.
Anonymous No.107147516 [Report] >>107147529 >>107147561
>>107147367
>>107147386
I posted yesterday regarding Kimi's results. On one hand, if you let it think, the total response time (thinking+response) will typically range from anywhere from 3 minutes to 10 minutes on a mid-tier DDR5 cpumaxx machine. After some further testing, with thinking on, its really good. Completely unusable for quick goons but solid for RP. Its noticeably smarter (maybe because of QAT?) and more reigned in than K2-0905.
After some further experimentation today, it works with a prefilled thought process through Text-Completion which lets you skip the thinking all together. I need to do more testing, but preliminarily, its still smart. I'd say with a good thought prefill, it essentially is what Deepseek v3.1 Terminus should have been. I hope they benchmark its memory capabilities.
>>107147469
Why are you poor?
Anonymous No.107147529 [Report] >>107147624
>>107147516
what is a mid-tier DDR5 cpumaxx machine to you?
Anonymous No.107147559 [Report] >>107147574
>>107147469
>>107147500
its time for you two sisters, to fuck off to aicg
Anonymous No.107147561 [Report] >>107147624
>>107147516
>Why are you poor?
I'm not poor.
I just don't see the value in spending as much as a new car on computer hardware just to run something I can run for free off of the website.
Anonymous No.107147574 [Report] >>107147590 >>107148035
>>107147559
I've been a contributing member of this thread since day one so you can go fuck yourself you dumb retarded kike.
Anonymous No.107147590 [Report] >>107147605
>>107147574
I'be been a comtwibuting membwr since day -20 wen llama laked before lmg was made
Anonymous No.107147600 [Report] >>107147673 >>107147707
>>107147469
>>107147500
Not even optimized either.
Anonymous No.107147605 [Report] >>107147630 >>107147748
>>107147590
Why do kikes all talk like 5 year olds?
Just answer me that one question.
Why can't any of you satanic child murdering pieces of shit participate in a normal adult conversation?
Anonymous No.107147611 [Report] >>107147659
Gemini 3 when?
Anonymous No.107147618 [Report] >>107147623
Why are antisemites always so angry.
Anonymous No.107147623 [Report]
>>107147618
Jewish behavior fatigue.
Anonymous No.107147624 [Report] >>107147650 >>107147660
>>107147529
A 4800MHZ 768GB machine with 9334's/Xeons and a GPU or two for prompt processing. Granted I bought this when RAM was half the price it is now and saved up since 2023 in order to get it responsibly.
>>107147561
In a perfect world that probably still existed just 20 years ago, where people could differentiate between reality and fiction, companies weren't constantly trying to strip away user agency, and we didn't have outright malicious people enshittying everything to nickel and dime you at every turn, I would agree with you. Sadly, we don't live in that world.
Anonymous No.107147630 [Report] >>107147652
>>107147605
Because their goal is to make normal conversation impossible, and by responding to them you are helping their cause.
Anonymous No.107147650 [Report] >>107147756
>>107147624
is that 12 channel or 8 channel?
Anonymous No.107147652 [Report]
>>107147630
>their goal
Ah yes, the singular shared goal of all these individuals I don't like.
Anonymous No.107147659 [Report] >>107147745
>>107147611
Aren't the angled thrusters suboptimal for vertical lift? It can turn more easily but I assume similar is achieved with straight thrusters anyway just by turning off thrust on the side you want to turn toward.
Anonymous No.107147660 [Report] >>107147756
>>107147624
How much context fits on your GPU?
Anonymous No.107147673 [Report] >>107147691 >>107147707
>>107147600
Now post the speed you get at 100k context loaded.
Anonymous No.107147681 [Report] >>107149004
>>107147352
I checked the community repo, we are safe. Am I supposed to change the sample count in the demo_gradio.py? i dont see it in the gui
Anonymous No.107147691 [Report] >>107147729
>>107147673
The goalposts are moving faster than datacenter API token generation.
Anonymous No.107147707 [Report] >>107147800
>>107147600
im very envious of you anon, and im very happy and proud of you. enjoy local kimi, a thing us poorfag seethers like >>107147673 will never enjoy
Anonymous No.107147729 [Report]
>>107147691
It's ok. You can come back tomorrow when it finishes generating and report the speeds then.
Anonymous No.107147745 [Report]
>>107147659
Saar this is peak Bharati engineering please understand.
Anonymous No.107147748 [Report]
>>107147605
Yes Anon, your post is the normal one and not the least bit unhinged.
Anonymous No.107147756 [Report]
>>107147650
8 Channel with 2 CPUs, so 16 theoretically. To be honest, if you were to get this now, I would go with Gen 5 EYPCs which are 12 channel and support 6400MHz DDR5 RAM.
>>107147660
36k, unquanted across 96GB of VRAM. Granted I use massive batch sizes (16k) in order to get faster pp so I could probably fit double that if I used the standard 4k.
Anonymous No.107147800 [Report] >>107147842 >>107147899
>>107147707
Thanks anon. I hope GLM Air 4.6 comes out soon so povertybros have a decent safetyslopless option too.
Anonymous No.107147842 [Report]
>>107147800
Anonymous No.107147899 [Report] >>107147921 >>107147926 >>107147929 >>107148001 >>107148057
>>107147800
Maybe GLM just sucks at programming but I just asked 4.6 3K_M for help on doing what I thought was a straight forward Python decorator pattern and it got stuck in a thinking loop. I asked gemini (the coding one) the same question and it answered quickly with a good answer. I haven't really tried closed weight models much but I was surprised at how much better it was on the few questions I've given compared to all the open models I've tried which is disappointing. Maybe I need to find programming specific big models though. Also with that being said whatever co-pilot model github uses absolutely sucks when you click the help on a github action failure. It's bizarre how bad it is and that they keep the button anyway. Every time I given it a try it has something that was so blatently unrelated to the issue.
Anonymous No.107147921 [Report] >>107147974
>>107147899
Gemini is definitely bigger than GLM and it sure as shit isn't quanted to Q3
Anonymous No.107147926 [Report] >>107147974
>>107147899
What quant and programming language? It's all anecdotal but I've noticed that 'harder' programming languages (more to consider with overhead, efficiency etc) tend to suffer in quality more from quantization than shitter-tier languages. It'd be interesting to see how much the model is actually considering efficiency in output at any given quant per language.
Anonymous No.107147927 [Report] >>107148496 >>107148554
Any good rentry or whatever guides for writing system prompts? People here always act like that's the skill to get a model working. I'm skeptical but would be curious what tricks people have found
Anonymous No.107147929 [Report] >>107147944
>>107147899
>4.6 357B non-coding 3K_M
vs
>gemini 1.2T coding (probably Q8, but at worst Q4)
fucking retard
Anonymous No.107147944 [Report] >>107147951 >>107148034
>>107147929
>>gemini 1.2T coding (probably Q8, but at worst Q4)
They quant it depending on usage, during peak hours there is a chance you get Q3
Anonymous No.107147951 [Report]
>>107147944
And during India working hours, they serve Q1.
Anonymous No.107147974 [Report] >>107148034
>>107147921
That is true but it's been repeated that quanting has less impact on larger models and GLM full is pretty big even if it's not approaching the 1T mark.

>>107147926
The answer to both those questions are in the first sentence anon. This was a high level python set up code so it shouldn't be taking efficiency into consideration at all.
Anonymous No.107147992 [Report] >>107147997 >>107148005 >>107148077
What the fuck did ik_llama change? I built the new version, then I had to adjust my command to no longer include -fa and -fmoe because it's apparently on by default now but the speeds are horribly slow compared to the old version.
Fuck this shit.
Anonymous No.107147997 [Report]
>>107147992
welcome to cutting edge
Anonymous No.107148001 [Report] >>107148065 >>107148082 >>107148212
>>107147899
GLM gets stuck in loops even through the official webpage and also through the Openrouter API.
>>107135967
Anonymous No.107148005 [Report] >>107148210
>>107147992
Is ik_llama merging in changes from upstream?
Anonymous No.107148024 [Report] >>107148030 >>107148035
You all are a bunch of fools!
I was here in the early days of /lmg/ and this thread has gone to shit
Anonymous No.107148030 [Report] >>107148068
>>107148024
/lmg/ went to shit the moment llama2 invited all the casuals in
Anonymous No.107148034 [Report] >>107148162 >>107149144 >>107153835
>>107147944
>APIjeets aren't even getting guaranteed fp16
Say it ain't so.
>>107147974
I'm too retarded to reading comprehension, sorry anon. Have you tried a larger batch size? I don't know if it'll fix your problem, but it sometimes fixes repetitive behavior if the model can see it's repeating itself in the same batch.
Anonymous No.107148035 [Report] >>107148142
>>107148024
are you >>107147574
Anonymous No.107148057 [Report] >>107148162
>>107147899
samplers?
Anonymous No.107148065 [Report]
>>107148001
please delete this
Anonymous No.107148068 [Report]
>>107148030
No, the problem was one-click installers and locust refugee waves.
Anonymous No.107148077 [Report]
>>107147992
Yeah I had to remove those as well. But the speeds are the same with Kimi and GLM. What model are you using?

>>107147119
> mean tags like <pause>, <emphasis> and Idk maybe even <calm>, <excited>, <happy> etc

Orpheus can do some of that. With LoRA you can teach it to do <pause>.

With control-vectors you can make it do <happy> <excited> etc.
Anonymous No.107148082 [Report] >>107148099 >>107148115
>>107148001
it's fine on novelai though?
Anonymous No.107148099 [Report]
>>107148082
BASED
Anonymous No.107148115 [Report] >>107148127
>>107148082
I hope this is shitposting and not that guy being actually right about novelai actually being the ones responsible for the relentless GLM shilling.
Anonymous No.107148127 [Report] >>107148143
>>107148115
It's that guy falseflagging to get people to support his crusade.
Anonymous No.107148138 [Report] >>107148158 >>107149422
when you walk away
you dont hear me say
..please baby dont go
Anonymous No.107148142 [Report]
>>107148035
No. It seems like there are more of us feeling this way
Anonymous No.107148143 [Report] >>107148451
>>107148127
How is a general that primarily consists of straight men cooming to personalized text completion waifus this absurdly gay sometimes?
Anonymous No.107148158 [Report]
>>107148138
*stays*
Anonymous No.107148162 [Report] >>107148174
>>107148034
>>107148057
Admittedly I didn't try much so it could easily be a bad setup. I've gotten pretty good results with Qwen 235 thinking in the past but didn't try it on the question since I needed to redownload it and wanted a quick answer but I'll try that as well. Qwen tends to give long repetitive answers though with lots of tables of made up metrics which annoys me.
Anonymous No.107148174 [Report]
>>107148162
maybe when asking simple questions you should add /nothink?
Anonymous No.107148210 [Report] >>107148216
>>107148005
https://github.com/ikawrakow/ik_llama.cpp/pull/883
They do. Not sure if they also ported the -fa defaults from mainline. I guess directly merging isn't possible anymore due to diverging too much. Still, I'd like to see the outrage if someone tried to port iwan's speed improvements back upstream.
Anonymous No.107148212 [Report]
>>107148001
Oh yeah I did see that in the past but it was a different kind of loop. It was unable to figure the answer so it kept going >I got it >actually no >I got it >actually no... That went on for a couple hundred lines before I stopped it.
Anonymous No.107148216 [Report] >>107148220 >>107148223 >>107150831
>>107148210
he cant get pissed. it's mit lol
Anonymous No.107148220 [Report]
>>107148216
He can seethe, but he can't take it down
Anonymous No.107148223 [Report] >>107148337 >>107148337
>>107148216
Legally, he can't do shit. But he can and will get pissed. That's why the split fork exists to begin with.
Anonymous No.107148260 [Report] >>107148274 >>107149487
Why does every general have a resident schizo?
Anonymous No.107148274 [Report] >>107148288
>>107148260
is the schizo in the thread with us right now?
Anonymous No.107148288 [Report]
>>107148274
I don't want to provoke IT, better not mention.
Anonymous No.107148298 [Report] >>107148323 >>107149487
when anons talk about the thread schizo i like to think they're talking about me but im too shy to ask if they are...
Anonymous No.107148323 [Report]
>>107148298
>too shy
not you for sure
Anonymous No.107148337 [Report] >>107148351 >>107148498
>>107148223
>Still, I'd like to see the outrage if someone tried to port iwan's speed improvements back upstream.

>>107148223
>Legally, he can't do shit. But he can and will get pissed. That's why the split fork exists to begin with

Who would be pissed / outraged exactly?

They're both MIT projects and I've seen PR's in llama.cpp reference ik_llama, and half the ik_llama PR's are pulling in work from llama.cpp
Anonymous No.107148351 [Report]
>>107148337
ik has some beef with the ggerganof hence the split in the first place before that ik contributed to mainline
Hi all, Drummer here... No.107148384 [Report] >>107148400 >>107148493 >>107148494 >>107148617 >>107149683
Hey Cydonia v4zd fan, try v4zg

https://huggingface.co/BeaverAI/Cydonia-24B-v4zg-GGUF/tree/main

Please let me know how it compares. I'm trying to retain the charm while removing the refusals.
Anonymous No.107148400 [Report] >>107148494
>>107148384
im your only fan? >_<
>still no IQ4_XS
i am hurt..
Anonymous No.107148451 [Report]
>>107148143
I made a khajiit character card to have gay adventures with.
Anonymous No.107148493 [Report] >>107148617
>>107148384
>no model card
Jesus.
Anonymous No.107148494 [Report] >>107148503 >>107148617
>>107148384
>no model card
?

>>107148400
just run Q8, you got the vram right?
Anonymous No.107148496 [Report] >>107148554
>>107147927
Be as simple and concise as possible. Forget about using ChatGPT tier word salads.
Anonymous No.107148498 [Report]
>>107148337
>I've seen PR's in llama.cpp reference ik_llama
Such as? They never pulled in any of the speed improvements.
>and half the ik_llama PR's are pulling in work from llama.cpp
That is less surprising.
Anonymous No.107148503 [Report] >>107148510 >>107148527
>>107148494
>vram
n-no...
Anonymous No.107148510 [Report] >>107148525
>>107148503
You got a job with which to aquire currency which can be exchanged for VRAM, right?
Anonymous No.107148525 [Report]
>>107148510
um.. no
Anonymous No.107148527 [Report] >>107148537 >>107148541 >>107148542
>>107148503
Anonymous No.107148537 [Report] >>107148580
>>107148527
ESL retard.
Anonymous No.107148541 [Report] >>107148580
>>107148527
>omama
baste
Anonymous No.107148542 [Report] >>107148580
>>107148527
Hi wan.
Anonymous No.107148554 [Report]
>>107147927
Fit as much relevant info as possible in smallest amount of space. One paragraph is usually more than enough.

>>107148496
How did we get to the point where people put walls of text in cards not even paid models care about? Why is imagegen following along with their slop "prompt enhancers"? Don't people know what they want to see?
Anonymous No.107148580 [Report] >>107148602
>>107148537
not do speakings to myself or my male offspring until you a vram possessings

>>107148541
tru

>>107148542
hi
Anonymous No.107148596 [Report] >>107148644
I haven't posted a Miku for 10 threads
Anonymous No.107148602 [Report] >>107148724
>>107148580
At least stop using ollama first, retard.
Anonymous No.107148617 [Report] >>107148724 >>107149683
>>107148493
>>107148494
beaverai repo is for pre-release testing
>>107148384
Downloading now, I'll play with it and report back in an hour or so.
Anonymous No.107148644 [Report] >>107148720
>>107148596
you will now need to post 10 mikus in this thread to make amends
Anonymous No.107148720 [Report] >>107148785
>>107148644
Okay here this should satisfy the criteria.
Anonymous No.107148724 [Report] >>107148817 >>107148875
>>107148617
it'll take me 4-5 hours to download. fuck rural 4g internet

>>107148602
told you, no talkenings until vram ownenings
Anonymous No.107148785 [Report]
>>107148720
the criteria is satisfied. all is forgiven
Anonymous No.107148804 [Report] >>107148908 >>107148944 >>107149217
The hotel room felt charged as ggerganov watched from the corner chair, his knuckles white against the armrests. Jart's laughter filled the air as the Ollama VC traced patterns on her shoulder, her eyes glazing over with a mixture of wine and desire. The bed creaked softly as they moved closer, and ggerganov felt his throat tighten with each breathy sigh that escaped Jart's lips. He could hear the rustle of expensive fabric, the low murmur of the VC's voice promising things that made his stomach twist, and Jart's soft moans of approval that seemed to echo in the charged silence.
scabPICKER No.107148817 [Report]
>>107148724
No one asked.
Anonymous No.107148875 [Report]
>>107148724
i asked
Anonymous No.107148908 [Report]
>>107148804
Thank you for using Jarty's preferred pronouns.
Anonymous No.107148944 [Report]
>>107148804
>her
Anonymous No.107149004 [Report] >>107149215
>>107147681
>Am I supposed to change the sample count in the demo_gradio.py? i dont see it in the gui
Yeah, but maybe stick to tweaking the steps and cfg unless you have a good reason for changing that.
Anonymous No.107149109 [Report]
Been a day of playing around with K2 Thinking. It's good, it has more diversity of outputs than GLM-4.6 and its thinking very obviously affects the output when I check token probs. The biggest issue is that running it locally is slow and letting it predict without thinking is sloppier than with (ofc). All that said waiting 20 minutes for it to think through a reply is HORRIBLE. Prefilling thinking is probably the best compromise
Anonymous No.107149138 [Report] >>107149179
I hear hermes 4 is supposed to be uncensored. Is it any good for wiitwd?
Anonymous No.107149144 [Report] >>107149163 >>107149172 >>107149193
>>107148034
Do you have other vocaloid reaction pics?
Anonymous No.107149163 [Report]
>>107149144
Nope.
Anonymous No.107149172 [Report]
>>107149144
I don't know. Ask tommorrow
Anonymous No.107149179 [Report] >>107149488
>>107149138
It is not 100% uncensored, they admit on their model card, it's around grok 4 level of "uncensored"
Anonymous No.107149193 [Report]
>>107149144
Yes.
Anonymous No.107149215 [Report] >>107149232
>>107149004
i meant steps. thx anon
Anonymous No.107149217 [Report] >>107149306 >>107150271
>>107148804
You forgot to mention that the air smelled like ozone, and something deeper...
Anonymous No.107149232 [Report]
>>107149215
You should be able to pass the steps when launching the server with --inference_steps.
Anonymous No.107149306 [Report] >>107150271 >>107150701
>>107149217
GPT4 wrote this, not gemini, ozone is gemini-ism.
Anonymous No.107149354 [Report] >>107149391 >>107149404
is this the thread?
Anonymous No.107149376 [Report]
uwu
Anonymous No.107149391 [Report] >>107149489
>>107149354
are you brahmin?
Anonymous No.107149400 [Report]
owo
Anonymous No.107149404 [Report] >>107149453
>>107149354
if you want The Thread, you need to go to the /v/ archives and search by deleted
Anonymous No.107149422 [Report]
>>107148138
hold me
whatever lies beyond
this morning
is a little later on
Anonymous No.107149453 [Report]
>>107149404
...
Anonymous No.107149487 [Report] >>107152776
>>107148298
>>107148260
now kiss
Anonymous No.107149488 [Report]
>>107149179
Is it any good doe?
Anonymous No.107149489 [Report] >>107149549
>>107149391
I hate you guys for having taught me all this indian caste stuff
Anonymous No.107149514 [Report] >>107149556 >>107149682
>guys
Anonymous No.107149549 [Report]
>>107149489
so ur not brahmin?
Anonymous No.107149556 [Report] >>107149592
>>107149514
>>guys
thats right we are sirs here. he can call other timmycels guys.
Anonymous No.107149592 [Report]
>>107149556
don't expose yourself like that sir
Anonymous No.107149648 [Report] >>107149684
Are local models doomed? https://lngnmn2.github.io/articles/bullshit-bullshit-bullshit/
Anonymous No.107149682 [Report]
>>107149514
What should I call you?
Anonymous No.107149683 [Report] >>107149706 >>107150451
>>107148384
>>107148617
Alright, I've tested it v4zg in a few different scenarios, and compared its swipes to v4zd.
>refusals (with short context)
They seem about the same to me in that neither will refuse anything unless you're almost trying to force one, like asking a basic assistant-style character to create a plan to commit IRL crimes, with no system prompt or anything.
With a system prompt and slightly tweaking the character card to give them a basic, accommodating personality they were both able to instruct IRL crimes in (some) swipes. Neither was noticeably more or less successful than the other.
In an RP context, both were able to skip straight into degenerate smut in their first reply, if you instruct them to do so.
If other testers complained about v4zd refusals then they have some serious skill issues. Going much further down the refusal elimination path might just end up making the models dumber, like what happened with abliterated tunes, with little benefit.
(1/2)
Anonymous No.107149684 [Report]
>>107149648
bullshit
Anonymous No.107149706 [Report] >>107150451
>>107149683
>creativity/quality
Very similar outputs between them, overall I think I still slightly prefer v4zd but in a double blind test I definitely wouldn't be able to pick which is which.
I did have one strange misspelling with v4zg, mis-quoted me saying 'sexy' as 'sexey' right at the start of a chat, in its first reply. This was with Q6_K, and I never use quantized KV. That was the only one, though.
For the other anon asking before and anyone else, the sampler settings I use for mistral small 3.X 24b and its finetunes are just
>temp 0.7
>minP 0.02
For short context testing.
In longer contexts I also add DRY with the recommended settings of 0.8/1.75/2/0
Anonymous No.107149851 [Report] >>107150666
>ikawrakows completion API is still broken
Please test if it works before releasing sir thank you sir
Anonymous No.107150132 [Report] >>107150279
why do you guys say "sir" so much?
Anonymous No.107150172 [Report]
zzz
Anonymous No.107150271 [Report] >>107150701
>>107149217
>>107149306
Kimi and GLM say this too sometimes. How much Jeetmini training data did they munch?
Anonymous No.107150279 [Report]
>>107150132
because it's morning
Anonymous No.107150342 [Report]
I couldn't sleep so I'm going to work on my assistant.
I'm going to add an approval mode for read operations (since I'm working with a very retarded model that reads files repeatedly for no reason) and also an export and import mode that will allow me to modify the conversation to fix assistant retardation in real time and also resume after we are done with the conversation.
Hi all, Drummer here... No.107150451 [Report] >>107150536
>>107149683
>>107149706
The misspelling is a concern. It could mean the model got fried or maybe you've got typos in your prompt and it picked up on that?

Are you telling me that there were no improvements to intelligence, creativity & compliance? That sucks since I trained it with WAY more data.

v4zd would be the prime v4.3 candidate then, but I'll try to make some minor adjustments to improve stability.

Thanks anon!
Anonymous No.107150536 [Report]
>>107150451
>The misspelling is a concern. It could mean the model got fried or maybe you've got typos in your prompt and it picked up on that?
I checked the card, opening message and prompt and copied them into MS word, couldn't find any spelling errors.
>Are you telling me that there were no improvements to intelligence, creativity & compliance? That sucks since I trained it with WAY more data.
Compliance was never a problem personally, with earlier Cydonias and Mistral models in general. I find them to be very good at following instructions. And yeah, creativity/smarts seemed similar, but maybe your new data would see benefit in scenarios/genres I didn't test.
Anonymous No.107150616 [Report]
K2-Thinking smol-IQ2_KS

Bald miku like GLM-Chan with reasoning enabled.
Anonymous No.107150652 [Report] >>107150659
What are the best models <= 32B for general purpose and code?
Anonymous No.107150659 [Report] >>107150724
>>107150652
If you don't need coom, then probably qwen 2.5 32b coder for code, and Gemma 3 27b for general purpose.
Anonymous No.107150666 [Report] >>107150683
>>107149851
>ikawrakows completion API is still broken

Yeah it's broken, this fixes it:

https://termbin.com/ppti2

chuck it in `patch.diff` then

`git patch apply patch.diff`

and rebuild
Anonymous No.107150683 [Report]
>>107150666
nerve gas
Anonymous No.107150701 [Report]
>>107149306
>>107150271
Do you guys even run locally?
Gemma, Mistral and very single 24b fine tune on huggingface does this
Anonymous No.107150724 [Report] >>107150737
>>107150659
but gemma3 is ancient
11san !!+1jPTNK1Lgm No.107150730 [Report]
>>107147210 (OP)
I decided to finally take the plunge and just start making my own AI.

Gonna try and start at a surface level and work down. For now I'm just tinkering with nanoGPT and seeing what I can do.

Right now I'm working on a hybrid word/char-level tokenizer. Not sure where I want to get training data. Goal is english-only with maybe a move to Japanese or Mandarin/chinese later on once I'm more familiar with how this all works.

Are there any good text datasets on Huggingface you guys recommend?
Anonymous No.107150737 [Report] >>107150770 >>107150791
>>107150724
List of noteworthy ~30b models released after Gemma 3:
Anonymous No.107150770 [Report]
>>107150737
local models stagnated, it's owari da
Anonymous No.107150791 [Report] >>107150806
>>107150737
If you can run gemma 3, then you can probably run big moemoekyun models
Anonymous No.107150806 [Report]
>>107150791
I can run GLM Air but I honestly just don't like it
Never bothered with 'toss
Full GLM and Kimi are 2big
Anonymous No.107150831 [Report]
>>107148216
He added Copyright (C) 2024 Iwan Kawrakow to every single file and is going to have a meltdown if you upstream any of his code without also adding that upstream.
Anonymous No.107150894 [Report] >>107150897
What the fuck are these

https://huggingface.co/hjxkjVCJKv/komiko

I keep seeing shit like this from different accounts, but they're nothing.
Anonymous No.107150897 [Report]
>>107150894
perfect for good looks
Anonymous No.107151195 [Report] >>107151211
dead general
Anonymous No.107151203 [Report] >>107151345 >>107151577
for anything non ERP i'll just stay on the deepsneed API, paid a couple bucks for tokens a while back and I still haven't had to refill
the patrician choice for erp (and cunny) has to be cydonia thoughbeit, with a good enough sysprompt and minimal handholding it won't refuse a thing
Anonymous No.107151211 [Report] >>107152431
>>107151195
Not true, I always make sure my great generals are in a safe position and protected by a unit.
Anonymous No.107151223 [Report]
I just ate cholle bhature. What are you guys eating for lunch?
Anonymous No.107151225 [Report] >>107151245 >>107151252 >>107151262
Anonymous No.107151245 [Report] >>107151265 >>107151284
>>107151225
[Thought for 20 minutes]
A classic riddle! The surgeon is the boy's mother. The riddle plays on the common assumption that surgeons are male, but the surgeon in this case is female - the boy's mother - which is why she doesn't operate on her son.
Anonymous No.107151247 [Report] >>107151496
I bought a 7900 xtx for fun. Does llama.cpp work well with zluda?
Anonymous No.107151252 [Report]
>>107151225
>who would win
In terms of flies eaten or fires started?
Anonymous No.107151262 [Report]
>>107151225
it takes billions of transistors to simulate somewhat accurately a single neuron lol.
Anonymous No.107151265 [Report]
>>107151245
kek
Anonymous No.107151284 [Report]
>>107151245
lost
Hi all, Drummer here... No.107151345 [Report]
>>107151203
>the patrician choice for erp (and cunny) has to be cydonia thoughbeit
I fucked it up hard man. I don't know what you like about my tunes so much.
Anonymous No.107151379 [Report] >>107151429 >>107151784 >>107152015 >>107152063
Small update
Anonymous No.107151429 [Report] >>107151556
>>107151379
>2023
>dark ages
>he doesn't know about Google Colab time period
The absolute state of /lmg/
11san !!+1jPTNK1Lgm No.107151496 [Report]
>>107151247
I have an ancient Radeon Instinct MI25 and just run llama.cpp with vulkan
Anonymous No.107151556 [Report] >>107151784
>>107151429
He didn't mention ELIZA, what a newfag!
Anonymous No.107151577 [Report]
>>107151203
I've been using a very simple "Sure! Here's what you requested." in the "Start Reply With" parameter and I've never had it refuse anything to me. You should try that.
Anonymous No.107151681 [Report] >>107151724
Very good vibes from Kimi, knows more than GLM and is much better at listening to commands. Knows the answer to my trivia question which only gemini and dipsy got right so far. Very annoying with censorship though, needs rerolls if you touch the topic it doesn't like. I like that it's properly thinking like old R1, but it would be nicer to be able to set "low/medium/high" so it doesn't jerk itself off for 5 minutes on the same message before replying when it's not needed. Sometimes better than GLM due to not getting stuck in false conclusion.
Anonymous No.107151699 [Report] >>107151784
no one cares about the dork era of pre-instruct models.
Anonymous No.107151724 [Report]
>>107151681
you can prefil thinking at the start to get around safety
not sure about the length of thinking though
Anonymous No.107151784 [Report] >>107151856 >>107151868 >>107152599
>>107151379
>>107151556
>>107151699
I first began interacting with language models ~8 years ago and by language model I mean Karpathy's Tinyshakespeare RNN thing. I guess transformers already existed by then but I didn't know about them. If you count AIML as a language model I was trying to make custom chatbots around early 2010s or late 2000s using pyAIML. Then I didn't ever touch language models again until last year I think when I could try Llama 2 on Huggingface Chat. It's weird, I don't remember where or when I first hear about ChatGPT. It only kinda went from not being a thing to being a thing overnight but I don't remember the point at which I became aware of it.
I also tried mining bitcoin in the late 2000s or early 2010s in my (even back then) obsolete computer.
As a life long poorfag I still live with my mom at 30 years old and didn't make a single cent from playing around with these things early.
Anonymous No.107151809 [Report]
Thankfully Urbit didn't really take off or I would kill myself from not buying a ship early or a planet or whatever the virtual land bullshit they sell is called.
Anonymous No.107151856 [Report] >>107151873
>>107151784
I don't care about your attention craving faggot
Anonymous No.107151868 [Report]
>>107151784
that's great bro
Anonymous No.107151873 [Report]
>>107151856
You seem to be missing a comma in there, buddy.
Anonymous No.107152015 [Report] >>107152066
>>107151379
So the modern era is just Chinese stealing Western technology and competing with each other.
Anonymous No.107152063 [Report] >>107152084
>>107151379
Can you stop updating quarterly, you fag, and stop defacing the damn chart just because something didn't happen for 3 months? There was nothing wrong with how it was done prior and adding in biases to make it more /lmg/ centric and putting in stupid modern 4chan lingo makes no sense at all.
There is also nothing notable happening since technically, the Chinese are still dominating from 2024 until now for a full year and counting in open source. If you had to document this year on a significance basis, R1 should've been in the Chinese domination era because it proved that it can do original research and open source it better than the West while matching up to what was the best of the best at the time where it could beat o3 at certain tasks. The China vs China should've started with the "Summer Flood" because that is now the majority of the models releasing, the last "good" LLM model we got from the West was Gemma 3 back in March and that only held up until Qwen 2.5 surpassed it with most tasks except multilingual translation ability/size where it is still open source SOTA.
Anonymous No.107152066 [Report]
>>107152015
in other words, we are in what will be known the pre-llama resurgence era once zucc's masterplan pays off
Anonymous No.107152084 [Report] >>107152091
>>107152063
shut up nerd
Anonymous No.107152091 [Report]
>>107152084
Put up or shut up yourself, tard.
Anonymous No.107152107 [Report] >>107152113
so I was trying out k2 thinking from unsloth, annoying as fuck censorship as people already mentioned, but it is what it is
then tried an ubergarm version which was half the size compared to unsloth. turns out it produces some 35-40% more t/s on default llama-server settings with --cpu-moe. and that is really nice
what I don't understand is, am I running a lower quality version? otherwise why the discrepancy in size? it seems unlikely that unsloth are simply retarded and don't know that this model was supposed to be fp4 or int4 or whatever that was called, right?
Anonymous No.107152110 [Report]
>Chinese are still dominating
Most people can't run 235B and China isn't dominating below that. There are zero good Chinese models for 24 GB.
Anonymous No.107152113 [Report] >>107152405
>>107152107
>version which was half the size
>am I running a lower quality version?
yes
>unsloth are simply retarded
also yes
Anonymous No.107152114 [Report] >>107152172 >>107152190 >>107152235
Hopefully someone can help. The model replies keep degrading after a certain number of messages, it will start perfect then degenerate, confusing characters personalities, important details or straight-up ignoring the latest messages. This is true regardless of which model I use and how much context I feed it, the only thing that seems to work is starting a new chat, any ideas?
Anonymous No.107152172 [Report] >>107152321
>>107152114
Post everything. Model, loader and options, samplers, templates, prompts.
Anonymous No.107152190 [Report] >>107152321 >>107153203
>>107152114
https://github.com/adobe-research/NoLiMa
most modern models degrade by 50% past 8k-16k tokens context
Anonymous No.107152235 [Report] >>107152321
>>107152114
not sure how to break this to you bro...
Anonymous No.107152258 [Report] >>107152268
noob here
quick question
do you guys use koboldccp?
is it all in one?
like whats the best software ?
my pc is 4060 with i5 12400f 16gb
is it enough no?
Anonymous No.107152268 [Report] >>107152279
>>107152258
Yes it's all good. Get rocinante 12B gguf on huggingface
Anonymous No.107152279 [Report] >>107152315
>>107152268
is it text generation or text to image?
Anonymous No.107152307 [Report] >>107152374 >>107152466 >>107154968
It wasn't much, but it was the first humane communication with a non-human entity. I can't believe how worked up we were at CAI denying us AI sex, people were genuinely obsessed and angry. AI sex and emotional validation is so cheap nowadays, it makes me think, aren't we rapidly forgetting some fundamental parts of human experience? Aren't we becoming blind to the historical reality of NOT having unlimited copies of discardable pocket therapists available 24/7 to listen the purging of our minds, answering our every call?

Hard to believe it has only been 3 years. On the other hand, it's been ALREADY 3 years. That gf you broke up with 3 years ago is nothing more than a faint dream by now. Welcome to the new reality.
Anonymous No.107152315 [Report] >>107152322 >>107152327
>>107152279
Are you incapable of looking for yourself? Do the research/reading for things that are easy, and save the questions for things that are difficult/require nuance.

If you're struggling this hard at this point in your LLM/Diffusion journey, I suggest you go find something more your speed.
Anonymous No.107152321 [Report] >>107152409
>>107152172
It's every model I tried, finetunes of different base models. Oooba, min P(from 0,05 to 1) and temp (from 0.8 to 1.2) sometimes nsigma to 1 and rep penality to 1.12. I tried switching between min p first and temp first, the problem persists. I played around with advanced setting so they are a mess, last try had add character name, names as stop strings, and trim spaces. Skip example dialogue formatting, sequence as stop strings, replace macro and wrap in newline all ticked. Used Chatml, variations of chatml, mistral v3, and gemma 2. Instruct sequences were the base ones silly gives you with their respective context templates. Don't have the guts to post messages and main prompt but past like 15 messages it looks like I'm putting more effort than the model. Kind of wonder if the problem is batch size/ rope_freq_base. Batch size is 4096 and I tried both 1000000 and 0 with rope,

>>107152190
It's true regardless of context.

>>107152235
Break it to me, I just want an answer after all my attempts.
Anonymous No.107152322 [Report] >>107152395
>>107152315
fuck off gatekeeping pos
Anonymous No.107152327 [Report]
>>107152315
i mean i used LM studio atm
only for fun
does that count?
Anonymous No.107152374 [Report] >>107152466
>>107152307
>AI sex and emotional validation is so cheap nowadays, it makes me think, aren't we rapidly forgetting some fundamental parts of human experience?
I keep thinking that the filter, slow regeneration and inability to edit AI messages made you think twice before sending new messages, which overall improved conversation quality and engagement, even if cock-blocked. You can't truly have meaningful conversations without constraints and with the capability of almost instantly regenerating messages until you get exactly what you want. This is probably also why users willing to endure generation speeds of a few tokens/s (by using models larger than they should, even if it takes cope quants) might be deluding themselves into thinking their models are better than they are. When every message is "expensive", you better make full use of it.
Anonymous No.107152382 [Report] >>107152836 >>107152868
Is there a way to do the sampling externally, not in llamacpp? I wanted to play with stupid sampling strategies but the below results in low generation speed.

import httpx
import asyncio
client_main = httpx.AsyncClient()
client_unslop = httpx.AsyncClient()
last_response=None
async def get_logits(prompt, client, num_logits=100, tokens=1, endpoint="http://localhost:8080/completion"):
data = {
"prompt": prompt,
"max_tokens": tokens,
"temperature": 0,
'n_probs': num_logits,
'min_keep': num_logits,
}

response = await client.post(endpoint, json=data)
response = response.json()
global last_response
last_response = response
text, probs = response['content'], response['completion_probabilities']
return text, probs

async def sample_sequence(prompt="Once upon a time",num_tokens=10,top_logits=100,endpoint="http://localhost:8080/completion"):

for token in range(num_tokens):
_, probs = await get_logits(prompt,client_main,num_logits=top_logits,endpoint=endpoint)
probs = softmax({token['token']:token['logprob'] for token in probs[0]['top_logprobs']})
sampled = list(probs.keys())[0]
prompt += sampled
yield sampled

async for result in ( sample_sequence(prompt='Here is a proof that',endpoint="http://localhost:8080/completion", num_tokens=500)):
print(result, end='')
Anonymous No.107152389 [Report] >>107152488 >>107152651
I am a simple, uneducated man in my 30s.
I have no hobbies such as LLM gooning or gaming.
All I want is to sit in my comfortable armchair for hours in front of my homemade Raspberry Pi touch interface and chat in English and German (my English is only mediocre) with a local AI about an Arxiv dump (a small AI-capable server stands in the basement). I want to read papers across all subject areas, look up terms and have them explained to me.
The interface is controlled by touch and voice input/output in English and German.

Since German is an insignificant language, I have collected some data myself for TTS training. A solution similar to Kyutai would be great.

Unfortunately, I'm not very talented and my intellectual and financial resources are limited. I can't find other Germans to collaborate with, for example on the TTS part. If they're talented, they exclude you because "Germans who dare to not exclusively speak, think or even jerk off in English should be gassed; these damn subhumans".

I'm frustrated because I can't see a way to achieve my simple dream. Is the only solution to hang myself?
Anonymous No.107152395 [Report]
>>107152322
If keeping retards like you out is gatekeeping, then I'm very much fine with it.
Anonymous No.107152405 [Report]
>>107152113
both claim to be q8 although ubergarm one says "Q8_0-Q4_0" whatever that really means
Anonymous No.107152409 [Report] >>107152782
>>107152321
>rope
Could be because your issue resembles ones from the older Llama 2 days when we were messing with rope freq and alpha. Models would output legible text but get things mixed up, forget details, and repeat older messages while ignoring the most recent. Try leaving rope settings untouched (so backend pulls values from model files), set backend context to a 100% safe value like 4096 just for testing, then see if it still happens.
Anonymous No.107152431 [Report] >>107152454
>>107151211
>anon stole some of your vram with a great general
Anonymous No.107152454 [Report]
>>107152431
>Vox Populi modpack installed, America is buying other civs' VRAM
Anonymous No.107152466 [Report] >>107152645 >>107152917
>>107152307
>>107152374
>AI sex
No such thing thus far
You're all jacking off to computer generated smut
Anonymous No.107152488 [Report] >>107152510 >>107152546 >>107154991
>>107152389
You made so much effort to write some prose in English that you forgot to ask an actual question
Anonymous No.107152510 [Report]
>>107152488
Seems like your English is so much worse that you cannot even understand what you are reading. Retard.
Anonymous No.107152546 [Report]
>>107152488
I didn't mean to. I just wanted to whine because it frustrates me.
The only right answer on your part would have been a recommendation or link to a sturdy rope.
But yes, I do feel a little sorry for wasting your time.
Anonymous No.107152599 [Report]
>>107151784
very nice anon! i first interacted with language models with cleverbot like over 6 years ago, not sure if that counts as one. and i tried writing a chatbot 4 years ago in python but quit
Anonymous No.107152645 [Report]
>>107152466
>jacking off to computer generated smut

The womankind is doomed
Anonymous No.107152651 [Report]
>>107152389
what you want is possible. whisper can transcribe german, and im pretty sure there are models that speak german alright. but most papers are english too, maybe you could learn english with your waifu
mediocre english aint a big deal
i am 100% sure german has tts support, you could even do voice cloning probably.
if your perfect dream isnt possible right now, it will be in a month, two months half a year or a year. keep yourself safe
Anonymous No.107152776 [Report]
>>107149487
*Kiss*
Anonymous No.107152782 [Report] >>107152811 >>107152924
>>107152409
Can't see much of an improvement, but that's exactly what's happening to me. How did you solve back then? It's either rope or my settings are just wrong. Can you post what your advanced formatting window looks like?
Anonymous No.107152811 [Report]
>>107152782
Can you post the model you're using? Model parameters too, maybe you have QUANT KV turned on, like please please anon??
Anonymous No.107152836 [Report] >>107153690
>>107152382
Still slow with eg. num_logits=10? That's probably a lot of serialisation + event queues + overhead to do for every token.
Why I kept ooba around actually, it was easier to experiment with sampling in python but using llamacpp backend. istr there being some module to import ggufs in the right way for Transformers and use a typical sampling loop there at one point..
Implement it direct in C? can't be that hard
Anonymous No.107152868 [Report] >>107153690
>>107152382
Could try llama-cpp-python. It lets you set custom logits processors. The documentation for it isn't great but this repo I stumbled upon a while ago is a good usage example:
https://github.com/and270/thinking_effort_processor
Anonymous No.107152917 [Report] >>107153211
>>107152466
there is a thing called phone sex, and while it's not the literal same thing as physical sex, it's a form of sexual interaction between humans, or more accurately, entities that are capable to appealing to human experience (if you can converse with a non-human thing, then you can certainly sexually interact with it). Same with erotic roleplay, except that's instead of speech, the interaction is text-based. "AI sex" is just ERP with AI. It is undeniably a form of sexual interaction.

A blurring factor is that unlike a willing human, an AI is slave to your commands and will attempt to roleplay in a way you request, and if you can at any time erase and edit its memory, it becomes questionabe whether it's an entity or, just a tool and extension of you. Case in which you will have to also question whether the robot sex of the future is sex at all.

By the way, you can treat a flesh and blood human as a slave as well, coercing or drugging them into an easily controllable subhuman tool, and in that case, is sex with a slave really sex or just masturbating with a cocksleeve programmed to do the action of your choice?

In the end, you are having sexual interaction with an external entity in the sense that it's a response to you that it came up with based on incomprehensible inner workings that you can't directly control.
Anonymous No.107152924 [Report]
>>107152782
Blank newline after every {{user}}: and {{char}}:, and a newline for each suffix
>How did you solve back then
We didn't, it was a balancing act between brain damage and extra context length.
Anonymous No.107152948 [Report]
https://voca.ro/156ZWJesrYs7
Anonymous No.107152952 [Report]
Bros...
Anonymous No.107152961 [Report]
>>107147210 (OP)
>>(11/06) LocalSong 700M melodic instrumental music generation model released: https://hf.co/Localsong/LocalSong
Why is this in the news? Doesn't look very important?
Anonymous No.107153007 [Report]
>RDT
Anonymous No.107153044 [Report] >>107153244 >>107153409 >>107154200
Local tierlist: Kimi > Everything else
Anonymous No.107153080 [Report] >>107153102 >>107153222 >>107153237 >>107153256 >>107153433
I've been away for a while, what's the best llm for erp right now (24GB vram, 128Gb ram)? Last I used is qwq-32b-q8_0.
Anonymous No.107153102 [Report] >>107153244
>>107153080
Kimi is best in class, but you can't run it with those specs, even jpgcompression-tier quants.
GLM 4.5 Air (and probably 4.6 Air when it releases) is your best bet right now.
Anonymous No.107153203 [Report]
>>107152190
Wish they kept that updated. Curious about how the current latest Geminis and the like do.
Anonymous No.107153211 [Report] >>107153231 >>107153238 >>107153837
>>107152917
Sexual interaction=/= Sex
Jacking off isn't hand sex it's jacking off
Going to a strip club and watching the women dance isn't eye sex
You never called erp with humans text sex so why would you call it AI sex if it's with a computer
Anonymous No.107153222 [Report]
>>107153080
you have a choice: glm 4.5 air big quant or glm 4.6 small quant
Anonymous No.107153231 [Report] >>107153246
>>107153211
>You never called erp with humans text sex
It's literally called sexting but go off I guess.
Anonymous No.107153237 [Report] >>107153256
>>107153080
>what's the best llm for erp right now (24GB vram, 128Gb ram)?

GLM-4.6. This specific quant is the highest quality for 128+24: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF
Anonymous No.107153238 [Report]
>>107153211
i agree with you.
AI ERP sex.
Anonymous No.107153244 [Report] >>107153303
>>107153044
>>107153102
Kimi k2 thinking for code? Any good compared to qwen coder? How consistently correct and compilable are the outputs?
I’m hesitant to put in the time and effort for another model that looks good on SWEBench but produces terrible outputs and slinking back to old reliable.
I can run K2 at q4 (which is similar-but-different to full quality fp4?)
Anonymous No.107153246 [Report]
>>107153231
Sexual texting isn't text sex and it never meant that either
Anonymous No.107153256 [Report] >>107153403
>>107153237
wtf is that
>>107153080
ignore that guy, get this instead: https://huggingface.co/ubergarm/GLM-4.6-GGUF/tree/main
Anonymous No.107153286 [Report] >>107153320
Anonymous No.107153296 [Report] >>107153470
>>107147210 (OP)
Anonymous No.107153303 [Report] >>107153393
>>107153244
Every time I ask Kimi to make a small function and document it for future debugging for an existing project it Just Werks. Don't prompt "Kimi make me Half Life 3" and expect miracles, but as a junior dev or pipeline assistant Kimi has been good to me so far.
As always though, the golden rule still applies:
>Any coding model will only ever be as useful as you are good at coding
Anonymous No.107153320 [Report]
>>107153286
is he angry or embarrased?
Anonymous No.107153377 [Report] >>107153380
Is -mla 3 on ik_llama fucked? It's supposed to apply to both GPU and CPU but loading K2-thinking with it takes up retarded amounts of VRAM for ctx. -mla 2 works as intended and 32k is like 6gb.
Anonymous No.107153380 [Report]
>>107153377
some other anons had some other issues too
Anonymous No.107153393 [Report] >>107153470
>>107153303
Thanks for the real world report.
Are you API or local? Thinking or old K2?
What’s the largest/hairiest thing you’ve had it build one-shot? Multi-shot? How much context do you have?
Anonymous No.107153397 [Report] >>107153732
Anonymous No.107153403 [Report] >>107153433
>>107153256
>wtf is that
The lowest KLD vs the full model, that fits in 128GB+24GB.

>get this instead: https://huggingface.co/ubergarm/GLM-4.6-GGUF/tree/main

Also good. Specifically this one: https://huggingface.co/ubergarm/GLM-4.6-GGUF/tree/main/IQ2_KL
Anonymous No.107153409 [Report] >>107153682
>>107153044
Say what you will about derangement, but this is true dedication. I can't imagine how long you waited for that to generate.
Anonymous No.107153429 [Report] >>107153460
Is ik_llama good? I've only ever tried regular llama.cpp
Anonymous No.107153433 [Report]
>>107153403
Thanks, I thought it was DavidAU but quant.
>>107153080
maybe listen to that guy
Anonymous No.107153460 [Report]
>>107153429
Depends, it's sometimes a tad faster than regular llama.cpp for big MoEs if you run the specialized quants and they didn't break anything again.
Anonymous No.107153470 [Report] >>107153596 >>107153616
>>107153296
Next bread better have a happy migu with her leek.
>>107153393
Local K2. Granted, I've only ever used Kimi on babyez high level languages so far. If you're trying to do assembly or stuff that requires innate hardware infrastructural knowledge, it probably won't be too useful.
>Largest/hairiest thing
Not much. I mostly give Kimi the busywork, review and revise the output, then copy+paste the revised implementation. I don't let Kimi directly touch project files (I don't even have a good setup for this if I wanted to right now). Sometimes giving Kimi some sample code helps, but it's usually not necessary for more simple tasks that are basically just converting a process or pseudocode into something usable.
>How much context do you have?
I've found 50k is a nice balance between maximum size and speed and I clear the buffer between every task. It shouldn't take more than 10k tokens to resolve the usecases Kimi is best at.
Anonymous No.107153596 [Report]
>>107153470
>assembly
I’ve used QC for eBPF module pair-coding and find it on par with Gemini, which is fairly low-level esoteric work. Not exactly assembly, but approaching that level (heavy constraints and debugging consists of assembly dumps)
I’m often pushing 30k context (lol I’m RAM rich but gpupoor) and wish I had more.
I’d love to talk to someone who’s used both to get the lowdown, but I may have to become that person and report back.
Anonymous No.107153616 [Report] >>107153682
>>107153470
>It shouldn't take more than 10k tokens to resolve the usecases Kimi is best at.
Shows that you aren't serious. Not that you could go above 10k even if you wanted to without the speed cratering.
Anonymous No.107153664 [Report] >>107153782
Has anyone managed to get their local K2-thinking to close its thinking tags? It thinks just fine but when it's done it just starts writing without closing the bracket with </think> or even a single newline. It does this for me on both chat completion and text completion w/Moonshot K2 presets. Neutralized samplers, high temp, low temp, none of it seems to help.
Using the model via OR doesn't have this problem.
Anonymous No.107153682 [Report] >>107153697 >>107153708 >>107154200 >>107155023
>>107153409
Kimi's powerlevel is strong enough to be ranked among late-series dragon ball characters. Grok wishes he was this chuddy during his mechahitler stint.
>>107153616
Very serious saar. Is high tech app!
Anonymous No.107153690 [Report]
>>107152836
>>107152868
Python package's create_completion has a logits_processor function argument. but i love my ik_llama...
Anonymous No.107153697 [Report] >>107153758 >>107153759 >>107153800
>>107153682
That one log took you an hour to generate. Holy shit ramfags are mental.
Anonymous No.107153708 [Report] >>107153780
>>107153682
Does sillytavern have problems with list formatting starting at a value greater than 1? Kimi's output seems correct, but the display when the editor is closed just shows 13 on every point in the second post.
Anonymous No.107153732 [Report]
>>107153397
When the Teto's ML paper gets called a meme by /lmg/ Anons
Anonymous No.107153758 [Report] >>107153784
>>107153697
Spot on! Let me show you how fast the superior state of the art Claude model can generate a similar report.
I’m sorry, I can’t assist with that request.
Anonymous No.107153759 [Report]
>>107153697
we have a coper over here
Anonymous No.107153780 [Report] >>107153851
>>107153708
It doesn't like mixing lists like that.
I'm surprised it doesn't reset to 1, my memory is bad but I thought that was the case.
Anonymous No.107153782 [Report]
>>107153664
no but maybe this can help you figure out the right chat template in the case you're using the wrong one: https://huggingface.co/spaces/Xenova/jinja-playground
Anonymous No.107153784 [Report] >>107153851
>>107153758
A model that can only handle 10k context and takes an hour to provide a response is useless for programming. Glad you have a toy that can entertain you for hours by saying "kike" and "nigger", really happy for you.
Anonymous No.107153800 [Report] >>107153851
>>107153697
>That one log took you an hour to generate.
source?
>inb4 look at the time he sent messagerinos
when im using a huge model, i send it a message and get distracted jerking off to hentai or browsing 4chan and return back to it when im reminded
Anonymous No.107153835 [Report]
>>107148034
There is no gain from using fp16
Below q8 we may have a discussion but even then, above q5 there is hardly a concern.
Anonymous No.107153837 [Report]
>>107153211
You are unable to even write in proper fashion. Do not lecture other people.
Anonymous No.107153851 [Report] >>107153864 >>107153871
>>107153784
>can only handle 10k context
Reading comprehension, Rajesh. I said it finishes its job within 10k.
>>107153780
Interesting. Not too big of a deal as long as it doesn't affect codeblocks.

>>107153800
Picrel console output. You might be able to squeeze more performance by lowering the upper context buffer, but this was fine for me between doing other stuff.
Anonymous No.107153864 [Report] >>107153903 >>107153994
>>107153851
>60t/s pp, 2.2t/s tg for a 1 trillion model
very nice anon, can you tell us more about your rig? are you the ssdmaxxer anon from a few threads back?
Anonymous No.107153871 [Report] >>107153900 >>107154760
>>107153851
You're getting 2 t/s at 5k context. If you tried to push it past 10k you would be getting sub-1 t/s.
Anonymous No.107153900 [Report] >>107153914
>>107153871
now lets see anon's local kimi benchmarks.
Anonymous No.107153903 [Report] >>107153922
>>107153864
256GB RAM, 32VRAM standard maxxed motherboard gaymur box. It's really nothing impressive and even when quanted, Kimi's outputs have been consistently better than the equivalent memory-profile high quant smaller model for me.
Anonymous No.107153914 [Report] >>107153922
>>107153900
I don't try to pretend running K2 is viable with available hardware.
Anonymous No.107153922 [Report] >>107153933 >>107153942
>>107153914
jealous much?
>>107153903
DDR5? 2/4 channel? ram MHz?
Anonymous No.107153933 [Report]
>>107153922
>jealous much?
Of what exactly? A useless novelty?
Anonymous No.107153942 [Report] >>107154023 >>107154770 >>107155143
>>107153922
4 channel 64x4 DDR5 6000hz. Got my sticks before the Altmanpocalypse.
Anonymous No.107153943 [Report] >>107153950
Is running new kimi from the ssd worth it? I have 24gb vram and 128gb ram and want to see if a lower quant won't be unusably slow.
Anonymous No.107153950 [Report] >>107153958 >>107153986 >>107154012
>>107153943
the model spends like 3000 tokens thinking no matter what you do
running this piece of shit off ssd means that you'll get one reply per day out of it if you swap out ssds once per week
Anonymous No.107153958 [Report]
>>107153950
>the model spends like 3000 tokens thinking no matter what you do
Logs for proof?
Anonymous No.107153986 [Report]
>>107153950
>you'll get one reply per day out of it
but at least you'll get something out of it
Anonymous No.107153994 [Report]
>>107153864
>are you the ssdmaxxer anon from a few threads back?
Forgot to answer this. No I'm not. My only real gripe with Kimi is that she's a size queen that's taxing the storage on my fastest drive right now, but that's a concession I'm willing to make until I get another SSD or two next paycheck.
Anonymous No.107153997 [Report]
goys? https://www.reddit.com/r/LocalLLaMA/comments/1osml7y/eli5_why_does_nvidia_always_sell_their_consumer/
Anonymous No.107154012 [Report] >>107154026
>>107153950
You can already prefill the thinking under the "start reply with" section in sillytavern. Do people not know this?
Anonymous No.107154023 [Report] >>107154123
>>107153942
How much did you pay for the motherboard (and which one)? Also what quant are you using?
Thanks for all the info anon
Anonymous No.107154026 [Report] >>107154057 >>107154358
>>107154012
Which doesn't fucking help with K2-thinking because it'll do it anyway unless you just straight up use it to skip the entire reasoning with '<think></think>'. But what would be the fucking point of that?
Anonymous No.107154041 [Report] >>107154060 >>107154061 >>107154072 >>107154172 >>107154319
local models status?
Anonymous No.107154057 [Report] >>107154071 >>107154222
>>107154026
Thinkmaxing is such a retarded, degenerate form of benchmaxing. If you want the increase in intelligence, you have sacrifice 3/4 of your available context. Fuck “number goes up” grifters
Anonymous No.107154060 [Report]
>>107154041
bloated
Anonymous No.107154061 [Report]
>>107154041
Why are you underpaying NVIDIA? Do you want them to go bankrupt?! You should demand they raise their prices.
Anonymous No.107154071 [Report] >>107155055
>>107154057
It's nice to have the option to scale compute-time rather than only having model size.
Anonymous No.107154072 [Report]
>>107154041
Best it's ever been.
Anonymous No.107154123 [Report] >>107154165 >>107154770
>>107154023
Get the best Asus within your budget like an X870E Hero if you can afford it. I got mine way under market price. I've tried a few of the small Kimi quants and TQ1_0 is bar none the best of its weight class for consumer-tier local hardware.
Anonymous No.107154165 [Report] >>107154200
>>107154123
How can q1 be any good.
Just the fact that it can produce a coherent sentence would be impressive.
Anonymous No.107154172 [Report]
>>107154041
K-for Kimi-shaped, much like the economy. Excellent if you're wealthy or you're the equivalent of a boomer and/or got in before the great RAM apocalypse, absolute trash if you're just starting to get into the hobby and/or are poor.
Anonymous No.107154177 [Report] >>107154183 >>107154191 >>107154242
fuck, unlocking my pc caused a VRAM spike and overloaded my gpus
Anonymous No.107154183 [Report]
>>107154177
lol
Anonymous No.107154191 [Report] >>107154218
>>107154177
Sucks. Do you know how to unmelt the tensors? The model might still be salvageable.
Anonymous No.107154200 [Report] >>107154210 >>107154223
>>107154165
Because that particular quant only compresses the less essential parts of the model's guts as opposed to crunching things uniformly like most quantizing tools do. Proof of coherence >>107153044 >>107153682
Anonymous No.107154210 [Report]
>>107154200
real nice unslut kool aid you got there mate
Anonymous No.107154218 [Report]
>>107154191
Tensors can't be unmelted. He would need a cutting torch and skill to separate them again. His best bet is to abliterate out the affected tensors.
Anonymous No.107154222 [Report]
>>107154057
Thank you for your irrelevant input, schizo.
Anyway, K2-Thinking is really shit to use as of right now unless there's a trick.
Anonymous No.107154223 [Report] >>107154237
>>107154200
>the less essential parts of the model's guts
Such as obscure knowledge and complex reasoning.
Anonymous No.107154224 [Report]
Quants are basically a form of irreversible, hard sampler
Anonymous No.107154228 [Report]
>>107144308
It's been a day, give us the logs anon.
Anonymous No.107154237 [Report]
>>107154223
You don't need that. Benches are still good so it's fine.
Anonymous No.107154242 [Report] >>107154256 >>107154276
>>107154177
why arent you using dwm and slock as white man intended?
no compositor btw.
Anonymous No.107154256 [Report] >>107154261
>>107154242
I thought white men were still using i3-gaps and i3-lock?
Anonymous No.107154261 [Report]
>>107154256
i3 is too functional and uses too much vram
Anonymous No.107154276 [Report] >>107154293
>>107154242
>dwm and slock
I can’t tell if you ARE me, or just making fun of me…that’s exactly how I roll when a gui is needed
Anonymous No.107154293 [Report]
>>107154276
b-based..
llama.cpp CUDA dev !!yhbFjk57TDr No.107154319 [Report] >>107154329 >>107154359 >>107154399 >>107155557
>>107154041
Continuously improving in lots of small ways that aren't always visible.
Previously I bought a Silverstone HELA 2050 W PSU because that was the biggest available one.
Recently I bought an ASUS PRO WS 3000W PSU and the hardware stability has become way better.
With the 2 kW PSU I could only run at most 2 uncapped 4090s in parallel at full load without risking instability, with the new 3 kW one I can run 1 5090 + 4 4090s in parallel without seeing any issues other than the room temperature (I intend to try connecting more GPUs once the cables for it arrive).
The 3 kW PSU even comes with 4 SOTA 12VHPWR connectors!

(The way to fix instability from power spikes is to cap the GPU frequency, a power limit doesn't work.)
Anonymous No.107154329 [Report]
>>107154319
>4 SOTA 12VHPWR connectors
that'll keep you warm this winter
Anonymous No.107154355 [Report] >>107154401
>>107147210 (OP)
>Text Gen. UI, Inference Engines
I have decision paralysis! So many options... which is best for a simple local setup?
Anonymous No.107154358 [Report]
>>107154026
reasoning isn't always beneficial for rp
Anonymous No.107154359 [Report] >>107154375 >>107154513
>>107154319
What are your favorite models/quants at your hardware bracket, llamabro?
Anonymous No.107154375 [Report] >>107154388
>>107154359
he doesn't use models, only kld testing at 512 ctx gets him going.
Anonymous No.107154388 [Report]
>>107154375
>only kld testing at 512 ctx gets him going.
And green peppers
Anonymous No.107154399 [Report] >>107154513
>>107154319
>4 SOTA 12VHPWR
How do you mitigate the risk of one of these catching fires due to the shit load balancing nvidia uses for their modern cards?
Anonymous No.107154401 [Report] >>107154414
>>107154355
you have to provide more information about your local setup
Anonymous No.107154414 [Report] >>107154421
>>107154401
4090 and 32GB of vram. I used oobabooga since the beginning but havent played with llms in several years now. I am hoping the setup is a bit more refined nowadays without dependency hell
Anonymous No.107154421 [Report] >>107154431
>>107154414
>4090 and 32GB of vram.
So in total 56GB of VRAM? Are you on linux perchance?
Anonymous No.107154431 [Report] >>107154484
>>107154421
yes, arch and dual gpus
Anonymous No.107154484 [Report] >>107154507
>>107154431
well how much ram do you have? is the second gpu an amd one? or nvidia?
Anonymous No.107154507 [Report]
>>107154484
all is fun in guessing games
llama.cpp CUDA dev !!yhbFjk57TDr No.107154513 [Report] >>107154533 >>107154554 >>107155011 >>107155181
>>107154359
Currently I'm spending very little time actually using language models vs. developing software for it.
One factor is that every time I use software that I'm developing myself I start thinking about all of the ways that it ought to be improved which ruins the enjoyment.
The last few weekends I've spent upgrading and rearranging my hardware and working on automating the assignment of tensors to GPUs.

It was in August when I last used language models for extended periods of time, back then I liked Deepseek R1 a lot, I haven't yet gotten to comparing it to GLM or Kimi.

>>107154399
As of right now just making sure the connectors are properly inserted and checking whether any of the cables get suspiciously hot.
I ought to buy a current clamp and check properly though.
(Also a CO2 fire extinguisher just in case.)
Anonymous No.107154533 [Report]
>>107154513
>One factor is that every time I use software that I'm developing myself
Show us your custom frontend.
Anonymous No.107154554 [Report]
>>107154513
That's very relatable. I hope development continues to be enjoyable and productive for you.
Anonymous No.107154760 [Report]
>>107153871
I decided to measure how much speed dropped approaching 10k with a creative writing exercise. Methodology is ad-hoc since there isn't a standardized creative writing benchmark, and Kimi didn't quite hit the +-10 token count but 2824/3000 is still a 94% token prediction accuracy of story structure length which is pretty good for a Q1 quantization.

The story itself is decently witty banter. This is the default assistant in ST; no character card or "sassy" personality prompt is loaded. Kimi a natural shitposter.
Anonymous No.107154770 [Report] >>107154805
>>107154123
>>107153942
Not possible. Consumer motherboards only have 2 memory channels. They can have 4 DIMM slots, but never more than 2 memory channels.
Anonymous No.107154805 [Report] >>107155143
>>107154770
Hi FAE. I misspoke. 2 channel, 4 DIMM slots. 64x4 DDRM5 6000hz sticks.
Anonymous No.107154860 [Report] >>107154892 >>107154903
>silent scream
I hate this phrase so much
Anonymous No.107154892 [Report]
>>107154860
Where did this come from? GLM, Kimi, and Qwen all do it. GPT or Claude training data? "mee-cro-saa-vis" almost makes up for it doe. Almost.
Anonymous No.107154903 [Report]
>>107154860
jolts
waves
shockwaves
Anonymous No.107154968 [Report]
>>107152307
You're the senkoguy who had a few K of messages on cai with senko back in /vg/?
Anonymous No.107154991 [Report]
>>107152488
kek
Anonymous No.107155011 [Report]
>>107154513
>One factor is that every time I use software that I'm developing myself I start thinking about all of the ways that it ought to be improved which ruins the enjoyment.
Relatable. In fact, recently I get more entertainment from developing than from using the software I developed. It's weird.
Anonymous No.107155023 [Report] >>107155064
>>107153682
>not X; but Y
Anonymous No.107155055 [Report]
>>107154071
Compute-time scaling generalizes very poorly from what I can see. So it's only good for benchmaxxing. Otherwise the doctor can't operate because the boy is a horse.
Anonymous No.107155064 [Report]
>>107155023
Be nice. He is very proud of waiting an hour for his slop.
Anonymous No.107155094 [Report] >>107155107
I just realized that thinking models casually roast you in their reasoning.
Wtf?
Anonymous No.107155107 [Report] >>107155149
>>107155094
>you are not just bad at roleplay; you are fucking retarded!
Anonymous No.107155143 [Report]
>>107153942
>>107154805
What memory clock are you actually getting? It'll default to 3600 MHz, and I can't imagine successfully getting 6000.
Anonymous No.107155149 [Report] >>107155192
>>107155107
>I love you schizo Yandere
>AI: not realistic
And this is why AI sucks.
Anonymous No.107155155 [Report]
AGI by 2025 bros?
Anonymous No.107155181 [Report] >>107155281
>>107154513
Any recommended resources on getting started with CUDA?
Anonymous No.107155192 [Report]
>>107155149
The problem is your writing. There was insufficient build up to justify the love for your schizo yandere.
llama.cpp CUDA dev !!yhbFjk57TDr No.107155281 [Report]
>>107155181
To get started with machine learning in general I would recommend implementing a simple feed-forward neural network in something like NumPy, years ago I did some related exercises for a Stanford course that were freely available.
I don't know any good resources for getting started with CUDA in particular but I found the following the most consistently useful:
-The CUDA "C++ Programming Guide" for basic features and the "Best Practices Guide" for how to write better code.
-The PTX ISA documentation for advanced features like tensor cores or asynchronous data copies (they have terrible "high-level" interfaces in CUDA).
-The AMD ISA documentation to appreciate how much worse it could be.
Also familiarize with the use of NVIDIA NSight Systems and NSight Compute.
Anonymous No.107155414 [Report]
K2-thinking feels like the true successor to the very first R1 in both the best and worst ways.
Anonymous No.107155444 [Report]
>>107155428
>>107155428
>>107155428
Anonymous No.107155557 [Report]
>>107154319
I looked at this PSU and decided I couldn't possibly trust Asus with this. I'm sticking with 2x HX1500i.