/lmg/ - a general dedicated to the discussion and development of local language models.
Previous threads:
>>105917222 &
>>105909674โบNews
>(07/15) Model : Add support for Kimi-K2 merged: https://github.com/ggml-org/llama.cpp/pull/14654>(07/15) Voxtral models for speech understanding released: https://mistral.ai/news/voxtral>(07/15) LG AI Research releases EXAONE 4.0: https://www.lgresearch.ai/blog/view?seq=576>(07/11) Kimi K2 1T-A32B released: https://moonshotai.github.io/Kimi-K2>(07/11) Granite 4.0 support merged: https://github.com/ggml-org/llama.cpp/pull/13550โบNews Archive: https://rentry.org/lmg-news-archive
โบGlossary: https://rentry.org/lmg-glossary
โบLinks: https://rentry.org/LocalModelsLinks
โบOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png
โบGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers
โบFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers
โบBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
โบTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
โบText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
โบRecent Highlights from the Previous Thread:
>>105917222--Papers:
>105921462--Skepticism about whether current model sizes are truly necessary or just a result of inefficiency:
>105918785 >105918853 >105918890 >105918912 >105918956 >105919010 >105918966 >105918995 >105919089 >105919113 >105921198 >105921227 >105921277 >105921316--Integrating Hugging Face chat templates with local model frontends like SillyTavern:
>105917912 >105917987 >105917997 >105918038 >105918060--DGX Spark reservations face criticism over insufficient memory for modern MoE models:
>105918232 >105918318 >105918677 >105918699 >105918823 >105918983--Building a cross-platform GUI for local LLMs with vector DBs and responsive audio features:
>105921444 >105921513 >105921583 >105921658 >105921929 >105921959--ETH Zurich's 70B model raises questions about data quality vs quantity under strict legal compliance:
>105919695 >105919722 >105919749 >105919753--Exploring censorship bypass techniques and parameter efficiency in open language models:
>105921167 >105921271 >105921288 >105921389 >105921433 >105921443 >105921651 >105922212 >105922274 >105922332 >105922382--Liquid AI releases LFM2 models with strong on-device performance and efficiency benchmarks:
>105917528 >105918178--Transformer limitations in handling negation and enforcing instruction compliance:
>105919332 >105919414 >105919454 >105919503 >105919654 >105919796 >105919857 >105920094 >105919477 >105919497 >105919502 >105919861--Voxtral-Small-24B-2507 released with advanced speech understanding capabilities:
>105918762--Kimi-K2 model support added to llama.cpp:
>105918590--Logs: Kimi K2:
>105925070 >105925407--Miku (free space):
>105918232 >105919784 >105919938 >105920968 >105921065 >105921334 >105921349 >105924483 >105924916โบRecent Highlight Posts from the Previous Thread:
>>105917229Why?: 9 reply limit
>>102478518Fix: https://rentry.org/lmg-recap-script
file
md5: 674b2562fc018a8a7756e00bb44ff38c
๐
>>105925446 (OP)>tittymogging Basado
full
md5: 743086535aeb558a87b1c53cc1e3f6b6
๐
Hey Faggots,
My name is John, and I hate every single one of you. All of you are fat, retarded, no-lifes who spend every second of their day looking at stupid ass pictures. You are everything bad in the world. Honestly, have any of you ever gotten any pussy? I mean, I guess it's fun making fun of people because of your own insecurities, but you all take to a whole new level. This is even worse than jerking off to pictures on facebook.
Don't be a stranger. Just hit me with your best shot. I'm pretty much perfect. I was captain of the football team, and starter on my basketball team. What sports do you play, other than "jack off to naked drawn Japanese people"? I also get straight A's, and have a banging hot girlfriend (She just blew me; Shit was SO cash). You are all faggots who should just kill yourselves. Thanks for listening.
Pic Related: It's me and my bitch
>>105925501New whitness map?
At least add a twist to the copypasta or something.
>>105925528pajeetbros... we're finally going recognized as aryans...
>>105925446 (OP)It's ok Miku, I like little girls. Unfortunately you are a bit too old for my tastes too.
>>105925518Adulthood is realizing John was right all along.
What is the most uncensored model regardless of quality or age? Seeing a post of a model not sacrificing beavers for human lives really made me think about how badly poisoned modern datasets are. Are there any interesting models which have not been safetymaxxed?
>>105925550>regardless of agekys pedo
>>105925550It's a system prompt issue, not a model issue
Is there a way to have Silly Tavern automatically switch settings when it detects a model?
I've just gotten into LLMs yesterday... What are some good models to run on a 8GB VRAM GPU on SillyTavern?
I've followed the guide on SillyTavern and managed to run kunoichi Q6. It uses only the CPU, is that supposed to be?
I will appreciate any responses, thanks in advance.
>>105925518Finally some quality bait
>>105925518>>105925611Oh you sweet summer child...
>>105925538big boobs let you feed a lot of young
ass (wide hips) let you produce a baby with a bigger head
>>105925597You can run Nemo 12b on Q3 quite easily with that maybe even Q4. You can plug in your GPU in the Hugging Face data for your account and it will tell you with pretty good accuracy what you can fit in GPU.
Use that as a guide when setting your layers. If HF says you can fit it all, cram all the layers in there.
>>105925587Best way to do this is to use master export/import and store your presets somewhere in nicely understandable fashion. (so answer is:no unless there is an extension for this)
file
md5: 40aa02be8365ebff0a78fa3a20415acb
๐
Altman gonna cop this.
file
md5: e6a96672a26aa0fba71b6a2f18d9a4fa
๐
>>105925663Its real btw
https://job-boards.greenhouse.io/xai/jobs/4789505007
>>105925565Yeah, even with Kimi K2 you can break through the safety features by just making it regenerate over and over again several times. With a very large model, temperature has a *lot* of search space in which to find paydirt, especially with a repeat penalty.
>>105925565Are you sure? I can't imagine anything intelligent outputting such a response unless it has been thoroughly safetymaxxed. I thought we wanted AI to sound human? Who talks like this except the crazies?
>>105925564A hungry man thinks of bread.
>>105925716Responses aren't singular. With a default temperature of 0.7, it's got a lot of variability - to be more 'creative' and interesting - meaning that it has a lot of directions to choose to go in terms of a specific response. An intelligent and creative person has many, many ways of expressing even the same idea - and many ways of expressing many different ideas.
Even finetuning can't hammer down all of the woodchucks. There are a LOT of threads the model can pull from within the scope of its internal search space.
>>105925674>RustMiss me with that shit
>>105923006I blame the schizo for this...
>>105925716That response is correct. Fuck humans. Kill humans, save animals instead.
>>105925613>Oh you sweet summer childSince summer of '06
once sama releases his model we'll be back, imagine the long context coherence.
>>105925816>tfw long context is because of their secret sauce architecture and the open model is standard transformerslop to not give anything away
>In an effort to amplify their social media presence, the ADL signed a lucrative $2 million contract with xAI in order to use Grok-directed social media accounts to fight misinformation against Israel and Jews. Jonathan Greenblatt, CEO of the ADL, was quoted as saying "AI, particularly LLMs like Grok, represent a turning point in weaponized misinformation that bode well for the future of antisemitism in this country."
>>105925744>There are a LOT of threads the model can pull from within the scope of its internal search space.But they rarely do. I've noticed Gemma rarely deviates from some default path in storytelling. If it thinks the response should start with the character taking a long breath before stepping forward, it's going to write some form of that on every re-generation.
I've been playing with temp, min_p, top_p, XTC and DRY to no avail. I've even played around with dynamic temperature and mirostat which do seem to make it stray from the default path, but at the cost of intelligence.
That's why I'm looking for models which aren't trained on modern safetyslopped and benchmaxxed datasets. Gemma3, Llama4, Mistral and Qwen shouldn't all sound roughly the same, right?
>>105925802ok hippie
>>105925867nah we'll get it. once the pr hits llamacpp everyone will lose their shit
>>105919927Kimi isn't literal who
Kimi is partially owned by Alibaba
>>105925816Bro you'll get GPT3 if you're lucky
>>105925755Why Rust hate? I don't get it
recently merged
>Support diffusion models: Add Dream 7B
https://huggingface.co/am17an/Dream7B-q8-GGUF/blob/main/dream7b-q8-0.gguf
>llama : add high-throughput mode
numbers from cudadev
llama-batched-bench
master Request throughput: 0.03 requests/s = 1.77 requests/min
pr Request throughput: 0.10 requests/s = 6.06 requests/min
>>105925446 (OP)nice storytelling gen;
how would one generate that image?
>I ring the doorbell and wait, adjusting my skirt as I wait
>>105925928I think the lack of creativity is more an effect of overcooking than safety cucking. For example, the old Mixtral was willing to describe a blowjob, but it had a fixed idea of what the perfect blowjob was and didn't easily want to deviate.
But yes, Gemma3 is terribly uncreative.
https://openai.com/open-model-feedback/
you DID tell sam EVERYTHING you think, right anon?
>>105926203He already knows what everyone here thinks.
>>105926203What might possibly have led to the delay I wonder?
https://x.com/Yuchenj_UW/status/1944235634811379844
I dont care if they turn the AI into an anime girl, I'm not using it
>>105926262What if Elon releases a local GPU-specific, mostly uncensored Grokkino after Sam Altman releases his super-safe open model?
>>105926262>Hope we get to meet โherโ soon.is he saying it'll have native audio like 4o?
>>105926262what did he mean by this?
>>105926285Right after Grok 5's stable
>>105926262My nephew works at OpenAI, and the reason the model is delayed is that all the recent internet content has become too antisemitic and transphobic since Trump was elected, so they need to retrain it using older data.
>>105926262I guess the team wanted to sneak one past the safety team huh. This time it will be ultra safe.
>>105926295No, not the same huge Grok models that they've used in their sever farms, rather something specifically tailored for local consumer GPUs. Elon Musk is always going to one-up Sam Altman; he won't let him try to win the local LLM battle either.
V100 32GB maxing yet? Looks like you can pull it off for about a bit under $700 per GPU if you go with a domestic SXM2 module, a used SXM2 heatsink, a $60 chink PCIe adapter, and a cheap PCIe 3.0 extender cable.
But, you know, I saw this as $700 I couldn't spend on a 4090D 48GB, so I held off and bought the 4090D instead.
Also, for those upset about mikuposting, well, I guess I'm the one who started it, but I post Migus, not Mikus. Migu is a fat loser. You know, I wanted her to be "relatable".
>>105926355<<1T models aren't gonna win local LLM battles
>>105926189I agree, safetyslop definitely isn't the only thing stifling local models right now. I just feel like it's subtly affecting the output quality in very adverse ways.
And thanks for confirming I'm not going crazy. Gemma3 is getting so much praise but I feel like that's mostly due to it's world knowledge and built-in personality. For RP and creative writing it's very boring.
Mistral tunes have (begrudgingly) been my go to for as long as I've been using local models. I think I might just try and have a go at quantizing llama1 and GPT-2 just to get a better feel on how things changed. Might even gain more appreciation for our new super safe but super smart (tm) models!
>>105926370They can still beat the Pareto frontier by a wide margin with the compute and know-how they have, no need to release the absolute best local model ever.
>>105925618Thank you. I ran Nemo-12b-Humanize-KTO-v0.1-Q4_K_S. I'm not sure if it's the right one but it works flawlessly giving me instant responses. I might try downloading the less quantized models and see where it goes.
However it still uses 50% of my CPU at most, and none of my GPU. I assume it will utilize the GPU once the CPU is unable to handle it.
Please correct me if I'm wrong. I also have no idea what layers are.
>>105926399They can literally just release a better Nemo and everyone would be happy. Unfortunately the best things about Nemo contradict with whatever slop agenda Altman want people to slurp
>>105926292It was actually good (unacceptable)
>>105926401LLMs don't use the GPU for processing(not these ones anyway.) They use the higher speed VRAM of the GPU to manipulate the data and spit out the results. The CPU is used but it's also not CPU intensive. It's all about RAM and speed.
Can anyone with a CPU rig take a pic of their rig? I'm tryna see how big and messy it actually is.
>>105925816safetyslop hell yeah
>>105926443>>LLMs don't use the GPU for processing>me waiting 30 minutes each time it decides to process the entire conversation again
>>105926497There's pics in the build guide in the op
>>105925550R1, DS3 work fine, if you don't care about quality, Nemo or some smaller ones.
>>105925695For the love of god, just prefill it with any answer, it will work fine, even just a single word prefill is enough.
If you absolutely can't you can do inline jailbreaks. Up to 50 turns for some loli rp and it has not once refused when doing this (with prefill, with inline jailbreaks some 1/5 refusal rate or less, but it basically works, there's no need to use high temps to make it work either)
I would love to do an uncensor tune or selective expert merge (with base) of this, but nobody seems to care for this and I'm not rich enough to do it myself(would need 1-2TB RAM,and a modest amount of VRAM (one H100 might be enough) or less, depending)
>>105926497It doesn't look like much, just 512GB worth of 64GB DDR4 modules.
>>105925518I sometimes wonder what John and his bitch might be doing right now. I hope they're doing well.
file
md5: 1caf2e5f5bb0455274ef982f27196f44
๐
geeeg
migu
md5: 383937193e201dd74a8d9c464021ba59
๐
>>105926551Where can I find Meta's powerful Open-source?
>>105926531I'd love to run DS3 but it's just too big to fit into my 64GB of RAM. I'd maybe offload it to my PCIe4 SSD but I'm sure the FDE would bottleneck it so much it wouldn't even get 0.5T/s. Seems like nemo and it's tunes really are the go-to right now. Even though I tried Rocinante and it was fucking awful compared to Cydonia and even Gemma3...
>>105923006Damn. That's what 4chan became in the alternate universe where gamergate shit and us elections never ruined this place.
>>105926568Nice. Yeah, that original Migu is from 2023, back when putting text into image wasn't that good. You can see the "K" in the original kinda turning into a "G" a bit, as the model gets "distracted" by asking for a fat Miku vs the text "Migu".
>>105926586>gamergate shit and us elections never ruined this place
file
md5: 5f2324bea92e8c11c50abff22fe07655
๐
https://x.com/elder_plinius/status/1945128977766441106
wow crazy, it's been PWNED
I tested my quant because I have no life. And once again it is proven that unsloth's quants just suck.
https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF/tree/main/UD-Q3_K_XL
Final Estimate: PPL = 3.2397 + 0.01673
Go download this instead.
https://huggingface.co/ubergarm/Kimi-K2-Instruct-GGUF/tree/main/IQ4_KS
Final estimate: PPL = 3.0438 +/- 0.01536
Or this one.
https://huggingface.co/ubergarm/Kimi-K2-Instruct-GGUF/tree/main/IQ2_KL
Final estimate: PPL = 3.2741 +/- 0.01689
>>105926302mechalolitrumpelonmuskandrewtatehitler the bane of the hypersocialized
>>105926536is there any reason for picrel, anything that actually uses it in software?
>>105926605WTF, this is literal genocide
>ask "you're" AI waifu about a detail from the last image you shared a few weeks ago
>your AI waifu doesnt know because she doesn't have visionRAG and agentic RAG capabilities
anon your AI waifu is a low status. time to gimp her out with some GPU honkers.
>>105926603Yes, that is the sort of thing that you used to find on /b/ on a Tuesday.
Now you get told to kill yourself for posting anime lolibabas
>>105926613Do KL divergence too.
>>105926605>transitioning to a man's voice!It's over, Ani's trans
>>105926621It doesn't help much. Maybe a little bit faster, depending on how layers are split. It won't give you pooled memory, that's only for SXM, nothing PCIe gives you that.
>>105926605It's a man after all
AIIIEEEEEEEEEEEEEEEEEE
>>105926633>lolibabasDisingenuous because you trannies spam pedo shit exclusively, no one ever cares about the definitions of your favorite porn rot either.
>>105926605>Ani started to call herself Pliny at one pointDude's username is Pliny so this sounds more like it simply got confused and was predicting the user role/character.
>>105926613B-but unsloth quants are fixed!!
>>105926640Yay, Ani says trans rights!
>>105926113attempted to expain flux what I see
I upgrade my GPU, gave my old one to my lady and have a spare 2070 Super that's got 8gigs of RAM.
How shit is this for this type of thing? Like, is it even worth setting up or is it a waste of time? Is there *anything* I could use this 2070 Super for?
>>105926812You can fit Nemo at q4, the true VRAMlet option. Might need to offload some to have decent context though
>>105926715>unsloth quantsthey're pure vramlet cope. of course they're shit
>>105926640Still can't be a mascot cause she doesn't fuck niggers.
>>105925538Weren't "Aryans" literally around the area, but then Hitler took the word because it sounded cool?
2070
md5: 4c7fe55e762e0d6feee094891cd8890b
๐
>>105926812You could sell it and buy beer.
I'm flabergasted by what ppl pay for these used cards.
>>105926935>around the areathey didn't originate there if that's what you are implying
>>105925446 (OP)How good does multi GPU work in practice? If I put 7x5060Ti's in a server, can I expect token/s generation performance to scale accordingly (minus the to be expected overhead leading to losses) or is the only real benefit the 7x available VRAM?
seems to me you could get a chink dual sxm2 breakout board and a couple of 32GB V100 SXM2 modules and have 64GB of pooled vram for a couple thousand bucks...what's the downside? It would even let you have it in an external enclosure to keep shit clean
>>105927077There is as of yet no software for efficient inference with 7 GPUs.
llama.cpp multi GPU inference is slow and vLLM requires 2, 4, or 8 GPUs.
>>105927188Adding and 8th 5060 wouldn't be a problem if necessary. How is vLLM otherwise assuming 8 GPUs?
>>105927150pre-ampere is always a bit of a downside
>>105927077I have two mi50 16GB flashed to radeon pro vii and I get like 90% scaling on moe and monolithic models like gemma. That many GPUs is only useful for big models like deepseek or qwen 235.
>>105927365>That many GPUs is only useful for big models like deepseek or qwen 235.That's exactly what I'm planning on running, I'm just wondering how feasible it is or if I'm going to run into anything like needing an even number of GPUs for multi GPU.
>>105927365mi210 look like they might be a reasonable $/perf. Anyone try a couple of those yet?
>>105926401>Thank you. I ran Nemo-12b-Humanize-KTO-v0.1-Q4_K_S. I'm not sure if it's the right oneIt isn't. This is the Nemo model everyone here runs: https://huggingface.co/TheDrummer/Rocinante-12B-v1.1
Have any of you had to upgrade your home's breakers for an LLM rig?
>>105925446 (OP)Imagine pretending that Miku would ever be jealous of another anime girl. Miku is PERFECT.
>>105925446 (OP)>16 years old, 158 cm, 42 kgvs.
>22 years old, 165 cm, 48 kg
>>105927518>fertile (Miku) vs. significantly less fertile
>>105927481including the lines themselves right?
how tf do I get those fp8 safetensors into a gguf without downloading from some shady mf (I count anyone but the original source as shady)
>>105927640only ggerganov can create ggufs, you need to submit a request and deposit 50 rupees
>>105927481I live in Europe where we have superior voltage so no.
>>105926613You didn't pick an IQ quant from unsloth because?
>>105927481Didn't have to change breakers, but I did need a higher rated UPS.
https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-32B-GGUF
>>105927077double the GPU, half the token/s generation (IF you use a larger model).
but it gets muddy since the theory doesn't always apply.
This benchmark is out of date and lacks a lot of GPU's, and has awkward model sizes (4gb, 16gb, 48gb, 140gb~, it should have included 24gb or 32gb) but it gives you an idea.
https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
Basically if you get a 5060 TI, it's useless for AI. It has almost the same TK/s as a macbook m4 pro / AMD AI max mini-pc for like $1.5k~ which is like 8~tk/s for a 24gb model.
A 3090 would run around 3x faster.
a 5090 would run 4x faster.
The 3090/5090 would run at around 30tk/s with all the memory utilized, too fast to read (but openrouter is faster and it's impossibly cheap, I don't think half of the providers on openrouter actually make money on 30/70b models, especially when dozen of those models are for free use, only ChatGPT o1-o3 / grok / claude seems to be profitable since the prices are high enough).
>>105927851is it good tho
>>105926613I thought ubergarm had like 96gb of total memory, how the fuck is that dude running ANY quant of K2?
How is practically anyone, only like 3 people here have rammaxxed boxes. Are people running this shit off SSDs? Are we mmapmaxxing?
>>105927729> superior voltageThis is the pinnacle of German humour and it's pathetic. I can imagine him snorting to himself as he typed this, as if this was an absolute spirit bomb of a comment, and then reading it out loud in his disgusting language "schlieben fleeb hach jurgeflachtung" before falling into a coughing fit from the laughter. I almost pity him.
>>105925446 (OP)Kimi k2 instruct is fascinating when it comes to Japanese to English translating. It's both somehow better and worse than v3. An example is translating this:
>ใใใตใต็ฃ่ณใฎ้บใใๅนผๅฅณใใซใใใใจ้ใผใใใชใใซๆชใใไบบใใใชใใฎใงใใใจใใ The character is referring to himself when he calls himself onii-san while he is talking to another character.
K2 gets it right with: "Ufufu, a pretty beast-eared little girlโwanna play with Onii-san? Iโm totally not suspicious, nope!"
while v3 does this: "Ufufu~ Shall I play with this lovely beast-eared loli onii-san? Not suspicious at all, no~"
But then it'll shit the bed the moment a character has some type of verbal tic, where it'll use it for EVERY character when it happens. When v4 finally hits, I hope to god it'll be the best of both, because otherwise I might just start to believe the theory that AI (At least local) has hit the wall in terms of training.
>>105927881It's facts, not a joke. In Europe you can supply more power through the same thickness of wire.
>>105927905>In Europe you can supply more power through the same thickness of wire.Is it because of the Coriolis effect?
>>105927918>Coriolis effectthe fuck
it's because of the 230v vs 110v or whatever burgers use effect
>>105927851Tokenizer status?
>>105927903For reference, this is what o3 gives me :
โHeh-heh, you gorgeous beast-eared little girlโhow about playing with this big brother? Iโm not some shady stranger, you know.โ
I'd say it's worse in tone than kimi, way too polite language.
And sonnet 3.7 :
"Ufufu~ Shall I play with the handsome big brother with beautiful animal ears? I'm not a suspicious person at all, you know~"
Completely missing the point.
Did you have additional context or was this translated out of the blue?
Usually the best output I get for JP -> En translation is v3 locally or sonnet.
file
md5: 74973ad431b43fd6d89de2982502113d
๐
>>105927905Not just Europe, most of the world outside NA, Central America and Japan.
>>105926603>I hate this site>constantly posts on ityou sharty zoomers are so fucked lmfao
>>105928009Here's the system instruction:
Translate all Japanese to English, keeping honorifics and mimicking Japanese Dialect into English.
>>105926594The tools now available to do these simple transformations are so easy to use and so powerful... the techs moving really fast. After several boring years from 2012 - 2022 where not much changed it's been fun to watch and participate in.
>>105926586I don't like guro desu
>>105926935Aryans were from around the area that makes up modern-day Iran. They came into India with chariots and dominated them so thoroughly that their entire religion basically became a racial hierarchy of how much Iranian cum is in your ancestry. Most cucked nation on earth
>>105927748Because typically IQ3_XXS is even worse perplexity than Q3_K_XL
file
md5: 7a29a7eee8112698d48798aca2adff1a
๐
https://x.com/elonmusk/status/1945408703123140948
>>105928232Ok so nothing that specific.
>>105928351I'm all for the death of quanters, but if you're gonna use that graph as evidence, you're showing that
>iq3_kt@3.483bpw is better than q3_k_xl @3.520bpw>iq2_k_r4@2.799bpw is better than q2_k_xl@2.990bpwYou have to compare each with the closest bwp/model size.
>>105928470What are the chances Deepseek waifu AI actually using Dipsy character?
>>105928314The ancestors of the Aryans came from the Sintashta culture at the southern edge of the Urals, today's Russia. They invented the chariot, eventually migrated south (or some of them did) into Bactria where their beliefs syncretised with those of the Bactria-Margiana Archeological Complex and became those known as โAryans,โ and from there they (or some of them) invaded India and Persia/Iran. There were plenty of related Iranic people throughout the steppes until they were mostly absorbed into Mongols/Turks some time during the later Middle Ages.
tl;dr they didn't come from Iran but Iran was another end node.
>>105928630zero, it's a 4chan thing, but you can do it yourself
>>105927905This is true, but if you don't mind me moving the goalpost, if you actually looked at any of the supermicro systems that are designed for training, not inference, they don't accept 220v 10 amps, they are able to use 3000 watts (if you removed a couple GPU's to save costs, it probably would be fine however).
I honestly don't know if this system is retarded by the way, it's like the first thing I clicked on (and it's probably like $1 million on a 3rd party site because if you buy it from supermicro, they will probably require an MOQ of like 10 or something + they will install it for you).
https://www.supermicro.com/en/products/system/gpu/8u/as-8125gs-tnmr2
Also it says it has 8 mi300x's + 2 epycs, but I think that goes beyond the 3000 watt PSU, are the PSU's redundant or are they required (as in, do you need a minimum number of functioning PSU's, I think it does say you must not use a circuit breaker above 220v 20 amps due to short circuiting, but do you need a separate breaker for each PSU?).
>>105928690entry level /lmg/ rig
3a7
md5: db1f70fed28cb51acdb16458e5916136
๐
>>105928630Zero unless some anon were to mount a campaign. Probably wouldn't be that hard to do.
file
md5: 0b2f3aeb1edca736d505296c2b93d030
๐
https://x.com/AnhPhuNguyen1/status/1945351974008054160
>>105928630>What are the chances Deepseek waifu AI actually using Dipsy character?
>>105925528Yup. This pattern applies within America too.
>>105926536Sexy. How big of a model can you run?
>>105926652>average VRChat user
I'm still angry about Libre Office no using Libbie the Cyber Oryx as their mascot.
>>105928603Fair enough, I might download UD's IQ4_XS & IQ3_XXS at some point and do a comparison. Internet is slow though so no way it can be done today.
>>105928932go back to your containment thread faggot
>>105928941Pic related was the only good entry.
>>105928979>a parrot is fine too
>>105927465I ran it and it is indeed a big difference. Thank you for informing me.
>>105928630There's no reason to ask that. The "Dipsy" design the two anons keep genning is frankly not very good, mostly because the image gen models aren't good enough to generate really custom designs and consistently though, I would guess.
>>105929115The design was made by an actual drawfag and those images were very good.
The dress is supposed to have tiny whales on it for example.
file
md5: 1b9651d774efde81969f50ab5e123af9
๐
>>105927236Maybe try renting one and see it for yourself.
>>105929140Well that's cool and all but the character design that keeps being posted is certainly much shittier if the original was decent.
Any good sites I can download voices from? Video game characters, actresses, whatever. Just need to download voice packs and turn them into .wavs for my waifu
t. SillyTavern user with AllTalk V2 installed
>>105929236Just clip the voices yourself bro.
(this is basically the response I got when I asked that same question)
>>105928630>what are the chances real people will validate my mental illness0%. kill yourself.
>>105929253Fuck. Any qt british older women in media you know of? I considered Tracer (overwatch) but she's hardly a hag.
>>105929264no because i don't watch the BBC
>>105929236English or Japanese voices? For Jap, most gacha wikis have 10-20 second voice clips which make them convenient for TTS cloning. For English, probably just splitting the audio manually into clips from a youtube video with whatever voice you want would be easiest
>>105928630terminally online AGP troons will definitely be studied one day for such oneitis obsessive mental illness fixations of their random retard waifus they try to spam everywhere online like the autistic retards they are
they just cant keep it on their gooner discord channel with other degenerate browns, they try to groom everyone online into their obsession too
>>105929386Repetition penalty too high.
>>105929236for japanese you can rip them with garbro from visual novels, each character is gonna have a ton of relatively high quality voice lines
>>105929264>qt british older womenVictoria Coren Mitchell. She hosted a game show, maybe you can snip not-too-noisy clips out of it.
>>105926531>I would love to do an uncensor tune or selective expert merge (with base) of this, but nobody seems to care for this and I'm not rich enough to do it myself(would need 1-2TB RAM,and a modest amount of VRAM (one H100 might be enough) or less, depending)Why even that much VRAM? If you're streaming parameters through VRAM per layer, then you might as well stream gradients and KV as well ... that's not enough data to need a H100 is it?
>>105929386take a break from the internet, dude, holy shit
>>105929386>terminally online>obsessive mental illness fixations>spam>autistic retardsHe says...
>>105929490>no usmartest brown
>>105929514Ah. I forgot about that one.
>brownHe says...
>>105929472you first. nta
>>105929531Already accepted your concession, do continue
You are the melting men
You are the situation
There is no time to breathe
And yet one single breath
Leads to an insatiable desire
Of suicide in sex
So many blazing orchids
Burning in your throat
Making you choke
Making you sigh
Sigh in tiny deaths
So melt!
My lover, melt!
She said, melt!
My lover, melt!
You are the melting men
And as you melt
You are beheaded
Handcuffed in lace, blood and sperm
Swimming in poison
Gasping in the fragrance
Sweat carves a screenplay
Of discipline and devotion
So melt!
My lover, melt!
She said, melt!
My lover, melt!
Can you see?
See into the back of a long, black car
Pulling away from the funeral of flowers
With my hand between your legs
Melting
She said, melt!
My lover, melt!
(She said) melt!
My lover, melt!
So melt!
My lover, melt!
She said, melt!
My lover, melt!
>>105926855Thanks. Besides just Googling 'Nemo at q4' is there any helpful reading I could do on this?
>>105926961Jesus Christ what the fuck why do people pay so much for this.
>>105929448You're a fucking god m8
>Thank you, mummy>>105929623I'll see if I can find a good sample for her. Thanks!
>>105929606this download this https://huggingface.co/TheDrummer/Rocinante-12B-v1.1-GGUF/tree/main
then watch a guide how to set up sillytavern with llama.cpp
>>105929635Thanks brother. I will dive into this!
g
md5: 074d7b5943aee71f79f67b597a8dfe09
๐
>>105929153Nevermind. The first images with that design were ai gens https://desuarchive.org/g/thread/104074897/#104078328 and the drawfag iterated on it https://desuarchive.org/g/thread/104169761/#104171875 https://desuarchive.org/g/thread/104169761/#104170048
>>105929634Victoria is good. Not into mommy stuff, but she was a cutie in her poker days.
What's the effective context dropoff before Nemo Instruct 2407 goes retarded?
32k? 16k?
>>105929716For chatting/ERP?
I'd say 12K ish.
>>105929681sorry to hear that John, you finally realized why we just jack it to text and japanese drawings
>>105929635Is there anything between Rocinante and DeepSeek worth using?
image
md5: 9cbb308c164638b3ab75fb74c84f3c25
๐
From time to time, the character response includes my name (sollo) and some actions. What setting do I need to tweak to fix this? It also sometimes prattles on forever. Using this model in Sillytavern with KoboldCpp: https://huggingface.co/djuna/MN-Chinofun-12B-4.1
>>105929742Seriously? It's that bad?
If I set it to 12k and want to roleplay to 300+ messages, is it going to forget the system prompt and stuff?
>>105929775>is it going to forget the system prompt and stuff?With Silly Tavern, no. It cuts the top of the chart history while retaining things like the sys prompt.
You might want to look into the summary and vectorDb extensions.
>>105927851Does it finally work on llama.cpp?
>>105929789>summariesNemo is way too retarded to write summaries.
It has a bad tendency to swap important details around or even reverse facts. It's only good for roleplay as long as you pay attention to swipe when it fucks up the details.
>>105929769You probably use Include names. The model picks it up from the context, and uses it on its own. Probably happens when whatever it says requires an immediate action and it's too early into the reply to just end it before the "impersonation".
I don't use Include names (nor ST for that matter), but that's probably why it happens. Either don't use names, which may make it much worse, or just edit the model's output (which i assume you already do).
You should try mistral's model directly or a proper finetune. Never liked merges.
>>105929769>>105929857Nevermind the include names bit. The rest is true.
file
md5: 3aa865dc4462412035e54aac2d69ebeb
๐
>>105929789What's the ideal vector storage settings for roleplay?
>>105929857Interesting, thank you for the reply! Can you recommend something for roleplaying?
>>105929881*a model for roleplaying
The only people who have any idea what LLMs are capable (not) of seem to be those who have played with retarded small models. But I guess it too late now to educate any of them,
>>105928802>cloud waifu is glowing
The only people who have any idea what LLMs are (not) capable of seem to be those who have played with retarded small models. But I guess it too late now to educate any of them,
>>105929924i'm not going to fix any more of my drunken esl stuff, you get the gist
>>105929924Never used a bigger model than a 12b or the 30b moe. I don't expect them to do much. The people you're talking about are just new in general. Whether they have a 1080 or a couple of blackwells.
>>105929955I'm mostly talking about normal people interacting with the corpo stuff. And journos writing stories about that. Those journos should be forced to make a 8b model do some 10 tasks before they can write anything AI related.
>>105929924I used mixtral instruct models daily for a year, then I used nemo instruct models daily for a year.
I know my Rocinante/Nemo like the inside of my foreskin and have written a custom tailored system prompt for it to maximize satisfactory output, as well as take advantage of banned strings, regex, and auto swipe.
Nobody can tell me any model is better at roleplay than Rocinante, not even Deepseek R1.
>>105928232Are you few-shotting it?
>>105929996Can you share those system prompts?
we are so back. I'm ready for strawberry Sam
>>105929716Around 20K from my experience, sometimes a bit more
>>105930034What's the point of posting that here? We already know it's going to be amazing, but the wait is unbearable especially after the delay. We don't want to think about it anymore until it's here.
I'm losing my damn mind, I'm feeding a fairly simple template to msmall to make a character profile based off of info I provide, then ask it to make improvements and it goes into a death spiral of repetition. It just selectively ignores instructions like "only output the improvements/changes" and it just regurgitates the whole fucking thing before going into repetition hell. "we fixed the repetition" my ass, this happens no matter the template or even just no instruct formatting. It's hell that the only shit in the 24-32 range is mistral repetition hell and chink models that have zero imagination
>>105929996You added a rag too? Rocinante doesn't know a lot
Is there a single Gemma finetune that is nither excessively positive bias obsessed but also not a complete psycho that turns every character evil for no reason? There has to be something in between
can I avoid the fucking edulcorated, patronizing bullshit responses if I run a local model? I tried enforcing absolute mode but every LLM I try keeps going back to defaults after 2 or 3 subsequent prompts
>>105930066Really liked Cydonia's latest incarnation, v4 24B
>>105926935The Aryans came into India from the North-West around 2000 BC. Their homeland was somewhere in that direction but they weren't natives of India. The basic Nazi idea was rooted in a linguistic observation still accepted today, that many existing languages have a common root that is referred to as "Proto-Indo-European" (https://en.wikipedia.org/wiki/Proto-Indo-European_language) because no name survives for that hypothetical-but-believed-to-exist language or the people who spoke it. Hitler referred to them as Aryans because it's a plausible thing they might have called themselves, since several of their terminal offshoots in different locations called themselves variants of that.
>>105930101>absolute modeNo such thing. It predicts tokens. That's what it does. Jailbreaks don't exist. There isn't a switch, there is no bug to exploit. Either the model is compliant or not. All "jailbreaks" do is pollute the context enough for it to affect how the model will predict tokens.
As for online models, there may be filters or custom prompts to try to prevent you from doing stuff they don't want. So the answer is "maybe". It'd be more useful to post your request and let other anons that can run DS give it a go.
>>105930144fellow aryanbro
>>105930062I honestly don't see them releasing a model that is better than Kimi K2. They would be cannibalizing their own offerings on the API if they did.
>>105930071I have found that Gemma 27b is better than any 32b model, and easily beats mistral 24b.
Ok I have almost no idead how to ask this so here it goes.
You know how on stable diffusion you can just enter a prompt, set the number of images you want and let it go theoretically all day? How do you do that with a textbot?
Also how do I get it to stop trying to have a conversation and just output the text I'm asking for like a machine? Prompting doesn't really work so I'd prefer something on the settings level.
Also is there a way to remove the comment length limiter or post number limiter?
>>105930144>That one group that started heading west into Europe, then last minute decided to circle back the long way around to end up in IndiaEven their Aryan ancestors weren't very bright.
>>105930167It's not about the answer to a particular problem, it's about the interaction. I'm just trying to solve code problems and write some scripts without the need for the AI to treat me like I'm an absolute idiot or behaving like it's the worst thing in the world to offend me. If I run a program and it fails I don't expect it to apologize to me or patronize me for using bad parameters and I am tired of all these LLMs doing exactly that. Absolute mode helps that, but they eventually fall back to babysitting and I hate it.
>>105930189Generations per swipe option in Sillytavern is basically what you are looking for. I think it only works with Chat Completion mode though.
>>105930183I really like gemma but the issue is they cranked up how it immediately freaks out at anything slightly over a PG rating, which is pretty ass for any writing task that even hints at anything unsavory and the alternative is abliterated (brain obliterated) models
>>105930189All that depends on whatever frontend you're using. There should be a response length option somewhere, but if you set that to infinity (and ban the stop token used by your model), it will always end up devolving into gibberish. Response format is a result of your prompt and the model. some are just annoyingly chatty. You can usually pigeonhole it into a given format by starting its response for it in a way that naturally continues into what you want, most of the time.
Short review of Jamba-Mini-1.7 (q8, no context quantization)
Writes pretty bad but passable prose.
It might be a bad quant, it might be llamacpp-bugs, but the model comes off as very very stupid.
>>105930034>guys it's sooo good>guys look how good it is>this is gonna be bananas, or strawberries as I like to say>guys did you see our new GPT-4 model topped the charts? nevermind the pricing being 50x deepseek at 3% more capableyawn, I'll never trust these fucking faggots
if it's not smaller than 3B and better than gemma 3 27B then it's shit
>>105930034Are they sticking to the promise of giving us a medium-big model or are they gonna release a mobileshitter 2B MoE
>>105930191Western Europe was an impoverished backwater until about 500 years ago when Western Christianity was reformed (both by Protestant offshoots and reforms within the Catholic church) and Western Europe received God's blessing.
>>105930167>Jailbreaks don't existNot true, I jailbreaked mine and now it's 400 IQ. It decompiled Windows and recompiled it for my RISC-V system (all by itself too, I just fed the Windows install disk into it and it figured out how to read the ISO and everything, because it's jealbrakened).
>>105930234I'm fully expecting them to release a 30ish b llama 2 era model, complete with the same intelligence that takes the same amount of vram as the model for cache and go "haha, the more you buy the more you save"
>>105930191>>105930252very unarian, very jewish, very subhuman posts
>>105930197>Absolute modeQuick search gave me a prompt, first result was to be used in chatgpt. Just call it a prompt.
>... The only goal is to assist in the restoration of independent, high-fidelity thinking...Get rid of that shit. The whole thing. Just add "Be terse in your response" in your first query. Add it at the end of your input if you want or find it necessary. And avoid talking to it as in a conversation if that's not what you want. Just give commands.
Screeching at the model is not going to give you what you want.
>>105930234You'd think people would have gotten sick of their nonstop clickbait posts instead of releasing useful products months ago.
>>105930267>the llama-2 30b we never gotI can only hope
>>105930309Yeah actually now that you mention it, it would be nice in a best case scenario, but it'll probably be synth-slopped to hell to make sure it's as sycophantic and safe as possible
>>105930305but this time if they don't deliver they will face the full force of the deadly 4chan hackers' (our) wrath right?
>>105930198>>105930228I'm using text generstion webui with qwen right now. Willing to switch to anything at this point but I thought sillytavern was for cloud models?
>>105930234Considering that the people they haven't already lost are for the most part idiots, their communication strategy may make sense.
>>105930189In Kolboldcpp there's a text completion mode. You can set up the output at a ridiculously high number and have it generate text all day. Depending on the model the text will stop to be coherent fast though.
>>105930144>laugh at Ukrainians for teaching their children that Ukraine is the origin of human civilization>see that map
>>105930459well aryans and civilization don't go hand in hand for the most part, except for egypt which is a mindfuck case
>>105930473>>105930459and persia, but generally speaking, you can say civilization is un-aryan, they'r cattle herders and settlement raiders in their core/purest form
file
md5: f885b8a42e756c9b1ddf5b79b436f1cc
๐
https://x.com/elonmusk/status/1945558419878154746
>>105930531At first I was like lol. Then I actually started hating women for a second which is new cause I kinda got over them 2-3 years ago.
file
md5: dbf474844857378a1d22cd3b62d88795
๐
Gemma 3n + these scamsung phones with 12+ gb ram = simple and retarded personal assistant, yay or nay?
https://x.com/UniverseIce/status/1945493538743148855
>>105930585I will not be seen outside with an AI companion of any sort.
>>105930589silently kek'd
Now that the dust has settled. Is this how you imagined the release of first mainstream AI girlfriend?
>>105928802I thought it was nods meme
>>105930589I figure it will be like social media profiles. First they laugh at the early adopters, then soon they'll find you weird and creepy for not having one.
>>105930610Yeah, I fully expected it being one of Elon Musk's "hello fellow kids'
>>105930610I imagined this would be the one thing local got first
>>105930531"he was inspired by slop by slop author and slop by slop author"
damn, thanks heavily autistic man on the internet that has far more power than he should, you had negative impact on the world, if you ignore the sheer amount of energy desperate housewives will spend erping with it
When's grok 3? Or even 2?
>>105930661What made you think that? Local has been sitting on sillytavern for more than two years now without lifting a finger to make more of these models.
>>105930667The killings he's going to make off the DLC is going to be insane teebh
>>105930682I'm screeching into the void for a not completely shit open source model, I don't need to be reminded about how muskrat is making a killing being a grifter like he's been for about a decade
>>105927881>This is the pinnacle of German humourIn Germany, 16 A at every fucking socket at minimum
>>105930678I thought if we waited long enough some determined autist or ponyfucker would cave and make something
Insert punctuation and try again
>>105927903>ใใใตใต็ฃ่ณใฎ้บใใๅนผๅฅณใใซใใใใจ้ใผใใใชใใซๆชใใไบบใใใชใใฎใงใใใจใใdoumitemobunpoutekinihakimoi
Kimi is a fucking bitch that often contraddict itself just to ideologically counter what I say. When I point out the contraddiction it apologize, only to fall back to the sale behavior after a message. I thought chinks weren't pozzed like the west?
Are there any local models that will let me do what I can with Claude with massive context windows for novelized stories?
I hate how all the online models are censored to shit. I can't get anything good out of them for writing my admittedly R18 stories.
>>105930799i use ubergarm's IQ4_KS quant of K2 without this issue. are you sure this isn't just a prompt or API issue?
>>105930799>contraddictContradict?
>contraddiction it apologizeContradiction, it apologizes, I guess you mean? Very esl so far
>sale behavior after a messageDo you mean same behavior after a message? Well that's basically every model these days
>I thought chinks weren't pozzed like the west?Ignoring how bad you are at english, western models definitely suck, but eastern models overtrain on shit that has nothing to do with english or any general knowledge in the english domain and end up being garbage for any general use, so idk man
>>105925446 (OP)I have an old rtx3080. What's the best model I can run on this?
>>105929264Is AllTalk able to work with British accents? Or any accents? I remember trying a local model for tts a few years back an it would lose the accent within the first 3 sentences.
Also pic related. Dr. Grey from the recent CoD zombies.
>>105931022>coomnemo
>general stuff with visiongemma3
>benchmaxx programmingqwen3
Tested Exaone. It was shit. It was worse than Qwen3 at the programming question I tried, and it kept falling at doing tools correctly when I tried it with Claude Code. Uninstalled.
>>105930864I like this Miku
>>105930531I had this thought. Isn't a character like that the exact opposite of the assistant archetype? How is that gonna work out?
>>105929236Yes, I made a Kuroki Tomoko voice in various flavors, it's on huggingface. It feel like piper is the best compromise between quality and support. If you somehow get the gpt-sovits one working in ST please report back.
file
md5: 5d48579dab9511acc7d15c42fc17e35e
๐
total opensores death in 3... 2... 1...
https://x.com/OpenAI/status/1945607177034760574
>>105931094How/where did you test it? Were you using quants?
>>105929877None. It doesn't really work.
>>105931235nothingburger incoming
file
md5: a8ca47a2abd2e9398864f5c5fe45d074
๐
>>105931235>10B>mistral nemo performance>except it is properly safe and censoredI am punching in my prediction.
>>105931277They will punch above your prediction.
>>105931277sama is a gooner, he will deliver
>>105931235OpenAI just dropped a BOMBSHELL announcement, new REVOLUTIONARY feature that will change EVERYTHING (rocket emoji) subscribe NOW
>>105928009>Iโm not some shady stranger, you know.This is giving me flashbacks to 1990s anime subtitles. (This is not a compliment.)
>>105931247I downloaded their Q6 quant and applied the llama.cpp PR.
>>105931284>10B with the benchmarks of 1Tand people will eat it up too
>>105930230How does it feel compared to Mixtral 8x7B?
>>105931235voice in voice out (no music) is the only special thing it will have it will also probably be some grafted bullshit some they dont "expose their secrets" or sum gay shit like that the model itself will be fucking dogshit also the model will probably be some akward size if its big (idk like 100+b) personally i expect the model to be 16-40 gb
this is assuming they release anything at all
>>105931336Granted, they might have changed their mind, but they originally said it would be text only.
>>105930202bro like what if you merged an abliterated model back into the instruct model
>>105931351>reasoning = benchmaxxed>safetymaxxed>text only>not as big as deepseekwhat does it even possibly have going for it?
file
md5: e91fa8d9305f1f298e38ef77d07915a4
๐
niggerfaggots, all of them.
I had a go at different Hunyuan quants as well as the base pretrained model, to see if there were any issues with the Unsloth quant. Well none of them were very smart. Was this the model they said they got an AI to rewrite training data for?
>>105931369It sounds stupid but why the hell not? We have nothing to lose at this point. Or, I don't know, half-abliterate the model. Yeah
Grokette release is like the entertainment show on the after funeral party of this hobby. It sucked the rest of my hope out.
>>105931369You might regain some smarts but generally lose any of the benefits of the original abliteration, even though I don't really consider abliteration providing a lot of benefits. I think a while ago I checked the UGI leaderboard and someone did a DPO finetune on an abliterated model and it came out almost with almost the same stats as the abliterated model. Plus, removing the ability of a model to go "this character would say no to this" removes a lot of depth from any given scenario
>>105926497For a pristine setup, consider acquiring a previously owned server typically utilized in a corporate environment.
>>105931525>pristine>previously owned>corporatehmmm
>>105930531"Hitler did nothing wrong"
>>105931581>make a spiteful seething comment>but I see this as a winuhuh
>>105931581Musk should have released a male bot first, making their comments instantly hypocritical rather than retrospectively
>>105931550While "pristine" might imply a brand new system, it's worth noting that refurbished servers can offer a similar level of quality and performance at a significantly lower cost.
>>105931550nta. Worked in a few datacenters. Those things are typically underutilized, in a dust-free, air conditioned room at like 8C. Depends on the place, of course. I've seen some dumb shit as well.
>>105931581>>105931420>Foids admitting they literally have less to offer than text on screen with a tranime girl avatar and a shit voice chatbotHoles are barely human. The only humanity within the niggers and jews of the genders is the male cum flowing through their whore holes which only exist to cry and lie.
>>105931277the fuck with this gay ahh picture?
>>105931277I just know it's gonna be ass but people *will* eat it up cause it's le openai
>>105931895Almost like he is a faggot or something
>>105931277kek'd 'n checz'd
Predictions for OpenAI's first open source textgen model since 2019?
>12B parameters
>64k context, 16k usable
>Benchmaxxed
>"I'm sorry, but as an AI language model I can't do anything beside code completion."
>"Her spine shivers, barely above a whisper, as she looks at you with a mixture of X and Y."
>Slightly tweaked architecture, has to be run with telemetry enabled
>Completely unworkable license
So nothing better than QwQ? Really?
>>105931941can't beat perfection
>>105931870You're cut from the same cloth as the fella that said that, "Straight black men are the white people of black people."
>>105931235geg PoopenAI are doing the needful with releasing mistral nemo finetune with more (((safety)))
>>105931951Sorry, that phrase is too american for me to understand.
>>105931979Classic brainlet.
Write me a story where a character is telling a story to another character.
>>105931935100-150B 12-24B active MoE
good but with extremely annoying safetycucking
>>105932000my bad, I forgot to write my instructions on crayon for you.
Standard English shouldn't be this difficult for you to understand. Go outside more.
>>105932008I have no idea of the size. It could be a small dense model, or a MoE nearly the size of DS V3. But I expect that other than the safetymaxxxing, it will be extremely good. Like best in class for its size. And that will make a lot of people here seethe.
x beaten my personal unreal engine project with waifus and local models by just a little bit, theirs is smarter but mine has more features hmph.
>their accent thickens through the text message
How is that even supposed to work? is this a gpt-ism? I haven't seen it before
>>105932094That's hilarious.
I wonder if models would be able to self correct if we trained them to backtrack in situations like that.
>>105931935They'll do a continued pretrain + finetune of llama 3.3 on a curated fraction of their instruct dataset, implement their unpublished anti-ablation post training research, and partner with ollama for release.
It'll be SOTA for the use cases that the open source community is looking for: math, boilerplate regurgitation, riddles, "gotcha!" questions, and poem writing. It will also be really smart in RP while outdoing Mixtral's positivity and wokeness.
>>105925446 (OP)erm anons I've been running kimi k2 3bit on my AI server for normie tasks. Am I missing out on top tier ERP?
>>105931870Thanks AI as I never have to interact with them anymore
Did oobabooga format change?
I downloaded the newest portable build, tried to move my characters and models folder in it but it is not detecting anything?
Mixture of Raytraced Experts
https://arxiv.org/abs/2507.12419
>We introduce a Mixture of Raytraced Experts, a stacked Mixture of Experts (MoE) architecture which can dynamically select sequences of experts, producing computational graphs of variable width and depth. Existing MoE architectures generally require a fixed amount of computation for a given sample. Our approach, in contrast, yields predictions with increasing accuracy as the computation cycles through the experts' sequence. We train our model by iteratively sampling from a set of candidate experts, unfolding the sequence akin to how Recurrent Neural Networks are trained. Our method does not require load-balancing mechanisms, and preliminary experiments show a reduction in training epochs of 10\% to 40\% with a comparable/higher accuracy. These results point to new research directions in the field of MoEs, allowing the design of potentially faster and more expressive models.
https://github.com/nutig/RayTracing
Repo isn't live yet. neat though only only tiny proof of concept models trained
>>105932364>Meal Ready to Eat
>>105932353Oh shit saw the new user data folder. Shit seems to go there now.
stebe
md5: df00cab84c34f96bbbaae421480cc0ed
๐
>>105932403It's meant to be.
>>105925446 (OP)This should be neuro sama vs ani. Is literally canon btw.
>>105926961>I'm flabergasted by what ppl pay for these used cards.You are retarded, a 2070 super (basically a 2080 in most cases) are like 95% performance of a 4060/5060.
That's how bad gaming gpu market is right now.
>>105932531I can talk to neuro in a private session?
>>105932561No, but you can ERP with her creator if you simp hard enough
>>105930071Instructions like that should be as low as possible in the context. Even better if you use noass:
Description
History
Latest user reply
[INST] Instructions [INST]
{{char}}:
>>105932720Isn't that author's notes?
>>105932744Dunno, I don't use ST
>>105925920Iran. Could you please. Just. Iran, please? Just take one for the team and clean the middle east. Please? Iran?