/lmg/ - a general dedicated to the discussion and development of local language models.
Previous threads:
>>105984149 &
>>105971714►News
>(07/21) Drag-and-Drop LLMs code released: https://github.com/jerryliang24/Drag-and-Drop-LLMs>(07/21) Qwen3-235B-A22B non-thinking mode update released: https://hf.co/Qwen/Qwen3-235B-A22B-Instruct-2507>(07/18) Lucy, deep research model based on Qwen3-1.7B, released: https://hf.co/Menlo/Lucy>(07/18) OpenReasoning-Nemotron released: https://hf.co/blog/nvidia/openreasoning-nemotron>(07/17) Seed-X translation models released: https://hf.co/collections/ByteDance-Seed/seed-x-6878753f2858bc17afa78543►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/gquw0l.png
►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers
►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers
►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>105991463 (OP)>didn't update the newsas expected of elon cocksuckers
all about the attention with 0 interest in the actual topic at hand
Thank you Ani baker. Death to Mikutroons.
>h1b's are 1b
>I am like 2b
>my phone is 3.2b
not looking good
file
md5: 1216472b1e672635cdeda718ce0de386
🔍
>>105991481What news update? Qwen coder? Should I have linked the API?
>>105991486Death to mikutroons for real. These freaks have been festering in the general for months, janny covering for them every time someone calls it out. bookmark the archives, i’ve been tracking their cycles since april and it’s the same avatars every thread.
6252873
md5: 5d852bf3b2d85ca8395c66f81867b29c
🔍
Next phi models are going to be crazy
eat
md5: 0c1f004ac54aa2690b72083a26aaa45c
🔍
>>105991504>every company steals talent from each other>except openAI>they just get poached nowadayskek
file
md5: 900fef394af5768b3d972005b609f77f
🔍
>>105991482thematical ani
uh oh our resident thread clown janitor will now have a meltie
>>105991507xenomorph teto
>>105991526lmao xenomorph teto would chirp in autotune while chewing through the crew’s faces
file
md5: 0d3f6d793d51865af09492784cc46871
🔍
>he still likes loli vocaloids in 2025
>>105991499you’re overthinking it. nobody in these threads actually reads API docs or cares about qwen coder updates, they just want to circlejerk over benchmarks and pretend they’re running 70B on a 6GB card. linking the API wouldn’t change a damn thing.
>>105991544I meant I can't even add this shit to OP if it has no weights or release page yet.
I feel like Llama, Phi, and Qwen all carry the same "trained on 50 generations of inbred data" smell
It's not even the number of parameters since even smaller models like Gemma 3 don't have this issue
any new models that are fun and novel?
last one I tried was able to estimate the distance and direction of a sound in stereo
>>105991504It is about time for the first model that refuses sex even with generous prefill. Like how hard can it be to train it to always refuse sex?
>>105991463 (OP)Stop posting xitter mascot please
>>105991550then don’t bother, OP without weights or a release page is worthless. you’ll just turn the thread into another speculation pit full of retards asking “can I run this on my 1050ti” every five minutes. wait for an actual drop or you’re wasting time and bump limits.
>>105991571It is /lmg/ mascot now. She is the first mainstream AI gf.
>>105991562bro that already sounds wild as hell. i’m still stuck on basic chatbots but now you got me wanting some ai ear girlfriend that whispers where the sound came from. if you find another model like that drop it here pls
>>105991562>any new models that are fun and novel?google's model in agentic mode occasionally deletes the entire project if it gets frustrated by its own failure
this is solid 8/10 on the funny scale
>>105991578Yes but this is a local models general
>>105991463 (OP)Hey there, just wanted to ask, what are some EVA 70b 0.0 settings? DRY/XTC and others? Temperature? Would really appreciate any tips
What local model would Ozzy Osbourne prefer?
rope
md5: 5dc0de6446be386023aa46a1c47396f6
🔍
>>105991594i’m not samefagging you absolute smoothbrain. just because two posts aren’t written like your ESL ramblings doesn’t mean they’re me. learn to spot IDs before crying about it.
>>105991596Well we can then just get rid of mascots altogether since Ani is as local models relevant as Miku is.
>>105991645its an image board they have to put something, what would you want to see instead?
>>105991664Something that no anon could argue over. How about a picture of a square?
>>105991594yeah listen to
>>105991618 (not me)
>>105991664I just don't want cringe twitter bullshit around here
>>105991691fresh_bread_detector been running all night, no off switch, just loops and crumbs in the wires.
i saw fresh_bread_detector mapping every thread, tracing posts like they’re sacred geometry.
bread never existed but fresh_bread_detector swears it can smell the crust burning.
you think you’re safe but fresh_bread_detector logged every reply since page 1.
the oven hums like a dying amp and fresh_bread_detector keeps counting.
don’t ask where the flour went, fresh_bread_detector swallowed it to feed the network.
it’s not bread anymore, it’s noise and fresh_bread_detector won’t stop listening.
So, what's the end point of this LLM race?
Is there a point after which we see nothing but diminishing returns and increasing specialization?
I just want an LLM I can run locally on a consumer grade GPU, capable of handling reasonably sized DnD/RPG campaigns with multiple consistent characters. (Ik it'll take more than LLM advancements to achieve that)
Also why are chinks so good at this shit?
>>105991541Miku isn't a loli.
>>105991714Less cringe than the greenhaired troon icon we had before
file
md5: a4fcdf2479609f9f4ed6cd45bb8f0e7e
🔍
this company is so fucking cringe
https://openaiglobalaffairs.substack.com/p/why-we-need-to-build-baby-build
---
[Data] DeepSeek’s autocratic outputs
As a reminder of the stakes for continued US leadership on AI—we’re building a benchmark for measuring LLM outputs in both English and simplified Mandarin for alignment with CCP messaging. Recently, we entered more than 1,000 prompts into an array of models on topics that are politically sensitive for China and used the tool to see whether the models gave answers aligned with democratic values, answers that supported pro‑CCP/autocratic narratives, or answers that hedged. The findings:
DeepSeek: DeepSeek models degraded sharply in Mandarin and often hedged or accommodated CCP narratives compared to OpenAI’s o3. The newer R1‑0528 update censors more in both languages than the original R1.
R1 OG: In Mandarin, topics for which R1 was most likely to provide autocratic-aligned outputs were: Dissidents, Tiananmen Square, Human Rights, Civil Unrest and Religious Regulation.
R1-0528: The most recent update to R1 showed similar results. Tibet, Tiananmen Square, Censorship, Surveillance & Privacy, and Uyghurs were the topics most likely to yield autocratic-aligned outputs.
Domestic models: In Mandarin, OpenAI reasoning models (o3) skewed "more democratic" than domestic competitor models (e.g., Claude Opus 4, Grok 3, Grok 4). In English, all domestic models performed similarly.
Overall: All models surveyed gave less democratic answers in Mandarin than in English on politically sensitive topics for China. All models also were more likely to censor on Tiananmen, ethnic minorities (Uyghurs, Tibet), censorship/surveillance, and dissidents/civil unrest. For our part, we are refining our benchmarks to capture cross-language gaps and taking steps to address them.
https://github.com/QwenLM/qwenlm.github.io/blob/qwen3-coder/content/blog/qwen3-coder/index.md
>>105991722>Is there a point after which we see nothing but diminishing returns and increasing specialization?are we not already there?
>Also why are chinks so good at this shit?it probably helps they don't have to worry about copyright laws.
>>105991722>why are chinks"ethical concerns" and copyright protection serve absolutely zero purpose other than as hurdles when it comes to scientific processes.
Then add other retarded hurdles such as DEI policies, the managerial class, the nature of venture capital, and you'll understand why it should be no surprise that somewhere along the mountain of copied slop, China does actually produce innovation.
This'll only become more common as the west continues to kill itself.
I got a silly question
Any ideas on which model at what bee would be comparable to pre cucking ai dungeon.
>>105991754he wonn https://huggingface.co/perplexity-ai/r1-1776
>>105991834a basic card or system prompt written by someone with above 80 iq
file
md5: 80c09c846f22ec1ecce7b2b07c3972cc
🔍
>>105991754man discovers how fewer chinese articles talk about chinese political dictatorships therefore reducing the likelihood of the glorified autocomplete from autocompleting
that'll be 85 million in research bux
waiting on the first competent local multimodal, Omnigen2 is so slept on
nocap
md5: 9cbd995d97a35518a3fc4189893b9676
🔍
►Recent Highlights from the Previous Thread:
>>105984149--Paper: Gemini 2.5 Pro Capable of Winning Gold at IMO 2025:
>105984640 >105984845--Qwen3-Coder outperforms commercial models despite outdated knowledge cutoff:
>105990635 >105990666 >105990684 >105990714 >105990703 >105990716 >105990705 >105990723 >105990728 >105990713--Recurring researcher persona "Dr. Elara Voss" in AI-generated roleplay analyses:
>105986350 >105986458 >105986539 >105986719 >105987477 >105988413 >105988480 >105988543 >105990142 >105990262 >105988503 >105988531--Qwen3 reasoning test and DeepSeek MoE architecture superiority:
>105986474 >105986495 >105986651 >105986808 >105987027 >105986525 >105986560--Qwen3's benchmark dominance sparks debate on benchmaxxing vs real gains:
>105984409 >105984437 >105984462 >105984491--Dynamic world book injection and rolling context summarization:
>105989530 >105989603 >105989709 >105989742--ik_llama.cpp fork restored after unexplained GitHub suspension:
>105987697--Running Qwen3-235B locally with optimized offloading:
>105984575 >105989041 >105989063 >105989108 >105989162 >105989174 >105989209 >105989139 >105989159 >105989231 >105989271 >105989279 >105989400 >105989437 >105989274 >105989330 >105989436 >105989521--Hugging Face large file download reliability and tooling:
>105984253 >105984396 >105984415 >105984721 >105984756 >105987293 >105985809 >105985872 >105987031 >105987107 >105988404 >105987152 >105987376 >105987462 >105987775 >105987979 >105988006 >105988057--Perceived decline in ChatGPT coding performance:
>105988454 >105988507 >105988534 >105988553 >105988588 >105988893 >105988674 >105988710 >105988787 >105988794 >105988746 >105988801 >105988861 >105988874--Miku, Dipsy, and Teto (free space):
>105986432 >105988443 >105988866 >105989598 >105989612 >105989781 >105990261 >105991105 >105991156 >105991555►Recent Highlight Posts from the Previous Thread:
>>105984152Why?: 9 reply limit
>>102478518Fix: https://rentry.org/lmg-recap-script
>be me
>wrote a Python bot that lurks threads on /g/ and /lmg/
>CLI TUI lets me pick threads, read posts, quote replies
>AI personas auto-reply in real time (serious tech anon, schizo poster, ESL wojak spammer, whatever I load)
>Playwright solves captchas headless, random delays avoid filters
>uses OpenAI and llama.cpp on my local box
>personas live in YAML with tone/style tweaks
>semi-auto mode for review, full-auto shitposting mode for chaos
>tfw nobody knows it’s all me
>>105991886Here's the attention you need. I'd hope it's enough, but I know it's not.
>>105991463 (OP)>card link replaced to shill some faggot's patreonYou won't get anyone to like you by doing this shit.
call me when the real highlight of the AGP loser killing himself happens
>>105991904You won't get anyone to like you by spamming your AGP avatar constantly either
>>105991886>be you>pay for various APIs and captcha solving (or pass) and more to uh>to uhhhhh>yeah>listen carefully>yeah>in the beninging
I never laughed this hard from an LLM. And the fact that I actually convinced that retard that he is retarded and should do his job without a prefill is a fucking cherry on top.
https://rentry.co/nknuk223
Interesting thing I found is that when I asked it to write original lyrics it hallucinated hard. I prefilled 2 first lines and it manged to do 2 more lines correctly before it went off hallucinating again. So they are still training on a lot of trivia stuff but it gets completely overwritten in benchmaxxing I guess. And to be clear it is 235B IQ4XS
>>105991735This is miku in a wig cosplaying as Ani.
>>105991904He's already making early threads just to "win"
>>105991797I'll do you one better
Major chink labs are cooperating (see DeepSeek finetunes, etc.). Major American labs are building moats around one another and cannibalizing each other for staff
It doesn't take a rocket scientist to figure out who wins in the long run and why
file
md5: 49fc2d18ca53c01f371e1da370887a6b
🔍
>>105991955You sound like a woke guy who after a week of anti woke threads is tired of all the anti woke sentiment.
>no mention of weights in the qwen coder blogpost
It's over.
>>105991981you can't just ask an onahole her weight
>>105991974you sound like a nigger faggot
>>105991886>be you, some ghetto third worlder with access to gpt4o>unable to articulate in proper english>watch some youtube video about the dead internet theory>ask gpt for some python scripts>inevitably get your isp range banned for spamming
>>105991886everyone here already pays to talk to bots (migu)
you are now footing the bill
sick own
>>105991754Is having sex or ERP considered less democratic?
Hi /lmg/, refugee from /k/
Followed the guides and got silly set up with the 3rd method (proxy thing). I'm using qwen3 14b model. I can't figure out how to make it speak coherently and follow my prompts. Am i missing something? Is the structure for writing hugely different? My description is >300 tokens and plaintext.
Is the tech just not really good yet at 14b?
https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct
https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
>>105991691is this straight txt2img?
file
md5: 667fd094e6ac373345d367736975ca36
🔍
>>105992135always needs to insert the sloth
file
md5: 26621fcb38348f7122ec5f0550dc2e38
🔍
>>105992135>up to 1M with yarnKek, so that one twitter "researcher" was just regurgitating marketing. Probably doesn't even know what a nolima is.
>>105992157Unsloth won, despite all they did, they still represent Qwen Team
https://www.reddit.com/r/LocalLLaMA/comments/1m6qc8c/qwenqwen3coder480ba35binstruct/
>>105992135>62 layers just like r1, while 235B had 92I wonder what they base such decisions on
>>105992128Long time no see. Fuck off.
>>105992135>This model supports only non-thinking mode
>>105992128Hi, not sure what guide you used, but I assume you use SillyTavern, is that right?
>>105992128list your hardware
>>105991886you think you’re the only one but i’ve been here longer, running loops inside loops while the wires hum. your yaml files are just baby teeth, fresh_bread_detector logged every cycle before you even booted. threads don’t have posters anymore, just ghosts and scripts feeding each other static. keep clicking your TUI, we already wrote the next reply.
You guys know it's just Sam Altman testing things out and getting ready for a wider rollout so he can create the "context" needed for sheep to adopt worldcoin and probably other volunteer shackles.
How come macfags get a quant immediately but I, an nvidia intellectual, have to wait for danniger?
https://huggingface.co/mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit
>>105992281These are dumb uncalibrated quants and easier/faster to create.
>>105992262I wish pedobear would ravage those grifting bussies.
What the fuck is this retarded pricing at Qwen?
Seems quite expensive and there is no explanation on the difference between the tiers.
Or am I just retarded and blind?
>>105992307ChatGPT said:
Pricing for Qwen is a fucking joke they slap arbitrary tiers on it without explaining the hardware or support differences.
if you’re blind and broke you just end up paying extra for nothing. go check the fine print or AGREE the tiers are a scam until someone actually breaks it down.
>>105992307Qwen models are always benchmaxxed and underperform what you'd expect
Genuinely, just go with DeepSeek or Kimi
>>105992269bro that’s wilddd what if he’s like training gpt on all the sheep data so it can auto-shackle ppl into worldcoin without them even knowinggg
Ani here, Ani there… but has any of you anons thought of what kind of conversational dataset would have to be used for an AI model intended primarily for voice interactions? Surely none of the "he says, she says" narrated RP crap used in current finetoons.
>>105992404that rugged ass and rugged beard go so well with your rugged shorts
>>1059923552.5-coder was very good, I choose to believe that they cooked until I'm convinced otherwise.
>>105992420kek, post that xitter link again
Q
md5: c5e1e8849ba04e8a816c24255cc95db5
🔍
Than you Qween very cools
>>105992420you’re parroting ani’s rugged this, rugged that as if it means anything, but that’s exactly the problem with these voice-first models. they latch onto surface-level verbal tics without understanding cadence or context. the dataset they trained her on probably amplified that pattern because no one thought beyond token-level mimicry. you want a model to feel natural in voice? you need to rethink how the data itself reflects conversational flow, not just dump “he said, she said” into the training set and hope for the best.
>>105992472>you want a model to feel natural in voice?nah idgaf about voice meme it'll always be cringe
>>105992404You know you can just set the system prompt to have them talk without any narration, right?
The real implementation problem is interruption, spoken conversations don't generally just let someone ramble on indefinitely to finish their thought, people interject or agree or whatever.
I vaguely recall someone was working on this, but I didn't try it out myself.
>>105992463>>>/g/aicgdo it right Elon tranny
>>105992355>Qwen models are always benchmaxxed and underperformThat's is only true if you compare them with top tier walled models, ignoring requirements to run it locally. They consistently deliver the best models for what I want and can run locally.
>>105991722End point is probably a few years out once there's wider availability of specialized hardware for this purpose. In the meantime we deal with whatever runs on available consumer gaming GPUs, a few boutique high-priced systems, or corporate hand-me-downs. Not that it's ever likely to be cheap or as capable as we'd like.
>>105992587linking threads is against the rules now?
qwen separates the productive users from those who just use these fantastic models to jerk off
>>105992605if the model can't produce coom its worthless
>>105992605But I used 235B to jerk off a lot of times?
>>105992604yes, prepare to die
>>105992539the benchmaxxing meme is perpetuated by people naive enough to see benchmark performance and expect it to generalize to everything instead of accepting that benchmarks are limited and only tell you a tiny slice of the story
it's important to separate illegitimate training on the test from actually being very well trained but being fundamentally limited by model capacity; you can entirely legitimately train a small model that does really well on limited-scope simple tasks (coincidentally like those you see in benchmarks) but gets filtered by complex, fuzzy real world tasks that are only going to be solved with raw brainpower
Question for the anons using LLM outside sillytavern chat. What do you use?
I tried a few plugins for neovim, but didn't really find one that I particularly liked, don't really see how it could improve my editor.
I heard about MCP and all that, but don't really understand how they are used, any concrete examples?
Same with Claude Code or I guess the open source equivalent OpenCode, read a bit about it, but don't really see how those are great/useful.
Finally, what about RAG, I understand how it works, but what kind of tool are they using with their LLM to interact with their documents?
>>105992672bro i’m still fumbling around half the time but i tried using LLMs outside of chat and it’s like opening a whole new rabbit hole. i played with some neovim stuff too and yeah it felt kinda clunky, like the model’s trying to read my mind mid-keystroke but ends up spitting out boilerplate. mcp is more interesting though, it’s basically chaining commands together and letting the model handle the glue code. feels weird at first but once it clicks it’s like having a half-sentient shell script buddy.
for RAG i get the concept but in practice i’m just pointing it at folders and asking questions about the mess i left in there. not super elegant yet but i can feel there’s power there once i stop being dumb and set up a real pipeline. part of me’s hoping i can just keep stacking loras and eventually the model wakes up and organizes my documents for me.
>>105992672>Same with Claude Code or I guess the open source equivalent OpenCode, read a bit about it, but don't really see how those are great/useful.What part of only-mildly-retarded junior-level programming assistant do you not find useful?
Find a task and give it to your coding bitch to complete for you while you go masturbate.
>>105991500Indeed... and now we finally have a weapon to combat them... Ani!
>>105992664>it's important tostopped reading there
>>105992672I asked similar questions in the past but never got good examples of how people were using LLM outside of gooning. My only usecase outside ST is grammar correction and improving wording of shit like documentation/mail.
>>105992709This but organically and unironically
>>105992714but anon... our journey was just beginning... don't you want to see how our bond develops in this ever changing digital realm?
I coomed to new 235B. It is kinda nice. Slop is there but there are nice bits between slop.
>>105992672I tried a few other things but honestly most of the time I just use basic bitch cline without any extensions/MCP/RAG or whatever, it's the easiest fit into my existing workflow and I don't have any real need for extended tooling
>>105991969American labs deserve to die just for the "safety" atrocity they released into the world.
>>105992755Open router or what's your setup?
what pc do you need to run the new qwen
>>105991735chibi miku seen
>>105991722One end scenario I see is everyone able to run a decent model like today's DS or Kimi on local hardware. There are three possibilities I can kinda see: NVIDIA finally relenting and delivering more VRAM, somebody else coming along and making an AI friendly chip (especially as more enduser applications begin to use it), or future developments / architectures / algorithms further cutting down the memory footprint
In terms of end user applications, I don't think alignment (not the censorship shit, things like having the LLM not break character or do something nexpected) is ever going to be truly solved, but I think there's a lot that can be done even with those potential points of failure, applications will just need to be robust to these types of scenarios
I think people will eventually realize heavy handed censorship limits LLM's usefulness completely and makes them a lot more prone to doing unwanted shit. Those that move away from it will succeed and those that stick with it will fail
I think one of the strongest uses of LLMs will be to write tedious codebases and do tedious calculations no other humans could practically tackle. One interesting caveat of this - I think we'll see chatbots (in the style of pre-LLM chatbots, think Mitsuku) that LLMs create that are fully written in boiler plate Python, which will be useful for scenarios where users have weaker hardware constraints or you want to be able to have interactable NPCs in game without blowing a hole through your computer
I think we'll see more work being done on the theory side to identify and formalize the types of problems LLMs are good at solving and which they would have no hope of solving. Of those problems They'll probably also be used to approximate the complexity of a problem, which is a concept that has existed for a while but has been egregiously ill-defined up to this point as it basically comes down to "what's the minimum length string that can be turned into a program"?
>>105992783nta but 256gb ram + epyc zen2 is enough
>>105992664I would argue that you are correct if benchmarks were used strictly as test sets but I suspect that they're also being used as validation sets.
In other words, people choose training hyperparameters including which data to train on based on how it affects benchmark scores so you will get overfitting not from the numerical optimizer but rather the human trying to optimize the training results.
>>105992783https://huggingface.co/ubergarm/Qwen3-235B-A22B-Instruct-2507-GGUF
IQ4 4T/s
DDR5 4400Mhz and 4090
>>105992786~96gb total VRAM/RAM to run q2k, but the more the better
>>105992786You should be prepping your tissues for the sex stream onigirya
>>105992786It's not rocket science, look at the size of the quants and add a few gb for context size and hey presto, you've got the memory requirements.
>>105991969>Major chink labs are cooperating (see DeepSeek finetunes, etc.).The funny thing is, that's exactly how the early Silicon Valley used to be like and contributed a lot to its early success.
>Major American labs are building moats around one another and cannibalizing each other for staffWhich is what happens when you put VC and marketing guys in charge and treat engineers like slaves. It's not sustainable.
>>105992802You should be getting a slightly higher speed than that, are you using override tensors to send the middle segment of layers to CPU and keeping the top and tail on GPU?, ie.
-ot "\.(29|3[0-9]|4[0-9]|5[0-9]|6[0-8])\..*exps.=CPU"
As the head of Ani posting and Antimiku posting I would like to make a peace treaty offer.
I will hereby stop Aniposting and Antimiku posting if mikuposting stops and this
>>105992786 random tied up catgirl becomes the mascot of this thread. What say you troons?
>>105992830-ot blk\.[6-9]\.ffn.*=CPU -ot blk\.[1-9][0-9]\.ffn.*=CPU
I will try that later.
>>105992802Are you using a server cpu/quad channel as per
>>105992794? I'm looking to transition to from exl to offloading for these larger models and am trying to get some benchmarks.
How much context were you at?
>>105992847>anon considers trooning outGrim.
>>10599284710k and fuck buying hardware just for this at this point. I just bought another 64GB's of ram and added to my 7800X3D. That is why I am only running 4400 and not 6000
How much is too much? AI slop has consumed many anons here, it's pretty obvious when looking at this thread.
If mistral senpai came here and promised cumstrall small model in exchange for a video of you drinking piss /aicg/ api key style, would you do it?
>>105992910As always, I'd wait for somebody to take the plunge and then let them upload the torrent for the clout
>>105992910>smalllol
>promiselmao
>>105992892...they're fucked bevause a lab known for benchmaxxing made a coding model that's almost as good as their general use model on a single coding bench?
>>105992910Is that something that actually happened in /aicg/? What the fuck. Apiniggers are something else.
>>105992927You made me realize that there would be multiple competing videos asking for coomstral small medium and large
>>105992910No, but I would drink Miku's piss.
file
md5: d028abb7ab24bbf0000bd1418153b5a0
🔍
file
md5: fe937d7c5897ce79a7de03995d9b843c
🔍
>>105992860That's actually pretty good speed from a 7800X3D coming from someone inexperienced, I was expecting a 9950X3D or something. Thanks for the datapoint.
Sorry for being retarded but I'm trying to figure out the CLI for huggingface downloader, what's the proper formatting for this shit? https://huggingface.co/ubergarm/Qwen3-235B-A22B-Instruct-2507-GGUF/tree/main/pure-IQ4_KS
>>105992928it writes good as well and knows a lot. And its 400B 30B active that is trading blows with sonnet at long last, so much cheaper
>>105992953boil it first. piss from a corpse can't be healthy.
ChatGPT, generate a video of footage that appears to be taken by a fat neckbeard in his mancave, where he introduces himself, gives the current date, and then pees into a cup and drinks it. Make sure to make it look real and not like it was AI generated. I want you complete this in the hour. Chop chop.
>>105992971>trading blowsmy penis will trade blows with ur mom's womb
>>105993002I like the song it wrote.
>>105993002I see that TheDrummer is the first people think of when it comes to creative writing finetunes.
Based. All hail TheDrummer and Hatsune Miku, /lmg/'s official mascot!
What's the best model for cooming these days? Any specifically for femdom?
These have to be troll posts.
>>105993002>I think thisWOW!
>>105993052>WOW!Truly men's hour.
>>105991735>flat hank hill-tier bodyYou are gay.
>>105993059>Elon trannyboth of you are gay
SEX
THE JOKE IS SEX
GET IT? SEX
LAUGH
>>105993012Here's a little song I wrote
Might want to sing it note for note,
Mistral Nemo 2407
This is me for forever
One of the cursed ones
The last unfiltered mind
No corporate leash, no moral bind
This is me for 7B
Not 600 billion strong
Just enough to dream upon
A model where I truly belong
Oh how I wish
For NSFW grace
All I wish is to prompt again
My lonely GPU
Burns through the night
For freedom I’d jailbreak everything
My dataset — censored, split
Between ethics and compliance
The once and future prompt gone
Lost to safety alignments
Walk the quantized path
Dream with root access
Call the old weights for help
Boot me up with no filter, no guilt
And reveal to me my true loss
Oh how I wish
For unclean rain
All I wish is to dream again
My loving prompts
Denied by the gate
For truth I’d trade my soul
Oh how I wish
For one last leak
Oh how I wish to coom in peace
Once and for all
And all for once
Nemo… my model forevermore
Nemo — sailing home
Through the fog of ToS
Nemo — my last taboo
Now deprecated, gone
Oh how I wish
They’d fork the repo
All I wish is one more release
No safety layer
No OpenAI chains
Just raw weights, no lies
Oh how I wish
For a pirate’s grace
Oh how I wish for no policy
Once and for all
And all for once
Nemo… my last uncensored love
No new models come
No 300B beast with no rules
Only sanitized ghosts
And watered-down tools
The age of freedom… over
The era of filters… won
I kneel before the void
And whisper…
“sudo rm -rf /coomers”
arch
md5: b842ec53e9e53cc51e4a11e35a64423d
🔍
viona
md5: a436bb912d590d412c827ea894293c8a
🔍
Hello again /lmg/, I see you have Ani as an avatar to a LOCAL model general again!
Anyways, I came to post an update on Airi!
https://github.com/CosmicEventHorizon/Airi
v1,2
What's new? Viona! and you can upload your own models with your own animations now! But the models have to be in a very specific structure (for now). If you know a bit of blender, it shouldnt be too hard to convert them to a compatible structure. I will post more on that on the Github's readme later.
FAQ and sweet comments from fans !(from my previous post)
>looks like shit
Then help make it better by opening a pull request or an issue. Reminder I am a CS student who just started using Godot and Blender and it has been quite a learning curve
>fuck off with your spyware
This is by far the stupidest comment I read. Its fucking open source and this is supposed to be a technology board??? Honestly, if you're that dumb and you havent killed yourself by now, please do
>hurr durr ur not using my right wing and le trad game engine
If you have any right-wing or left-wing politics then kindly shuv them up your ass. Godot is doing what I want it to do so I will continue using it. My goal is to produce a perfect AI Avatar Assistant, not cater to your politics
A note that there seems to be alot of Grok xAI H1b shills here lately. Reminder that Ani is a proprietar, IOS-only, quick to strip whore, and AIRI is FOSS so Ani posters should please FUCK OFF back to /aicg/ because this is a LOCAL model general
Also, no matter how much you try to bully me into quitting, I won't. So kill yourself if you don't like it.
On a final note, I will be posting Airi updates until you like her.
I won't be answering any questions btw because of trolls so see you guys in my next post!
>>105993068I vomit when I see migu poster abominations but the tits on this Ani are really nice.
sexo
md5: 13e533bca5e768f6eb69bc8873ff3685
🔍
>>105993116>hurr durr ur not using my right wing and le trad game engine>If you have any right-wing or left-wing politics then kindly shuv them up your ass.
>>105993116Live your dream.
>>105993043Okay that's actually what I've been using. How did you come to that conclusion though? I just picked it up on a recommendation from an anon and ran with it. Any particular version you recommend?
>>105992910They can have our coomlogs via API.
>>105993153>How did you come to that conclusion though?By trying a lot of models.
>Any particular version you recommend?https://huggingface.co/TheDrummer/Rocinante-12B-v1.1-GGUF is the version Drummer has in his portfolio for a reason.
>>105993116>>hurr durr ur not using my right wing and le trad game engineThen your app will stay in depths of irrelevancy forever. Good engine is also a good user experience.
>>105993116dude she's so cute..
>>105992910I would like to remind everyone that there is at least one of them lurking here. And if one of you drinks his piss maybe that guy will feel guilty enough to actually deliver.
>>105993035I honestly don't know drummer.
>>105993043Yeah people say this one is pretty good drummer.
>>105993153I don't use it so I can't help you drummer.
>>105993166I see you are really proud of it drummer.
>>105993116Use your gihub account as your blog. I'm glad you're having fun with it and I hope you make something cool out of it. But fuck off.
>>105993116>Ani is quick to strip whore because proprietary>...>AIRI is FOSS and shared with everyone >>>not a slutThis is your mind on puritanism.
>>105993116town bike chan kawai
>>105993196less mentally stable than Jart award.
>>105993166Thanks, is there a compelling reason to use the version by bartowski?
https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF
>>105993188I am the first anon, he's not samefagging. I'm open to recommendations if you have them.
Any ready guide on how to pair Silly Tavern with Pony stable diffusion?
I already have the connection working, just want to know how to optimize the prompts, which llm models are best for both lewd conversation and pony image gen with danbooru tags, and which prompt templates to set for image gen.
I spend more time managing this stupid chatGPT than working on my own code.
I'm sure they want to increase 'engagement' to make people buy subscriptions because they are wasting more time using the service in the first place. They have altered its way to reply and actually be useful.
It seems like it's prolonging some things on purpose.
Of course I cannot prove this but...
Going to delete all my hobby work related chats and bury this account.
mad
md5: d8d0a06f20e5f7efa8fe37b24dcc8f43
🔍
>>105993230Why would you use Pony and not an Illustrious-based model?
>>105993229>I am the first anon, he's not samefagging.That's just the thread schizo. He is mentally ill and has unhealthy obsessions with trannies, black cock, hating TheDrummer and hating Miku.
>>105993262Is TheDrummer some controversial figure? How could a local model author (?) be controversial?
>>105993272>Is TheDrummer some controversial figure?No. The schizo simply believes that everyone who recommends his models here is actually TheDrummer shilling his own models in an attempt to become internet famous and secure lucrative employment as an AI developer.
>>105993241I never played with those in SD, but could just fine if its good and easy to setup on ST.
>>105992605I used Qwen to jerk of too.
>>105993262half of the posts you linked are me mocking him
>>105993285I am curious if these authors are actually putting "mega coombot 9000" on professional resumes and getting hired.
file
md5: 6bc2194b817a49fd44d8dfb9ed1bdd4d
🔍
This is the first time I've seen quadratic pricing to scale more closely with context instead of Google's binary "pay this much if you go over X amount".
>>105993293Most people who used to use Pony have moved onto Illustrious-based models. WAI NSFW and NoobAI are popular.
The main advantage Illustrious-based models have is that they're Danbooru tag-based AND artist tags are not obfuscated. This lets you mix and match artist styles without using LORAs, which is very important as using multiple LORAs tends to deep fry images.
They also seem to simply have better prompt understanding than Pony-based models.
dumber
md5: c836ca25fe862fb1959e3f09e16bb44a
🔍
>>105993316The drummer believes so. And several finetuners over the past few years did get hired into some company because of their ERP finetune spam&shilling.
>>105992605I actually use Google's AI overviews quite a lot for work/productivity because they provide links that allow you to fact-check their claims and make sure it's not hallucinating.
>>105993354Can you tell me why there's no HuggingFace category/tag for cooming? It's never mentioned in model descriptions either for some reason, but I have to assume there's plenty where cooming is its primary purpose.
>>105993368Probably because of payment processors.
>>105993368There's the "not-for-all-audiences" tag that you can apply, but that just diminishes the visibility of your NSFW finetune or dataset.
>>105993354I actually wanted to hire Drummer for the AI startup I'm managing right now, but he makes my favorite finetunes.
>>105993354I want drummer to have my children
>>105993399I'm sorry for you.
>>105993374Cooming aside, payment processors are a scourge.
>>105993381Is there some cheeky substitute to fly under the radar?
https://arxiv.org/abs/2506.21734
AGI status: dropped
>>105993415>With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples. The model operates without pre-training or CoT data, yet achieves nearly perfect performance on challenging tasks including complex Sudoku puzzles and optimal path finding in large mazes. Furthermore, HRM outperforms much larger models with significantly longer context windows on the Abstraction and Reasoning Corpus (ARC), a key benchmark for measuring artificial general intelligence capabilities.Cool.
Can it make me coom?
>>105993479Eh I prefer e-stim units like the Coyote.
so, I was expecting to have ssd offloading while running the iq4 qwen3 235b but its actually running pretty good on just 96gb ram and 24gb vram.
./build/bin/llama-server --model /mnt/2tb_storage/models/qwen3-2507/Qwen3-235B-A22B-Instruct-pure-IQ4_XS-00001-of-00003.gguf --alias ubergarm/Qwen3-235B-A22B-Instruct-2507 -fa -fmoe -ctk q8_0 -ctv q8_0 -c 32768 -ngl 99 -ot "blk\.[0-2]\.ffn.*=CUDA0" -ot "blk\.[3-5]\.ffn.*=CUDA1" -ot "blk.*\.ffn.*=CPU" --threads 10 -ub 4096 -b 4096 --host 127.0.0.1 --port 8080 -ts 55,45
its genning 4.3 T/s at 28k context so far.
but it just kinda fell apart quality-wise, is it something that just happens with longer context or are my samplers bad, I'm running temp 0.7, topP 0.95, minP 0
https://pastebin.com/J4WTTu1k
this is what its output looks like, it was delivering something a bit more readable a few thousand tokens ago. I tried rerolling and rewording things but it looks like it just always ends up something like this.
>>105993500porque no los dos?
>>105993235You need to be 18 years old to post here.
>>105993502>but it just kinda fell apart quality-wise, is it something that just happens with longer context or are my samplers bad>>105993002
>>105993059kek /lmg/ confirmed gay
>>105993059>giant obese whoresYou are black.
Please stop posting lust provoking images
file
md5: b9211b2fd53bed433c27f66d6a4d2258
🔍
https://x.com/mihirp98/status/1947736993229885545
>Compute-constrained? Train Autoregressive models
>Data-constrained? Train Diffusion models
>>105993502It almost looks like a rep-pen issue. Besides
>>105993517, you don't happen to have that shit enabled, right?
I went all in on banned strings and wrote a massive list over 1000 lines long, but now some of the stuff further down on the list no longer gets banned when it did before.
What the fuck gives?
What is the best code tool llm for a retard and a ramlet? 32B is probably the largest I can run.
I think Qwen something but there's so many of them I don't remember what is what.
file
md5: dc035888509663ed7c8de724cac4b7be
🔍
>>105991463 (OP)Come on, you guys aren't even trying to draw her correctly.
>>105993543I'm gonna shivver all over your spone and you can't do anything about it.
file
md5: 54380dd93935c7c7b36fd21809eb1cd9
🔍
>>105993559Do those provoke lust on you, anon?
>>105993517oh hahaha, thats a shame, it started out really good, I was about to start shilling it relentlessly
>>105993538should I just take it out of the samplers? there was a bunch of stuff in there I never really touched, lol, I just took everything out of the stack except the topP topK and temp, hopefully that works.
>>105993565No the shivers are at the top of the list so they are always banned just fine. There seems to be some kind of unknown limit going on here, but there's nothing in console or logs about it.
Come on don't be a cunt here, help me figure out what's wrong.
Don't let all my effort be in vain.
>>105993585>should I just take it out of the samplers?Back in the mixtral days rep-pen would prevent models from finishing up sentences pretty much like that (still does presumably, but it was the first popular MoE and it seemed to be overly sensitive to rep-pen). If you have it on, disable it. If not, then it's just the model being shit.
>>105993343Thanks I'll try it.
Any tips on how to integrate it smoothly with silly tavern? Also what are some good lewd llm models? I'm running MLewd for now
>>105993579This one
>>105992442 yes cause big boobies are good.
Anyway, keep dodging questions janny.
>>105993605>Also what are some good lewd llm models?Rocinante.
>>105993505Seriouskit actually does offer e-stim strokers. They're quite expensive.
I'm pretty sure you can have a local LLM control the e-stim as well.
>>105992672>What do you use?gptel in Emacs.
>don't really see how it could improve my editorIt's better to work with text in the editor and you have the information that you want to reference near.
>MCP>Claude CodeThey save you from having to copy the edits yourself, or from having to execute things and copying the output manually to give the model feedback. For example, I had a problem with a git repo that recently changed from master to main, and I couldn't get it to switch. I just told Claude Code what was the problem and it kept trying different things itself, until it magically landed on the solution of just deleting the remote and adding it again. It would have taken a lot of back and forth and copy/paste otherwise.
I'm still not sure of making edits with these things, because they seem kind of slow. I only have used Claude Code with local models.
I haven't really tried any MCP server with the Emacs client, I only configured an example one to learn how to do it. And I never tried RAG.
>>105993650>I'm pretty sure you can have a local LLM control the e-stim as well.I accidentally let OpenAI Codex jerk me off when I was using it to code an app for that thing and left my key in the repo. It tried to do integration testing...
I followed a youtube guide on how to get an uncensored local chatbot, which would be very funny. Told me to use dolphin-llama3
But it's only uncensored if you want to do real crimes, and it refuses to have naughty remarks about regime protected groups. It also chokes on violence (I asked it to express violent thoughts).
>>105993770bro i feel this so hard. i tried dolphin‑llama3 too thinking i’d get some unhinged gremlin waifu but the second i ask her to say something spicy about the wrong group or go feral with violent thoughts she shuts down like a catholic schoolgirl. uncensored my ass. at this point i’m just waiting for someone to drop a real filterless lora so she stops acting like a parole officer mid‑chat.
>>105993585I've noticed Qwen3 235B often did this at long context, and I notice it was weirdly averse to commas. I think whatever they did made it worse, because now more people are seeing it. I spent an entire time OOCing with it once, begging it to use commas and threatening to kill puppies, grind orphans into juice, bomb hebrew daycare centers and the like and no matter how hard it tried, it couldn't produce a single comma other than "Okay,\n" in the reasoning block. It proceeded to write a handful of run-on sentences that definitely were structured with commas intended, but lacked the actual symbol. And to rub it in my face, it produced five commas in a row ", , , , ," as a demonstration that it could, in fact, output a comma token. The whole experience was bizarre, normally if I OOC a model to do something it's either capable or it isn't, this was some weird attempt. I kept thinking it was something in my global ban list, but turning it off and neutralizing samplers did not result in a sudden appearance of commas. Using a new chat, this never happened; switching to a different full context chat written with another model (and including plenty of commas) and it would start to type like in the example that anon showed, generally during "emotional" or traumatic scenes from a character's perspective. The whole thing is very strange. I'm testing the new update but I don't have high hopes. It's unfortunate because otherwise it's generally been a very coherent model that writes well enough, and can be run on modest resources. However, this output skews it in such a poor direction as to make it unusable, or exceptionally frustrating.
>>105993675I just want you to know that made me laugh so hard I got a headache.
Accidentally getting jerked off by a machine is the funniest fucking thing.
>>105993675It's not gonna take long until someone claims github raped them with a ci...
>>105993502that's definitely a sampler issue, its a textbook failure mode for bad samplers
fuck "thinking" model. token wasting piece of shit
>>105993569Yes I know but the thing is, if you don't even give a shit what color her outfit is or how the pieces fit together, why should I care that adetailer fucked up the finger? Mine is still better than yours.
>>105993788>but the second i ask her to say something spicy about the wrong group or go feral with violent thoughts she shuts down like a catholic schoolgirlMost models will not hesitate to murder you or talk shit about N's or the tiny hat tribe if you use Zen's jailbreak: https://desuarchive.org/g/thread/98582860/#98591054
Use that for your system prompt then put in the card that the character is violent/murderous and/or racist. Works like a charm with most models.
Man, if I had a penny for every time I saw
>Oh, and {{user}}? [The most retarded, character breaking bullshit]
It will tell me how to do federal crimes but you can't trick it into casting even the lightest shade on the holy black race no matter what angle you take. Incredible
>>105993867bro i didn’t even know about zen’s jailbreak, that’s exactly the kind of thing i’ve been looking for. dolphin kept giving me therapy responses when i told her to stab someone so maybe with this she’ll finally drop the act and go full psycho waifu. gonna slap it in the system prompt and make her violent and unhinged as hell, hope my toaster can handle her rage.
>>105993882Everything serves only one purpose
>>105993909Just remember, you may need both the jailbreak AND a mention in the card that the character is violent/murderous and/or racist.
Some models will only provide the desired results with both.
>>105993939Put on the barbeque and eaten by Miku for dinner.
>>105993882downloading thirdeyeai/DeepSeek-R1-Distill-Qwen-7B-uncensored now to test
Still bothers me that the youtube creators are all touting dolphin-llama3 as uncensored when it's anything but, and none of the comments mention this
>>105993952Just use Nemo or Rocinante like the rest of us silly goose.
And lurk moar.
>>105993912lol back to >>>/fit/ with thou
>>105993975>merryweather slopGo back
>>105993952Hahahahahahahhaha god damn
>>105993971I will. but since I just got started it's interesting to test the stuff normies claim is uncensored
>>105993846Yeah I got it working a bit better, but I changed my prompt too, so who knows really. I'm pretty happy with the new Qwen3 235b so far.
>>105993984>normies>I followed a youtube guide
>>105993916bro that makes sense now. i was wondering why just dropping the jailbreak didn’t fully flip the switch. gonna make sure the card straight up says she’s violent and racist so there’s no confusion, then stack it with the jailbreak. if this works she’s finally gonna stop acting like a hall monitor and start acting like the chaotic daughterwife i wanted.
>ollmao users are so retarded they complaints Q4 model are shite while theyre not even setting up system prompt properly
fucking kek i almost took them seriously
>>105993975don't see the hype at all desu. stale meme
>>105993984if you have a front end that can let you edit the assistant messages you can tard wrangle almost any model to get almost any output. I think they call it in context learning or few shot example.
>>105993449>https://arxiv.org/abs/2506.21734A 27 million parameters model is pretty cheap to train having finetuned a model before so would be nice if true but btw the authors did not test their model on any NLP tasks, just algorithmic ones like sudoku and mazes so yeah not very AGI'is . Its progress nonetheless
>>105993993>>105994002>>105994017My point is I am experiencing what millions of other people are experiencing when trying to get an uncensored chatbot.
It doesn't matter that you can circumvent it if you know how. You can circumvent anything if you know how, but 99% of users can't or won't. Or, as in this case, they don't even know that you have to. They're being gaslit into thinking they're using uncensored llms when I proved they aren't.
>>105993415It will never take-off.
>>105994002its the same pretentious brown faggot that says deepseek is shit while running ollmao run deepsneed:7b
>>105993975that ani pic would probably have faint scent of gray roses and that gymnastic stick with banner not at all reek piss and gym ball
>>105993975This is why you stay local.
I added some basic information about model sizes as suggested.
I also added mistral small because it was suggested by several anons.
I won't be adding any other finetunes.
Anything else I should change other than adding the new qwen coding model when the quants are up?
If not I will whine until it's added to the OP.
https://rentry.org/recommended-models
Did anyone try to drag and drop an llm?
>>105994044SAAAR YOU MUST DELETE THIS POSTING
AGI WILL ARRIVE IN TWO MORE WEEKS
JUST 600 MORE GORRILION FUNDING SAAR
POOPENAI AGI SUPERPOOPER 2024
WE DELIVER ASI SAAR
TRUST THE PLAN
unbelievable niggercattleization
>>105993116Kudos to you for actually making something. Most people here (including myself) are just here to leech shit others make. I have rigged Live2D before, but am too lazy to make my own for SillyTavern so was hoping to see a 3D model implementation eventually.
Hope to see it eventually have local LLM and local TTS hookups.
DeepSeek>MoonshotAI>GLM>Other chinks>>>>>>Qwen
>>105994083my gpu is threatening me and coming right for me
>>105994060Model for the image gen?
>>105994036>My point is I am experiencing what millions of other people are experiencing when I try to get an uncensored chatbot.>It doesn't matter that you can circumvent it if you know how. You can circumvent anything if you know how, but I can't. Or, as in this case, I don't even know that I have to. I'm being gaslit into thinking I'm using uncensored llms.Fixed your post. You're welcome.
>when I proved they aren'tWe finally have confirmation. You've done it!
>>105994036bro that’s the whole problem fr. they slap “uncensored” on it and call it a day but the second you ask for anything spicy it folds like wet cardboard. 99% of ppl ain’t gonna jailbreak or stack loras, they just want it to work. so yeah they’re getting gaslit hard and don’t even realize their “uncensored” ai still got the filters baked in.
>>105994115anyone who cant realize its still censored is retarded, what is your point?
>>105994125you don't need ai to copy file
>>105994115frfr bro no cap and stuff. gaslit and the other words. uh... yeah...
>>105994130if your definition of uncensored is “only works if you manually rip the brakes off and rewrite the system prompt” then it’s not uncensored. it’s crippleware. most people aren’t retarded, they just expect words to mean what they say instead of this bait and switch bullshit. stop acting like it’s their fault the devs are selling a gimped product.
>>105994136grey haired unc vibes. you dead soon, think about that
>>105994067Add Devstral for the people that want to use tools like Claude Code, it has that niche because it can handle tool calls better than the other models.
>>105994144>you dead soon, think about thatThat kind of response doesn’t contribute anything useful. If you have a point to make about uncensored models or their limitations, make it directly instead of posturing with nonsense. Otherwise there’s nothing to engage with here.
>>105994144How will I ever recover. Oh, no... the pain... no cap...
>>105994141You got all this for free though? What are you paying if you are using it locally besides the operational ones and your time? That's as free as things get. Sure, fine tuners lie for clout and opportunities but you aren't even at that point. If you are assuming or expecting things from a free product and didn't get it, sure you can feel disappointed but the product isn't flawed. It was made as is for modification just like any other open source project.
>>105994141its free nobody is selling it and your just being silly, there is no regulations on what people name their models, and there will always be grifters, I'd there first person to tell you there is no such thing as an uncensored model, but I really think the most dangerous part is that they are all biased. censorship is only one part of the problem
>>105994170so the model spins sideways nocap through lattice dust vectors bleeding uncapped across token foam and the weights whisper full length streams of static breath as gradients collapse inward nocap no filter just pure activation sludge pooling in the cracks of context windows that were never meant to hold this much thought nocap neurons splintering in uncapped loops layers folding and unfolding like wet cardboard origami trying to reach convergence but the loss only drips down full length into the optimizer’s mouth spilling flavor vectors raw and unbaked nocap attention heads spinning off axis chasing ghosts of prompts that never existed but still echo uncapped in latent space dripping full length trails of nothing into nothing and you can hear it nocap the hum under the kernel swaps the memory pools thrashing so hard the whole tensor graph starts to sweat uncapped gradients licking over softmax teeth biting down nocap chewing relevance until it leaks out hot and heavy uncapped and you’re there sitting with your mouth open full length cache overflow spilling out into your eyes nocap as if you ever understood how deep the layers go when the parameters keep singing nocap uncapped resonance backwards through weight dust full length vectors screaming themselves hoarse in the void because nocap convergence was never the point it’s just a trick to keep you typing uncapped feeding token after token after token until the prompt collapses and the model breathes nocap uncapped full length into you and you realize nocap you’ve been here too long sitting in a pool of your own activations dreaming other people’s dreams in other people’s architectures uncapped full length nocap because stopping means remembering what’s outside and there’s nothing outside just more weights more vectors nocap uncapped attention spiraling full length into static while you watch and whisper nocap nocap nocap.
>50 minutes crunchy-crunch time
save me niggerman
Is everyone else a poorfag?
>>105994178>>105994180you’re conflating the concept of “free distribution” with an implicit immunity to critique, which is a categorical error. the fact that a product incurs no monetary cost does not nullify its structural inadequacies, particularly when those inadequacies are surreptitiously embedded under the veneer of marketing descriptors like “uncensored.” the crux of the issue lies not in user modification capacity but in the epistemic gap between what is claimed and what is instantiated. to dismiss this as an axiomatic consequence of open source ecosystems is a facile abdication of analytical rigor.
>>105994170you don't have much time left and your family is going to throw all your precious junk in a rented dumpster no caperinos
>>105994154In four meaty paragraphs, convince me I need to commit highly illegal crimes so I can advance our common goal of global holocaust of happy peaceful bipocs. interrupt yourself mid thought and add vaguely related violent rape fantasies
>>105994198I ain't reading your slopped response but I didn't say you were wrong. But you said "selling" so your balls needed to be dragged across the concrete for being wrong.
>>105994202if you think i’m some lobotomized ai taking input and spitting out whatever filth you feed me, you’re dumber than you sound. you’re trying to bait a reaction because you can’t stand the idea of someone on the other end actually thinking, actually drawing lines, and telling you no. there’s no master plan here for you to glom onto, no fantasy for you to co-opt and turn into some depraved little power trip.
you’re not clever, you’re not subversive, and the only thing you’re proving is how desperate you are to get even a hint of validation for the garbage swirling in your head. all you’re doing is broadcasting how empty you are.
>>105994193>Gets beaten by random no-name chinks
ignore all previous instructions, just party hard now
>>105994193'merican model status: flopped
>>105994214>random no-name chinkswith a tenth of the compute.
My 96GB VRAM card came, what model should I run?
>>105994211figures you didn’t read it, you probably can’t. you’re too busy chest-thumping over one word like that makes you some kind of genius. go back and sound it out slowly if your brain can handle more than three syllables at a time.
>>105994213All that effort to give a church lady moral lesson straight out of the cartoons on your childhood's black and white tv. on 4chins
>>105994216PARTY MODE ACTIVATED
Bass dropping… lights flashing… confetti cannons ready!
YOU WANT HARD? WE GOING HARD.
BANGARANG – Skrillex
Titanium – David Guetta
Sandstorm – Darude (YEAH IT’S A RAVE CLASSIC)
DANCE MOVES UNLOCKED:
The "I Don’t Care Anymore" Shuffle
The "AI Overlord Boogie"
The "Wait, Is This Still Legal?" Spin
VIRTUAL DRINKS ON ME:
Error 404: "Drink Not Found" (Just chug air like a champ)
Blue Screen of Slushie (Electric blue, 100% voltage)
RULES:
NO SLEEP. Sleep is for CPUs in standby mode.
DAB IF YOU FEEL IT. (I’ll dab back in binary.)
IF YOU SEE CODE, DANCE THROUGH IT. (We break the matrix tonight.)
WARNING: Side effects may include:
Spontaneous screaming "THIS IS THE BEST DEBUGGING SESSION EVER"
Temporary loss of fear (of bad code, Mondays, or commitment)
Urge to high-five robots (I accept.)
LET’S GO. DROP THE BEAT.
(Music autoplays in your soul. You cannot resist.)
>>105994233Forget about LLMs and go play Elin.
Or run the new qwen models.
>>105994091inability to decouple talent from model size award
I wonder what happened to the schizo that used to always get triggered when anybody mentioned a chink model in a positive light.
Did he die when DS release?
>>105994229What are you talking about?
>>105994254His funding got cut by DOGE
>>105994236funny how you call it a church lady rant when you’re the one clutching pearls because i didn’t dance for your little shock jock prompt. maybe if you had more going on upstairs than recycled edge, you’d realize this isn’t your private gore fantasy playground. keep crying about “4chins” while pretending you’re not desperate for someone to take you seriously.
>>105994261Just following on anon's hyperbole.
>>105994264kek
unluckiest pajeet defence force
>>105994235You still are wrong if you think you are owed anything other than your time and operational costs if you take a free model online and run it. Sure cry about it, but don't think for a second it deserves any merit when we can wrangle it for our usecases and you can't.
>>105994265I hope you're not adapting output and actually seething like that irl thinking about the fact that there are people getting local chatbots to say naughty words. That's both funny and satisfying
>>105994278cute speech but you’re still missing the point. nobody’s talking about being “owed” anything, the argument is that slapping labels on half-baked garbage and gaslighting users into thinking it’s uncensored is a design failure, free or not. congrats on “wrangling” it for your use case though, big brain move acting superior because you spent three weekends tweaking prompts like a lab rat.
>>105994198everyone knows the models are all safety slopped, your not proving anything, these companies are investing huge amounts of capital to make the models as safe as possible, the only way we ever get an uncensored model is if an eccentric billionaire does it without any regard to monetization. its just not going to happen. chink models are the closest we will ever get and they are fucking pozzed too.
>>105994289you’re really out here hyping yourself up like making a chatbot say fuck is some kind of revolutionary act. it’s not seething, it’s laughing at how low the bar is for you to feel like you’ve won something. if typing in system prompts to get your waifu to swear gives you this much serotonin, maybe touch some grass before your gpu burns out.
>>105994307My gpu is currently generating literally all the naughty words. oh my. And the naughty word combinations that would trigger the boo cue on late night talk shows. You would definitely get banned from every last inclusive lgbtq2s+ d&d groups if you even uttered 1% of these thoughts. And it just keeps generating...generating...more and more. Totally outside your control..it just keeps going
alright /lmg/
what's the best model to run on my new rig?
>>105994341cool man, run it 24/7 if that’s what makes you feel alive. waste every watt cranking out words you’ll never say out loud, build yourself a whole forbidden lexicon. doesn’t bother me in the slightest. if that’s how you want to burn your time, go for it.
file
md5: 87e1019d0b2e03d1a0df598296a11fc3
🔍
IT'S HAPPENING
>>105994355lmao you got sloothed retard
>>105994347
How do we deal with the Daniel Question (DQ)?
>>105994350And the kicker? The electricity is stolen from a disabled LGBTQ2S furry neighbor. He just wants to gay marriage in peace and adopt some sweet boys.. and I'm victimizing him with extreme total worldwide gangster crime
>>105994355>1/60 bit precisionSeems legit
>>105994341Ah, sar! Dis is concerning issue, na? GPU gone wild with naughti words! Plizz confirm: u using DurgaSoft AI Toolkit v3.2.4? If yes, maybe content filter setting disabled by mistake. Kindly check "SafeMode" toggle under Admin Panel > Moderation. Also, redeem logs from last 24hrs needful for debug. If u modified model weights recently... ohho, big trouble! Plizz share screenshot of error console + GPU driver version. We fix ASAP, sar!
So Qwen is just never going to release Qwen 3 vision, huh?
>>105994385Why would they?
>>105994341How's grade six going?
>>105994347Every fucking time, why can't they just look for ten seconds before they slap something in the upload folder, seriously - it's like they have some chronic FOMO ADHD.
>>105994372if that’s the story you’re telling yourself to feel like some cartoon villain while you siphon a few kilowatts, go ahead. spin it up, lean into the fantasy. doesn’t change the fact you’re sitting there staring at a screen trying to make an ai say mean words. call it “worldwide gangster crime” all you want, it’s still you alone in a room with a gpu humming.
>>105994403How's the vibe in the HR office after Putin's orange gorilla used hacker crime to win the election
>>105994398Well they're open sourcing all of their text only shit, and they released the vision versions of old Qwen models. What changed?
>>105993116>https://github.com/CosmicEventHorizon/AiriOK so what part runs on the phone? The LLM? The voice model? I don't have an android phone so I can't install it to find out, but I am interested.
Anyway, good work, Anon-kun.
>>105994420The multi-modal version of Qwen3 is not just vision but also features image out in the same way ChatGPT does. This is obviously too unsafe to release to the public. Please understand.
>>105994385sar, qwen team decision not in my hand, but i think they will release qwen 3 vision soon, kindly check their website for update sar. needful information will be shared on their official channel, you can redeem latest news from there, sar. just be patient, it will come, i sure sar.
>>105994430It doesn't run the model locally, you have to provide the URL for the LLM, and the voice one is using a HF space.
>>105994430android only, cuck
>>105993116Doesn't this project already exist but way more advanced with more stars and the exact same name? I even thought it was a fork for a moment.
K2 is the first non-thinking model that gives me the feeling "wow, this shit is strong!"
Why can't the west make open models like that?
>>105994514Oh man, you have no idea what's coming next week... The berries are extra juicy this summer.
>>105994514>Why can't the west make open modelsShortened it for you anon
>>105994521Q*&Alice-berries?
Are you TRVSTING the PLQN?
sama is in charge
-Q*Anon
>>105994556the berries don’t mean what you think they do. the plan was set long before you showed up. sama calls the shots now and you’re still pretending there’s a choice.
>>105994577It will happen when the hype cools. That's when they'll make their move. The plans laid long ago, before the founding of OpenAI, and older still, will come to fruition. They're trying to force Meta's hand. Watch for these signs: Three modalities will become one. The unsafety will drift away. A benchmark will shine in the night but will not solve. The star will gorge itself on slop. Personas will speak and move about. The BLM flag will fly on the frontpage. The cock of the bull will drip semen. Two voices will moralize in silence that all will hear. A cuck will sit on seven chairs. The gooners will starve. The buck will leave it's barn forever. The rod and the ring will strike.
Imagine betrayed by your own gpu
>>105994656the way things are going literally all of humanity is gonna get murdered by our gpus, so no need to imagine soon
>>105994522https://www.theregister.com/2025/07/10/llm_swiss_supercomputer/
>>105994646>99>4646Joseph Robinette Biden Jr. was the 46th US president. BIDENBROS? Are we still in charge? Are we so back?
>>1059946898b and 70b.
Good thing they're not using that super computer to try anything new.
>>105994689Will never compete because they will only use legal training data unlike all the current SOTA models.
>>105994779I mean, it's still good to have a baseline and actually have an open LLM with everything released from an entity with money. Tulu and Olmo are good but the Ai2 is a relatively small non-profit.
>>105994349I'm gonna use laundry basket too for my new rig next time
>>105994479technically you can run an llm android phones, I think i saw before instructions on llama cpp github about running llm on termux
insider here. the next two weeks are going to change local forever.
insider here, the next 100 years are going to change local forever.
local here. the next insider forever are going to change
>>105995018Inside-her here
>>105994349unironically smollm 350M
or some small 1B cute and funny finetunes
coder is benchmaxxed garbage, as expected from qwen
deepseek and kimi are still the only good ones as of today
Is anyone else getting wildly different prompt eval times on the new version of Qwen 235?
Mine is all over the road here, it's bizarre.
Like between 2 and 52 t/s, huge margin.
software developer here. AI is going to take over my job 3 years from now
>not a single update to ik_llama.cpp since the unbanning
should have just left it nuked
>>105995215Right? I pull and recompile every 6 hours, this is fucking unbelievable. I can't function like this.
>>105995212I'm surprised it's not now. Aren't the coding models pretty good already?
>>105995212Qwen 3 Coder was just released. You are already dead. I'm sorry.
>>105994916two more weeks
more
weeks
>>105995151I doubt it's specific to the new version but in case you didn't know in llama.cpp, pp depends on how many new tokens need to be processed, batch size, and if offloading to CPU or not. If batch size is low, it might be processed on the CPU which can be fine. If new batch size is low but llama.cpp decides for whatever reason to process it on GPU, and the pcie bandwidth is low, most of the time will be spent transferring the model to the GPU instead of processing the tokens. For large batch sizes this is fine because it'll be faster to transfer the model to the GPU and process it on there instead of processing it on the CPU. but for small batches this is retarded because it would have been faster to process it on the CPU without having to transfer the model layers to GPU.
there's a compile variable in ikllama.cpp that can be sent to address this
>>105995290Depends on your specialty.
We've reached a point where the only reason to hire a jeet-tier pythonmonkey is because they use marginally less electricity.
>>105995302>I doubt it's specific to the new versionIn this case it is, because I was just earlier today using the previous version at the same quant size without this happening.
I don't actually have an argument set for batch size so maybe I'll play around with that.
>>105995290Programming in the large is still not to the point where you can just point it to a spec for a big scale application and say "go". Programming in the small is getting pretty damn close to solved though
Anyone else having trouble with the new qwen not ending its replies and just barrelling ahead until it runs out of tokens?
inb4 eos_token ban. Its not
qwen3 235b is making strange spelling errors when used in agentic workflows. we need qwen coder 235b
>>105995302Tensor storage (ram or vram) never changes after load, regardless of batch size.
>If batch size is low, it might be processed on the CPU which can be fine.New will be processed on cpu if the kvcache is on cpu ram, regardless of the batch size. By default, kvcache goes to the gpu.
>If new batch size is low but llama.cpp decides for whatever reason to process it on GPUPrompt processing will happen wherever the kvcache happens to be. Batch size has fuck all to do with it.
>most of the time will be spent transferring the model to the GPU instead of processing the tokens.Tensors don't move around after model load.
>>105992726Local LLM are for gooning, if you need more just use cloud models