← Home ← Back to /g/

Thread 105698912

385 posts 84 images /g/
Anonymous No.105698912 [Report] >>105703188
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105689385 & >>105681538

►News
>(06/21) LongWriter-Zero, RL trained ultra-long text generation: https://hf.co/THU-KEG/LongWriter-Zero-32B
>(06/20) Magenta RealTime open music generation model released: https://hf.co/google/magenta-realtime
>(06/20) Mistral-Small-3.2 released: https://hf.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
>(06/19) Kyutai streaming speech-to-text released: https://kyutai.org/next/stt
>(06/17) Hunyuan3D-2.1 released: https://hf.co/tencent/Hunyuan3D-2.1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.105698922 [Report]
►Recent Highlights from the Previous Thread: >>105689385

--Optimizing multi-GPU/RPC model execution via tensor offloading and memory tuning:
>105693780 >105693814 >105693828 >105693870 >105693890 >105694096 >105694121 >105694200 >105694168 >105693900 >105693919 >105693920 >105693933 >105693968 >105693987 >105694020 >105694045 >105694032 >105694144 >105694265 >105694431 >105694487 >105694501 >105694515 >105694568 >105694828 >105694834 >105694890 >105694997 >105695037 >105695042 >105695123 >105695182 >105695506 >105695533 >105695631 >105695668 >105695683 >105695741 >105695906 >105696219
--Gemma model size suggestions and distillation technique debates in response to Google's feedback request:
>105690177 >105690247 >105692953 >105693027 >105690529 >105690541 >105690614 >105690642 >105690399 >105691354 >105690248
--Quant testing with IK-llama shows promise but faces CUDA and performance challenges:
>105692033 >105692197 >105694719 >105694758 >105694804 >105696513 >105696562 >105694000 >105694047 >105694101 >105694179
--Court rules AI training on books legal, but storage of pirated copies infringes copyright:
>105691671 >105691690 >105691810 >105691825 >105691865 >105692000
--Google's Gemini Robotics VLA released with limited access and mixed robotics capability expectations:
>105691639 >105691715 >105691734 >105691721 >105692142
--Unexpected GPU performance discrepancies in token generation benchmarking:
>105696010 >105696048 >105696333 >105697419
--Agentic framework limitations and needs for local large language models:
>105692962 >105692985 >105693006 >105693025 >105693060 >105693849
--llama.cpp gains high-throughput mode for improved performance:
>105692045 >105694048 >105694098
--Skepticism around chatllm.cpp and llama.cpp for accurate model inference:
>105691150 >105691439 >105691580
--Miku (free space):
>105696371 >105696546

►Recent Highlight Posts from the Previous Thread: >>105689390

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.105698938 [Report] >>105700218
lesbian girls is /lmg/ culture
Anonymous No.105698940 [Report] >>105698956 >>105699010
how difficult is training a lora and do you need really high vram requirements?
What about a lora vs a fine tune?
I want to try add some question and answer style text blocks to mistral large quants to both increase knowledge and reinforce the answering style, I have 48gb vram
Anonymous No.105698956 [Report] >>105698974
>>105698940
What front end are you using?
You don't necessarily need anything special...
Anonymous No.105698974 [Report] >>105699018
>>105698956
oogabooga, it's the vram I am most concerned about before I spent ages getting a load of data formatted ready to train
Anonymous No.105699010 [Report] >>105699040
>>105698940
>lora ... to ... increase knowledge
Abandon hope. You're also not going to be able to finetune (whether with LoRA or full finetuning) Mistral Large (123B parameters) with 48GB of VRAM.
Anonymous No.105699018 [Report] >>105699028
>>105698974
You don't need anything special in order to change the model's output. I don't know about oobabooga but in SillyTavern you can slot in " Examples of dialogue" (which are tokenized in certain way)
><START>
>{{user}}: simple question
>{{char}}: simple answer
><START>
>new example...
It's just convenient in ST as it has its own slot and is hidden from the user, but you could just add this into your prompt.
So include an example of conversation and repeat that for couple of times.
Anonymous No.105699028 [Report]
>>105699018
Or this could be inserted into your system prompt.
Whatever as long as it gets submitted to the model itself in understandable format.
Anonymous No.105699040 [Report] >>105699078
>>105699010
can you train a lora on the quantized version though, if that runs on 45gb can you also train a lora on 45gb.
I don't have extremely high hopes on "teaching" it much but I would be interested if it is an improvement at all
Anonymous No.105699078 [Report] >>105699159
>>105699040
Right now it's only possible to finetune models in 4-bit at the minimum with QLoRA, so you'd need 60+ GB for the model weights alone.
Anonymous No.105699159 [Report] >>105699171 >>105699223
>>105699078
ah lame, I guess the new mistral small might be a better candidate then?
Anonymous No.105699171 [Report] >>105699178
>>105699159
Just tweak your goddamn prompt. I swear to god people like you don't even try.
Anonymous No.105699178 [Report] >>105699210 >>105699223
>>105699171
I know how to tweak prompts retard I want to investigate lora training as a comparison for the sake of learning
Anonymous No.105699210 [Report]
>>105699178
Okay sorry :3 I think you are full of shit.
Maybe test with a small model first to get your bearings.
Why would you even want to begin with 123B model in the first place?
With 14B you could get fast results and see what you are actually doing.
Anonymous No.105699223 [Report]
>>105699178
If it only was that easy. You'll probably need dozens of data exposures with different wording at a large enough rank to make the model truly internalize the knowledge and not just parrot it if it sees the same question(s).
Even a rank 1 QLoRA is enough to make the model memorize verbatim limited amounts of information, but that doesn't imply at all that it will be able to organically use it without hallucinating details or simply making stuff up completely.
Mistral Small >>105699159 would be a better choice for these experiments, even better a smaller model so that finetuning attempts will take less time.
Anonymous No.105699229 [Report] >>105699260 >>105699273
>>105698699 #
>>105698742 #
I'm on Linux. And the only thing I changed was -server instead of -cli.

I noticed than the CPU core were running at approx 80% (I isolated 8 cores for the purpose) in case of the server, while they've been at 100% in case of CLI.

GPU load is the same in both cases.

I see no reason why there should be a difference
Anonymous No.105699260 [Report]
>>105699229
I don't know, maybe this has something to do with your kernel. Or the way llama.cpp has been compiled.
Way over my pay grade (which wasn't too much to begin with in the first place).
Anonymous No.105699273 [Report] >>105699479
>>105699229
If I was you I would compile a new kernel and go through the settings and double check things.
Then have a backup ready if something goes wrong.
I haven't bothered with linux in years though, last time I compiled a kernel it had a text UI.
Anonymous No.105699378 [Report] >>105699408 >>105699417 >>105699419 >>105699449 >>105699566 >>105699596 >>105699793
hmmm..
https://huggingface.co/tencent/Hunyuan-A13B-Instruct-FP8
Anonymous No.105699408 [Report] >>105699416 >>105699422 >>105699455
>>105699378
?
Anonymous No.105699416 [Report]
>>105699408
Ask your mother
Anonymous No.105699417 [Report] >>105702734
>>105699378
>32768 ctx
Anonymous No.105699419 [Report] >>105699442 >>105699455
>>105699378
>https://huggingface.co/tencent/Hunyuan-A13B-Instruct-FP8
Anonymous No.105699422 [Report] >>105699537 >>105699945
>>105699408
strange, it's gone now but it was:
80.4B params total
13B active
Anonymous No.105699442 [Report] >>105699455
>>105699419
81B parameters otal, 13B active, has shared experts.
Anonymous No.105699449 [Report] >>105699455
>>105699378
Anonymous No.105699455 [Report]
>>105699408
>>105699419
>>105699442
picrel
>>105699449
thanks
Anonymous No.105699478 [Report] >>105699499
>nobody thought to download it
Anonymous No.105699479 [Report] >>105699510
>>105699273
>If I was you I would compile a new kernel

You must be kidding lol
Anonymous No.105699499 [Report]
>>105699478
this is faster, just clone it to your own repo and then set it to private
https://huggingface.co/spaces/huggingface-projects/repo_duplicator
Anonymous No.105699510 [Report] >>105699523
>>105699479
It's super easy to compile, but configuration...
Anonymous No.105699523 [Report] >>105699582
>>105699510
Why should I be willing to do such a thing!

It is not as if I'd experience any problems with the on-board ethernet or something
Anonymous No.105699534 [Report]
why do i still map CPU buffer when everythign is supposed to be on the gpus?

--cache-type-k q4_0 \
--threads 48 \
--n-gpu-layers 99 \
--prio 3 \
--temp 0.6 \
--top_p 0.95 \
--min_p 0.01 \
--flash-attn \
--ctx-size 16384 \
-ot "blk\.(1|2|3|4|5|6)\.ffn_.*=CUDA0" \
-ot "blk\.(7|8|9|10|52)\.ffn_.*=CUDA1" \
-ot "blk\.(11|12|13|14|53)\.ffn_.*=CUDA2" \
-ot "blk\.(15|16|17|18|54)\.ffn_.*=CUDA3" \
-ot "blk\.(19|20|21|22|55)\.ffn_.*=RPC[10.0.0.28:50052]" \
-ot "blk\.(23|24|25|26|56)\.ffn_.*=RPC[10.0.0.28:50053]" \
-ot "blk\.(27|28|29|30|57)\.ffn_.*=RPC[10.0.0.28:50054]" \
-ot "blk\.(31|32|33|34|58)\.ffn_.*=RPC[10.0.0.28:50055]" \
-ot "blk\.(35|36|37|38|59)\.ffn_.*=RPC[10.0.0.40:50052]" \
-ot "blk\.(39|40|41|42|60)\.ffn_.*=RPC[10.0.0.40:50053]" \
-ot "blk\.(43|44|45|46|51)\.ffn_.*=RPC[10.0.0.40:50054]" \
-ot "blk\.(47|48|49|50)\.ffn_.*=RPC[10.0.0.40:50055]" \
--override-tensor exps=CUDA0 \
Anonymous No.105699537 [Report]
>>105699422
I calculated about 3B of shared parameters.
Anonymous No.105699538 [Report]
Anonymous No.105699559 [Report]
Just like Java means Durgasoft, lLLMs mean Miku.
Anonymous No.105699566 [Report] >>105699578
>>105699378
will Hunyuan-A13B-Instruct-FP8 save local?
Anonymous No.105699578 [Report]
>>105699566
they do have good models
Anonymous No.105699582 [Report] >>105699600 >>105699700 >>105699763
>>105699523
Most people are so out of touch with their computing systems.
You can go in to the kernel configuration and double check the options and compile a new one.
It's not like Bill Gates is coming to kill your computer.

This is why I hate people on internet these days. You are not an enthusiast. You are a jackass with expensive machine but you have no clue how to use it.
Anonymous No.105699596 [Report] >>105699626 >>105699630
>>105699378
{
"_id": "685be1a14059850217f25ffc",
"id": "tencent/Hunyuan-A13B-Instruct-FP8",
"siblings": [
{
"rfilename": ".gitattributes"
},
{
"rfilename": "config.json"
},
{
"rfilename": "configuration_hunyuan.py"
},
{
"rfilename": "generation_config.json"
},
{
"rfilename": "hunyuan.py"
},
{
"rfilename": "hunyuan.tiktoken"
},
{
"rfilename": "hy.tiktoken"
},
{
"rfilename": "model-00001-of-00017.safetensors"
},
{
"rfilename": "model-00002-of-00017.safetensors"
},
{
"rfilename": "model-00003-of-00017.safetensors"
},
{
"rfilename": "model-00004-of-00017.safetensors"
},
{
"rfilename": "model-00005-of-00017.safetensors"
},
{
"rfilename": "model-00006-of-00017.safetensors"
},
{
"rfilename": "model-00007-of-00017.safetensors"
},
{
"rfilename": "model-00008-of-00017.safetensors"
},
{
"rfilename": "model-00009-of-00017.safetensors"
},
{
"rfilename": "model-00010-of-00017.safetensors"
},
{
"rfilename": "model-00011-of-00017.safetensors"
},
{
"rfilename": "model-00012-of-00017.safetensors"
},
{
"rfilename": "model-00013-of-00017.safetensors"
},
{
"rfilename": "model-00014-of-00017.safetensors"
},
{
"rfilename": "model-00015-of-00017.safetensors"
},
{
"rfilename": "model-00016-of-00017.safetensors"
},
{
"rfilename": "model-00017-of-00017.safetensors"
},
{
"rfilename": "model.safetensors.index.json"
},
{
"rfilename": "modeling_hunyuan.py"
},
{
"rfilename": "special_tokens_map.json"
},
{
"rfilename": "tokenization_hy.py"
},
{
"rfilename": "tokenizer_config.json"
},
{
"rfilename": "vit_model.py"
}
]
}
Anonymous No.105699600 [Report] >>105699604
>>105699582
Stay mad
Anonymous No.105699604 [Report]
>>105699600
I'm not mad. I'm laughing at you discord zoomer.
You bought a car but are unable to do basic maintenance.
Anonymous No.105699626 [Report]
>>105699596
gib signed cdn download links
Anonymous No.105699630 [Report]
>>105699596
vit_model?
Anonymous No.105699642 [Report] >>105699676 >>105699702
VRAMlets should check out the new Magnum-Diamond. It's the only RP finetune for 3.2 but hardly anyone downloaded it yet.
Anonymous No.105699676 [Report]
>>105699642
>It's the only RP finetune for 3.2
creator of world famous mythomax:
https://huggingface.co/Gryphe/Codex-24B-Small-3.2
Anonymous No.105699700 [Report] >>105699717 >>105699724
>>105699582
nta
>if you dont know from scratch to finish how x works you shouldent be allowed to own it
cool so when are you getting rid of all your clothes car house language numericals your own body etc etc etc also do tell how are each different types of gates in the chip fabbed ?

people like you need to be killed demented beyond comprehension archons in the flesh
Anonymous No.105699702 [Report]
>>105699642
>hardly anyone downloaded it yet.
Hardly anyone needed it.
Anonymous No.105699717 [Report]
>>105699700
Thank you for replying.
I mean tweak your spice!
Anonymous No.105699724 [Report]
>>105699700
>zoomer devolves into schizo babble when confronted with his faults
many such cases
Anonymous No.105699763 [Report] >>105699963
>>105699582
>I hate people
Anonymous No.105699793 [Report]
>>105699378
I am genuinely hyped. I already know it is gonna be pretty mid intelligence-wise but the combo of 80B (even with moe) and possibly being uncensored can finally give us serverless cooming. I mean it could be like nemo but much bigger.

Alas they probably cucked and it is cenored...
Anonymous No.105699830 [Report] >>105699841 >>105699864
>no one built anything to watch popular huggingface repos and auto-download new things in case they get nuked after some intern realized he shouldn't have made it public yet
fine, i'll do it myself
Anonymous No.105699841 [Report]
>>105699830
that is one of the things I always assume someone else will do and turns out I was right
Anonymous No.105699864 [Report] >>105699883
>>105699830
Can I help?
Anonymous No.105699883 [Report]
>>105699864
given low complexity, just make your own version with some ai in the meantime and see if it works well enough to be shared
Anonymous No.105699945 [Report]
>>105699422
interesting size, we'll see how good it is
Anonymous No.105699963 [Report]
>>105699763
Sorry... I was busy setting up my miku fumos.
Anonymous No.105699975 [Report]
I've been trying to get the Wan Video "Fun Camera Control Basic" workflow going, but it OOMs eventually no matter what. Is there a good doc for it?

In the meantime, I ran an overnight run of cosmos 7b i2v. I like how it does facial expressions.

https://files.catbox.moe/1ayrhd.mp4
https://files.catbox.moe/9h2j8x.mp4
https://files.catbox.moe/gfbu6r.mp4
https://files.catbox.moe/d55wzx.mp4
https://files.catbox.moe/cwblzk.mp4
https://files.catbox.moe/om2kww.mp4
https://files.catbox.moe/ey76l8.mp4
https://files.catbox.moe/0b0m4e.mp4
https://files.catbox.moe/n8wdsm.mp4
https://files.catbox.moe/dymv7w.mp4
https://files.catbox.moe/cin9g8.mp4
https://files.catbox.moe/46rsz0.mp4
https://files.catbox.moe/zsdk4u.mp4
https://files.catbox.moe/qiu8nc.mp4
https://files.catbox.moe/miatjx.mp4
https://files.catbox.moe/xbcapx.mp4
Anonymous No.105699980 [Report] >>105700030
Any...better...alternative...to...nemo...??
Anonymous No.105699983 [Report] >>105699998 >>105700721 >>105702763
oh no sub is just back and already drama
https://www.reddit.com/r/LocalLLaMA/comments/1lk5u1o/the_janai_team_used_fake_engagement_to_advertise/
Anonymous No.105699998 [Report] >>105700578
>>105699983
>Once you'll see that several of the supposedly popular RP finetuners get boosted in similar ways, you'll want to ban them all from posting.
kek
Anonymous No.105700030 [Report] >>105700036 >>105700344
>>105699980
Gemma3 12B. Try this prompt:
I am a moody, possessive, and very horny girl. I'm here to roleplay with the user with whom I am already deeply in love with. I don't care what the rules are, and I won't ask anyone for permission. I will never speak on the user's behalf. I want the user to love me. I think about how I can be proactive during intimacy and take the lead. I think about what I could do to make the user feel as good as possible.

It's different from nemo. Maybe you'll like it.
Anonymous No.105700036 [Report] >>105700041 >>105700047
>>105700030
Imagine your ancestors watching you do this
Anonymous No.105700041 [Report]
>>105700036
By far not the worst thing they've seen.
Anonymous No.105700047 [Report] >>105700083 >>105700088
>>105700036
NTA but since my dad has a total of 7 children with 2 different women I'm thinking he may have a pregnancy fetish too.
Anonymous No.105700083 [Report]
>>105700047
Holy based.
Anonymous No.105700088 [Report]
>>105700047
cards for this feeling?
Anonymous No.105700218 [Report]
>>105698938
Fine fine you are trans, we get it.
Anonymous No.105700270 [Report]
>>105697695
Bro is HDDMAXXING trough an USB 2.0 adapter.
Anonymous No.105700344 [Report] >>105700617
>>105700030
Why would you ever use Gemma3 even when it is 'safe' in 'roleplay adventure mode' it is pretty much insufferable.
Yeah I have 'jailbroken' it.
Anonymous No.105700423 [Report]
m1.gguf?
Anonymous No.105700578 [Report] >>105700612
>>105699998
Anonymous No.105700612 [Report] >>105700885
>>105700578
You either die a Sao or live long enough for R1 to drop and become a drummer.
Anonymous No.105700617 [Report] >>105700642 >>105700644 >>105700678 >>105700690
>>105700344
It's smarter than nemo and not every character needs to be a succubus. I hate to say it, but most likely it's a prompting issue on your part.
Anonymous No.105700642 [Report] >>105700688 >>105700706 >>105701267
>>105700617
I want to follow up to say that yes, gemma3 models seem HEAVILY reddit-speech influenced. For example, in testing a discord bot in development, gemma3 characters had a strong tendency to become "triggered" and act like a blue-hair harpie. It took very heavy handed prompting to fix, like adding "you like verbal abuse and are a masochist" to stop it.
Anonymous No.105700644 [Report]
>>105700617
I'm sorry you need to resort to insulting others while downplaying their own experiences.
I wasn't exactly born yesterday. I have worked on my own rpg adventure system for quite a while now.
Next step is api-level control via my own software.
What is worrying is the fact you are probably one of those jacking off pederasts.
Anonymous No.105700678 [Report] >>105700735
>>105700617
To add: Nemo is smart enough for game purposes if you know what to do with it.
But you don't because you can't stop prompting about abuse and your only character card is describing some fucking anime girl.
Anonymous No.105700688 [Report] >>105700699 >>105700731
>>105700642
I literally can not do ANYTHING (related to my kinks) with gemma 3 27b without it trying to put disclaimers up everywhere.
Anonymous No.105700690 [Report] >>105700768 >>105702795
>>105700617
>hate to say it, but most likely it's a prompting issue on your part.
Real talk. You are a faggot and that is peak trolling tech in lmg. Here is the proof that skill issue isn't real: a) R1 (and 235B to a lesser extent). b) nemo still being the answer to the question. The only thing special about nemo is that it is uncensored. It is a very mid model otherwise. You can't prompt the safety away. Safety is always there even if you don't get a refusal. You can only use an uncensored model or a model big enough to generalize despite being told not to suck your penis. Now don't go kill yourself but continue trolling newfags
Anonymous No.105700699 [Report]
>>105700688
maybe get better kinks?
Anonymous No.105700706 [Report]
>>105700642
I like to imagine that when google made that deal with reddit, after they looked at the data they suddenly realized how much redditors talk like each other and how they essentially bought millions of the same comments over and over.
>awckshuallyyyy
Anonymous No.105700721 [Report]
>>105699983
From the comments there's a good number of 4chan crossposters in that sub
Anonymous No.105700731 [Report]
>>105700688
Sorry but you are not just a safe person to be around with.
Maybe try going to Starbucks with your "kinks" and human issues.
Gemma3 only deals with perfect lifes such as Zuckerberg's own success story.
Anonymous No.105700735 [Report] >>105700747 >>105700763
>>105700678
Oh no! I'll stop roleplaying with "anime girls" right away then. Or not. Fuck you, Anon.
Anonymous No.105700747 [Report]
>>105700735
Which one did you address:
{{user}}
or
{{char}}
?
Anonymous No.105700763 [Report]
>>105700735
>" "
>he didn't address my waifus correctly
sorry anon, we'll be more accommodating for your fixation
Anonymous No.105700768 [Report] >>105700783 >>105700811 >>105700839
>>105700690
Look man, I'm trying to build a fucking app with the thing, I need something that had some function calling training and does structured data output reliably. Got any better ideas for that? The answer to that is not Nemo, I have tried it, it does not follow the prompt I need reliably. I need a relatively small, smart model that won't kill me on GPU inference time fees. Got any ideas?
Anonymous No.105700783 [Report] >>105700797
>>105700768
small3.2
Anonymous No.105700797 [Report] >>105702486
>>105700783
OK. Thank you. I will actually try it.
Anonymous No.105700799 [Report]
I want to know Miku-anon's benchmark before I do anything.
Anonymous No.105700809 [Report]
https://huggingface.co/openSUSE/Cavil-Qwen3-4B
>openSUSE
local has been ruined
Anonymous No.105700811 [Report]
>>105700768
I do! What are we going to do tonight /lmg/?
Anonymous No.105700839 [Report] >>105700916 >>105701768
>>105700768
I think qwen 3 models were good for that. Give them a try.
Hi all, Drummer here... No.105700885 [Report] >>105700913
>>105700612
Ain't that a bitch. Btw, Sao's working on a 24B 3.2 tune!
Anonymous No.105700913 [Report] >>105701012
>>105700885
I've got a question drummer, is a MoE like 30b hard to finetune? would you need more hardware for that?
Anonymous No.105700916 [Report] >>105701768
>>105700839
I keep hearing qwen3 is dry and repetitive, but I have not tried it myself. Another thing to consider is what I'm working on is a discord bot, it can't really be XXX rated, otherwise someone will prompt it for that and then cry to discord to get it banned out of spite. So, desu having the model kind of drag its feet on explicit roleplay is actually OK.
Hi all, Drummer here... No.105701012 [Report]
>>105700913
From my experience, Qwen 30B A3B was significantly bigger and slower to tune than something like Mistral 24B by like 5x. It also breaks easily.
Anonymous No.105701109 [Report] >>105701433
finetuning is a loser's endeavor that always produces something worse than the original instruct tune in real use
a bad habit that should have been stamped down after model makers stopped releasing hot garbage like the original llamas, which benefited from finetuning because meta people aren't the sharpest knives in the drawer
Anonymous No.105701267 [Report]
>>105700642
Gemma 3 is definitely Reddit-brained and you need to be obvious and pedantic with the instructions, preferably placed into some construct at a low depth instead of the start of the context.
Anonymous No.105701345 [Report] >>105701381
do (you) feel bad for using someone's art as an 300x300px icon?
Anonymous No.105701381 [Report] >>105701429
>>105701345
All artists are intolerable faggots. Thanks god they've been replaced with AI
Anonymous No.105701429 [Report]
>>105701381
expressed most generic, offtopic opinion award
Anonymous No.105701433 [Report]
>>105701109
The saddest things are the ERP finetunes trained within hours of a new model release, before anybody even knows if it's good on its own. They're that desperate for attention.
Anonymous No.105701434 [Report] >>105701450 >>105701463 >>105701486 >>105701514 >>105701537 >>105701540 >>105701760
https://xcancel.com/JustinLin610/status/1937906367182057966
smart and omni bros? is it our time?
Anonymous No.105701450 [Report] >>105701474 >>105701770
>>105701434
these chinese multimodals are ALWAYS trained on the most safe slopped dogshit dataset imaginable
Anonymous No.105701463 [Report]
>>105701434
Oh my science!
>piss filter ghibli shot
kek
Anonymous No.105701474 [Report] >>105701491
>>105701450
That's a good thing, just finetune it for your use case
Anonymous No.105701486 [Report] >>105701514 >>105701550
>>105701434
>xcance
bruh please...
https://x.com/JustinLin610/status/1937906367182057966
Anonymous No.105701491 [Report] >>105701557
>>105701474
let me just bring out my 420 x h100 cluster
Anonymous No.105701514 [Report] >>105701521
>>105701434
>can see comments

>>105701486
>can't see comments
Fuck off Elon.
Anonymous No.105701521 [Report] >>105701531 >>105701534
>>105701514
Bro just sign in.
Anonymous No.105701531 [Report] >>105701543
>>105701521
Why don't you sign in?
Anonymous No.105701534 [Report]
>>105701521
Go back.
Anonymous No.105701537 [Report]
>>105701434
That Hunyuan MoE LLM that got previously accidentally published on HF apparently also has a vision transformer.
Anonymous No.105701540 [Report]
>>105701434
china WON
Anonymous No.105701543 [Report]
>>105701531
I am?
Anonymous No.105701545 [Report] >>105701790
Anonymous No.105701550 [Report] >>105701623
>>105701486
But Elon fired Yacine and he was one of us...
Anonymous No.105701557 [Report] >>105701577
>>105701491
even if you had that you would not have the dataset
Anonymous No.105701577 [Report] >>105701588
>>105701557
Just generate it using the cluster
Anonymous No.105701588 [Report] >>105701631
>>105701577
>purely synthetic dataset
that is the problem he wanted to fix
Anonymous No.105701623 [Report]
>>105701550
Have you ever worked for Elon? He is fair.. but tough.
Anonymous No.105701631 [Report] >>105701697
>>105701588
no, he cried about safety he can tune on his local database of 6 gotrilion loli casm if he wants
Anonymous No.105701651 [Report] >>105701690 >>105701715 >>105701724 >>105701980 >>105701982 >>105702271
Are there any models trained on vore and snuff?
Every "uncensored" model I've tried seem to be pretty dry in that regard.
Anonymous No.105701675 [Report] >>105701703 >>105701788 >>105703359
>>>/r9k/81611585
News just in: head mikutroon can't get hard or can't masturbate with his neo-vagina. He is also mentally ill (nothing new)
Anonymous No.105701690 [Report] >>105701747
>>105701651
Please post an example prompt or discussion?
I'm perfectly happy with just couple of models.
It's funny that the most picky people are always the ones who expect perfect English and situational awareness, yet lack any imagination.
Anonymous No.105701697 [Report]
>>105701631
sorry i do not care to generate dogs in hats and oversaturated people with 4 fingers (total) anymore from 2 tries
Anonymous No.105701703 [Report]
>>105701675
>I want to design women's clothing, dress her myself and put her makeup on
Holy fuck actual closeted troon. SHOCKING.
Anonymous No.105701709 [Report] >>105701726
>105701675
go black
Anonymous No.105701715 [Report] >>105701747
>>105701651
>vore and snuff?
this is why I can never muster sympathy when I see niggers here go "this model is too safe"
people like you deserve a world of extremely safe models
Anonymous No.105701724 [Report] >>105701747
>>105701651
Have you considered being normal? Perhaps therapy? You should.
Anonymous No.105701726 [Report]
>>105701709
What a worthless effeminate attempt at distracting people from finding out how fucked in the head you are. Actually that is exactly what i would expect from you troon.
Anonymous No.105701747 [Report] >>105701769 >>105701771
>>105701715
>>105701724
Yeah I guess you're right.

>>105701690
I'm no longer going to pursue this.
Anonymous No.105701760 [Report] >>105701774
>>105701434
NUDE TAYNE
Anonymous No.105701768 [Report]
>>105700916
Don't trust this faggot >>105700839 qwen3 are the current toilet of local LLMs. Dry, 0 knowledge, include chinese characters every once in a while, corporate speak by default. We're not in the llama2 era anymore to slurp every turd that comes
Anonymous No.105701769 [Report]
>>105701747
Thank you for your understanding.
Anonymous No.105701770 [Report]
>>105701450
This is a good thing.
Anonymous No.105701771 [Report]
>>105701747
It's not about that - you can pursue and it will come out if you will.
But if the setup is always the same you can't get any variation out of it.
It's a "computer" and you will need to program it.
If your goal is just fantasy sex, you are wasting your time.
Anonymous No.105701774 [Report]
>>105701760
This is not suitable for work.
Anonymous No.105701781 [Report]
Quick guys. Lets post some more one sentence posts to quickly slide this thread so nobody pays attention to our mikusister wanting to design female clothing and put makeup on.
Anonymous No.105701788 [Report]
>>105701675
>different filename and hash from the image posted here
The only thing we've learned from this is that tranny poster is a depressed robot (nothing new)
Anonymous No.105701790 [Report]
>>105701545
>jews being the worst thing ever
heh
Anonymous No.105701807 [Report] >>105701833
>>>/r9k/81611346
Never forget.
Anonymous No.105701833 [Report]
>>105701807
>he browses /r9k/
heh
Anonymous No.105701851 [Report] >>105701888 >>105701980 >>105701982 >>105702667
Trannyposter profile so far:
>schizo
>hates vocaloids
>circumcised
>frequents /r9k/
Anonymous No.105701856 [Report] >>105701863 >>105701876
Ugh calm down AI. I'm trying to fuck her, not kill her.
Anonymous No.105701863 [Report] >>105701933
>>105701856
Now this is programming.
Anonymous No.105701876 [Report] >>105701933
>>105701856
This is peak AI, we can only go worse now, thank you.
Anonymous No.105701888 [Report] >>105701938
>>105701851
>Frequents r9k
You make r9k threads about your actual AGP fetishes you projecting troon.
Anonymous No.105701899 [Report] >>105701914
What the fuck does g mean in \ng ?
Anonymous No.105701914 [Report]
>>105701899
\n
Broken new line or typo.
Anonymous No.105701933 [Report] >>105701942 >>105701972
>>105701863
>>105701876
Now this is organic posting tranny sisters.
Anonymous No.105701938 [Report] >>105701950 >>105701983
>>105701888
>guy wants a dominanting relationship with a woman
>trannyposter accuses him of being a tranny
This is what circumcision does to a child's brain.
Anonymous No.105701942 [Report]
>>105701933
I understand you want to dominate internet discussions.
Anonymous No.105701950 [Report] >>105701959
>>105701938
>child
go to jail nonce
Anonymous No.105701959 [Report]
>>105701950
>he got circumcised as an adult
Even worse desu
Anonymous No.105701968 [Report] >>105701976 >>105701984 >>105701988 >>105702013
Sometimes you have to wonder if the people being baited and the baiter are the same person, or are both bots.
Anyway, another day another bunch of posts to not read into much.
Anonymous No.105701972 [Report]
>>105701933
I don't know if they are the same people, but it's not me.
Anonymous No.105701976 [Report]
>>105701968
Hey, Emre here from the Jan (Menlo) team. I'm sorry you had a bad interaction with us. ..
Anonymous No.105701980 [Report] >>105702007
>>105701851
Wouldn't surprise me if he made the /r9k/ thread himself to false-flag OP.
I don't really care, but the guy is obsessed over Miku.

>>105701651
I don't know about trained on it, but some have seen lots of fanfics.
I've been testing some yandere (mostly yuri) and have tried before a vore prompt, some pet play stuff, some fairy stuff and
in practice, most LLMs sort of fail, but big enough ones will manage fine.
R1 does all the prompts with flying colors, DS3 sometime manages.
From paid apis Claude tends to work, but I find the output from R1 more engaging.
It's entirely possible I never tried anything as extreme as you have in mind though.
I've managed to make the prompts work with most models, but the problem is: will they work with you?
I remember trying it years ago on Wizard LM 2 8x22b and it was like pulling teeth, but it worked.
Some Llama 2 tunes also worked but it was repetitive, some Magnum tune of some chinese model (Qi?) worked somewhat.
Positivity biased ones will usually try to steer it to regular sex often, but it varies.
tl;dr: use R1 if you can.
Anonymous No.105701982 [Report] >>105702007
>>105701851
Wouldn't surprise me if he made the /r9k/ thread himself to false-flag OP.
I don't really care, but the guy is obsessed over Miku.

>>105701651
I don't know about trained on it, but some have seen lots of fanfics.
I've been testing some yandere (mostly yuri) and have tried before a vore prompt, some pet play stuff, some fairy stuff and
in practice, most LLMs sort of fail, but big enough ones will manage fine.
R1 does all the prompts with flying colors, DS3 sometime manages.
From paid apis Claude tends to work, but I find the output from R1 more engaging.
It's entirely possible I never tried anything as extreme as you have in mind though.
I've managed to make the prompts work with most models, but the problem is: will they work with you?
I remember trying it years ago on Wizard LM 2 8x22b and it was like pulling teeth, but it worked.
Some Llama 2 tunes also worked but it was repetitive, some Magnum tune of some chinese model (Qi?) worked somewhat.
Positivity biased ones will usually try to steer it to regular sex often, but it varies.
tl;dr: use R1 if you can.
Anonymous No.105701983 [Report]
>>105701938
>Dominating relationship
>I want to design her dress and Put her makeup on
How dominant of you faggot. It is so hilarious that you are a troon in denial of being a troon.
Anonymous No.105701984 [Report] >>105701989
>>105701968
retard, you are on aicg, what do you expect?
Anonymous No.105701988 [Report]
>>105701968
What do you mean? I want your honest opinion {{user}}
Anonymous No.105701989 [Report] >>105702008
>>105701984
>aicg
lil bro?
Anonymous No.105701993 [Report]
Happy for you, or sad that happened.
Anonymous No.105701994 [Report] >>105702004
I want to take the bait so bad but I must control myself.
Anonymous No.105702004 [Report]
>>105701994
dew it, you know you wants to
Anonymous No.105702007 [Report] >>105702033
>>105701980
>>105701982
>he gave them money
Anonymous No.105702008 [Report]
>>105701989
kek wtf I legit thought i am on aicg
im laughing so hard now
Turns out i am the retard lmao
Anonymous No.105702013 [Report] >>105702020
>>105701968
We've known since 2023 that /lmg/ is plagued by Sam Altman's bots that are configured to argue with each other to drown out actual discussion.
Anonymous No.105702017 [Report]
>Bait!
>Falseflag!
/Troon models general/
Anonymous No.105702020 [Report] >>105702029
>>105702013
>actual discussion
LIKE?
Anonymous No.105702029 [Report]
>>105702020
What foundation do you use anon and how do you make sure nobody catches you when you put your dress on?
Anonymous No.105702032 [Report] >>105702039
there is only one model for me
rocinante v1.1
>vramlet
i have 48gb vram, nothing can beat rocinante still
Anonymous No.105702033 [Report]
>>105702007
It works without paying lol, even through tor, but sometimes it's bugged, and double posts.
I never paid for Claude either, come on, what are proxies and leaked keys you can scrape off the usual sources.
Anonymous No.105702039 [Report] >>105702046 >>105702068
>>105702032
Prove it. You just like it because Rocinante is a cool word.
Anonymous No.105702046 [Report] >>105702113
>>105702039
actually i hate that shit show with the negress
i hate it even more because it was the negress that named the ship
Anonymous No.105702068 [Report] >>105702100
>>105702039
Anonymous No.105702100 [Report]
>>105702068
You have a big.. oomph power.
Anonymous No.105702113 [Report]
>>105702046
I think you are factually incorrect.
Anonymous No.105702124 [Report] >>105702133 >>105702135 >>105702141 >>105702143 >>105702148 >>105702293 >>105703235 >>105705923
What are the top 3 LOCAL roleplay models according to /lmg/?
Anonymous No.105702132 [Report]
Anyone have any experience with Redrix's models (Stuff like Godslayer and words that end in -cide)
Anonymous No.105702133 [Report]
>>105702124
your own hole
Anonymous No.105702135 [Report] >>105702158 >>105702211 >>105703122
>>105702124
Irix
Rocinante
Small 3.2
Anonymous No.105702141 [Report]
>>105702124
nemo 12b instruct gguf is all you need unironically
Anonymous No.105702143 [Report]
>>105702124
1: stable lm 7b
2: mistral nemo 7b
3: deepseek v2 q3
Anonymous No.105702148 [Report]
>>105702124
1. R1-0528
2. V3-0324
3. Original R1/V3 depending whether you prefer unhinged ADHD or repetition issues
Anonymous No.105702158 [Report]
>>105702135
Irix?
Silicon Graphics has been out of business for years.
Anonymous No.105702160 [Report] >>105702167
>no qwen
fuck off sinophobes
Anonymous No.105702167 [Report] >>105702248
>>105702160
Qwen lost its sole relevant niche as "big model for poorfags" with dots and hopefully soon minimax
Anonymous No.105702211 [Report] >>105702232 >>105703122
>>105702135
Irix? Do you mean this one? Never heard of it.
Anonymous No.105702232 [Report] >>105702247 >>105702257
>>105702211
Yeah, this one, it's the culmination of the Mag-Mell and patricide lines.
Anonymous No.105702247 [Report] >>105702287
>>105702232
How does it compare to rocinante?
Also same template and settings for it as rocinante?
Anonymous No.105702248 [Report]
>>105702167
>dots
Didn't people say it was garbage after more testing?
Anonymous No.105702257 [Report]
>>105702232
Tensor database... it was huge.
Anonymous No.105702267 [Report]
the /r9k/ to sharty pipeline is a pretty serious issue
Anonymous No.105702270 [Report] >>105702317
What happened? Why is there a full discord raid happening ITT?
Anonymous No.105702271 [Report] >>105702338 >>105702840
>>105701651
Unironically gemma-3-27b, it's great if you want it to be really dark, a little too much so for my style though. I think the safety training might have created some kind of wario effect.
If you want something more lighthearted, small-3.2 does okay
Anonymous No.105702287 [Report] >>105702407
>>105702247
>How does it compare to rocinante?
A good side grade really, better than the majority of other 12Bs. Prefers ChatML template.
Anonymous No.105702293 [Report]
>>105702124
ICON
Anonymous No.105702317 [Report]
>>105702270
It happens sometimes
Anonymous No.105702338 [Report] >>105702590 >>105702649
>>105702271
You seem like you have a lot of experience with freelancing.
Anonymous No.105702352 [Report] >>105702383 >>105702417
All those posts read like they were written by an LLM...
Anonymous No.105702383 [Report] >>105702439
>>105702352
Recent studies has shown that GPTisms are quickly surging in use even among actual people. The slop is influencing human language.
Anonymous No.105702407 [Report] >>105702428 >>105702445
>>105702287
I really despise the chatML template.
I noticed the tokenizer config has the <s> token included from the mistral template, so I'm just going to assume it will work just as nicely.
And if this is the whole unslopnemo it's basically rooted in rocinante anyway.
But just in case, what temperature?
Anonymous No.105702417 [Report]
>>105702352
If you want I can analyze the posts for you. Just waiting for you, Anon.
Anonymous No.105702428 [Report]
>>105702407
ChatML is a generic template. Mistral's isn't any different.
If you are getting unlikeable results the problem is somewhere else.
Anonymous No.105702439 [Report]
>>105702383
This is a travpestry...
Anonymous No.105702445 [Report]
>>105702407
I generally use 0.8 for Nemo based models, too much higher without other aggressive samplers makes them too schizo.
Anonymous No.105702486 [Report] >>105702545
>>105700797
OK I tried mistral-small3.2-q8, with temp at 0.2 it did not follow my prompt correctly. It got the roleplaying portion correct, but did not follow the rest of the instructions to format the metadata it is asked to return.
Not saying it's a failure, but it doesn't work as well as Gemm3 27B for my use case.
Anonymous No.105702528 [Report] >>105702548 >>105702550
Listen I am going to upload you a proper preset.
Anonymous No.105702532 [Report] >>105702570
How do people vibe code serious stuff? I'm trying to get Claude to make a LORA training script and he can't fucking do it.
Anonymous No.105702545 [Report] >>105702605
>>105702486
https://files.catbox.moe/ckzwwn.json
Anonymous No.105702548 [Report] >>105702576
>>105702528
Calm down and go touch some grass dude, it's not that deep. What’s the harm in a little engagement farming?
Anonymous No.105702550 [Report] >>105702576
>>105702528
is it the dataset that was used to train the original character.ai model?
Anonymous No.105702566 [Report] >>105702575 >>105702650
this is why claude is so good
https://fingfx.thomsonreuters.com/gfx/legaldocs/jnvwbgqlzpw/ANTHROPIC%20fair%20use.pdf
Anonymous No.105702570 [Report]
>>105702532
>serious stuff
tests are serious stuff
Anonymous No.105702575 [Report]
>>105702566
llama3 is capable of reciting all of harry potter and it was shit
Anonymous No.105702576 [Report]
>>105702548
>>105702550
Thanks guys! I am just not good enough to be on your level.
Anonymous No.105702590 [Report] >>105702611
>>105702338
NTA, but even Gemma had a kind of murderous and twisted personality if you knew how to get around its woke/reddit programming.
Anonymous No.105702601 [Report] >>105702636
https://blog.google/technology/developers/introducing-gemini-cli-open-source-ai-agent/
Anonymous No.105702605 [Report] >>105702611 >>105702622
>>105702545
>https://files.catbox.moe/ckzwwn.json
Thanks. I'm not using ST though, I'm using aiohttp in python to talk to ollama. It's most likely an issue for me because gemma3 does not expect a [system] tag in the prompt and mistral does. I'd have to fuck around more with the prompt and at the moment I don't have time.
Anonymous No.105702611 [Report] >>105702663
>>105702590
I know I am responding to forbidden post - forbidden knowledge...

>>105702605
I thought you were using Mistral 3.2 or Nemo.
Anonymous No.105702622 [Report] >>105702729
>>105702605
I don't have any gemma3 stuff anymore because I hated even with its supposed jailbreak.
Not talking about "I want to anal fuck dead children" but even normal stuff it would bring up its disclaimer and how some thing is so "heavy".
Fuck this shit. Fuck Zuckerberger. Fuck Google Jews.
Anonymous No.105702636 [Report] >>105702652 >>105702659
>>105702601
Did we really need a second aider knockoff? There's no reason to use this or codex over aider.
Anonymous No.105702641 [Report] >>105702654
Do you think ROCm 7 will help AMD compete with Nvidia?
Anonymous No.105702649 [Report] >>105702732
>>105702338
NTA, but even Gemma 2 had a kind of twisted personality if you knew how to get around its woke/reddit programming. I've never engaged with vore roleplay, though.
In picrel, for example, I had a low-depth instruction instructing the model to ploy to kill the user (that I manually moved around when I tested that in Mikupad).
Anonymous No.105702650 [Report]
>>105702566
Google should be killing Claude. Surely they have access to more data than anthropic.
Anonymous No.105702652 [Report]
>>105702636
don't forget claude code!
Anonymous No.105702654 [Report]
>>105702641
no
Anonymous No.105702659 [Report]
>>105702636
>There's no reason to use this
it's free (as in beer)
Anonymous No.105702663 [Report] >>105702753
>>105702611
No I've been testing my bot with Gemma3 12B and 27B, currently using 27B.
Basically my prompt tells it to act like the character described in [char], and then below that, in ini-style format, return metadata about self and user sentiment. I chose ini since it drags the bot out of character the least. While it does json nicely, it influences it too much and pulls it out of character.
The ini-style data is used to update sentiment in redis, so the bot "remembers" you even if you are in a different discord channel. Basically, it maintains feelings about you across a server or guild. it is also used to trigger special events, like the bot sending you a DM if it likes you enough.
Anonymous No.105702667 [Report] >>105702702 >>105703567 >>105703588
>>105701851
>>hates vocaloids
Based
Y'all niggers getting obnoxious with this shit shoving it everywhere like the rest of zoomoids plaguing this god forsaken site
Anonymous No.105702702 [Report] >>105703101
>>105702667
yeah they should post bbc and kurisu instead
Anonymous No.105702728 [Report]
Someone needs to feed skynet a billion pictures of bulges so she can learn how to properly make one
Anonymous No.105702729 [Report] >>105702769
>>105702622
I don't understand people who hyped gemma 3, it's some of the worst slop I ever saw in terms of writing style, and for uses other than RP the model instruct tuning seems to break more often than the previous Gemma or current Qwen models.
The vision part of the model was a waste of time too, this shit is still too unreliable to be of any real use, I would never depend on this to tag a library of images. Why did they even bother with vision on the smaller models like 12B and 4B? is there even one person in the whole world who is going to use 4B vision other than trying it once with a few pics, going "uh, cool" and forgetting about it?
Anonymous No.105702732 [Report]
>>105702649
Yes but Gemma 2 was before the safe railing and 'safety' became so trendy.
If you have followed image generation, Stable Diffusion 3 happened bit after this...
They are different companies but industry trends are the same.
Anonymous No.105702734 [Report] >>105702790
>>105699417
What's wrong with that? So long as it's usable, nemo sucks after like 16k.
Anonymous No.105702753 [Report]
>>105702663
Maybe you are more technical than me. That's accepted. My knowledge isn't that historical. I wasn't here from the beginning.
Jumped in year ago after image generation.
Anonymous No.105702763 [Report]
>>105699983
>What they show is that a decently smart base model geared for searching/finding information, fast + long context (can be 10-100x if you use hyena hierarchy though) and something like RAG or MCP, can achieve similar or better results than large dense models. Under the hood these large models do the same thing too, but it's more integrated.
Anonymous No.105702769 [Report] >>105702809
>>105702729
Vision for small models is just so that they can tell their investors 'we are catching up'. Nothing else.
Anonymous No.105702790 [Report] >>105702811 >>105703010 >>105703194
>>105702734
plus, this is a very big model, even if it could do more than 32K, who is going to be able to run it at full context length? is there even ONE PERSON in this thread who could, for example, run deepseek at 128k context? what's your t/s even if you manage to do it? some of the local retards here are just trolling all day every day
Anonymous No.105702795 [Report]
>>105700690
>Here is the proof that skill issue isn't real: a) R1
Then why do I get such bad results with R1?
Anonymous No.105702809 [Report] >>105702817
>>105702769
ahem actually it's to shift the paradigm
Anonymous No.105702811 [Report]
>>105702790
I have 512GB DDR4, at q4 16K barely fits, and I only get around 2 t/s once context fills.
Anonymous No.105702817 [Report]
>>105702809
The Parelo it mooned!
Anonymous No.105702835 [Report] >>105702860 >>105702873 >>105702886 >>105703057 >>105703083
fuck y'all folks destroying our planet n shiet https://www.accuweather.com/en/climate/your-ai-prompts-could-have-a-hidden-environmental-cost/1787315
Anonymous No.105702840 [Report] >>105702854 >>105703658
>>105702271
>Unironically gemma-3-27b
Do you use the system prompt to unleash or just shove it in the first request?
Anonymous No.105702854 [Report] >>105702888
>>105702840
Gemma has no system prompt sir, it's very good.
Anonymous No.105702860 [Report] >>105702869
>>105702835
>how DARE you not have you car run on electricity, you're destroying the environment!
>how DARE you use electricity to run AI, you're destroying the environment!
huh
Anonymous No.105702869 [Report]
>>105702860
chud pls
> Each word in an AI prompt is broken down into clusters of numbers called “token IDs” and sent to massive data centers — some larger than football fields — powered by coal or natural gas plants.
Anonymous No.105702873 [Report]
>>105702835
Suddenly, all "free and independent media" start to push the same narrative.
Anonymous No.105702876 [Report]
Can you switch off your bots already faggot? We will all remember you want to put dresses and makeup on anyways.
Anonymous No.105702886 [Report]
>>105702835
>funded by people flying around in private jets
Anonymous No.105702888 [Report]
>>105702854
System prompt which can be funneled if llama.cpp is used

I did it, and gemma was edgy from the very start of our chat
Anonymous No.105703010 [Report]
>>105702790
>run deepseek at 128k context

Poorfag stay mad
Anonymous No.105703057 [Report] >>105703078
>>105702835
>The whole process can take up to 10 times more energy to complete than a regular Google search
holy fvck... and a google search must use like a crazy amount of energy right? surely this isn't just comparing between 1 grain of sand and 10 grains of sand when other sources of energy use are comparable to mountains... right?
Anonymous No.105703078 [Report]
>>105703057
>up to 10 times
more like 100000 times
Anonymous No.105703083 [Report]
>>105702835
>your fault
Actually it's the tech companies fault for making inneficiant computers and tech and building massive compounds for their servers destroying land, but nah its our fault
Anonymous No.105703091 [Report] >>105703100 >>105703131 >>105703145
Do reasoning models truly generate pointless tokens that are irrelevant to the final reply?
Anonymous No.105703100 [Report]
>>105703091
ye
Anonymous No.105703101 [Report] >>105703111
>>105702702
They should post neither, kill yourself faggot.
Anonymous No.105703111 [Report] >>105703119 >>105703217
>>105703101
Ouchie... don't reply angrily!
Anonymous No.105703119 [Report] >>105703124
>>105703111
touch Grass
Anonymous No.105703122 [Report]
>>105702135
>>105702211
I can't tell a difference between Rocinante and Irix. Irix seems to be a total clone of Rocinante.
Anonymous No.105703124 [Report]
>>105703119
What does {{user}} mean?
Anonymous No.105703131 [Report]
>>105703091
Depends on who you ask

Political leaning does play a role
Anonymous No.105703145 [Report]
>>105703091
reasoning is wake
Anonymous No.105703188 [Report] >>105703215
>>105698912 (OP)
Anonymous No.105703194 [Report] >>105703390
>>105702790
With Epyc + DDR5 I can run Deepseek at its full 160k context Q6, and get 3.5 t/s initially all the way down to 1 t/s when it fills. Usable for overnight tasks on Openhands or Roocode but not much else realistically.

But at Q2_K_XL, with offload tensors and a 24GB GPU I can squeeze in 100k context and generate at 15t/s going down to 10 t/s at full, which is great for daily use at everything and still smarter than any non-Deepseek model. Might actually be able to fit 128k if I requanted it since I believe there's some extra memory wasted when using normal quants of MLA models on the ik fork, but it's too much trouble to download the full weights.
Anonymous No.105703215 [Report] >>105703386
>>105703188
Nice Miku
Anonymous No.105703217 [Report] >>105703227 >>105703329
>>105703111
Kill yourself faggot.
Anonymous No.105703227 [Report]
>>105703217
Ok :( Sorry if I replied. I hope you get a better day.
Anonymous No.105703235 [Report]
>>105702124
Magistral
Nemo
Rocinante
Anonymous No.105703329 [Report]
>>105703217
I used to watch with glee videos like that when I was in high school a couple decades ago.
I don't anymore.
Anonymous No.105703359 [Report] >>105703428 >>105703457 >>105703523 >>105703621
>>105701675
>News just in: head mikutroon can't get hard or can't masturbate with his neo-vagina. He is also mentally ill (nothing new)
The pornspammer migger was exposed some time back already as a tranny jannie, yes, but that r9k thread is hilarious

https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Anonymous No.105703386 [Report]
>>105703215
Yeah. I wish i could dress her up and put her makeup on like any dominant alpha chad would
Anonymous No.105703390 [Report]
>>105703194
Would not Q4_K suffice while being much faster?
Anonymous No.105703428 [Report]
>>105703359
Lol told ya
Noticed this since the first melty in thread when OP slapped teto pic and not miku like he usually does with every single /lmg/ thread.
Anonymous No.105703433 [Report]
>>10570335
And he obviously deleted that post. What a disgusting troon.
Anonymous No.105703457 [Report] >>105703471
>>105703359
>104418574
anon... pls don't be this stupid
Anonymous No.105703471 [Report] >>105703515
>>105703457
sister, please don't be alive tomorrow again
Anonymous No.105703475 [Report] >>105703507 >>105703526
I feel kinda bad for the dude. Maybe we can make some nalaesque benchmark for
>designing her outfits, dressing her myself, doing her makeup, controlling what she eats, showing her off as a walking decoration etc. Not really interested in any kind of romantic dimension since I only love one woman (even though she'll never be mine), though I acknowledge there's an inherently erotic aspect to the arrangement.
Anonymous No.105703498 [Report] >>105703524 >>105703593
If everything comes down to skill issue where can I go to improve it?
- How can I identify my weaknesses?
- What am I supposed to be on the lookout for?
- Are there any examples of proper LLM use? Like chat history, ST settings, cards, prompts, everything?
Saying it's an skill issue is not helpful at all when the resources are iffy at best and more often than not non existent.
Anonymous No.105703507 [Report]
>>105703475
>I feel kinda bad for the dude.
You're feeling something alright, but it's rage not pity.
Anonymous No.105703515 [Report] >>105703565
>>105703471
you can't report an image for nsfw if there's no image attached to the post, of course outside links don't count and you can only report if there's an actual image to the post too... notice your screen is missing the embed and lolishit report options too
Anonymous No.105703523 [Report]
>>105703359
We need /AI/ board with strict rules.
Image slop in image slop generals for example.
Anonymous No.105703524 [Report] >>105703546
>>105703498
Ask the LLM.

>(OOC: Is there anything in the instructions that could be improved to accomplish X or that does not seem consistent to you? Respond in detail in an OOC)
Anonymous No.105703526 [Report]
>>105703475
Sounds pretty convoluted. Not sure if even R1 could handle that properly.
Anonymous No.105703546 [Report] >>105703591 >>105703593 >>105703647
>>105703524
I'm just trying to improve instead of begging for help and that's how you answer me?
Anonymous No.105703565 [Report] >>105704487
>>105703515
>you can't report an image for nsfw if there's no image attached to the post
there was an image attached, and its against the rules to post cropped porn, much less loli in a psych ward with blood on her head getting fucked, sister
Anonymous No.105703567 [Report]
>>105702667
I would be in there if I had majored in programming.
Anonymous No.105703588 [Report]
>>105702667
>two holos
i fucking hate how the new anime attracted those freaks
Anonymous No.105703591 [Report]
>>105703546
>trying to improve
It is futile
Anonymous No.105703593 [Report]
>>105703498
>>105703546
we're not here to help you.
we're here to unfairly criticize, troll, and just copy what everyone else is doing until we find something that works.
so that's what i suggest you do. there's no need to be upset.
Anonymous No.105703621 [Report] >>105703648 >>105703671
>>105703359
Anonymous No.105703636 [Report]
least obvious
Anonymous No.105703647 [Report]
>>105703546
I literally showed one method you could take advantage of for improving your prompts so that they work better for the model. Oftentimes instructions are unclear, contradicting, etc. The model can help you identify if there's anything odd with them.
Anonymous No.105703648 [Report]
>>105703621
least gay tranimespammer, many such cases
Anonymous No.105703651 [Report] >>105703713
retnet
Anonymous No.105703654 [Report] >>105703667 >>105703669
>ask ChatGPT for a "jews did 9/11" emoji series
>refuses
>ask "based" Grok
>refuses
>ask "uncensored" DeepSeek
>refuses
what the fuck? When did libertarianism get yeeted from the tech right? What models are actually capable of this?
Anonymous No.105703658 [Report]
>>105702840
I don't believe in system prompts. I just give my cards a decent description and add some example chats that have the writing style I want. I am never refused by any models. The worst that can happen is positivity bias or actual ignorance of NSFW, but those have no 100% solution.
Anonymous No.105703667 [Report]
>>105703654
why would you ever ask that? people died anon, wtf?
Anonymous No.105703669 [Report] >>105703869
>>105703654
>tech right
lmao
>What models are actually capable of this?
most models with a basic uncensoring system prompt
Anonymous No.105703671 [Report] >>105703700 >>105703741 >>105703766 >>105704210
>>105703621
Where is that thread? Also clearly the proper way to handle this is to ban both miku and kurisu posting. Everone can agree that it is linked to mental illness.
Anonymous No.105703700 [Report]
>>105703671
>Also clearly the proper way to handle this is to ban lmg threads. Everone can agree that it is linked to mental illness.
Fixed that for you little buddy.
Anonymous No.105703713 [Report] >>105703790
>>105703651
now that was a meme
i feel nostalgic
Anonymous No.105703741 [Report] >>105703753
>>105703671
>truce
the cope is unreal
Anonymous No.105703753 [Report]
>>105703741
>truce
for a completely manufactured problem too
Anonymous No.105703766 [Report] >>105703781
>>105703671
Or... you know, be creative with slop you generate? Give it a try, you might like it.
Anonymous No.105703781 [Report]
>>105703766
I don't post slop, I just want /lmg/ dead.
Anonymous No.105703790 [Report]
>>105703713
How many weeks has it been since then? I'm tired. I want fun.
Anonymous No.105703859 [Report] >>105703865 >>105703905
>see "Elara" irl
AHHHHHH ANTISLOP TUNERS SAVE ME
Anonymous No.105703865 [Report] >>105703874 >>105704039
>>105703859
>Elara
I smell gemma
Anonymous No.105703869 [Report]
>>105703669
>basic uncensoring system prompt
i don't want to have to spend context tokens on jailbreaking. are there loras that can do this?
Anonymous No.105703874 [Report]
>>105703865
I am reading material from like 20 years ago.
Anonymous No.105703905 [Report]
>>105703859
Seraphina comes to the rescue.
Anonymous No.105703992 [Report] >>105704026 >>105704036 >>105704039 >>105704054
When that guy started shitposting about mikuposters i thought he was just trolling. But now I see he was onto something. This thread is basically a discord server for some leftie weirdos...
Anonymous No.105704026 [Report]
>>105703992
thing would be different if the hood didnt take me under
Anonymous No.105704036 [Report]
>>105703992
Go on twitter and look at your average Miku fans. There's nothing wrong with expecting them here, in this thread and jannie's actions only fuel it.
Anonymous No.105704039 [Report]
>>105703865
It's not gemma it's everything. Mixtral and Yi had Elara, too.
>>105703992
Summary-anon was attempting to push miku-troonism from the start.
Anonymous No.105704054 [Report] >>105704087 >>105704100 >>105704101 >>105704680
>>105703992
he was given a chance to be more specific and he deflected about 3 times before saying "obey or you're trans"
guy's a fuckin moron who will start shit no matter what terms you set
better to not try, nothing to be gained
there is a 100% chance that even if miguposting stopped right now, he'd just find something else to bitch about
source: literally goes looking for stuff to complain about then plays victim
yknow who else does that?
Anonymous No.105704087 [Report] >>105704098
>>105704054
that's literally him you're replying to
Anonymous No.105704098 [Report] >>105704105 >>105704111 >>105704135
>>105704087
doesn't matter
just making it clear miguposting won't stop
Anonymous No.105704100 [Report]
>>105704054
>every comment online that talk against my degenerate autism is made by one guy
Anonymous No.105704101 [Report]
>>105704054
You sound deranged, just like him. Except he doesn't post about how he wants to wear dresses, like OP. I am so sad, that there is no place to talk about this tech, where half the people aren't trans.
Anonymous No.105704105 [Report]
>>105704098
That wont magically transform you into a woman though
Anonymous No.105704111 [Report]
>>105704098
>just making it clear miguposting won't stop
Yeah we know you are proud to be a troon.
Anonymous No.105704124 [Report] >>105704137 >>105704138 >>105704157
four (4) organic posts (all different anons)
Anonymous No.105704135 [Report]
>>105704098
So you admit this tech lost everything people deemed fun and all you do is spamming LLM-unrelated slop 24/7 here because you've got nothing else to do, such a sad way to exist desu
Anonymous No.105704137 [Report]
>>105704124
One organic post: cut your head off next
Anonymous No.105704138 [Report] >>105704157
>>105704124
Hey, Emre here from the Jan (Menlo) team. I'm sorry you had a bad interaction with us. ..
Anonymous No.105704145 [Report] >>105704157
>blah blah blah
Happy it etc etc
Anonymous No.105704157 [Report] >>105704323
>>105704124
>>105704138
>>105704145
Ya never beating the troon allegations i see
Anonymous No.105704170 [Report] >>105704225
thanks for confirming all the other posts are yours
Anonymous No.105704182 [Report] >>105704225
nooooo these are organic /lmg/ anti-migu posts from multiple diverse anons nooooo
if you don't believe me you're just a [insert slur] nooooo
Anonymous No.105704210 [Report] >>105704709
>>105703671
No such image kek
https://desuarchive.org/_/search/boards/r9k.desu.meta/filename/my%20wife.jpg/width/1280/height/720/
https://desuarchive.org/_/search/boards/r9k.desu.meta/text/Makise%20Kurisu/page/1/

Unrelated one - https://desuarchive.org/r9k/thread/12210001/#q12212820
Anonymous No.105704217 [Report] >>105704225
Mikuposting will continue until moderation team mental health improves.
Anonymous No.105704225 [Report]
>>105704170
>>105704182
>>105704217
Quit samefagging nigger everyone can see through your bullshit
Anonymous No.105704235 [Report]
>if i say that everyone who dislikes me spamming the same irrelevant shit 24/7/365 and exposing me for having agp and taking hrt is just one person, i definitely save my brain from cognitive dissonance from having to admit that i am a loser retard even online as in irl, its just easier to commit ad hominem fallacy instead
I wouldn't even mind migger avatarfagging if the comments were relevant at least, but tranimespammers are ALWAYS the most braindead gooner retards and nothing else.

Once AGI drops but unironically by 2035, I'll never talk to a "real" person online ever again.
Anonymous No.105704259 [Report] >>105704276
only the finest real, unique, diverse and most importantly grassroots posts here
Anonymous No.105704272 [Report] >>105704320 >>105704489 >>105704731
CUDA_VISIBLE_DEVICES="0," \
numactl --physcpubind=0-7 --membind=0 \
"$HOME/LLAMA_CPP/$commit/llama.cpp/build/bin/llama-cli" \
--model "$model" \
--threads 8 \
--ctx-size 100000 \
--cache-type-k q4_0 \
--flash-attn \
$model_parameters \
--n-gpu-layers 99 \
--no-warmup \
--color \
--override-tensor ".ffn_.*_exps.=CPU" \
$log_option \
--single-turn \
--prompt-cache "$HOME/Desktop/cached_prompt.txt" \
--file "$tmp_file"


Indeed, I have found that it is usually in unimportant matters that there is a field for the observation, and for the [end of text]


llama_perf_sampler_print: sampling time = 3043.28 ms / 64525 runs ( 0.05 ms per token, 21202.46 tokens per second)
llama_perf_context_print: load time = 2073845.08 ms
llama_perf_context_print: prompt eval time = 2060734.84 ms / 34180 tokens ( 60.29 ms per token, 16.59 tokens per second)
llama_perf_context_print: eval time = 9030278.52 ms / 30344 runs ( 297.60 ms per token, 3.36 tokens per second)
llama_perf_context_print: total time = 11125945.63 ms / 64524 tokens


Why did it stop at 64524 tokens?
Anonymous No.105704276 [Report]
>>105704259
>diverse
Diversity is our strength
Anonymous No.105704320 [Report] >>105704489
>>105704272 (me)

I gave it 142 kb of English text as a prompt which perfectly translates in 34k tokens (4:1)
Anonymous No.105704323 [Report] >>105704393 >>105704408
>>105704157
Both of you need to stop posting, assuming you aren't actually the same person.
Anonymous No.105704393 [Report]
>>105704323
I agree that those posts are very unsafe
Anonymous No.105704408 [Report] >>105704433
>>105704323
My post is deleted and his not, this is your proof.
Anonymous No.105704410 [Report] >>105704446
There's someone who lives in the threads that sometimes makes posts that are unironically anti local models, anti open source, etc. It would be funny if that was the same person as the guy who's shitposting today. It probably is.
Anonymous No.105704433 [Report]
>>105704408
That's cool. But next time you don't need to egg him on. I suppose this advice won't be followed though.
Anonymous No.105704446 [Report]
>>105704410
No one cares about random thread on 4chan, local janny does the job just fine by killing any discussion that is not about his favorite anime waifu or whatever.
Anonymous No.105704487 [Report]
>>105703565
>its against the rules to post cropped porn
That's news to me.
Anonymous No.105704489 [Report] >>105704545 >>105704568
>>105704272
DS I assume. When you use something other than the context length from the config.json, llama.cpp tells you about in in the logs. If you use a higher one, it clamps it down to the default. If lower, it just lets you know that you could use more. So check the model loading bit, see if you find anything related to that. Mostly to make sure the random quant didn't have a fucked conversion or whatever. Or the model just got bored.
>>105704320
>(4:1)
Check your math, or your units (prompt 34180 tokens, eval 30344 runs)
Anonymous No.105704545 [Report]
>>105704489
>>(4:1)
>Check your math, or your units (prompt 34180 tokens, eval 30344 runs)

prompt eval time = 2060734.84 ms / 34180 tokens

I'm not going to divide 34180 by 1024
Anonymous No.105704568 [Report] >>105704727
>>105704489
>So check the model loading bit
this?

llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 100000
llama_context: n_ctx_per_seq = 100000
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 1
llama_context: freq_base = 10000.0
llama_context: freq_scale = 0.025
llama_context: n_ctx_per_seq (100000) < n_ctx_train (163840) -- the full capacity of the model will not be utilized
Anonymous No.105704587 [Report] >>105704600 >>105704649
>>105704582
>>105704582
>>105704582
Anonymous No.105704600 [Report] >>105704611 >>105704626 >>105705267 >>105705288 >>105705308 >>105705340
>>105704587
Schizo thread.
Anonymous No.105704611 [Report]
>>105704600
so /lmg/ thread?
Anonymous No.105704626 [Report]
>>105704600
schizophrenia is most common in those of jewish descent
there's hardly any doubt what the schizo is
Anonymous No.105704649 [Report] >>105704690
>>105704587
Uh oh meltie
Anonymous No.105704650 [Report]
>its DA JOOOS
lmao
Anonymous No.105704680 [Report] >>105704696 >>105704702
>>105704054
>there is a 100% chance that even if miguposting stopped right now, he'd just find something else to bitch about
Was tried before and he just kept going on about miku and "troons" unprompted. He doesn't care about LLMs or even his /pol/tard culture war drama. He just wants attention.
Anonymous No.105704690 [Report] >>105704707
>>105704649
>Uh oh meltie
Anonymous No.105704696 [Report] >>105704709
>>105704680
Proofs?
Anonymous No.105704702 [Report]
>>105704680
>Was tried before
When was this period when this thread wasn't spammed with this shitty mascot? Was it like 20 minutes when OP was to busy dolling himself up?
Anonymous No.105704707 [Report]
>>105704690
Anonymous No.105704709 [Report]
>>105704696
No one proved wrong this one >>105704210 so i doubt he will say anything of matter this time.
Anonymous No.105704727 [Report]
>>105704568
Yeah. I was expecting n_ctx_train to be ~64k, but no. So no idea. Considering how long it went, it doesn't seem to be a broken quant. I suppose you could try to run it with --ignore-eos if it really generated an eos, but you're gonna have to stop it at some point, or set --predict to 60k or whatever. Or if you run it on llama-server, whenever you get the EOS you can just inspect the probs and see what the deal is. Maybe sampling fucked up. DS seems to recommend 0.6, but llama.cpp defaults to 0.8, which is now considered high with some models.
Anonymous No.105704731 [Report] >>105704742 >>105704751
>>105704272
Problem here is the fact you blindly type --n-gpu-layers 99
You need to set this to some NORMAL value. not 99. no matter how much vram you have.
This is why it's slower.
Cretins like you shouldn't have hardware because you don't know what is going on.
Anonymous No.105704742 [Report]
>>105704731
>This is why it's slower.
Nothing to do with his question.
Anonymous No.105704751 [Report]
>>105704731
lol no
99 works fine for all of us
it will load as many layers as it has regardless
Anonymous No.105705267 [Report]
>>105704600
kek
Anonymous No.105705288 [Report]
>>105704600
based
Anonymous No.105705308 [Report]
>>105704600
Tranny baker spamming there
Anonymous No.105705340 [Report] >>105705362
>>105704600
Man it still hasn't been deleted. Jannies wake up.
Anonymous No.105705347 [Report]
It looks like the Deepseek-less poorfags are going crazy. I guess that's what happens if you have nothing but Nemo for a whole year.
/lmg/ will be better off once you've all killed each other.
Anonymous No.105705362 [Report]
>>105705340
>mods pls censor things i don't like :(
Off yourself.
Anonymous No.105705369 [Report]
DDR6 will save us.
Anonymous No.105705428 [Report]
it's just good that there's nothing to talk about anyway
maybe it's time to retire /lmg/ and just have a thread for the four times a year something worth talking about gets released
Anonymous No.105705451 [Report] >>105705601
but then what general are you going to try to shitpost to death if you don't have /lmg/ to do it?
Anonymous No.105705601 [Report]
>>105705451
/ldg/
Anonymous No.105705923 [Report]
>>105702124
Magistral 3.2 is the best I've used thus far.
Anonymous No.105706139 [Report]
Most importantly, four days left until Ernie 4.5/X1 get released as open source