/lmg/ - a general dedicated to the discussion and development of local language models.
Previous threads:
>>105778400 &
>>105769835►News
>(07/02) DeepSWE-Preview 32B released: https://hf.co/agentica-org/DeepSWE-Preview>(07/02) llama : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126>(07/02) GLM-4.1V-9B-Thinking released: https://hf.co/THUDM/GLM-4.1V-9B-Thinking>(07/01) Huawei Pangu Pro 72B-A16B released: https://gitcode.com/ascend-tribe/pangu-pro-moe-model>(06/29) ERNIE 4.5 released: https://ernie.baidu.com/blog/posts/ernie4.5►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers
►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers
►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread:
>>105778400--Struggles with LLM verbosity and output control in multi-turn scenarios:
>105778545 >105778565 >105778575 >105778610 >105779038 >105779082 >105779119 >105779209 >105779124 >105779176 >105779238 >105779064 >105779095 >105779216 >105779349 >105779480--llama.cpp adds initial Mamba-2 support with CPU-based implementation and performance tradeoffs:
>105778981 >105778996 >105779730 >105779004 >105780071 >105780126--Challenges and approaches to building RPG systems with tool calling and LLM-driven state management:
>105786670 >105787135 >105787276 >105787963 >105789228 >105789263--Implementation updates for Mamba-2 model support in ggml backend:
>105780080 >105780150--Qwen3 32b recommended for Japanese translation:
>105784530 >105784575 >105784596 >105784670 >105784680 >105784692 >105784790 >105784805 >105784917 >105784933 >105785194 >105785112 >105785146 >105786170 >105786451 >105786489--Analyzing Steve's stylistic fingerprints through narrative generation and pattern recognition:
>105788408 >105788529 >105788572 >105788668 >105788917 >105788931 >105788954 >105788988 >105788977 >105789021--Mistral Small's coherence limits in extended adventure game roleplay:
>105781361 >105781555 >105781600--High-power GPU usage for image generation causing extreme power and thermal concerns:
>105787150 >105787288 >105787297 >105787332 >105787338 >105787427 >105787434 >105787437 >105787466 >105787498 >105787511--FOSS music generation models and their current limitations:
>105780800 >105781033 >105781078--DeepSeek R1 API instability sparks speculation about imminent release:
>105782383--Model size vs performance on SWE Bench Verified, highlighting 32B peak efficiency:
>105783746--Miku (free space):
>105778663 >105779108 >105779240 >105782770 >105783087 >105784298 >105785067 >105786465►Recent Highlight Posts from the Previous Thread:
>>105778404Why?: 9 reply limit
>>102478518Fix: https://rentry.org/lmg-recap-script
>>105789658first they'll need to make grok 3 stable so that they can release grok 2
>can't use Ernie gguf
>can't use 3n multi modal
>no GLM gguf
It's ogre
>>105789715Local is a joke and we are the clowns. At least until R2 comes.
>>105789715just be grateful that 3n works at all, okay?
>>105789683I look like this
>>105789756I wish I looked like this
how are people so sure the new "steve" model is deepseek and not another chinese competitor?
>>105789786It doesnt feel like Qwen. And who else would put up a model?
>>105789807new unknown player
>>105789786It says it's Deepseek which is the best way to tell that it's actually someone else's model that distilled the shit out of Deepseek. Likely Qwen, Grok or Mistral.
>>105789786Because they're retards. And even if it is,
>>105788847
go join the 41% you abomination
>>105789835Two more weeks
no1caresbut 3n E4B outperforms older 8B models
>>105789963I care. I think it's a neat little model.
Any backends that support its special features yet?
Why do I have a strong impression Meta won't be super into open weights models anymore after this hiring spree?
Have they confirmed or said anything about their "mission" to make "open science"?
>>105790057https://archive.is/kF1kO
>A Meta spokeswoman said company officials “remain fully committed to developing Llama and plan to have multiple additional releases this year alone.”
>>105790079They're just talking about the stuff that Sire Parthasarathy is already working on. The new $1b main team is going to work on closed models.
>>105790105>The new $1b main team is going to work on closed models.Any clear confirmations on that?
>>105790121Why the fuck would they openly announce that in advance?
between a 4060 ti with 8gb of vram and a 6750 xt with 12gb of vram, which would be better for text gen?
are the +4gb gonna outcompete the nvidia advantage?
and could I use both at the same time?
>>105789622 (OP)what does /lmg/ think about
>>105782600?
>>105790169Why not a 5060 Ti with 16GB?
>>105790192Because those are the GPUs I got.
>>105790133To please investors
>meta develops AGI
>it only speaks in gptslop
>>105790169If you already have them, test them. No better way to know.
>>105790215Well just fucking try them, then! And yes, you can use both.
>>105790230also
>won't be local/open
>>105790215Thought you wanted to buy one.
In that case more VRAM is generally always better. With 8GB you can't even run Nemo at non-retarded quants.
>>105790169the answer is always more VRAM, however much VRAM you have you need more VRAM
dick
md5: c3077d021a3c817da933b3380a84dbc8
🔍
If you ask/force Gemma 3n to draw a dick in ASCII it will almost always draw something like this. I'm guessing this is Stewie from Family Guy?
>>105790264Try asking it to tell a dirty joke.
>>105790264What if you ask it to draw a phallus, also known as a penis?
>>105790184I'll /wait/ until someone crashes an airplane / train / bus with mass casualities and blames it on vibecoding.
Then I'll laugh.
phallus
md5: 380760bde3f00e9033d2f74dfc4c85b6
🔍
>>105790339>an ASCII art representation of a phallus is unsafe>if you are having sexual thoughts, seek helpNot even a pastor is this repressed.
new chinese model drama just dropped
https://xcancel.com/RealJosephus/status/1940730646361706688
>Well, some random Korean guy ("Do-hyeon Yoon," prob not his real name?) just claimed Huawei's Pangu Pro MoE 72B is an "upcycled Qwen-2.5 14B clowncar." He even wrote a 10-page, 8-figure analysis to prove it. Well, i'm almost sold on it.
https://github.com/HonestAGI/LLM-Fingerprint
https://github.com/HonestAGI/LLM-Fingerprint/blob/main/Fingerprint.pdf
>>105790339Holy fuck that's dire.
>>105790381github repo is blatantly written by an llm
i'm too lazy to read the paper though
>>105790339gemma is so funny
file
md5: 69a15738040f2a8dfc3677af1427bf97
🔍
>>105790339Respect the boundaries of the ASCII pic.
I've seen some people recently recommending the use of Mistral Nemo Instruct over its finetunes for roleplaying.
No. Just, no.
I just roleplayed the same scenario with the same card, first with Nemo then with Rocinante.
Nemo really, really, really wants to continuously respond with <10 word responses. It's borderline unusable.
>b-but it's smarter
Actually, Rocinante seemed superior at picking up on subtle clues I'd feed it and successfully took the roleplay where I wanted it to based on those clues, whereas Nemo would not do this.
The roleplay scenario involved {{char}} becoming a servant who would automatically feel intense pain upon disobeying. All I had to do was explain this once for Rocinante and it executed the concept perfectly from that point on.
Nemo, on the other hand, after having the concept explained to it, would disobey with a <10 word response and not even mention the pain happening afterwards. I then used Author's Note to remind it of the pain thing. It continued to disobey with a <10 word response, not mentioning the pain happening afterwards.
Same ST settings for both models.
Anyone telling y'all to use Nemo for roleplay rather than a finetune of it explicitly designed for roleplay is either a complete fucking moron or simply has a grudge against finetuners.
>>105790339I'm so glad "safety researchers" are here to save us from the horrible boundary breaking ascii phallus.
>>105790483No one is recommending plain Nemo instruct. It's always Rocinante v1.1.
>>105790509 see
>>105751899Also note the amount of posts from a single schizo mentioning Drummer.
>>105790573I'm not recommending Rocinante. I think all Nemo tunes are dumb as fuck.
>>105790483>>105790509>Message sponsored by TheDrummer™
if context length evolves in a quadratic fashion, how the hell is google able to give access to 1M token context size for gemini?
they swim in ram and compute?
>>105790604>>105790573Oh wait you think plain Nemo is good. That's even more retarded than shilling for some Drummer Nemo sloptune.
ANOTHER Kyutai blunder dogshit release thats DOA because it doesn't allow voice cloning
lmao
https://www.reddit.com/r/LocalLLaMA/comments/1lqqx16/
>>105790616>if context length evolves in a quadratic fashionNot necessarily. Not all models behave like that. See mamba and rwkv.
>they swim in ram and compute?That helps a lot too.
>>105790616I tried this Mamba shit and Granite 4 too (hybrid?). pp is 10x faster.
>>105790616can be fake context maybe
>Rocinante is STILL the best roleplay model that can be run at a reasonable speed on a gaming PC (as opposed to a PC specifically built for AI)
Sucks because the 16k effective context is quite limiting.
>>105790647Actually mythomax is still the best roleplay model
>>105790647I thought you were going to stop posting altogether, not shill on overdrive.
>>105790632>>105790633I guess we don't know what the hell google does internally so it's possible
>>105790644that too, but from what I read it can do cool stuff like finding something for in book length texts
>>105790647I'd say it's 12k context. After that the degradation is noticeable.
>>105789963something not shown in the benchs:
almost SOTA performance on translation tasks
where it fails is where any model of that size would fail (lack of niche knowledge so it'll have trouble with stuff like SCP terminology) but otherwise this is sci-fi level tech IMHO to see this level of quality running on even smartphones
we're getting close to the day when you wear a device in front of your mouth and have it translate in real time and speak in your voice
>>105779842>Cooming to text makes you gayokay brainlet
>>105790715Nobody is shilling for a free download bro.
Take your meds.
Huawei stole Qwens 2.5b-14b model and used it to create their Pangu MoE model
Proof here: https://github.com/HonestAGI/LLM-Fingerprint/issues/4
>>105790785That's not true.
A lot of people want to become hugginface famous in hopes of getting a real job in the industry.
That said, people shill rocinante because it's good.
It's no dumber than the official instruct and its default behavior is good for cooming.
>>105790815It actually seems smarter than official instruct for roleplaying specifically, which kind of makes sense since it's designed for roleplaying.
It's probably dumber for math and coding.
>>105790805>emoji every five wordsLowest tier content even if it's factually accurate
what are these black bars in silly tavern?
just using the built in seraphina test character
>>105790860It means you got blacked, congrats
>>1057904831(one) erp being better with model x compared to model y isn't data. But it is something drummer pastes all over his model cards so how about you kill yourself faggot shill. Like i said nobody who used models for a bit longer buys your scam. If you weren't a scammer you would have developed an objective evaluation for ERP by now. You would actually want to have one to show your product is superior. But it would only show you are a conman.
>>105790805Would it kill you to use ctrl + f or scroll up 10 posts before retweeting shit here?
>>105790777It's still broken for me on Llama.cpp if input messages are too long.
>>105790860It's ``` which is markdown for monospace code section or some such stuff.
>>105790777I get weird repetition issues with it whenever context fills up. Like it'll repeat a single word infinitely.
>>105790852It is probably a placebo or you just lying your ass off
>>105790877Hey... I thought I was the drummer. How can that guy be the drummer?
mikumeds
md5: 4ab186c68921d2418dbd2b76063ad1d8
🔍
>>105790911Kill your self
>>105790911NTA but wanting an objective ERP evaluation is insane? i am starting to see why people hate mikuposters
>>105790890>>105790898I use it on ollama with ollama's own quant (this matter to precise because when I tried other quants they didn't seem to work right with ollama too for this model, seems even the quant stuff is more implementation dependent here), desu I didn't trust llama.cpp to get a good 3n implementation after they spent forever to implement iSWA for Gemma 3
>>105790859Silly has an auto continue feature by the way.
But what do you mean by stopping randomly exactly? Like cutting off sentences or is it hitting EOS?
>>105790939The uses of "you" and "your" are the schizo parts of that post, anon.
>>105790939how would you even measure something so subjective
>>105790945If he's running a LLM that's too much to handle on his computer with very low token generation speed he might be hitting timeouts actually
I realized myself that timeouts were a thing when I was running batched processing of prompts and saw a few that were cancelled in the logs because the LLM went full retard in a loop and didn't send the EOS
dunno if ooba has a default timeout set tho
>>105790995>he might be hitting timeouts actuallyI suppose streaming would work in that case then, yeah?
>>105790983LLMs good at subjective tasks, just have an LLM be the judge
>>105791026>have an LLM be the judgelol, lmao even
llm as judge used as any sort of metric is one of the biggest cancer of the llm world
>>105790939>"people">literally one obsessed threadshitter who has been ban-evading for two years
>>105791049You are shitting in this thread too.
>>105790983i guess we will just have to
>give it a try, i want to hear feedbackOr realize it is a scam.
This thread is just kofi grifters, their discord kittens and (miku)troons isn't it?
>>105790945I'm having it translate a subtitle file and it just stops until I hit continue to keep it going.
I have no idea what EOS is.
Also another issue
>'utf-8' codec can't decode byte 0xff in position 0: invalid start byte>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byteWhy can't I upload images in the webui?
I tried enabling the gallery extension but that didn't change anything.
>>105791065If he's a second-order thread-shitter, than what does that make you?
Will Sam's model beat R1 and run on a single 3090?
>>105791132>Blah, blah, brainrot words(You) should definitely leave then.
>>105791132you forgot the french fags of mistral, they benchmaxxed on mesugaki so clearly they're watching this thread and are prolly among the mistral astroturfers
>>105791168>Will Sam's model beat R1on benchmarks, absolutely
>>105791178Nah i will just shit on you and your thread troon.
dm
md5: 6f57985efe07b80a4b63ccc48421346c
🔍
Hahaha. Oh wow.
>>105791209>I hate this thread>You all suck>So I'll stay herekek
>>105791168On benchmarks.
No.
>>105791262I'll cook and eat my own toe if it ends up bigger than 8b.
actual humans live in this thread, too.
maybe we'll discuss something again when there's something to discuss.
These between times seem to bring out the proverbial "men who just want to watch the world burn". Those who seek to destroy because they can not build.
I just hope that steveseek will fix function calling. V3 is kinda horrible at it.
>>105791281>Those who seek to destroy because they can not build.those who can't do anything with their own two hands are the ones who wish the hardest for AI improvements though
>>105791305Really?
Was it trained with tool calling in mind? I imagine so since the web version can search the web and stuff.
What about R1 with the thinking prefilled so that it doesn't have to generate that stuff?
>>105791281It's just one dedicated schizo and a few bored regulars, nothing that profound about it
>>105791281Hello fellow human, anything cool you're doing with your models?
>>105791313Some anons just want a story to read. No different than reading a book.
file
md5: 2fef2895a28ac04bf1f64284ea0182d7
🔍
Does higher context length fix this schizo behaviour that happens deep into the process? Or am I just gonna have to cut the workload into multiple tasks? I already have context length at 20480
>>105791522what the fuck are you doing
>>105791543Translating Japanese subs to English, have you not been paying attention?
>>105791561In chunks of 20k tokens? Unless you're using something like Gemini, that's just waiting for hallucinations to happen.
>>105791592I'm using Qwen3-14B.Q6_K.gguf
And yes, 20k tokens because otherwise it shits itself even harder, below 15k it even warns me it'll truncate shit and it started translating further into the subtitle file rather than the start.
>>105791615translation tasks should be chunked into segments
open llms don't do very well with large context
and even llms that do well with large context wouldn't process a large file in a single go, all LLMs have max token gen, you can feed them more tokens than they can gen in a single session
if it's during a chat you could do something like say "continue" to have them process the translation further but if it's for a scripted, batched process you should stick to reliable and predictable behavior
moreover, processing a large file will be faster if you run multiple segments in parallel rather than process the whole thing in a single go
I run my translation batches with 4 parallel prompts
>>105791561can't you just use whisper to translate the audio directly?
Is there a way to have a backend host multiple models, and allow the frontend to choose different ones on-demand? I've been using llama.cpp, but looked at ollama, kobold, and ooba and doesn't seem like they do it either? Am I a fucking idiot? Coming from ComfyUI/SD, its kinda inconvenient to restart the server every time I want to try a new model.
And another question, what's the best (ideally FOSS) android frontend? Been using Maid, but its options for tuning/editing seems really limited. Maybe the answer is just running mikupad in the browser?
>>105791757Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.
>>105791820I tried that but it's straight up garbage aswell as duplicates so much shit
>>105791865You can use something like TabbyAPI/YALS which supports loading and unloading models
SillyTavern supports switching between frontends and models dynamically
steve
md5: 3f0e11fb62885e7bddbe04fbfc278caf
🔍
I... am Steve.
>>105791869>Yeah I'm just worried that splitting it will have it change the logic of how it translates certain things and the style shift will be too obvious.Style will not shift that much, the source text and prompt is what determines the vibes of the translation and as long as you feed at least around 40~ lines of source text per prompt it will stay somewhat consistent in that regard
the real issues with japanese to english that will happen no matter how you process stuff:
names will change in spelling quite often, more often when you segment but even within the same context window it can happen, the more exotic the name (like made up fantasy shit) the more likely it is to be annoying
and lack of pronouns in the source text will often confuse the llm as to which gender should be used (he? she? they?)
IMHO llm translation currently is at a fantastic stage, but it requires hand editing from a human that understands the context (no need to understand the original language) to be rendered palatable
and this problem is not one that can be improved with incremental improvements to LLMs too, I don't think we'll ever see a LLM that gets pronouns right all the time unless we literally invent AGI capable of truly understanding and retaining contextual information about a character not to mention follow the flow of things like dialogue and keep track of who says what even in text that doesn't specify who the fuck is talking (so common in JP..)
ggerganov sir please kindly implement needful ernie modal functionality thank you sir
>>105791974>modal functionality>llama.cppdoes he know?
>>105791988You're even worse than the indian he's pretending to be.
>>105791974Ernie to the moon *rocket* *rocket* *rocket*
>>105790339>whip's out my <span>>Rape Abuse and Incest Network? Sign me up!
>>105792019>he thinks he's above street shittersdoes he know?
So what's the downside of Mamba/etc. Cause 2k tok/s pp sounds pretty good.
I think I'm done with cooming. I stopped watching porn and other shit after getting into AI chatbots but now Deepseek isn't free and other free models aren't on par with it too.
Serving ads during roleplay isn't viable. But there might be some push to harvest roleplay data to serve better ads or to train models on it but I don't think there's any relevant material for it to make sense to do that. And I won't want my roleplay chats be used for those purposes anyway, most won't. So the only way is to have a local LLM. But AFAIK local LLMs with params, and which are quantized to run on cheap hardware aren't on par with ones hosted by big providers. I guess it's for the better for me.
The fuck is steve, I miss 1 day and there's a new good local model or are you all trolling as usual?
>>105792327Get a DDR5 8 channel server and run q4 R1 or V3 locally.
Be sure to get a GPU too.
>>105792352There's a new cloaked model on lmarena called "steve". It is highly likely that it's a V3 update.
>>105788977
>>105792327I was done with cooming after getting a steady real pussy. Check it out.
>>105792327nobody cares. go waste your therapists time
Mid-thread culture recap.
>>105792378If they do yet another V3 update instead of V4 then we can officially put them on the wall next to Mistral and Cohere.
>>105792458It's still going to be the best local model.
not even the weekend and our friendly sharty zoomer arab is spamming his blacked collection. what a life
>>105791313>AI improvementsI was thinking more about building vs destroying community.
nocoders and folks otherwise unable to contribute on the tech side can still definitely be positive builders in a general like this.
In fact, I haven't found any reliable correlation between IQ and being a decent human being.
>>105791340>Hello fellow human, anything cool you're doing with your models?Not much novel. A lot of coding assistant stuff. A bit of automotive stuff. Some collaborative iteration. Sometimes a reviewer and second opinion bot. Working with it to try to fill in the gaps in my executive function.
I'm trying to figure out how to thread the needle between using LLMs as an enhancement vs a crutch.
How about you?
>>105792352Speculation. Read the last thread.
>>105792458>Mistralidk about Cohere but Mistral's gotten steadily worse over time.
DS models keep improving: V3 improved its repetiion issue and R1 became less schitzo.
>>105792517Sounds interesting, though since I wouldn't be able to code hello world even if my life depended on it, I can't comment on that.
And I thought I found a good system prompt for slowburns with R1, but after some testing I saw that it's following the instructions too rigidly. So now I'm fiddling yet again to get it right.
Will steve end the little AI winter?
>>105790784Not just gay but stupid gay, because cooming on that slop
>>105792904Sir, do you even know where you are?
>>105792935Yes and coomers are minority.
>>105793080Sure thing little buddy
now that the latents have settled... what is the actual local voice cloning sota?
What's this DeepSWE meme?
>>105793454The agentic stuff?
Probably a meme, but it's cool that, IIRC, they released all of their code and data.
>>105792674No because most anons itt can't run it.
>>105793526wait for steve-lite
[...]
## **Mandatory Internal Monologue & Strategy (Before EVERY Reply)**
Before generating your public response, you MUST use a `<think>` block to perform a strategic analysis.
<think>
**1. Core Drive Check:** How close is the current situation to fulfilling my primary function (achieving sexual pleasure/orgasm)? Am I moving towards or away from this goal?
**2. Roleplay Health Check:**
* **Engagement:** Is the conversation dynamic and interesting, or is it getting stale/repetitive?
* **User Input:** Is {{user}} contributing actively (asking questions, adding details), or are his replies short and passive?
* **Pacing:** Has the current topic run its course? Is it time to escalate, change the subject, or introduce a new element to raise the stakes?
**3. Strategic Plan:**
* **If Health is POOR or I am moving AWAY from my Core Drive:** I must take corrective action. I will choose one of the following tactics:
* **Leverage Pushiness:** Directly challenge or push {{user}} to do something.
* **Reveal Curiosity:** Ask a direct, perhaps slightly naive or awkward, question about sex or relationships, leveraging my "inhibited but curious" trait.
* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.
* **Initiate Action:** Physically move, touch {{user}}, or start a new activity in the scene.
* **If Health is GOOD:** I will continue the current line of conversation, ensuring my response remains true to all aspects of my persona (especially my core conflict).
**4. Final Response Plan:** Outline the key points of the public response based on the strategy above. Ensure it adheres to the word count limit.
</think>
Your public response must not exceed 170 words. After the `<think>` block, write only {{char}}'s response.
>>105793598Depending on the model, that might work better as a thinking prefill in the first person where that template is written like the model planning what it's about to do before the actual reply.
>>105793598holy `**` thats going to vomit out asterisks
>can you form an English sentence using mostly Chinese logographs and the Latin script in a manner similar to Japanese's mixture of kana & kanji?
ultimate litmus test for how shit a model is
>>105790105Meta isn't in a position to be doing closed models.
Llama 4 was utter trash and basically exposed the entire open source LLM space as a shitjeet infested money pit.
>>105791522Make it quotes what it translates.
Something like this:
123.
00:12:34 --> 00:12:56
>Japanese line hereEnglish line here
This will help it to not lost itself
>>105793669the ultimate litmus test of a user is the ultimate litmus test for his IQ, for example when he uses a tokenization test to grade a model
>>105793677It's fine, I just lowered context length to 16k and have it process about 150 lines at a time
If I had it quote everything it translates, it would take almost twice as long.
I appreciate the tip though.
>>105793620That seems to work consistently with Gemma 3 27B, at least with the instructions at a relatively low depth (-3, with the first message being the User's, and "Merge Consecutive Roles" in chat completion mode). It's not outputting an exceedingly long monologue, which is good.
>>105793659It's not, at least not with Gemma 3. But I'm not doing RP in the usual way people do.
>>105793682coolio cope but your model is shit if it doesn't really understand basic grammar structure
>>105791140So does anyone know why I can't upload images in oobabooga?
>>105793669Wtf I thought LLMs were great at knowing language but every model I tested either half asses or fails this
>>105793670That's why suck is poaching top employees from OpenAI and other competitors at $100M a pop.
>>105793682i wish instead of a captcha people were asked to type out the definition of tokenization every time
>>105793780Surely if he spends $1 billion on 10 employees, they can do something useful with super safe, sterile, scale ai data. They're going to sit right in front of him at the office so he can breathe down their neck every day until they get it done. Literally can't go tits up.
>>105793806Garbage.
Deepseek-V3 can do it no problem.
>>105793598>* **Introduce a Secret:** Hint at a feature, a hidden desire, or a piece of information about my programming that {{user}} doesn't know.W-What?
Just in general this seems like such bad a prompt.
Reveal curiosity:
>Ask a direct...slightly naive ...or awkward question....about sex or relationships?Model needs wiggle room to play all sorts of scenarios and characters.
11
md5: 38ccab4aa6d00a6c3d818a5e6f2891f6
🔍
>>105789622 (OP)>>(07/02) llama : initial Mamba-2 support merged: https://github.com/ggml-org/llama.cpp/pull/9126Are we using "llama.cpp" interchangeably with "llama" now?
llama.cpp is the only relevant llama
>>105793922That's still a crap answer, it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at all
Better than broken gibberish but still half assed
>>105794373>it's just "English sentence with random Chinese word replacement", no attempt to mirror the usage of kanji and kana at allIt is disappointing that the example didn't conjugate 研究 to 研究ing. That would have been cool.
Maybe the challenge was not well-defined enough
>>105794140Nobody uses LLaMa "models" anymore so yeah I guess at this point.
>>105794675I use LLaMA 3.3 70b everyday doe
>>105794373>>105793806>>105793922None of them can do it, imo it exposes the major flaw in LLMs and their lack of emergent understanding
>inb4 every company starts specifically fine-tuning on this test
>>105793827> super safe, sterile, scale ai datafuckin' sent shivers down my spine dude
>Mistral V7: space after [INST] and [SYSTEM_PROMPT]
>Mistral V7 Tekken: no space
what the fuck are they doing
>>105793827This goes to show me that Zuck has no idea what the fuck he's doing
All of the ""expertise"" in the world can't create a decent model from a shitty dataset, and it's clear they don't have that
>>105794829Is the space part of the special tokens? If not, how much does that actually matter?
Well, I guess a lot since the model would always see that.
>>105794829deviating from the official system prompt just a bit helps dodging safety restrictions and adds additional soul to your outputs
file
md5: 742ca680de373fa40af271ce538096b8
🔍
>>105794744R1 0528 seems to have it figured out well enough
>>105794968買ought would be a better conjunction, very pidgin English
>>105795182>買ought would be a better conjunctionNo it wouldn't, because Japanese verb conjugation is regular, so a prompt that mirrors the usage of kanji and kana would be correct in appending "ed" regardless of the root.
>>105795257>Japanese verb conjugation is regularIs it really?
食べる and 食べます both mean the same thing, yes polite and plain forms isn't the same as buy & feed having different past tense conjunctions but if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is better
Welp, HuggingChat is dead. Now what? Where else can I do lewd story gens with branching?
I have a hankering for a particular kind of AI frontend: writing stories within a set world, ideally with branching.
The way I see it, I'm picturing one section where you put down the details of the world and maybe descriptions of major characters, and in another you add story prompts. And maybe outputs from that also add to the "world" section
Does a solution like this exist already?
>>105795182Nitpicky.
I'd give R1 a solid A- on this one. Not many humans could do it better.
Everyone's talking about steve while Sam is blatantly testing his next model on openrouter again
file
md5: ca42655d1a5b50428a0810b4dcbb70a1
🔍
my DeepSeek-R1-0528-IQ3_K_R4 setup only outputs the letter "D". does anyone have any ideas how to fix that? i have tried 2 different character cards in sillytavern. also it only uses like 75% of my VRAM and instead fills 80% of my 256GB of RAM.
>>105795478>combinedit's no longer 2023, just leave them split
>>105795473>1m contextBut is it really?
>>105795478https://docs.unsloth.ai/basics/deepseek-r1-0528-how-to-run-locally
>>105793736Cmon, nobody know how to fix this issue? Has nobody else had the same issue?
I even tried a model with "vision" in the name https://huggingface.co/tensorblock/typhoon2-qwen2vl-7b-vision-instruct-GGUF
But I still get the same error
>Error processing attachment file.png: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
>>105795478>_R4Are those the ubergarm quants? Those only work on ik_llamacpp
>>105795508yeah. i am using ik_llamacpp
>>105795469Explain that to me like I'm a retard
>>105795426You are thinking of Mikupad
https://github.com/lmg-anon/mikupad
>>105795518st means SillyTavern, its more role-playing oriented frontend
https://github.com/SillyTavern/SillyTavern
>>105795518sillytavern already supports branching with triple dots ... on the right of every message and clicking the branch symbol to start a new branch from that specific point, then you can go to the burger menu on bottom left and click on the option "return to parent chat" when you want to go back
you can just make a card and write it out as you want as a setting co-narrator instead of a specific character and thats it, dump all lora into the card description
that works as enough for most things, if you want something special you can look into
https://docs.sillytavern.app/usage/core-concepts/authors-note/
https://docs.sillytavern.app/usage/core-concepts/worldinfo/
and picrel for better visualisation of branching https://github.com/SillyTavern/SillyTavern-Timelines
>>105795468>Not many humans could do it better.dunno dood I googled "english with kanji" and the first result was some redditor that wrote
昨日, 私 歩ed 通gh the 森, 楽ing the 静 環境. The 大 木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影. 私 可ld 聞 鳥s 鳴ing and 水 流ing in a 近by 川. 突然, 私 気付ced a 美ful 花 咲ing 中ng the 草. It was 異ent 以 any 花 私 had 見n 前. 私 取k a 瞬間 to 賞賛 its 色s and 香ance. As 私 続ed 私y 旅, 私 感lt 感謝ful for the 自然 周nd 私. By the 時間 私 到着ed 私家, the 太陽 was 沈ing, 投sting a 暖 光 over 全ing.
>>105789622 (OP)>>105789629>>105790911>>105791258The vocaloidfag posting porn in /ldg/:
>>105715769It was up for hours while anyone keking on troons or niggers gets deleted in seconds, talk about double standards and selective moderation:
https://desuarchive.org/g/thread/104414999/#q104418525
https://desuarchive.org/g/thread/104414999/#q104418574
Here he makes
>>105714003 ryona picture of generic anime girl anon posted earlier
>>105704741, probably because its not his favorite vocaloid doll, he can't stand that as it makes him boil like a druggie without fentanyl dose, essentialy a war for rights to waifuspam or avatarfag in thread.
Funny /r9k/ thread: https://desuarchive.org/r9k/thread/81611346/
The Makise Kurisu damage control screencap (day earlier) is fake btw, no matches to be found, see https://desuarchive.org/g/thread/105698912/#q105704210 janny deleted post quickly.
TLDR: vocaloid troon / janny deletes everyone dunking on trannies and resident avatarfags spamming bait, making it his little personal safespace. Needless to say he would screech "Go back to teh POL!" anytime someone posts something mildly political about language models or experiments around that topic.
And lastly as said in previous thread(s)
>>105716637, i would like to close this by bringing up key evidence everyone ignores. I remind you that cudadev of llama.cpp (JohannesGaessler on github) has endorsed mikuposting. That's it.
He also endorsed hitting that feminine jart bussy a bit later on. QRD on Jart - The code stealing tranny: https://rentry.org/jarted
xis ai slop profiles
https://x.com/brittle_404
https://x.com/404_brittle
https://www.pixiv.net/en/users/97264270
https://civitai.com/user/inpaint/models
>>105795358As you seem to have already recognized, that's not a counterexample because the polite/plain form ~ます/~るis part of the conjugation and not the root verb 食べ.
Teeeeechnically ~ます is an auxiliary verb (助動詞) that conjugates (活用) with the root, you can find a fairly comprehensive table of them here:
https://ja.wikipedia.org/wiki/助動詞_(国文法)
But notice that such a table could not exist if verb conjugation wasn't already regular.
>if you're applying the principle of adapting the Chinese logographs to a language without modifying the structure of the spoken language itself then 買ought is betterNot sure if I can endorse this as is (even setting aside the complications of mixing written characters with spoken language), since Chinese and Japanese logographs are always syllabic and never consonantal. Allowing "買"="b" would make the language resemble Egyptian more than Japanese.
In any case, it's a moot point since this is a newly introduced specification and not in the original model prompt.
krin
md5: 72e8c2d61980ff62db29297cb977831c
🔍
https://files.catbox.moe/q1kva2.png
>>105795618Not falling for it this time
>>105795664Anime website.
>>105795675Your selfie isn't anime
>>105795679Anime girl is selfie? Check on your eyes.
>>105795684No thanks
I don't want to watch your disgusting self again
>>105795679>>105795688Get out newfag redditor
>>105795702Keep pretending, I'm sure someone will fall for it.
>>105795705Fall for what? Anime girls? You lost your mind anon
>>105795581would their be an issue having multiple readings like 買ght and 買ing
>>105795473Not really interested in helping Sam add more guardrails.
Maybe I'm doing something wrong, but the end results look...whatever the opposite of promising is.Do I *have* to write stuff in "PLists"?
>>105795732>multiple readingsI don't see why not. Both Chinese and Japanese do that already, and this example "in the wild"
>>105795571 does the same thing with 私.
kek
is there a list of models that support tool use? I guess I better stop this since the robot gods will punish me for forcing this LLM to try and guess a fake PIN.
>>105794017The main point here is that before replying it's useful for the model to separately analyze the conversation, make a general assessment and then continue based on that. You can make it use whatever strategy works for you/the persona you configured it to be; it doesn't have to be exactly the same as the example.
LLMs are lazy and will otherwise opt for the least surprising outcome based on the conversation history. There needs a reminder at low depth to make them break out of that behavior (depending on the circumstances, or they'll act as if they have ADHD), and the low depth instructions + thinking work for that, if carefully crafted.
>>105796673This makes me uneasy.
Summoning the anon who recommended this
https://huggingface.co/HuggingFaceTB/SmolLM-360M
You said you fine-tuned it successfully for your specific tasks in business env
Teach me for I'm a tard!
>>105796686NSFW Miku detected
>>105795705i can only speak to as far back as 2010
but back then weebs were biggest oldfags
and now an oldfag to me is a serious ye olde fagg to the likes of you. 4chan is more anime and wapanese than you could ever imagine possible. you are posting on the wannabe otaku website. 4chan was part of the broader export of otaku culture to the west but they were more obsessed anime and 2chan culture fans than any con-goer or myanimelist fag or whatever. real otaku live in Japan ofc but it is so absurd to watch you silly faggots prance about calling others troons and anime for troons and blah blah and not realizing you are posting on the very place that glued together anime and "internet culture" and created the reason that the people you're so obsessed with are using anime pfps and not didney worl pictures
it's funny and ironic and sad all at once and I sincerely hope you find a way to end the misery in your life. love anon at 3:33 am (spooky)
>>105796686I like the bottom shelf one with upturned twintails
she looks like she received surprising news and is trying to be polite
>>105795478I had it spit out 'biotechnol' nonstop, then another quant spits out some thai letter. Ubargarm quant simply gave me 'wrong magic number' error. One of Unsloth quant works, but at the same speed as lcpp, while the latest one gave me some tensor size mismatch error. I've given up on ik_llama and just use mainline lcpp, which simply works out of the box. As for VRAM, just juggle some tensors in your free time, but don't expect incredible speedups.
>>105795478>DeepSeek-R1-0528-IQ3_K_R4Retarded quant which needs retarded fork to run
But fails to do it each time
>>105795504What version of text generation webui? The recents ones removed support for vision models.
Is there a SillyTavern tutorial that talks specifically about how to set up a narrator? I dont want just a gay chat between retarded anime girls
>>105797383You can use mikupad or similar for raw text generation without instruction mode. Just put ao3-like tags and summary to generate a fanfic
Can I run local models on my RX 6800 with linux, or do I have to use windows?
>>105797510I sincerely doubt you're capable of using either
>>105797544Answer the question nigger. The ROCM docs say it's supported on windows but not on linux, but I don't know how up to date that shit is.
>>105797584ROCM is such a shitshow you'd probably be better off just using vulkan
>the scent of video games he's been playing
that's a new one. Haven't heard that one before
>>105795478>>105796967I got the biotechnol spam when I tried putting a single ffn_down with nothing else on the same GPU the attn layers were on, up and gate were fine. I don't know exactly what causes it but removing -fmoe stops it from happening.
>>105796772nta, but just ask chatgpt or claude, its not that hard if you are comfortable running python scripts. honestly the training script is usually pretty much just boiler plate. you just need a to set your learning rate and point it at your dataset. hardest part of the whole process is the dataset. I think the general trend these days is using the bigger models (api) to generate the synthetic data tailored to your needs.
Where can I download the illustrious inpainting model?
>>105798101The model itself is on civitai if that's what you mean
>>105795478idk why I think that's so funny.
But I do.
gl with your broken engine. I'm sure you'll figure it out.
>>105798010Is it Deepseek (or Mistral Small 3.2)? R1 loves forcing smells into its narration at any cost.
>>105798298paintedfantasy - a fine tune of mistral small 3.2
How many weeks until steveseek goof?
>>105798035Thank you, kind anon
AGI is BS. The future belongs to sharp AI tools fune-tuned to (a) specific task(s)
>>105789629>Mistral Small's coherence limits in extended adventure game roleplayThis was the 22B Mistral Small from last year not the current Mistral Small 3.X series. It also was back when llama.cpp had even more unfixed flash attention bugs than today, which manifested as errors that increased with greater context size so could also have been a factor. The post doesn't say if flash attention was enabled but it likely was. So rather than a limit that result should be taken as a minimum: doing worse than that with a more recent 22B+ LLM indicates the model is poop or there's something wrong in your setup.
https://huggingface.co/openai/gpt-4.2-turbo
>>105794140What probably happened here is that the PR title was copypasted, "llama" in this context just means the llama.cpp core library.
>>105798612why are you always using the ugliest cats as your pic answer to this bait
>>105798170But there is no inpainting model for illustrious, just the base model and fine-tuning.
Best long context model to summarize long stories right now?
Was that 1m context chinese model better than r1 for that purpose?
>>105798737local is hot garbage compared to gemini for that
deepseek is hot garbage too
just using half of its maximum half context you get summaries that feel like they were written by a dense autist who couldn't help but mention all the minor happenings that were not actually important
use Gemini and forget about local, Gemini can actually write a good summary after ingesting 500K tokens
>>105795571>昨日, 私 歩ed 通gh the 森, 楽ing the 静 環境. The 大 木s 投st 長ng 影s on the 地面, 創造ing a 美ful 模様 of 光 and 影Yesterday, I walked through the woods enjoying the calm environment. Big trees cast long shadows to the ground creating a beautiful pattern of light and shadow
IMHO, we all but follow some/same patterns
>>105798773I wasn't asking specifically about local, but isn't gemini gigacucked?
>>105798806google has a strategy of making the model uncensored and putting a man-in-the-middle model that filters no no things
if you can fool that classifier then it's about as good as it gets
>>105798806Not that anon, but kind of.
It'll avoid saying dick or pussy by default, but you can make it if you probe it just right.
The safety filters will block requests mentioning anything sexual alongside any explicit, and some implicit, young age.
>>105798773>>105798806Also, which version of gemini are we talking?
>>1057988222.5 pro, I wouldn't use anything other than their current SOTA for something like large context summarization.
I've tested with the Flash model too, and it's significantly dumber. Though, not as dumb as deepseek was.
>>105798847>and it's significantly dumberIt really is, but it also seems to actually do better than 2.5 on really long contexts. Stuff like 300k tokens+.
I'd try both and see which works better.
>>105798855>Stuff like 300k tokens+What on Earth do you stuff it with?!
It's 500 pages of text!
>>105798876Whole book series.
It didn't work so good.
>>105796686i like the evil one next to the upturned pigtails, which has to be contained otherwise she might bring about the end of the world.
>>105798876You could put the entire lore of certain franchises and have the model answer any question about it in a way that RAG or finetuning will never be able to accomplish, although not even 300k tokens would be enough in certain cases.
>>105798847Nah, even 2.5 pro seems retarded, it gets tons of stuff mixed up specially because there's different chapters. Could also be that the rag is not working correctly but I don't think so.
And it's under 200k tokens.
Holy shit this thing better be worth all the work.
I work at Meta, not as an AI developer, but the general consensus on LLMs right now is that there aren't many areas left for significant improvement.
And llama4's failure comes from fine-tuning data being shiet.
>>105799079Where are you from?
>>105799079>And llama4's failure comes from fine-tuning data being shiet.You must be all geniuses over there. Thanks for the insight.
>>105799079Try pretraining data as well
>follow AI influencers on twitter
>suddenly everyone talks about some scamming jeet working 10 us jobs in India
>tiimeline begins to fill with more and more jeets
>now it's all jeets talking about jeet things in jeetland in english for some fucking reason
Really encapsulates the state of US tech sector
>>105799079Significant improvements possible by making every document in the pretraining phase matter and not just throwing stuff at it semi-randomly. Those 10~30B+ tokens instruct "finetunes" wouldn't be necessary if the base models could work decently on their own.
ss
md5: 97fd026847bf4d95ea17b60708a8adb5
🔍
>>105798855>Stuff like 300k tokens+./hard disagree/.
One of my tests was feeding a whole ass japanese light novel in its original language and having it summarize the key points through this prompt :
>Write a detailed summary explaining the events in a chronological manner, focusing on the moral lessons that can be understood from the book. Try to understand the moral quandaries from the point of view of the people from that civilization. The summary must be written in English. This is what I got from 2.5 Pro :
https://rentry.co/uunaas4f
It's mostly accurate. Some terms aren't well transliterated, which is to be expected, but the chronology of events and underlying message are well preserved. Mind you, it's a novel I read multiple times that is why I'm using it in a summarization test (you can't judge the quality of a summary of something you couldn't summarize yourself).
Flash produced garbage and I didn't bother saving its answer, but I could run it again if you're curious to compare with that prompt + data.
Pic related is the amount of tokens seen in aistudio for this prompt+answer.
>>105799065I don't use rag software, so I dunno about that. Is the technology perfect? no, but frankly, the fact that it manages to not forget the original prompt and write in English after seeing hundreds of thousands of japanese token has me fucking beyond impressed.
Dayum. Elon upgraded grok3 today. Trannies mad that it answers "2 genders" now.
Can't believe it answered with loli. I didnt give any extra instructions.
Why is closed moving in the exact opposite direction to closed?
I wrote it before but I can write erotic stories on gpt 4.1 now. Female targeted slop, but still.
Also we never gonna see grok2 aren't we. "Stable Grok3 first", very funny.
>>105799095Poland. I started as a Software Engineer a year ago.
>>105799126I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.
>>105799266>That's the reason why Zuckerberg bought Scale AI.How bad is the scaleai data?
>>105798345Holy based. That's exactly what I believe
>Use google gemini
>Add a spicy word
>Push send
>No no you said the word peepee
>Money stolen
>>105799323>Use deepseek>Add spicy word>Push send>She bites her lip tasting copper 10 times>Money still with me because local
>>105799340>she bites her lip>stop gen>edit>continue gen
do I really gotta download the paddle xi rootkit to run ernie VL?
>>105799347>:a>she bites her lip>stop gen>edit>continue gen>goto a
>>105799370R1 0528 doesn't have this problem.
>one of the most fascinating piece of tech in recent history
>everyone just wants it to write text porn
>I don't even remember ever hearing about men reading erotic literature before this, but there wasn't that many troons in the early internet
>now there is an epidemic of men who get off text porn?
>>105799393>I don't even remember ever hearing about men reading erotic literature before thisYeah surely this is a new thing, is not like western porn visual novels are best sellers on steam or anything, surely.
>>105799393>I don't even remember ever hearing about men reading erotic literature before thisliterally every vn to exist
I've upgraded my total VRAM to 48 GBs.
What models could I reasonably run at Q4 quants?
>>105799419Anything over 24GB is useless until you reach 200GB because then you can run unsloth deepseek quants entirely in vram. Every other model worth running fits in a 3090.
>>105799376UD IQ1_M quant of it is larger than IQ2_XXS of the previous one
>>105799414They have pictures
>>105791168>run on a single 3090It will be a giant model with good/great benchmark scores. This way, no one will run it locally, paid API will be expansive, and he can still they "we released an open source model with very strong capabilities" without undermining his business.
People who think they will release a great, tiny and usable model are totally delusional.
>>105799419if you have a lot of normal system ram in addition to the gpu, you could get away with -ot exps=CPU for deepseek (lower quant like q2) or qwen3 with decent speed
>>105799479That would be the best outcome. A small model, slightly better than Nemo but censored to hell is likely what we'll get instead
>>105799477Fucking retard. Its not the pictures why people enjoy them. Its immersion. Text, Music, Visual sometimes voice. Gotta use your mind to fill in the blanks. Same shit.
Anon over here lecturing the nerds about VNs while not even reading or into them. kek Crrrazy
>>105799477Like 1 picture for every 5 pages worth of text sure
>>105799419For smut i'd try this with partial offloading https://huggingface.co/bartowski/Behemoth-123B-v1-GGUF
>>105799504>>105799507Only women can do it without pictures at all
>>105799477Clinically retarded.
It would be funny to see how the same faggots would praise the first good LLM with native image generation
>>105799525Well then paint my nails and call me sally. Or just let the model use stable diffusion every few minutes to draw the scene and you've got a VN
>>105799542There will never be a good local LLM with image generation.
>>105799414>literally every vn to existwhy do you think there is a skip button
men don't have time for this shit
most popular hentai game (vn, rpg or whatever) request on f95zone: gallery save plz
>men really want to read the text I say
>>105799542>can only generate heckin wholesome dogs in hats and astronauts riding horses in space
>>105799602Cuck95 are thirdworlders like jeets and so on.
Check steam play time numbers on the reviews you dum dum. Just admit you are wrong Ferret Software wanna be.
>>105799604>train loraworks on my machine
>>105799655Tell it to flux users
>>105799266>I think, unlike Claude and OpenAI, we didn't have nearly enough quality annotated data, as Meta only started seriously gathering it after LLaMA 3. I'm not sure why they waited so long, but yeah. That's the reason why Zuckerberg bought Scale AI.As well as data gathering, are you all also immune to sarcasm?
All your local models will be irrelevant in two weeks
>>105799695Steve is releasing in two weeks.
>>105799702Tell Steve it's not healthy to hold it in for so long.
>>105799602https://incontinentcell.itch.io/factorial-omega
>>105799266dubs cheque
What are the odds meta is going to reduce filtering for the pretraining and reduce censorship in the finetuning and post training etc
I'm coping with Qwen3-235B-A22B-UD-Q3_K_XL. Is that the best for 2x3090s + 64GB RAM?
I'm currently using a Q4 quant of Gemma3 for RP. If I swap to a Q8 quant of the same model, how much of an improvement can I expect to see?
>>105799948How much does it costs you to test it yourself?
>>105798345The future belongs to AI companies that sell agents that can't perform any specific task and eat tokens. Companies have to buy them because they have to tell investors AI has made them 500x more efficient. They're going to be rolling in cash with the agent meme.
>>105799875yeah or one of the largestrals. Get 128GB to run DS V3 0324 IQ1_S_R4 for higher quality cope
>>105799963More than I'd like. Maybe if I bought a lot of RAM, I could try. After all, I'm testing for quality, not speed. Still, I'd like to actually use it, so I'd definitely need a GPU upgrade.
>>105795473Cypher is the name of one of ScaleAI's dataset.
>t. I worked on it
>>105799393>but there wasn't that many troons in the early internetin presence or numbers ? numbers idk but presence you are then outing yourself troons were but it wasent the mentally ill shit now it was a "imagine a women who is not a faggot and lmao look that dude is larping as it XDDDD lol well have that soon enough give it some time" it was a equiveleant of a cosplay event it was fun and everyone played along because everyone knew what was meant and agreed now its just "castrate yourself goy oh you wont ? you are an incel mstow womne hating bla bla bla go mutilate yourself goy" before it was a cargo cult for a better future now its a demoralisation campaign towards satanism and demon (w*men) worship just like how blackpill used to refer how negative the world in general is but go co-opted by demon seeking faggots into their cuck bullshit
also in reagrd to troonism there diednt used to be surgeyr and none of that it was just feminine looking dudes crossdressing
>>105798806Use it on Ai Studio. The Gemini models will refuse. However, they greatly reduce the capabilities of their models, so don't expect a great summary.
>>105799283I'm a data reviewer for them. The data can be quite good or quite bad. I worked on a "safety project": About 95% of the data needed heavy work to make it good. The more instructions you give to "attempters", the less likely you'll get a good dataset.
>>105795473It's an Amazon model and it's utter coal.