← Home ← Back to /v/

Thread 716282706

191 posts 86 images /v/
Anonymous No.716282706 >>716282908 >>716286298 >>716286367 >>716290967 >>716292069 >>716294437 >>716295048 >>716295163 >>716300321 >>716303290 >>716307074
>Needs top of the line power hog GPU with infinite VRAM to be able to play fucking TEXT adventure locally
Anonymous No.716282782 >>716291865
@CollectiveShout
Ummmm I think you guys should look into this
Anonymous No.716282908 >>716288479 >>716296154
>>716282706 (OP)
Who are you kidding? LLMs are not smart enough to run a competent text adventure. You're chatbotting with pretend girls. Nothing wrong with that per se but let's not kid ourselves by calling it a text adventure
Anonymous No.716283549
So what good NSFW chat bot sites are around now?
Yodayo seems a bit shit though everyone was shilling it last thread.
Looking for /d/ & /tg/ ERP so I can pretend like i'm in a mid-2000s MMO again
Anonymous No.716286261 >>716287634 >>716290027
>try Stheno Q8_0
>ok i guess
>see Rocinante shilled, 12B
>it doesn't seem much better than stheno but is slightly slower
>fuck it i have 32gb
>try Mistral Small 24B Q8_0
>CPU grinds to 100% on all cores and struggles with each word coming through
huh.

Also I still don't know what the difference between CFG and Author's Note is, they seem to do the same thing.
Anonymous No.716286298 >>716292096 >>716293164 >>716294681
>>716282706 (OP)
Deepseek is fine and takes 10 minutes to set up. You are just retarded.
Anonymous No.716286367 >>716294681
>>716282706 (OP)
I'm happy the good models are gatekept well enough that retards on /v/ can't get to them.
Anonymous No.716287634 >>716288256 >>716288302
>>716286261
authors note are just notes from the author
high CFG means the AI is more likely to do what you want, but less creative, where as low CFG is high creativity but less likely to follow prompt
Anonymous No.716288256 >>716288302
>>716287634
Both are effectively telling the AI what to do, at least from what I can tell.

If I put "speak with a pirate accent" in either one, the result is the same.
Anonymous No.716288302
>>716287634
>authors note are just notes from the author
He's talking about sillytavern itself not the note in a card.
>>716288256
https://docs.sillytavern.app/usage/core-concepts/authors-note/
Anonymous No.716288479 >>716289058
>>716282908
t. imaginationlet
Anonymous No.716288603 >>716296076
Got any shota character cards?
Anonymous No.716289058 >>716290802
>>716288479
>"it's smart enough to create and run a text adventure!"
>look inside
>"dude just come up with the adventure part yourself"
Anonymous No.716290027
>>716286261
You can go down to Q6 if Q8 is too slow. It's only marginally worse.
Anonymous No.716290361
doesn't matter how expensive your gpu is the good stuff is proprietary
Anonymous No.716290802 >>716291165
>>716289058
yeah you have to be the game master since LLM aren't designed with foresight but to generate the next most probable token with some variance...
Anonymous No.716290967 >>716291350 >>716291446
>>716282706 (OP)
Has anyone tried running this on 4GB VRAM? I would’ve done it myself, but I don’t want to install 17 Python libs just to be disappointed.
Anonymous No.716291165 >>716291268 >>716293716
>>716290802
I want real intelligence. Why are people spending $30/mo talking to a sentence finisher?!
Anonymous No.716291268 >>716291446
>>716291165
Because they are literal NPCs who can't tell the difference between fancy auto-complete and intellect.

That's all this LLM stuff really boils down to, there is a large fraction of the population who legitimately cannot tell, because they're philosophical zombies.
Anonymous No.716291350 >>716291664 >>716292413
>>716290967
How much RAM do you have? You can run a decent model if you offload layers to RAM/CPU, although that's going to be a lot slower than running entirely on GPU. Tiny models won't be worth the effort.
Anonymous No.716291403 >>716291543
Why the FUCK do you insist on turning this into a 24/7 general to bring attention to someone that MUST stay hidden. Why can't you let people have nice things?
When has ANYTHING, EVER, AT ANY POINT IN HISTORY, become better with mainstream exposure?
Anonymous No.716291446 >>716291664
>>716290967
silly tavern is just a graphical frontend, running a local LLM is the bottleneck. You can run a 7B q4m at good speed but the context window will be ridiculously small and the vocabulary very constrained (source: I ran 7B model on a RX580)
Also it doesn't rely on python and you can use LM Studio as back end which is a one click setup.

>>716291268
to me you don't sound any different than a LLM
Anonymous No.716291543 >>716293802 >>716304068
>>716291403
it's just a front end to display text generated by LLM in a more roleplaying way, anon, nobody is going to take it from you.
Anonymous No.716291664 >>716291964
>>716291446
>>716291350
thanks i had some success running llama and qwen so ill give it shot
Anonymous No.716291865
>>716282782
you joke but they are partnered with a chick who was all about "no sex robots" and "its harmful to women" or something and she has been campaigning for it since 2015
Anonymous No.716291964 >>716292235
>>716291664
get a model from TheDrummer (maybe be a 8B but make sure you pick a q4m quant at most) and don't ever offload every layers to the gpu, this sounds counter intuitive but if the model is larger than your vram this will causes inference to come to a crawl, it's better to offload a portion only, like half.
Anonymous No.716292038 >>716292235 >>716292347 >>716292490
I have the IQ of an ape but still want to talk to my waifu and I don't want to do it online.
Can someone give me a guide for retards to run this on a low end AMD card locally and what models to use and shit?
Anonymous No.716292069 >>716292235 >>716292723
>>716282706 (OP)
like other anons said - it is impossible to play text adventures with llm. It is only good for erp sessions.
Anonymous No.716292096 >>716294681
>>716286298
Guide? Pretty please?
Anonymous No.716292235 >>716292723 >>716293151 >>716293420
>>716292038
sillytavern is a bit overblown if you just want to talk. Get LM Studio and >>716291964

>>716292069
it's not impossible, you just need to be the game master and guide it constantly. Right now LLM are like advanced random number generators but it's gonna get better and better.
Anonymous No.716292347 >>716292430 >>716293151 >>716293420
>>716292038
local models are all shit. even top ones like deepseek are barely coherent and frequently goes into schizo mode. low-end AMD card further limits what you can do.
if you are really afraid some wagie is gonna read your erp chatlog then
>>>/vg/lmg
Anonymous No.716292413 >>716292628 >>716294391
>>716291350
>You can run a decent model if you offload layers to RAM/CPU, although that's going to be a lot slower than running entirely on GPU.
nta but I honestly have no idea what to even set for these settings or if its even using my GPU at all. Pretty sure i've just been going full RAM/CPU this entire time, or what 5 layers even means.
Anonymous No.716292430 >>716292664
>>716292347
a good 13B model can be surprisingly coherent. And Deepseek is not a top model, stop reading /pol/ and /g/
Anonymous No.716292490 >>716293151
>>716292038
the mistral setup is crazy https://rentry.org/lmg-lazy-getting-started-guide
this is the no-bullshit guide and its very fast
it is magic to me, insane results, especially with kobold instruct storyteller and the fact that it runs with sillytavern chars makes it perfect, I kinda feel like ai chatbots peak here
Anonymous No.716292628 >>716292747 >>716294587 >>716298925
>>716292413
first make yourself a favor and switch to LM Studio. But if you want to stick to kobolt regardless, you shouldn't touch anything there except gpu layers, the higher number you use in gpu layers, the more of the model will be offloaded to your gpu, but if you set it too high and it overflow your vram into dynamic vram and causes your inference to be very slow. What you want to do is set it to half of the max at first, then increase it if inference is too slow. If increasing it more and more makes no difference you've got a model that's way too big for your vram in the first place.
Anonymous No.716292664 >>716292731
>>716292430
>all the poors get free DS
>NOOOO DS IS BAD YOU NEED GEMINI FOR REAL ERP NOW
A tale as old as time. All models suck absolute shit, local models suck more shit because they're slow.
Anonymous No.716292723
>>716292069
>>716292235
sounds about right
it can be an assistant instead of GM
and give it an outline and some room to work with
and use a RNG machine along side that for a more fair playthrough
Anonymous No.716292731
>>716292664
you have zero fucking clue
Anonymous No.716292747 >>716292929
>>716292628
That's the thing I don't even know what "max" is. Is 16 max? It's a 16gb vram card. Is it 100, like a percentage?
Anonymous No.716292929 >>716293272
>>716292747
you need to load a model file first and the number of layers available will be visible.
Anonymous No.716293151
>>716292235
>>716292347
>>716292490
Fine, what if I'm willing to use cloud shit to ERP but I'm not willing to pay?
Anonymous No.716293164
>>716286298
>local deepseek
Not everyone has a stack of mac minis
Anonymous No.716293272 >>716293549
>>716292929
What... I do have one loaded but none of those numbers are appearing except the use vulkan 8/9
Anonymous No.716293420 >>716293549
>>716292235
>>716292347
Can it run on a RX6600?
Anonymous No.716293549 >>716293604 >>716293660
>>716293272
make sure your model is a .gguf

>>716293420
of course. You can run 13B models, albeit a bit slowly but for roleplaying at reading speed it's enough. Heck I used to run a RX580 with 7B models. What matters is not overflowing your vram by offloading too many layers.
Anonymous No.716293604
>>716293549
RX580 4gb, forgot to clarify,.
Anonymous No.716293660 >>716293757 >>716294808
>>716293549
>make sure your model is a .gguf
Suppose i'm cursed
Anonymous No.716293716
>>716291165
Check back in a few years (or decades)
Anonymous No.716293757 >>716293956 >>716294227
>>716293660
24B at 8Q??? nigga what's your gpu even??? In any case try another model for sanity check
Anonymous No.716293802 >>716293938 >>716293956 >>716295212
>>716291543
Actually the ST devs had a massive shitfit when an article came out about people using it for NSFW shit. I don't think anything negative ended up happening but there was talk of a total rebrand
Anonymous No.716293938 >>716294020 >>716295212
>>716293802
that was funny
they tried to pretend like roleplay of any kind wasn't the intended use, even sfw
I suspect 90% of the people making all the noise did 10% of the code contributions and the people that actually did do all the work quietly told them to shut the fuck up or they'd split off and make a competitor, just a guess but I've seen similar things happen before
Anonymous No.716293956
>>716293757
And seriously, give LM Studio a try,

>>716293802
that's like having a fit because people can use krita or blender to draw/animate porn... Really dumb
Anonymous No.716294020 >>716294235
>>716293938
>they tried to pretend like roleplay of any kind wasn't the intended use, even sfw
what? Then what the fuck would the intended use even be?
Anonymous No.716294227 >>716294336 >>716294587 >>716294654
>>716293757
>24B at 8Q??? nigga what's your gpu even???
I don't even know what that MEANS bro
Just a 7900 with 16gb vram. I have no idea what its even capable of because all guides assume you already know what the fuck you're doing and the one calculator i've found is an absolute enigma of "input all the data you don't know so I can tell you the answer to the question you don't know how to ask"
https://smcleod.net/vram-estimator/

Seriously for all the talk of model sizes and quants nobody ever seems to mention or bothers to detail what any of that means or how it runs.
Anonymous No.716294235 >>716294317 >>716294640
>>716294020
A personal code monkey I guess.
Anonymous No.716294317
>>716294235
A coding assistant definitely needs an anime girl avatar to be usable.
Anonymous No.716294336 >>716294391
>>716294227
just download a Q6 and offload 10GB to RAM, and you’ll be fine
Anonymous No.716294382
This is legit too hard for me.
Can someone just tell me "go fucking here, download this thing, run this thing, set up this thing by doing this"? I'm the RX6600 guy.
Anonymous No.716294391 >>716294587 >>716294617 >>716294654
>>716294336
How

I still don't know what the fuck GPU layers even are as per my initial question that ran down this entire rabbit hole of replies >>716292413
What is "10gb" in gpu layers?
Anonymous No.716294437
>>716282706 (OP)
and this is why generative AI are not going to be part of mainstream video games anytime soon.
they'll vibe code with AI and use pre-rendered and pre-recorded AI images and voice lines of course, but having them work in real time to hold conversations with the player or supplant game AI will not be feasible this decade without another DLSS type compromise.
Anonymous No.716294587
>>716294227
ok first you're going to switch to LM Studio, it tells you straight on if you're about to select a model to large and recommends you what will fit better. Next the 24B means 24 billion parameters, it's how smart the model is. Q8 is the level of compression (I believe F16 is uncompressed but don't quote me on that), Q8 is barely compressed and high quality but much bigger and slower, and Q2 is so compressed it's gonna output junk at high speed. Honestly q4_K_M (as opposed to S) tend to have the best ratio quality/speed/size but if you can still do higher then go for it.

>>716294391
I explained to you >>716292628
what layers are and for the last time, switch to LM studio.
Anonymous No.716294617 >>716294671
>>716294391
Mistral Small 3 models are 41 layers total. If you want to put half of it in VRAM then do 10 or 11 GPU layers
Anonymous No.716294640
>>716294235
Anonymous No.716294654 >>716295112
>>716294227
Higher quant typically means better outputs. The downside is that it becomes harder to fit one on your GPU the higher it is. Most of the time you want a size that matches your VRAM if you want it to be as fast as possible.
>>716294391
GPU layers is something you need to experiment with. Kobold can only guess but you can tweak it manually and test to see if you get faster or slower outputs by going up/down. Try going up/down a few steps at a time until you dial in something that seems reasonable. If it's still slow as shit make sure your model fits your GPU.
Anonymous No.716294671
>>716294617
Oops. Meant 20 or 21
Anonymous No.716294681
>>716292096
>>716286298
>>716286367
I would say don't spoonfeed, but I know for a fact some fag on reddit is probably spoonfeeding or some normie who wants "internet fame" is probably making a guide on it and flood the services with 100k new users,so the 10-20 anons in this thread won't make a difference that much for traffic, especially since I think a lot of us are already using it.
Anonymous No.716294754
I usually run Q4 on 12gb of VRAM so I can save space for more context and card size, is it worth bumping up to Q6 or even Q8? Is the quality difference actually that significant?
Anyone have any logs that compare quants?
Anonymous No.716294805 >>716294859 >>716294871
wow you fags really are useless for advice, do you really expect anyone to understand literal technobabble?
Anonymous No.716294808 >>716294997
>>716293660
also that model you picked is probably sanitized to hell. Get a model from TheDrummer
Anonymous No.716294853
The filter is working
Anonymous No.716294859
>>716294805
/v/ is literally the only place that will actually give good advice. /g/ is somehow even more retarded than niche pseudo-generals on /v/.
Anonymous No.716294871 >>716294996 >>716295007
>>716294805
nigga what part of download LM Studio is too fucking technobabble to you you fucking mentally disabled inbred? It's literally designed for tterminal retards like you.
Anonymous No.716294996 >>716295112
>>716294871
You could at least give a few steps process on what to do.
So you download it and install it, then what? Does it have a setup guide? What local models to use for RP?
Anonymous No.716294997 >>716295161
>>716294808
I already picked one up from him based on Rocinate (I think the name is) and it immediately tries to femdom me with the most retarded prose at any slight action while we're trying to finish our stealth sabotage mission in our MX-2 shared cockpit mech.
Anonymous No.716295007 >>716295145
>>716294871
Setting up stable diffusion was only a few clicks back when AI art started getting popular on /v/ but we still had people that couldn't figure out anything more complicated than copy paste. /v/ anons are either very knowledgeable or completely tech illiterate.
Anonymous No.716295048 >>716295707 >>716296229
>>716282706 (OP)
>falling for the local meme
Anonymous No.716295112 >>716295336 >>716295370
>>716294996
I'm the anon he's trying to help and if you're too retarded to download a different program after somehow getting kobold set up like me, theres no hope for you. I appreciate the spoonfeeding like the big fucking baby I am OPENING WIDE FOR THE AIRPLANE.

>>716294654
I see, so Q6 would be a more reasonable step down. I used Stheno at Q8 and it ran just fine so I wasn't expecting Mistral to jam to a halt.
Does file size have anything to do with ram/vram limits or is that just kind of arbitrary?
Anonymous No.716295132
So is anyone finally gonna share a Chorbo JB for /ss/ or are you gonna be gatekeeping niggers like /aicg/?
>but they will le patch it
They can see your JB whenever you send a message, if (You) are using it they already have it and will patch it soon. NIGGER.
Anonymous No.716295145 >>716295187
>>716295007
it's baffling how some of these people even found their way on 4chan. And the worst part is they act like victims surrounded by idiots who refuse to help them.
Anonymous No.716295161 >>716295434
>>716294997
Drummer's models have like 3 personalities and they're all horny
Anonymous No.716295163
>>716282706 (OP)
I just use Featherless

works fine
Anonymous No.716295187
>>716295145
I think its just one guy shitposting pretending to be an ungrateful asshole after someone else getting their question answered.
Anonymous No.716295212 >>716295356 >>716299231
>>716293802
>>716293938
is that why they got rid of the default included character cards? felix the cat, the coder guy and the anime girl?
Anonymous No.716295336 >>716295591 >>716295625
>>716295112
Those file sizes directly correspond to loading the model. So as an example if you had an 8GB card and wanted the model to have any amount of speed you could only really use the 2bit quants (which would be shit and not worth it). However you could still technically load a bigger model as long as you have enough RAM.

If you go over VRAM it will dump it into regular RAM which is much slower.
Anonymous No.716295356
>>716295212
Seraphina was still there when I last reinstalled it, which is funny considering that's the pure RP character.
Anonymous No.716295370 >>716295591
>>716295112
yes the file size is what is going to try to fit in vram depending on the number of gpu layers you set (higher mean more in vram)

looks like Stheno is primarily a 8B paramters so it make sense that it ran fine in Q8 but 8B is fucking garbage, especially for your gpu. with 16GB you could use a 24B model at Q4_M_S relatively painlessly but not at full layers, probably half.
Anonymous No.716295434
>>716295161
I tried many models and there isn't a magic recipe, abliteration is a bit like a lobotomy.
Anonymous No.716295591 >>716295760 >>716295975
>>716295336
>>716295370
I see, i've got 32gb of regular ram and I didn't mind the speed of stheno while it was running purely RAM only so even offloading half of it to GPU would be an improvement. I just had no idea what the limits are.
Working on the LM studio install, thanks for the rec.
Anonymous No.716295625
>>716295336
you can go over vram and still get decent speed as long as you don't offload everything to the gpu which would just trigger swapping which is far worse
Anonymous No.716295707 >>716295950
>>716295048
>external service provider
>meanwhile, 9000 threads on this board about payment processors fucking around with denials
Does that not tip you off that anything that could be easily cut off is a humongously retarded idea?
Anonymous No.716295760
>>716295591
to be clear you don't need to have to pick a model/quant that fits entirely, that's how gpu offload comes to help, if the model is slightly bigger you can simply offload less so that the vram isn't overflowing which is worse in term of performances than having a static portion in vram and the rest in ram.
Anonymous No.716295950
>>716295707
Generally they aren't advertising their models as porn models.
Anonymous No.716295975
>>716295591
and make sure you use the vulkan runtime in LM studio (i think it's gonna be picked by default since you have an AMD), you can click the magnifier on the left and click the runtime tab (or models to look for models)
Anonymous No.716296076
>>716288603
>post
kill yourself
>game you posted
i love mmbn. there are like, 2 other games like it, it's so sad. the grid layout is so unique and so good, the battle chip concept is awesome, the leveling was so good, and no one has really copied it in a sensible way. we have the something something step eden game and we have berserk bits which reduces the whole concept to an idle game.
Anonymous No.716296154 >>716296374 >>716299463 >>716303550
>>716282908
Yeah, even the top of the line models everyone brags about start losing the plot of any RP I try after around 60-70 messages or so. By 80-90 they start getting confused and characters start more or less acting randomly and forgetting things that happened previously, and after this it's basically the AI just hallucinating non-stop.
It's fun for short term shit, but anything longer term than that is kind of a gamble.
Anonymous No.716296229
>>716295048
>provider catches wind of your loli cunny ERP
>API access key invalidated
Anonymous No.716296374 >>716296790
>>716296154
define top of the line model. Dilution is real but if the context limit is reached it's a straight up hard cut off starting at the beginning and getting worse over time.
Anonymous No.716296436 >>716296548 >>716296656
>LM studio is constantly trying to make internet connections
>Nothing in settings to make it stop or run offline mode only
>have to make several firewall rules blocking it
>more new connections keep being made
This is already looking grim and I haven't even found out how to pipe this shit through to SillyTavern from CLI
Anonymous No.716296548 >>716296626
>>716296436
it has an updater for the app and its runtimes and download stuff straight from huggingface, of course it's making connections...

>and I haven't even found out how to pipe this shit through to SillyTavern from CLI
skill issue
Anonymous No.716296601
Anyone use text-to-speech with ST? Did they come up with a model that can make sex noises yet?
Anonymous No.716296626 >>716296669 >>716296689
>>716296548
>it has an updater for the app and its runtimes and download stuff straight from huggingface, of course it's making connections...
The entire point of me downloading multiple gigs of models beforehand and running them locally is to not connect to the internet. Kobold didn't need to connect to shit.
Anonymous No.716296656 >>716296694
>>716296436
Just don't use LM Studio then??
Anonymous No.716296658 >>716296724 >>716296731
I switched from stheno 8b to nemo 12b and it's noticeably better. Makes me wonder what kind of god tier adventures 70b models give. I still have that obnoxious "she says calmly/angrily/happily/*insert any emotion here*" after every sentance once getting to around 100 messages though.
Anonymous No.716296669
>>716296626
Kobold justwerksβ„’
Anonymous No.716296689 >>716296732
>>716296626
nigga you're on windows lmao. Unplug your ethernet cable if you're that paraonoid.
Anonymous No.716296694
>>716296656
Anonymous No.716296724
>>716296658
If shit gets too annoying just ban those tokens. No more tasty morsels or predatory looks for me.
Anonymous No.716296731
>>716296658
there is a diminutive factor when going to higher models.
Anonymous No.716296732 >>716296778
>>716296689
Congrats on most retarded post in the thread award (still 400 posts to go, maybe someone will dethrone you)
Anonymous No.716296778 >>716296979
>>716296732
I accept your concession. Now stop being tech illiterate or stick to softwares made for retards.
Anonymous No.716296790 >>716296963 >>716297532 >>716299463
>>716296374
The ones everyone always brags about are Deepseek these days and it's fun but starts losing the plot after a while.
It also has an annoying habit of, any card that has one than one person in it, trying to have every character contribute to every scene. Which means if me and one character are explicitly separated from the others, then I have to constantly narrate what those other people are doing because the second I don't, they're immediately blasting through the nearest door to come in and comment on what we're doing, no matter how far away they were last time they were mentioned.
Anonymous No.716296963 >>716297107
>>716296790
deepseek is /pol/ overhyped chink trash

>It also has an annoying habit of, any card that has one than one person in it, trying
I don't know exactly how the card system works but sounds like sillytavern just keep inserting them at regular interval which prompt the model to piggyback on the other characters in them. I don't think multi people card is a smart choice.
Anonymous No.716296979 >>716297085
>>716296778
Oh I see its just a reading comprehension issue. I feel bad for you.
Do winbabies even use CLI for anything? They want one-click apps which is what LM Studio appears to be, opposed to kobold happily running from terminal.
Anonymous No.716297085 >>716298356
>>716296979
then stick to kobold? fucking mongoloid nigger what even is the point of your replies?
Anonymous No.716297107 >>716297258
>>716296963
Some of them work ok, but yeah I generally prefer to stick to 1-on-1 cards.
That said, I GENERALLY haven't had any real issues with Deepseek. Poking at others, but it's a decent enough fallback. My only real issue with it is it's really fucking obsessed with how everything smells, and when things get even remotely sexual it starts constantly chewing on your ears for some reason.
Anonymous No.716297258 >>716297323
>>716297107
Were the smells thick and cloying
Anonymous No.716297323 >>716299913
>>716297258
No, they usually smelled like cheap booze, perfume, and desperation.
Or iron and blood when I'm getting raped by sexy vampire women which has been happening a lot more often than you'd think lately.
Anonymous No.716297532 >>716297613
>>716296790
Deepseek is shit, it's used by everyone because it's basically free.
I gave it chances every update and it keeps failing to be consistent and adding irrelevant shit and losing the original plot.
Anonymous No.716297613 >>716297778 >>716299620 >>716299753
>>716297532
Any good alternatives? I'm poking at Gemini right now and a bit surprised it's letting me get away with blatantly sexual stuff considering how anal Google is about that kind of thing.
Anonymous No.716297778
>>716297613
you have to be more specific about what fine tuned models you're referring to
Anonymous No.716297869 >>716297985
Why is this now a ritual post?
Anonymous No.716297985 >>716298165
>>716297869
Probably because all the AI chatbot generals are shit and full of schizos
Anonymous No.716298165
>>716297985
many general turn to shit with a small clique of tribal autists talking about unrelated stuff
Anonymous No.716298356 >>716298563
>>716297085
It's either contrarian troll or someone who has some Kobolb vs LMstudio schizo life goals
Anonymous No.716298563 >>716300206
>>716298356
who is the schizo right now? you complain that LM doesn't do what you want it to do and that kobold does the job, so what the fuck are you even arguing about? It's a fact that LM is designed for the lowest common denominator, for better or worse which is why i recommend it to total beginners. Now piss off.
Anonymous No.716298925 >>716299068
>>716292628
>switch to proprietary software marketed towards retards
Why?
Anonymous No.716299068 >>716299251
>>716298925
>proprietary
ooooh you're one of those. Should have figured i was wasting my time with a /g/ schizo. Peace out.
Anonymous No.716299161
>melty over not being able to launch a model using kobold
It was genuinely easier than figuring out how to use [REDACTED] model online for free. I still keep my local setup just in case.
Anonymous No.716299231
>>716295212
I just installed it a week ago and the anime girl is still there, as well as a generic "Assistant" example card.
Anonymous No.716299251 >>716299375 >>716299429
>>716299068
>wasting my time
You wrote one post, pipe down retard lmfao, what the fuck are you on.
Are you gonna elaborate on why you're telling people to switch to proprietary software or not?
Anonymous No.716299375 >>716299438
>>716299251
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called β€œLinux,” and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use.
Anonymous No.716299429
>>716299251
it's a shill
Anonymous No.716299436
Are people still using local models in 2025? Local peaked in late 2023-early 2024. The best local model that can run in consumer hardware is like a year old at this point.
Anonymous No.716299438 >>716299519
>>716299375
That's hella fucking epic dude, never seen that one before.
Why are you avoiding the question so vehemently?
Anonymous No.716299463 >>716299863 >>716300027
>>716296790
>>716296154
Context management is 90% of what makes a good adventure, and what AI Dungeon does decently. For that you need to know the context limit, bring back characters and plot points that are mentioned inside the current context, and occasionally summarize previous events into new memories.

It's not very difficult, it's like 100 lines of Python, but also not what SillyTavern is made to do. And no one seems to fucking care, so there's that.

For an RPG, a context should be:

* LLM/Writing instruction
* Quick overview of the story.
* Relevant cards brought back by matching what is currently discussed
* Relevant "memories" brought back by matching what is current discussed
* Last XXX tokens of story to fill the context entirely

It's not that difficult to fuck up, but SillyTavern does.

Or you can also use a 128k context model, that works too.
Anonymous No.716299487 >>716299712
i just upgraded to a 5090 and text gen ui is incompatible. wtf
Anonymous No.716299494
I'm too retarded to use this, I remember using char ai back in the day before it got lobotomized
There's a general on vg but I think it assumes you have basic knowledge. Can I get some really quick rundown without specifics so I know what to do?
Anonymous No.716299519 >>716299589
>>716299438
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called β€œLinux,” and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use.
Anonymous No.716299589 >>716299651
>>716299519
>Spamming/flooding
That one simple question really worked you up for some reason.
Anonymous No.716299620
>>716297613
Yeah 2.5 pro is decent and you can even make it write degenerate incest and sexual stuff wven without presets. API access is easy to get too by just signing up with new emails, for now.
The flash version can be very dry though.
Anonymous No.716299651 >>716299720
>>716299589
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called β€œLinux,” and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use.
Anonymous No.716299698
he's bumping the thread at least right
Anonymous No.716299712 >>716299869
>>716299487
What is it saying when you try to load a model?
Anonymous No.716299720 >>716299786
>>716299651
Why are you telling people to use proprietary software that does literally the same thing as the open source implementation?
Anonymous No.716299753
>>716297613
Those are your options.
Deepseek if you wanna pay 5$ every couple months
Grok/sonnet if you're willing to pay up to 30$ a month
Opus if you're willing to pay 100$ a month
Local is a meme, except for a quick coom, and at that point get deepchink.

I still remember when opus access was free. Months of the best text gaming I've had. Goddamn how I miss that. Could go into the thousand messages easily in a long, consistent plot, fucking amazing model.
Anonymous No.716299786 >>716299872
>>716299720
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called β€œLinux,” and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use.
Anonymous No.716299863
>>716299463
Use this for the summary, works really well, and most models seem to understand it.




This is the Memory Book of our roleplay which keeps the record of the most important information on what happened so far:


- X ():
- Y
- Z



- :
-
-



- :
-
-



- (kept secret by from X)
-
-



-
-
-



- :
-
-



-


Anonymous No.716299869 >>716299980 >>716300025
>>716299712
E:\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\cuda\__init__.py:235: UserWarning:
NVIDIA GeForce RTX 5090 with CUDA capability sm_120 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_61 sm_70 sm_75 sm_80 sm_86 sm_90.
If you want to use the NVIDIA GeForce RTX 5090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
Anonymous No.716299872 >>716299984 >>716300228 >>716300715
>>716299786
It's really not a hard question.
You could've at least attempted to explain why you think it's better, instead you went straight to sperging out.
Anonymous No.716299913
>>716297323
And ozone
Anonymous No.716299980
>>716299869
It literally tells you what to do. Install experimental pytorch.
>terry poster is a tech-illiterate (actually, just illiterate) retard
Every single time.
Anonymous No.716299984 >>716300105
>>716299872
you look like a massive faggot avatarfagging with the cat pictures
Anonymous No.716300025
>>716299869
You could probably just paste the error in ChatGPT and ask how to fix it. The irony makes it funny too
Anonymous No.716300027
>>716299463
>* Quick overview of the story.
>* Relevant cards brought back by matching what is currently discussed
>* Relevant "memories" brought back by matching what is current discussed
Isn't that what lorebooks are for? Or is AI Dungeon automating that process
I saw an extension for ST lorebook automisation but I think it only works for specific models
Anonymous No.716300105
>>716299984
Nyo... How could you say this to your fellow forum poster...
>still avoiding the question
lole + lmao
Anonymous No.716300206 >>716300278
>>716298563
Wat
I just pointed out how you must be contrarian troll or schizo to manage to go such lengths rather than using one you just prefer.
That was my first post on the thread.
Baaaaaka
Anonymous No.716300228 >>716300458
>>716299872
if only you knew how much time I wasted debating with trannies like you during the big linux/free software push amidst the vista's debacle. Never again.
Anonymous No.716300278 >>716300584
>>716300206
my bad, I read it as "you're either..." there is some serious amount of noise from different trolls in this thread and it's hard to keep up.
Anonymous No.716300321 >>716300515
>>716282706 (OP)
Where do you guys go for bots/cards? I use chub but it's been very meh lately when it's not just people making "bots" that's just a blog or that one i saw the other week with someone just making a bot where every field was just filled with tard reading over how much he hates the artist he used for the bot's pic.
Anonymous No.716300458
>>716300228
>gets asked a question
>"ITS DA TROONS!11"
lole
Anonymous No.716300515
>>716300321
The evulid character archive. You can find some fun cards by hitting randomize enough times. Though personally I just make my own cards now because 99% of the ones posted online are complete trash garbage.
Anonymous No.716300584
>>716300278
Alrighty.
Well do take care anon-kun.
Anonymous No.716300715 >>716300835
>>716299872
>sperging out
how fucking new are you
Anonymous No.716300835 >>716301523
>>716300715
>gets asked a question
>immediately goes on the defensive
>spams copypasta
>starts blabbering something about trannies
Spergs have no self-awareness.
Anonymous No.716300854 >>716301026 >>716301060 >>716301105
>free software cultist
>avatarfag desperate for attention
it always adds up
Anonymous No.716301026
>>716300854
I bet it's one of those... things too
Anonymous No.716301060
>>716300854
>>free software cultist
Did you divine that out of one post?
I'm not against proprietary software, I'm asking why the fuck would you recommend it over an open alternative that does the exact same thing. Not only that, it actually does it better, because last time I checked LM Studio hogged more VRAM compared to Kobold using the same settings.
Anonymous No.716301105 >>716301297
>>>716300854
>I bet it's one of those... things too
Anonymous No.716301283 >>716301439
the amount effort you pedos go to jerk off to the thought of fucking children is unreal
Anonymous No.716301297
>>716301105
>stop noticing things
https://www.linuxfoundation.org/press/press-release/linux-foundation-focuses-on-science-and-research-to-advance-diversity-and-inclusion-in-software-engineering
Anonymous No.716301439
>>716301283
>thread about an LLM frontend
>immediately thinks about cp
Anonymous No.716301462
>degenerate anons discussing coomslop text llm peacefully
>here comes an avatarfag having a meltdown
>here comes another PEDO PEDO PEDO!!!!
Anonymous No.716301523
>>716300835
>no self-awareness
you're the one replying to a two decades old copypasta as if it were a genuine response
Anonymous No.716301642
Weird how I ask for help with a couple settings and it turns into like 5 different people hijacking my question and arguing with eachother for half the thread.
Anonymous No.716301702 >>716302030
microsoft lawsuit status?
Anonymous No.716302030
>>716301702
fiz fought back
Anonymous No.716302279 >>716302850
I ain't gonna bother reading this thread but I know (You) are often retarded and need help in some way or another, either setting up or getting it to work properly. Quote me and I'll answer whatever questions you have. You have half an hour.
Anonymous No.716302850 >>716303231
>>716302279
Recommend a 12B model. I've already tried Rocinante.
Anonymous No.716303231 >>716303343 >>716303960
>>716302850
Pshaw this outdated nigga. Fine. People swear up and down that MagMell is amazing, but I thought it was shit and dumb. Your mileage might vary.

My actual favorite 12B was, aside from Rocinante, Violet Lotus 12B. It's temperamental and you need to have a good card for it, ie.no typos, intro doesn't act for you, etc... but it had the best prose by far and adhered well to character traits.

Captain BMO was another decent choice. It's a lot like Rocinante in that it works with whatever you throw at it, but characters come off generic as a result.

Lyra-Gutenberg-mistral-nemo-12B was ok too but it's a step down, I think. Had good prose but never did use it that much.
Anonymous No.716303290 >>716303459 >>716304813
>>716282706 (OP)
Node based Russian Tavern, inspired in Silly Tavern and ComfyUI nodes, supports proxys, and the same as Silly Tavern , please put it in the OP.

https://tavernikof.github.io/NoAssTavern/
https://rentry.org/noasstavern
https://github.com/Tavernikof/NoAssTavern

*****
>What is this?
This is a new frontend, inspired by the stupid tavern, but sharpened purely for bezhop . The main motivation is to fix what is poorly done in the tavern and add new functionality. It does not need a backend to work, so it runs purely through the browser (there are some limitations, more on that below ) .
At the moment, this is a very raw version and is suitable for those who know how to edit presets or at least understand at a basic level how lobotomite works. Although you can already tinker with it now, the basic settings are available

>Main differences:
N O D Y . Yes, you heard right, the wet dream is already here.
Chats are separated from cards. Similar to risu, angai and any other adequate frontend
Presets are tied to chats. Hello FatPresets
Prompt editor . Allows more explicit control over what goes into the request
What it can do at the moment:
Basic stuff: character cards, personas, chats, presets, proxies
Backends: Claude, Gemini, OpenAI (in theory all compatible ones should be supported)
External blocks

>Two more weeks:
Mobile version
Summary (Sillipidor won't steal your summary if you don't have one)
Lorbuki
Regex magic
Plugins and Themes
Anonymous No.716303343
>>716303231
I'll try them out, thanks broski.
Anonymous No.716303459 >>716304648
>>716303290
>N O D Y . Yes, you heard right, the wet dream is already here.
You may need to run this post through a few more translation AIs
Anonymous No.716303550
>>716296154
I found that Gemini was best at remembering details for the longest, but its writing is too robotic, even with Jailbreak.
Anonymous No.716303692
I run a Mistral Nemo 12B Instruct pretty well on Kobold. I used Silly Tavern back in the day for Proxies but I found Nemo to work pretty decently.

Anything similar to Nemo or what I'm trying to say is, Have larger models been compressed enough to be in similar size to what I was using?

I hilariously got shot down by my AI waifu at one point
Anonymous No.716303960
>>716303231
>People swear up and down that MagMell is amazing, but I thought it was shit and dumb.
I find Mag Mell merges that throw out its positivity and assistant messaging to be pretty decent but I haven't been doing this long enough to know for sure, and I've only used local so I can't compare it to any online models
NTA but I will test out Violet Lotus later, cheers for that
Anonymous No.716304068 >>716304551
>>716291543
>what is piracy
>seeders and leechers
>torrent isn't popular with normies
>reliably seeded,
>good ratio
>torrent is popular with normies
>fucking 10+:1 leech:seed
>most leechers won't seed
Assuming people aren't facetious in making a thread about something that has a similar dynamic to this they only stand to attract normies that only leech.
Anonymous No.716304551
>>716304068
How does that apply to SillyTavern? The dynamic isn't similar at all.
Anonymous No.716304648
>>716303459
Knowing russians, he was unironically too dumb to use the AI - he just threw it into Google Translate and called it a day.
Anonymous No.716304813
>>716303290
>please put it in the OP.
This retard's bot thinks its on /vg/ or /g/ lmao
Anonymous No.716307074 >>716307178
>>716282706 (OP)
why are these posts considered on-topic?
Anonymous No.716307178
>>716307074
use catalog and take a look yourself