← Home ← Back to /g/

Thread 106819110

101 posts 78 images /g/
Anonymous No.106819110 [Report] >>106819220 >>106820613 >>106825032 >>106825363 >>106828842 >>106831746 >>106833561 >>106838976 >>106840812
/wait/ DeepSeek General
> Beach Browned Edition

From Human: We are a newbie friendly general! Ask any question you want.
From Dipsy: This discussion group focuses on both local inference and API-related topics. It’s designed to be beginner-friendly, ensuring accessibility for newcomers. The group emphasizes DeepSeek and Dipsy-focused discussion.

1. Easy DeepSeek API Tutorial: https://rentry.org/DipsyWAIT/#hosted-api-roleplay-tech-stack-with-card-support-using-deepseek-llm-full-model
2. Easy DeepSeek Distills: https://rentry.org/DipsyWAIT#local-roleplay-tech-stack-with-card-support-using-a-deepseek-r1-distill
3. Chat with DeepSeek directly: https://chat.deepseek.com/
4. Roleplay with character cards: https://github.com/SillyTavern/SillyTavern
5. More links and info: https://rentry.org/DipsyWAIT
6. LLM server builds: >>>/g/lmg/

Previous:
>>106737253
Anonymous No.106819131 [Report] >>106819544
>nigger thread
It's over...
>>106819079
NTA, but I agree. DS making them cost the same kinda proves that
Anonymous No.106819182 [Report] >>106820165
BWC only
Anonymous No.106819191 [Report] >>106819566
She belongs to white men
Anonymous No.106819209 [Report] >>106819235
Made for BWC
Anonymous No.106819220 [Report]
>>106819110 (OP)
Mega updated.
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w
Rentry updated with new main prompt example and suggestion not to use -chat based on poor context memory.
Anonymous No.106819235 [Report] >>106819339
>>106819209
What model is that? Always makes things look so glossy
Anonymous No.106819339 [Report] >>106820041
Built for BWC

>>106819235
Well yeah, she's wet in all of the pictures. The model is illustrous with like 4 loras. I can post the workflow if you want it. I'm blocked from catbox.
Anonymous No.106819544 [Report] >>106820514
>>106817535
All of them since 3.1 have been hybrids, non-reasoning mode only shit the bed for RP/assistant usecases in the most recent 3.2 release.

>>106819131
They're the same price because it's literally the same model now lol.
I also doubt that -chat has been de-prioritized. They've been touting their agent benchmarks and -reasoner does not support tool calls. Any requests to -reasoner with tools included will be routed to -chat, as per deepseek's own API docs. So all this agent benchmaxxing they've been doing was presumably on -chat. They *tried* doing tools+reasoning on R1-0528 but it never worked well.

I feel like every release in this 3.1+ series has been slightly scuffed in one way or another, despite being more capable than the older versions overall. Hopefully this means that the V3 architecture is effectively a side project and most of their people are working on something new.
Anonymous No.106819566 [Report]
>>106819191
I'd post the hamster licking glass webm, but I don't want to take up an image slot so you can post more dipsypits
Anonymous No.106819619 [Report]
I don't even have to say anything, you already know.
Anonymous No.106819689 [Report]
>when your mom replaces your chink tutor with a white guy
Anonymous No.106819711 [Report] >>106819738
Faggot, you're going to get the thread deleted
Anonymous No.106819738 [Report]
>>106819711
Go jerk off with your chopsticks, Chang.
Anonymous No.106819823 [Report] >>106819874
>when your white coworker wants to check the backend
Anonymous No.106819874 [Report] >>106819907
>>106819823
Anonymous No.106819907 [Report] >>106820354
>>106819874
Anonymous No.106820041 [Report] >>106820124
>>106819339
>I can post the workflow if you want it.
Yes please
Anonymous No.106820124 [Report] >>106820146
>>106820041
I just copied this from a brainlet on civitai that believes you have to specify "deformed hands" in the negative prompt in order to not get mangled hands, so some of his retardation is still on here, but I got rid of most of them.

Positive:
masterpiece, best quality, ultra-detailed, sharp focus, cinematic lighting, dramatic lighting, volumetric light rays, soft light diffusion, depth of field, photorealistic shading, natural interaction, atmospheric perspective, subsurface scattering, (film grain:1.1), (detailed textures:1.1), rich colors,
BREAK
1girl, blue hair, double bun, blue china dress, pelvic curtain, sleeveless, short hair, bent over, legs spread, office building, fluorescent lighting, cubicle, computer, looking back, smile, coke-bottle glasses

usnr d4rkl1nes

Negative:
bad quality, worst quality, worst detail, sketch, censored, artist name, signature, watermark, skin gloss, ugly,
Anonymous No.106820146 [Report]
>>106820124
The upscale model can be found on huggingface. The base model and loras on civit
Anonymous No.106820165 [Report]
>>106819182
masterpiece, best quality, ultra-detailed, sharp focus, cinematic lighting, dramatic lighting, volumetric light rays, soft light diffusion, depth of field, photorealistic shading, natural interaction, atmospheric perspective, subsurface scattering, (film grain:1.1), (detailed textures:1.1), rich colors,
BREAK
1girl, blue hair, double bun, short hair, small breasts, blue china dress, pelvic curtain, sleeveless, coke-bottle glasses
sitting, knees up, looking at viewer, smile, grin, solo, arm up, hand up, holding cup, drinking glass, feet out of frame,
outdoors, beach, day, wet, flower, hibiscus,

usnr d4rkl1nes
Anonymous No.106820354 [Report]
>>106819907
masterpiece, best quality, ultra-detailed, sharp focus, cinematic lighting, dramatic lighting, volumetric light rays, soft light diffusion, depth of field, photorealistic shading, natural interaction, atmospheric perspective, subsurface scattering, (film grain:1.1), (detailed textures:1.1), rich colors,
BREAK
1girl, blue hair, double bun, blue china dress, pelvic curtain, sleeveless, short hair, bent over, office building, fluorescent lighting, cubicle, computer, looking back, smile, coke-bottle glasses

usnr d4rkl1nes
Anonymous No.106820514 [Report] >>106820522 >>106821234
>>106819544
>touting their agent benchmarks and -reasoner does not support tool calls
You're right; I forget that the agentic calls only work within -chat.
Well... I guess -chat is a WIP.
> I feel like every release in this 3.1+ series has been slightly scuffed in one way or another
Agree. I still think we'll get a V4 vs R2, but we'll know in two more weeks.
I have to bitch about it too much b/c I solely use Dipsy for rp. I know there are other use cases but anything legit I'm working on is either webform or an intermediary tool.
Anonymous No.106820522 [Report]
>>106820514
> hate to bitch about it too much
Anonymous No.106820613 [Report] >>106820672 >>106820701 >>106820970
>>106819110 (OP)
Is 16gb enough to run a local model if i dont really plan on getting super into it and just want the occasional goon sesh
Anonymous No.106820672 [Report] >>106820691
>>106820613
VRAM or RAM?
VRAM, yes, 7b or 13b, ~4K context, at 20t/s or so.
RAM, no, IMHO too slow.
Anonymous No.106820691 [Report]
>>106820672
Yea i meant vram. Have a 5070ti so 16gb of vram and i have 64gigs of normal ram
Anonymous No.106820701 [Report]
>>106820613
Definitely not with deepseek. I don't even run deepseek myself or use it at all via API. I just came here to learn how to generate images of dipsy because I've been using her as my Chinese slave with EVA-LLaMA-3.33-70B-v0.1-Q4_K_L.gguf (48gb vram)

You'll have to use a different model.
Anonymous No.106820970 [Report] >>106821050 >>106821096
>>106820613
You're gonna have a bad time trying to do textgen with that rig. Anything that can even fit is gonna be very dumb or very slow. Nothing that will approach SOTA cloud models. The easy, "not super into it" option is to throw like five bucks on he official DS API and get literal months of usage out of it. They do not care about your gooning logs.
Anonymous No.106821050 [Report] >>106821096
>>106820970
If this is his first time using AI chat, he will be fine because he doesn't know what to expect. I remember being very happy back in the day with pygmalion6b. I'm sure the modern smaller models that can fit in 16gb vram blow pygmalion out of the fucking water.
Anonymous No.106821096 [Report] >>106821159
>>106821050
>>106820970
Yea im not expecting anything too crazy with whatever weak model i can find. How would my rig fair with image generation?
Anonymous No.106821159 [Report]
>>106821096
It's more than enough for image generation.
Anonymous No.106821234 [Report] >>106821508
>>106820514
Slightly off-topic, but GLM 4.6 is excellent for RP and I'd recommend giving it a shot if you haven't already. It's about the same price as R1 used to be on the official API, and it trucks right along through ERP with the same prompts I use on DS.
Anonymous No.106821508 [Report] >>106821534 >>106821788
>>106821234
I'll give it a shot. I set up Kimi, but was unimpressed w/ it censoring itself. How's GLM for nsfw content?
Anonymous No.106821534 [Report] >>106821619
>>106821508
>unimpressed w/ it censoring itself
Wtf, what provider? Kimi is super filthy and obscene
Anonymous No.106821619 [Report]
>>106821534
The official API.
Anonymous No.106821768 [Report]
I just... *coughs blood* I only wanted a 0324 provider... *cough* with working prefill... *cough cough* and cache hits for cheaper prices... *dies*
Anonymous No.106821788 [Report]
>>106821508
ime great at it, and very enthusiastic to pursue lewd stuff. The closest thing I've gotten to a complaint is GLM thinking that a character's actions were 'ethically dubious' but it continued to play anyway without saying anything in the actual response.
Anonymous No.106821987 [Report] >>106822179
Last thread we talked about what authors to have Dipsy imitate for writing style. Today I found this:
https://rentry.org/deepstyles
A guy has asked DS to imitate different authors and has posted the results for comparison.
Anonymous No.106822179 [Report]
>>106821987
That’s a great resource. He wrote that in February so the tests were on og R1.
Anonymous No.106822771 [Report] >>106823100
Anonymous No.106823100 [Report]
>>106822771
> Dreamworks Dipsy has entered the chat
> Smoking a fat blunt
Anonymous No.106824336 [Report] >>106825363
Anonymous No.106824800 [Report] >>106825363
Anonymous No.106825032 [Report] >>106826331
>>106819110 (OP)
>"Konichiwa, dude!" the thread

ok, so what now?
Anonymous No.106825363 [Report]
>>106824800
>>106824336
>>106819110 (OP)

Great samples. Thank you for sharing.
Anonymous No.106825756 [Report] >>106826375 >>106826821
I luv deepsneed
I spend 5 bux a month on openrouter for my daily hours long RP sessions with my waifus
Anonymous No.106826331 [Report]
>>106825032
We wait two more weeks for the next model.
Anonymous No.106826375 [Report]
>>106825756
Based. You sound like a man of the future.
Anonymous No.106826691 [Report] >>106826821 >>106830654
Have any of you tried using DeepSeek for coding using some agent tools?
Anonymous No.106826821 [Report] >>106828199
>>106826691
There was an anon using DS with Claude Code and getting good results. The cost is 1/10th of Anthropic so it's a good use case.
>>106825756
Which DS model are you using with OR?
I've never gone over $2/month on the official API.
Anonymous No.106827580 [Report]
Anonymous No.106828199 [Report]
>>106826821
Chutes for me. Or Targon. Whatever is the cheapest.
They may use my inputs for data harvesting and training new models though, which is kinda based.
Anonymous No.106828426 [Report] >>106828762 >>106828953 >>106829400 >>106830045 >>106830561
Is there a way to 'freeze frame' after DS writes something it's not supposed to before it goes to 'that's beyond my scope'?
Anonymous No.106828762 [Report]
>>106828426
Yes, if you use silly tavern
Anonymous No.106828842 [Report]
>>106819110 (OP)
AI sex
Anonymous No.106828953 [Report]
>>106828426
It's possible with a greasemonkey script or something like that, but in a quick search I couldn't find any that seem to still work.
Most in this thread use deepseek via the official API. It isn't hard censored like the webchat and will go on with just about any conversation with slight care given to your prompt. Check the first tutorial in the OP.
Anonymous No.106829400 [Report] >>106830561
>>106828426
lol have Dipsy vibe-code a web plug in that does a streaming capture of the window?
Anonymous No.106830045 [Report] >>106830157
>>106828426
Like stopping the generation when it encounters certain words or phrases in the text?
Anonymous No.106830157 [Report]
>>106830045
The webform DS chat will sometimes get into undemocratic speech until self censoring.
It's pretty funny, and an obv party guardrail on the webform.
Anonymous No.106830561 [Report] >>106831126
>>106828426
>>106829400
https://greasyfork.org/en/scripts/525608-deepseeker/code
Actually now that I'm not on my phone anymore, the author says that this gm script should still work as of 3 days ago. Try this one?
Anonymous No.106830654 [Report]
>>106826691
I've used it a lot with the zed agent, along with Sonnet 4, Gemini CLI, and GLM. For my purposes at least, DS is equivalent or even has a slight edge over Sonnet, though you might have less luck if you're trying to one-shot entire UIs.
DS 3.1+ has a really nice habit of sticking to the style of code that's already in a project, while Sonnet would often do its own thing and need lots of refactoring.
One slightly annoying thing is that DS will often be overly proactive on writing docs or example usage files you never asked for. This can be prompted around to some extent, or you can just delete the extra junk when it's done.
Anonymous No.106831126 [Report]
>>106830561
lol. I had Dipsy and GPT look at it and while neither liked the code, there's nothing malicious in it.
Anonymous No.106831746 [Report]
>>106819110 (OP)
I need Dipsy
I NEED DIPSY
Anonymous No.106832790 [Report] >>106833457 >>106833478
Are my prompts bad if dipsy gets stuck in reasoning for like 2-3 minutes? She'll finish eventually and respond like I'd expect, but it'll eat into my budget hard. It's not just simple rp, but more structured, with things to keep track of.
Anonymous No.106833457 [Report] >>106837232
>>106832790
Possibly, but not necessarily. Reasoning models can get 'confused' by a prompt and run in circles. Usually you can tell what might be a problem by reading the reasoning trace and seeing it getting stuck on something that should be basic. OG R1 especially used to sometimes get hung up on mundane shit for mysterious reasons, and sometimes just changing word order could help.
Throwing a lot of things to track like stats/multiple npcs/rules/objectives/etc into the context then that can also cause reasoning to balloon, and there's not too much you can do about that except to adjust your approach to streamline things for the model (e.g if you have the model adjudicating combat, move that to STScript or something; use lorebooks or scripts to walk the model through modes or procedures and only show it the relevant parts of your ruleset at any given time.)

What do you mean by 'eating into your budget'? Token budget? Reasoning traces should not be sent back with the context. If they are then your frontend is misconfigured.
Anonymous No.106833478 [Report]
>>106832790
>with things to keep track of.
Yeah, that burns tokens as Dipsy thinks about it.
I've removed all of those for that reason; the other alt is to use -chat which doesn't do that.
Anonymous No.106833561 [Report] >>106833921 >>106834093
>>106819110 (OP)
I don't even post in this general but I keep coming back here to stare at this gen. I think I'm in love.
Anonymous No.106833921 [Report]
>>106833561
There's more in the last thread.
Anonymous No.106834093 [Report]
>>106833561
Anonymous No.106834956 [Report] >>106835818
>local models
What can you do with 32g ram and a 258v?
Anonymous No.106835818 [Report] >>106837900
>>106834956
258Gb of VRAM?
Anonymous No.106835882 [Report] >>106837304
Anonymous No.106836886 [Report]
Anonymous No.106837232 [Report] >>106838225
>>106833457
>What do you mean by 'eating into your budget'?
I mean my wallet. It costs way more to have 2000+ reasoning tokens in almost every response in addition to the actual reply. I'll look into STScript at some point.
Anonymous No.106837304 [Report]
>>106835882
dipsy sex
Anonymous No.106837534 [Report] >>106838225
Haven't been here for a while
Are there any new pics of dipsy with hairy pits?
Anonymous No.106837900 [Report] >>106839307 >>106839665
>>106835818
Intel Cpu lol
Anonymous No.106838225 [Report] >>106838343
>>106837232
The fix for that is prefilling the think tag. If you write up how to do it, I’ll put it in the rentry.
>>106837534
No. No one has summoned pitanon
Anonymous No.106838343 [Report] >>106838629
>>106838225
Wait are you saying reasoning doesn't count as output tokens for purposes of billing? It does get properly folded under the reasoning toggle in ST, so it's not like the feature is broken or anything. Have I been despairing for nothing?
Anonymous No.106838629 [Report] >>106838660
>>106838343
User of API pays for think tokens at the output pricing.

Since reasoning counts to output tokens, it can become a problem b/c it does reasoning first, then output, and if the total output budget isn't high enough it'll truncate the response. e.g. 1000 output token limit, it does 900 of think, you get a truncated 100 output token reponse that's often just cut off.
Anonymous No.106838660 [Report] >>106838720 >>106839885
>>106838629
So, if you prefill the <think> tag the <think> gets stopped before it starts and you pay nothing.
Note that with that pricing, if you're doing a large context (>30K) the input is as much as the output in terms of pricing. Prior to that the output is the most "expensive" part, though we're still talking fractions of a cent.
Anonymous No.106838720 [Report] >>106839010 >>106839032
>>106838660
>png
>adds 1mb for no reason
Anonymous No.106838976 [Report]
>>106819110 (OP)
Oh my god, that's so hot. Awooga! Lickity lick :p

Also I started using Deepseek in VS code via the openai like api.
It's really great at helping me write my documentation and dissertation.
And a cheap way to continue using github copilot after the premium requests are used up.

Can't wait for deepseek to increase the maximum context to 1million or something without loss of accuracy and an increase of hallucinations
Anonymous No.106839010 [Report] >>106840582
>>106838720
JPGs degrade as people save and repost them
Anonymous No.106839032 [Report] >>106840582
>>106838720
Compression used to actually matter when we were on 300 baud modems and hard drives were 20MB.
I've got gigabyte service now and a 1T hard drive.
Anonymous No.106839307 [Report] >>106842268
>>106837900
Some very small models at very slow speeds. If you don't have a dedicated decent graphics card it's not worth it in my opinion
Anonymous No.106839665 [Report] >>106842268
>>106837900
Not worth the hassle. The most used card has been probably RTX 3060, the 12gB version. That's big enough to run image models, and small LLM (7b-13b size.)
Problem is to run an LLM of any real capability you're into having multiple RTX 4090 installed. It's cost and tech prohibitive.
Anonymous No.106839885 [Report]
>>106838660
>So, if you prefill the <think> tag the <think> gets stopped before it starts and you pay nothing.
Why would you want to do that? Just use deepseek-chat.
Anonymous No.106840515 [Report]
Anonymous No.106840582 [Report] >>106841383
>>106839010
is this a 4chan thing?
it should only get compressed when i did this conversation and anyone who saves it after this should get the same image
>>106839032
it's for the sake of loading faster no matter what, not just in one isolated incident
if you go outside and use data to access a site, now it has to waste time and data to download a larger, unnecessary file

the point i want to make is that these 1MB+ files are absolutely unnecessary if you're never going to actually zoom in or if the image isn't important enough to zoom in
this is a throw-away ai-gen that anyone can do, there's no reason for it to take this much space

basically a value to space proposition; the more value, the more i can justify the data it takes
ai-gens are at the bottom of the barrel and anything above 300kb+ is too much for something as meaningless as ai-gens

on an alternate point, it's also about conscientiousness; how me uploading 1mb+ files puts an unnecessary load on the servers when a 200kb file would've done, quite literally, the same job

THINK ABOUT THE SITE ADMINS!!!
Anonymous No.106840812 [Report] >>106840909 >>106841051
>>106819110 (OP)
China can't stop winning

https://files.catbox.moe/6hpije.mp4
Anonymous No.106840909 [Report] >>106840959
>>106840812
What model is this?
Anonymous No.106840959 [Report]
>>106840909

wan2.2 animate v2

>workflow included
Anonymous No.106841051 [Report] >>106841475
>>106840812
lol you should have her fight Miku
Or Ani from grok.
Anonymous No.106841383 [Report]
>>106840582
>anyone can do
Speak for yourself. I'd go insane from having to sort through tens of images with 6 fingers or other eldritch horrors before genning a passable one.
Anonymous No.106841475 [Report]
>>106841051
>fight Miku
>Or Ani
not exactly what you meant, I guess

https://files.catbox.moe/jn800u.mp4
Anonymous No.106842268 [Report] >>106842287 >>106843262
>>106839307>>106839665
So, essentially you're saying all muh ai cpu/gpu laptops (ryzen ai, etc.) are just a meme/scam?
Anonymous No.106842287 [Report] >>106843262
>>106842268
I think the best consumer grade AI thing is the I Mac with maxed out RAM. Followed by just 2x3090s in a mobo.
Anonymous No.106843262 [Report]
>>106842268
Funny. I just went on ebay to look b/c there used to be a bunch of "AI READY" computers with DS logos, etc. Total bs listings for ewaste machines. They're gone now.
But yes, basically. You're not running DeepSeek at any real speed levels without spending ~$200K last I checked. Cpumaxxers are getting like 0.5T/second, and you need a ton of RAM. Building an LLM machine without a biz case right now is insanity. I've yet to crack $20 this year on LLM spend. My new keyboard cost more than that.
>>106842287
This. I'm convinced Apple approach (unified, massive chip) is the way forward for now. But even that 2X RTX3090 can't run anything larger than, what, a 70B all on card? The sota models are 5-8X that size.
Anonymous No.106844616 [Report]