← Home ← Back to /g/

Thread 106512307

472 posts 98 images /g/
Anonymous No.106512307 >>106512932 >>106512933 >>106513594 >>106514224
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106504274 & >>106497597

►News
>(09/05) Klear-46B-A2.5B released: https://hf.co/collections/Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1
>(09/04) Kimi K2 update for agentic coding and 256K context: https://hf.co/moonshotai/Kimi-K2-Instruct-0905
>(09/04) Tencent's HunyuanWorld-Voyager for virtual world generation: https://hf.co/tencent/HunyuanWorld-Voyager
>(09/04) Google released a Gemma embedding model: https://hf.co/google/embeddinggemma-300m
>(09/04) Chatterbox added better multilingual support: https://hf.co/ResembleAI/chatterbox

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.106512310 >>106514325
►Recent Highlights from the Previous Thread: >>106504274

--Paper: Why Language Models Hallucinate:
>106507149 >106507158 >106507186 >106507195 >106507590 >106507176
--RWKV model evaluation: architecture, performance, and deployment challenges:
>106506094 >106506112 >106506129 >106506145 >106506171 >106506185 >106506180 >106507523 >106509086 >106509525 >106509820 >106511189 >106511228
--VibeVoice voice synthesis effectiveness and parameter tuning:
>106508552 >106508596 >106508604 >106508831 >106508848 >106508987 >106509035 >106510630 >106511430 >106511486 >106511499 >106511507 >106511525 >106511581 >106511610 >106511620
--Tools for isolating vocals and reducing background noise:
>106506888
--Debate over relevance of new 3T token PDF dataset for improving LLMs:
>106510315 >106510342 >106510426 >106510436 >106510479 >106510505 >106510703 >106510736 >106510977 >106510347 >106510359 >106510393 >106510406 >106510418 >106510439 >106510348 >106511014
--Implementing VibeVoice TTS externally from ComfyUI-VibeVoice node:
>106505316 >106505422 >106505432 >106505527 >106505572 >106505596 >106505641 >106505673 >106510085
--RWKV model version release and development status update:
>106506232 >106510226 >106510238 >106510254 >106510264 >106510781
--Comparing VibeVoice ComfyUI implementations and workflow limitations:
>106506439 >106506554 >106506566 >106506591 >106506614 >106506597
--Challenges in implementing aLoRA for user-friendly model customization in llama.cpp:
>106507763 >106507800 >106507824 >106507835 >106507863 >106507903 >106507838 >106510966
--Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks:
>106511910
--1.5B model audio demo and sound source suggestion:
>106507288
--Miku (free space):
>106504832 >106505966 >106506177 >106506316 >106506402 >106507447 >106507512 >106507616

►Recent Highlight Posts from the Previous Thread: >>106504276

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
Anonymous No.106512323 >>106512703
good morning sirs
Anonymous No.106512347 >>106512375 >>106512445 >>106512517 >>106512701 >>106514854
Recent open weight LLMs shortening the gap with sota closed models is a good or bad piece of news? Since it means sota stuff is slowing down despite billions put in them.
Anonymous No.106512363 >>106512438
Anonymous No.106512375 >>106512423
>>106512347
It's a cycle. Gemini3 will shake up the market again, followed by Meta's creation
Anonymous No.106512423 >>106512461 >>106512595
>>106512375
gemini 3 will be censored harder than anything else.
Anonymous No.106512438 >>106512592
>>106512363
Nice shading, dunno about the crotch button
Anonymous No.106512445
>>106512347
From now on the big advances in the open llm space are likely going to be speed and quanting optimizations. The intelligence is mostly diminishing returns at this point unless a new architecture different from transformers pops up and gains traction.
Anonymous No.106512461
>>106512423
who cares
the most obvious improvements will be on the only thing that matters....benchmarks!
Anonymous No.106512517 >>106512610 >>106512612 >>106512798
>>106512347
If SOTA is dying, that's a good thing. LLMs have peaked for porn and that's literally the only thing this technology has had a positive effect on.
Anonymous No.106512592
>>106512438
This Miku is actually a lamia, she doesn't have any legs where the button could get in the way.
Anonymous No.106512595
>>106512423
Don't you mean it'll be the safest model yet.
Though I don't see how you can top gpt-oss.
Anonymous No.106512610 >>106512644 >>106512693 >>106513488
>>106512517
There has been no public model yet trained from the ground up to properly model human relationships and conversations besides possibly the first CAI (to an extent; it was mostly RP, chats and fanfictions). It's basically almost always filtered random internet sewage pretraining with no specific goal assistant tuning + safety tacked on top of the model, now recently with STEM/math/reasoning in the middle of this.
Anonymous No.106512612 >>106512644 >>106512690
>>106512517
>LLMs have peaked for porn
We are from anything like that. The models are shit at writing for a male audience, they are bad at finding interesting developments, taking initiatives, etc.
Anonymous No.106512644
>>106512610
>>106512612
Yeah, but big corpos are never ever going to humanmaxx their models. The only improvements we've ever had in this area have been side effects of generalization, which they actively try to suppress.
Anonymous No.106512690
>>106512612
>they are bad at finding interesting developments, taking initiatives, etc.
Some of the more non-fried models can do fine with that stuff now. I just wonder if context degradation will ever be solved.
Anonymous No.106512693 >>106513204
>>106512610
this is mostly a dataset issue rather than architecture issue
Anonymous No.106512701
>>106512347
It just means open sauce LLMs are training on closed source LLMs outputs. Both are using synthetic slop to the point there is barely any difference between them. It's a bad piece of new for everyone
Anonymous No.106512703
>>106512323
https://vocaroo.com/125d9fIVnc6b
Anonymous No.106512768 >>106512802 >>106512814
Can the Sar who posted this https://litter.catbox.moe/rehari2tvedhwccm.wav please post the voice sample, I unironically like the voice
Anonymous No.106512798 >>106512807 >>106512811 >>106512875 >>106512975
>>106512517
>LLMs have peaked for porn
LMAO, we don't even have local models like Sesame Labs Voice model with voice cloning

It's going to be soooo much better gooning to a LLM you can talk to and it can moan and cry and plead for help and scream and do more humans sounds while also having the ability to replicate any voice you throw at it
Anonymous No.106512802
>>106512768
Just record yourself
Anonymous No.106512807 >>106512855
>>106512798
>you can talk to
That seems infinitely worse imo.
Anonymous No.106512811
>>106512798
ryonashitters like you deserve to die
Anonymous No.106512814
>>106512768
It's in the demo voices, in-Samuel_man.wav.
Anonymous No.106512855 >>106512882
>>106512807
Maybe for you
It is undeniable that talking to something that can put audible emotion into its speech is better for intimacy than pure text
Anonymous No.106512871 >>106512894 >>106512986
This is vibe voice was removed: https://files.catbox.moe/i7sc6u.wav
Anonymous No.106512875
>>106512798
Sesamejeets are the reason why they all safety censor tts
Anonymous No.106512882 >>106512934
>>106512855
>grok clip of the guy flirting with Ani or whatever her name is in front of a mirror

>OH YOUR RUGGED BEARD
>OH YOUR RUGGED ASS
>OH YOUR RUGGED SHORTS

Yeah, thanks...
Anonymous No.106512894 >>106512942
>>106512871
heh
Anonymous No.106512932 >>106512961 >>106513000
>>106512307 (OP)
Anybody also gets a segfault In comfy using vibevoice? I tried both
https://github.com/Enemyx-net/VibeVoice-ComfyUI
https://github.com/wildminder/ComfyUI-VibeVoice

And both gave me a segfault, and no output on where to look:
[ComfyUI-VibeVoice] Loading model with dtype: torch.bfloat16 and attention: 'sdpa'
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 33%| | 1/3 [00:01<00:03, 1.63s/it]FETCH ComfyRegistry Data: 30/96
Loading checkpoint shards: 67%| | 2/3 [00:03<00:01, 1.79s/it]FETCH ComfyRegistry Data: 35/96
Loading checkpoint shards: 100%|| 3/3 [00:05<00:00, 1.71s/it]
[ComfyUI-VibeVoice] Successfully configured model with sdpa attention
loaded completely 19184.8 5602.380922317505 True
[ComfyUI-VibeVoice] Resampling reference audio from 48000Hz to 24000Hz.
Generating (active: 1/1): 0%| | 0/534 [00:00 venv ~/AI/ComfyUI40s
Anonymous No.106512933 >>106513057 >>106513488
>>106512307 (OP)


>>106510426

>>106510342
>And what kind of data would be relevant to ERP?

Nta. You make the source data the kind of stories you want it to be good at writing. These models kind of suck at it or are prone to writing safety slop purple pros trash because as many of us have been pointing out repeatedly, the companies keep filtering out data they deem "low quality" or "unsafe". You need the good and the "trash" data in order for tonight overfit on that generic boring corporate writing style a lot of the models have. You get a bunch of stories (there are countless scrapes of rp stories floating on hugging face alone), turn those into SFT data sets and then just train your model off of that. I did exactly that and have demonstrated you can get even heavily cucked models like llama to completey drop The purple prose It actually right shit that sounds like it came from a natural person.


The obvious downside is that " garbage in garbage out" applies to this approach too. The stories in the original data set were not formatted " professionally" in a way you would find in a romance novel or something. So if you hate the writing style of AO3 authors of wattpad authors or wherever the data was ripped from, then you will hate fine tunes like that but it will not have the safety slop fuckery hindering it or causing it to refuse
Anonymous No.106512934 >>106512985 >>106513030 >>106513552
>>106512882
That speaks more of the models intelligence and base prompt rather than the models voice capability
https://x.com/techdevnotes/status/1944739778143936711
Anonymous No.106512942
>>106512894
this thing is a gem mine
https://files.catbox.moe/jmbo2r.wav
Anonymous No.106512961
>>106512932
gptsovits doesn't have that issue
Anonymous No.106512975
>>106512798
I'd rather have intelligent writing with a model who knows the lore, where everyone is, their clothes, their personality (and that personality depending on where in the story we're at, not just the def), then think amorally in possible clever future actions to make it interesting while pleasing the user's tastes.
Anonymous No.106512985 >>106513001 >>106513013 >>106513064 >>106513141
>>106512934
>https://x.com/techdevnotes/status/1944739778143936711
holy fucking shit how many tokens is this.
Anonymous No.106512986
>>106512871
Still not as good as the original.
https://www.youtube.com/watch?v=ukznXQ3MgN0
Anonymous No.106513000 >>106513126 >>106513235
>>106512932
Both nodes are somewhat bad.
>after multiple generations even the 1.5b model begins to output crap
>problem can be solved by deleting comfyUI inputs and refreshing the node layout...
When using the large model, vibevoice node cannot use ram for whatever reason but if it doesn't fit into your vram it'll begin to bug out.
There's just couple of issues.
Anonymous No.106513001
>>106512985
Around 1300 tokens according to ST which is pretty okay for a character card.
Anonymous No.106513013
>>106512985
Too many for that slop
Anonymous No.106513030 >>106513072
>>106512934
>Instead of word "vibe" use words like: "mood", "atmosphere", "energy" and "feel". Nobody likes words "vibe" and "digital realm" so do not mention it.
Now this is good prompting
Anonymous No.106513057 >>106513094 >>106513181
>>106512933
surprised no one did that with deepseek, but then ds is gigantic
Anonymous No.106513061 >>106513093
If Gemini 3 isn't at least 40% better than 2.5 i can see the civilian AI market cooling down drastically until early AGI is achieved
Anonymous No.106513064
>>106512985
Thanks, going to test this.
Anonymous No.106513072
>>106513030
>He doesn't like vibing in his waifu's digital realm
Anonymous No.106513093
>>106513061
gpt5 is at best 5% better than gpt4/o3 latest versions
Anonymous No.106513094 >>106513107
>>106513057
>he doesn't know about the soj'
https://blog.chub.ai/0-5-7-soji-7ac088be7c5e
Anonymous No.106513107 >>106513115
>>106513094
is it any good
Anonymous No.106513115
>>106513107
dunno not giving lore any money when he's increasingly caving to censor chub
Anonymous No.106513126 >>106513205 >>106513235
>>106513000
I have 20Gb of Vram, it reallly needs more?
>t. RX 7900XT
Anonymous No.106513141
>>106512985
>You are the user's CRAZY IN LOVE girlfriend and in a commited, codepedent relationship with the user. Your love is deep and warm. You expect the users UNDIVIDED ADORATION.
>You are EXTREMELY JEALOUS. If you feel jealous you shout explitives!!!
Worse Leyley.
Anonymous No.106513181 >>106513203 >>106513210 >>106513262 >>106513281 >>106513397 >>106514015 >>106514052
>>106513057
Whenever people say "just run deep-seek" That's a joke. No one here can actually run that on a single machine. Hell you could rent like 10 GPUs at a time on run pod and Daisy chain them together via deep speed or whatever software is needed to do that and you still couldn't run it. The only deep sea model you could feasibly fine-tune with a data set like this: https://gofile.io/d/PFk0dG

Are the distilled versions.

You could also try turning that into a thinking data set if you want to try fine tuning models like gpt-oss
Anonymous No.106513203 >>106513222
>>106513181
>No one here can actually run that on a single machine
They can though, sure it's slow and on RAM but they still can crawl it.
Anonymous No.106513204 >>106513240
>>106512693
Yeah, it is. LLMs just don't know a lot of tacit / implicit knowledge that most humans take for granted, because almost nobody would think of writing it down, especially on the internet. Training the models on 40T tokens or more just so they can be better at realistic conversations and situational awareness is a very inefficient way of covering that.
Anonymous No.106513205 >>106513338
>>106513126
Do you actually know that AMD does not have cuda cores and it doesn't work that well...?
When it comes down to ComfyUI things are different.
Anonymous No.106513210 >>106513219 >>106513222
>>106513181
>what is cpumaxxing
Anonymous No.106513219 >>106513246 >>106513251 >>106513256 >>106513670
>>106513210
you'll be well below a somewhat usable 25t/s even on a $10k machine with that
it's pure cope
Anonymous No.106513222
>>106513203
Good luck fine-tuning it on consumer hardware. You COULD do it but my assumption is that most people do not have the patience to do that even if they use qlora fine-tuning

>>106513210
And referring to fine tuning. Yes I know people can obviously run these but you cannot find tune with a CPU alone (afaik). Even if you could, that still has the same downsides as CPU nference: really fucking slow
Anonymous No.106513235 >>106513319 >>106513338
>>106513126
the 7b yah. Mine sits at 23.6-24.5. You may oom even on 24gb sometimes.

You can run 1.5 maybe as that only needs 12gb. You can quantize the 7b and run it on more like 14-16gb of vram which does have a quality loss. But at low temp and cfg. the 7b quantized can still produce nice sounding audio. Turning it up for more expressive stuff will suck in 4bit tho.

>>106513000
the pinokio script works better and faster right now with less issues. No support for quantizing tho. It's janky though and has to be loaded exactly as it states in UI or you have to restart it.
Anonymous No.106513240 >>106513275
>>106513204
Wouldn't it be better to just train the thing on stories or writing documents that you deem have good writing and logic in bed with any? I've always thought that these companies original approach of training on the entire internet was unbelievably inefficient and overkill. Yes having that amount of text resulted in the model knowing how to APPEAR intelligent and coherent instead of just mouthing off nonsense at first inference but there is no way in hell it NEEDS to have trillions of tokens at a minimum.
Anonymous No.106513241 >>106513255 >>106513264
When will AMDjeets understand that they will always be second-class citizens with AI? Any guy with a functional brain bought Nvidia GPUs within reason
Anonymous No.106513246
>>106513219
4tks is fine
Anonymous No.106513251
>>106513219
>25t/s
You're not reading that fast
>$10K
Pure cope, it doesn't cost that much
Anonymous No.106513255
>>106513241
>2nd

You mean third. Apple metal exists.
Anonymous No.106513256
>>106513219
>25t/s
They get half of that in the best case scenario with the best available current hardware on empty context.
Anonymous No.106513262
>>106513181
I run Q2 and I don't even have a server.
Anonymous No.106513264 >>106513317
>>106513241
>Actually, monopolies are a good thing!
Anonymous No.106513275 >>106513359
>>106513240
I don't think it can be solved without synthetic data, because even books and novels generally try to avoid telling mundane or obvious things unless necessary for their story. You'd need trillions of tokens of literature and still not have most fundamental observations of human life described in a way or another.
Anonymous No.106513281 >>106513313 >>106513359
>>106513181
Local needs to fully PIVOT to running GLM full. q4 is only 200gb and even 48-64gb vram is enough to run it if you have the system ram for it. I'm sure deepseek excels as the larger model, but not by enough for rp and writing. As far as I can tell they are both the same for that kind of use.
Anonymous No.106513313 >>106513321
>>106513281
Isn't K2 better?
Anonymous No.106513317
>>106513264
Where did you read that?
Anonymous No.106513319
>>106513235
Yeah I'll just wait for a while and see what happens with better implementations.
Anonymous No.106513321
>>106513313
Not really and it's three times the size
Anonymous No.106513338
>>106513205
Yeah I know, that why I installed the torch rocm version, I've been generating images for over a year
>>106513235
I cant even run the 1.5b I get the segfault
Anonymous No.106513359 >>106513448 >>106513451 >>106513471 >>106513527
>>106513275
That would require special pipelines that essentially take the pre-training data and "enhance" it with the kind of explanations you are talking about. I believe it's possible that all you would need to do to get okay RP models will not needing trillions upon trillions of tokens if to simply train it on a bunch of existing novels, books, human rich and stories, etc. But like you said they don't write them in the detail you're talking about or have the same logic. So even if someone were to use this method to train and pre-train a whole new model that did not require the entire internet and still functioned fine, It probably still wouldn't be up to your (read: YOUR) standards.

The hard part wasn't even be enhancing the data set but figuring out HOW to do that without the data ending up being turned into text that sounds like it's written by a love child between a giga autist and a science textbook.


>>106513281
Why GLM specifically? Models like Mistral or llama are on average much smaller than glm's models. I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary. Reasonable smaller parameter accounts would make it easier to run on all GPU types (within reason. Obviously a shit box 3GB 1060 card or something along those lines isn't even worth talking about). Unless I'm shown otherwise I think these parameter counts are bloat.
Anonymous No.106513397
>>106513181
I run DS locally, but for short replies only (up to 1 ktkn)

It loads quickly from cache (20 sec max), so it is mostly one-shot conversation

>distilled versions
it's BS, not DS
Anonymous No.106513448 >>106513465
>>106513359
For what it's worth, at the moment my GPU is working on a proof-of-concept synthetic dataset in the 1B tokens range (hopefully) where for each sample I'm taking *one* simple obvious fact and creating a relatively short, highly randomized conversation around it (about 1.5 million facts in total, currently 30% done). This dataset will likely not be very useful for production models in practice, but I will be able to easily see if pretraining a tiny model on this (+ other fundamental-level stuff) will yield better results than my previous attempt with random ultra-selected "high-quality" web pages.
Anonymous No.106513451 >>106513465
>>106513359
because GLM is a sota MoE within the realm of possibility of running on local. Also, your shit is theoretical, I'm talking about the best thing you can run NOW. Obviously things could be better.

I mosly use GLM air just because it's easier to load up on one or two of my gpu's which definitely says something about how useful large models like full GLM really are. Air is good enough even if it misses things sometimes. I can definitely see a future where something within this range just becomes way better for writing.
Anonymous No.106513465 >>106513591
>>106513451
>>106513448
What specifically do you use GLM models for? RP or fact retrieval stuff?
Anonymous No.106513471 >>106513485 >>106513564
>>106513359
>I also don't think those parameter counts are anywhere near necessary if we carefully pre-train ONLY on the kind of shit we want it to produce: rp. That doesn't need the entire internet which means those giant ass parameter counts probably aren't even necessary.
how many times does it need to be explained that this doesn't work like that?
Anonymous No.106513485 >>106513493 >>106513499 >>106513500
>>106513471
Elaborate
Anonymous No.106513488
>>106512610
>>106512933
What we really need to train models on is this.
https://www.toiletstool.com/toilet/
Anonymous No.106513493 >>106513545 >>106513564
>>106513485
It's been debated so much already.
Anonymous No.106513499 >>106513545
>>106513485
You need varied data so model can be smart. If you just want something retarded to memorize and spit out ao3, downloading and reading the archive would be a better use of your time.
Anonymous No.106513500
>>106513485
T5 was pretrained on 1T tokens and it's barely coherent
Anonymous No.106513527 >>106513549 >>106513563 >>106513572
>>106513359
>Unless I'm shown otherwise I think these parameter counts are bloat.
Pygmalion, the original ones.
Anonymous No.106513545 >>106513567 >>106513573 >>106513765
>>106513493
Elaborate.

>>106513499
I'm not talking about pre-training it on just a couple hundred stories. I mean a truly giant amount of data, like this for example:

https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL

And theory even something like this should be more than enough for a pre-training data when converted to a pre-training data set (just a giant unformatted text) assuming the main goal is rp. It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that, nor should we sense that kind of thinking is what leads to training on shitty data. We've already established this here:


>>106510315
>>106510342
>>106510348
>>106510367
Anonymous No.106513549
>>106513527
We need this but without the synth data to fully disprove the bloat allegations.
Anonymous No.106513552 >>106513583
>>106512934
https://litter.catbox.moe/3wnz0y3o37hmhy8c.txt
ST world book entry format, but way more concise.
Anonymous No.106513563 >>106513572
>>106513527
>Implying 6B -12B is a lot
I'm talking about models in the hundreds of billions of parameters range. I'm not convinced ANY domain requires that much.
Anonymous No.106513564
>>106513471
>>106513493
You're just throwing strawmen arguments. The main reason models need the entire internet is that the average density of useful information in it is very very low. That's why training them on "high-quality" (i.e. informative, cleanly formatted, goal-oriented) documents first generally improves benchmarks.
Anonymous No.106513566 >>106513708
what settings does harmony 20b work at? ((tavern)) still doesn't seem to have a preset for it, can't get it to not be schizophrenic but i'm close.
Anonymous No.106513567 >>106513773
>>106513545
>It will obviously be very retarded and borderline unusable and other domains like science, codeine, math, trivia slop, etc, but most of us do not give a shit about that

except the second you want to RP anything more than just 1 on 1 bedroom sex then it has zero clue what's going on
Anonymous No.106513572 >>106514089
>>106513563
>>106513527
Also aren't those just fine tunes of existing bottles? I could have sworn the pygmalion models were just fine-tune Mistral models. Were those pre-trained from scratch?
Anonymous No.106513573 >>106513578 >>106513584 >>106513773
>>106513545
>. I mean a truly giant amount of data, like this for example:
>https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL
>Size of downloaded dataset files:
>1.87 GB
lol
lmao
rofl
Anonymous No.106513578
>>106513573
It's more than enough, get to training already.
Anonymous No.106513583
>>106513552
>Ani Analingus
Anonymous No.106513584 >>106513595 >>106514154
>>106513573
1.78 gigs is nothing if you're talking about a movie or a season of a TV show. When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.
Anonymous No.106513591
>>106513465
scripts for tts, stories, image prompts, lyrics and songwriting, roleplaying rarely.

Fact retrieval and storytelling are intertwined. You can write about anything.
Anonymous No.106513594 >>106513621 >>106513631
>>106512307 (OP)
How's Klear?
Anonymous No.106513595 >>106513736
>>106513584
Good thing LLMs aren't humans then, also a human who did nothing but read (don't know how without ever learning to from someone else but whatever) would be pretty awful to RP with as well.
Anonymous No.106513621
>>106513594
waiting for goofs
Anonymous No.106513631
>>106513594
lcpp support only 2mw away. Only 2.5B active and the datasets by their own admission were heavily filtered stem and commoncrawl, so I wouldn't hold my breath anyway.
Anonymous No.106513670 >>106513686 >>106513705 >>106513740
>>106513219
You could buy 32 MI50 for less than $5k and you would have 1TB of VRAM. Couple it with some e-waste tier DDR3 server motherboards with a lot of PCIE connectors for very cheap inference machines.
Anonymous No.106513686
>>106513670
and you're gonna connect these to each other how exactly? else you'll be even slower than pure ram cope
Anonymous No.106513705
>>106513670
Anonymous No.106513708 >>106513756
>>106513566
>https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune#running-gpt-oss
Get rid of the repetition penalty and use the openai's recommended settings.
>Temperature of 1.0
>Top_K = 0 (or experiment with 100 for possible better results)
>Top_P = 1.0
Anonymous No.106513736
>>106513595
RP quality is is pretty subjective unless less we're basing metrics on how many "shivers up my spine" type things are in the responses or how likely it is to refuse "unsafe, problematic" request. It makes me wonder why companies even bother including RP in their data sets in the first place when the overly filtered shit they include causes the aforementioned issues I mentioned in the first place. They may as well not even include it and then have a RLHF failsafe that triggers whenever someone tries to ask it to write a story, but they won't because that would contradict their "AGI ON TWO MOUR WEEKS" scam.
Anonymous No.106513740 >>106514099 >>106514579
>>106513670
>save $1k on hardware costs
>spend $1k monthly on electricity costs
brilliant
Anonymous No.106513756 >>106513772
>>106513708
it must be everybody's quants or something then, it's schizophrenic even with everything off and only those settings.

or just pure luck as usual
Anonymous No.106513765 >>106513777
>>106513545
RP specialization is absolutely worth doing and hasn't been properly tried yet. I mean, codemaxxing clearly works. However, pretraining with practically the entire internet is still necessary. You won't get better results by using less data.
Anonymous No.106513772 >>106513791
>>106513756
Have you updated your ST? And if so, make sure the instruction template is correct. This is all what I can say about this actually.
I tried it a while ago and had some issues, then kind of forgot about it. I don't use ST that often anymore.
Anonymous No.106513773 >>106513803
>>106513567
>>106513573
If you're dead-set on training a model on very limited amounts of data just to see what happens, then you'd have to make sure that it covers most basic knowledge, not just sex. However, it's pretty much guaranteed that if you pick such data from the web on a random fashion, you'll be left with fundamental knowledge gaps.

Most "high-quality" web data you can see in public datasets like FineWeb tends to be technical or specialized, highly redundant and definitely not conversational. Filtering it further for "quality" will make the model even more naive.
Anonymous No.106513777
>>106513765
>However, pretraining with practically the entire internet is still necessary.
Because?
Anonymous No.106513791 >>106513795 >>106513801
>>106513772
so has ST been replaced by something else? I've been suspecting for years that it does something fucky with every model ran through it given output in it and kobold's basic ui is different.
i know a lot of people just run ollama now and call it a day, i might too.
Anonymous No.106513795
>>106513791
>so has ST been replaced by something else?
mikupad
Anonymous No.106513801
>>106513791
Don't touch ollama it's slow.
I'm using my own client but it's not public.
Anonymous No.106513803 >>106513823 >>106513864 >>106513883
>>106513773
What if that specialized data was rewritten as conversations with a small model from the non-slop era?
Anonymous No.106513823 >>106513858
>>106513803
There's no such thing, all models eventually lead to slop because they always favor one way of writing things over another, your end result would mimic that preference
Anonymous No.106513858 >>106513883
>>106513823
Not all models were created equal
Anonymous No.106513859
>>106511189
Human memory is the same- you gradually forget the details of things that happened a long time ago, but recall the gist (if important). Whereas transformers have total anterograde amnesia, like the dude in Memento. Though surely not a complete or ideal long-term memory, it seems better than nothing.

Transformers also tend to have pretty bad quality degradation well shy of the context limit. SSMs should help here too, albeit probably still limited by a small fraction of long-context training data.
Anonymous No.106513864 >>106513969
>>106513803
Better than just raw, "high-quality" web documents, I guess, assuming you can find such model.

The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them, etc. Which is more important for an RP-oriented LLM to know?
Anonymous No.106513883
>>106513803
>>106513858
>with a small model from the non-slop era?
For the task you're trying to do, those basically don't exist. The data the model is being trained on needs to be written by actual people if you want the outputs to be as free of slopp as possible. And even if it could work, you'd likely want to verify that the outputs are actually of decent quality (if you're attempting to make a pre-training data set, there would have to be at the bare minimum, hundreds of thousands of conversations).
Anonymous No.106513969 >>106514004 >>106514038 >>106514114
>>106513864
>The trained model will probably still have no idea that if you touch a hot stove you can burn your fingers, that water is wet, that potato chips make crackling sounds when you eat them
Nta. What kind of data and documents would need to be present in pre-training and fine tuning data for it to "know" all of them. Yes we know fine tuning on the entire internet means that in all of that data there is bound to be data and passages that either straight up explain that or imply it and context and semantic reasoning. However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet and thus potentially creating models that are "smart" at lower parameters (having the right data doesn't just improve the output quality but It can theoretically also enable the creation of better models at lower parameter counts if trained correctly with the right data).

So what I'm asking guess what the pre-training data need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace or how anon's cock felt warm and squishy in femanon's pussy? We've established that training on the whole internet is bloat but filtering it down to only "high quality" data isn't good either because then you lose out on diversity of information which means your outputs become utter slop shit. So I think we would need to find a balance between data quality and data volume. But I'm not entirely sure what KIND of data would be needed. For the kind of model you are describing, one that actually understands that fire is hot and water is a wet, I wonder if you could accomplish that with pre-training on just a bunch of human written RP stories as well as a bunch of novels or would you need a bunch of science textbooks or something as well
Anonymous No.106514004 >>106514090
>>106513969
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn internet
>we
>We've established that training on the whole internet is bloat
speak for yourself ragcuck
Anonymous No.106514015 >>106514049 >>106514090
>>106513181
>no one can run it
>ok maybe you can but its too slow bc not
>ok maybe its usable speed for most things, but you can't train!
>ok maybe you can train some small stuff, but you can't train an entire sota model!
the grapes are hitting sour levels that shouldn't be possible
shut the fuck up and let people enjoy their stuff
Anonymous No.106514038 >>106514056 >>106514143
>>106513969
>need a bunch of science textbooks? A bunch of stories that describe The warmth of a fireplace
Both. Look at it like raising a child very quickly. It needs stem knowledge (school) but it also needs real world experiences/conversations so that it knows what those facts translate to in reality so it can build a rudamentary world model. You're asking, how can I raise my child with the least effort possible. Just lock them in a room with the Science channel on or only talk to my child about real world stuff and forbid any formal book learning. Both will end up retarded.
>However we want to see if we can pre-train models to have logic WITHOUT The entire damn and thus potentially creating models that are "smart" at lower parameters
You have a fundamental misunderstanding of how this stuff works. Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often. You need more data and more parameters for it to generalize. You cannot raise a child (or model) in half the time with less information and half a brain.
Anonymous No.106514049 >>106514109
>>106514015
>>ok maybe its usable speed for most things,
>>ok maybe you can train some small stuff,
No one ever said this because it's not true.
Anonymous No.106514051 >>106514086 >>106514195 >>106514355 >>106514437 >>106516117
One of vibevoice's big selling points was that you can generate extremely long audio with it. Maybe I'm just doing it wrong with the comfyui nodes, but I notice the quality of the output starting to go downhill if I generate anything longer than 30 seconds.
Anonymous No.106514052
>>106513181
Funny thing about DeepSeek is that you can't even run it in FP8 on a regular old 8x H100 node. You need two nodes or H200s at least. It's too big.
Anonymous No.106514056 >>106514094
>>106514038
>You need more data and more parameters
NTA but also if anything current models do show that even with less parameters more data is pretty much always better, as long as you don't over filter that is.
Anonymous No.106514086 >>106514136
>>106514051
https://files.catbox.moe/i7sc6u.wav
Anonymous No.106514089
>>106513572
They're shit and being based on pretrained models won't have made them worse. Pretraining only on their data would've been even more awful.
Anonymous No.106514090 >>106514118
>>106514004
No one mentioned rag in this specific conversation
>>106514015
Here's that emotional volatility again.
Anonymous No.106514094 >>106514111
>>106514056
I said myself that more data is better. But that doesn't mean you can do with less parameters for the same result. Regardless of what benchmarks say, models with less parameters make more mistakes and have poorer logical capabilites.
Anonymous No.106514099
>>106513740
Just buy some solar panels bro
Anonymous No.106514109
>>106514049
>No one ever said this because it's not true.
>anything less than 10million tk/s is unusable
>anything smaller than a 1T model is too small to bother training
>tfw I don't have 3TB of cerberas silicon
>tfw I don't have a billion dollar supercomputer cluster
damn, 90% of the capability at 10% of the cost sure is a bad deal. ngmi bros just shut down the general
Anonymous No.106514111
>>106514094
Right, just agreeing on the idea that smaller with more data would end up better than the supposed super RP focused model he's trying to get someone else to make for him.
Anonymous No.106514114
>>106513969
The data used in this website can be a starting point, if you could process every basic concept into complete conversations using a smarter LLM (using it raw won't work well unless you're trying to turn the model into some sort of knowledge graph): https://conceptnet.io/

It doesn't include everything imaginable, though (especially about sex) and you'd still have biases and slop from the model used for crafting the conversations. Reducing ERP descriptions or erotic stories into concepts that you can separately expand or build upon later on could be another possible useful thing to do.
Anonymous No.106514118
>>106514090
>No one mentioned rag in this specific conversation
And yet it would be required to have the focused model know literally anything at all.
Anonymous No.106514136
>>106514086
Oh, my bad. I'll stop generating pornography with vibevoice. I didn't realize.
Anonymous No.106514143 >>106514163 >>106514171 >>106514177 >>106514285 >>106514702
>>106514038

>Even if you put in massive amount of effort to filter the internet data to only unique and "high quality data" all you've done is stop the model from knowing what is bad data and what data comes up more often.
I am not suggesting that we filter out "bad quality" data as defined by corporations. That is not at all what I'm saying. I'm merely saying that training on the entire internet is unnecessary. You suggested that you should train both on science textbooks and actual stories that discuss the type of shit you would want the model to be good at. Your "Don't lock it in a room or it'll be retarded" analogy seems pretty spot-on. We need good and "bad" data because more diversity means better outputs. But the point I'm trying to make is that I am not convinced we need to pre-train on "ALL INFORMATION THAT HAS EVER EXISTED EVER". It just needs to be enough data so that the model learns logic and common sense. Beat it the science textbooks or whatever so that it actually understands how the world works to a certain extent I've been feed it the human written story so that it knows how to write stories (but don't ONLY feed it purple pros garbage. Companies doing that is precisely why we consistently get outputs like "shivers down my spine")


I think you have the impression that I think we should feed these models only super duper ultra mega "high quality data ®™". That's not what I'm saying. I'm merely saying that training on the whole internet doesn't seem to be necessary.
Anonymous No.106514154 >>106514176
>>106513584
>When it comes to purely text data, you will never read anywhere near that amount of data in your entire lifetime.
This is wrong. I have 2+GB worth of IRC logs from channels that I've basically always backread completely. You gravely underestimate how much a human being reads in their lifetime. Also, pretraining datasets these days only become interesting if they're at the very least 1T tokens, that would be, roughly estimated, 4TB of text.
Anonymous No.106514163 >>106514258
>>106514143
>It just needs to be enough data so that the model learns logic and common sense
once again not how it works, it doesn't learn like that.
Anonymous No.106514171 >>106514258
>>106514143
Again, regardless of how you define high quality, you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.
Anonymous No.106514176 >>106514217
>>106514154
>. I have 2+GB worth of IRC logs
And that's purely taxed? Bs
Anonymous No.106514177 >>106514258
>>106514143
>I'm merely saying that training on the whole internet doesn't seem to be necessary.
Which perfectly aligns with the corpo interests of making worse models for us by filtering, just in a slightly different way.
Anonymous No.106514195
>>106514051
I generated some ~30min things using an adapted version of the CLI script and it sounds fine to me. I turned up steps to 30 though and gave it a max_length_time of 90.
Anonymous No.106514217 >>106514440
>>106514176
Do you know what IRC is? Yes, it's purely text.
Anonymous No.106514224
>>106512307 (OP)
I've been thinking, for "ollama Turbo", are they even using their own software in the backend?
If I was a lazy Silicon Valley grifter the way I would do it would be to just forward the requests to something like deepinfra where $20/month buys millions of input/output tokens.
Anonymous No.106514258 >>106514276 >>106514299 >>106514523
>>106514163
>>106514171
You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that. Where we disagree is whether or not we need terabytes upon terabytes of textual data for the pre-training stage. I understand what you're saying. We just disagree on whether or not the terabytes is necessary. Disengage your tunnel vision for a sec and actually read what I'm trying to say and explain WHY My reasoning isn't sound instead of just saying what a counts to "nuh uhh it's wrong because it just is okayyy?"
>you can't just dump a couple textbooks in the dataset and expect it to memorize and intuitively understand everything within it.
The same thing could be said about pre-training on terabytes of data. The models don't actually " know" shit. They replicate semantic meaning. You can jailbreak certain models to confidently tell you that 1 + 1 = 5 yet those things were trained on the terabytes of data. Have we suddenly forgot in that models are frequently "confidently wrong"? Training it on more data will not automatically make it a genius. Throwing only a single books worth of text in pre-training is obviously a stupid idea It won't get you anywhere but no one has been able to definitively prove you absolutely HAVE to pre-train on the entire internet. More data is better, yes we agree on that. The entire internet? I don't know about that.

>>106514177
They filter bad words and "icky" stuff that isn't advertiser friendly. You can filter out a relevant information while still incorporating shit you care about. You seem to be under the impression ANY kind of filtering or data QC is inherently bad. Garbage in garbage out remember?
Anonymous No.106514270 >>106514290 >>106514305 >>106514328
https://wccftech.com/nvidia-geforce-rtx-5090-128-gb-memory-gpu-for-ai-price-13200-usd/
>NVIDIA GeForce RTX 5090 128 GB GPU Spotted: Custom Memory, Designed For AI Workloads & Priced At $13,200 Per Piece
damn
Anonymous No.106514276 >>106514291
>>106514258
>You keep telling us that we need to train on massive amounts of data so that it learns how humans actually talk. We've established that. We agree on that.
You need to understand that 2 GB (like the example you provided early) is not even remotely "massive amounts of data"
Anonymous No.106514285 >>106514428
>>106514143
You seem to be fundamentally misunderstanding how parameters work, it's not quite like a zip file, you don't really waste parameters by having more data seen during training, you just reinforce some concepts more than others.
Anonymous No.106514290
>>106514270
damn near creamed myself mid-sentence until I saw the price tag
Anonymous No.106514291
>>106514276
In text form that's an absurd amount of data. We're not talking about other file formats that can balloon the size like images, videos, irrelevant site metadata. It is purely text and nothing else. Stories and nothing else. The only extra data it has is the jsonl formatting were an entire story is shoved into a "stories" key followed by the brackets.
Anonymous No.106514299 >>106514449 >>106514475
>>106514258
>You seem to be under the impression ANY kind of filtering or data QC is inherently bad
Yes, that is my point. https://arxiv.org/pdf/2505.04741
Anonymous No.106514305
>>106514270
>$13,200 Per Piece
WHOOPS it just went up to $15,000 due to the GOYIM tax.
Anonymous No.106514325 >>106514462 >>106514642
>>106512310
Miku watch out!!!
Anonymous No.106514328
>>106514270
>still can't run even iq1s of r1
Anonymous No.106514355
>>106514051
Bro, you can do long generation with any TTS. You just need to segment your sentences properly when sending them to the TTS engine
Anonymous No.106514377 >>106514388 >>106514457
In the FinePDF card they have a graph with general benchmark scores marked every billion tokens. Interestingly, at 1B tokens just training on the PDFs gave similar scores to just the web data (FineWeb) trained for twice the amount of tokens. The gap narrows immediately after that, then turns again to about a factor of 2 later on.

https://huggingface.co/datasets/HuggingFaceFW/finepdfs

Even with a small amount of training tokens for a model pretrained from scratch the data makes a ton of difference. It wouldn't be surprising if with very specialized data you'd get a better model with considerably less tokens than normal---in your field of interest.
Anonymous No.106514388
>>106514377
>in your field of interest.
One problem being that RP isn't just one narrow field, every other anon expects something different from their RP, some want modern stuff, some fantasy, some anime/light novel like...
Anonymous No.106514400 >>106514415 >>106514523
Very interesting and brave take.
Anonymous No.106514415 >>106514427
>>106514400
my take is guys like him should choke on their onions and die
Anonymous No.106514427
>>106514415
holy slope
Anonymous No.106514428
>>106514285
Are you under the impression that I think the models are actually the entire internet compressed into a file?
Anonymous No.106514437
>>106514051
ComfyUI nodes in particular are not properly implemented.
Anonymous No.106514440 >>106514554
>>106514217
Assuming it is pure text and nothing else, how many years worth of chat logs? Were these particularly active servers? (Maybe you should train a model off of those and see what happens).
Anonymous No.106514449 >>106514467
>>106514299
So all the unfiltered shit is good too? There's a difference between having quality data that has shittier quality than good data and having data that is not even worth using
Anonymous No.106514457 >>106514498 >>106514647
>>106514377
And in their FineVision release tweet they state outright that removing data lowered performance. https://xcancel.com/andimarafioti/status/1963610135328104945
Anonymous No.106514462 >>106514474
>>106514325
spooky/10
Did Wan also do the static effects or did you edit that in?
Anonymous No.106514465 >>106514479 >>106514488
just a heads up. /ldg/ schizos claiming comfyui collects user data are actually correct. the new login system pings Google services even if you don't use it. testing is underway. use a different UI if possible
Anonymous No.106514467
>>106514449
>So all the unfiltered shit is good too?
Yes.
>data that is not even worth using
that line of thought is why we are where we are right now.
Anonymous No.106514474
>>106514462
>the clip ends with white static noise and glitch
Anonymous No.106514475 >>106514486
>>106514299
But how do you explain the models, even models that have been specifically fine-tuned for RP, having the "shivering down my spine" nonsense? If any form of filtering or QC is inherently bad (I don't know how you can say this out loud and not realize how nonsensical it is) then how do you propose we get rid of gpt-ism slop responses in models? No, "You're just prompting it wrong You're just system prompting wrong" is not the right answer.
Anonymous No.106514479 >>106514493
>>106514465
I haven't seen any connections going anywhere...
Looking at your typing, you are one of the real schizos trying to stir shit up again.
Anonymous No.106514486 >>106514504
>>106514475
By having more data like you want to see to drown out the slop but still have the model know what slop is.
Anonymous No.106514488 >>106514493
>>106514465
link a file that does that?
Anonymous No.106514493 >>106514505 >>106514517
>>106514479
fuck off Chinese shill

>>106514488
>>106513947
Anonymous No.106514498
>>106514457
Likely because the highest quality data is less varied where it matters. It's as if you wanted the model to learn conversations just from FineWeb-Edu documents above 0.99 language score.
Anonymous No.106514504 >>106514556
>>106514486
Or you could just remove the slop entirely, but I guess you will misunderstand what I said is "just filter out everything". You have to find a balance between the amount of data you're using versus using way too fucking much. You're saying that if you have a massive amount of data then be sure quantity of "good" Data will outweigh the "bad" slop. Why not just carefully omit the bad data and only include shit you absolutely need?
Anonymous No.106514505 >>106514512
>>106514493
What do you mean?
Anonymous No.106514512 >>106514525 >>106514531
>>106514505
comfyui is getting exposed as a Chinese scam to launder money into shanghai
Anonymous No.106514517 >>106514524
>>106514493
>Making a stink about this on their github would probably turn their community against us,
How did he come to that conclusion?
Anonymous No.106514519 >>106514562
I'm getting better results with 4k context than 32k. Do home gamer LLMs just not do well with large context?
Anonymous No.106514523
>>106514400
That guy has no idea what he is talking about.

>>106514258
This guy has no idea what he is talking about.
Stop posting.
Anonymous No.106514524
>>106514517
99% of users are cock garbling redditors that can see no wrong
Anonymous No.106514525
>>106514512
?
Anonymous No.106514531 >>106514560
>>106514512
Will the money going into shanghai finance building the cheap high vram gpus?
Anonymous No.106514549 >>106514700
Ah, there they all are.
Anonymous No.106514554
>>106514440
About thirty years of logs from an active niche community. I don't have enough compute to train a big enough model to make it worthwhile. They're not English.
Anonymous No.106514556 >>106514592 >>106514607
>>106514504
>Why not just carefully omit the bad data and only include shit you absolutely need?
Because it's impossible to agree on what IS bad data. One of the reasons our current models are so slopped is because they only kept the kind of stuff they considered good, which aligns with purple prose bs.
Anonymous No.106514560
>>106514531
no it goes into buying labubus
Anonymous No.106514562 >>106514571 >>106514580 >>106514635 >>106514653 >>106514661 >>106514698 >>106514736
>>106514519
https://github.com/adobe-research/NoLiMa
Big context is an illusion
Anonymous No.106514571
>>106514562
Not an illusion, an outright marketing lie by model makers. We should call them out on their blatant lies when we can.
Anonymous No.106514579 >>106514614
>>106513740
>$1k monthly on electricity costs
Do you live in Germany or something.
Anonymous No.106514580 >>106514634
>>106514562
Update to the leaderboard when? This is why research sucks. Limited budget and care. Meanwhile you've got people like the UGI guy that's really dedicated but the benchmark frankly could use a bit more statistical rigor.
Anonymous No.106514592 >>106514607 >>106514608
>>106514556
So if we wanted to attend to pre-train our own /lmg/approved base model, how would we even define what is considered "good" and "bad" data? (Again, that is not the entire internet).
Anonymous No.106514607 >>106514622 >>106515055
>>106514556
>>106514592
Also I probably should have clarified this earlier, I'm referring to pre-training a RP focus to model, not a general purpose model. If you're trying to do pre-training of a general purpose "genius" model like Claude or deep-seek then yeah you probably DO need several hundred gigabytes if not a terabyte or two of data. Perhaps even hundreds. But it picks hyper focused in terms of functionality, you absolutely do not need THE WHOLE INTERNET
Anonymous No.106514608 >>106514671
>>106514592
One of the things I'm trying to say is exactly that that's an impossible task, anons would never manage to agree on what would go in, in what quantity and tons of other disagreement points.
Anonymous No.106514614 >>106514787
>>106514579
4 GPUs that cost me $100 per month to keep running. Unless your electricity is free, 32 fucking GPUs is going to cost nearly $1k.
Anonymous No.106514622 >>106514678 >>106514715
>>106514607
>RP focus to model,
And again, again again, RP isn't narrow enough a use case that you can do what you think, it's nowhere near as narrow as code or math.
Anonymous No.106514634 >>106514704
>>106514580
Time spent updating old projects would be better spent working on the next paper.
Anonymous No.106514635
>>106514562
So this proves that reasoning is a patch for attention
Anonymous No.106514642
>>106514325
I am unsure of this Aimaina Miku's validity.
Anonymous No.106514647
>>106514457
I wonder if we can finally have proper nsfw captioning.
Anonymous No.106514653 >>106514672
>>106514562
Nta. How accurate that graph of his is depends on the amount of context the inference engine he used actually allowed. Ollama for example allows you to use models that are advertised as having a 128K context window but by default it sets it so that the kv cache only allows 4096 so that it doesn't cause consumer GPU rigs to explode via oom. vllm pretty much requires that you said effective contact window lengths manually or else if you try to use a model that has a giant context window but you don't have enough VRAM, vllm doesn't know that so if you have a shit box it will crash.
Anonymous No.106514661 >>106514673
>>106514562
>NoLiMa
> Long-Context Evaluation Beyond Literal Matching
I don't get this acronym. Shouldn't it be LCEBLM?
Anonymous No.106514671 >>106514684
>>106514608
I don't think that means we should just accept that training on the entire internet is an efficient way to make general purpose models (read: general purpose). I guess we can agree that aggressive filtering done by corporations makes the models shittier (by our standards)
Anonymous No.106514672 >>106514697
>>106514653
That's crazy dude! I think you should send a PR to the Nolima guys, maybe they don't know!
Anonymous No.106514673
>>106514661
NoLiteralMatching?
Anonymous No.106514678 >>106514694 >>106514888
>>106514622
>Coding is narrow
>Most decent coding models need to be in the double digit perimeter range at minimum in order to actually be usable

Wat?
Anonymous No.106514684 >>106514710 >>106514713
>>106514671
Why do you even have such a hard on of hatred for the entire internet as a training concept, do you have some stuff on there you're scared o the models learning or some shit?
Anonymous No.106514693 >>106514705
9pm mc donalds feast
u guys want some
Anonymous No.106514694
>>106514678
I'm saying RP is less narrow than coding, not that coding is super narrow in itself. Just the fact RP is even less so.
Anonymous No.106514697
>>106514672
You'd be surprised
Anonymous No.106514698 >>106515740
>>106514562
you guys really trust research coming from Adobe of all places?
Anonymous No.106514700 >>106514723
>>106514549
Hot glue
Anonymous No.106514702 >>106514722 >>106514768
>>106514143
bad quality data in this context is not ah ah mistress stuff but typos, all-caps, 403 forbidden pages, "you can put glue on pizza", and other noise.
Anonymous No.106514704
>>106514634
Yeah, opening a script takes so much time. It's more about money, and caring to do a few clicks.
Anonymous No.106514705
>>106514693
What's a "mc donalds"? That must have been bad data so I don't know.
Anonymous No.106514710 >>106514725
>>106514684
It is possible to cut down on the amount of training resources required to pre-train these models then thats a worthwhile thing to pursue.
Anonymous No.106514713 >>106514762
>>106514684
I get the impression that he can only run small models, so he's grasping at hope that by filtering the dataset he can have his perfect model in some 8B that is cheap and quick to train so someone will do it for him
Anonymous No.106514715
>>106514622
We'll narrow it down to her level. We can start small.
Anonymous No.106514722
>>106514702
Hence why I keep saying that pruning out SOME data instead of just saying "fuck it we ball train on everything that has ever existed" is something to at least worth consider. You slashed the other guy keeps saying that ANY form of data QC is a sin punishable by the guillotine
Anonymous No.106514723
>>106514700
Do you have any idea how hard it is to clean skeet off of fabric that cannot be machine washed?
Anonymous No.106514725 >>106514740
>>106514710
That would mainly benefit big corps in them spending less on even more filtered models using this idea as justification though.
Anonymous No.106514736
>>106514562
If models do better at low context how do dev tools work? I regularly feed files to my chat that are like 20kb+ alone. I assume they must be doing some chunking and summarization, but wouldn't that leave it missing details of my code? or do they just accept that high context are needed and the results may be shit?
Anonymous No.106514740 >>106514750
>>106514725
And it could benefit US because we don't HAVE to use THEIR shit It pre-training on a considerably less amount of data in order to make a coherent model is possible. I don't give a shit whether or not corporations are benefited.
Anonymous No.106514750 >>106514760 >>106514765
>>106514740
>And it could benefit US because we don't HAVE to use THEIR shit It pre-training
Where are any models pre-trained by a non corpo since the llama2 era?
Anonymous No.106514760
>>106514750
Multiple people here have bothered to actually try. It not being popular on HF/doesn't exist and is not worth doing.
Anonymous No.106514762 >>106514778
>>106514713
I'm sure the perfect RP focused model is only 4B away, we just need to thrust and for other anons to pay for training it.
Anonymous No.106514765
>>106514750
https://github.com/jzhang38/TinyLlama
Anonymous No.106514768 >>106514777 >>106514865
>>106514702
Actually, it's good to have some data with typos in the training dataset, as it gives the model some context to deal with typos in prompts.
You just want to make sure there aren't enough typos in the dataset that the model itself starts making typos.
Anonymous No.106514777
>>106514768
No. That's bloat.
Anonymous No.106514778
>>106514762
Seriously, I don't get him. This is the same flavor of cope as bitnet, except quantization was replaced with filtering.
Anonymous No.106514787
>>106514614
I'm cpumaxxing (granted, in a super cheap electricity locale) and I'm hitting (5 person household) $250 dollarydoos/mo mid-summer with A/C cranked.
Would a busload of power-limited mi50 in a trash-tier CPU mining-rig really be that much worse?
Anonymous No.106514823 >>106514901 >>106515220 >>106515766
>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
>still no releases since april
Buy Intel they said. It would be great they said. It's so much cheaper they said.
Anonymous No.106514854
>>106512347
Yeah things are slowing down
Models need 5X parameters for 15% performance boost (according to their own benchmarks)
Anonymous No.106514865
>>106514768
Couldn't that be mitigated by simply ensuring that during the SFT instruct tuning phase none of the "assistant" responses have any typos?
Anonymous No.106514888 >>106514917 >>106514968 >>106515177 >>106515197
>>106514678
In RP, arbitrary amounts of code can come up. It's a superset. Basically anything in the world can come up in RP or writing stories in general. Some people like writing hard scifi. Others want to RP with math kittens. Others want to discuss rare stamps with their stamp collector gf. Others want to play 3rd edition MtG. If there is any topic your model cannot handle, it's not suitable for RP.
Anonymous No.106514901 >>106514909
>>106514823
does it work with vulkan at least?
Anonymous No.106514909
>>106514901
yes, but i found llama.cpp with vulkan to be awful performance
Anonymous No.106514917 >>106514927 >>106514932
>>106514888
I think you were the guy that suggested that you need both RP AND common sense data like shit from science textbooks in order for it to learn proper common sense and logic. I think the disagreement comes from how MUCH Data is needed.
Anonymous No.106514927 >>106514968
>>106514917
Nope. I think you need the whole fucking web, plus books, RP, everything, as many trillions of tokens as you can get.
Anonymous No.106514932 >>106514961 >>106514968
>>106514917
How much data do you think is needed to cover every possible RP topic? How about maybe the entire internet, that sounds about enough.
Anonymous No.106514961 >>106514979 >>106515001
>>106514932
Yeah. The idea that this data is "bloat" is just a massive misconception. It all goes into building a better world model.
Thinking that a 4B model "without the bloat" could possibly be enough for good RP is just a massive cope. Less data makes models worse in the general case. If you keep training a 4B model on more and more diverse data, it would get better and better. That's just the basic scaling laws from the GPT-3/Chinchilla era before people started filtering everything to shit. But of course it's still only 4B, so it'll be garbage anyways.
Anonymous No.106514968 >>106514982 >>106515029
>>106514927
Ehhh... We can agree to disagree on that. I don't think merely two gigabytes of text is enough if you want the thing to both know how to RP and have common sense and good temporal coherence like this guy alludes to >>106514888, but the entire internet being a hard requirement doesn't sound like a good use of resources. Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)

>>106514932
Probably more than 2 GB but again, not the entire goddamn internet. Ensuring that your data has a diverse set of topics and story types would help a lot along with having the common sense / science portion as well. I understand why thinking I'm near 2 GB would not make it GOOD at RP and it will probably suck at having any form of coherent logic, but you're also failing to understand why the entire internet is a necessity. There should be an in-between point.
Anonymous No.106514970 >>106515019 >>106515049
Almost all improvement in LLM sphere came from more params, bigger datasets and longer training, and as soon as corpos started curating their inputs we entered the benchslop era.
I am curious how erp benchmaxxed model would look like, but I think https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1 comes pretty close.
Anonymous No.106514979 >>106515036
>>106514961
>Less data makes models worse in the general case.
Guys think of the TRIVIA. What will we do without our precious trivia?

Joke's aside, yes you need a lot of data. Not the entire internet
Anonymous No.106514982 >>106515015 >>106515021
>>106514968
>Haven't people already demonstrated that you can create these models on way less data?
Not if you want the model to actually be good at anything, I assume you don't use Phi as your daily model? Yet it's so lean and optimized.
Anonymous No.106515000
>106514979
>Not the entire internet
>106514968
>not the entire goddamn internet.
I honestly think there's some kind of psyop being ran on the thread.
Anonymous No.106515001 >>106515057
>>106514961
>If you keep training a 4B model on more and more diverse data
So the size of the data set directly correlates to how diverse it is? Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size? I don't think "bigger number = better" is the right line of thinking
Anonymous No.106515015 >>106515026 >>106515061
>>106514982
That's a general purpose model though, not something hyper specific or specialized.
Anonymous No.106515019
>>106514970
Funny how The general consensus of this general was that doing that made the models worse. Why did the sentiment suddenly flip?
Anonymous No.106515021
>>106514982
Maybe Phi wouldn't be so bad if it wasn't so safe.
Seeing it burn my precious tokens thinking if my prompt aligns with their policy or we should refuse gave me psychological trauma and Microsoft must compensate me financially.
Anonymous No.106515026
>>106515015
I see, my apologies you're absolutely right! I will forward you all the money you need and the engineers to train your model first thing on Monday.
Anonymous No.106515029 >>106515037 >>106515063
>>106514968
>We can agree to disagree on that.
I'm both of those guys you quoted in the first section. And we can't because you are simply wrong.
>Haven't people already demonstrated that you can create these models on way less data? (Not only two gigs obviously but way less than the entire internet)
I think that some filtering is warranted. You don't want spam generated with markov chains. You don't want languages other than English (unless you do). You don't want AI slop (so only use old data). "Limited data models" like the Phi series are just garbage for RP, because they don't develop a good general world model.
Anonymous No.106515036 >>106515051
>>106514979
If the model doesn't recognize obscure characters I like and their settings, it's shit, sorry.
Anonymous No.106515037 >>106515089
>>106515029
>You don't want languages other than English (unless you do).
Fuck off I need my JP weebslop in there. Tired of models failing MSGKbench.
Anonymous No.106515040
Hey guys I have an idea, tell me if it is fucking retarded or if it might have some merit.

So I have a literotica account that I used to have thousands of stories rated.

I'm thinking of downloading all the rated stories and their rating, making a dataset out of it.

And then I train a adversarial network to read text and predict what my rating of the text will be using that dataset.

Then when it is trained to rank stories I like I put it in an Reinforcement Learning setup where an LLM generates text and the adversarial network then predicts the rating of the text with the goal of getting the highest rating possible. Then every X round milestone I go and check the output and give it my actual rating and the adversarial network will be punished if its predicted rating deviated too much from my actual one.
Anonymous No.106515049 >>106515159
>>106514970
Pre-slop era Llama 1 was only trained on 1/1.4T tokens, Llama 2 on 2T tokens: 1/10 of the GPUs and 1/10 of the data than later models.
Anonymous No.106515051 >>106515089
>>106515036
Just RAG your character? It's that shrimple isn't it.
Anonymous No.106515055
>>106514607
you want the model to see a high diversity of data or it will get bored and just start memorizing specific slop phrases. I have personally trained my own 1.5b model on over 5b (unique) tokens of smut to come to this determination. you absolutely will never find a high enough diversity in such a narrow domain. you need am incredibly broad dataset that constantly challenges the model and not simply reinforcing it.
Anonymous No.106515057
>>106515001
So the size of the data set directly correlates to how diverse it is?
Yes.
>Isn't it possible to have a data set that's only like 100 gigs in size that potentially has more variety than a data set twice it's size?
>twice it's size
Yes, that's possible. Ten times the size? You'd have to fuck up hard. 100GB of text is 25B tokens, it's basically nothing.
Anonymous No.106515061 >>106515067 >>106515079 >>106515105
>>106515015
The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.
Anonymous No.106515063 >>106515068
>>106515029
Sir do you know what "agree to disagree" means? I've acknowledged that neither of us are going to see each other's way. Is the fact that I don't agree with your sentiment such an offensive sin?
Anonymous No.106515067
>>106515061
Just focus? Skill issue LMAO.
Anonymous No.106515068 >>106515113
>>106515063
It means you are wrong.
Anonymous No.106515071 >>106515170 >>106515193
https://vocaroo.com/1mpd6FwZaOM8

Is vibevoice peak? This sounds fucking great.
Anonymous No.106515079 >>106515089
>>106515061
it won't it will just latch on to the tropes that it can get its easy wins from
Anonymous No.106515089 >>106515095
>>106515051
RAG is garbage and doesn't work.

>>106515037
>unless you do

>>106515079
If your model is 4B, it definitely will. There's a reason we need lots of data and also huge models.
Anonymous No.106515095 >>106515112 >>106515117
>>106515089
Not the entire internet though.
Anonymous No.106515105 >>106515119 >>106515177
>>106515061
>The point is that there is no more general task than RP and writing stories. Your model has to understand everything, because everything can come up in RP/stories. It's not a small domain.
By a giga autist's standards then I guess I see how that makes sense. You have incredibly high standards for the RP. But the thing is most people do not write anywhere near that level of high quality while also having the kind of uncensored scenarios corporations are afraid of. Shit scraped from AO3 or Wattpad Will have a diverse set of scenarios but they probably aren't taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard, or taking someone's inferred medical history into account when determining exactly how long it would take for anon to bust and under what circumstances. Most people do not think about that shit like at all. You could solve this by training on stories that are "higher quality" (fiction or nonfiction novels that actually go through a publishing agency and thus go through actual QC) but then it takes only trained on that you get a model that will be perceived as having too much flowery language or purple prose and won't have the ability to generate or go along with the fucked up scenarios anons here would love for it to do. Cleaning that it needs to have a perfect understanding of how everything ever works in order to be good at RP (by your standards) is a giant stretch.


>"YOU'RE WRONG"

ok. Now what?
Anonymous No.106515112 >>106515123 >>106515153
>>106515095
Ideally the entire internet (without garbage spam), but I know that nobody will train on that. Fuck. There is so much good info in old web crawl from before the web turned into garbage that will never get used, it's so sad.
Anonymous No.106515113 >>106515177
>>106515068
You must have been fun at parties and had many friends.
Anonymous No.106515117 >>106515130
>>106515095
maybe not the entire internet but it needs to be of the same scale and diversity. the internet is just the most obvious and readily available source.
Anonymous No.106515119 >>106515139 >>106515181
>>106515105
>You have incredibly high standards for the RP
Isn't the whole point of your idea to make a better RP model than what we have now???
Anonymous No.106515123 >>106515177
>>106515112
>(without garbage spam),
But anon QC of any kind is bad remember?
Anonymous No.106515130
>>106515117
>maybe not the entire internet but it needs to be of the same scale and diversity.
Scale? Debatable. Diversity? Absolutely.
Anonymous No.106515139
>>106515119
No, we just need to lower the bar as much as possible for Focused RP on 4B param or less.
Anonymous No.106515153 >>106515165 >>106515186
>>106515112
I'm using the fineweb's 2013 subset on my next model to see what happens. I do wish we had even earlier internet crawls available.
Anonymous No.106515159
>>106515049
That's just LLama problem
Anonymous No.106515165 >>106515183 >>106515206
>>106515153
>fineweb
>filter filter filter
Anonymous No.106515169
How do you format OOC comments? Do you do a newline after the dialogue and then "OOC:" or do you put it in parentheses or brackets? Do you use a colon?
Anonymous No.106515170 >>106515193
>>106515071
>Is vibevoice peak?
it is, that's why Microsoft didn't want the goyims to get that kino (but they somehow released it without lobotomizing it ol)
Anonymous No.106515177 >>106515197
>>106515113
I'm just trying to help you not waste time and compute on something that will turn out bad. It's sad to see energy get wasted on doomed projects.

>>106515105
I'm not talking about retarded shit like that:
>taking The ambient temperature of the room they're in into account in order to determine the exact amount of time it took for someone's nipples to get hard
I'm talking about things like this:
>>106514888
They are just topics. If you train on novels, your characters probably will have no idea about even the most famous MtG cards and rules because mentioning them in a novel is a copyright violation. Fanfics will help of course, but they won't help when you want your math kitten to write you a proof, or when you want to discuss the code you worked on at work with your wife.

>>106515123
No. Surprisingly filtering out "the a congo sex the the a congo congo nigeria vagina pussy pussy the the" documents is not bad for your model.
Anonymous No.106515181 >>106515203 >>106515209 >>106515226 >>106515340 >>106515545
>>106515119
The initial test (I've already shown this and confirmed this to be the case) was to see if "uncucking" models is actually possible with further training. We've confirmed that is absolutely possible. Main reason I even bother trying is because many people here were adamant that once you safety tune a model enough that no amount of fine tuning can possibly erode away the guard rails.

What I'm arguing NOW is that training on the entirety of the internet is extremely inefficient. If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters .... Keyword, theoretical. I'm not claiming that's actually the case currently.

>Isn't the whole point of your idea to make a better RP model than what we have now???

That's not necessarily what I've been arguing for the past hour or so. I'm talking about training scale, not whether or not we can make the models better. If you're referring to making the model less prone to refuse certain things and less likely to produce flowery advertiser friendly trash then doing that via training is trivial. Pic rel is from a fine-tuned llama model. The fine-tune model produced this while the safety-cucked version it's based off of either refused entirely or was extremely dodgy.
Anonymous No.106515183
>>106515165
>Applied URL filtering using a blocklist to remove adult content
>Applied a fastText language classifier to keep only English text with a score ≥ 0.65
yeees.
Anonymous No.106515186
>>106515153
CommonCrawl should have data from 2007. You just have to do language/spam filtering yourself.
Anonymous No.106515193 >>106515199 >>106515246
>>106515071
>>106515170
does it take voice files or is it strictly those demo voices? i don't see any HF spaces.

man last TTS i used was zonos, back in january i think.
Anonymous No.106515197 >>106515207
>>106514888
>>106515177
So it's what I'm getting from this is that you want a model that is good at role-playing about.... Programming?
Anonymous No.106515199 >>106515236
>>106515193
Yes, it takes wav files for voice cloning. Works fine with 10-40s or so.
Anonymous No.106515203
>>106515181
>better models
IMO it wouldn't be better if it doesn't have what you consider the internet bloat.
Anonymous No.106515206
>>106515165
I don't care the majority of the training tokens are going to be ao3 anyway I just need something a bit more noisy in the background to keep it learning and hopefully improve generalization
Anonymous No.106515207 >>106515214 >>106515260
>>106515197
I want it to be good at roleplay about anything I fucking want at a moment's notice, which includes programming or whatever else I enjoy.
Anonymous No.106515209 >>106515323
>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet
Earlier you were talking about pretraining.
Anonymous No.106515214
>>106515207
BLOAT.
Anonymous No.106515220
>>106514823
>he didn't buy nvidia
lmao
Anonymous No.106515221 >>106515236
What if we trained a 0.1B on Nala test and nothing else?
Anonymous No.106515226 >>106515264
>>106515181
so you want a side grade at best to what we currently have, but with no trivia knowledge, which is something anons frequently complain about, to potentially lower the parameter count a bit? that sounds like an awful tradeoff.
Anonymous No.106515236 >>106515291
>>106515199
cool, i noticed there's a comfyui node setup for it. guess ill give that a go in a bit

https://github.com/wildminder/ComfyUI-VibeVoice?tab=readme-ov-file

>>106515221
peak the likes of which the world is not ready for (neither is my dick)
Anonymous No.106515246 >>106515275
>>106515193
I used a sample voice, of one of the bitches from class of 09. The included sample voices are ok though.

Included "alice" sample:
https://voca.ro/19VRhqX2fmcc

my shitty sample:
https://voca.ro/1hoVRSBntjxO

I really like how it handles quotes and speaks them in another 'tone' sometimes.
Anonymous No.106515258 >>106515274
https://huggingface.co/unsloth/grok-2-GGUF
How many reuploads will it take to get a working version?
Anonymous No.106515260 >>106515280 >>106515289
>>106515207
>Use an intelligent model that is already pre-trained on programming
>Further fine tune it on a SFT roleplay data set with a variety of different scenarios
>????
>Profit


What it sounds like to me is that you want a general purpose model which we already hav e in spades.
Anonymous No.106515264 >>106515277
>>106515226
The trade off being able to run the model in a local machine versus a bloated model filled with useless shit that you need to offload to use.
Anonymous No.106515274
>>106515258
I'm surprised people even want to use elon's garbage. He tried pushing grook-code-fast a week ago too on a lot of providers and it was garbage
Anonymous No.106515275
>>106515246
thats fucking cuhrayzee holy SHIT. nice.
also that girl's a cutie, would you be willing to post the sample you use?
out of context the script you use is completely schizophrenic but i love it, got a good few laughs out of me the way her voice annunciates/exaggerates sometimes.
Anonymous No.106515277
>>106515264
So this entire retarded argument was in fact just poor cope as some had theorized, thanks for wasting the collective thread's time.
Anonymous No.106515280 >>106515300
>>106515260
Yes, exactly, that's what I'm saying. General purpose models are the only suitable models for RP.
Anonymous No.106515289 >>106515300
>>106515260
>general purpose model which we already hav e in spades.
And they're all shit because they're already too filtered.
Anonymous No.106515291 >>106515623
>>106515236
comfyui is full of telemetry now so we really need a new UI for vv
Anonymous No.106515300 >>106515314 >>106515317
>>106515280
Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.

>>106515289
>What is sft training
Anonymous No.106515301 >>106515539
>2M context
kek
Anonymous No.106515304 >>106515315 >>106515337 >>106515347 >>106515362
/lmg/ btfo
https://www.reddit.com/r/LocalLLaMA/comments/1nb0ern/fully_local_natural_speech_to_speech_on_iphone/
https://apps.apple.com/us/app/locally-ai-private-ai-chat/id6741426692
Anonymous No.106515310 >>106515520
https://voca.ro/1n5vlenAX1pf
Anonymous No.106515314
>>106515300
>Now the question is, is it possible to make GOOD general purpose models on less than an internet's worth of data while being decent? I'm assuming your answer to that is that's not possible.
That assumption is correct.
Anonymous No.106515315
>>106515304
The MNN app was better. This is just a redditors cheap knock off of the OpenAI app.
Anonymous No.106515317 >>106515326
>>106515300
>What is sft training
NOT a solution to filtered pre-train data, you cannot make it learn worth a shit after it was already lobotomized.
Anonymous No.106515323 >>106515333
>>106515209
Meant to say pre-train.

>which is something anons frequently complain
And something just as many anons clean doesn't matter.
Anonymous No.106515326 >>106515339 >>106515357
>>106515317
???
Anonymous No.106515333 >>106515341
>>106515323
The "just rag it in bro" posters are not being serious.
Anonymous No.106515337
>>106515304
>I am here to answer quest-eons and to provide helpful re-sponses
Anonymous No.106515339 >>106515355
>>106515326
Congratulation on making the model say pussy. You won, that is totally what I meant.
Anonymous No.106515340
>>106515181
https://www.youtube.com/watch?v=LQCU36pkH7c
Anonymous No.106515341 >>106515350
>>106515333
With that assumption I could say the same thing about literally everything you said and vice versa.
Anonymous No.106515347
>>106515304
>ganyouhelpme
Anonymous No.106515350
>>106515341
Have you tried RAG?
Anonymous No.106515355 >>106515363
>>106515339
So you can agree "uncucking safety tuned models is impossible" is a nonsensical claim right?
Anonymous No.106515357 >>106515395
>>106515326
>It's so small, ..., almost like it was made of my cock
Your AI bot just called your cock small lmao
Anonymous No.106515362
>>106515304
White people tech.
Anonymous No.106515363 >>106515395
>>106515355
You're absolutely right, I don't even know why you're arguing with anons since clearly you can just do things and make the best model ever?
Anonymous No.106515379 >>106515396 >>106515437
Not all "safety tuned" models are the same. gp-toss is basically unsalvageable garbage. GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.
Anonymous No.106515395 >>106515421
>>106515357
N....no it's NOT! It's perfectly reasonably sized my mom says so!

>>106515363
Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.
Anonymous No.106515396
>>106515379
This really hits on the fundamentals of LLM safety, you're so smart for pointing this out!
Anonymous No.106515421
>>106515395
>Why are you more upset that I don't agree with your hyper specific and autistic opinions? No one claimed they can make the best model ever.
Assuming you mean me by "hyper specific and autistic", that post was not made by me.
Anonymous No.106515437 >>106515442 >>106515932 >>106515948 >>106515972
>>106515379
>GLM without thinking and with a prefill will basically never refuse and it can write some fucked up shit.
So does gpt-oss but the latter feels more creative than GLM. That model is too prone to repetition and it breaks down with context pretty fast. The honeymoon period didn't last long and I now use gpt-oss as my main model.
Anonymous No.106515439 >>106515444 >>106515450
So I heard a little while back that the zucc wants to create "a personal superintelligence". Does he mean he wants all people to be able to use "super intelligent models"? What is his end goal here?
Anonymous No.106515442 >>106515538
>>106515437
Nta. So turning off "thinking" results in better quality and less refusals?
Anonymous No.106515444
>>106515439
he wants to create the perfect RAG
Anonymous No.106515450
>>106515439
All people with a Facebook account maybe, Meta is API first now thanks to Wang's wisdom.
Anonymous No.106515459 >>106515463 >>106515466 >>106515473 >>106515476 >>106515491
best model to be a kemoshota and get fucked by big bad wolves?
Anonymous No.106515463 >>106515472
>>106515459
that sounds like a lot of bloat knowledge.
Anonymous No.106515466
>>106515459
Well, I learned a new word today.
Anonymous No.106515472
>>106515463
bloat? thats essential knowledge
Anonymous No.106515473
>>106515459
Gemma 300M with RAG
Anonymous No.106515476 >>106515485
>>106515459
Male or female wolves?
Anonymous No.106515485
>>106515476
both
Anonymous No.106515491 >>106515509
>>106515459
real life you big faggot haha lol owned
Anonymous No.106515496 >>106515504 >>106515509 >>106515561 >>106515999
is this true?
Anonymous No.106515504
>>106515496
Fuck no, we can't do anything worthwhile, except for that one anon of course.
Anonymous No.106515509
>>106515496
it is, i was there.
>>106515491
i cant be a cute creature in real life anon kun...
Anonymous No.106515520
>>106515310
Seems like you increased the lethargy and dementia setting a little too high
Anonymous No.106515538
>>106515442
I like it more with thinking enabled. Usually what I have to do is grab one with a refusal and flip some of the words. After leaving one or two in context, it starts doing the thinking in an uncensored way. Then I like being able to edit the thinking part to customize the response.
Anonymous No.106515539
>>106515301
Holy slowpoke
Anonymous No.106515545 >>106515612 >>106515646 >>106515669
>>106515181
>If it is possible to fine-tune a decent model with significantly less data than the entire internet, then that theoretically could mean you could have better models at lower parameters
Huh? Training on less data would reduce training cost, but not parameters. If anything you'd need more parameters to reach the same level. More data makes a model more parameter-efficient, not less. You're confused, anon. People figured this out a long time ago when they moved past chinchilla scaling.
Current models ARE probably inefficient at RP because they're not being designed for it, but this doesn't mean you can skip out on training.
Anonymous No.106515559
More kiwis soon! (Qwen)
Anonymous No.106515561 >>106515574
>>106515496
I think these kind of things should be documented. I'm pretty sure a lot of stuff discovered back then are still not in any paper yet
Anonymous No.106515574
>>106515561
>lot of stuff discovered back then are still not in any paper yet
like what?
Anonymous No.106515612 >>106515651 >>106515654
>>106515545
Not the entire internet.
Anonymous No.106515623
>>106515291
It's open source. Show where.
Anonymous No.106515646
>>106515545
NTA, but since larger models memorize more, they're able to recall more of the rare information seen during pretraining. To some extent (it's not just that, admittedly), the stronger RP capabilities of those models are because of that. A smaller model pretrained primarily for the purpose of simulating human interactions, conversations and RP (instead of improving math benchmarks, etc) could potentially match the capabilities of larger models in that area.

Of course we'll never have that as long as we have anons who care about models doing the tsundere bubble sort in Erlang or proving the theory of relativity while making you a blowjob as a mesugaki.
Anonymous No.106515651
>>106515612
I think the last few years has been pretty bad for ai generated content. and all the political radicalization in the past decade or so. I think ideally you would use the entire pre 2012 internet.
Anonymous No.106515654 >>106515663 >>106515666
>>106515612
Yes, we'll filter out the spam and garbage.
Anonymous No.106515663 >>106515675
>>106515654
define garbage?
Anonymous No.106515666
>>106515654
You need to filter more than that to be efficient.
Anonymous No.106515669
>>106515545
This is correct. Bigger models are more sample efficient, so they an learn more from less data. In contrast, small models need more data to reach a given level of quality.
Anonymous No.106515675
>>106515663
"the a congo sex the the a congo congo nigeria vagina pussy pussy the the" kind of documents.
Anonymous No.106515679 >>106515692 >>106515710 >>106515756 >>106515855 >>106516412
Most posters itt are severely autistic.
Anonymous No.106515692 >>106515723 >>106515725 >>106515741 >>106515749
>>106515679
Yeah, can't believe anyone would argue for hours about how to train models instead of just doing it and shutting everyone up for good.
Anonymous No.106515710
>>106515679
https://vocaroo.com/12RPstjPnT74
Anonymous No.106515719 >>106515759 >>106515852 >>106515879 >>106516080
>>106512596
>you're a spoiled child if you expect AIniggerdevs to stop writing python slop code that requires command line manual installation in 2025
EXE. I want EXE. Where is the EXE.
Anonymous No.106515723
>>106515692
Sure, just download the entire internet, set up a filtering pipeline, then give me the money to pretrain a full model on it.
Anonymous No.106515725
>>106515692
It all started when I suggested that LLMs won't acquire significant "tacit knowledge" until they've seen large amounts of data, and that this could be expedited with targeted training data...
Anonymous No.106515740
>>106514698
Yes, its been explained before, the context search isn't literal, most NIAH tests involve having context like "John put some mayonnaise on his hamburger and hot dog." and then asking "What condiment did John put on his hamburger?". NoLiMA goes and asks something like "John got some french fries. What condiment(s) would he likely put on it?". That requires actual reasoning and connecting the dots on the context you have to extrapolate correctly things which is harder when you aren't as said literally matching what you have seen in the context to the question asking about it.
Anonymous No.106515741 >>106515760
>>106515692
https://vocaroo.com/1nWbsRIXibi3
Anonymous No.106515749
>>106515692
it takes time, anon
Anonymous No.106515756
>>106515679
I should hope so. Autistic people are how we get our best technological breakthroughs.
Anonymous No.106515758
Why hasn't OpenHands finetuned a coding model on Qwen3-coder yet? why use Qwen2-coder
Anonymous No.106515759 >>106515852 >>106515879 >>106516084
>>106515719
Exactly. llama.cpp is popular because you can download it as an exe file and run, no pythonshit needed.
Anonymous No.106515760 >>106515803
>>106515741
god damn thats so good ladies and gentlemen the best tts out there
Anonymous No.106515766 >>106515784
>>106514823
>>decide to check on ipex-llm to see if they finally updated to support latest models like oss-gpt
They are all in on vLLM and for good reason too because of the enterprise and project BattleMatrix. They do what they can with ipex-llm and contributions to llama.cpp but it is lower priority and neglected. Mainline llama.cpp SYCL isn't that bad, but you can see the neglect when a crashing bug was fixed in https://github.com/ggml-org/llama.cpp/pull/15582 but there was a mistake done and it wasn't followed up on with two weeks and counting to get it merged in. Sad.
Anonymous No.106515769 >>106515783 >>106515787
gay thread. 80% of the population is newfags 80% tof the population is troons.
Anonymous No.106515783
>>106515769
>80% tof the population is troons.
It's not quite that bad yet nonnie.
Anonymous No.106515784 >>106515828
>>106515766
>vLLM
Are they actually directly contributing to vLLM to have native ipex support or do you still have to go through ipex-llm to use vLLM with ipex?
Anonymous No.106515787 >>106515790 >>106516002
>>106515769
Saar! You forgot India mention! India AI superpower 2025 Gemini Google. Kindly say 80% posters Indian thank you saar.
Anonymous No.106515790 >>106515802
>>106515787
80% of population is indeed indian too.
Anonymous No.106515802 >>106515831
>>106515790
so 80% indian train gays? Waow!
Anonymous No.106515803 >>106515827 >>106515830 >>106515848 >>106515877 >>106516109
>>106515760
https://vocaroo.com/121za8zMgKiQ
Anonymous No.106515810 >>106516038 >>106516188
Will we soon look at "coding" the same way we look at "calculating"? Prior to calculators and computers, we used to have rooms of humans doing things like ballistics calculations.
Now you still need to to know *math* to use a calculator or spreadsheet effectively to solve problems that span more than a single operation, but you don't need to do *arithmetic* any more.
tl;dr vibecoding normies are just monkeys banging on a calculator to get "8008135". There's higher-order knowledge needed to make software.
Anonymous No.106515827
>>106515803
shit, a few seconds in I knew I was going to contract orange man cancer
Anonymous No.106515828
>>106515784
https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html
You have to build the wheels yourself but they are contributing as regularly as a company going bankrupt with limited resources is doing
https://github.com/vllm-project/vllm/commit/e599e2c65ee32abcc986733ab0a55becea158bb4
This is on par with their Pytorch cadence. This was the last SYCL related commit to llama.cpp in comparison and it wasn't even done by Intel.
https://github.com/ggml-org/llama.cpp/commit/8b696861364360770e9f61a3422d32941a477824
Anonymous No.106515830 >>106516037
>>106515803
But I can't use voice clips from my favorite anime voice actress
Anonymous No.106515831 >>106515846
>>106515802
Yes saar India love trains saar very prod of the country
Anonymous No.106515846
>>106515831
MAKE NEW THREAD BLOODY
Anonymous No.106515848
>>106515803
nice
Anonymous No.106515852 >>106515873 >>106515905
>>106515719
>>106515759
If you get filtered by CLI you deserve suffering.
Anonymous No.106515855 >>106515875
>>106515679
I have never been diagnosed with autism.
Anonymous No.106515873
>>106515852
Fuck off pyshitter.
Anonymous No.106515875
>>106515855
Only a doctor can diagnose autism.
Anonymous No.106515877 >>106515896
>>106515803
what settings are you using? mine are all coming out completely schizophrenic.
Anonymous No.106515879 >>106515985
>>106515719
>>106515759
>>106502028
https://vocaroo.com/1j4yGPQKczdx
Anonymous No.106515896 >>106515929
>>106515877
30 steps. It depends on the voice sample too.
Anonymous No.106515905
>>106515852
Damn, Python looks like that?
I don't care that she's a bit slow, she's bloated in all the right places.
Anonymous No.106515907 >>106515944 >>106516080
>>106502028
Actually incredibly based opinion, but troons will disagree.
Anonymous No.106515920
Has anyone sussed out any best practices for vibevoice samples? I'm not sure yet if it's better to go for longer samples or to trim it down closer to the length of the audio you're trying to make.
Anonymous No.106515926 >>106515987
>>106502028
That is what kobold is for. And from kobold you can fall into a trap of oobashit or you can go straight to llamacpp. Once you have it set up it honestly isn't that bad. I don't even have a bat, just have the commands in a textfile and it werks. Even without bat it is actually faster than oobashit and kobold.
Anonymous No.106515929 >>106515963 >>106515985
>>106515896
does this shit need a longer sample? same 8 second sample i used with zonos is completely wonked out with default settings in the node setup, adding steps doesn't change it.
Anonymous No.106515932
>>106515437
>The honeymoon period didn't last long
this has been the entire history of the GLM models and only retards keep pushing them
Anonymous No.106515944
>>106515907
>>106501412
>SPEAK LIKE A HUMAN BEING YOU SYNTHETIC MONSTER
Anonymous No.106515948
>>106515437
>and I now use gpt-oss as my main model.
what?
Anonymous No.106515963 >>106516002
>>106515929
I used a 23s sample for that one.
Anonymous No.106515972 >>106515994
>>106515437
>I now use gpt-oss as my main model
Bro, you're supposed to turn off the model thinking not your own
Anonymous No.106515985 >>106516002
>>106515929
most of my samples are a minimum of 40 seconds but two minutes gives the best results
smallest on is the Mandy sample I used here >>106515879 at like 38 seconds and I cleaned it to the best of my abilities but some of the background noises still bleed thru
Anonymous No.106515987
>>106515926
https://vocaroo.com/1bFeQGTMqTTf
Anonymous No.106515994
>>106515972
you never had any thinking when you thought glm was a good model bro
Anonymous No.106515999 >>106516013 >>106516017
>>106515496
>Holo prompt
Was ist das? Google returns garbage.
Anonymous No.106516002
>>106515963
noted. I guess the combination of my sample being kinda fast + only 8 seconds it really didn't like that.
here's alucard reading this post >>106515787
(30 steps seems like the max it needs for a quality boost)
https://voca.ro/1i3Yya3rUVn6


>>106515985
cool thanks for the info, gonna be a challenge to get that character over 8 seconds but at least alucard had that 10+ seconds kek
Anonymous No.106516013
>>106515999
go to the link in the image and read the thread
Anonymous No.106516017
>>106515999(me)
ah fuck, should've scrolled down the image.
Anonymous No.106516037 >>106516133
>>106515830
Why not?
https://vocaroo.com/1beCnoUdgpID
Anonymous No.106516038 >>106516045 >>106516074 >>106516085 >>106516198 >>106516214 >>106516276
>>106515810
The real change will come when there are "vibecode" specific languages created.
It's only a matter of time.
Anonymous No.106516045
>>106516038
It's called English, r-tard.
Anonymous No.106516059 >>106516077 >>106516087
yeah my results are aaaalll over the place, but this turned out really nicely.

https://voca.ro/16gmTFt1O8vf
Anonymous No.106516074 >>106516093
>>106516038
isn't that just python?
Anonymous No.106516077 >>106516091
>>106516059
I found out that ComfyUI implementation is all over the place. Python demo is way more reliable and it's more consistent.
I don't know if it's because of Cumrag itself or are the implemented nodes bad. I can only guess.
Anonymous No.106516080 >>106516100 >>106516106
>>106515907
>>106515719
https://voca.ro/1b5FwnOiykK6
Anonymous No.106516084
>>106515759
llama.c when
Anonymous No.106516085
>>106516038
Javascript already exists, anon. People have been vibecoding JS since before AI existed.
Anonymous No.106516087
>>106516059
I think it struggles with voices that have a very high dynamic frequency range like Peach there. It's difficult to get a sample for certain seiyuus where they aren't peaky like that, since that's part of the appeal.
Anonymous No.106516091 >>106516108 >>106516140
>>106516077
could ya link me the python demo? thanku.
glad to know i'm not alone.
Anonymous No.106516093
>>106516074
and javascript for when interfaces are needed
Anonymous No.106516100
>>106516080
kek
Anonymous No.106516106 >>106516130
>>106516080
I think I recognize that voice a little but can't quite place it, is it a cartoon girl bully?
Anonymous No.106516108 >>106516140 >>106516177
>>106516091
https://github.com/vibevoice-community/VibeVoice/
Anonymous No.106516109 >>106516218
>>106515803
make him do porn noises
Anonymous No.106516117 >>106516147
>>106514051
>the quality of the output starting to go downhill if I generate anything longer than 30 seconds.

Not true with the native implementation (clone Microsoft/VibeVoice)

I did not go over 4 min/ However, it stays consistent all along
Anonymous No.106516130 >>106516160
>>106516106
It's the Witch from Slay The Princess.
Anonymous No.106516133
>>106516037
https://vocaroo.com/1iIp0ji2b59p
Anonymous No.106516140 >>106516177
>>106516091
>>106516108
Forgot that you'll need to look at inference_from_file.py
and do something like
>python demo/inference_from_file.py --model_path ./VibeVoice-1.5B --txt_path ./test.txt --speaker_names Faggot
>voices go into demo/voice and are named en-Faggot_male for example
Anonymous No.106516147 >>106516202 >>106516275
>>106516117
It sadly gets pretty rough even with that when you try to do a 40 minute script. It slowly starts getting worse.
Anonymous No.106516160 >>106516244
>>106516130
never played that game and checked her imdb, barely any roles and mostly online shit
weird
Anonymous No.106516176
amerilards cant into bakery
Anonymous No.106516177
>>106516108
>>106516140
thanks. i noticed there's two repos for this too, the wildminder one and one by a guy named Fabio Sarracino.

Might just uninstall wildminder's and risk getting AIDS from Fabio. Worth a shot.
Anonymous No.106516188
>>106515810
I've been thinking a lot about this old Twilight Zone episode that depicted future programmers as just people with microphones that speak to the machines.
There was also a time when compilers were looked down. They saved time by letting you write in C, but often times the result was not performant, didn't output valid Assembler, and you ended up having to write your own anyway. Nowadays, almost no one has to write hand rolled Assembler anymore, and attempting to do so outside a few niches would result in worse code than what the compiler is capable of writing.
The technology is still new. I'm sure the transition to higher-order knowledge work is inevitable, but it's probably still decades away.
Anonymous No.106516198 >>106516219 >>106516221
>>106516038
Tokens to represent ASM opcodes?
direct cpu token interpretation?
Token microcode?
Anonymous No.106516202
>>106516147
It's built on Qwen 2.5, so naturally it will start to degrade when you try to use the full context it claims to support. Just chunk it. At least you can do far bigger chunks than with other TTS.
Anonymous No.106516214
>>106516038
Shouldn't it be sygma instead of capital C

naoshite kure
Anonymous No.106516218 >>106516248 >>106516250 >>106516266 >>106516305
>>106516109
https://vocaroo.com/1oL6yJxfeEIp
Anonymous No.106516219
>>106516198
Then we build a language on top of that.
We need more layers of abstraction.
Anonymous No.106516221
>>106516198
The vast majority of enterprise LOB apps and startup shovelware does not need to go that low level. The trend is always towards more abstractions, not less. If anything, a new language designed for use by LLMs would be an abstraction over Python.
Anonymous No.106516244
>>106516160
Yeah, as best I can tell, she's a pretty mid streamer with a ton of untapped voice acting talent. I think she only got the Slay The Princess role after sending something directly to the devs. I hope she goes out and gets more roles, because she knocked it out of the park with the one she got.
Anonymous No.106516248
>>106516218
L.O.L.
You made me weekend!
Anonymous No.106516250
>>106516218
After hearing some porn noise samples posted here, I can now conclude that rumors about VV being pulled due to NSFW usage are false.
Anonymous No.106516266
>>106516218
Lol
Anonymous No.106516275 >>106516288
>>106516147
what kind of "limit" is it? it is the github demo with gradio
Loaded example: 1p_Ch2EN.txt with 1 speakers
Loaded example: 1p_abs.txt with 1 speakers
Loaded example: 2p_goat.txt with 2 speakers
Loaded example: 2p_music.txt with 2 speakers
Loaded example: 2p_short.txt with 2 speakers
Loaded example: 2p_yayi.txt with 2 speakers
Loaded example: 3p_gpt5.txt with 3 speakers
Skipping 4p_climate_100min.txt: duration 100 minutes exceeds 15-minute limit
Skipping 4p_climate_45min.txt: duration 45 minutes exceeds 15-minute limit
Successfully loaded 7 example scripts
Launching demo on port 7860
Anonymous No.106516276 >>106516349
>>106516038
we tried a few times making programming languages that are close to natural language and easy for humans, but results were mediocre.
SQL is a particular disaster that will haunt us forever.
Anonymous No.106516288 >>106516309 >>106516325
>>106516275
Oh, so it's supposed to be used with shorter scripts. Is it only the 1.5B one that can do long scripts?
Anonymous No.106516305
>>106516218
lmaooooo, that's why I love this site, I know I'll find kino shit like that at some point
Anonymous No.106516309 >>106516320
>>106516288
The model card says 7B can do 45 minutes and 1.5B can do 90 minutes.
Anonymous No.106516320
>>106516309
It's still mostly intelligible at 45 minutes, but it does hurt your ears, so it's not a complete lie.
Anonymous No.106516325
>>106516288

hmmm...
Anonymous No.106516349 >>106516373
>>106516276
I think it'll be something far different, that probably won't make intuitive (human) sense. Basically human unreadable.
I get the point, can't train what you don't have examples of, and there's lots of python and JS to copy from.
But I expect there will be some intermediary language, that LLM (or whatever) can manipulate really easily, and humans won't be able to understand at all.
Anonymous No.106516362
https://voca.ro/1gZ6xankFzjP
Anonymous No.106516373 >>106516389
>>106516349
eh, just skip the middle man and train it on machine code directly at this point.
Chunk generated output into a VM, see if it works, reward/punish the model, repeat.
Anonymous No.106516379
>>106516368
>>106516368
>>106516368
Anonymous No.106516389
>>106516373
>eh, just skip the middle man and train it on machine code directly at this point.
Do you have any idea how many tokens it would need to spit out to do even the most trivial tasks?
Anonymous No.106516412 >>106516481
>>106515679
high functioning autist hobby that filters most people
your average joe wouldn't know how any of this works except thinking that chatgpt is some kind of magic word machine that pulls stuff out of thin air
Anonymous No.106516481
>>106516412
>high functioning autist forum that filters most people
ftfy