>>106512307


>>106510426

>>106510342
>And what kind of data would be relevant to ERP?

Nta. You make the source data the kind of stories you want it to be good at writing. These models kind of suck at it or are prone to writing safety slop purple pros trash because as many of us have been pointing out repeatedly, the companies keep filtering out data they deem "low quality" or "unsafe". You need the good and the "trash" data in order for tonight overfit on that generic boring corporate writing style a lot of the models have. You get a bunch of stories (there are countless scrapes of rp stories floating on hugging face alone), turn those into SFT data sets and then just train your model off of that. I did exactly that and have demonstrated you can get even heavily cucked models like llama to completey drop The purple prose It actually right shit that sounds like it came from a natural person.


The obvious downside is that " garbage in garbage out" applies to this approach too. The stories in the original data set were not formatted " professionally" in a way you would find in a romance novel or something. So if you hate the writing style of AO3 authors of wattpad authors or wherever the data was ripped from, then you will hate fine tunes like that but it will not have the safety slop fuckery hindering it or causing it to refuse