Search Results
7/8/2025, 9:23:25 AM
>>105835202
I'm not saying that closed source datasets are shittier, but the quality of the individual datasets matters only up to a certain point when you have the compute for rapidly iterating with huge batch sizes and can apply RLHF according to precise specifications on top of that.
Until recently MistralAI Instruct models were finetuned mainly with open datasets. People seemed fine with them? e.g.:
>The Mistral Nemo Instruct model is a **quick demonstration** that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.
Picrel from MistralAI's first paper shows that they meant with "quick demonstration".
I'm not saying that closed source datasets are shittier, but the quality of the individual datasets matters only up to a certain point when you have the compute for rapidly iterating with huge batch sizes and can apply RLHF according to precise specifications on top of that.
Until recently MistralAI Instruct models were finetuned mainly with open datasets. People seemed fine with them? e.g.:
>The Mistral Nemo Instruct model is a **quick demonstration** that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.
Picrel from MistralAI's first paper shows that they meant with "quick demonstration".
Page 1