Report Content

>>106157843
yes, but none of them are as good as the best general purpose LLMs
some aren't too bad, like aya, but aya has some of that command jank where it will randomly go crazy, it doesn't happen often but still often enough that I wouldn't want to use it for automation
it's okay I guess if you use it interactively and regen a bad gen on the go
also cohere models aren't very good instruction followers, if you try to do something other than just get a basic translation
my recommendation, from smallest size model to biggest (run the biggest your computer can handle)
Qwen 3 4B - 8B - 14B, Gemma 3 27B (the smaller gemma are too quirky), then straight to the humongous DeepSeek. There's really nothing of value between Gemma and DeepSeek for this kind of use, most models have too little knowledge which makes them bad at translating niche terms/made up but common words in fiction etc. The Qwen models also have little knowledge, but they get a mention for the smaller sized ones because they are the most coherent, reliable small size models.

Post Preview