Search Results
7/17/2025, 9:25:04 AM
>>105933826
It's something like pain. With the recent exllamav2 commits, Mistral Small outputs gibberish, can't use anything newer than Mistral-3.1, and exllamav3 does not even support Radeon yet. There are lots of small issues, for example, TensorFlow ROCm exceeds the size that can be serialized in MessagePack, so you have to disable the pip cache. Also, cuda emulation is faster than the native api (when in works), but as you can imagine, it's another layer of suffering. I mainly use nvidia, bought 1 card to experience the pain. 10/10, would recommend
It's something like pain. With the recent exllamav2 commits, Mistral Small outputs gibberish, can't use anything newer than Mistral-3.1, and exllamav3 does not even support Radeon yet. There are lots of small issues, for example, TensorFlow ROCm exceeds the size that can be serialized in MessagePack, so you have to disable the pip cache. Also, cuda emulation is faster than the native api (when in works), but as you can imagine, it's another layer of suffering. I mainly use nvidia, bought 1 card to experience the pain. 10/10, would recommend
Page 1