>>105588693
o shit wait you're right, e5m2 mostly works. Though on the 2B it's still prompt and seed dependent, sometimes it gives really bad results but switching to bf16 with same seed looks fine. On 14B e5m2 is a bit more robust. But the difference with bf16 is still much larger than it is on other models. And the fact that e4m3 completely doesn't work makes it seem like this model is unusually sensitive to quantization.

Either way, need GGUF, it would probably be a lot better.