What's the best model to run on a 32GB RAM + 12GB VRAM machine? I downloaded QwQ-32B a while ago and it was pretty good, albeit obviously slow (2t/s). I wonder if anything better came out since then, either in terms of speed or intelligence. I use the LLM for solving problems and not casual chatting so output quality is more important than speed.