>>106158572
nemo should be a nice jump in intelligence vs both of those. anything that says ds r1 8b must be a tune of llama 3 8b. at the quant i linked you should be able to fit into vram if you use 12k context, maybe 8bit kv cache. mistral small is 24b and a bit newer, and has thinking. you could try that too but it'd have to be split to your ram so it will be slower