Search - 4rchive

>>106975556
Frens, does anyone tested DGX Spark already?
I fucking can't make Magistral-Small-2509 gguf work faster than single 3090 lol...
I tried recompile llama.cpp with various flags and different quantization like Q5_K_M.gguf can't make it faster than 10 tokens/s wtf...
Wander maybe someone else tried it