Search Results
7/24/2025, 9:34:04 PM
I made a performance benchmark of a deepseek cope quant running partially on nvme, I discovered an extremely marginal improvement by using an excessive number of threads, I can only speculate that more threads means more concurrent memory accesses and thus page faults, it must be letting the kernel queue up the nvme more and get a bit higher total throughput despite the overhead or what have you. I'm going to try the iq2 next and see just how bad running from nvme can really get
Page 1