>>105826331The p40 has the same speed of a m4 pro mac mini 24gb which is $1500 (~16gb usable due to sharing with OS), or a AMD AI max 385 32gb $1000 from framework (more like 16gb usable due to a the software running out of vram even though you have 32gb and allocated 24gb for the GPU because the software wastes ram for some reason). And the p40 is from 2016 and can't run with certain model formats due to how old it is.
So technically it's all just 8-10tk/s with a 24gb model when you have 1 p40 or the miniPC (50% faster on the miniPC since they would use 16gb models in reality...), if you got 2 p40's or you upgraded the miniPC to reach 48gb, it's like 4tk/s.
At this point, if you just want ONE gpu, you might as well just get a 5070 TI for like $750, it's 16gb and it's similar to a 3090 in token speed, so it runs at like 30tk/s at 24gb (can't fit that but in theory) + you can offload to the CPU to run a 24gb model at like 15tk/s~.
Or just wait for the 5070 TI super which should have 24gb of vram (for like $1000 MSRP, $1500 inflated in the worst cast scenario depending on AMD's price and supply).
>the p40 is $200, if the 5070 TI super is "~$1000", the p40 is betterthe 5070 TI runs at 3x the speed of a p40, arguably worse in cost per token by like ~30%?, but non-obsolete hardware comes at a premium (Also I don't know if p40's are still $200).