>>105761911Yes, yes, thats what they said about 16, 32, and 64 bit basic processor word lengths when moving from 8, 16, and 32-bits respectively.
The xmm ops work like fp coprocessors, slow everything else down, usually have to shuttle the data into and out if them, don’t have all the standard operations, can’t be used as index registers, and often cause a processor context switch/escape when used, and some require extra setup overhead. It’s a pain in the ass.
Our company replaced asic with fpga, and then replaced fpga with xeon when they became fast enough, but it doesn’t use and simd shit, it’s mostly XOR operations.