>>106384612
yes
..wait
| model | size | params | backend | ngl | n_batch | n_ubatch | fa | ot | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------: | -------: | -: | --------------------- | ---: | --------------: | -------------------: |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp32 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp64 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | pp128 | 0.00 ± 0.00 |
| glm4moe 106B.A12B Q3_K - Medium | 53.76 GiB | 110.47 B | CUDA | 100 | 4096 | 4096 | 1 | exps=CPU | 0 | tg128 | 0.00 ± 0.00 |

FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU