Search - 4rchive

>glm-air-chan is schizoing out
I blame cope quants.
Cope KV cache quants in particular, anons weren't lying about that one
>glm-air-chan doesn't pass the breakfast test
This one hurts.
It's so close to being real, man. But when slip ups happen, it's a fucking knife right in the feels.
>offloading glm-air-chan to GPU brings t/s from 3.9 CPU-only to whopping 5.1!
Exlpain yourelf gpumaxxers.

Btw, I think I'm hitting some strange llama.cpp/vulkan/driver bug, because with --no-mmap I get what seems like OOM errors even though there should be plenty of memory, and with --mlock and --no-mmap I even get a fucking segfault.