Report Content - 4rchive

>>105613888
sure. Just run with llama.cpp in pure CPU inference mode (or really low context on the GPU)
It'll be a bit slow, but slow is ok for playing with. You'll be better off than desktop anons stuck with 128GB max RAM capacities that can't even run big models.

Report

Post Preview