Qwen-code (the gemini-cli fork) works just fine but I'm running on cpu and it has to process 9k tokens with every request.