
What is Kuzco?
Features
- Run LLMs entirely on-device with llama.cpp
- Supports multiple architectures like LLaMA, Mistral, and Phi
- Async/await API with streaming responses
- Flexible tuning for context, batch, and sampling
- Optimized for Apple Silicon and Intel Macs
Use Cases
- Build private AI chat apps
- Enable offline text generation
- Integrate LLMs into macOS productivity tools
- Develop AI-powered iOS assistants





