Running a 1-Bit LLM on Windows — Bonsai 8B Setup Guide
06 April 2026
An 8B model that runs at 180+ tokens/sec on a laptop GPU using just 1.1 GB of VRAM. That's not a future promise — this is what 1-bit quantisation actually delivers today.
Thoughts on software engineering and life as a developer
06 April 2026
An 8B model that runs at 180+ tokens/sec on a laptop GPU using just 1.1 GB of VRAM. That's not a future promise — this is what 1-bit quantisation actually delivers today.
02 March 2026
Senior full-stack engineer with 11+ years of experience in .NET, JVM, microservices, and production AI systems.
23 February 2025
If you work with financial data, you will run into the situation where you need to round calculated values to the nearest cent.