Microsoft's AI Runs on Low-End CPUs, Saving 6x Memory

Microsoft's AI Runs on Low-End CPUs, Saving 6x Memory

Arkadiy Andrienko

Microsoft specialists have developed the BitNet b1.58 2B4T language model, which, despite its compact size, delivers results comparable to larger counterparts. Its key feature is its ability to run on standard CPUs without the need for graphical accelerators. This opens up access to AI technologies for devices with limited resources.

Instead of standard 16- or 32-bit computations, the model uses simplified 1-bit operations with three states: -1, 0, and +1. This approach reduces memory usage to 400MB — by comparison, the closest competitor from Google (Gemma 3 1B) requires 1.4GB. The savings are achieved through a fundamentally different approach to data processing: instead of complex mathematical operations, the system relies on optimized binary algorithms.

During testing, BitNet was compared with models from Meta (LLaMa 3.2 1B), Google (Gemma 3 1B), and Alibaba (Qwen 2.5 1.5B). Despite its smaller size, Microsoft’s development achieved an average score of 54.19 in comprehensive tests, surpassing LLaMa (44.90) and Gemma (43.74), and only slightly trailing behind Qwen (55.23), which occupies 6.5 times more memory. In specific tasks related to text analysis, BitNet took the lead.

Microsoft's AI Runs on Low-End CPUs, Saving 6x Memory

For maximum efficiency, the model requires the special bitnet.cpp framework, available in the open GitHub repository. Standard tools like the Transformers library do not fully unlock its potential. Developers note that the current version is optimized for CPUs, but future updates will add support for neuroprocessors and GPUs.

BitNet is an example of the trend toward "lightweight" AI models. Such solutions reduce energy consumption and allow complex algorithms to run on devices without cloud service access. This is especially relevant for regions with slow internet or when dealing with confidential data, where transmitting information to data centers is undesirable. According to the developers, their goal is to make AI technologies accessible without hardware upgrades, which could change the approach to developing AI-powered applications.

    About the author
    Comments0