Articles How Smart is Elon Musk's «Frighteningly Smart» Chatbot?

How Smart is Elon Musk's «Frighteningly Smart» Chatbot?

February 18, 2025, 06:45 PM Savage_Division

xAI has introduced a new language model, Grok 3, which the company's founder, Elon Musk, called "the smartest AI on Earth." The chatbot's creators claim that the new version significantly surpasses the previous one: it processes a larger volume of training data and features new self-correction mechanisms. The Grok 3 demo version launched today, and the first reviews have already surfaced.

What’s New

The key advantage of Grok 3 is access to enhanced computational resources. The chatbot is trained using the Colossus supercomputer: in the initial stages, its creators employed 100,000 NVIDIA H100 GPUs, later doubling that number. In the future, computing power is expected to increase fivefold.

Grok 3 includes built-in self-correction mechanisms. The AI analyzes its own responses, compares them with reference answers, and then makes adjustments. Interestingly, the chatbot receives "rewards" for accurate responses and "penalties" for so-called "hallucinations" — incorrect or fabricated information.

According to xAI representatives, Grok 3 is smarter than other models in math, natural sciences, and programming. Blind tests were used to evaluate response quality, meaning users were unaware of which chatbot was replying.

During the Grok 3 presentation, xAI also showcased Deep Search — a "next-generation" search agent capable of quickly finding and analyzing information online. While similar features exist in competing models, xAI claims that Deep Search is more accurate.

Additionally, Grok 3 will soon receive a voice interface, allowing users to interact with it as if speaking to a real person. Its voice is said to sound more natural and expressive than competing models.

Do you use artificial intelligence for work or study?

Take the poll

How It Performs in Practice

Users on the X social network can access the new chatbot by subscribing to X Premium+ for $50 per month. While there aren’t many early reviews of Grok 3 yet, some stand out.

For instance, a user named Penny2x shared that they created a fully functional game using the new AI version:

Andrej Karpathy also appreciated Grok 3's determination:

I like that the model will attempt to solve the Riemann hypothesis when asked to, similar to DeepSeek-R1 but unlike many other models that give up instantly (o1-pro, Claude, Gemini 2.0 Flash Thinking) and simply say that it is a great unsolved problem. I had to stop it eventually because I felt a bit bad for it, but it showed courage, and who knows, maybe one day...

However, there were some drawbacks. The Deep Search agent raised a few concerns:

…the model doesn't seem to like to reference X as a source by default, though you can explicitly ask it to. A few times I caught it hallucinating URLs that don't exist. A few times it said factual things that I think are incorrect and it didn't provide a citation for it (it probably doesn't exist).

In conclusion, Andrej Karpathy noted that, based on initial impressions, Grok 3 has approached the level of OpenAI’s top models, such as o1-pro ($200 per month), and even slightly surpasses DeepSeek-R1 and Gemini 2.0 Flash Thinking. Considering that the xAI team started developing this AI from scratch about a year ago, the progress is impressive. However, more comprehensive tests are needed before determining whether the chatbot truly deserves the title of "the smartest."

Bias Concerns

It’s no secret that Elon Musk actively participates in U.S. political life and openly expresses his views. Some internet users worry that Grok 3 might also push certain narratives.

These concerns are not unfounded: Musk shared a screenshot showing the chatbot criticizing one news media outlet while praising X as the most reliable source of information. This is despite Grok 3 being positioned as a product with minimal censorship. Many people believe that AI should remain neutral in its judgments.

***

Regardless, the launch of another promising language model marks an important milestone in the ongoing AI race. The higher the competition, the faster progress advances.