Can DeepSeek disrupt AI’s future and shift the balance of power?

Avatar img-thumbnail img-circle
By

in AI Adventures

Last week, the relatively unknown Chinese AI company DeepSeek shook the tech world when it launched an AI model that rivals the performance of OpenAI’s models. With a fraction of the typical cost and using far less powerful hardware than its competitors, DeepSeek is already challenging the status quo of AI development and raising questions about the future of global tech supremacy.

The Chinese company’s standout moment came on January 20 with the launch of its R1 reasoning model, which, the company claims, rivals OpenAI’s o1 model. More surprising than the model’s capabilities is its low training cost—around $5 million, a price tag that experts are now reassessing in the context of GPU power and model development. 

DeepSeek’s claims that it achieved such impressive results with relatively inexpensive hardware has raised eyebrows and spurred debate across the tech world. Top AI engineers in the US have commended DeepSeek’s research for showing clever ways to build AI with fewer chips.

The startup’s engineers found a more efficient way to analyze data using these chips. AI systems learn by finding patterns in large amounts of data, like text, images, and sounds. DeepSeek used a method called “mixture of experts,” spreading the analysis across several specialized models, while reducing the time spent moving data around.

Prominent venture capitalist Marc Andreessen, called DeepSeek’s success “AI’s Sputnik moment,” referring to the shockwaves caused by the Soviet Union’s launch of the first satellite in 1957. With US stocks, particularly those of Nvidia, tumbling following the news, many are questioning whether American companies are still on top in the AI race.

In the wake of DeepSeek’s rise, comments from US political and tech figures, including President Donald Trump and OpenAI co-founder Sam Altman, have added fuel to the fire. Stocks tied to the AI sector saw some of their steepest declines in years, and the tremors were felt as far as Silicon Valley. Nvidia, the semiconductor giant that powers much of the AI industry, saw its largest single-day loss ever, prompting many to re-evaluate the status quo.

One of the most talked-about aspects of DeepSeek’s breakthrough is its approach to training models. Despite using what many would consider low-end hardware, including the H800 GPUs—far less powerful than the H100s used by industry leaders—the company has made strides in maximizing efficiency. Experts like Marin Smiljanic, CEO of AI startup Omnisearch, point out that DeepSeek’s use of distillation and other optimizations allowed them to sidestep the usual reliance on high-end processing power.

Marin Smiljanic

“It’s definitely too early to call DeepSeek the established leader in reasoning models, since OpenAI’s o3 is more powerful, but the ~$5M training cost is insanely good and they’re definitely the leader in efficiency,” commented Smiljanic, highlighting the significance of DeepSeek’s success in terms of cost-effectiveness.

This efficient approach challenges conventional wisdom about the massive GPU requirements needed for efficient AI training. As the industry grapples with DeepSeek’s unexpected success, it’s clear that other companies may need to reassess their own strategies—particularly as demand for GPUs is expected to surge.

“OpenAI and Anthropic are probably in for a rough couple of months, as is NVIDIA. There is, though, a bull case for NVIDIA with many smaller LLM startups generating more demand for their chips. The rest of Big Tech will likely be fine,” Smiljanic explains.

However, despite the buzz around DeepSeek, it is too early to declare it the undisputed leader in reasoning models. Machine learning engineer and tech entrepreneur Aleksa Gordic acknowledged that while DeepSeek’s models are impressive, OpenAI’s o3 is still more powerful. 

Aleksa Gordic

However, DeepSeek’s models, especially the open-source DeepSeek-R1, are already making waves in the open-source AI community, with many praising the company’s willingness to release its models for public use under an MIT license.

“The Zero model achieves amazing reasoning capabilities,” Gordic remarked, adding that DeepSeek’s multi-stage training process, which includes reinforcement learning and supervised fine-tuning, has led to the emergence of advanced capabilities in their models, such as reflection and exploration of alternatives. “I definitely didn’t expect Chinese to be leading in the open-source AI,” Gordic adds.

Yet, as Martin Vechev, ETH Zurich professor and founder of Bulgaria-based INSAIT (Institute for Computer Science, Artificial Intelligence and Technology) points out, DeepSeek’s models are still specialized, with R1 excelling in reasoning tasks but not necessarily in multilingual applications. 

“The DS series of models from China have been public for years. They are developed by strong researchers and engineers, who publish what they do in conferences and are constantly improvе the models, making them public, etc,” Vechev says.

I expect we will see more benchmarks where R1/O1 do not work great. Then Google/OpenAI will release models that do well, then someone with access to more GPUs will still make an open version that is similar quality (DS or someone else). However, to make such a version, one will need more compute than $5M and closer to $50-$100M even for more specialized models,” the Bulgarian scientist explains.

Martin Vechev

Beyond the technical details, DeepSeek’s rise raises geopolitical questions. With tensions between the U.S. and China already at a high, DeepSeek’s rapid ascent is likely to fuel further scrutiny. Concerns about the potential risks of sending data to China are already circulating, though experts like Smiljanic argue that these fears are often overstated.

“Concerns around shipping data to China are overblown,” Smiljanic noted, pointing out that DeepSeek’s models are open-weights and can be run locally, much like other models from companies like Meta.

Still, the international implications are undeniable. As AI continues to become an increasingly important part of national security and global competition, the race to develop the most powerful, cost-effective AI systems is only going to intensify.

Furthermore, U.S. companies, while still at the forefront in some areas, may need to adapt quickly to stay competitive, particularly as smaller start-ups harness the power of cost-efficient, open-source AI.

Now, the real question moving forward is whether the rise of DeepSeek will mark the beginning of a new era in AI—one where the lines of power and innovation are not solely drawn by American companies, but by a global community of tech pioneers. 

Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments