Between Praise and Caution
Demis Hassabis, CEO of Google DeepMind, referred to the Chinese AI model DeepSeek as "the best work coming out of China" but cautioned that the media hype around it was "exaggerated." This statement followed the Chinese company's announcement of its DeepSeek-R1 model, which rivaled major models like GPT-o1 from OpenAI, yet at a cost 90% lower. While DeepSeek's model is being hailed as a breakthrough, it also raises important questions about the fine line between true innovation and clever engineering.
This article explores how executives (CEOs) can learn from this case to enhance innovation strategies, manage expectations effectively, and apply the lessons of DeepSeek to real-world scenarios. Whether the focus is on cost-effective models, geopolitical competition, or media management, there are valuable takeaways for leaders in every industry.
1. Smart Engineering Over Raw Power: The DeepSeek Key
A. Low Cost, High Efficiency
- DeepSeek spent only $5.6 million to train its base model V3, significantly less than the billions that U.S. companies typically invest in AI model development. The ability to create a model with similar capabilities at such a low cost is remarkable and speaks to the company's engineering ingenuity.
- The company relied on Optimized Reinforcement Learning Algorithms (GRPO) to replace traditional expensive human training. This innovation allowed DeepSeek to reduce its dependence on costly human input by 70%, cutting down operational costs while maintaining efficiency.
- Even more impressive, DeepSeek used Nvidia H100 chips, which are constrained by performance limitations due to U.S. sanctions. However, the company found a way to utilize these chips more effectively through Assembler programming, which allowed for greater efficiency and performance despite the hardware constraints.
This intelligent approach to cost reduction and optimization is crucial for CEOs aiming to maximize returns without relying on massive budgets. DeepSeek's success highlights that smart engineering and optimization are often more valuable than throwing additional resources at a problem.
B. Post-Training Innovations
- One of DeepSeek's most innovative steps came after the training phase. The company developed Automated Reinforcement Learning (ARL) to replace traditional human-in-the-loop feedback (RLHF). RLHF, while effective, is expensive and time-consuming as it involves human annotators providing feedback on model outputs. By automating this process, DeepSeek was able to reduce the cost and time spent on training while still improving the model's capabilities.
- DeepSeek also invested in Distilled Models, smaller versions of their larger models that could run efficiently on laptops or mobile devices. These models maintain performance close to their larger counterparts while drastically reducing the computational requirements. This innovation could revolutionize AI deployment by making powerful models accessible to a broader range of users and devices.
These innovations prove that post-training engineering and optimization are just as important as the initial training phase. CEOs should consider how to maximize the use of their models and resources throughout the entire lifecycle, ensuring efficiency and scalability.
Lesson for CEOs:
"Focusing on optimizing engineering processes might be more impactful than just increasing financial or computational resources."
2. Managing Expectations: Between Media Hype and Technological Reality
A. Praise for Chinese Progress... with Caution
- While DeepSeek’s model is being celebrated for its engineering excellence, Hassabis was quick to point out that it is not a "scientific invention" but rather a clever reassembly of existing techniques. This subtle yet important distinction reminds us that true innovation does not always require a complete breakthrough but can instead stem from the ability to reimagine existing technologies in new ways.
- DeepSeek's bold claim that it was able to develop a highly cost-efficient model raised questions among experts. Some speculated that the initial research and development costs could be far higher than the company let on, possibly up to 10 times the amount stated. While the company’s figures are impressive, transparency about the real costs is essential to avoid potential backlash from investors or the market.
B. The Risks of Over-Promotion
- The announcement of DeepSeek’s advancements sent shockwaves through the tech industry, and the stock market reacted swiftly. Shares of major U.S. tech companies like Nvidia and Meta plummeted by 12% in a single day, highlighting how sensitive the market is to hype and speculation.
- Hassabis warned that while DeepSeek’s model is impressive, the overwhelming media attention could result in a disruption of genuine investments in meaningful scientific advancements. If companies focus too much on marketing and media promotion, they risk overselling their capabilities, which can lead to disappointed investors and a loss of credibility.
For CEOs, this serves as a cautionary tale about the dangers of hype. It’s important to underpromise and overdeliver, rather than the reverse. Transparency in disclosing capabilities and costs is crucial for maintaining investor trust and market stability.
Lesson for CEOs:
"Transparency in disclosing capabilities and costs protects reputation and avoids market shocks."
3. Geopolitical Implications: How DeepSeek Reshapes Global Competition
A. China as an Emerging AI Power
- China has become a dominant force in the AI landscape, accounting for 36% of the world’s AI models, compared to just 18% for the United States. This rapid rise has been fueled by significant investments in AI education and infrastructure. With over 440 universities offering AI degrees, China is building a robust pipeline of talent that could outpace the U.S. in the near future.
- One of China’s key advantages is its ability to innovate even under sanctions. The country’s tech companies have had to become more resourceful and efficient, finding ways to build world-class AI models without relying on cutting-edge Western technologies.
B. Western Reactions
- In response to China’s rising power in AI, former Google CEO Eric Schmidt called for increased investment in open-source models. He believes that open-source AI is essential for democratizing technology and ensuring that the U.S. remains competitive in the global race.
- The Stargate project, a U.S. initiative with an investment of $1 trillion, aims to overhaul the nation’s AI infrastructure. This ambitious plan is designed to enhance America’s position in the global AI market and counter China’s growing influence.
Lesson for CEOs:
"Global competition requires strategic alliances between the public and private sectors, not just an individual race."
4. Five Key Lessons for CEO Optimization
- Focus on Engineering Efficiency
- Invest in improving algorithms instead of blindly increasing computational resources.
- Manage Expectations Through Transparency
- Avoid exaggerating capabilities in marketing to prevent investor disappointment.
- Innovate Under Pressure
- Turn obstacles (e.g., sanctions) into opportunities to create unconventional solutions.
- Enhance Open Collaboration
- Join open-source initiatives to share knowledge and reduce costs.
- Prepare for AGI
- Develop proactive plans for dealing with Artificial General Intelligence (expected in 3-5 years).
Conclusion: The Balance Between Ambition and Realism
The DeepSeek case shows that success in AI is not just about financial resources but about creativity in optimizing processes and intelligently managing expectations. While China presents a geopolitical challenge, the biggest lesson for CEOs is that "true innovation begins by understanding current limits, then surpassing them in unexpected ways.