China's AI Startups Rise Despite Chip Restrictions: Innovation Amidst Challenges
China's AI Startups Rise Despite Chip Restrictions: Innovation Amidst ChallengesDespite US export restrictions on advanced chips, Chinese AI startups are demonstrating remarkable speed in catching up to leading US AI models, exceeding the expectations of many industry observers. This article delves into how Chinese AI startups are making progress under these constraints, examining the challenges and opportunities they face
China's AI Startups Rise Despite Chip Restrictions: Innovation Amidst Challenges
Despite US export restrictions on advanced chips, Chinese AI startups are demonstrating remarkable speed in catching up to leading US AI models, exceeding the expectations of many industry observers. This article delves into how Chinese AI startups are making progress under these constraints, examining the challenges and opportunities they face.
Beijing-based DeepSeek's November release of a preview version of its large language model reportedly matches the capabilities of OpenAI's September release of its reasoning model, o1. This is not an isolated case; other Chinese companies have made similar claims. Moonlit Dark Side, a startup backed by Alibaba and Tencent, claims its mathematics-focused model performs near o1; Alibaba even asserts its experimental research model outperforms o1 in mathematics.
However, these claims require careful consideration. Detailed papers substantiating model performance haven't been released, and a lack of standardized benchmarks for AI model capabilities makes objective verification difficult. Nevertheless, some US experts acknowledge the Chinese models' progress. Andrew Carr, a former OpenAI researcher and current AI entrepreneur, notes that the Chinese AI field is experiencing "rapid catch-up," with DeepSeek's replication of OpenAI's reasoning model in just months surprising many.
The American Invitational Mathematics Examination (AIME) is a common test for model performance. DeepSeek claims its model outperforms OpenAI's on AIME. Yet, US media experiments on this year's 15 AIME questions showed that while OpenAI's o1 preview model was faster, all models including those from DeepSeek, Moonlit Dark Side, and Alibaba provided correct answers on the first attempt, a significant achievement in itself. For example, on a word puzzle concerning a two-player game strategy, OpenAI solved it in 10 seconds, while DeepSeek took over two minutes.
Since 2022, Chinese AI developers have faced US export restrictions on advanced AI chips (e.g., Nvidia's high-performance chips), further tightened in December 2023 by the Biden administration. However, Chinese developers have shown remarkable innovation in finding workarounds.
Yang Zhilin, founder of Moonlit Dark Side, says the company focuses on reinforcement learning, mimicking human trial-and-error to improve model performance, thus reducing computational resource needs and enhancing model capabilities. Furthermore, Mixture-of-Experts (MoE) technology has gained traction since late last year. MoE routes specific problems to the expert model best suited to handle them similar to a restaurant chef assigning dishes to different cooks based on orders effectively reducing reliance on high-performance chips. Tencent announced in November that its latest MoE model matches the performance of Meta's Llama 3.1 (July 2023 release). However, US researchers reviewing both companies papers suggest Tencent's model might have used only a tenth of the computational resources for training.
DeepSeek, originally the AI research arm of High-Flyer (a quantitative hedge fund managing $8 billion), built the Fire-Flyer2 AI training cluster in 2021 using approximately 10,000 Nvidia A100 chips. An August 2023 DeepSeek paper states that Fire-Flyer2 performs comparably to systems using similar Nvidia chips but with significantly reduced cost and energy consumption. Furthermore, their May 2023 paper on their MoE model, highlighting more efficient data processing techniques, garnered significant attention. Anthropic co-founder Jack Clark mentioned in his blog that DeepSeek's Fire-Flyer2 cluster exemplifies China's response to export controls, bypassing them by building a superior software and hardware stack and demonstrating strong competitiveness in AI models.
Despite this, many Chinese AI developers still acquire restricted Nvidia chips through intermediaries and overseas data centers. However, Chinese executives point out that the scarcity of advanced chips remains a major bottleneck, a gap potentially widening as Nvidia's customers prepare for large-scale deployment of its latest AI data center chip, Blackwell.
Internationally, US companies continue to ramp up investment. Elon Musk's xAI has built a data center with 100,000 Nvidia chips and secured $5 billion for further expansion; Amazon AWS plans to use hundreds of thousands of its own custom-designed chips for an unprecedentedly large AI supercomputer.
DeepSeek focuses on open-source model research, particularly in mathematics and programming; Moonlit Dark Side has gained traction with Chinese consumers through its ChatGPT-like chatbot, Kimi, known for excellent long-text handling. However, Chinese AI startups remain significantly undervalued compared to US companies like OpenAI, recently valued at $157 billion, with Chinese startups facing financing challenges.
Intense market competition has also sparked price wars among AI model providers. Beijing-based Zhipu AI, for instance, reportedly delayed its IPO (initially planned for late 2025) as investment bankers believed it would struggle to achieve its target valuation. In its latest funding round, Zhipu AI was valued at approximately $3 billion. Noteworthy is Zhipu AI's late November demonstration of its AI agent and its July release of a video generation model similar to OpenAI's Sora.
Howard Huang, a former infrastructure executive at a Chinese AI company, describes the Chinese AI industry as "dancing in chains," suggesting focusing on strengths is the only survival strategy, and potentially the key to global competitiveness.
In conclusion, the progress of Chinese AI startups despite chip restrictions is impressive, adapting through innovation and strategic adjustments. However, the shortage of advanced chips, relatively low valuations, and fierce market competition remain significant hurdles. Whether they can secure a prominent position in the global AI landscape remains to be seen.
Tag: China AI Startups Rise Despite Chip Restrictions Innovation Amidst
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.