A new AI assistant from China has Silicon Valley talking

Silicon Valley has been rocked by a little lab in China.

The U.S. tech industry has been discussing and debating what the abrupt arrival of an advanced AI helper from DeepSeek, a little-known firm in the Chinese city of Hangzhou, implies about the larger AI development race.

The AI models that drive DeepSeek’s assistant are already outperforming the best models in the United States, despite the fact that they were created with a fraction of the resources, according to the business. The assistant recently ranked number one in the Apple App Store.

A week ago, DeepSeek released R1, their most recent large language model. R1 is already outperforming a number of other models, including as Google’s Gemini 2.0 Flash, Anthropic’s Claude 3.5 Sonnet, Meta’s Llama 3.3-70B, and OpenAI’s GPT-4o. It is ranked second only to OpenAI’s o1 model in the Artificial Analysis Quality Index, a widely used independent AI analysis ranking.

Entrepreneur Marc Andreessen, who co-wrote Mosaic, one of the first web browsers ever, wrote on XSunday that DeepSeek R1 is AI’s Sputnik moment, comparing it to the space race between the US and the USSR and the event that made the US realize that its technological prowess was not unchallenged.

Monday saw a significant decline in tech equities, with the Nasdaq Composite falling 3.4% in the first few minutes of trading. Hundreds of billions of dollars are being invested in AI technology by major U.S. tech corporations.

One of R1’s primary strengths is its capacity to use chain-of-thought reasoning, which divides difficult tasks into manageable chunks, to explain its reasoning. This technique lets the model go back and make changes to previous steps, simulating human thought, while still letting users understand its reasoning.

Microsoft CEO Satya Nadella, whose business is one of OpenAI’s largest investors, described DeepSeek’s new approach as “very impressive” during last week’s World Economic Forum in Switzerland. He also stated that he thinks we should take the advancements coming out of China very seriously.

R1 and O1 belong to a new class of reasoning models designed to tackle more challenging issues than earlier AI model generations. However, in contrast to OpenAI’s o1, DeepSeek’s R1 is open weight and free to use, so anybody may examine and replicate its design.

R1 was based on DeepSeek’s prior model V3, which had also outperformed Alibaba’s Qwen2.5-72B, China’s previous top AI model, GPT-4o, and Llama 3.3-70B. When V3 was released in late December, its performance was comparable to that of Claude 3.5 Sonnet.

R1’s development claims from DeepSeek are part of what makes it so amazing.

According to a DeepSeektechnical report, R1 was built in just two months and cost less than $6 million, despite the fact that major US tech companies still spend billions of dollars annually on AI. Additionally, DeepSeek was forced to use less powerful chips to create its models due to U.S. export laws that restricted access to the top AI computer chips.

Leave a Reply Cancel reply