DeepSeek’s R2 model delayed

Published in AI

DeepSeek’s R2 model delayed

by Nick Farrell on14 August 2025

font size decrease font size increase font size
Print
Email

Beijing’s homegrown chip dream still can’t train properly

Chinese AI outfit DeepSeek has been forced to eat humble pie after its shiny new model flopped on Huawei’s Ascend chips, dragging a planned May launch into the long grass and handing rivals the chance to surge ahead.

The startup had been nudged by Beijing’s tech commissars to ditch Nvidia kit and embrace the Ascend processor when developing its R2 model. According to three insiders, the company gamely gave it a go after launching its R1 in January. It didn’t go well.

Persistent technical hiccups meant training the model on Ascend was a non-starter. In the end, DeepSeek reverted to Nvidia silicon for training and stuck with Huawei’s gear for inference. That halfway house solution was only chosen after the Ascend effort faceplanted completely.

One source familiar with the mess said the training failures were the main reason the R2 launch was kicked past May, costing the company valuable time in an AI arms race where seconds matter.

Training, for the record, is the bit where an AI sucks up vast data sets like a digital hoover. Inference is when it tries to sound clever in response to user questions, which is what your average chatbot does when it’s not hallucinating.

Huawei sent in the cavalry, dispatching engineers to DeepSeek’s office to try and get the Ascend-based model working. It didn’t help. Even with on-site support, DeepSeek still couldn’t get through a single successful training run.

The case lays bare what most people in the industry already know: China’s chip contenders still trail the US in areas that matter. Huawei and Cambricon may fly the red flag, but their silicon lags behind Nvidia’s kit in terms of software, speed and general reliability.

That’s despite Beijing ramping up pressure on domestic companies to stop handing wads of cash to Nvidia. The Financial Times reported that Chinese firms have now been told to justify any orders for Nvidia’s H20 chip, the last export-legal GPU it can flog to China.

Insiders say Huawei’s Ascend chips are still prone to gremlins and dodgy interconnect speeds. The software isn’t up to scratch either. None of that is especially helpful if you're trying to train a next-gen large language model.

DeepSeek is said to be working with Huawei to get Ascend working for inference at least. Internally, founder Liang Wenfeng has apparently made it known he’s less than thrilled with R2’s progress and wants more time to push for a stronger follow-up model to keep ahead in the AI race.

Another source said the launch was also delayed by a data-labelling bottleneck, though Chinese media reckons the model might still crawl out of the lab sometime soon.

Meanwhile, competition is heating up. AI researcher Ritwik Gupta from the University of California, Berkeley, pointed out that Alibaba’s Qwen3 model nicked DeepSeek’s best ideas and made them work.

“Models are commodities that can be easily swapped out,” Gupta said. He added that Ascend chips are still in the growing pains phase but might catch up later.

“Just because we’re not seeing leading models trained on Huawei today doesn’t mean it won’t happen in the future. It’s a matter of time,” he said.

Nvidia, for its part, doesn’t appear too rattled. It recently cut a deal to hand over a slice of China revenue to the US government in order to keep selling H20 chips to the Middle Kingdom.

“Developers will play a crucial role in building the winning AI ecosystem,” said Nvidia, while politely reminding Washington that throwing China to the wolves might not be the best look for American national security.

Last modified on 14 August 2025

Rate this item

(0 votes)

Tagged under

More in this category: « Nvidia rolls out 7bn-parameter AI brain for robots Simply NUC becomes SNUC Systems »

DeepSeek’s R2 model delayed

Latest comments

Read more about: