DeepSeek has finally released its much-anticipated next-generation foundational artificial intelligence model, the open-source V4, which it said was competitive with leading US closed-source models from the likes of OpenAI and Google DeepMind.
The Hangzhou-based AI start-up released two versions of the model on Friday, with the V4-pro model boasting 1.6 trillion parameters, making it the company’s biggest-ever model by that metric, while the smaller V4-flash model has 284 billion parameters.
Both models have a context window of 1 million tokens, a critical feature that determines the amount of information an AI system is able to process, which DeepSeek said was achieved with “world-leading” cost efficiency.
Prior to V4’s release, US officials accused DeepSeek of using banned Nvidia Blackwell chips to train its models. Meanwhile, American tech outlet The Information reported that the model was optimised to run on Huawei Technologies’ Ascend 950PR chips rather than US chips.
DeepSeek did not disclose the hardware stack used to train V4. However, it mentioned the development of “kernels” – codes dictating the functions of graphics processing units (GPUs) – adapted to both Nvidia and Huawei chips in an extended technical report.
The company said that the throughput of V4-pro was currently limited by a shortfall in computational supply, adding that “prices will drop significantly” in the second half of the year “once Huawei’s Ascend 950PR super nodes ship at scale”.