DeepSeek’s proposed “mHC” architecture could transform the training of large language models (LLMs) – the technology behind artificial intelligence chatbots – as developers look for ways to scale models without simply adding more computing power.

However, experts cautioned that while the approach could prove far-reaching, it might still prove difficult to put into practice.

In a technical paper released last week, co-authored by DeepSeek founder and CEO Liang Wenfeng, the company proposed Manifold-Constrained Hyper-Connections (mHC), a method designed to address the training instability of Hyper-Connections (HC), a network structure introduced by Chinese tech giant ByteDance in 2024.

HC was developed to address limitations of Residual Networks (ResNet), an architecture that underpins many modern deep-learning models, including LLMs.

ResNet was proposed about a decade ago by four researchers at Microsoft Research Asia, including prominent computer scientist Kaiming He.

DeepSeek’s paper marks the Chinese AI start-up’s latest effort to improve model training efficiency with limited computing resources, fuelling speculation that its next models could incorporate the new architecture.