{"id":382816,"date":"2026-01-02T03:15:09","date_gmt":"2026-01-02T03:15:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/382816\/"},"modified":"2026-01-02T03:15:09","modified_gmt":"2026-01-02T03:15:09","slug":"deepseek-kicks-off-2026-with-paper-signalling-push-to-train-bigger-models-for-less","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/382816\/","title":{"rendered":"DeepSeek kicks off 2026 with paper signalling push to train bigger models for less"},"content":{"rendered":"<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">Chinese artificial intelligence start-up DeepSeek has ushered in 2026 with a new technical paper, co-authored by founder Liang Wenfeng, that proposes a rethink of the fundamental architecture used to train foundational AI models.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">The method \u2013 dubbed Manifold-Constrained Hyper-Connections (mHC) \u2013 forms part of the Hangzhou firm\u2019s push to make its models more cost-effective as it strives to keep pace with better-funded US rivals with deeper access to computing power.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">It also reflected the increasingly open, collaborative culture among Chinese AI companies, which have published a growing share of their research in public.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">For industry watchers, DeepSeek\u2019s papers often provide an important early signal of the engineering choices that will shape the start-up\u2019s next major model release.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">In the paper, released on Thursday, a team of 19 DeepSeek researchers said they tested mHC on models with 3 billion, 9 billion and 27 billion parameters, and found it scaled without adding significant computational burden.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">\u201cEmpirical results confirm that mHC effectively \u2026 [enables] stable large-scale training with superior scalability compared with conventional HC (hyper-connections),\u201d wrote the researchers, led by Zhenda Xie, Yixuan Wei and Huanqi Cao.<\/p>\n","protected":false},"excerpt":{"rendered":"Chinese artificial intelligence start-up DeepSeek has ushered in 2026 with a new technical paper, co-authored by founder Liang&hellip;\n","protected":false},"author":2,"featured_media":382817,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,3995,28,144,101,5425,149,3,1767,33670,1079,74,25,4477],"class_list":{"0":"post-382816","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-asia","12":"tag-business","13":"tag-china","14":"tag-economy","15":"tag-hong-kong","16":"tag-lifestyle","17":"tag-news","18":"tag-opinion","19":"tag-south-china-morning-post","20":"tag-sport","21":"tag-technology","22":"tag-us","23":"tag-world"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/382816","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=382816"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/382816\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/382817"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=382816"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=382816"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=382816"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}