{"id":173994,"date":"2025-09-28T00:39:13","date_gmt":"2025-09-28T00:39:13","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/173994\/"},"modified":"2025-09-28T00:39:13","modified_gmt":"2025-09-28T00:39:13","slug":"token-costs-emerge-as-a-bottleneck-for-scaling-ai-applications-with-the-industry-seeking-breakthroughs-in-computing-power","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/173994\/","title":{"rendered":"Token costs emerge as a bottleneck for scaling AI applications, with the industry seeking breakthroughs in computing power."},"content":{"rendered":"<p>\u2460Reporters from the China Financial News Agency observed on-site that the focus of discussion among attending experts and corporate representatives has shifted from macro-level policy interpretation to specific &#8220;implementation blueprints.&#8221; \u2461Interviews conducted by the News Agency&#8217;s reporters revealed that at the conference, high token costs have become a core pain point for many enterprises in advancing the scaling of AI applications.<\/p>\n<p>China Financial News Agency, September 27th, by reporter Guo Songqiao: It has been just one month since the issuance of the &#8216;Opinions on Deeply Implementing the &#8220;Artificial Intelligence +&#8221; Action,&#8217; and the acceleration of the industry\u2019s \u201cstarting run\u201d is already evident.<\/p>\n<p>The 2025 Artificial Intelligence Computing Conference held yesterday in Beijing served as an excellent observation window. Reporters from the China Financial News Agency noticed on-site that the focus of discussion among attending experts and corporate representatives has shifted from macro-level policy interpretation to specific &#8220;implementation blueprints.&#8221;<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2025\/09\/1759019952_4_big.jpeg\" ftsrcwh=\"1440x1080\" style=\"width:1440px;height:1080px;aspect-ratio:1440\/1080;max-height:495.00px\"\/><\/p>\n<p><img decoding=\"async\" class=\"boundary-pic\" loading=\"lazy\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2025\/09\/17590199492519454306438.png\"\/>This year\u2019s conference closely aligns with artificial intelligence infrastructure construction and the optimization of domestic AI computing power systems, focusing on promoting algorithm innovation and application implementation. With computing power as the core element driving innovation, it brings together resources from academia, industry, research, and application fields to jointly advance the high-quality development of the artificial intelligence industry. On-site, over 30 companies and institutions including China Mobile, Inspur Information, Zhiyuan Research Institute, and Kunlun Chip jointly released the &#8216;Beijing Solution for Intelligent Computing Applications &#8211; Building Industry Intelligence Based on Super-Node Innovation Consortium,&#8217; taking the lead in responding to the national &#8216;Opinions on Deeply Implementing the &#8220;Artificial Intelligence +&#8221; Action.&#8217;<\/p>\n<p>&#8220;Achieving cross-regional, cross-hardware computing power connectivity and inclusive sharing.&#8221;<\/p>\n<p>At the conference, Jin Yaocai, founder of the Westlake University &#8216;Trustworthy and General Artificial Intelligence Laboratory&#8217; and member of the European Academy of Sciences, outlined the main trajectory of artificial intelligence development. He noted that its developmental path resembles the emergence process of human brain intelligence \u2014 undergoing three key mechanisms: evolution, development, and learning. From this perspective, he further elaborated on the requirements of trustworthy artificial intelligence and the importance of artificial intelligence governance, while sharing the laboratory\u2019s practices in industrial artificial intelligence and explorations in brain-inspired general artificial intelligence.<\/p>\n<p>Lin Yonghua, Vice President and Chief Engineer of the Beijing Zhiyuan Artificial Intelligence Research Institute, shared the technical progress of the &#8216;CrowdWisdom FlagOS&#8217; platform. He pointed out that as an open and unified system software stack, the platform aims to break down barriers within the AI computing power ecosystem, achieve cross-regional, cross-hardware computing power connectivity and inclusive sharing, and provide global developers with a unified computing foundation across chips, frameworks, and scenarios.<\/p>\n<p>Wang Haifeng, Chief Technology Officer of Baidu, reviewed the development of artificial intelligence from rule-based methods, statistical machine learning, to deep learning and large models. He highlighted that the universality and comprehensive capabilities of large model technology offer a promising outlook for achieving general artificial intelligence.<\/p>\n<p>Liu Jun, Chief AI Strategist of Inspur Information, introduced two innovative systems for the era of intelligent agents. He discussed the challenges faced by sustainable AI computing power development, such as scale, electricity, and investment. He proposed shifting from a scale-oriented approach to an efficiency-oriented one, rethinking and redesigning AI computing systems, and developing specialized AI computing architectures.<\/p>\n<p>Dai Beijie, Vice President of the Beijing Zhongguancun Artificial Intelligence Research Institute, introduced the project-based talent development system established by Beijing Zhongguancun College to meet the demand for cultivating unconventional AI leaders. This initiative promotes the deep integration of AI with multiple disciplines, enabling bidirectional empowerment through the implementation of scientific research achievements and feedback from industrial needs to technological innovation, thereby injecting strong momentum into the development of new quality productive forces.<\/p>\n<p>Hardware Innovation Targets Token Cost Bottleneck<\/p>\n<p>Reporters from Cailian Press learned on-site that at the conference, high token costs have become a core pain point for many companies in scaling AI applications.<\/p>\n<p>\u201cOur platform handles massive volumes of customer service, recommendation, and risk control scenarios every day that require calling large models, and token costs are like the \u2018Sword of Damocles\u2019 hanging over our heads.\u201d At the conference, a technical director from the AI platform department of an e-commerce company told Cailian Press reporters, adding that he had come specifically to find cost-reduction solutions.<\/p>\n<p>\u201cAs intelligent agent applications expand further, the token consumption per interaction session is surging rapidly. The current cost structure has caused many valuable innovative applications to hit a roadblock due to &#8216;economic feasibility&#8217; even before reaching scale, posing significant challenges to profitability,\u201d the aforementioned director admitted. This sentiment has become one of the most common voices heard by Cailian Press reporters at this year\u2019s Artificial Intelligence Conference.<\/p>\n<p>Guo Tao, Deputy Director of the China E-Commerce Expert Service Center, stated in an interview with Cailian Press that the AI industry is transitioning from &#8216;model competition&#8217; to &#8216;application implementation.&#8217; Inference costs and interaction speed have now become more critical competitive dimensions than model parameter size. The effectiveness of infrastructure in terms of &#8216;speed enhancement and cost reduction&#8217; will directly determine the depth and breadth of &#8216;AI+&#8217; penetration across vertical industries.<\/p>\n<p>At the conference, a representative from a publicly listed company also told reporters that as Scaling Law continues to drive advancements in model capabilities, open-source models represented by DeepSeek have significantly lowered the threshold for innovation, accelerating the industrialization of intelligent agents. The three core elements of intelligent agent industrialization are capability, speed, and cost. Among these, model capability determines the upper limit of application potential, interaction speed defines commercial value, and token cost dictates profitability.<\/p>\n<p>This pain point has resonated widely at the conference. In response to the industry\u2019s universal demand, computing infrastructure providers are attempting to seek breakthroughs at the hardware level.<\/p>\n<p>On the hardware front, Inspur Information unveiled its YuanNao HC1000 ultra-scalable AI server at the conference. Based on a newly developed fully symmetric DirectCom ultra-speed architecture, the lossless, ultra-scalable design aggregates a vast array of domestic AI chips and supports exceptionally high inference throughput. For the first time, the cost of inference was reduced to below RMB 1 per million tokens, providing an innovative computing power system with ultimate performance to overcome the token cost bottleneck for intelligent agents.<\/p>\n<p>From a technical perspective, Liu Jun told Cailian Press that the YuanNao HC1000 achieves comprehensive optimization for cost reduction and hardware-software synergy through innovations such as a 16-card compute module design and balanced single-card designs integrating &#8216;compute-memory-interconnect.&#8217; These improvements significantly reduce both individual card costs and system overhead per card. Additionally, the fully symmetric system topology supports ultra-large-scale lossless expansion. According to calculations, the YuanNao HC1000 achieves a 1.75x improvement in inference performance compared to traditional RoCE solutions, with single-card model computational efficiency increasing by up to 5.7 times via deep compute-network synergy and full-domain lossless technologies.<\/p>\n<p>The exponential surge in inference computing demands brought by intelligent agents in the future is undoubtedly acknowledged by the industry.<\/p>\n<p>Inspur Information revealed to reporters that it will continuously promote innovation and breakthroughs in AI computing architecture through software-hardware co-design and deep optimization. The company is committed to accelerating token generation while reducing costs, actively fostering the deep integration of artificial intelligence technologies, such as large models and intelligent agents, with the real economy. This effort aims to make artificial intelligence a driving force for productivity and innovation across various industries.<\/p>\n","protected":false},"excerpt":{"rendered":"\u2460Reporters from the China Financial News Agency observed on-site that the focus of discussion among attending experts and&hellip;\n","protected":false},"author":2,"featured_media":173995,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[64,63,257,105],"class_list":{"0":"post-173994","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-computing","8":"tag-au","9":"tag-australia","10":"tag-computing","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/173994","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=173994"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/173994\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/173995"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=173994"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=173994"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=173994"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}