{"id":280212,"date":"2025-11-08T23:27:11","date_gmt":"2025-11-08T23:27:11","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/280212\/"},"modified":"2025-11-08T23:27:11","modified_gmt":"2025-11-08T23:27:11","slug":"microsoft-has-reportedly-developed-toolkits-to-break-nvidias-cuda-dominance-slashing-inference-costs-with-amd-ai-gpus","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/280212\/","title":{"rendered":"Microsoft Has Reportedly Developed \u201cToolkits\u201d to Break NVIDIA\u2019s CUDA Dominance, Slashing Inference Costs with AMD AI GPUs"},"content":{"rendered":"<p>Microsoft is exploring ways to leverage the &#8216;stack&#8217; of its AMD GPUs for inferencing workloads, as the company develops toolkits that convert NVIDIA CUDA models into ROCm-supported code.<\/p>\n<p>Microsoft Sees Massive Demand For Inference Over Training, Which Makes AMD&#8217;s AI Chips a Lot More Attractive<\/p>\n<p>One of the reasons NVIDIA has managed to retain its dominance in the AI space is that the firm has a &#8216;CUDA lock-in&#8217; mechanism in place, which essentially forces CSPs and AI giants to employ NVIDIA&#8217;s AI chips to achieve optimal results with NVIDIA&#8217;s CUDA software ecosystem. Efforts have been made in the past to break this barrier and allow cross-platform support, but we haven&#8217;t seen a solution that has become mainstream. However, <a href=\"https:\/\/x.com\/RihardJarc\/status\/1986800283624546658\" rel=\"nofollow\">according to a &#8216;high-ranking&#8217; Microsoft employee<\/a>, it is reported that the tech giant has developed &#8216;toolkits&#8217; that allow the firm to run CUDA code on AMD GPUs by translating it into a ROCm-compatible version.<\/p>\n<p lang=\"en\" dir=\"ltr\">A MUST-read interview with a high-ranking <a href=\"https:\/\/twitter.com\/search?q=%24MSFT&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">$MSFT<\/a> employee on data centers and what is happening right now ( <a href=\"https:\/\/twitter.com\/search?q=%24NVDA&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">$NVDA<\/a>\/ <a href=\"https:\/\/twitter.com\/search?q=%24AMD&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">$AMD<\/a>, liquid cooling, and HHD):<\/p>\n<p>1. The challenges that <a href=\"https:\/\/twitter.com\/search?q=%24MSFT&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">$MSFT<\/a> is having right now are energy and liquid cooling. To improve its goodwill with municipalities, <a href=\"https:\/\/twitter.com\/search?q=%24MSFT&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">$MSFT<\/a> is\u2026 <a href=\"https:\/\/t.co\/jQTfhnxQga\" rel=\"nofollow\">pic.twitter.com\/jQTfhnxQga<\/a><\/p>\n<p>\u2014 Rihard Jarc (@RihardJarc) <a href=\"https:\/\/twitter.com\/RihardJarc\/status\/1986800283624546658?ref_src=twsrc%5Etfw\" rel=\"nofollow noopener\" target=\"_blank\">November 7, 2025<\/a><\/p>\n<p>Breaking CUDA&#8217;s dominance isn&#8217;t an easy task, as the software ecosystem is so integral to the AI industry that its adoption is almost ubiquitous, even in nations like China. However, Microsoft&#8217;s toolkit, mentioned by the employee, likely employs a route that has been in the market for quite some time. One way to perform a CUDA-to-ROCm translation is through a runtime compatibility layer, which enables CUDA API calls to be translated into ROCm without requiring full source code rewrites. One example of this <a href=\"https:\/\/wccftech.com\/zluda-sees-major-progress-in-bringing-nvidia-cuda-code-to-other-gpus\/\" rel=\"nofollow noopener\" target=\"_blank\">is the ZLUDA tool, which intercepts CUDA calls<\/a>, translates them into ROCm, and does so without requiring a full recompile.<\/p>\n<p>We built some toolkits to help convert like CUDA models to ROCm so you could use it on an AMD, like a 300X. We have had a lot of inquiries about what is our path with AMD, the 400X and the 450X. We&#8217;re actually working with AMD on that to see what we can do to maximize that.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"728\" height=\"410\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/11\/NVIDIA-CUDA-728x410.jpg\" alt=\"NVIDIA CUDA Can Now Directly Run On AMD's RDNA GPUs Using The &quot;SCALE&quot; Toolkit 1\" class=\"wp-image-1510725\" style=\"width:839px;height:auto\"  \/><\/p>\n<p>However, due to ROCm still being a relatively &#8216;immature&#8217; software stack, there are several API calls or pieces of code in CUDA that have no mapping with AMD&#8217;s software, which, in some cases, collapses the performance, which is a high-risk problem in large datacenter environments. Another possible variant of the toolkit being mentioned here is likely an end-to-end cloud migration tool that integrates with Azure, targeting both AMD and NVIDIA instances. Of course, this does bring problems when conversions happen on a large scale, but by the looks of it, the toolkits developed by Microsoft appear to be in confined use.<\/p>\n<p>\t\t\t\t\t<a id=\"bizdev_mobile_3_link\" target=\"_blank\" class=\"noskim d-md-none\" href=\"\" rel=\"sponsored\"><br \/>\n\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" id=\"bizdev_mobile_3_image\" style=\"display:inline-block\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/09\/placeholder-mobile.png\" width=\"300\" height=\"250\"\/><br \/>\n\t\t\t\t\t<\/a><\/p>\n<p>\t\t\t\t<a id=\"bizdev_mid_desktop_link\" target=\"_blank\" class=\"noskim d-none d-lg-block\" href=\"\" rel=\"sponsored\"><br \/>\n\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" id=\"bizdev_mid_desktop_image\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/09\/placeholder-desktop.png\" width=\"728\" height=\"90\"\/><br \/>\n\t\t\t\t<\/a><\/p>\n<p>Now, the reason why Microsoft is pursuing the &#8216;software conversions&#8217; here is simply because the firm is seeing an increase in inference workloads, and the company is looking for a more cost-effective workload, which is why AMD&#8217;s AI chips make sense here, since they are the only counterpart to the &#8216;pricey&#8217; NVIDIA GPUs. And since you cannot leave out CUDA models across inference environments, the translation from it to ROCm becomes the next big step for Microsoft.<\/p>\n<p style=\"font-style: italic; border-top: 1px solid var(--gray-300); padding-top: 10px; margin-top: 20px;\">Follow <a href=\"https:\/\/profile.google.com\/cp\/Cg0vZy8xMWM3NDB2MmIyGgA\" rel=\"nofollow noopener\" target=\"_blank\">Wccftech on Google<\/a> or <a href=\"https:\/\/google.com\/preferences\/source?q=wccftech.com\" rel=\"nofollow noopener\" target=\"_blank\">add us as a preferred source<\/a>, to get our news coverage and reviews in your feeds.<\/p>\n<p>\t\t<script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n","protected":false},"excerpt":{"rendered":"Microsoft is exploring ways to leverage the &#8216;stack&#8217; of its AMD GPUs for inferencing workloads, as the company&hellip;\n","protected":false},"author":2,"featured_media":280213,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-280212","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/280212","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=280212"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/280212\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/280213"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=280212"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=280212"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=280212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}