{"id":367187,"date":"2026-04-07T03:02:11","date_gmt":"2026-04-07T03:02:11","guid":{"rendered":"https:\/\/www.newsbeep.com\/nz\/367187\/"},"modified":"2026-04-07T03:02:11","modified_gmt":"2026-04-07T03:02:11","slug":"microsoft-launches-three-new-mai-ai-models-for-foundry","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/nz\/367187\/","title":{"rendered":"Microsoft launches three new MAI AI models for Foundry"},"content":{"rendered":"<p data-end=\"252\" data-start=\"93\">Microsoft has launched three new MAI artificial intelligence models in Microsoft Foundry, covering speech transcription, voice generation and image generation.<\/p>\n<p data-end=\"476\" data-start=\"254\">The new products &#8211; MAI-Transcribe-1, MAI-Voice-1 and MAI-Image-2 &#8211; are also available through MAI Playground in the US. They are aimed at developers building applications and services that use audio and visual AI features.<\/p>\n<p data-end=\"501\" data-start=\"478\">Transcription model<\/p>\n<p data-end=\"777\" data-start=\"503\">MAI-Transcribe-1 is a speech-to-text model covering the 25 most-used languages measured by the FLEURS benchmark. It is designed to handle noisy, less controlled settings, and Microsoft says its batch transcription speed is 2.5 times that of the existing Azure Fast offering.<\/p>\n<p data-end=\"935\" data-start=\"779\">Pricing for the transcription model starts at USD $0.36 per hour. Microsoft describes it as offering the best price-performance among large cloud providers.<\/p>\n<p data-end=\"957\" data-start=\"937\">Voice generation<\/p>\n<p data-end=\"1134\" data-start=\"959\">MAI-Voice-1 is Microsoft&#8217;s latest voice generation model. It can produce natural speech with emotional variation and preserve speaker identity across longer passages of audio.<\/p>\n<p data-end=\"1317\" data-start=\"1136\">Microsoft is also adding custom voice creation in Foundry, allowing developers to create a voice from a short audio sample. The model can generate 60 seconds of audio in one second.<\/p>\n<p data-end=\"1468\" data-start=\"1319\">Pricing for MAI-Voice-1 starts at USD $22 per 1 million characters. It is intended for developers building voice-based applications and voice agents.<\/p>\n<p data-end=\"1493\" data-start=\"1470\">Image model rollout<\/p>\n<p data-end=\"1716\" data-start=\"1495\">For image generation, MAI-Image-2 has already been deployed in Copilot and has delivered at least twice the previous generation speed, based on production traffic data. Roll-outs are also under way in Bing and PowerPoint.<\/p>\n<p data-end=\"2044\" data-start=\"1718\">Microsoft positions the model as a tool for photographers, designers and visual storytellers seeking more natural lighting, more accurate skin tones and textures, and clearer in-image text for diagrams and layouts. Pricing starts at USD $5 per 1 million tokens for text input and USD $33 per 1 million tokens for image output.<\/p>\n<p data-end=\"2064\" data-start=\"2046\">Early customer<\/p>\n<p data-end=\"2203\" data-start=\"2066\">WPP is among the first companies using MAI-Image-2 at scale and was cited as an early partner for creative work built on the image model.<\/p>\n<p data-end=\"2478\" data-start=\"2205\">&#8220;MAI-Image-2 is a genuine game-changer. It&#8217;s a platform that not only responds to the intricate nuance of creative direction, but deeply respects the sheer craft involved in generating real-world, campaign-ready images,&#8221; said Rob Reilly, Global Chief Creative Officer, WPP.<\/p>\n<p data-end=\"2593\" data-start=\"2480\">&#8220;WPP has some of the best creative talent in the world and MAI-Image-2 is making them even better,&#8221; added Reilly.<\/p>\n<p data-end=\"2609\" data-start=\"2595\">Model push<\/p>\n<p data-end=\"2892\" data-start=\"2611\">The launch adds to Microsoft&#8217;s broader effort to develop and distribute in-house AI models through its own platforms and products. The company is deploying these models across consumer and commercial services while also making them available to external developers through Foundry.<\/p>\n<p data-end=\"3213\" data-start=\"2894\">That approach reflects a wider shift among major technology groups to package speech, image and multimodal AI models directly into cloud development environments. By tying new models to Foundry, Microsoft is seeking to make its AI stack more accessible to developers already building on its cloud and software products.<\/p>\n<p data-end=\"3442\" data-start=\"3215\">Microsoft says the models were developed, tested and red-teamed under its safe AI processes. Through Foundry, developers will have access to guardrails, governance features and controls intended to support compliant deployment.<\/p>\n<p data-end=\"3530\" data-is-last-node=\"\" data-is-only-node=\"\" data-start=\"3444\">More MAI models are expected to follow in Foundry and across Microsoft&#8217;s own products.<\/p>\n","protected":false},"excerpt":{"rendered":"Microsoft has launched three new MAI artificial intelligence models in Microsoft Foundry, covering speech transcription, voice generation and&hellip;\n","protected":false},"author":2,"featured_media":367188,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[73693,6160,83422,6162,7665,32170,159219,17617,1031,609,25740,111,139,69,5296,192624,25779,145,44124],"class_list":{"0":"post-367187","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"tag-ai-ethics-governance","9":"tag-artificial-intelligence-ai","10":"tag-bing","11":"tag-cloud","12":"tag-content-creation","13":"tag-creative-technologies","14":"tag-developer-tools","15":"tag-generative-ai-genai","16":"tag-marketing","17":"tag-microsoft","18":"tag-microsoft-azure","19":"tag-new-zealand","20":"tag-newzealand","21":"tag-nz","22":"tag-photography","23":"tag-powerpoint","24":"tag-software-development","25":"tag-technology","26":"tag-wpp"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts\/367187","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/comments?post=367187"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts\/367187\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/media\/367188"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/media?parent=367187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/categories?post=367187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/tags?post=367187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}