Home / Technology
Microsoft's 'superintelligence' strategy focuses on business growth
Suleyman stands as Microsoft's first-ever CEO of AI
Mustafa Suleyman has been gearing up for his new position for quite some time. Suleyman stands as Microsoft's first-ever CEO of AI, but after the company underwent major restructuring in mid-March, he delegated some responsibilities to focus on the pursuit of superintelligence.
Although the announcement was made public just last month, he informed The Verge that he had been preparing for the shift for up to nine months — and even though renegotiating Microsoft's agreement with OpenAI officially began Microsoft's “ability to pursue superintelligence,” he was planning it long before it was finalised.
“This plan has been in the works for a while," he stated, adding that achieving superintelligence was "solely my focus."
Superintelligence and AGI, or artificial general intelligence, often have vague and evolving definitions in the AI sector.
For Suleyman, it's mainly about business and efficiency. "Superintelligence is effectively about asking, 'Can these models offer product value to the many companies relying on us to provide top-tier language models?'” Suleyman remarked.
"This is where our focus lies. We aim to cater to developers, enterprises, and countless consumers." AI companies are under growing pressure to boost revenue, and Microsoft’s vision mirrors a new direction at OpenAI too.
Microsoft’s restructuring merged its enterprise and consumer branches under the Copilot AI label.
Although Suleyman will still handle high-level strategy, Jacob Andreou, previously a corporate vice president of product and growth for Microsoft AI, assumed the role of executive vice president, leading the unified teams’ engineering, growth, product, and design projects.
This transition allows Suleyman to dedicate his efforts to advancing superintelligence and creating next-generation AI models for Microsoft at a time when competition among leading AI companies — and the demand to draw new paying consumers and enterprise clients — is more intense than ever before.
On Thursday, Microsoft unveiled a new transcription model that it believes will meet these objectives — and because it incurs “half the GPU cost of the leading models,” according to Suleyman, it's a “significant cost reduction” for Microsoft.
The company promotes MAI-Transcribe-1 as “advancing the frontier of speech recognition” with its capability to transcribe meetings, caption videos, and evaluate call center interactions in 25 different languages.
Microsoft's blog posts introducing the model mention it was designed for “demanding” recording settings including background noise, subpar audio quality, and overlapping dialogue, trained using a mix of “human-curated” and machine-generated transcripts.
Suleyman mentioned the source audio comes from both controlled studio environments and contractors recording themselves amid noise, from bustling streets to children playing, plus “extensive data from the open web.”
Alongside MAI-Voice-1 and MAI-Image-2, already existing voice and image-generation models, the new transcription tool is now accessible on Microsoft Foundry and as part of the new Microsoft AI Playground.
According to Microsoft, this is the first time these models are “widely available for commercial use.” MAI-Transcribe-1 can process audio files in MP3, WAV, and FLAC formats.
