Microsoft has recently announced a groundbreaking initiative to introduce a voice-cloning feature within its Teams platform, enabling users to effortlessly replicate their own voices for effective communication across multiple languages during virtual meetings. This remarkable new tool, dubbed Interpreter in Teams, was showcased at the Microsoft Ignite 2024 conference and promises to deliver real-time speech-to-speech translation. Beginning in early 2025, Teams users will have the exciting capability to simulate their voices in a diverse array of nine languages, including English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese, and Spanish.
This innovative feature is strategically designed to enhance multilingual collaboration and significantly bridge communication gaps that often arise in virtual meetings.
According to Microsoft’s Chief Marketing Officer, Jared Spataro, in a detailed blog post, “Imagine being able to sound just like you in a different language. The Interpreter agent in Teams provides real-time speech-to-speech translation during meetings, and you can opt to have it simulate your speaking voice for a more personal and engaging experience,” as reported by TechCrunch.
Although Microsoft has provided limited details regarding the feature, it will be exclusively accessible to Microsoft 365 subscribers. Importantly, the company emphasized that the new tool does not retain any biometric data, intentionally avoids adding artificial nuances beyond those naturally present in the user’s voice, and gives users the flexibility to disable it via the Teams settings.
A Microsoft spokesperson elaborated, “Interpreter is designed to replicate the speaker’s message as faithfully as possible without adding assumptions or extraneous information. Voice simulation can only be enabled when users provide consent via a notification during the meeting or by enabling ‘Voice simulation consent’ in settings.”
Can This Feature Be Misused?
Deepfakes have rapidly proliferated across social media platforms, blurring the line between reality and misinformation. This year alone, deepfakes featuring high-profile figures like President Joe Biden, Taylor Swift, and Vice President Kamala Harris have generated millions of views and shares, raising significant concern over the potential for deception. Furthermore, deepfake technology has been weaponized in personal scams, including impersonation of family members. The FTC has reported that impersonation scams resulted in over $1 billion in losses last year, underscoring the severity of this issue.
One alarming incident involved cybercriminals who utilized deepfake technology to simulate a Teams meeting involving a company’s executives, successfully deceiving them into authorizing a transfer of $25 million. Such incidents have intensified concerns about the associated risks and public perception, prompting OpenAI to withhold the release of its own voice cloning tool, Voice Engine, earlier this year, as a proactive measure against potential misuse.
Based on the preliminary details shared by Microsoft, Interpreter in Teams appears to have a specific use case for voice cloning that is tailored for professional communication. However, this does not mitigate the inherent risks of misuse. For example, a malicious user could potentially input a deceptive recording — such as a request for sensitive banking details — and then employ the tool to generate a translation in the target’s language, which could pave the way for exploitation. As developments progress, we may expect further information from Microsoft regarding safeguards and functionalities of this feature in the coming months.
What measures does Microsoft have in place to prevent the misuse of the voice-cloning technology in Teams?
**Interview with Jared Spataro, Chief Marketing Officer at Microsoft**
**Editor:** Welcome, Jared! Thank you for joining us today to talk about the exciting new voice-cloning feature in Microsoft Teams. First, can you give us an overview of how the Interpreter in Teams works?
**Jared Spataro:** Thank you for having me! The Interpreter feature in Teams leverages advanced AI technology to provide real-time speech-to-speech translation during meetings. What makes it truly unique is the ability to simulate the user’s own voice in nine different languages. This means when you speak in your native language, attendees in the meeting can hear your message delivered in their language, but with your voice.
**Editor:** That’s fascinating! What impact do you expect this feature to have on international communication and collaboration?
**Jared Spataro:** We believe this feature will significantly enhance multilingual collaboration by bridging communication gaps that often hinder effective discussions. Businesses are increasingly global, and we want to empower teams to connect and collaborate without language barriers. The ability to hear a colleague’s voice—essentially sounding like them speaking in another language—adds a personal touch to virtual meetings that can help build rapport.
**Editor:** Privacy is always a concern when it comes to AI technology. How does Microsoft ensure user data is safeguarded while using this voice-cloning feature?
**Jared Spataro:** Great question. Privacy and security are paramount for us. The feature does not retain any biometric data and is designed to avoid unnecessary artificial nuances in voice simulation. Users have full control; they can disable the feature through Teams settings. Additionally, consent is required for it to be used during meetings, ensuring total transparency and user agency.
**Editor:** That’s reassuring to hear. Now, there’s been some concern about the potential misuse of voice-cloning technology. How is Microsoft addressing these concerns?
**Jared Spataro:** We are acutely aware of the risks associated with voice cloning and are committed to responsible technology deployment. The design of the Interpreter tool includes strict protocols to ensure it can only be activated with explicit user consent. Our focus is on providing a tool that enhances communication while maintaining a high ethical standard.
**Editor:** Exciting developments ahead for Teams users! When can users expect to see this feature available?
**Jared Spataro:** We plan to roll out the Interpreter feature in early 2025, and it will be exclusively available to Microsoft 365 subscribers. We’re eager to see how our users incorporate this tool into their workflows to foster better communication across diverse teams.
**Editor:** Thank you for your insights, Jared! We look forward to seeing how this innovative feature transforms virtual meetings.
**Jared Spataro:** Thank you for having me! It’s an exciting time for Teams, and I can’t wait for our users to experience the benefits of Interpreter.