What's new

Meta unveils SeamlessM4T multimodal translation model

Meta unveils SeamlessM4T multimodal translation model​

meta-ai-seamlessm4t-speech-translation-transcription-model.jpg

About the Author​

By You do not have permission to view the full content of this post. Log in or register now. | August 22, 2023
Categories: You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now.,
Meta unveils SeamlessM4T multimodal translation model
Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it's geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)



Meta researchers have unveiled You do not have permission to view the full content of this post. Log in or register now., a pioneering multilingual and multitask model that facilitates seamless translation and transcription across both speech and text.
The internet, mobile devices, social media, and communication platforms have ushered in an era where access to multilingual content has reached unprecedented levels. SeamlessM4T aims to realise the vision of seamless communication and comprehension across languages.
Boasting an impressive array of capabilities, SeamlessM4T encompasses:
  • Automatic speech recognition for nearly 100 languages
  • Speech-to-text translation supporting nearly 100 input and output languages
  • Speech-to-speech translation for nearly 100 input languages and 35 (including English) output languages
  • Text-to-text translation for almost 100 languages
  • Text-to-speech translation for nearly 100 input languages and 35 (including English) output languages
SeamlessM4T is being made available to researchers and developers under the You do not have permission to view the full content of this post. Log in or register now. license, embodying an ethos of open science.
Additionally, the metadata of SeamlessAlign – the largest multimodal translation dataset ever compiled, consisting of 270,000 hours of mined speech and text alignments – has been released. This facilitates independent data mining and further research within the community.
The development of SeamlessM4T addresses a long-standing challenge in the field of multilingual communication. Unlike earlier systems, which were confined by limited language coverage and reliance on separate subsystems, SeamlessM4T presents a unified model capable of comprehensively handling speech-to-speech and speech-to-text translation tasks.
Meta has built upon previous innovations – such as You do not have permission to view the full content of this post. Log in or register now. (NLLB) and You do not have permission to view the full content of this post. Log in or register now. – to create this unified multilingual model. With its impressive performance on low-resource languages and consistently strong performance on high-resource languages, SeamlessM4T holds the potential to revolutionise cross-language communication.
Underpinning the model’s architecture is the multitask UnitY model, which excels in generating translated text and speech.
UnitY supports various translation tasks, including automatic speech recognition, text-to-text translation, and speech-to-speech translation, all from a single model. To train this versatile model, Meta employed advanced techniques such as text and speech encoders, self-supervised encoders, and sophisticated decoding processes.
The result is a model that outperforms previous leaders:
seamlessm4t-model-comparison-1024x613.jpg

To ensure the accuracy and safety of the system, Meta adheres to a responsible AI framework.
Meta says that extensive research on toxicity and bias mitigation has been conducted, resulting in a model that is more aware of and responsive to potential issues. The public release of the SeamlessM4T model encourages collaborative research and development in the AI community.
As the world becomes more connected, SeamlessM4T’s ability to transcend language barriers is a testament to the power of AI-driven innovation. This milestone brings us closer to a future where communication knows no linguistic limitations, enabling a world where people can truly understand each other regardless of language.
A demo of SeamlessM4T can be found You do not have permission to view the full content of this post. Log in or register now.. The code, model, and data can be downloaded You do not have permission to view the full content of this post. Log in or register now..
(Image Credit: You do not have permission to view the full content of this post. Log in or register now.)
See also: You do not have permission to view the full content of this post. Log in or register now.
You do not have permission to view the full content of this post. Log in or register now.
Want to learn more about AI and big data from industry leaders? Check out You do not have permission to view the full content of this post. Log in or register now. taking place in Amsterdam, California, and London. The comprehensive event is co-located with You do not have permission to view the full content of this post. Log in or register now..
Explore other upcoming enterprise technology events and webinars powered by TechForge You do not have permission to view the full content of this post. Log in or register now.
Tags: You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now., You do not have permission to view the full content of this post. Log in or register now.
 
Back
Top