Meta, the technology company formerly known as Facebook, has unveiled an advanced multilingual multimodal AI translation and transcription model named 'SeamlessM4T.' This versatile model is capable of performing various tasks including speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations. Supporting nearly 100 languages, 'SeamlessM4T' facilitates speech recognition, translation, and synthesis for diverse linguistic needs. The model also marks a significant milestone in the field of AI by releasing the metadata of SeamlessAlign, an extensive multimodal translation dataset encompassing 270,000 hours of speech and text alignments.
Meta's commitment to multilingual capabilities is evident in its previous projects. Last year, the company introduced No Language Left Behind (NLLB), a text-to-text translation model that supports 200 languages, integrated into Wikipedia as a translation provider. Additionally, Meta demonstrated its Universal Speech Translator, the first speech-to-speech translation system for languages without widely used writing systems, such as Hokkien.
Furthermore, Meta's Massively Multilingual Speech project, unveiled earlier this year, contributed to the development of 'SeamlessM4T.' This project covered speech recognition, language identification, and speech synthesis across over 1,100 languages.
ALSO READ: Instagram introduces web version of Threads amid usage decline
Drawing insights from these initiatives, 'SeamlessM4T' brings together diverse spoken data sources to provide a comprehensive multilingual and multimodal translation experience from a single model. This advancement showcases Meta's ongoing dedication to enhancing AI-driven translation capabilities, facilitating effective communication across a multitude of languages and cultures.
ALSO READ: YouTube reports over 1.7 billion views on AI-related videos and launches music AI incubator
Inputs from IANS