Multilinguality

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

We introduce OmniSONAR, a family of omnilingual cross-lingual and cross-modal sentence embedding models spanning 4,000+ language varieties and natively supporting text, speech, …

The Omnilingual SONAR Team
Read more

Omnilingual MT: Machine Translation for 1,600 Languages

We present OMT, the first MT system supporting more than 1,600 languages, where 1B–8B parameter specialized models match or exceed a 70B LLM baseline, with strong generalization to …

The Omnilingual MT Team
Read more