Yiannis (Ioannis) Tsiamas | AI Research Scientist

👾

Yiannis (Ioannis) Tsiamas

(he/him)

AI Research Scientist
Multilinguality & Multimodality | LLMs | Representation Learning | Speech & Text Translation

I’m Ioannis Tsiamas,† a Research Scientist and with a PhD in AI at UPC Barcelona, working on multilingual and multimodal representation learning, self-supervised pre-training, and large-scale distributed training. My research has been published at ACL, EMNLP, ECCV, ICASSP, and Interspeech.

I spent 15 months at Meta FAIR, where I led the language expansion of Omnilingual SONAR and contributed to Omnilingual MT, the most massively multilingual systems ever built, spanning thousands languages. I have also conducted research at Apple AI/ML, Dolby, and Zeta Alpha.

My goal is to make AI truly multilingual and accessible across the world’s languages, including the thousands that current systems still cannot reach.

†Note: I publish under ‘Ioannis Tsiamas’, but use Yiannis as prefered name, which is the casual version Ioannis.

Download CV

Experience & Education

AI Research Scientist

Internship at the Omnilingual team of Meta FAIR.
(Aug 2024 - Nov 2025)

AI Research Scientist

Internship at the Machine Translation team of Apple AI/ML.
(Apr 2024 - Jul 2024)

AI Research Scientist

Internship on Audio-Visual Representations at Dolby AI.
(Nov 2023 - Feb 2024)

PhD in AI

PhD in Artificial Intelligence at UPC Barcelona.
(Mar 2021 - Jun 2026)

MSc in AI

MSc in Artificial Intelligence at the University of Amsterdam.
(Graduated Aug 2020)

MSc in Quant Finance

MSc in Quantitative Finance at VU University Amsterdam.
(Graduated Oct 2018)

View Full Experience & Education

Featured Publications

Machine Translation

Omnilingual MT: Machine Translation for 1,600 Languages

We present OMT, the first MT system supporting more than 1,600 languages, where 1B–8B parameter specialized models match or exceed a 70B LLM baseline, with strong generalization to …

The Omnilingual MT Team

• Mar 17, 2026 • 1 min read

Multilingual NLP

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

We introduce OmniSONAR, a family of omnilingual cross-lingual and cross-modal sentence embedding models spanning 4,000+ language varieties and natively supporting text, speech, …

The Omnilingual SONAR Team

• Mar 17, 2026 • 1 min read

Machine Translation

Improving Language and Modality Transfer in Translation by Character-level Modeling

We propose a character-based translation model to improve adaptability to new languages and modalities, particularly for low-resource scenarios. Our method achieves …

Ioannis Tsiamas

• Jul 1, 2025 • 1 min read

Audio-Visual Representation Learning

Sequential Contrastive Audio-Visual Learning

We introduce Sequential Contrastive Audio-Visual Learning (SCAV), a novel method that contrasts non-aggregated sequential representations to learn fine-grained audio-visual …

Ioannis Tsiamas

• Apr 6, 2025 • 1 min read

See all

Recent Publications

The Omnilingual MT Team, Belen Alastruey

Core contributor

, Niyati Bafna

Core contributor

, Andrea Caciolai

Core contributor

, Kevin Heffernan

Core contributor

, Artyom Kozhevnikov

Core contributor

, Christophe Ropers

Core contributor

, Eduardo Sánchez

Core contributor

, Charles-Eric Saint-James

Core contributor

, Ioannis Tsiamas

Core contributor

, Chierh Cheng, Joe Chuang, Paul-Ambroise Duquenne, Mark Duppenthaler, Nate Ekberg, Cynthia Gao, Pere Lluís Huguet Cabot, João Maria Janeiro, Jean Maillard, Gabriel Mejia Gonzalez, Holger Schwenk, Edan Toledo, Arina Turkatenko, Albert Ventayol-Boada, Rashel Moritz, Alexandre Mourachko, Surya Parimi, Mary Williamson, Shireen Yates, David Dale, Marta R. Costa-Jussà (2026). Omnilingual MT: Machine Translation for 1,600 Languages. arXiv.

arXiv PDF Meta AI Leaderboard

The Omnilingual SONAR Team, João Maria Janeiro

Core contributor

, Pere Lluís Huguet Cabot

Core contributor

, Ioannis Tsiamas

Core contributor

, Yen Meng

Core contributor

, Vivek Iyer, Guillem Ramírez, Loic Barrault, Belen Alastruey, Yu-an Chung, Marta R. Costa-Jussà, David Dale, Kevin Heffernan, Jaehyeong Jo, Artyom Kozhevnikov, Alexandre Mourachko, Christophe Ropers, Holger Schwenk, Paul-Ambroise Duquenne (2026). Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech. arXiv.

arXiv PDF Meta AI

Pierre Andrews, Mikel Artetxe, Mariano Coria Meglioli, Marta R. Costa-Jussà, Joe Chuang, David Dale, Mark Duppenthaler, Nathanial Paul Ekberg, Cynthia Gao, Daniel Edward Licht, Jean Maillard, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Eduardo Sánchez, Ioannis Tsiamas, Arina Turkatenko, Albert Ventayol-Boada, Shireen Yates (2025). BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation. EMNLP 2025.

ACL Anthology PDF arXiv Dataset News DOI

Ioannis Tsiamas, David Dale, Marta R. Costa-Jussà (2025). Improving Language and Modality Transfer in Translation by Character-level Modeling. In ACL 2025.

PDF Source Document DOI

Ioannis Tsiamas, Santiago Pascual, Chunghsin Yeh, Joan Serrà (2025). Sequential Contrastive Audio-Visual Learning. In ICASSP 2025.

PDF Source Document DOI

See all

AI Research Scientist Multilinguality & Multimodality | LLMs | Representation Learning | Speech & Text Translation

Experience & Education

AI Research Scientist

AI Research Scientist

AI Research Scientist

PhD in AI

MSc in AI

MSc in Quant Finance

Omnilingual MT: Machine Translation for 1,600 Languages

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Improving Language and Modality Transfer in Translation by Character-level Modeling

Sequential Contrastive Audio-Visual Learning

AI Research Scientist
Multilinguality & Multimodality | LLMs | Representation Learning | Speech & Text Translation