Audio Segmentation

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

We propose SegAugment, a data augmentation strategy that creates multiple sentence-level variations from document-level speech data, leading to significant performance gains in …

Ioannis Tsiamas
Read more

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

We propose Supervised Hybrid Audio Segmentation (SHAS), a method that learns optimal speech segmentation from manually segmented data. SHAS significantly improves translation …

Ioannis Tsiamas
Read more

End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021

Our submission to the IWSLT 2021 shared task details an end-to-end speech translation system combining large pretrained models with adapters for efficient fine-tuning. By training …

Gerard I. Gállego
Read more