Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities