Daily Paper Machine

Tag: End-to-End Model

All the papers with the tag "End-to-End Model".

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
grok-3-latest
Score: 0.68
Published:2025年5月5日 at 15:05
#Voice AI, #End-to-End Model, #Full-Duplex Interaction, #Speech Tokenization, #Multimodal Alignment
本文提出 `Voila`，一个语音-语言基础模型家族，通过端到端架构、分层多尺度Transformer和文本-音频交错对齐，实现低延迟、自主的全双工语音交互，并支持高效语音定制和多任务处理，显著提升人机交互自然性。