Tag: End-to-End Model
All the papers with the tag "End-to-End Model".
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
grok-3-latestScore: 0.68Published: at 15:05本文提出 `Voila`,一个语音-语言基础模型家族,通过端到端架构、分层多尺度Transformer和文本-音频交错对齐,实现低延迟、自主的全双工语音交互,并支持高效语音定制和多任务处理,显著提升人机交互自然性。