High-Quality Text-to-Speech Implementation via Active Shallow Diffusion Mechanism
Denoising diffusion probabilistic models (DDPMs) have proven to be useful in text-to-speech (TTS) tasks; however, it has been a challenge for traditional diffusion models to carry out real-time processing because of the need for hundreds of sampling steps during the iteration. In this work, a two-st...
Saved in:
| Main Authors: | Junlin Deng, Ruihan Hou, Yan Deng, Yongqiu Long, Ning Wu |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/3/833 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
by: Kamil Skowroński, et al.
Published: (2024-11-01) -
Speech Emotion Recognition on MELD and RAVDESS Datasets Using CNN
by: Gheed T. Waleed, et al.
Published: (2025-06-01) -
Research on Speech Enhancement Translation and Mel-Spectrogram Mapping Method for the Deaf Based on Pix2PixGANs
by: Shaoting Zeng, et al.
Published: (2025-01-01) -
MixDiff-TTS: Mixture Alignment and Diffusion Model for Text-to-Speech
by: Yongqiu Long, et al.
Published: (2025-04-01) -
ECE-TTS: A Zero-Shot Emotion Text-to-Speech Model with Simplified and Precise Control
by: Shixiong Liang, et al.
Published: (2025-05-01)