A Bitrate-Scalable Variational Recurrent Mel-Spectrogram Coder for Real-Time Resynthesis-Based Speech Coding
This paper introduces a method for real-time speech coding that combines a binary-latent-vector variational recurrent neural network for mel-spectrogram coding with a non-autoregressive convolutional vocoder for waveform reconstruction. To enable bitrate scalability, we propose a latent vector trunc...
Saved in:
| Main Authors: | Benjamin Stahl, Simon Windtner, Alois Sontacchi |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10720741/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Perceptual consequences of change in vocoded speech parameters for various reverberation conditions
by: Szymon Drgas, et al.
Published: (2014-01-01) -
Perception of vocoded speech in domestic dogs
by: Amritha Mallikarjun, et al.
Published: (2024-04-01) -
V2Coder: A Non-Autoregressive Vocoder Based on Hierarchical Variational Autoencoders
by: Takato Fujimoto, et al.
Published: (2025-01-01) -
Research on Speech Enhancement Translation and Mel-Spectrogram Mapping Method for the Deaf Based on Pix2PixGANs
by: Shaoting Zeng, et al.
Published: (2025-01-01) -
Coded speech enhancement using auxiliary utterance-level information
by: Haixin Zhao, et al.
Published: (2025-07-01)