Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech

Although recent neural text-to-speech (TTS) systems have achieved high-quality speech synthesis, there are cases where a TTS system generates low-quality speech, mainly caused by limited training data or information loss during knowledge distillation. Therefore, we propose a novel method to improve...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yeunju Choi, Youngmoon Jung, Youngjoo Suh, Hoirin Kim
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	MOS prediction neural TTS perceptual loss speech synthesis
Online Access:	https://ieeexplore.ieee.org/document/9775804/
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech

Similar Items