Indonesian Voice Cloning Text-to-Speech System With Vall-E-Based Model and Speech Enhancement

Indonesian Voice Cloning Text-to-Speech System With Vall-E-Based Model and Speech Enhancement

In recent years, Text-to-Speech (TTS) technology has advanced, with research focusing on multi-speaker TTS capable of voice cloning. In 2023, Wang et al. introduced Vall-E, a Transformer-based neural codec language model, achieving state-of-the-art results in voice cloning. However, limited research...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hizkia Raditya Pratama Roosadi, Rizki Rivai Ginanjar, Dessi Puji Lestari
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Neural codec language model speech enhancement transformer text-to-speech Vall-E voice cloning
Online Access:	https://ieeexplore.ieee.org/document/10806715/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Speech Emotion Recognition Based on Voice Fundamental Frequency
by: Teodora DIMITROVA-GREKOW, et al.
Published: (2019-04-01)

Coded speech enhancement using auxiliary utterance-level information
by: Haixin Zhao, et al.
Published: (2025-07-01)

The Effectiveness of Lee Silverman Voice Treatment (LSVT LOUD) on Children’s Speech and Voice: A Scoping Review
by: Angelos Papadopoulos, et al.
Published: (2024-09-01)

Voice, speech and gender:
by: Erwan Pépiot
Published: (2015-06-01)

A Bitrate-Scalable Variational Recurrent Mel-Spectrogram Coder for Real-Time Resynthesis-Based Speech Coding
by: Benjamin Stahl, et al.
Published: (2024-01-01)

Phonetic characteristics of spontaneous speech in a total laryngectomized Italian speaker: Perspectives for speech enhancement algorithms
by: Chiara Meluzzi, et al.
Published: (2022-03-01)

Assessment of the Speech Material Usability for Forensic Speaker Identification by Voice and Sounding Speech
by: T. N. Svirava, et al.
Published: (2025-04-01)

Speech-dependent data augmentation for own voice reconstruction with hearable microphones in noisy environments
by: Mattes Ohlenbusch, et al.
Published: (2025-07-01)

Perception and social evaluation of cloned and recorded voices: Effects of familiarity and self-relevance
by: Victor Rosi, et al.
Published: (2025-05-01)

Accent conversion method with real-time voice cloning based on a nonautoregressive neural network model
by: V. A. Nechaev, et al.
Published: (2025-06-01)

Using casual speech phonology in synthetic speech
by: Linda SHOCKEY
Published: (2014-04-01)

Laryngostroboscopy and voice evaluation in adult patients with Parkinson’s disease
by: Andréa de Carvalho Anacleto Ferrari Castro, et al.
Published: (2025-07-01)

Advances in Automated Voice Pathology Detection: A Comprehensive Review of Speech Signal Analysis Techniques
by: Anitha Sankaran, et al.
Published: (2024-01-01)

Speech Analysis as a Tool for Detection and Monitoring of Medical Conditions: A review
by: Magdalena IGRAS-CYBULSKA, et al.
Published: (2023-08-01)

NADIEM MAKARIM’S FIRST SPEECH AS THE MINISTER OF INDONESIA EDUCATION AND CULTURE: SPEECH ACT ANALYSIS
by: Intan Siti Nugraha, et al.
Published: (2022-04-01)

How Much Voicing in Voiced Geminates? The Laryngeal Voicing Profile of Polish Double Stops
by: Arkadiusz ROJCZYK, et al.
Published: (2024-05-01)

Lombard Effect in Polish Speech and its Comparison in English Speech
by: Piotr KLECZKOWSKI, et al.
Published: (2017-11-01)

Intensive Speech Therapy for Hypokinetic Dysarthria in Parkinson’s Disease: Targeting the Five Subsystems of Speech Production with Clinical and Instrumental Evaluation
by: Annalisa Gison, et al.
Published: (2025-01-01)

Multi-dialectical Languages Effect On Speech Recognition Too Much Choice Can Hurt
by: Mohamed G.Elfeky, et al.
Published: (2016-05-01)

Intelligibility and Recognition of Announcer's Speech during Electric Acoustic Conversions in a Transformer
by: A. Y. Shafranov, et al.
Published: (2022-06-01)

Voice rehabilitation with voice prosthesis post-laryngectomy: IPO-LFG ENT department expertise
by: Gustavo Pedrosa Rocha, et al.
Published: (2024-03-01)

Semantic component of speech development of older preschoolers in the process of speech education
by: Yuri A. Kochetkov, et al.
Published: (2025-02-01)

Perception of vocoded speech in domestic dogs
by: Amritha Mallikarjun, et al.
Published: (2024-04-01)

An Automatic Voice Test Method of HMI for Train with PerceptualEvaluation of Speech Quality
by: GAO Feng, et al.
Published: (2021-01-01)

HILLARY CLINTON’S CONCESSION SPEECH : A CONSTRUCTIVISM STUDY OF DISCOURSE
by: Khairunnisa Khairunnisa, et al.
Published: (2018-01-01)

ZeST: A Zero-Resourced Speech-to-Speech Translation Approach for Unknown, Unpaired, and Untranscribed Languages
by: Luan Thanh Nguyen, et al.
Published: (2025-01-01)

The Influence of the Semantic Material on the Assessment of Speech Reception Threshold
by: Magdalena KRENZ, et al.
Published: (2015-01-01)

Leveraging Artificial Neural Networks for Real-Time Speech Recognition in Voice-Activated Systems
by: Kumar V Suresh, et al.
Published: (2025-01-01)

Formant structure of the voice during the intensive acute hypoxia
by: Obrenović Joviša M., et al.
Published: (2003-01-01)

THE ONTOGENESIS OF SPEECH DEVELOPMENT
by: T. E. Braudo, et al.
Published: (2017-04-01)

Quality assessment of synthetic speech
by: Stefan Brachmański, et al.
Published: (2025-07-01)

Pedagogy of live speech
by: Magdalena Ostolska
Published: (2024-06-01)

Building Text‐to‐Speech Models for Low‐Resourced Languages From Crowdsourced Data
by: Andrew Katumba, et al.
Published: (2025-04-01)

On-Device System for Device Directed Speech Detection for Improving Human Computer Interaction
by: Abhishek Singh, et al.
Published: (2021-01-01)

Multi‐stage attention network for monaural speech enhancement
by: Kunpeng Wang, et al.
Published: (2023-03-01)

Assessing costa rican children speech recognition by humans and machines
by: Maribel Morales-Rodríguez, et al.
Published: (2022-11-01)

Speech Emotion Recognition: Humans vs Machines
by: S. Werner, et al.
Published: (2019-12-01)

Le “Written Speech” yeatsien et ses expressions scéniques
by: Pierre Longuenesse
Published: (2013-06-01)

INFRASTRUCTURE DEVELOPMENT: POLITENESS STRATEGY IN THE SPEECH OF THE INDONESIAN PRESIDENT, JOKOWI.
by: Ema Eliyana
Published: (2023-12-01)

Automation of subjective measurements of speech intelligibility in analogue telecommunication channels
by: Stefan BRACHMAŃSKI
Published: (2008-01-01)