Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer

Introduction: This study introduces a fully differentiable, end-to-end audio transformation network designed to overcome these limitations by operating directly on acoustic features. Methods: The proposed method employs an encoder–decoder architecture with a global conditioning mechanism. It elimina...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shashwat Aggarwal, Shashwat Uttam, Sameer Garg, Shubham Garg, Kopal Jain, Swati Aggarwal
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	AI
Subjects:	voice conversion musical style transfer audio transformations end-to-end audio pipeline
Online Access:	https://www.mdpi.com/2673-2688/6/1/16
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832589377438482432
author	Shashwat Aggarwal Shashwat Uttam Sameer Garg Shubham Garg Kopal Jain Swati Aggarwal
author_facet	Shashwat Aggarwal Shashwat Uttam Sameer Garg Shubham Garg Kopal Jain Swati Aggarwal
author_sort	Shashwat Aggarwal
collection	DOAJ
description	Introduction: This study introduces a fully differentiable, end-to-end audio transformation network designed to overcome these limitations by operating directly on acoustic features. Methods: The proposed method employs an encoder–decoder architecture with a global conditioning mechanism. It eliminates the need for parallel utterances, intermediate phonetic representations, and speaker-independent ASR systems. The system is evaluated on tasks of voice conversion and musical style transfer using subjective and objective metrics. Results: Experimental results demonstrate the model’s efficacy, achieving competitive performance in both seen and unseen target scenarios. The proposed framework outperforms seven existing systems for audio transformation and aligns closely with state-of-the-art methods. Conclusion: This approach simplifies feature engineering, ensures vocabulary independence, and broadens the applicability of audio transformations across diverse domains, such as personalized voice assistants and musical experimentation.
format	Article
id	doaj-art-100c547430254b348b246d035f28f4fa
institution	Kabale University
issn	2673-2688
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	AI
spelling	doaj-art-100c547430254b348b246d035f28f4fa2025-01-24T13:17:24ZengMDPI AGAI2673-26882025-01-01611610.3390/ai6010016Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style TransferShashwat Aggarwal0Shashwat Uttam1Sameer Garg2Shubham Garg3Kopal Jain4Swati Aggarwal5Department of Computer Science and Engineering, Netaji Subhas University of Technology, New Delhi 110078, IndiaDepartment of Computer Science and Engineering, Netaji Subhas University of Technology, New Delhi 110078, IndiaDepartment of Computer Science and Engineering, Netaji Subhas University of Technology, New Delhi 110078, IndiaDepartment of Computer Science and Engineering, Netaji Subhas University of Technology, New Delhi 110078, IndiaDepartment of Electrical Engineering, Indian Institute of Technology, Kharagpur 721302, IndiaFaculty of Logistics, Molde University College, 6410 Molde, NorwayIntroduction: This study introduces a fully differentiable, end-to-end audio transformation network designed to overcome these limitations by operating directly on acoustic features. Methods: The proposed method employs an encoder–decoder architecture with a global conditioning mechanism. It eliminates the need for parallel utterances, intermediate phonetic representations, and speaker-independent ASR systems. The system is evaluated on tasks of voice conversion and musical style transfer using subjective and objective metrics. Results: Experimental results demonstrate the model’s efficacy, achieving competitive performance in both seen and unseen target scenarios. The proposed framework outperforms seven existing systems for audio transformation and aligns closely with state-of-the-art methods. Conclusion: This approach simplifies feature engineering, ensures vocabulary independence, and broadens the applicability of audio transformations across diverse domains, such as personalized voice assistants and musical experimentation.https://www.mdpi.com/2673-2688/6/1/16voice conversionmusical style transferaudio transformationsend-to-end audio pipeline
spellingShingle	Shashwat Aggarwal Shashwat Uttam Sameer Garg Shubham Garg Kopal Jain Swati Aggarwal Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer AI voice conversion musical style transfer audio transformations end-to-end audio pipeline
title	Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer
title_full	Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer
title_fullStr	Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer
title_full_unstemmed	Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer
title_short	Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer
title_sort	advancements in end to end audio style transformation a differentiable approach for voice conversion and musical style transfer
topic	voice conversion musical style transfer audio transformations end-to-end audio pipeline
url	https://www.mdpi.com/2673-2688/6/1/16
work_keys_str_mv	AT shashwataggarwal advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer AT shashwatuttam advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer AT sameergarg advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer AT shubhamgarg advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer AT kopaljain advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer AT swatiaggarwal advancementsinendtoendaudiostyletransformationadifferentiableapproachforvoiceconversionandmusicalstyletransfer

Advancements in End-to-End Audio Style Transformation: A Differentiable Approach for Voice Conversion and Musical Style Transfer

Similar Items