Music source feature extraction based on improved attention mechanism and phase feature

Music source feature extraction is an important research direction in music information retrieval and music recommendation system. To extract the features of music sources more effectively, the study introduces the jump attention mechanism and combines it with the convolutional attention module. Als...

Full description

Saved in:

Bibliographic Details
Main Author:	Weina Yu
Format:	Article
Language:	English
Published:	Elsevier 2024-12-01
Series:	Systems and Soft Computing
Subjects:	Improved attention mechanism Phase feature Music source Feature extraction SAM
Online Access:	http://www.sciencedirect.com/science/article/pii/S2772941924000784
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850132236751339520
author	Weina Yu
author_facet	Weina Yu
author_sort	Weina Yu
collection	DOAJ
description	Music source feature extraction is an important research direction in music information retrieval and music recommendation system. To extract the features of music sources more effectively, the study introduces the jump attention mechanism and combines it with the convolutional attention module. Also, a feature extraction module based on Unet + + and spatial attention module is proposed. In addition, the phase feature information of the mixed music signals is utilized to improve the network performance. Results showed that this model was studied to perform well in music source separation experiments of vocals and accompaniment. For vocal separation on the MIR-1K dataset, the model achieves 11.25 dB, 17.34 dB, and 13.83 dB for each metric, respectively. Meanwhile, for drum separation on the DSD100 dataset, the model achieves a median signal-to-source distortion ratio of 4.36 dB, which is 2.91 dB better than that of the Spectral Hierarchical Network model. For the separation of the bass sound and the human voice, the model's in the separation of bass and human voice, the median distortion ratio of the model is as high as 4.87 dB and 6.09 dB, which is better than that of the Spectral Hierarchical Network model. This indicates the significant performance advantages in feature extraction and separation of music sources, and it has important application values in music production and speech recognition.
format	Article
id	doaj-art-0719fe365b9e4be282f9944862c36b95
institution	OA Journals
issn	2772-9419
language	English
publishDate	2024-12-01
publisher	Elsevier
record_format	Article
series	Systems and Soft Computing
spelling	doaj-art-0719fe365b9e4be282f9944862c36b952025-08-20T02:32:15ZengElsevierSystems and Soft Computing2772-94192024-12-01620014910.1016/j.sasc.2024.200149Music source feature extraction based on improved attention mechanism and phase featureWeina Yu0Music and Dance Post-doctoral Research Mobile Station, Nanjing University of the Arts, Nan'jing 210013, ChinaMusic source feature extraction is an important research direction in music information retrieval and music recommendation system. To extract the features of music sources more effectively, the study introduces the jump attention mechanism and combines it with the convolutional attention module. Also, a feature extraction module based on Unet + + and spatial attention module is proposed. In addition, the phase feature information of the mixed music signals is utilized to improve the network performance. Results showed that this model was studied to perform well in music source separation experiments of vocals and accompaniment. For vocal separation on the MIR-1K dataset, the model achieves 11.25 dB, 17.34 dB, and 13.83 dB for each metric, respectively. Meanwhile, for drum separation on the DSD100 dataset, the model achieves a median signal-to-source distortion ratio of 4.36 dB, which is 2.91 dB better than that of the Spectral Hierarchical Network model. For the separation of the bass sound and the human voice, the model's in the separation of bass and human voice, the median distortion ratio of the model is as high as 4.87 dB and 6.09 dB, which is better than that of the Spectral Hierarchical Network model. This indicates the significant performance advantages in feature extraction and separation of music sources, and it has important application values in music production and speech recognition.http://www.sciencedirect.com/science/article/pii/S2772941924000784Improved attention mechanismPhase featureMusic sourceFeature extractionSAM
spellingShingle	Weina Yu Music source feature extraction based on improved attention mechanism and phase feature Systems and Soft Computing Improved attention mechanism Phase feature Music source Feature extraction SAM
title	Music source feature extraction based on improved attention mechanism and phase feature
title_full	Music source feature extraction based on improved attention mechanism and phase feature
title_fullStr	Music source feature extraction based on improved attention mechanism and phase feature
title_full_unstemmed	Music source feature extraction based on improved attention mechanism and phase feature
title_short	Music source feature extraction based on improved attention mechanism and phase feature
title_sort	music source feature extraction based on improved attention mechanism and phase feature
topic	Improved attention mechanism Phase feature Music source Feature extraction SAM
url	http://www.sciencedirect.com/science/article/pii/S2772941924000784
work_keys_str_mv	AT weinayu musicsourcefeatureextractionbasedonimprovedattentionmechanismandphasefeature

Music source feature extraction based on improved attention mechanism and phase feature

Similar Items