Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control

As artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accur...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongtao Wang, Li Gong
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11096601/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849419284593246208
author Hongtao Wang
Li Gong
author_facet Hongtao Wang
Li Gong
author_sort Hongtao Wang
collection DOAJ
description As artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accuracy in current emotion recognition and music generation systems, an innovative approach was proposed that fused a graph convolutional neural network with a channel attention mechanism for emotion recognition. This integrated model was subsequently paired with a Transformer architecture, creating a sophisticated framework capable of fine-grained control and heterogeneous music generation. In comparing the performance of the emotion recognition model against other leading models, the results underscored its exceptional accuracy across five distinct electroencephalogram signal bands: 97.3%, 95.8%, 96.9%, 98.4%, and 97.6%, respectively. Crucially, all these accuracy metrics exceeded the 95% benchmark, clearly demonstrating superiority over the comparative models. Additionally, a rigorous performance assessment was conducted to evaluate the music generation model’s capabilities against alternative approaches. The findings revealed that the suggested model achieved an average mean square error of 0.27 and an average root mean square error of 0.24. These error rates were notably lower than those of the competing models, highlighting the enhanced precision and fidelity of the music generated. Together, these results validated the effectiveness of both the emotion recognition and music generation models developed in this research. This research not only propelled the existing frontiers of emotion detection and musical composition forward, but also laid a robust theoretical framework to facilitate subsequent investigations into the emerging field of emotion-aware music generation.
format Article
id doaj-art-a60a9b04e22c4f62b65604f35ce0ce57
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-a60a9b04e22c4f62b65604f35ce0ce572025-08-20T03:32:10ZengIEEEIEEE Access2169-35362025-01-011313287013288310.1109/ACCESS.2025.359269911096601Heterogeneous AI Music Generation Technology Integrating Fine-Grained ControlHongtao Wang0Li Gong1https://orcid.org/0009-0000-9571-7669School of Music and Dance, Guangzhou University, Guangzhou, ChinaSchool of Music and Dance, Guangzhou University, Guangzhou, ChinaAs artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accuracy in current emotion recognition and music generation systems, an innovative approach was proposed that fused a graph convolutional neural network with a channel attention mechanism for emotion recognition. This integrated model was subsequently paired with a Transformer architecture, creating a sophisticated framework capable of fine-grained control and heterogeneous music generation. In comparing the performance of the emotion recognition model against other leading models, the results underscored its exceptional accuracy across five distinct electroencephalogram signal bands: 97.3%, 95.8%, 96.9%, 98.4%, and 97.6%, respectively. Crucially, all these accuracy metrics exceeded the 95% benchmark, clearly demonstrating superiority over the comparative models. Additionally, a rigorous performance assessment was conducted to evaluate the music generation model’s capabilities against alternative approaches. The findings revealed that the suggested model achieved an average mean square error of 0.27 and an average root mean square error of 0.24. These error rates were notably lower than those of the competing models, highlighting the enhanced precision and fidelity of the music generated. Together, these results validated the effectiveness of both the emotion recognition and music generation models developed in this research. This research not only propelled the existing frontiers of emotion detection and musical composition forward, but also laid a robust theoretical framework to facilitate subsequent investigations into the emerging field of emotion-aware music generation.https://ieeexplore.ieee.org/document/11096601/Graph convolutional neural networkchannel attentionemotion recognitionmusic generationtransformer
spellingShingle Hongtao Wang
Li Gong
Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
IEEE Access
Graph convolutional neural network
channel attention
emotion recognition
music generation
transformer
title Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
title_full Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
title_fullStr Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
title_full_unstemmed Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
title_short Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
title_sort heterogeneous ai music generation technology integrating fine grained control
topic Graph convolutional neural network
channel attention
emotion recognition
music generation
transformer
url https://ieeexplore.ieee.org/document/11096601/
work_keys_str_mv AT hongtaowang heterogeneousaimusicgenerationtechnologyintegratingfinegrainedcontrol
AT ligong heterogeneousaimusicgenerationtechnologyintegratingfinegrainedcontrol