Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control
As artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accur...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11096601/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849419284593246208 |
|---|---|
| author | Hongtao Wang Li Gong |
| author_facet | Hongtao Wang Li Gong |
| author_sort | Hongtao Wang |
| collection | DOAJ |
| description | As artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accuracy in current emotion recognition and music generation systems, an innovative approach was proposed that fused a graph convolutional neural network with a channel attention mechanism for emotion recognition. This integrated model was subsequently paired with a Transformer architecture, creating a sophisticated framework capable of fine-grained control and heterogeneous music generation. In comparing the performance of the emotion recognition model against other leading models, the results underscored its exceptional accuracy across five distinct electroencephalogram signal bands: 97.3%, 95.8%, 96.9%, 98.4%, and 97.6%, respectively. Crucially, all these accuracy metrics exceeded the 95% benchmark, clearly demonstrating superiority over the comparative models. Additionally, a rigorous performance assessment was conducted to evaluate the music generation model’s capabilities against alternative approaches. The findings revealed that the suggested model achieved an average mean square error of 0.27 and an average root mean square error of 0.24. These error rates were notably lower than those of the competing models, highlighting the enhanced precision and fidelity of the music generated. Together, these results validated the effectiveness of both the emotion recognition and music generation models developed in this research. This research not only propelled the existing frontiers of emotion detection and musical composition forward, but also laid a robust theoretical framework to facilitate subsequent investigations into the emerging field of emotion-aware music generation. |
| format | Article |
| id | doaj-art-a60a9b04e22c4f62b65604f35ce0ce57 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-a60a9b04e22c4f62b65604f35ce0ce572025-08-20T03:32:10ZengIEEEIEEE Access2169-35362025-01-011313287013288310.1109/ACCESS.2025.359269911096601Heterogeneous AI Music Generation Technology Integrating Fine-Grained ControlHongtao Wang0Li Gong1https://orcid.org/0009-0000-9571-7669School of Music and Dance, Guangzhou University, Guangzhou, ChinaSchool of Music and Dance, Guangzhou University, Guangzhou, ChinaAs artificial intelligence algorithms continue to advance, researchers have increasingly harnessed their capabilities to generate music that resonates with human emotions, offering a novel means of alleviating the escalating pressures of contemporary life. To tackle the persistent issue of low accuracy in current emotion recognition and music generation systems, an innovative approach was proposed that fused a graph convolutional neural network with a channel attention mechanism for emotion recognition. This integrated model was subsequently paired with a Transformer architecture, creating a sophisticated framework capable of fine-grained control and heterogeneous music generation. In comparing the performance of the emotion recognition model against other leading models, the results underscored its exceptional accuracy across five distinct electroencephalogram signal bands: 97.3%, 95.8%, 96.9%, 98.4%, and 97.6%, respectively. Crucially, all these accuracy metrics exceeded the 95% benchmark, clearly demonstrating superiority over the comparative models. Additionally, a rigorous performance assessment was conducted to evaluate the music generation model’s capabilities against alternative approaches. The findings revealed that the suggested model achieved an average mean square error of 0.27 and an average root mean square error of 0.24. These error rates were notably lower than those of the competing models, highlighting the enhanced precision and fidelity of the music generated. Together, these results validated the effectiveness of both the emotion recognition and music generation models developed in this research. This research not only propelled the existing frontiers of emotion detection and musical composition forward, but also laid a robust theoretical framework to facilitate subsequent investigations into the emerging field of emotion-aware music generation.https://ieeexplore.ieee.org/document/11096601/Graph convolutional neural networkchannel attentionemotion recognitionmusic generationtransformer |
| spellingShingle | Hongtao Wang Li Gong Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control IEEE Access Graph convolutional neural network channel attention emotion recognition music generation transformer |
| title | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control |
| title_full | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control |
| title_fullStr | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control |
| title_full_unstemmed | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control |
| title_short | Heterogeneous AI Music Generation Technology Integrating Fine-Grained Control |
| title_sort | heterogeneous ai music generation technology integrating fine grained control |
| topic | Graph convolutional neural network channel attention emotion recognition music generation transformer |
| url | https://ieeexplore.ieee.org/document/11096601/ |
| work_keys_str_mv | AT hongtaowang heterogeneousaimusicgenerationtechnologyintegratingfinegrainedcontrol AT ligong heterogeneousaimusicgenerationtechnologyintegratingfinegrainedcontrol |