Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
In the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approach...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10944805/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849739575047487488 |
|---|---|
| author | Vassilis Lyberatos Spyridon Kantarelis Edmund Dervakos Giorgos Stamou |
| author_facet | Vassilis Lyberatos Spyridon Kantarelis Edmund Dervakos Giorgos Stamou |
| author_sort | Vassilis Lyberatos |
| collection | DOAJ |
| description | In the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approaches rely on deep neural networks, which despite their impressive performance, are opaque, meaning it is difficult to explain their output on a given input. While the problem of interpretability has been highlighted in other domains, such as medicine, it has not been a priority for music-related tasks. In this work, we explored the usefulness of interpretability for music auto-tagging. We developed a pipeline incorporating three types of information extraction procedures: 1) symbolic knowledge, 2) auxiliary deep neural networks, and 3) signal processing, to extract perceptual features of audio files, which were then used to train an explainable machine learning model to predict tags. We experimented on three datasets the MTG-Jamendo dataset, the GTZAN dataset, and the MagnaTagATune dataset. Our method outperforms baseline models in all tasks and in some cases is competitive with the state-of-the-art. We conducted a human survey to evaluate user trust in our methodology and in a state-of-the-art model, concluding that while the state-of-the-art model offers better performance, there are use cases where the slight deterioration in accuracy is outweighed by the increased trust and value provided by interpretability. |
| format | Article |
| id | doaj-art-ce4c60a4efb94e3f9cb6c697de3def6f |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-ce4c60a4efb94e3f9cb6c697de3def6f2025-08-20T03:06:14ZengIEEEIEEE Access2169-35362025-01-0113607206073210.1109/ACCESS.2025.355574110944805Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual FeaturesVassilis Lyberatos0https://orcid.org/0000-0001-8957-0277Spyridon Kantarelis1Edmund Dervakos2Giorgos Stamou3AI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceIn the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approaches rely on deep neural networks, which despite their impressive performance, are opaque, meaning it is difficult to explain their output on a given input. While the problem of interpretability has been highlighted in other domains, such as medicine, it has not been a priority for music-related tasks. In this work, we explored the usefulness of interpretability for music auto-tagging. We developed a pipeline incorporating three types of information extraction procedures: 1) symbolic knowledge, 2) auxiliary deep neural networks, and 3) signal processing, to extract perceptual features of audio files, which were then used to train an explainable machine learning model to predict tags. We experimented on three datasets the MTG-Jamendo dataset, the GTZAN dataset, and the MagnaTagATune dataset. Our method outperforms baseline models in all tasks and in some cases is competitive with the state-of-the-art. We conducted a human survey to evaluate user trust in our methodology and in a state-of-the-art model, concluding that while the state-of-the-art model offers better performance, there are use cases where the slight deterioration in accuracy is outweighed by the increased trust and value provided by interpretability.https://ieeexplore.ieee.org/document/10944805/Music understandingexplainable AIperceptual musical features |
| spellingShingle | Vassilis Lyberatos Spyridon Kantarelis Edmund Dervakos Giorgos Stamou Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features IEEE Access Music understanding explainable AI perceptual musical features |
| title | Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features |
| title_full | Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features |
| title_fullStr | Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features |
| title_full_unstemmed | Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features |
| title_short | Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features |
| title_sort | challenges and perspectives in interpretable music auto tagging using perceptual features |
| topic | Music understanding explainable AI perceptual musical features |
| url | https://ieeexplore.ieee.org/document/10944805/ |
| work_keys_str_mv | AT vassilislyberatos challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures AT spyridonkantarelis challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures AT edmunddervakos challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures AT giorgosstamou challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures |