Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features

In the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approach...

Full description

Saved in:
Bibliographic Details
Main Authors: Vassilis Lyberatos, Spyridon Kantarelis, Edmund Dervakos, Giorgos Stamou
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10944805/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849739575047487488
author Vassilis Lyberatos
Spyridon Kantarelis
Edmund Dervakos
Giorgos Stamou
author_facet Vassilis Lyberatos
Spyridon Kantarelis
Edmund Dervakos
Giorgos Stamou
author_sort Vassilis Lyberatos
collection DOAJ
description In the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approaches rely on deep neural networks, which despite their impressive performance, are opaque, meaning it is difficult to explain their output on a given input. While the problem of interpretability has been highlighted in other domains, such as medicine, it has not been a priority for music-related tasks. In this work, we explored the usefulness of interpretability for music auto-tagging. We developed a pipeline incorporating three types of information extraction procedures: 1) symbolic knowledge, 2) auxiliary deep neural networks, and 3) signal processing, to extract perceptual features of audio files, which were then used to train an explainable machine learning model to predict tags. We experimented on three datasets the MTG-Jamendo dataset, the GTZAN dataset, and the MagnaTagATune dataset. Our method outperforms baseline models in all tasks and in some cases is competitive with the state-of-the-art. We conducted a human survey to evaluate user trust in our methodology and in a state-of-the-art model, concluding that while the state-of-the-art model offers better performance, there are use cases where the slight deterioration in accuracy is outweighed by the increased trust and value provided by interpretability.
format Article
id doaj-art-ce4c60a4efb94e3f9cb6c697de3def6f
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-ce4c60a4efb94e3f9cb6c697de3def6f2025-08-20T03:06:14ZengIEEEIEEE Access2169-35362025-01-0113607206073210.1109/ACCESS.2025.355574110944805Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual FeaturesVassilis Lyberatos0https://orcid.org/0000-0001-8957-0277Spyridon Kantarelis1Edmund Dervakos2Giorgos Stamou3AI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceAI and Learning Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, GreeceIn the era of music streaming platforms and recommendation systems, the automatic music auto-tagging task has gained a lot of traction and has motivated researchers to develop methods for solving the task focusing on improving performance metrics on baseline datasets. The majority of recent approaches rely on deep neural networks, which despite their impressive performance, are opaque, meaning it is difficult to explain their output on a given input. While the problem of interpretability has been highlighted in other domains, such as medicine, it has not been a priority for music-related tasks. In this work, we explored the usefulness of interpretability for music auto-tagging. We developed a pipeline incorporating three types of information extraction procedures: 1) symbolic knowledge, 2) auxiliary deep neural networks, and 3) signal processing, to extract perceptual features of audio files, which were then used to train an explainable machine learning model to predict tags. We experimented on three datasets the MTG-Jamendo dataset, the GTZAN dataset, and the MagnaTagATune dataset. Our method outperforms baseline models in all tasks and in some cases is competitive with the state-of-the-art. We conducted a human survey to evaluate user trust in our methodology and in a state-of-the-art model, concluding that while the state-of-the-art model offers better performance, there are use cases where the slight deterioration in accuracy is outweighed by the increased trust and value provided by interpretability.https://ieeexplore.ieee.org/document/10944805/Music understandingexplainable AIperceptual musical features
spellingShingle Vassilis Lyberatos
Spyridon Kantarelis
Edmund Dervakos
Giorgos Stamou
Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
IEEE Access
Music understanding
explainable AI
perceptual musical features
title Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
title_full Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
title_fullStr Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
title_full_unstemmed Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
title_short Challenges and Perspectives in Interpretable Music Auto-Tagging Using Perceptual Features
title_sort challenges and perspectives in interpretable music auto tagging using perceptual features
topic Music understanding
explainable AI
perceptual musical features
url https://ieeexplore.ieee.org/document/10944805/
work_keys_str_mv AT vassilislyberatos challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures
AT spyridonkantarelis challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures
AT edmunddervakos challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures
AT giorgosstamou challengesandperspectivesininterpretablemusicautotaggingusingperceptualfeatures