The AI Music Arms Race: On the Detection of AI-Generated Music

Several companies now offer platforms for users to create music at unprecedented scales by textual prompting. As the quality of this music rises, concern grows about how to differentiate AI‑generated music from human‑made music, with implications for content identification, copyright enforcement, an...

Full description

Saved in:
Bibliographic Details
Main Authors: Laura Cros Vila, Bob L. T. Sturm, Luca Casini, David Dalmazzo
Format: Article
Language:English
Published: Ubiquity Press 2025-06-01
Series:Transactions of the International Society for Music Information Retrieval
Subjects:
Online Access:https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/254
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849716687556837376
author Laura Cros Vila
Bob L. T. Sturm
Luca Casini
David Dalmazzo
author_facet Laura Cros Vila
Bob L. T. Sturm
Luca Casini
David Dalmazzo
author_sort Laura Cros Vila
collection DOAJ
description Several companies now offer platforms for users to create music at unprecedented scales by textual prompting. As the quality of this music rises, concern grows about how to differentiate AI‑generated music from human‑made music, with implications for content identification, copyright enforcement, and music recommendation systems. This article explores the detection of AI‑generated music by assembling and studying a large dataset of music audio recordings (30,000 full tracks totaling 1,770 h, 33 m, and 31 s in duration), of which 10,000 are from the Million Song Dataset (Bertin‑Mahieux et al., 2011) and 20,000 are generated and released by users of two popular AI music platforms: Suno and Udio. We build and evaluate several AI music detectors operating on Contrastive Language–Audio Pretraining embeddings of the music audio, then compare them to a commercial baseline system as well as an open‑source one. We applied various audio transformations to see their impacts on detector performance and found that the commercial baseline system is easily fooled by simply resampling audio to 22.05 kHz. We argue that careful consideration needs to be given to the experimental design underlying work in this area, as well as the very definition of ‘AI music.’ We release all our code at https://github.com/lcrosvila/ai-music-detection.
format Article
id doaj-art-7aaa25230f814a06bb6e4090b725a665
institution DOAJ
issn 2514-3298
language English
publishDate 2025-06-01
publisher Ubiquity Press
record_format Article
series Transactions of the International Society for Music Information Retrieval
spelling doaj-art-7aaa25230f814a06bb6e4090b725a6652025-08-20T03:12:54ZengUbiquity PressTransactions of the International Society for Music Information Retrieval2514-32982025-06-0181179–194179–19410.5334/tismir.254254The AI Music Arms Race: On the Detection of AI-Generated MusicLaura Cros Vila0https://orcid.org/0000-0003-1098-6873Bob L. T. Sturm1https://orcid.org/0000-0003-2549-6367Luca Casini2https://orcid.org/0000-0002-3468-6974David Dalmazzo3https://orcid.org/0000-0002-3262-4091KTH Royal Institute of Technology, StockholmKTH Royal Institute of Technology, StockholmKTH Royal Institute of Technology, StockholmKTH Royal Institute of Technology, StockholmSeveral companies now offer platforms for users to create music at unprecedented scales by textual prompting. As the quality of this music rises, concern grows about how to differentiate AI‑generated music from human‑made music, with implications for content identification, copyright enforcement, and music recommendation systems. This article explores the detection of AI‑generated music by assembling and studying a large dataset of music audio recordings (30,000 full tracks totaling 1,770 h, 33 m, and 31 s in duration), of which 10,000 are from the Million Song Dataset (Bertin‑Mahieux et al., 2011) and 20,000 are generated and released by users of two popular AI music platforms: Suno and Udio. We build and evaluate several AI music detectors operating on Contrastive Language–Audio Pretraining embeddings of the music audio, then compare them to a commercial baseline system as well as an open‑source one. We applied various audio transformations to see their impacts on detector performance and found that the commercial baseline system is easily fooled by simply resampling audio to 22.05 kHz. We argue that careful consideration needs to be given to the experimental design underlying work in this area, as well as the very definition of ‘AI music.’ We release all our code at https://github.com/lcrosvila/ai-music-detection.https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/254ai music detectionai musicgenerative aisunoudio
spellingShingle Laura Cros Vila
Bob L. T. Sturm
Luca Casini
David Dalmazzo
The AI Music Arms Race: On the Detection of AI-Generated Music
Transactions of the International Society for Music Information Retrieval
ai music detection
ai music
generative ai
suno
udio
title The AI Music Arms Race: On the Detection of AI-Generated Music
title_full The AI Music Arms Race: On the Detection of AI-Generated Music
title_fullStr The AI Music Arms Race: On the Detection of AI-Generated Music
title_full_unstemmed The AI Music Arms Race: On the Detection of AI-Generated Music
title_short The AI Music Arms Race: On the Detection of AI-Generated Music
title_sort ai music arms race on the detection of ai generated music
topic ai music detection
ai music
generative ai
suno
udio
url https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/254
work_keys_str_mv AT lauracrosvila theaimusicarmsraceonthedetectionofaigeneratedmusic
AT bobltsturm theaimusicarmsraceonthedetectionofaigeneratedmusic
AT lucacasini theaimusicarmsraceonthedetectionofaigeneratedmusic
AT daviddalmazzo theaimusicarmsraceonthedetectionofaigeneratedmusic
AT lauracrosvila aimusicarmsraceonthedetectionofaigeneratedmusic
AT bobltsturm aimusicarmsraceonthedetectionofaigeneratedmusic
AT lucacasini aimusicarmsraceonthedetectionofaigeneratedmusic
AT daviddalmazzo aimusicarmsraceonthedetectionofaigeneratedmusic