Music Similarity Detection Through Comparative Imagery Data

In music, plagiarism has been an important but troubled issue, which becomes ever more critical with the widespread usage of generative AI tools. Meanwhile, the development of techniques for music similarity detection has been hampered by the scarcity of legally verified data on plagiarism. In this...

Full description

Saved in:
Bibliographic Details
Main Authors: Asli Saner, Min Chen
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7706
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In music, plagiarism has been an important but troubled issue, which becomes ever more critical with the widespread usage of generative AI tools. Meanwhile, the development of techniques for music similarity detection has been hampered by the scarcity of legally verified data on plagiarism. In this paper, we present a technical solution for training music similarity detection models through the use of comparative imagery data. With the aid of feature-based analysis and data visualization, we conducted experiments to analyze how different music features may contribute to the judgment of plagiarism. While the feature-based analysis guided us to focus on a subset of features, whose similarity is typically associated with music plagiarism, data visualization inspired us to train machine learning models using such comparative imagery instead of using audio signals directly. We trained feature-based sub-models (convolutional neural networks) using imagery data and an ensemble model with Bayesian interpretation for combining the predictions of the sub-models. We tested the trained model with legally verified data as well as AI-generated music, confirming that the models produced with our approach can detect similarity patterns which are typically associated with music plagiarism. Furthermore, using imagery data as the input and output of an ML model has been proven to facilitate explainable AI.
ISSN:2076-3417