Improved Audio Separation Using U-Net and ICA

This paper introduces UNetICA, an innovative hybrid model for audio source separation that integrates the strengths of U-Net and Independent Component Analysis (ICA). The model is designed to effectively isolate individual audio sources such as vocals, drums, bass, and other instruments from mixed m...

Full description

Saved in:
Bibliographic Details
Main Authors: Gupta Vagisha, Seeja K.R
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2025/13/epjconf_icetsf2025_01031.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper introduces UNetICA, an innovative hybrid model for audio source separation that integrates the strengths of U-Net and Independent Component Analysis (ICA). The model is designed to effectively isolate individual audio sources such as vocals, drums, bass, and other instruments from mixed music tracks. Initially, the U-Net architecture is employed to process spectrograms, extracting multi-scale features and generating coarse estimates of each source. These preliminary outputs are then refined through ICA, which enhances separation by leveraging the statistical independence of audio components. This two-stage approach allows UNetICA to address both spectral structure and statistical properties of sources, resulting in more accurate separation. The model was trained and evaluated on the MUSDB18 dataset, which includes 100 tracks for training and 50 for testing. Performance was measured using Signal-to-Distortion Ratio (SDR). UNetICA demonstrated superior results, achieving an SDR of 19.05 dB for bass, significantly outperforming existing models. Vocals and other sources also showed competitive SDRs of 8.792 dB and 8.868 dB, respectively. When compared with state-of-the-art models such as Open-Unmix, Demucs, and Conv-Tasnet, UNetICA consistently achieved better separation performance, validating the effectiveness of the proposed hybrid framework.
ISSN:2100-014X