3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement

Artificial neural networks (ANNs) have become an important part of signal processing research. While ANNs outperform model-based signal processing methods in many applications, their internal processing often remains unclear. In this thesis, a framework for analyzing the signal processing performed...

Full description

Saved in:
Bibliographic Details
Main Author: Annika Briegleb
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Science Talks
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S277256932500012X
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823856695619616768
author Annika Briegleb
author_facet Annika Briegleb
author_sort Annika Briegleb
collection DOAJ
description Artificial neural networks (ANNs) have become an important part of signal processing research. While ANNs outperform model-based signal processing methods in many applications, their internal processing often remains unclear. In this thesis, a framework for analyzing the signal processing performed by ANN-based filters for multichannel speech enhancement is proposed. By designing specific training and test scenarios that allow to associate each time frame with certain information, e.g., spatial cues, and using low-cost analysis tools such as clustering, interpretable information can be extracted from the hidden features of the ANN. The proposed framework allows to assess whether and where spatial information is represented inside the ANN, answering the question whether these ANNs exploit spatial cues in addition to spectral information. Furthermore, the impact of the choice of training target on the functionality and interpretability of the ANN is considered. By applying the proposed analysis tools to two conceptually different speech enhancement frameworks, it is shown that the amount of spatial information extracted inside the ANN varies depending on the training target and the test scenario. The insights from this thesis help to assess the signal processing capabilities of ANNs and allow to make informed decisions when configuring, training, and deploying ANNs.
format Article
id doaj-art-c44d4789c90146bf8d761cd9e6846873
institution Kabale University
issn 2772-5693
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Science Talks
spelling doaj-art-c44d4789c90146bf8d761cd9e68468732025-02-12T05:33:08ZengElsevierScience Talks2772-56932025-03-01131004303MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancementAnnika Briegleb0Multimedia Communications and Signal Processing, Friedrich-Alexander-Universität Erlangen-Nürnberg, Cauerstr. 7, 91058 Erlangen, GermanyArtificial neural networks (ANNs) have become an important part of signal processing research. While ANNs outperform model-based signal processing methods in many applications, their internal processing often remains unclear. In this thesis, a framework for analyzing the signal processing performed by ANN-based filters for multichannel speech enhancement is proposed. By designing specific training and test scenarios that allow to associate each time frame with certain information, e.g., spatial cues, and using low-cost analysis tools such as clustering, interpretable information can be extracted from the hidden features of the ANN. The proposed framework allows to assess whether and where spatial information is represented inside the ANN, answering the question whether these ANNs exploit spatial cues in addition to spectral information. Furthermore, the impact of the choice of training target on the functionality and interpretability of the ANN is considered. By applying the proposed analysis tools to two conceptually different speech enhancement frameworks, it is shown that the amount of spatial information extracted inside the ANN varies depending on the training target and the test scenario. The insights from this thesis help to assess the signal processing capabilities of ANNs and allow to make informed decisions when configuring, training, and deploying ANNs.http://www.sciencedirect.com/science/article/pii/S277256932500012XExplainable AIDNN interpretabilityMultichannel speech enhancementSpatial filtering
spellingShingle Annika Briegleb
3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
Science Talks
Explainable AI
DNN interpretability
Multichannel speech enhancement
Spatial filtering
title 3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
title_full 3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
title_fullStr 3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
title_full_unstemmed 3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
title_short 3MT Competition (EUSIPCO2024): A peek into the black box: Insights into the functionality of complex-valued neural networks for multichannel speech enhancement
title_sort 3mt competition eusipco2024 a peek into the black box insights into the functionality of complex valued neural networks for multichannel speech enhancement
topic Explainable AI
DNN interpretability
Multichannel speech enhancement
Spatial filtering
url http://www.sciencedirect.com/science/article/pii/S277256932500012X
work_keys_str_mv AT annikabriegleb 3mtcompetitioneusipco2024apeekintotheblackboxinsightsintothefunctionalityofcomplexvaluedneuralnetworksformultichannelspeechenhancement