Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention

Abstract In short speech situations, the performance of existing speaker recognition systems degrades significantly due to factors such as short speech segment length, scarce speaker identity information, and noise interference. In this paper, a short speech speaker recognition system based on Dense...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fei Deng, Rui Huang, Peifan Jiang, Lin Yu, Lihong Deng
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-03-01
Series:	Scientific Reports
Subjects:	Short speech speaker recognition Dense-Fusion2Net Attention mechanism Time-Frequency Channel Attention (TFCA)
Online Access:	https://doi.org/10.1038/s41598-025-93873-x
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849389973236613120
author	Fei Deng Rui Huang Peifan Jiang Lin Yu Lihong Deng
author_facet	Fei Deng Rui Huang Peifan Jiang Lin Yu Lihong Deng
author_sort	Fei Deng
collection	DOAJ
description	Abstract In short speech situations, the performance of existing speaker recognition systems degrades significantly due to factors such as short speech segment length, scarce speaker identity information, and noise interference. In this paper, a short speech speaker recognition system based on Dense-Fusion2Net and the Time-Frequency Channel Attention (TFCA) is proposed to address the problems of current short speech speaker recognition systems. We propose the Dense-Fusion2Net network architecture to more efficiently utilize the limited acoustic features in short speech segments. We designed the Time-Frequency Channel Attention (TFCA). It can effectively learn the relationship between time and frequency domains and channels, and enhance the global feature extraction capability of the network. We conducted validation experiments using the publicly available dataset Voxceleb. The experimental results show that the proposed Dense-Fusion2Net and the TFCA attention exhibit higher performance and better robustness in short speech situations. In addition, we conducted experiments with different window lengths for short speech and obtained the most suitable window length for short speech.
format	Article
id	doaj-art-9a79e820b65545d48b6ca3a899c132df
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-03-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-9a79e820b65545d48b6ca3a899c132df2025-08-20T03:41:47ZengNature PortfolioScientific Reports2045-23222025-03-0115111510.1038/s41598-025-93873-xDense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attentionFei Deng0Rui Huang1Peifan Jiang2Lin Yu3Lihong Deng4College of Computer Science and Cyber Security, Chengdu University of TechnologyCollege of Computer Science and Cyber Security, Chengdu University of TechnologyCollege of Geophysics, Chengdu University of TechnologyCollege of Computer Science and Cyber Security, Chengdu University of TechnologySchool of Computing and Artificial Intelligence, Southwest Jiaotong UniversityAbstract In short speech situations, the performance of existing speaker recognition systems degrades significantly due to factors such as short speech segment length, scarce speaker identity information, and noise interference. In this paper, a short speech speaker recognition system based on Dense-Fusion2Net and the Time-Frequency Channel Attention (TFCA) is proposed to address the problems of current short speech speaker recognition systems. We propose the Dense-Fusion2Net network architecture to more efficiently utilize the limited acoustic features in short speech segments. We designed the Time-Frequency Channel Attention (TFCA). It can effectively learn the relationship between time and frequency domains and channels, and enhance the global feature extraction capability of the network. We conducted validation experiments using the publicly available dataset Voxceleb. The experimental results show that the proposed Dense-Fusion2Net and the TFCA attention exhibit higher performance and better robustness in short speech situations. In addition, we conducted experiments with different window lengths for short speech and obtained the most suitable window length for short speech.https://doi.org/10.1038/s41598-025-93873-xShort speech speaker recognitionDense-Fusion2NetAttention mechanismTime-Frequency Channel Attention (TFCA)
spellingShingle	Fei Deng Rui Huang Peifan Jiang Lin Yu Lihong Deng Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention Scientific Reports Short speech speaker recognition Dense-Fusion2Net Attention mechanism Time-Frequency Channel Attention (TFCA)
title	Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention
title_full	Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention
title_fullStr	Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention
title_full_unstemmed	Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention
title_short	Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention
title_sort	dense fusion2net a more efficient and lightweight short speech speaker recognition system with time frequency channel attention
topic	Short speech speaker recognition Dense-Fusion2Net Attention mechanism Time-Frequency Channel Attention (TFCA)
url	https://doi.org/10.1038/s41598-025-93873-x
work_keys_str_mv	AT feideng densefusion2netamoreefficientandlightweightshortspeechspeakerrecognitionsystemwithtimefrequencychannelattention AT ruihuang densefusion2netamoreefficientandlightweightshortspeechspeakerrecognitionsystemwithtimefrequencychannelattention AT peifanjiang densefusion2netamoreefficientandlightweightshortspeechspeakerrecognitionsystemwithtimefrequencychannelattention AT linyu densefusion2netamoreefficientandlightweightshortspeechspeakerrecognitionsystemwithtimefrequencychannelattention AT lihongdeng densefusion2netamoreefficientandlightweightshortspeechspeakerrecognitionsystemwithtimefrequencychannelattention

Dense-Fusion2Net a more efficient and lightweight short speech speaker recognition system with time-frequency channel attention

Similar Items