A hybrid model based on transformer and Mamba for enhanced sequence modeling

Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain e...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaocui Zhu, Qunsheng Ruan, Sai Qian, Miaohui Zhang
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-04-01
Series:	Scientific Reports
Subjects:	State space models (SSMs) Transformer Mamba Feature fusion
Online Access:	https://doi.org/10.1038/s41598-025-87574-8
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850284537940017152
author	Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang
author_facet	Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang
author_sort	Xiaocui Zhu
collection	DOAJ
description	Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks.
format	Article
id	doaj-art-2e9d0d612bf346ae89d7a54fcc0f4609
institution	OA Journals
issn	2045-2322
language	English
publishDate	2025-04-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-2e9d0d612bf346ae89d7a54fcc0f46092025-08-20T01:47:33ZengNature PortfolioScientific Reports2045-23222025-04-011511910.1038/s41598-025-87574-8A hybrid model based on transformer and Mamba for enhanced sequence modelingXiaocui Zhu0Qunsheng Ruan1Sai Qian2Miaohui Zhang3Jiangxi Academy Sciences, Institute of EnergyDepartment of nature science and computer, Ganzhou Teachers CollegeJiangxi Academy Sciences, Institute of EnergyJiangxi Academy Sciences, Institute of EnergyAbstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks.https://doi.org/10.1038/s41598-025-87574-8State space models (SSMs)TransformerMambaFeature fusion
spellingShingle	Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang A hybrid model based on transformer and Mamba for enhanced sequence modeling Scientific Reports State space models (SSMs) Transformer Mamba Feature fusion
title	A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_full	A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_fullStr	A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_full_unstemmed	A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_short	A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_sort	hybrid model based on transformer and mamba for enhanced sequence modeling
topic	State space models (SSMs) Transformer Mamba Feature fusion
url	https://doi.org/10.1038/s41598-025-87574-8
work_keys_str_mv	AT xiaocuizhu ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT qunshengruan ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT saiqian ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT miaohuizhang ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT xiaocuizhu hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT qunshengruan hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT saiqian hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT miaohuizhang hybridmodelbasedontransformerandmambaforenhancedsequencemodeling

A hybrid model based on transformer and Mamba for enhanced sequence modeling

Similar Items