A hybrid model based on transformer and Mamba for enhanced sequence modeling

Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain e...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaocui Zhu, Qunsheng Ruan, Sai Qian, Miaohui Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-87574-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850284537940017152
author Xiaocui Zhu
Qunsheng Ruan
Sai Qian
Miaohui Zhang
author_facet Xiaocui Zhu
Qunsheng Ruan
Sai Qian
Miaohui Zhang
author_sort Xiaocui Zhu
collection DOAJ
description Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks.
format Article
id doaj-art-2e9d0d612bf346ae89d7a54fcc0f4609
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-2e9d0d612bf346ae89d7a54fcc0f46092025-08-20T01:47:33ZengNature PortfolioScientific Reports2045-23222025-04-011511910.1038/s41598-025-87574-8A hybrid model based on transformer and Mamba for enhanced sequence modelingXiaocui Zhu0Qunsheng Ruan1Sai Qian2Miaohui Zhang3Jiangxi Academy Sciences, Institute of EnergyDepartment of nature science and computer, Ganzhou Teachers CollegeJiangxi Academy Sciences, Institute of EnergyJiangxi Academy Sciences, Institute of EnergyAbstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks.https://doi.org/10.1038/s41598-025-87574-8State space models (SSMs)TransformerMambaFeature fusion
spellingShingle Xiaocui Zhu
Qunsheng Ruan
Sai Qian
Miaohui Zhang
A hybrid model based on transformer and Mamba for enhanced sequence modeling
Scientific Reports
State space models (SSMs)
Transformer
Mamba
Feature fusion
title A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_full A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_fullStr A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_full_unstemmed A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_short A hybrid model based on transformer and Mamba for enhanced sequence modeling
title_sort hybrid model based on transformer and mamba for enhanced sequence modeling
topic State space models (SSMs)
Transformer
Mamba
Feature fusion
url https://doi.org/10.1038/s41598-025-87574-8
work_keys_str_mv AT xiaocuizhu ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT qunshengruan ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT saiqian ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT miaohuizhang ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT xiaocuizhu hybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT qunshengruan hybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT saiqian hybridmodelbasedontransformerandmambaforenhancedsequencemodeling
AT miaohuizhang hybridmodelbasedontransformerandmambaforenhancedsequencemodeling