A hybrid model based on transformer and Mamba for enhanced sequence modeling
Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain e...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-87574-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850284537940017152 |
|---|---|
| author | Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang |
| author_facet | Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang |
| author_sort | Xiaocui Zhu |
| collection | DOAJ |
| description | Abstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks. |
| format | Article |
| id | doaj-art-2e9d0d612bf346ae89d7a54fcc0f4609 |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-2e9d0d612bf346ae89d7a54fcc0f46092025-08-20T01:47:33ZengNature PortfolioScientific Reports2045-23222025-04-011511910.1038/s41598-025-87574-8A hybrid model based on transformer and Mamba for enhanced sequence modelingXiaocui Zhu0Qunsheng Ruan1Sai Qian2Miaohui Zhang3Jiangxi Academy Sciences, Institute of EnergyDepartment of nature science and computer, Ganzhou Teachers CollegeJiangxi Academy Sciences, Institute of EnergyJiangxi Academy Sciences, Institute of EnergyAbstract State Space Models (SSMs) have made remarkable strides in language modeling in recent years. With the introduction of Mamba, these models have garnered increased attention, often surpassing Transformers in specific areas. Nevertheless, despite Mamba’s unique strengths, Transformers remain essential due to their advanced computational capabilities and proven effectiveness. In this paper, we propose a novel model that effectively integrates the strengths of both Transformers and Mamba. Specifically, our model utilizes the Transformer’s encoder for encoding tasks while employing Mamba as the decoder for decoding tasks. We introduce a feature fusion technique that combines the features generated by the encoder with the hidden states produced by the decoder. This approach successfully merges the advantages of the Transformer and Mamba, resulting in enhanced performance. Comprehensive experiments across various language tasks demonstrate that our proposed model consistently achieves competitive results, outperforming existing benchmarks.https://doi.org/10.1038/s41598-025-87574-8State space models (SSMs)TransformerMambaFeature fusion |
| spellingShingle | Xiaocui Zhu Qunsheng Ruan Sai Qian Miaohui Zhang A hybrid model based on transformer and Mamba for enhanced sequence modeling Scientific Reports State space models (SSMs) Transformer Mamba Feature fusion |
| title | A hybrid model based on transformer and Mamba for enhanced sequence modeling |
| title_full | A hybrid model based on transformer and Mamba for enhanced sequence modeling |
| title_fullStr | A hybrid model based on transformer and Mamba for enhanced sequence modeling |
| title_full_unstemmed | A hybrid model based on transformer and Mamba for enhanced sequence modeling |
| title_short | A hybrid model based on transformer and Mamba for enhanced sequence modeling |
| title_sort | hybrid model based on transformer and mamba for enhanced sequence modeling |
| topic | State space models (SSMs) Transformer Mamba Feature fusion |
| url | https://doi.org/10.1038/s41598-025-87574-8 |
| work_keys_str_mv | AT xiaocuizhu ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT qunshengruan ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT saiqian ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT miaohuizhang ahybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT xiaocuizhu hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT qunshengruan hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT saiqian hybridmodelbasedontransformerandmambaforenhancedsequencemodeling AT miaohuizhang hybridmodelbasedontransformerandmambaforenhancedsequencemodeling |