GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
Named-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/14/23/11003 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850107070721818624 |
|---|---|
| author | Yingjie Xu Xiaobo Tan Mengxuan Wang Wenbo Zhang |
| author_facet | Yingjie Xu Xiaobo Tan Mengxuan Wang Wenbo Zhang |
| author_sort | Yingjie Xu |
| collection | DOAJ |
| description | Named-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of NER to unprecedented heights. However, these models often have high requirements for computational power and memory resources, making it difficult to train and deploy them on small computing platforms. Although ALBERT as a lightweight model uses parameter sharing and matrix decomposition strategies to reduce memory consumption to some extent consumption, it does not effectively reduce the computational load of the model. Additionally, due to its internal sharing mechanism, the model’s understanding ability of text is reduced leading to poor performance in named-entity recognition tasks. To address these challenges, this manuscript proposes an efficient lightweight model called GoalBERT. The model adopts multiple fusion technologies by integrating a lightweight and efficient BiGRU that excels at handling context into part of the Transformer’s self-attention layers. This reduces the high computational demand caused by stacking multiple self-attention layers while enhancing the model’s ability to process context information. To solve the problem of gradient disappearance and explosion during training, residual connections are added between core layers for more stable training and steady performance improvement. Experimental results show that GoalBERT demonstrates recognition accuracy comparable to standard models with accuracy surpassing ALBERT by 10% in multi-entity type scenarios. Furthermore, compared to standard models, GoalBERT reduces memory requirements by 200% and improves training speed by nearly 230%. Experimental results indicate that GoalBERT is a high-quality lightweight model. |
| format | Article |
| id | doaj-art-1ff9d98bf32044cb93dc4a61843b0908 |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-1ff9d98bf32044cb93dc4a61843b09082025-08-20T02:38:40ZengMDPI AGApplied Sciences2076-34172024-11-0114231100310.3390/app142311003GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple FusionYingjie Xu0Xiaobo Tan1Mengxuan Wang2Wenbo Zhang3School of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaNamed-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of NER to unprecedented heights. However, these models often have high requirements for computational power and memory resources, making it difficult to train and deploy them on small computing platforms. Although ALBERT as a lightweight model uses parameter sharing and matrix decomposition strategies to reduce memory consumption to some extent consumption, it does not effectively reduce the computational load of the model. Additionally, due to its internal sharing mechanism, the model’s understanding ability of text is reduced leading to poor performance in named-entity recognition tasks. To address these challenges, this manuscript proposes an efficient lightweight model called GoalBERT. The model adopts multiple fusion technologies by integrating a lightweight and efficient BiGRU that excels at handling context into part of the Transformer’s self-attention layers. This reduces the high computational demand caused by stacking multiple self-attention layers while enhancing the model’s ability to process context information. To solve the problem of gradient disappearance and explosion during training, residual connections are added between core layers for more stable training and steady performance improvement. Experimental results show that GoalBERT demonstrates recognition accuracy comparable to standard models with accuracy surpassing ALBERT by 10% in multi-entity type scenarios. Furthermore, compared to standard models, GoalBERT reduces memory requirements by 200% and improves training speed by nearly 230%. Experimental results indicate that GoalBERT is a high-quality lightweight model.https://www.mdpi.com/2076-3417/14/23/11003named entity recognitionnatural language processinglightweight modelmultiple fusion technologies |
| spellingShingle | Yingjie Xu Xiaobo Tan Mengxuan Wang Wenbo Zhang GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion Applied Sciences named entity recognition natural language processing lightweight model multiple fusion technologies |
| title | GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion |
| title_full | GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion |
| title_fullStr | GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion |
| title_full_unstemmed | GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion |
| title_short | GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion |
| title_sort | goalbert a lightweight named entity recognition model based on multiple fusion |
| topic | named entity recognition natural language processing lightweight model multiple fusion technologies |
| url | https://www.mdpi.com/2076-3417/14/23/11003 |
| work_keys_str_mv | AT yingjiexu goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion AT xiaobotan goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion AT mengxuanwang goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion AT wenbozhang goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion |