GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion

Named-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of...

Full description

Saved in:
Bibliographic Details
Main Authors: Yingjie Xu, Xiaobo Tan, Mengxuan Wang, Wenbo Zhang
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/23/11003
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850107070721818624
author Yingjie Xu
Xiaobo Tan
Mengxuan Wang
Wenbo Zhang
author_facet Yingjie Xu
Xiaobo Tan
Mengxuan Wang
Wenbo Zhang
author_sort Yingjie Xu
collection DOAJ
description Named-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of NER to unprecedented heights. However, these models often have high requirements for computational power and memory resources, making it difficult to train and deploy them on small computing platforms. Although ALBERT as a lightweight model uses parameter sharing and matrix decomposition strategies to reduce memory consumption to some extent consumption, it does not effectively reduce the computational load of the model. Additionally, due to its internal sharing mechanism, the model’s understanding ability of text is reduced leading to poor performance in named-entity recognition tasks. To address these challenges, this manuscript proposes an efficient lightweight model called GoalBERT. The model adopts multiple fusion technologies by integrating a lightweight and efficient BiGRU that excels at handling context into part of the Transformer’s self-attention layers. This reduces the high computational demand caused by stacking multiple self-attention layers while enhancing the model’s ability to process context information. To solve the problem of gradient disappearance and explosion during training, residual connections are added between core layers for more stable training and steady performance improvement. Experimental results show that GoalBERT demonstrates recognition accuracy comparable to standard models with accuracy surpassing ALBERT by 10% in multi-entity type scenarios. Furthermore, compared to standard models, GoalBERT reduces memory requirements by 200% and improves training speed by nearly 230%. Experimental results indicate that GoalBERT is a high-quality lightweight model.
format Article
id doaj-art-1ff9d98bf32044cb93dc4a61843b0908
institution OA Journals
issn 2076-3417
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-1ff9d98bf32044cb93dc4a61843b09082025-08-20T02:38:40ZengMDPI AGApplied Sciences2076-34172024-11-0114231100310.3390/app142311003GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple FusionYingjie Xu0Xiaobo Tan1Mengxuan Wang2Wenbo Zhang3School of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaSchool of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, ChinaNamed-Entity Recognition (NER) as a core task in Natural Language Processing (NLP) aims to automatically identify and classify specific types of entities from unstructured text. In recent years, the introduction of Transformer architecture and its derivative BERT model has pushed the performance of NER to unprecedented heights. However, these models often have high requirements for computational power and memory resources, making it difficult to train and deploy them on small computing platforms. Although ALBERT as a lightweight model uses parameter sharing and matrix decomposition strategies to reduce memory consumption to some extent consumption, it does not effectively reduce the computational load of the model. Additionally, due to its internal sharing mechanism, the model’s understanding ability of text is reduced leading to poor performance in named-entity recognition tasks. To address these challenges, this manuscript proposes an efficient lightweight model called GoalBERT. The model adopts multiple fusion technologies by integrating a lightweight and efficient BiGRU that excels at handling context into part of the Transformer’s self-attention layers. This reduces the high computational demand caused by stacking multiple self-attention layers while enhancing the model’s ability to process context information. To solve the problem of gradient disappearance and explosion during training, residual connections are added between core layers for more stable training and steady performance improvement. Experimental results show that GoalBERT demonstrates recognition accuracy comparable to standard models with accuracy surpassing ALBERT by 10% in multi-entity type scenarios. Furthermore, compared to standard models, GoalBERT reduces memory requirements by 200% and improves training speed by nearly 230%. Experimental results indicate that GoalBERT is a high-quality lightweight model.https://www.mdpi.com/2076-3417/14/23/11003named entity recognitionnatural language processinglightweight modelmultiple fusion technologies
spellingShingle Yingjie Xu
Xiaobo Tan
Mengxuan Wang
Wenbo Zhang
GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
Applied Sciences
named entity recognition
natural language processing
lightweight model
multiple fusion technologies
title GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
title_full GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
title_fullStr GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
title_full_unstemmed GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
title_short GoalBERT: A Lightweight Named-Entity Recognition Model Based on Multiple Fusion
title_sort goalbert a lightweight named entity recognition model based on multiple fusion
topic named entity recognition
natural language processing
lightweight model
multiple fusion technologies
url https://www.mdpi.com/2076-3417/14/23/11003
work_keys_str_mv AT yingjiexu goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion
AT xiaobotan goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion
AT mengxuanwang goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion
AT wenbozhang goalbertalightweightnamedentityrecognitionmodelbasedonmultiplefusion