Few-shot Named Entity Recognition for Medical Text

Aiming at the problem that medical text named entity recognition lacks sufficient labeled data,a newly named entity recognition deep neural network and data enhancement method is proposed. First of all,the Bert word vector is extended with pinyin and strokes of Chinese characters to make it contain...

Full description

Saved in:
Bibliographic Details
Main Authors: QIN Jian, HOU Jian-xin, XIE Yi-ning, HE Yong-jun
Format: Article
Language:zho
Published: Harbin University of Science and Technology Publications 2021-08-01
Series:Journal of Harbin University of Science and Technology
Subjects:
Online Access:https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1998
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849771931664908288
author QIN Jian
HOU Jian-xin
XIE Yi-ning
HE Yong-jun
author_facet QIN Jian
HOU Jian-xin
XIE Yi-ning
HE Yong-jun
author_sort QIN Jian
collection DOAJ
description Aiming at the problem that medical text named entity recognition lacks sufficient labeled data,a newly named entity recognition deep neural network and data enhancement method is proposed. First of all,the Bert word vector is extended with pinyin and strokes of Chinese characters to make it contain more useful information. Then the named entity recognition model and the word segmentation model are jointly trained to enhance the model's ability to recognize entity boundaries. Finally,an improved data enhancement method is used to process the training data,which can increase the recognition effect of the model on named entities while avoiding overfitting of the model. The experimental results on the electronic medical record text provided by CCKS-2019 show that the proposed method can effectively improve the accuracy of named entity recognition in the case of small samples and the recognition rate can still be maintained without a significant decrease when the training data is reduced by half.
format Article
id doaj-art-01dcc431ee1a4d65bf5a90616dae2510
institution DOAJ
issn 1007-2683
language zho
publishDate 2021-08-01
publisher Harbin University of Science and Technology Publications
record_format Article
series Journal of Harbin University of Science and Technology
spelling doaj-art-01dcc431ee1a4d65bf5a90616dae25102025-08-20T03:02:28ZzhoHarbin University of Science and Technology PublicationsJournal of Harbin University of Science and Technology1007-26832021-08-0126049410110.15938/j.jhust.2021.04.013Few-shot Named Entity Recognition for Medical TextQIN Jian0HOU Jian-xin1XIE Yi-ning2HE Yong-jun3School of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,ChinaSchool of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,ChinaSchool of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,ChinaSchool of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,ChinaAiming at the problem that medical text named entity recognition lacks sufficient labeled data,a newly named entity recognition deep neural network and data enhancement method is proposed. First of all,the Bert word vector is extended with pinyin and strokes of Chinese characters to make it contain more useful information. Then the named entity recognition model and the word segmentation model are jointly trained to enhance the model's ability to recognize entity boundaries. Finally,an improved data enhancement method is used to process the training data,which can increase the recognition effect of the model on named entities while avoiding overfitting of the model. The experimental results on the electronic medical record text provided by CCKS-2019 show that the proposed method can effectively improve the accuracy of named entity recognition in the case of small samples and the recognition rate can still be maintained without a significant decrease when the training data is reduced by half.https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1998named entity recognitionfew-shotdata augmentationjoint trainingfeature fusion
spellingShingle QIN Jian
HOU Jian-xin
XIE Yi-ning
HE Yong-jun
Few-shot Named Entity Recognition for Medical Text
Journal of Harbin University of Science and Technology
named entity recognition
few-shot
data augmentation
joint training
feature fusion
title Few-shot Named Entity Recognition for Medical Text
title_full Few-shot Named Entity Recognition for Medical Text
title_fullStr Few-shot Named Entity Recognition for Medical Text
title_full_unstemmed Few-shot Named Entity Recognition for Medical Text
title_short Few-shot Named Entity Recognition for Medical Text
title_sort few shot named entity recognition for medical text
topic named entity recognition
few-shot
data augmentation
joint training
feature fusion
url https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1998
work_keys_str_mv AT qinjian fewshotnamedentityrecognitionformedicaltext
AT houjianxin fewshotnamedentityrecognitionformedicaltext
AT xieyining fewshotnamedentityrecognitionformedicaltext
AT heyongjun fewshotnamedentityrecognitionformedicaltext