Survey on terminology extraction from texts

Abstract Automatic extraction of domain-related terminology from natural language texts is an important research topic with many practical application scenarios, such as text summarization, knowledge graph construction. It has a great impact on improving the quality of topic construction and the acc...

Full description

Saved in:
Bibliographic Details
Main Authors: Kang Xu, Yifan Feng, Qiandi Li, Zhenjiang Dong, Jianxiang Wei
Format: Article
Language:English
Published: SpringerOpen 2025-02-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01077-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861981503815680
author Kang Xu
Yifan Feng
Qiandi Li
Zhenjiang Dong
Jianxiang Wei
author_facet Kang Xu
Yifan Feng
Qiandi Li
Zhenjiang Dong
Jianxiang Wei
author_sort Kang Xu
collection DOAJ
description Abstract Automatic extraction of domain-related terminology from natural language texts is an important research topic with many practical application scenarios, such as text summarization, knowledge graph construction. It has a great impact on improving the quality of topic construction and the accuracy of semantic retrieval. In recent years, automatic terminology extraction (ATE) has attracted widespread attention from scholars, and rich research results have been achieved. In this paper, we present a survey of terminology extraction from natural language texts. The survey encompasses definitions of pertinent issues and concepts, a systematic classification of proposed methodologies, an exploration of the associated datasets and tools, among other aspects. To the best of our knowledge, this is the first review that systematically summarizes the work of terminology extraction based on language models (LMs), offering comprehensive guidance resources for researchers and practitioners in the field. Consequently, this review offers valuable insights for researchers interested in terminology extraction issues within the Natural Language Processing (NLP) domain.
format Article
id doaj-art-8d2770b26e264e5e9b7e0b65770003fb
institution Kabale University
issn 2196-1115
language English
publishDate 2025-02-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj-art-8d2770b26e264e5e9b7e0b65770003fb2025-02-09T12:41:21ZengSpringerOpenJournal of Big Data2196-11152025-02-0112114010.1186/s40537-025-01077-xSurvey on terminology extraction from textsKang Xu0Yifan Feng1Qiandi Li2Zhenjiang Dong3Jianxiang Wei4School of Computer Science, Nanjing University of Posts and TelecommunicationSchool of Computer Science, Nanjing University of Posts and TelecommunicationSchool of Computer Science and Engineering, Southeast UniversitySchool of Computer Science, Nanjing University of Posts and TelecommunicationSchool of Management, Nanjing University of Posts and TelecommunicationAbstract Automatic extraction of domain-related terminology from natural language texts is an important research topic with many practical application scenarios, such as text summarization, knowledge graph construction. It has a great impact on improving the quality of topic construction and the accuracy of semantic retrieval. In recent years, automatic terminology extraction (ATE) has attracted widespread attention from scholars, and rich research results have been achieved. In this paper, we present a survey of terminology extraction from natural language texts. The survey encompasses definitions of pertinent issues and concepts, a systematic classification of proposed methodologies, an exploration of the associated datasets and tools, among other aspects. To the best of our knowledge, this is the first review that systematically summarizes the work of terminology extraction based on language models (LMs), offering comprehensive guidance resources for researchers and practitioners in the field. Consequently, this review offers valuable insights for researchers interested in terminology extraction issues within the Natural Language Processing (NLP) domain.https://doi.org/10.1186/s40537-025-01077-xTerminology extractionTerminology recognitionSurveyNatural language processing
spellingShingle Kang Xu
Yifan Feng
Qiandi Li
Zhenjiang Dong
Jianxiang Wei
Survey on terminology extraction from texts
Journal of Big Data
Terminology extraction
Terminology recognition
Survey
Natural language processing
title Survey on terminology extraction from texts
title_full Survey on terminology extraction from texts
title_fullStr Survey on terminology extraction from texts
title_full_unstemmed Survey on terminology extraction from texts
title_short Survey on terminology extraction from texts
title_sort survey on terminology extraction from texts
topic Terminology extraction
Terminology recognition
Survey
Natural language processing
url https://doi.org/10.1186/s40537-025-01077-x
work_keys_str_mv AT kangxu surveyonterminologyextractionfromtexts
AT yifanfeng surveyonterminologyextractionfromtexts
AT qiandili surveyonterminologyextractionfromtexts
AT zhenjiangdong surveyonterminologyextractionfromtexts
AT jianxiangwei surveyonterminologyextractionfromtexts