Survey on terminology extraction from texts

Abstract Automatic extraction of domain-related terminology from natural language texts is an important research topic with many practical application scenarios, such as text summarization, knowledge graph construction. It has a great impact on improving the quality of topic construction and the acc...

Full description

Saved in:
Bibliographic Details
Main Authors: Kang Xu, Yifan Feng, Qiandi Li, Zhenjiang Dong, Jianxiang Wei
Format: Article
Language:English
Published: SpringerOpen 2025-02-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01077-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Automatic extraction of domain-related terminology from natural language texts is an important research topic with many practical application scenarios, such as text summarization, knowledge graph construction. It has a great impact on improving the quality of topic construction and the accuracy of semantic retrieval. In recent years, automatic terminology extraction (ATE) has attracted widespread attention from scholars, and rich research results have been achieved. In this paper, we present a survey of terminology extraction from natural language texts. The survey encompasses definitions of pertinent issues and concepts, a systematic classification of proposed methodologies, an exploration of the associated datasets and tools, among other aspects. To the best of our knowledge, this is the first review that systematically summarizes the work of terminology extraction based on language models (LMs), offering comprehensive guidance resources for researchers and practitioners in the field. Consequently, this review offers valuable insights for researchers interested in terminology extraction issues within the Natural Language Processing (NLP) domain.
ISSN:2196-1115