A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts

The accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This pape...

Full description

Saved in:
Bibliographic Details
Main Authors: Zan Qiu, Guimin Huang, Xingguo Qin, Yabing Wang, Jiahao Wang, Ya Zhou
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/15/11/708
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850227273811099648
author Zan Qiu
Guimin Huang
Xingguo Qin
Yabing Wang
Jiahao Wang
Ya Zhou
author_facet Zan Qiu
Guimin Huang
Xingguo Qin
Yabing Wang
Jiahao Wang
Ya Zhou
author_sort Zan Qiu
collection DOAJ
description The accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This paper proposes a hybrid semantic representation method that combines a topic model that integrates conceptual knowledge with a weighted word embedding model. Specifically, we construct a topic model incorporating the Probase concept knowledge base to perform topic clustering and obtain topic semantic representation. Additionally, we design a weighted word embedding model to enhance the contextual semantic information representation of the text. The feature-based information fusion model is employed to integrate the two textual representations and generate a hybrid semantic representation. The hybrid semantic representation model proposed in this study was evaluated based on various English composition test sets. The findings demonstrate that the model presented in this paper exhibits superior accuracy and practical value compared to existing text representation methods.
format Article
id doaj-art-5f18c2ca41b4478dbe4ea8e3046a0e84
institution OA Journals
issn 2078-2489
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-5f18c2ca41b4478dbe4ea8e3046a0e842025-08-20T02:04:52ZengMDPI AGInformation2078-24892024-11-01151170810.3390/info15110708A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English TextsZan Qiu0Guimin Huang1Xingguo Qin2Yabing Wang3Jiahao Wang4Ya Zhou5Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaThe accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This paper proposes a hybrid semantic representation method that combines a topic model that integrates conceptual knowledge with a weighted word embedding model. Specifically, we construct a topic model incorporating the Probase concept knowledge base to perform topic clustering and obtain topic semantic representation. Additionally, we design a weighted word embedding model to enhance the contextual semantic information representation of the text. The feature-based information fusion model is employed to integrate the two textual representations and generate a hybrid semantic representation. The hybrid semantic representation model proposed in this study was evaluated based on various English composition test sets. The findings demonstrate that the model presented in this paper exhibits superior accuracy and practical value compared to existing text representation methods.https://www.mdpi.com/2078-2489/15/11/708text representationconceptual knowledgeword embeddingsinformation fusion
spellingShingle Zan Qiu
Guimin Huang
Xingguo Qin
Yabing Wang
Jiahao Wang
Ya Zhou
A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
Information
text representation
conceptual knowledge
word embeddings
information fusion
title A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
title_full A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
title_fullStr A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
title_full_unstemmed A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
title_short A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
title_sort hybrid semantic representation method based on fusion conceptual knowledge and weighted word embeddings for english texts
topic text representation
conceptual knowledge
word embeddings
information fusion
url https://www.mdpi.com/2078-2489/15/11/708
work_keys_str_mv AT zanqiu ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT guiminhuang ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT xingguoqin ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT yabingwang ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT jiahaowang ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT yazhou ahybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT zanqiu hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT guiminhuang hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT xingguoqin hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT yabingwang hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT jiahaowang hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts
AT yazhou hybridsemanticrepresentationmethodbasedonfusionconceptualknowledgeandweightedwordembeddingsforenglishtexts