Enhancing Text Similarity Measurement with Hybrid Siamese Neural Networks and Lexical Features
Accurately measuring text similarity holds significant importance in various text-centric applications, including text clustering, information retrieval, and question/answer systems. This study focuses on enhancing the precision of deep learning models in gauging text similarity. To achieve this, a...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Bilijipub publisher
2025-03-01
|
| Series: | Advances in Engineering and Intelligence Systems |
| Subjects: | |
| Online Access: | https://aeis.bilijipub.com/article_218018_edc285458ddacd0913c93d26caca7639.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurately measuring text similarity holds significant importance in various text-centric applications, including text clustering, information retrieval, and question/answer systems. This study focuses on enhancing the precision of deep learning models in gauging text similarity. To achieve this, a novel hybrid approach is proposed, integrating a Siamese neural network with lexical similarity features. The Siamese network comprises two parallel sub-networks, each featuring a word embedding layer and a deep neural network. This study explores three variations of deep neural networks (CNN, LSTM, Bi-LSTM), alongside two types of word embedding models and lexical similarity features, constructing diverse models. Evaluation across three distinct datasets demonstrates the superiority of the hybrid Siamese neural network model, leveraging convolutional networks and lexical features, showcasing higher Pearson's correlation and lower mean square errors (MSE) compared to literature models. These results signify advancements in accurately assessing text similarity. The combined Siamese network model, incorporating a convolutional network, lexical features, and the cross-embedding layer (SNN_CNN_feat), achieved the highest correlation value (0.7590) and the lowest MSE error value (1.0235), as established. |
|---|---|
| ISSN: | 2821-0263 |