A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences

People tend to share their opinions on social media daily. This text needs to be accurately mined for different purposes like enhancements in services and/or products. Mining and analyzing Arabic text have been a big challenge due to many complications inherited in Arabic language. Although, many re...

Full description

Saved in:
Bibliographic Details
Main Authors: Alaa Hamed, Arabi Keshk, Anas Youssef
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/1/44
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832589422491598848
author Alaa Hamed
Arabi Keshk
Anas Youssef
author_facet Alaa Hamed
Arabi Keshk
Anas Youssef
author_sort Alaa Hamed
collection DOAJ
description People tend to share their opinions on social media daily. This text needs to be accurately mined for different purposes like enhancements in services and/or products. Mining and analyzing Arabic text have been a big challenge due to many complications inherited in Arabic language. Although, many research studies have already investigated the Arabic text sentiment analysis problem, this paper investigates the specific research topic that addresses Arabic comparative opinion mining. This research topic is not widely investigated in many research studies. This paper proposes a lexicon-based framework which includes a set of proposed algorithms for the mining and analysis of Arabic comparative sentences. The proposed framework comprises a set of contributions including an Arabic comparative sentence keywords lexicon and a proposed algorithm for the identification of Arabic comparative sentences, followed by a second proposed algorithm for the classification of identified comparative sentences into different types. The framework also comprises a third proposed algorithm that was developed to extract relations between entities in each of the identified comparative sentence types. Finally, two proposed algorithms were developed for the extraction of the preferred entity in each sentence type. The framework was evaluated using three different Arabic language datasets. The evaluation metrics used to obtain the evaluation results include precision, recall, F-score, and accuracy. The average values of the evaluation metrics for the proposed sentences identification algorithm reached 97%. The average evaluation values of the evaluation metrics for the proposed sentence type identification algorithm reached 96%. Finally, the average results showed 97% relation word extraction precision for the proposed relation extraction algorithm.
format Article
id doaj-art-a7eefe89d1a24137a66f65292d56036e
institution Kabale University
issn 1999-4893
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-a7eefe89d1a24137a66f65292d56036e2025-01-24T13:17:36ZengMDPI AGAlgorithms1999-48932025-01-011814410.3390/a18010044A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative SentencesAlaa Hamed0Arabi Keshk1Anas Youssef2Computer Science Department, Faculty of Computers and Information, Menoufia University, Shebin El Kom 32511, EgyptComputer Science Department, Faculty of Computers and Information, Menoufia University, Shebin El Kom 32511, EgyptComputer Science Department, Faculty of Computers and Information, Menoufia University, Shebin El Kom 32511, EgyptPeople tend to share their opinions on social media daily. This text needs to be accurately mined for different purposes like enhancements in services and/or products. Mining and analyzing Arabic text have been a big challenge due to many complications inherited in Arabic language. Although, many research studies have already investigated the Arabic text sentiment analysis problem, this paper investigates the specific research topic that addresses Arabic comparative opinion mining. This research topic is not widely investigated in many research studies. This paper proposes a lexicon-based framework which includes a set of proposed algorithms for the mining and analysis of Arabic comparative sentences. The proposed framework comprises a set of contributions including an Arabic comparative sentence keywords lexicon and a proposed algorithm for the identification of Arabic comparative sentences, followed by a second proposed algorithm for the classification of identified comparative sentences into different types. The framework also comprises a third proposed algorithm that was developed to extract relations between entities in each of the identified comparative sentence types. Finally, two proposed algorithms were developed for the extraction of the preferred entity in each sentence type. The framework was evaluated using three different Arabic language datasets. The evaluation metrics used to obtain the evaluation results include precision, recall, F-score, and accuracy. The average values of the evaluation metrics for the proposed sentences identification algorithm reached 97%. The average evaluation values of the evaluation metrics for the proposed sentence type identification algorithm reached 96%. Finally, the average results showed 97% relation word extraction precision for the proposed relation extraction algorithm.https://www.mdpi.com/1999-4893/18/1/44natural language processingArabic text miningcomparative opinioncomparative sentence identificationtype identificationrelation extraction
spellingShingle Alaa Hamed
Arabi Keshk
Anas Youssef
A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
Algorithms
natural language processing
Arabic text mining
comparative opinion
comparative sentence identification
type identification
relation extraction
title A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
title_full A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
title_fullStr A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
title_full_unstemmed A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
title_short A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
title_sort lexicon based framework for mining and analysis of arabic comparative sentences
topic natural language processing
Arabic text mining
comparative opinion
comparative sentence identification
type identification
relation extraction
url https://www.mdpi.com/1999-4893/18/1/44
work_keys_str_mv AT alaahamed alexiconbasedframeworkforminingandanalysisofarabiccomparativesentences
AT arabikeshk alexiconbasedframeworkforminingandanalysisofarabiccomparativesentences
AT anasyoussef alexiconbasedframeworkforminingandanalysisofarabiccomparativesentences
AT alaahamed lexiconbasedframeworkforminingandanalysisofarabiccomparativesentences
AT arabikeshk lexiconbasedframeworkforminingandanalysisofarabiccomparativesentences
AT anasyoussef lexiconbasedframeworkforminingandanalysisofarabiccomparativesentences