Exploiting Semantic Annotations and Q-Learning for Constructing an Efficient Hierarchy/Graph Texts Organization

Tremendous growth in the number of textual documents has produced daily requirements for effective development to explore, analyze, and discover knowledge from these textual documents. Conventional text mining and managing systems mainly use the presence or absence of key words to discover and analy...

Full description

Saved in:
Bibliographic Details
Main Authors: Asmaa M. El-Said, Ali I. Eldesoky, Hesham A. Arafat
Format: Article
Language:English
Published: Wiley 2015-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2015/136172
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Tremendous growth in the number of textual documents has produced daily requirements for effective development to explore, analyze, and discover knowledge from these textual documents. Conventional text mining and managing systems mainly use the presence or absence of key words to discover and analyze useful information from textual documents. However, simple word counts and frequency distributions of term appearances do not capture the meaning behind the words, which results in limiting the ability to mine the texts. This paper proposes an efficient methodology for constructing hierarchy/graph-based texts organization and representation scheme based on semantic annotation and Q-learning. This methodology is based on semantic notions to represent the text in documents, to infer unknown dependencies and relationships among concepts in a text, to measure the relatedness between text documents, and to apply mining processes using the representation and the relatedness measure. The representation scheme reflects the existing relationships among concepts and facilitates accurate relatedness measurements that result in a better mining performance. An extensive experimental evaluation is conducted on real datasets from various domains, indicating the importance of the proposed approach.
ISSN:2356-6140
1537-744X