TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)

Many unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and...

Full description

Saved in:
Bibliographic Details
Main Authors: Majd E. Tannous, Wassim H. Ramadan, Mohanad A. Rajab
Format: Article
Language:English
Published: Wiley 2023-01-01
Series:Advances in Human-Computer Interaction
Online Access:http://dx.doi.org/10.1155/2023/6044007
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849683407391424512
author Majd E. Tannous
Wassim H. Ramadan
Mohanad A. Rajab
author_facet Majd E. Tannous
Wassim H. Ramadan
Mohanad A. Rajab
author_sort Majd E. Tannous
collection DOAJ
description Many unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and question answering. Resumes (CVs) are unstructured documents that have diverse formats. They contain various segments such as personal information, experience, and education. Manually processing resumes to find the most suitable candidates for a particular job is a difficult task. Due to the increased amount of data, it has become very necessary to manipulate resumes by computer to save time and effort. This research presents a new algorithm named TSHD for topic segmentation based on headings detection. We apply the algorithm to extract resume segments and identify their topics. The proposed TSHD algorithm is accurate and addresses many weaknesses in previous studies. Evaluation results show a very high F1 score (about 96%) and a very low segmentation error (about 2%). The algorithm can be easily adapted to deal with other textual domains that contain headings in their segments.
format Article
id doaj-art-d72addf3236249d4aebf6eb9356bd3e2
institution DOAJ
issn 1687-5907
language English
publishDate 2023-01-01
publisher Wiley
record_format Article
series Advances in Human-Computer Interaction
spelling doaj-art-d72addf3236249d4aebf6eb9356bd3e22025-08-20T03:23:52ZengWileyAdvances in Human-Computer Interaction1687-59072023-01-01202310.1155/2023/6044007TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)Majd E. Tannous0Wassim H. Ramadan1Mohanad A. Rajab2Department of Computer EngineeringDepartment of Computer EngineeringFaculty of Informatics EngineeringMany unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and question answering. Resumes (CVs) are unstructured documents that have diverse formats. They contain various segments such as personal information, experience, and education. Manually processing resumes to find the most suitable candidates for a particular job is a difficult task. Due to the increased amount of data, it has become very necessary to manipulate resumes by computer to save time and effort. This research presents a new algorithm named TSHD for topic segmentation based on headings detection. We apply the algorithm to extract resume segments and identify their topics. The proposed TSHD algorithm is accurate and addresses many weaknesses in previous studies. Evaluation results show a very high F1 score (about 96%) and a very low segmentation error (about 2%). The algorithm can be easily adapted to deal with other textual domains that contain headings in their segments.http://dx.doi.org/10.1155/2023/6044007
spellingShingle Majd E. Tannous
Wassim H. Ramadan
Mohanad A. Rajab
TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
Advances in Human-Computer Interaction
title TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
title_full TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
title_fullStr TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
title_full_unstemmed TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
title_short TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
title_sort tshd topic segmentation based on headings detection case study resumes
url http://dx.doi.org/10.1155/2023/6044007
work_keys_str_mv AT majdetannous tshdtopicsegmentationbasedonheadingsdetectioncasestudyresumes
AT wassimhramadan tshdtopicsegmentationbasedonheadingsdetectioncasestudyresumes
AT mohanadarajab tshdtopicsegmentationbasedonheadingsdetectioncasestudyresumes