Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm

In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confide...

Full description

Saved in:
Bibliographic Details
Main Authors: Diyari Jalal Mussa, Noor Ghazi M. Jameel
Format: Article
Language:English
Published: Sulaimani Polytechnic University 2019-11-01
Series:Kurdistan Journal of Applied Research
Subjects:
Online Access:https://kjar.spu.edu.iq/index.php/kjar/article/view/338
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861311891570688
author Diyari Jalal Mussa
Noor Ghazi M. Jameel
author_facet Diyari Jalal Mussa
Noor Ghazi M. Jameel
author_sort Diyari Jalal Mussa
collection DOAJ
description In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation.
format Article
id doaj-art-1024a4e8eef546888610c19410f7817e
institution Kabale University
issn 2411-7684
2411-7706
language English
publishDate 2019-11-01
publisher Sulaimani Polytechnic University
record_format Article
series Kurdistan Journal of Applied Research
spelling doaj-art-1024a4e8eef546888610c19410f7817e2025-02-09T21:00:25ZengSulaimani Polytechnic UniversityKurdistan Journal of Applied Research2411-76842411-77062019-11-014210.24017/science.2019.2.11338Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost AlgorithmDiyari Jalal Mussa0Noor Ghazi M. Jameel1Information technology Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqComputer Networks Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqIn recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation. https://kjar.spu.edu.iq/index.php/kjar/article/view/338SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
spellingShingle Diyari Jalal Mussa
Noor Ghazi M. Jameel
Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
Kurdistan Journal of Applied Research
SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
title Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_full Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_fullStr Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_full_unstemmed Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_short Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_sort relevant sms spam feature selection using wrapper approach and xgboost algorithm
topic SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
url https://kjar.spu.edu.iq/index.php/kjar/article/view/338
work_keys_str_mv AT diyarijalalmussa relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm
AT noorghazimjameel relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm