Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification

On the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the...

Full description

Saved in:
Bibliographic Details
Main Authors: WEN Xue-yan, ZHAO Li-ying, XU Ke-sheng, LU Guang
Format: Article
Language:zho
Published: Harbin University of Science and Technology Publications 2018-08-01
Series:Journal of Harbin University of Science and Technology
Subjects:
Online Access:https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849229037381091328
author WEN Xue-yan
ZHAO Li-ying
XU Ke-sheng
LU Guang
author_facet WEN Xue-yan
ZHAO Li-ying
XU Ke-sheng
LU Guang
author_sort WEN Xue-yan
collection DOAJ
description On the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the new samples, wrong points sample can't be contained,the correct classification of the wrongly classified sample is added to the existing MDSMOTE algorithm to improve the quality of the samples. For that we can' t solve the problem of the hyper plane bias of the minority class in traditional FSVM on imbalanced data sets classification,positive and negative penalty coefficient and fuzzy factor are added the FSVM to improve the recognition rate of unbalanced data. The improved algorithm is used in the classification of JingDong online shopping commentary data set. The f- measure value of this algorithm is increased by 9. 13% on average,which indicates the feasibility and effectiveness of this method.
format Article
id doaj-art-ad854acbbd7044e59e8789ebaac1a331
institution Kabale University
issn 1007-2683
language zho
publishDate 2018-08-01
publisher Harbin University of Science and Technology Publications
record_format Article
series Journal of Harbin University of Science and Technology
spelling doaj-art-ad854acbbd7044e59e8789ebaac1a3312025-08-22T09:13:21ZzhoHarbin University of Science and Technology PublicationsJournal of Harbin University of Science and Technology1007-26832018-08-012304879410.15938/j.jhust.2018.04.016Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set ClassificationWEN Xue-yan0ZHAO Li-ying1XU Ke-sheng2LU Guang3School of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaSchool of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaState Forestry Administration, Harbin Forestry Machinery Research Institute, Heilongjiang, Harbin 150086, ChinaSchool of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaOn the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the new samples, wrong points sample can't be contained,the correct classification of the wrongly classified sample is added to the existing MDSMOTE algorithm to improve the quality of the samples. For that we can' t solve the problem of the hyper plane bias of the minority class in traditional FSVM on imbalanced data sets classification,positive and negative penalty coefficient and fuzzy factor are added the FSVM to improve the recognition rate of unbalanced data. The improved algorithm is used in the classification of JingDong online shopping commentary data set. The f- measure value of this algorithm is increased by 9. 13% on average,which indicates the feasibility and effectiveness of this method.https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561imbalanced data setssupport vector machinessmote algorithmtext categorization
spellingShingle WEN Xue-yan
ZHAO Li-ying
XU Ke-sheng
LU Guang
Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
Journal of Harbin University of Science and Technology
imbalanced data sets
support vector machines
smote algorithm
text categorization
title Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
title_full Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
title_fullStr Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
title_full_unstemmed Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
title_short Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
title_sort application of improved mdsmote and fc svm in imbalanced data set classification
topic imbalanced data sets
support vector machines
smote algorithm
text categorization
url https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561
work_keys_str_mv AT wenxueyan applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification
AT zhaoliying applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification
AT xukesheng applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification
AT luguang applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification