Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification
On the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Harbin University of Science and Technology Publications
2018-08-01
|
| Series: | Journal of Harbin University of Science and Technology |
| Subjects: | |
| Online Access: | https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849229037381091328 |
|---|---|
| author | WEN Xue-yan ZHAO Li-ying XU Ke-sheng LU Guang |
| author_facet | WEN Xue-yan ZHAO Li-ying XU Ke-sheng LU Guang |
| author_sort | WEN Xue-yan |
| collection | DOAJ |
| description | On the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the new samples, wrong points sample can't be contained,the correct classification of the wrongly classified sample is added to the existing MDSMOTE algorithm to improve the quality of the samples. For that we can' t solve the problem of the hyper plane bias of the minority class in traditional FSVM on imbalanced data sets classification,positive and negative penalty coefficient and fuzzy factor are added the FSVM to improve the recognition rate of unbalanced data. The improved algorithm is used in the classification of JingDong online shopping commentary data set. The f- measure value of this algorithm is increased by 9. 13% on average,which indicates the feasibility and effectiveness of this method. |
| format | Article |
| id | doaj-art-ad854acbbd7044e59e8789ebaac1a331 |
| institution | Kabale University |
| issn | 1007-2683 |
| language | zho |
| publishDate | 2018-08-01 |
| publisher | Harbin University of Science and Technology Publications |
| record_format | Article |
| series | Journal of Harbin University of Science and Technology |
| spelling | doaj-art-ad854acbbd7044e59e8789ebaac1a3312025-08-22T09:13:21ZzhoHarbin University of Science and Technology PublicationsJournal of Harbin University of Science and Technology1007-26832018-08-012304879410.15938/j.jhust.2018.04.016Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set ClassificationWEN Xue-yan0ZHAO Li-ying1XU Ke-sheng2LU Guang3School of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaSchool of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaState Forestry Administration, Harbin Forestry Machinery Research Institute, Heilongjiang, Harbin 150086, ChinaSchool of Information and Computer Engineering, Northeast Forestry University, Heilongjiang, Harbin 150040, ChinaOn the network shopping evaluation data sets appear the phenomenon of extreme imbalance,inorder to improve the classification accuracy of the unbalanced data set,It should be improved from both the sample and the algorithm For one of the problem in MDSMOTE algorithm that when generating part of the new samples, wrong points sample can't be contained,the correct classification of the wrongly classified sample is added to the existing MDSMOTE algorithm to improve the quality of the samples. For that we can' t solve the problem of the hyper plane bias of the minority class in traditional FSVM on imbalanced data sets classification,positive and negative penalty coefficient and fuzzy factor are added the FSVM to improve the recognition rate of unbalanced data. The improved algorithm is used in the classification of JingDong online shopping commentary data set. The f- measure value of this algorithm is increased by 9. 13% on average,which indicates the feasibility and effectiveness of this method.https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561imbalanced data setssupport vector machinessmote algorithmtext categorization |
| spellingShingle | WEN Xue-yan ZHAO Li-ying XU Ke-sheng LU Guang Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification Journal of Harbin University of Science and Technology imbalanced data sets support vector machines smote algorithm text categorization |
| title | Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification |
| title_full | Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification |
| title_fullStr | Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification |
| title_full_unstemmed | Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification |
| title_short | Application of Improved MDSMOTE and FC-SVM in Imbalanced Data Set Classification |
| title_sort | application of improved mdsmote and fc svm in imbalanced data set classification |
| topic | imbalanced data sets support vector machines smote algorithm text categorization |
| url | https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=1561 |
| work_keys_str_mv | AT wenxueyan applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification AT zhaoliying applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification AT xukesheng applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification AT luguang applicationofimprovedmdsmoteandfcsvminimbalanceddatasetclassification |