Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data
Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2013-01-01
|
| Series: | The Scientific World Journal |
| Online Access: | http://dx.doi.org/10.1155/2013/875450 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849684946941116416 |
|---|---|
| author | Fengqi Li Chuang Yu Nanhai Yang Feng Xia Guangming Li Fatemeh Kaveh-Yazdy |
| author_facet | Fengqi Li Chuang Yu Nanhai Yang Feng Xia Guangming Li Fatemeh Kaveh-Yazdy |
| author_sort | Fengqi Li |
| collection | DOAJ |
| description | Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution which happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classification. In this paper, we proposed a simple and effective approach to alleviate the unfavorable influence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets and MNIST handwritten digits dataset showed that the proposed approach outperforms other existing state-of-art methods. |
| format | Article |
| id | doaj-art-bcf204ac33d146a88c6f06da72a98345 |
| institution | DOAJ |
| issn | 1537-744X |
| language | English |
| publishDate | 2013-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | The Scientific World Journal |
| spelling | doaj-art-bcf204ac33d146a88c6f06da72a983452025-08-20T03:23:19ZengWileyThe Scientific World Journal1537-744X2013-01-01201310.1155/2013/875450875450Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced DataFengqi Li0Chuang Yu1Nanhai Yang2Feng Xia3Guangming Li4Fatemeh Kaveh-Yazdy5School of Software, Dalian University of Technology, Dalian 116620, ChinaSchool of Software, Dalian University of Technology, Dalian 116620, ChinaSchool of Software, Dalian University of Technology, Dalian 116620, ChinaSchool of Software, Dalian University of Technology, Dalian 116620, ChinaSchool of Software, Dalian University of Technology, Dalian 116620, ChinaSchool of Software, Dalian University of Technology, Dalian 116620, ChinaTransductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution which happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classification. In this paper, we proposed a simple and effective approach to alleviate the unfavorable influence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets and MNIST handwritten digits dataset showed that the proposed approach outperforms other existing state-of-art methods.http://dx.doi.org/10.1155/2013/875450 |
| spellingShingle | Fengqi Li Chuang Yu Nanhai Yang Feng Xia Guangming Li Fatemeh Kaveh-Yazdy Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data The Scientific World Journal |
| title | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
| title_full | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
| title_fullStr | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
| title_full_unstemmed | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
| title_short | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
| title_sort | iterative nearest neighborhood oversampling in semisupervised learning from imbalanced data |
| url | http://dx.doi.org/10.1155/2013/875450 |
| work_keys_str_mv | AT fengqili iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT chuangyu iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT nanhaiyang iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT fengxia iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT guangmingli iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT fatemehkavehyazdy iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata |