A New Image Oversampling Method Based on Influence Functions and Weights

Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so st...

Full description

Saved in:
Bibliographic Details
Main Authors: Jun Ye, Shoulei Lu, Jiawei Chen
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/22/10553
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850217276728410112
author Jun Ye
Shoulei Lu
Jiawei Chen
author_facet Jun Ye
Shoulei Lu
Jiawei Chen
author_sort Jun Ye
collection DOAJ
description Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets.
format Article
id doaj-art-0fce0a1fd8b646f59a8ed6eb9654c708
institution OA Journals
issn 2076-3417
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-0fce0a1fd8b646f59a8ed6eb9654c7082025-08-20T02:08:07ZengMDPI AGApplied Sciences2076-34172024-11-0114221055310.3390/app142210553A New Image Oversampling Method Based on Influence Functions and WeightsJun Ye0Shoulei Lu1Jiawei Chen2Key Laboratory of Internet Information Retrieval of Hainan Province, School of Cyberspace Security, Hainan University, Haikou 570228, ChinaKey Laboratory of Internet Information Retrieval of Hainan Province, School of Cyberspace Security, Hainan University, Haikou 570228, ChinaSchool of Cyberspace Security, Hainan University, Haikou 570228, ChinaAlthough imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets.https://www.mdpi.com/2076-3417/14/22/10553data imbalanceimage oversamplingSMOTE interpolationDCGAN
spellingShingle Jun Ye
Shoulei Lu
Jiawei Chen
A New Image Oversampling Method Based on Influence Functions and Weights
Applied Sciences
data imbalance
image oversampling
SMOTE interpolation
DCGAN
title A New Image Oversampling Method Based on Influence Functions and Weights
title_full A New Image Oversampling Method Based on Influence Functions and Weights
title_fullStr A New Image Oversampling Method Based on Influence Functions and Weights
title_full_unstemmed A New Image Oversampling Method Based on Influence Functions and Weights
title_short A New Image Oversampling Method Based on Influence Functions and Weights
title_sort new image oversampling method based on influence functions and weights
topic data imbalance
image oversampling
SMOTE interpolation
DCGAN
url https://www.mdpi.com/2076-3417/14/22/10553
work_keys_str_mv AT junye anewimageoversamplingmethodbasedoninfluencefunctionsandweights
AT shouleilu anewimageoversamplingmethodbasedoninfluencefunctionsandweights
AT jiaweichen anewimageoversamplingmethodbasedoninfluencefunctionsandweights
AT junye newimageoversamplingmethodbasedoninfluencefunctionsandweights
AT shouleilu newimageoversamplingmethodbasedoninfluencefunctionsandweights
AT jiaweichen newimageoversamplingmethodbasedoninfluencefunctionsandweights