SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification

Abstract Waste management handles all kinds of waste, including household, industrial, municipal, organic, biomedical, biological, and radioactive wastes. People still face challenges in proper disposal methods for different types of waste, including landfill-bound items, recyclable materials, and b...

Full description

Saved in:
Bibliographic Details
Main Authors: B. Madhavi, Mohan Mahanty, Chia-Chen Lin, B. Omkar Lakshmi Jagan, Hari Mohan Rai, Saurabh Agarwal, Neha Agarwal
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-91302-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850251941297258496
author B. Madhavi
Mohan Mahanty
Chia-Chen Lin
B. Omkar Lakshmi Jagan
Hari Mohan Rai
Saurabh Agarwal
Neha Agarwal
author_facet B. Madhavi
Mohan Mahanty
Chia-Chen Lin
B. Omkar Lakshmi Jagan
Hari Mohan Rai
Saurabh Agarwal
Neha Agarwal
author_sort B. Madhavi
collection DOAJ
description Abstract Waste management handles all kinds of waste, including household, industrial, municipal, organic, biomedical, biological, and radioactive wastes. People still face challenges in proper disposal methods for different types of waste, including landfill-bound items, recyclable materials, and biodegradable waste. Inadequate waste management poses a significant and multifaceted global challenge. The conventional method of segregating waste is a time-consuming and ineffective method that wastes human power and money. To address this issue in real time, sophisticated and sustainable waste management systems need to be implemented. The latest advancements in computer vision and deep learning offer efficient solutions for effective recycling and waste management. Existing deep learning models exhibited various limitations, such as detection accuracy and computational inefficiency, particularly when dealing with objects of varying sizes and exhibiting high degrees of visual similarity. These limitations generate various challenges in effectively capturing and representing the nuanced features of visually similar objects. To address this problem, we proposed the stacking of an enhanced Swin Transformer, improved ConvNeXt, and a spatial attention mechanism. The enhanced Swin transformers incorporate two key components- hierarchical feature extraction and shifting window mechanism to extract the global features from the garbage images effectively. The shifting window mechanism extracts the most important features from various regions of the images to identify the objects. In contrast, the hierarchical feature extraction captures long-range dependencies within the image to effectively identify different types of garbage. The improved ConvNext block with optimized parameterization extracts the local features of the image. This enhanced feature extraction capability enables the model to effectively discern fine-grained details of individual garbage particles, such as shape, texture, and subtle variations in color and appearance, leading to more accurate classification results. When we evaluated the performance of the proposed model using the publicly available Garbage Classification dataset, it attained 98.97% accuracy, 98.42% Precision, and 98.61% Recall. Due to its lightweight and low computational time and power, the proposed model surpasses the existing state-of-the-art deep learning models.
format Article
id doaj-art-eb13f2f0cc71418e98462351f12af2e1
institution OA Journals
issn 2045-2322
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-eb13f2f0cc71418e98462351f12af2e12025-08-20T01:57:47ZengNature PortfolioScientific Reports2045-23222025-03-0115111610.1038/s41598-025-91302-7SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classificationB. Madhavi0Mohan Mahanty1Chia-Chen Lin2B. Omkar Lakshmi Jagan3Hari Mohan Rai4Saurabh Agarwal5Neha Agarwal6Department of Computer Science and Engineering, Vignan’s Institute of Information TechnologyDepartment of Computer Science and Engineering, Vignan’s Institute of Information TechnologyDepartment of Computer Science and Information Engineering, National Chin-Yi University of TechnologyDepartment of Computer Science and Engineering, Vignan’s Institute of Information TechnologySchool of Computing, Gachon UniversitySchool of Computer Science and Engineering, Yeungnam UniversitySchool of Chemical Engineering, Yeungnam UniversityAbstract Waste management handles all kinds of waste, including household, industrial, municipal, organic, biomedical, biological, and radioactive wastes. People still face challenges in proper disposal methods for different types of waste, including landfill-bound items, recyclable materials, and biodegradable waste. Inadequate waste management poses a significant and multifaceted global challenge. The conventional method of segregating waste is a time-consuming and ineffective method that wastes human power and money. To address this issue in real time, sophisticated and sustainable waste management systems need to be implemented. The latest advancements in computer vision and deep learning offer efficient solutions for effective recycling and waste management. Existing deep learning models exhibited various limitations, such as detection accuracy and computational inefficiency, particularly when dealing with objects of varying sizes and exhibiting high degrees of visual similarity. These limitations generate various challenges in effectively capturing and representing the nuanced features of visually similar objects. To address this problem, we proposed the stacking of an enhanced Swin Transformer, improved ConvNeXt, and a spatial attention mechanism. The enhanced Swin transformers incorporate two key components- hierarchical feature extraction and shifting window mechanism to extract the global features from the garbage images effectively. The shifting window mechanism extracts the most important features from various regions of the images to identify the objects. In contrast, the hierarchical feature extraction captures long-range dependencies within the image to effectively identify different types of garbage. The improved ConvNext block with optimized parameterization extracts the local features of the image. This enhanced feature extraction capability enables the model to effectively discern fine-grained details of individual garbage particles, such as shape, texture, and subtle variations in color and appearance, leading to more accurate classification results. When we evaluated the performance of the proposed model using the publicly available Garbage Classification dataset, it attained 98.97% accuracy, 98.42% Precision, and 98.61% Recall. Due to its lightweight and low computational time and power, the proposed model surpasses the existing state-of-the-art deep learning models.https://doi.org/10.1038/s41598-025-91302-7Waste segregationDeep learningSwin transformerConvNeXtComputer visionSpatial attention mechanism
spellingShingle B. Madhavi
Mohan Mahanty
Chia-Chen Lin
B. Omkar Lakshmi Jagan
Hari Mohan Rai
Saurabh Agarwal
Neha Agarwal
SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
Scientific Reports
Waste segregation
Deep learning
Swin transformer
ConvNeXt
Computer vision
Spatial attention mechanism
title SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
title_full SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
title_fullStr SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
title_full_unstemmed SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
title_short SwinConvNeXt: a fused deep learning architecture for Real-time garbage image classification
title_sort swinconvnext a fused deep learning architecture for real time garbage image classification
topic Waste segregation
Deep learning
Swin transformer
ConvNeXt
Computer vision
Spatial attention mechanism
url https://doi.org/10.1038/s41598-025-91302-7
work_keys_str_mv AT bmadhavi swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT mohanmahanty swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT chiachenlin swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT bomkarlakshmijagan swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT harimohanrai swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT saurabhagarwal swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification
AT nehaagarwal swinconvnextafuseddeeplearningarchitectureforrealtimegarbageimageclassification