Swin-HSSAM: A green coffee bean grading method by Swin transformer.

A novel shifted window (Swin) Transformer coffee bean grading model called Swin-HSSAM has been proposed to address the challenges of accurately classifying green coffee beans and low identification accuracy. This model integrated the Swin Transformer as the backbone network; fused features from the...

Full description

Saved in:
Bibliographic Details
Main Authors: Yujie Jiao, Yuqing Zhao, Aoying Jia, Tianyun Wang, Jiashun Li, Kaiming Xiang, Hangyu Deng, Maochang He, Rui Jiang, Yue Zhang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0322198
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850128287983992832
author Yujie Jiao
Yuqing Zhao
Aoying Jia
Tianyun Wang
Jiashun Li
Kaiming Xiang
Hangyu Deng
Maochang He
Rui Jiang
Yue Zhang
author_facet Yujie Jiao
Yuqing Zhao
Aoying Jia
Tianyun Wang
Jiashun Li
Kaiming Xiang
Hangyu Deng
Maochang He
Rui Jiang
Yue Zhang
author_sort Yujie Jiao
collection DOAJ
description A novel shifted window (Swin) Transformer coffee bean grading model called Swin-HSSAM has been proposed to address the challenges of accurately classifying green coffee beans and low identification accuracy. This model integrated the Swin Transformer as the backbone network; fused features from the second, third, and fourth stages using the high-level screening-feature pyramid networks module; and incorporated the selective attention module (SAM) for discriminative power enhancement to enhance the feature outputs before classification. Fusion Loss was employed as the loss function. Experimental results on a proprietary coffee bean dataset demonstrate that the Swin-HSSAM model achieved an average grading accuracy of 96.34% for the three grading as well as the nine defect subdivision levels, outperforming the AlexNet, VGG16, ResNet50, MobileNet-v2, Vision Transformer (ViT), and CrossViT models by 3.86%, 2.56%, 0.44%, 4.05%, 5.36%, and 5.40% percentage points, respectively. Evaluations on a public coffee bean dataset revealed that, compared with the aforementioned models, the Swin-HSSAM model improved the average grading accuracy by 1.01%, 0.13%, 4.75%, 0.85%, 0.73%, and 0.27% percentage points, respectively. These results indicate that the Swin-HSSAM model not only achieved high grading accuracy but also exhibited broad applicability, providing a novel solution for the automated grading and identification of green coffee beans.
format Article
id doaj-art-5a26044d851247e6b1a6da2ec34b36b2
institution OA Journals
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-5a26044d851247e6b1a6da2ec34b36b22025-08-20T02:33:23ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032219810.1371/journal.pone.0322198Swin-HSSAM: A green coffee bean grading method by Swin transformer.Yujie JiaoYuqing ZhaoAoying JiaTianyun WangJiashun LiKaiming XiangHangyu DengMaochang HeRui JiangYue ZhangA novel shifted window (Swin) Transformer coffee bean grading model called Swin-HSSAM has been proposed to address the challenges of accurately classifying green coffee beans and low identification accuracy. This model integrated the Swin Transformer as the backbone network; fused features from the second, third, and fourth stages using the high-level screening-feature pyramid networks module; and incorporated the selective attention module (SAM) for discriminative power enhancement to enhance the feature outputs before classification. Fusion Loss was employed as the loss function. Experimental results on a proprietary coffee bean dataset demonstrate that the Swin-HSSAM model achieved an average grading accuracy of 96.34% for the three grading as well as the nine defect subdivision levels, outperforming the AlexNet, VGG16, ResNet50, MobileNet-v2, Vision Transformer (ViT), and CrossViT models by 3.86%, 2.56%, 0.44%, 4.05%, 5.36%, and 5.40% percentage points, respectively. Evaluations on a public coffee bean dataset revealed that, compared with the aforementioned models, the Swin-HSSAM model improved the average grading accuracy by 1.01%, 0.13%, 4.75%, 0.85%, 0.73%, and 0.27% percentage points, respectively. These results indicate that the Swin-HSSAM model not only achieved high grading accuracy but also exhibited broad applicability, providing a novel solution for the automated grading and identification of green coffee beans.https://doi.org/10.1371/journal.pone.0322198
spellingShingle Yujie Jiao
Yuqing Zhao
Aoying Jia
Tianyun Wang
Jiashun Li
Kaiming Xiang
Hangyu Deng
Maochang He
Rui Jiang
Yue Zhang
Swin-HSSAM: A green coffee bean grading method by Swin transformer.
PLoS ONE
title Swin-HSSAM: A green coffee bean grading method by Swin transformer.
title_full Swin-HSSAM: A green coffee bean grading method by Swin transformer.
title_fullStr Swin-HSSAM: A green coffee bean grading method by Swin transformer.
title_full_unstemmed Swin-HSSAM: A green coffee bean grading method by Swin transformer.
title_short Swin-HSSAM: A green coffee bean grading method by Swin transformer.
title_sort swin hssam a green coffee bean grading method by swin transformer
url https://doi.org/10.1371/journal.pone.0322198
work_keys_str_mv AT yujiejiao swinhssamagreencoffeebeangradingmethodbyswintransformer
AT yuqingzhao swinhssamagreencoffeebeangradingmethodbyswintransformer
AT aoyingjia swinhssamagreencoffeebeangradingmethodbyswintransformer
AT tianyunwang swinhssamagreencoffeebeangradingmethodbyswintransformer
AT jiashunli swinhssamagreencoffeebeangradingmethodbyswintransformer
AT kaimingxiang swinhssamagreencoffeebeangradingmethodbyswintransformer
AT hangyudeng swinhssamagreencoffeebeangradingmethodbyswintransformer
AT maochanghe swinhssamagreencoffeebeangradingmethodbyswintransformer
AT ruijiang swinhssamagreencoffeebeangradingmethodbyswintransformer
AT yuezhang swinhssamagreencoffeebeangradingmethodbyswintransformer