A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items

Emotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotio...

Full description

Saved in:
Bibliographic Details
Main Authors: Gaeun Lee, Seoyun Yi, Jongtae Lee
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/6/3318
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850089761800192000
author Gaeun Lee
Seoyun Yi
Jongtae Lee
author_facet Gaeun Lee
Seoyun Yi
Jongtae Lee
author_sort Gaeun Lee
collection DOAJ
description Emotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotional emotions of fashion images and compares the performance of CNN, ViT, and ResNet to determine the most suitable model. The experimental results showed that the vision transformer (ViT) model outperformed both ResNet50 and CNN models. This is due to the fact that transformer-based models, like ViT, offer greater scalability compared to CNN-based models. Specifically, ViT utilizes the transformer structure directly, which requires fewer computational resources during transfer learning compared to CNNs. This study illustrates that vision transformer (ViT) demonstrates higher performances with fewer computational resources than CNN during transfer learning. For academic and practical implications, the strong performance of ViT demonstrates the scalability and efficiency of transformer structures, indicating the need for further research applying transformer-based models to diverse datasets and environments.
format Article
id doaj-art-537473ef572d45708dee07628f5a96f6
institution DOAJ
issn 2076-3417
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-537473ef572d45708dee07628f5a96f62025-08-20T02:42:41ZengMDPI AGApplied Sciences2076-34172025-03-01156331810.3390/app15063318A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion ItemsGaeun Lee0Seoyun Yi1Jongtae Lee2Department of Business Administration, Seoul Women’s University, Seoul 03079, Republic of KoreaDepartment of Data Science, Seoul Women’s University, Seoul 03079, Republic of KoreaDepartment of Business Administration, Seoul Women’s University, Seoul 03079, Republic of KoreaEmotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotional emotions of fashion images and compares the performance of CNN, ViT, and ResNet to determine the most suitable model. The experimental results showed that the vision transformer (ViT) model outperformed both ResNet50 and CNN models. This is due to the fact that transformer-based models, like ViT, offer greater scalability compared to CNN-based models. Specifically, ViT utilizes the transformer structure directly, which requires fewer computational resources during transfer learning compared to CNNs. This study illustrates that vision transformer (ViT) demonstrates higher performances with fewer computational resources than CNN during transfer learning. For academic and practical implications, the strong performance of ViT demonstrates the scalability and efficiency of transformer structures, indicating the need for further research applying transformer-based models to diverse datasets and environments.https://www.mdpi.com/2076-3417/15/6/3318vision transformerCNNResNetemotion forecastartificial intelligence
spellingShingle Gaeun Lee
Seoyun Yi
Jongtae Lee
A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
Applied Sciences
vision transformer
CNN
ResNet
emotion forecast
artificial intelligence
title A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_full A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_fullStr A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_full_unstemmed A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_short A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_sort study on deep learning performances of identifying images emotion comparing performances of three algorithms to analyze fashion items
topic vision transformer
CNN
ResNet
emotion forecast
artificial intelligence
url https://www.mdpi.com/2076-3417/15/6/3318
work_keys_str_mv AT gaeunlee astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems
AT seoyunyi astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems
AT jongtaelee astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems
AT gaeunlee studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems
AT seoyunyi studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems
AT jongtaelee studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems