A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items

Emotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gaeun Lee, Seoyun Yi, Jongtae Lee
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Applied Sciences
Subjects:	vision transformer CNN ResNet emotion forecast artificial intelligence
Online Access:	https://www.mdpi.com/2076-3417/15/6/3318
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850089761800192000
author	Gaeun Lee Seoyun Yi Jongtae Lee
author_facet	Gaeun Lee Seoyun Yi Jongtae Lee
author_sort	Gaeun Lee
collection	DOAJ
description	Emotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotional emotions of fashion images and compares the performance of CNN, ViT, and ResNet to determine the most suitable model. The experimental results showed that the vision transformer (ViT) model outperformed both ResNet50 and CNN models. This is due to the fact that transformer-based models, like ViT, offer greater scalability compared to CNN-based models. Specifically, ViT utilizes the transformer structure directly, which requires fewer computational resources during transfer learning compared to CNNs. This study illustrates that vision transformer (ViT) demonstrates higher performances with fewer computational resources than CNN during transfer learning. For academic and practical implications, the strong performance of ViT demonstrates the scalability and efficiency of transformer structures, indicating the need for further research applying transformer-based models to diverse datasets and environments.
format	Article
id	doaj-art-537473ef572d45708dee07628f5a96f6
institution	DOAJ
issn	2076-3417
language	English
publishDate	2025-03-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-537473ef572d45708dee07628f5a96f62025-08-20T02:42:41ZengMDPI AGApplied Sciences2076-34172025-03-01156331810.3390/app15063318A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion ItemsGaeun Lee0Seoyun Yi1Jongtae Lee2Department of Business Administration, Seoul Women’s University, Seoul 03079, Republic of KoreaDepartment of Data Science, Seoul Women’s University, Seoul 03079, Republic of KoreaDepartment of Business Administration, Seoul Women’s University, Seoul 03079, Republic of KoreaEmotion recognition using AI has garnered significant attention in recent years, particularly in areas such as fashion, where understanding consumer sentiment can drive more personalized and effective marketing strategies. This study aims to propose an AI model that automatically analyzes the emotional emotions of fashion images and compares the performance of CNN, ViT, and ResNet to determine the most suitable model. The experimental results showed that the vision transformer (ViT) model outperformed both ResNet50 and CNN models. This is due to the fact that transformer-based models, like ViT, offer greater scalability compared to CNN-based models. Specifically, ViT utilizes the transformer structure directly, which requires fewer computational resources during transfer learning compared to CNNs. This study illustrates that vision transformer (ViT) demonstrates higher performances with fewer computational resources than CNN during transfer learning. For academic and practical implications, the strong performance of ViT demonstrates the scalability and efficiency of transformer structures, indicating the need for further research applying transformer-based models to diverse datasets and environments.https://www.mdpi.com/2076-3417/15/6/3318vision transformerCNNResNetemotion forecastartificial intelligence
spellingShingle	Gaeun Lee Seoyun Yi Jongtae Lee A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items Applied Sciences vision transformer CNN ResNet emotion forecast artificial intelligence
title	A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_full	A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_fullStr	A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_full_unstemmed	A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_short	A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items
title_sort	study on deep learning performances of identifying images emotion comparing performances of three algorithms to analyze fashion items
topic	vision transformer CNN ResNet emotion forecast artificial intelligence
url	https://www.mdpi.com/2076-3417/15/6/3318
work_keys_str_mv	AT gaeunlee astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems AT seoyunyi astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems AT jongtaelee astudyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems AT gaeunlee studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems AT seoyunyi studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems AT jongtaelee studyondeeplearningperformancesofidentifyingimagesemotioncomparingperformancesofthreealgorithmstoanalyzefashionitems

A Study on Deep Learning Performances of Identifying Images’ Emotion: Comparing Performances of Three Algorithms to Analyze Fashion Items

Similar Items