Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation

IntroductionNutrition is closely related to body health. A reasonable diet structure not only meets the body’s needs for various nutrients but also effectively prevents many chronic diseases. However, due to the general lack of systematic nutritional knowledge, people often find it difficult to accu...

Full description

Saved in:
Bibliographic Details
Main Authors: Yaping Zhao, Ping Zhu, Yizhang Jiang, Kaijian Xia
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-12-01
Series:Frontiers in Nutrition
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnut.2024.1469878/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850064029510270976
author Yaping Zhao
Yaping Zhao
Ping Zhu
Ping Zhu
Yizhang Jiang
Kaijian Xia
Kaijian Xia
author_facet Yaping Zhao
Yaping Zhao
Ping Zhu
Ping Zhu
Yizhang Jiang
Kaijian Xia
Kaijian Xia
author_sort Yaping Zhao
collection DOAJ
description IntroductionNutrition is closely related to body health. A reasonable diet structure not only meets the body’s needs for various nutrients but also effectively prevents many chronic diseases. However, due to the general lack of systematic nutritional knowledge, people often find it difficult to accurately assess the nutritional content of food. In this context, image-based nutritional evaluation technology can provide significant assistance. Therefore, we are dedicated to directly predicting the nutritional content of dishes through images. Currently, most related research focuses on estimating the volume or area of food through image segmentation tasks and then calculating its nutritional content based on the food category. However, this method often lacks real nutritional content labels as a reference, making it difficult to ensure the accuracy of the predictions.MethodsTo address this issue, we combined segmentation and regression tasks and used the Nutrition5k dataset, which contains detailed nutritional content labels but no segmentation labels, for manual segmentation annotation. Based on these annotated data, we developed a nutritional content prediction model that performs segmentation first and regression afterward. Specifically, we first applied the UNet model to segment the food, then used a backbone network to extract features, and enhanced the feature expression capability through the Squeeze-and-Excitation structure. Finally, the extracted features were processed through several fully connected layers to obtain predictions for the weight, calories, fat, carbohydrates, and protein content.Results and discussionOur model achieved an outstanding average percentage mean absolute error (PMAE) of 17.06% for these components. All manually annotated segmentation labels can be found at https://doi.org/10.6084/m9.figshare.26252048.v1.
format Article
id doaj-art-f1f3cd74a4a8497ca76bd8f7b14e2b1d
institution DOAJ
issn 2296-861X
language English
publishDate 2024-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Nutrition
spelling doaj-art-f1f3cd74a4a8497ca76bd8f7b14e2b1d2025-08-20T02:49:25ZengFrontiers Media S.A.Frontiers in Nutrition2296-861X2024-12-011110.3389/fnut.2024.14698781469878Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimationYaping Zhao0Yaping Zhao1Ping Zhu2Ping Zhu3Yizhang Jiang4Kaijian Xia5Kaijian Xia6School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, ChinaChangshu Key Laboratory of Medical Artificial Intelligence and Big Data, Suzhou, Jiangsu, ChinaChangshu Key Laboratory of Medical Artificial Intelligence and Big Data, Suzhou, Jiangsu, ChinaDepartment of Scientific Research, The Changshu Affiliated Hospital of Soochow University, Suzhou, Jiangsu, ChinaSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, ChinaChangshu Key Laboratory of Medical Artificial Intelligence and Big Data, Suzhou, Jiangsu, ChinaDepartment of Scientific Research, The Changshu Affiliated Hospital of Soochow University, Suzhou, Jiangsu, ChinaIntroductionNutrition is closely related to body health. A reasonable diet structure not only meets the body’s needs for various nutrients but also effectively prevents many chronic diseases. However, due to the general lack of systematic nutritional knowledge, people often find it difficult to accurately assess the nutritional content of food. In this context, image-based nutritional evaluation technology can provide significant assistance. Therefore, we are dedicated to directly predicting the nutritional content of dishes through images. Currently, most related research focuses on estimating the volume or area of food through image segmentation tasks and then calculating its nutritional content based on the food category. However, this method often lacks real nutritional content labels as a reference, making it difficult to ensure the accuracy of the predictions.MethodsTo address this issue, we combined segmentation and regression tasks and used the Nutrition5k dataset, which contains detailed nutritional content labels but no segmentation labels, for manual segmentation annotation. Based on these annotated data, we developed a nutritional content prediction model that performs segmentation first and regression afterward. Specifically, we first applied the UNet model to segment the food, then used a backbone network to extract features, and enhanced the feature expression capability through the Squeeze-and-Excitation structure. Finally, the extracted features were processed through several fully connected layers to obtain predictions for the weight, calories, fat, carbohydrates, and protein content.Results and discussionOur model achieved an outstanding average percentage mean absolute error (PMAE) of 17.06% for these components. All manually annotated segmentation labels can be found at https://doi.org/10.6084/m9.figshare.26252048.v1.https://www.frontiersin.org/articles/10.3389/fnut.2024.1469878/fullnutrition estimationNutrition5kdeep learningimage segmentationregression
spellingShingle Yaping Zhao
Yaping Zhao
Ping Zhu
Ping Zhu
Yizhang Jiang
Kaijian Xia
Kaijian Xia
Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
Frontiers in Nutrition
nutrition estimation
Nutrition5k
deep learning
image segmentation
regression
title Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
title_full Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
title_fullStr Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
title_full_unstemmed Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
title_short Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation
title_sort visual nutrition analysis leveraging segmentation and regression for food nutrient estimation
topic nutrition estimation
Nutrition5k
deep learning
image segmentation
regression
url https://www.frontiersin.org/articles/10.3389/fnut.2024.1469878/full
work_keys_str_mv AT yapingzhao visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT yapingzhao visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT pingzhu visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT pingzhu visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT yizhangjiang visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT kaijianxia visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation
AT kaijianxia visualnutritionanalysisleveragingsegmentationandregressionforfoodnutrientestimation