Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production

Bees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML)...

Full description

Saved in:
Bibliographic Details
Main Authors: Johanna Ramirez-Diaz, Arianna Manunza, Tiago Almeida de Oliveira, Tania Bobbo, Francesco Nutini, Mirco Boschetti, Maria Grazia De Iorio, Giulio Pagnacco, Michele Polli, Alessandra Stella, Giulietta Minozzi
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Insects
Subjects:
Online Access:https://www.mdpi.com/2075-4450/16/3/278
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850090437736398848
author Johanna Ramirez-Diaz
Arianna Manunza
Tiago Almeida de Oliveira
Tania Bobbo
Francesco Nutini
Mirco Boschetti
Maria Grazia De Iorio
Giulio Pagnacco
Michele Polli
Alessandra Stella
Giulietta Minozzi
author_facet Johanna Ramirez-Diaz
Arianna Manunza
Tiago Almeida de Oliveira
Tania Bobbo
Francesco Nutini
Mirco Boschetti
Maria Grazia De Iorio
Giulio Pagnacco
Michele Polli
Alessandra Stella
Giulietta Minozzi
author_sort Johanna Ramirez-Diaz
collection DOAJ
description Bees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML) methods with climatic conditions and environmental factors recorded from the winter before and during the harvest season. The initial dataset included 598 THH records collected from five apiaries in Lombardy (Italy) during spring and summer from 2015 to 2019. Colonies were classified into medium-low or high production using the 75th percentile as a threshold. A total of 38 features related to temperature, humidity, precipitation, pressure, wind, and enhanced vegetation index–EVI were used. Three ML models were trained: Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, sensitivity, specificity, precision, and area under the ROC curve (AUC). All models reached a prediction accuracy greater than 0.75 both in the training and in the testing sets. Results indicate that winter climatic conditions are important predictors of THH. Understanding the impact of climate can help beekeepers in developing strategies to prevent colony decline and low production.
format Article
id doaj-art-e22ebd0a54434a19bbc6669be327bac5
institution DOAJ
issn 2075-4450
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Insects
spelling doaj-art-e22ebd0a54434a19bbc6669be327bac52025-08-20T02:42:34ZengMDPI AGInsects2075-44502025-03-0116327810.3390/insects16030278Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey ProductionJohanna Ramirez-Diaz0Arianna Manunza1Tiago Almeida de Oliveira2Tania Bobbo3Francesco Nutini4Mirco Boschetti5Maria Grazia De Iorio6Giulio Pagnacco7Michele Polli8Alessandra Stella9Giulietta Minozzi10Institute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyStatistics Department, Paraíba State University, Campina Grande 58429-500, BrazilInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute for Electromagnetic Sensing of the Environment, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute for Electromagnetic Sensing of the Environment, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyBees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML) methods with climatic conditions and environmental factors recorded from the winter before and during the harvest season. The initial dataset included 598 THH records collected from five apiaries in Lombardy (Italy) during spring and summer from 2015 to 2019. Colonies were classified into medium-low or high production using the 75th percentile as a threshold. A total of 38 features related to temperature, humidity, precipitation, pressure, wind, and enhanced vegetation index–EVI were used. Three ML models were trained: Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, sensitivity, specificity, precision, and area under the ROC curve (AUC). All models reached a prediction accuracy greater than 0.75 both in the training and in the testing sets. Results indicate that winter climatic conditions are important predictors of THH. Understanding the impact of climate can help beekeepers in developing strategies to prevent colony decline and low production.https://www.mdpi.com/2075-4450/16/3/278<i>Apis mellifera</i>honey productionmachine learningpredictionenvironmental conditions
spellingShingle Johanna Ramirez-Diaz
Arianna Manunza
Tiago Almeida de Oliveira
Tania Bobbo
Francesco Nutini
Mirco Boschetti
Maria Grazia De Iorio
Giulio Pagnacco
Michele Polli
Alessandra Stella
Giulietta Minozzi
Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
Insects
<i>Apis mellifera</i>
honey production
machine learning
prediction
environmental conditions
title Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
title_full Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
title_fullStr Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
title_full_unstemmed Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
title_short Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
title_sort combining environmental variables and machine learning methods to determine the most significant factors influencing honey production
topic <i>Apis mellifera</i>
honey production
machine learning
prediction
environmental conditions
url https://www.mdpi.com/2075-4450/16/3/278
work_keys_str_mv AT johannaramirezdiaz combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT ariannamanunza combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT tiagoalmeidadeoliveira combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT taniabobbo combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT francesconutini combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT mircoboschetti combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT mariagraziadeiorio combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT giuliopagnacco combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT michelepolli combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT alessandrastella combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction
AT giuliettaminozzi combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction