Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production
Bees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML)...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Insects |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2075-4450/16/3/278 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850090437736398848 |
|---|---|
| author | Johanna Ramirez-Diaz Arianna Manunza Tiago Almeida de Oliveira Tania Bobbo Francesco Nutini Mirco Boschetti Maria Grazia De Iorio Giulio Pagnacco Michele Polli Alessandra Stella Giulietta Minozzi |
| author_facet | Johanna Ramirez-Diaz Arianna Manunza Tiago Almeida de Oliveira Tania Bobbo Francesco Nutini Mirco Boschetti Maria Grazia De Iorio Giulio Pagnacco Michele Polli Alessandra Stella Giulietta Minozzi |
| author_sort | Johanna Ramirez-Diaz |
| collection | DOAJ |
| description | Bees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML) methods with climatic conditions and environmental factors recorded from the winter before and during the harvest season. The initial dataset included 598 THH records collected from five apiaries in Lombardy (Italy) during spring and summer from 2015 to 2019. Colonies were classified into medium-low or high production using the 75th percentile as a threshold. A total of 38 features related to temperature, humidity, precipitation, pressure, wind, and enhanced vegetation index–EVI were used. Three ML models were trained: Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, sensitivity, specificity, precision, and area under the ROC curve (AUC). All models reached a prediction accuracy greater than 0.75 both in the training and in the testing sets. Results indicate that winter climatic conditions are important predictors of THH. Understanding the impact of climate can help beekeepers in developing strategies to prevent colony decline and low production. |
| format | Article |
| id | doaj-art-e22ebd0a54434a19bbc6669be327bac5 |
| institution | DOAJ |
| issn | 2075-4450 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Insects |
| spelling | doaj-art-e22ebd0a54434a19bbc6669be327bac52025-08-20T02:42:34ZengMDPI AGInsects2075-44502025-03-0116327810.3390/insects16030278Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey ProductionJohanna Ramirez-Diaz0Arianna Manunza1Tiago Almeida de Oliveira2Tania Bobbo3Francesco Nutini4Mirco Boschetti5Maria Grazia De Iorio6Giulio Pagnacco7Michele Polli8Alessandra Stella9Giulietta Minozzi10Institute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyStatistics Department, Paraíba State University, Campina Grande 58429-500, BrazilInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute for Electromagnetic Sensing of the Environment, Italian National Research Council (CNR), 20133 Milan, ItalyInstitute for Electromagnetic Sensing of the Environment, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyInstitute of Agricultural Biology and Biotechnology, Italian National Research Council (CNR), 20133 Milan, ItalyDepartment of Veterinary Medicine and Animal Sciences (DIVAS), University of Milan, 26900 Lodi, ItalyBees are crucial for food production and biodiversity. However, extreme weather variation and harsh winters are the leading causes of colony losses and low honey yields. This study aimed to identify the most important features and predict Total Honey Harvest (THH) by combining machine learning (ML) methods with climatic conditions and environmental factors recorded from the winter before and during the harvest season. The initial dataset included 598 THH records collected from five apiaries in Lombardy (Italy) during spring and summer from 2015 to 2019. Colonies were classified into medium-low or high production using the 75th percentile as a threshold. A total of 38 features related to temperature, humidity, precipitation, pressure, wind, and enhanced vegetation index–EVI were used. Three ML models were trained: Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, sensitivity, specificity, precision, and area under the ROC curve (AUC). All models reached a prediction accuracy greater than 0.75 both in the training and in the testing sets. Results indicate that winter climatic conditions are important predictors of THH. Understanding the impact of climate can help beekeepers in developing strategies to prevent colony decline and low production.https://www.mdpi.com/2075-4450/16/3/278<i>Apis mellifera</i>honey productionmachine learningpredictionenvironmental conditions |
| spellingShingle | Johanna Ramirez-Diaz Arianna Manunza Tiago Almeida de Oliveira Tania Bobbo Francesco Nutini Mirco Boschetti Maria Grazia De Iorio Giulio Pagnacco Michele Polli Alessandra Stella Giulietta Minozzi Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production Insects <i>Apis mellifera</i> honey production machine learning prediction environmental conditions |
| title | Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production |
| title_full | Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production |
| title_fullStr | Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production |
| title_full_unstemmed | Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production |
| title_short | Combining Environmental Variables and Machine Learning Methods to Determine the Most Significant Factors Influencing Honey Production |
| title_sort | combining environmental variables and machine learning methods to determine the most significant factors influencing honey production |
| topic | <i>Apis mellifera</i> honey production machine learning prediction environmental conditions |
| url | https://www.mdpi.com/2075-4450/16/3/278 |
| work_keys_str_mv | AT johannaramirezdiaz combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT ariannamanunza combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT tiagoalmeidadeoliveira combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT taniabobbo combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT francesconutini combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT mircoboschetti combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT mariagraziadeiorio combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT giuliopagnacco combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT michelepolli combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT alessandrastella combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction AT giuliettaminozzi combiningenvironmentalvariablesandmachinelearningmethodstodeterminethemostsignificantfactorsinfluencinghoneyproduction |