Forecasting insect abundance using time series embedding and machine learning

Implementing insect monitoring systems provides an excellent opportunity to create accurate interventions for insect control. However, selecting the appropriate time for an intervention is still an open question due to the inherent difficulty of implementing on-site monitoring in real-time. A possib...

Full description

Saved in:
Bibliographic Details
Main Authors: Gabriel R. Palma, Rodrigo F. Mello, Wesley A.C. Godoy, Eduardo Engel, Douglas Lau, Charles Markham, Rafael A. Moral
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S157495412400476X
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595423162793984
author Gabriel R. Palma
Rodrigo F. Mello
Wesley A.C. Godoy
Eduardo Engel
Douglas Lau
Charles Markham
Rafael A. Moral
author_facet Gabriel R. Palma
Rodrigo F. Mello
Wesley A.C. Godoy
Eduardo Engel
Douglas Lau
Charles Markham
Rafael A. Moral
author_sort Gabriel R. Palma
collection DOAJ
description Implementing insect monitoring systems provides an excellent opportunity to create accurate interventions for insect control. However, selecting the appropriate time for an intervention is still an open question due to the inherent difficulty of implementing on-site monitoring in real-time. A possible solution to enhance decision-making is to apply forecasting methods to predict insect abundance. However, another layer of complexity is added when other covariates are considered in the forecasting, such as climate time series collected along the monitoring system. Multiple combinations of climate time series and their lags can be used to build a forecasting method. Therefore, we propose a new approach to address this problem by combining statistics, machine learning, and time series embedding. We used two datasets containing a time series of aphids and climate data collected weekly in two municipalities in Southern Brazil for eight years. We conduct a simulation study based on a probabilistic autoregressive model with exogenous time series based on Poisson and negative binomial distributions to evaluate the performance of our approach. We pre-processed the data using our newly proposed approach and more straightforward approaches commonly used to train learning algorithms. We evaluate the performance of the selected algorithms by looking at the Pearson correlation and Root Mean Squared Error obtained using one-step-ahead forecasting. Based on Random Forests, Lasso-regularised linear regression, and LightGBM regression algorithms, we showed the feasibility of our novel approach, which yields competitive forecasts while automatically selecting insect abundances, climate time series and their lags to aid forecasting.
format Article
id doaj-art-c4090bef380a4eca8fe4f3ee9a6d0ab0
institution Kabale University
issn 1574-9541
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Ecological Informatics
spelling doaj-art-c4090bef380a4eca8fe4f3ee9a6d0ab02025-01-19T06:24:34ZengElsevierEcological Informatics1574-95412025-03-0185102934Forecasting insect abundance using time series embedding and machine learningGabriel R. Palma0Rodrigo F. Mello1Wesley A.C. Godoy2Eduardo Engel3Douglas Lau4Charles Markham5Rafael A. Moral6Hamilton Institute, Maynooth University, Maynooth, Ireland; Department of Mathematics and Statistics, Maynooth University, Maynooth, Ireland; Corresponding author at: Hamilton Institute, Maynooth University, Maynooth, Ireland.Mercado Livre, Osasco, BrazilDepartment of Entomology and Acarology, University of São Paulo, Piracicaba, BrazilDepartment of Entomology and Acarology, University of São Paulo, Piracicaba, BrazilBrazilian Agricultural Research Corporation (Embrapa Trigo), Passo Fundo, Rio Grande do Sul, BrazilHamilton Institute, Maynooth University, Maynooth, Ireland; Department of Computer Science, Maynooth University, Maynooth, IrelandHamilton Institute, Maynooth University, Maynooth, Ireland; Department of Mathematics and Statistics, Maynooth University, Maynooth, IrelandImplementing insect monitoring systems provides an excellent opportunity to create accurate interventions for insect control. However, selecting the appropriate time for an intervention is still an open question due to the inherent difficulty of implementing on-site monitoring in real-time. A possible solution to enhance decision-making is to apply forecasting methods to predict insect abundance. However, another layer of complexity is added when other covariates are considered in the forecasting, such as climate time series collected along the monitoring system. Multiple combinations of climate time series and their lags can be used to build a forecasting method. Therefore, we propose a new approach to address this problem by combining statistics, machine learning, and time series embedding. We used two datasets containing a time series of aphids and climate data collected weekly in two municipalities in Southern Brazil for eight years. We conduct a simulation study based on a probabilistic autoregressive model with exogenous time series based on Poisson and negative binomial distributions to evaluate the performance of our approach. We pre-processed the data using our newly proposed approach and more straightforward approaches commonly used to train learning algorithms. We evaluate the performance of the selected algorithms by looking at the Pearson correlation and Root Mean Squared Error obtained using one-step-ahead forecasting. Based on Random Forests, Lasso-regularised linear regression, and LightGBM regression algorithms, we showed the feasibility of our novel approach, which yields competitive forecasts while automatically selecting insect abundances, climate time series and their lags to aid forecasting.http://www.sciencedirect.com/science/article/pii/S157495412400476XInsect outbreakIntegrated pest managementMachine learningForecastingCausality
spellingShingle Gabriel R. Palma
Rodrigo F. Mello
Wesley A.C. Godoy
Eduardo Engel
Douglas Lau
Charles Markham
Rafael A. Moral
Forecasting insect abundance using time series embedding and machine learning
Ecological Informatics
Insect outbreak
Integrated pest management
Machine learning
Forecasting
Causality
title Forecasting insect abundance using time series embedding and machine learning
title_full Forecasting insect abundance using time series embedding and machine learning
title_fullStr Forecasting insect abundance using time series embedding and machine learning
title_full_unstemmed Forecasting insect abundance using time series embedding and machine learning
title_short Forecasting insect abundance using time series embedding and machine learning
title_sort forecasting insect abundance using time series embedding and machine learning
topic Insect outbreak
Integrated pest management
Machine learning
Forecasting
Causality
url http://www.sciencedirect.com/science/article/pii/S157495412400476X
work_keys_str_mv AT gabrielrpalma forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT rodrigofmello forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT wesleyacgodoy forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT eduardoengel forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT douglaslau forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT charlesmarkham forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning
AT rafaelamoral forecastinginsectabundanceusingtimeseriesembeddingandmachinelearning