A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution

Processing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direc...

Full description

Saved in:
Bibliographic Details
Main Authors: Davide Consoli, Leandro Parente, Rolf Simoes, Murat Şahin, Xuemeng Tian, Martijn Witjes, Lindsey Sloat, Tomislav Hengl
Format: Article
Language:English
Published: PeerJ Inc. 2024-12-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/18585.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850175762146000896
author Davide Consoli
Leandro Parente
Rolf Simoes
Murat Şahin
Xuemeng Tian
Martijn Witjes
Lindsey Sloat
Tomislav Hengl
author_facet Davide Consoli
Leandro Parente
Rolf Simoes
Murat Şahin
Xuemeng Tian
Martijn Witjes
Lindsey Sloat
Tomislav Hengl
author_sort Davide Consoli
collection DOAJ
description Processing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direct input for modeling because of cloud presence and/or prohibitive storage size. Existing solutions for readily using these data are not openly available, are poor in performance, or lack flexibility. Addressing this issue, we developed TSIRF (Time-Series Iteration-free Reconstruction Framework), a computational framework that can be used to apply diverse time-series processing tasks, such as temporal aggregation and time-series reconstruction by simply adjusting the convolution kernel. As the first large-scale application, TSIRF was employed to process the entire Global Land Analysis and Discovery (GLAD) ARD Landsat archive, producing a cloud-free bi-monthly aggregated product. This process, covering seven Landsat bands globally from 1997 to 2022, with more than two trillion pixels and for each one a time-series of 156 samples in the aggregated product, required approximately 28 hours of computation using 1248 Intel® Xeon® Gold 6248R CPUs. The quality of the result was assessed using a benchmark dataset derived from the aggregated product and comparing different imputation strategies. The resulting reconstructed images can be used as input for machine learning models or to map biophysical indices. To further limit the storage size the produced data was saved as 8-bit Cloud-Optimized GeoTIFFs (COG). With the hosting of about 20 TB per band/index for an entire 30 m resolution bi-monthly historical time-series distributed as open data, the product enables seamless, fast, and affordable access to the Landsat archive for environmental monitoring and analysis applications.
format Article
id doaj-art-cfb19833e78b47e584feb436b8b58dfe
institution OA Journals
issn 2167-8359
language English
publishDate 2024-12-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj-art-cfb19833e78b47e584feb436b8b58dfe2025-08-20T02:19:23ZengPeerJ Inc.PeerJ2167-83592024-12-0112e1858510.7717/peerj.18585A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolutionDavide Consoli0Leandro Parente1Rolf Simoes2Murat Şahin3Xuemeng Tian4Martijn Witjes5Lindsey Sloat6Tomislav Hengl7OpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsLand & Carbon Lab, World Resources Institute (WRI), Washington DC, United StatesOpenGeoHub Foundation, Doorwerth, NetherlandsProcessing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direct input for modeling because of cloud presence and/or prohibitive storage size. Existing solutions for readily using these data are not openly available, are poor in performance, or lack flexibility. Addressing this issue, we developed TSIRF (Time-Series Iteration-free Reconstruction Framework), a computational framework that can be used to apply diverse time-series processing tasks, such as temporal aggregation and time-series reconstruction by simply adjusting the convolution kernel. As the first large-scale application, TSIRF was employed to process the entire Global Land Analysis and Discovery (GLAD) ARD Landsat archive, producing a cloud-free bi-monthly aggregated product. This process, covering seven Landsat bands globally from 1997 to 2022, with more than two trillion pixels and for each one a time-series of 156 samples in the aggregated product, required approximately 28 hours of computation using 1248 Intel® Xeon® Gold 6248R CPUs. The quality of the result was assessed using a benchmark dataset derived from the aggregated product and comparing different imputation strategies. The resulting reconstructed images can be used as input for machine learning models or to map biophysical indices. To further limit the storage size the produced data was saved as 8-bit Cloud-Optimized GeoTIFFs (COG). With the hosting of about 20 TB per band/index for an entire 30 m resolution bi-monthly historical time-series distributed as open data, the product enables seamless, fast, and affordable access to the Landsat archive for environmental monitoring and analysis applications.https://peerj.com/articles/18585.pdfEarth observation dataTime-series processingLandsatDiscrete convolutionGap-fillingTime-series reconstruction
spellingShingle Davide Consoli
Leandro Parente
Rolf Simoes
Murat Şahin
Xuemeng Tian
Martijn Witjes
Lindsey Sloat
Tomislav Hengl
A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
PeerJ
Earth observation data
Time-series processing
Landsat
Discrete convolution
Gap-filling
Time-series reconstruction
title A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
title_full A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
title_fullStr A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
title_full_unstemmed A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
title_short A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
title_sort computational framework for processing time series of earth observation data based on discrete convolution global scale historical landsat cloud free aggregates at 30 m spatial resolution
topic Earth observation data
Time-series processing
Landsat
Discrete convolution
Gap-filling
Time-series reconstruction
url https://peerj.com/articles/18585.pdf
work_keys_str_mv AT davideconsoli acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT leandroparente acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT rolfsimoes acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT muratsahin acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT xuemengtian acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT martijnwitjes acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT lindseysloat acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT tomislavhengl acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT davideconsoli computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT leandroparente computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT rolfsimoes computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT muratsahin computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT xuemengtian computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT martijnwitjes computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT lindseysloat computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution
AT tomislavhengl computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution