A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution
Processing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direc...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PeerJ Inc.
2024-12-01
|
| Series: | PeerJ |
| Subjects: | |
| Online Access: | https://peerj.com/articles/18585.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850175762146000896 |
|---|---|
| author | Davide Consoli Leandro Parente Rolf Simoes Murat Şahin Xuemeng Tian Martijn Witjes Lindsey Sloat Tomislav Hengl |
| author_facet | Davide Consoli Leandro Parente Rolf Simoes Murat Şahin Xuemeng Tian Martijn Witjes Lindsey Sloat Tomislav Hengl |
| author_sort | Davide Consoli |
| collection | DOAJ |
| description | Processing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direct input for modeling because of cloud presence and/or prohibitive storage size. Existing solutions for readily using these data are not openly available, are poor in performance, or lack flexibility. Addressing this issue, we developed TSIRF (Time-Series Iteration-free Reconstruction Framework), a computational framework that can be used to apply diverse time-series processing tasks, such as temporal aggregation and time-series reconstruction by simply adjusting the convolution kernel. As the first large-scale application, TSIRF was employed to process the entire Global Land Analysis and Discovery (GLAD) ARD Landsat archive, producing a cloud-free bi-monthly aggregated product. This process, covering seven Landsat bands globally from 1997 to 2022, with more than two trillion pixels and for each one a time-series of 156 samples in the aggregated product, required approximately 28 hours of computation using 1248 Intel® Xeon® Gold 6248R CPUs. The quality of the result was assessed using a benchmark dataset derived from the aggregated product and comparing different imputation strategies. The resulting reconstructed images can be used as input for machine learning models or to map biophysical indices. To further limit the storage size the produced data was saved as 8-bit Cloud-Optimized GeoTIFFs (COG). With the hosting of about 20 TB per band/index for an entire 30 m resolution bi-monthly historical time-series distributed as open data, the product enables seamless, fast, and affordable access to the Landsat archive for environmental monitoring and analysis applications. |
| format | Article |
| id | doaj-art-cfb19833e78b47e584feb436b8b58dfe |
| institution | OA Journals |
| issn | 2167-8359 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | PeerJ Inc. |
| record_format | Article |
| series | PeerJ |
| spelling | doaj-art-cfb19833e78b47e584feb436b8b58dfe2025-08-20T02:19:23ZengPeerJ Inc.PeerJ2167-83592024-12-0112e1858510.7717/peerj.18585A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolutionDavide Consoli0Leandro Parente1Rolf Simoes2Murat Şahin3Xuemeng Tian4Martijn Witjes5Lindsey Sloat6Tomislav Hengl7OpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsOpenGeoHub Foundation, Doorwerth, NetherlandsLand & Carbon Lab, World Resources Institute (WRI), Washington DC, United StatesOpenGeoHub Foundation, Doorwerth, NetherlandsProcessing large collections of earth observation (EO) time-series, often petabyte-sized, such as NASA’s Landsat and ESA’s Sentinel missions, can be computationally prohibitive and costly. Despite their name, even the Analysis Ready Data (ARD) versions of such collections can rarely be used as direct input for modeling because of cloud presence and/or prohibitive storage size. Existing solutions for readily using these data are not openly available, are poor in performance, or lack flexibility. Addressing this issue, we developed TSIRF (Time-Series Iteration-free Reconstruction Framework), a computational framework that can be used to apply diverse time-series processing tasks, such as temporal aggregation and time-series reconstruction by simply adjusting the convolution kernel. As the first large-scale application, TSIRF was employed to process the entire Global Land Analysis and Discovery (GLAD) ARD Landsat archive, producing a cloud-free bi-monthly aggregated product. This process, covering seven Landsat bands globally from 1997 to 2022, with more than two trillion pixels and for each one a time-series of 156 samples in the aggregated product, required approximately 28 hours of computation using 1248 Intel® Xeon® Gold 6248R CPUs. The quality of the result was assessed using a benchmark dataset derived from the aggregated product and comparing different imputation strategies. The resulting reconstructed images can be used as input for machine learning models or to map biophysical indices. To further limit the storage size the produced data was saved as 8-bit Cloud-Optimized GeoTIFFs (COG). With the hosting of about 20 TB per band/index for an entire 30 m resolution bi-monthly historical time-series distributed as open data, the product enables seamless, fast, and affordable access to the Landsat archive for environmental monitoring and analysis applications.https://peerj.com/articles/18585.pdfEarth observation dataTime-series processingLandsatDiscrete convolutionGap-fillingTime-series reconstruction |
| spellingShingle | Davide Consoli Leandro Parente Rolf Simoes Murat Şahin Xuemeng Tian Martijn Witjes Lindsey Sloat Tomislav Hengl A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution PeerJ Earth observation data Time-series processing Landsat Discrete convolution Gap-filling Time-series reconstruction |
| title | A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution |
| title_full | A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution |
| title_fullStr | A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution |
| title_full_unstemmed | A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution |
| title_short | A computational framework for processing time-series of earth observation data based on discrete convolution: global-scale historical Landsat cloud-free aggregates at 30 m spatial resolution |
| title_sort | computational framework for processing time series of earth observation data based on discrete convolution global scale historical landsat cloud free aggregates at 30 m spatial resolution |
| topic | Earth observation data Time-series processing Landsat Discrete convolution Gap-filling Time-series reconstruction |
| url | https://peerj.com/articles/18585.pdf |
| work_keys_str_mv | AT davideconsoli acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT leandroparente acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT rolfsimoes acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT muratsahin acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT xuemengtian acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT martijnwitjes acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT lindseysloat acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT tomislavhengl acomputationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT davideconsoli computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT leandroparente computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT rolfsimoes computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT muratsahin computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT xuemengtian computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT martijnwitjes computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT lindseysloat computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution AT tomislavhengl computationalframeworkforprocessingtimeseriesofearthobservationdatabasedondiscreteconvolutionglobalscalehistoricallandsatcloudfreeaggregatesat30mspatialresolution |