Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data

Hyperspectral infrared atmospheric sounding data, characterized by their high vertical resolution, play a crucial role in capturing three-dimensional atmospheric spatial information. The hyperspectral infrared atmospheric detectors HIRAS/HIRAS-II, mounted on the FY3D/EF satellite, have established a...

Full description

Saved in:
Bibliographic Details
Main Authors: Han Li, Mingjian Gu, Guang Shi, Yong Hu, Mengzhen Xie
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/21/4088
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850197236459241472
author Han Li
Mingjian Gu
Guang Shi
Yong Hu
Mengzhen Xie
author_facet Han Li
Mingjian Gu
Guang Shi
Yong Hu
Mengzhen Xie
author_sort Han Li
collection DOAJ
description Hyperspectral infrared atmospheric sounding data, characterized by their high vertical resolution, play a crucial role in capturing three-dimensional atmospheric spatial information. The hyperspectral infrared atmospheric detectors HIRAS/HIRAS-II, mounted on the FY3D/EF satellite, have established an initial global coverage network for atmospheric sounding. The collaborative observation approach involving multiple satellites will improve both the coverage and responsiveness of data acquisition, thereby enhancing the overall quality and reliability of the data. In response to the increasing number of channels, the rapid growth of data volume, and the specific requirements of multi-satellite joint observation applications with infrared hyperspectral sounding data, this paper introduces an efficient storage and indexing method for infrared hyperspectral sounding data within a distributed architecture for the first time. The proposed approach, built on the Kubernetes cloud platform, utilizes the Google S2 discrete grid spatial indexing algorithm to establish a grid-based hierarchical model for unified metadata-embedded documents. Additionally, it optimizes the rowkey design using the BPDS model, thereby enabling the distributed storage of data in HBase. The experimental results demonstrate that the query efficiency of the Google S2 grid-based embedded document model is superior to that of the traditional flat model, achieving a query time that is only 35.6% of the latter for a dataset of 5 million records. Additionally, this method exhibits better data distribution characteristics within the global grid compared to the H3 algorithm. Leveraging the BPDS model, the HBase distributed storage system adeptly balances the node load and counteracts the detrimental effects caused by the accumulation of time-series remote sensing images. This architecture significantly enhances both storage and query efficiency, thus laying a robust foundation for forthcoming distributed computing.
format Article
id doaj-art-d4f7845f7b1f4700a23e7bacae20683e
institution OA Journals
issn 2072-4292
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-d4f7845f7b1f4700a23e7bacae20683e2025-08-20T02:13:14ZengMDPI AGRemote Sensing2072-42922024-11-011621408810.3390/rs16214088Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding DataHan Li0Mingjian Gu1Guang Shi2Yong Hu3Mengzhen Xie4Key Laboratory of Infrared Science and Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaKey Laboratory of Infrared Science and Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaKey Laboratory of Infrared Science and Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaKey Laboratory of Infrared Science and Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaKey Laboratory of Infrared Science and Technology, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, ChinaHyperspectral infrared atmospheric sounding data, characterized by their high vertical resolution, play a crucial role in capturing three-dimensional atmospheric spatial information. The hyperspectral infrared atmospheric detectors HIRAS/HIRAS-II, mounted on the FY3D/EF satellite, have established an initial global coverage network for atmospheric sounding. The collaborative observation approach involving multiple satellites will improve both the coverage and responsiveness of data acquisition, thereby enhancing the overall quality and reliability of the data. In response to the increasing number of channels, the rapid growth of data volume, and the specific requirements of multi-satellite joint observation applications with infrared hyperspectral sounding data, this paper introduces an efficient storage and indexing method for infrared hyperspectral sounding data within a distributed architecture for the first time. The proposed approach, built on the Kubernetes cloud platform, utilizes the Google S2 discrete grid spatial indexing algorithm to establish a grid-based hierarchical model for unified metadata-embedded documents. Additionally, it optimizes the rowkey design using the BPDS model, thereby enabling the distributed storage of data in HBase. The experimental results demonstrate that the query efficiency of the Google S2 grid-based embedded document model is superior to that of the traditional flat model, achieving a query time that is only 35.6% of the latter for a dataset of 5 million records. Additionally, this method exhibits better data distribution characteristics within the global grid compared to the H3 algorithm. Leveraging the BPDS model, the HBase distributed storage system adeptly balances the node load and counteracts the detrimental effects caused by the accumulation of time-series remote sensing images. This architecture significantly enhances both storage and query efficiency, thus laying a robust foundation for forthcoming distributed computing.https://www.mdpi.com/2072-4292/16/21/4088infrared hyperspectral sounding dataHIRASdistributed storagekubernetesHBase
spellingShingle Han Li
Mingjian Gu
Guang Shi
Yong Hu
Mengzhen Xie
Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
Remote Sensing
infrared hyperspectral sounding data
HIRAS
distributed storage
kubernetes
HBase
title Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
title_full Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
title_fullStr Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
title_full_unstemmed Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
title_short Distribution-Based Approach for Efficient Storage and Indexing of Massive Infrared Hyperspectral Sounding Data
title_sort distribution based approach for efficient storage and indexing of massive infrared hyperspectral sounding data
topic infrared hyperspectral sounding data
HIRAS
distributed storage
kubernetes
HBase
url https://www.mdpi.com/2072-4292/16/21/4088
work_keys_str_mv AT hanli distributionbasedapproachforefficientstorageandindexingofmassiveinfraredhyperspectralsoundingdata
AT mingjiangu distributionbasedapproachforefficientstorageandindexingofmassiveinfraredhyperspectralsoundingdata
AT guangshi distributionbasedapproachforefficientstorageandindexingofmassiveinfraredhyperspectralsoundingdata
AT yonghu distributionbasedapproachforefficientstorageandindexingofmassiveinfraredhyperspectralsoundingdata
AT mengzhenxie distributionbasedapproachforefficientstorageandindexingofmassiveinfraredhyperspectralsoundingdata