CIBPartitioner: a computational intensity-balanced partitioner for enhancing distributed spatial join processing

Load-balanced spatial partitioning is crucial for achieving high-efficiency distributed spatial join processing. However, existing spatial partitioning methods focus more on balancing data quantity, and there is much less emphasis on accurately quantifying computational loads and generating partitio...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangyang Yang, Xuefeng Guan, Ming Zhang, Hang Wu, Bo Wang, Pengcheng Yin, Qingyang Xu, Huayi Wu
Format: Article
Language:English
Published: Taylor & Francis Group 2025-07-01
Series:Geo-spatial Information Science
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/10095020.2025.2510364
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Load-balanced spatial partitioning is crucial for achieving high-efficiency distributed spatial join processing. However, existing spatial partitioning methods focus more on balancing data quantity, and there is much less emphasis on accurately quantifying computational loads and generating partitioning layouts according to the derived loads. To bridge these gaps, we propose a novel partitioning method, i.e. a computational intensity-balanced partitioner (termed CIBPartitioner for short), to enhance the efficiency of distributed spatial join processing by ensuring computational load balance. First, a computational intensity (CI) indicator is defined through theoretical analysis of the time complexity of spatial join processing to quantify the computational loads. Second, a distributed estimation method using grid histograms is introduced to efficiently calculate the distribution of CI. Finally, inspired by the KDBTree, a CI-balanced partitioning scheme is designed to partition the grid cells in the grid histogram according to the CI distribution, which minimizes the CI differences across partitions to achieve a balanced CI layout. Extensive experiments on real-world datasets demonstrate that CIBPartitioner significantly improves computational load balancing and enhances the end-to-end efficiency of distributed spatial join processing compared with popular spatial partitioners, including KDBTree. The source code of CIBPartitioner has been released.
ISSN:1009-5020
1993-5153