Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model

With the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center...

Full description

Saved in:
Bibliographic Details
Main Authors: GAO Xuning, YANG Song, LI Mingzhe, ZHANG Yanfeng
Format: Article
Language:zho
Published: China InfoCom Media Group 2025-01-01
Series:大数据
Subjects:
Online Access:http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849731685419057152
author GAO Xuning
YANG Song
LI Mingzhe
ZHANG Yanfeng
author_facet GAO Xuning
YANG Song
LI Mingzhe
ZHANG Yanfeng
author_sort GAO Xuning
collection DOAJ
description With the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center and at the edge, making data storage and processing more flexible and supporting business such as data security, data privacy and cross-geographic data sharing while ensuring query efficiency. This paper designed a scheduling framework based on cloud edge multi-data warehouses, integrated the query cost prediction model with machine learning technology as the core, and realized cloud edge collaborative execution and cloud edge selective execution on multiple query granularity, so as to improve the performance and query efficiency of the whole system. In addition, the multi-feature fusion and feature selection method is proposed to enhance the query cost information. The scheduling framework and optimization algorithm achieve significant performance improvement on SSB and TPC-DS datasets. It provides an effective solution for the query scheduling of data warehouse under the cloud edge multi-data warehouses architecture.
format Article
id doaj-art-0b565ed7b0a045bc98f0afccd543dad4
institution DOAJ
issn 2096-0271
language zho
publishDate 2025-01-01
publisher China InfoCom Media Group
record_format Article
series 大数据
spelling doaj-art-0b565ed7b0a045bc98f0afccd543dad42025-08-20T03:08:28ZzhoChina InfoCom Media Group大数据2096-02712025-01-011115016681058865Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction modelGAO XuningYANG SongLI MingzheZHANG YanfengWith the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center and at the edge, making data storage and processing more flexible and supporting business such as data security, data privacy and cross-geographic data sharing while ensuring query efficiency. This paper designed a scheduling framework based on cloud edge multi-data warehouses, integrated the query cost prediction model with machine learning technology as the core, and realized cloud edge collaborative execution and cloud edge selective execution on multiple query granularity, so as to improve the performance and query efficiency of the whole system. In addition, the multi-feature fusion and feature selection method is proposed to enhance the query cost information. The scheduling framework and optimization algorithm achieve significant performance improvement on SSB and TPC-DS datasets. It provides an effective solution for the query scheduling of data warehouse under the cloud edge multi-data warehouses architecture.http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011scheduling frameworkquery cost predictionrandom forestfeature selection
spellingShingle GAO Xuning
YANG Song
LI Mingzhe
ZHANG Yanfeng
Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
大数据
scheduling framework
query cost prediction
random forest
feature selection
title Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
title_full Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
title_fullStr Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
title_full_unstemmed Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
title_short Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
title_sort query scheduling based on cloud edge multi data warehouse architecture and cost prediction model
topic scheduling framework
query cost prediction
random forest
feature selection
url http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011
work_keys_str_mv AT gaoxuning queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel
AT yangsong queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel
AT limingzhe queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel
AT zhangyanfeng queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel