Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model
With the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
China InfoCom Media Group
2025-01-01
|
| Series: | 大数据 |
| Subjects: | |
| Online Access: | http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849731685419057152 |
|---|---|
| author | GAO Xuning YANG Song LI Mingzhe ZHANG Yanfeng |
| author_facet | GAO Xuning YANG Song LI Mingzhe ZHANG Yanfeng |
| author_sort | GAO Xuning |
| collection | DOAJ |
| description | With the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center and at the edge, making data storage and processing more flexible and supporting business such as data security, data privacy and cross-geographic data sharing while ensuring query efficiency. This paper designed a scheduling framework based on cloud edge multi-data warehouses, integrated the query cost prediction model with machine learning technology as the core, and realized cloud edge collaborative execution and cloud edge selective execution on multiple query granularity, so as to improve the performance and query efficiency of the whole system. In addition, the multi-feature fusion and feature selection method is proposed to enhance the query cost information. The scheduling framework and optimization algorithm achieve significant performance improvement on SSB and TPC-DS datasets. It provides an effective solution for the query scheduling of data warehouse under the cloud edge multi-data warehouses architecture. |
| format | Article |
| id | doaj-art-0b565ed7b0a045bc98f0afccd543dad4 |
| institution | DOAJ |
| issn | 2096-0271 |
| language | zho |
| publishDate | 2025-01-01 |
| publisher | China InfoCom Media Group |
| record_format | Article |
| series | 大数据 |
| spelling | doaj-art-0b565ed7b0a045bc98f0afccd543dad42025-08-20T03:08:28ZzhoChina InfoCom Media Group大数据2096-02712025-01-011115016681058865Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction modelGAO XuningYANG SongLI MingzheZHANG YanfengWith the development of cloud computing and big data, traditional local data warehouses are difficult to expand and have low data processing efficiency, As a result, the data warehouse of cloud edge architecture comes into being. The architectures data warehouses are distributed in the cloud center and at the edge, making data storage and processing more flexible and supporting business such as data security, data privacy and cross-geographic data sharing while ensuring query efficiency. This paper designed a scheduling framework based on cloud edge multi-data warehouses, integrated the query cost prediction model with machine learning technology as the core, and realized cloud edge collaborative execution and cloud edge selective execution on multiple query granularity, so as to improve the performance and query efficiency of the whole system. In addition, the multi-feature fusion and feature selection method is proposed to enhance the query cost information. The scheduling framework and optimization algorithm achieve significant performance improvement on SSB and TPC-DS datasets. It provides an effective solution for the query scheduling of data warehouse under the cloud edge multi-data warehouses architecture.http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011scheduling frameworkquery cost predictionrandom forestfeature selection |
| spellingShingle | GAO Xuning YANG Song LI Mingzhe ZHANG Yanfeng Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model 大数据 scheduling framework query cost prediction random forest feature selection |
| title | Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model |
| title_full | Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model |
| title_fullStr | Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model |
| title_full_unstemmed | Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model |
| title_short | Query scheduling based on cloud-edge multi-data warehouse architecture and cost prediction model |
| title_sort | query scheduling based on cloud edge multi data warehouse architecture and cost prediction model |
| topic | scheduling framework query cost prediction random forest feature selection |
| url | http://www.j-bigdataresearch.com.cn/thesisDetails#10.11959/j.issn.2096-0271.2025011 |
| work_keys_str_mv | AT gaoxuning queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel AT yangsong queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel AT limingzhe queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel AT zhangyanfeng queryschedulingbasedoncloudedgemultidatawarehousearchitectureandcostpredictionmodel |