HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data

A lot of scholars have focused on developing effective techniques for package queries, and a lot of excellent approaches have been proposed. Unfortunately, most of the existing methods focus on a small volume of data. The rapid increase in data volume means that traditional methods of package querie...

Full description

Saved in:
Bibliographic Details
Main Authors: Meihui Shi, Derong Shen, Tiezheng Nie, Yue Kou, Ge Yu
Format: Article
Language:English
Published: Tsinghua University Press 2018-06-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2018.9020014
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832572922624999424
author Meihui Shi
Derong Shen
Tiezheng Nie
Yue Kou
Ge Yu
author_facet Meihui Shi
Derong Shen
Tiezheng Nie
Yue Kou
Ge Yu
author_sort Meihui Shi
collection DOAJ
description A lot of scholars have focused on developing effective techniques for package queries, and a lot of excellent approaches have been proposed. Unfortunately, most of the existing methods focus on a small volume of data. The rapid increase in data volume means that traditional methods of package queries find it difficult to meet the increasing requirements. To solve this problem, a novel optimization method of package queries (HPPQ) is proposed in this paper. First, the data is preprocessed into regions. Data preprocessing segments the dataset into multiple subsets and the centroid of the subsets is used for package queries, this effectively reduces the volume of candidate results. Furthermore, an efficient heuristic algorithm is proposed (namely IPOL-HS) based on the preprocessing results. This improves the quality of the candidate results in the iterative stage and improves the convergence rate of the heuristic algorithm. Finally, a strategy called HPR is proposed, which relies on a greedy algorithm and parallel processing to accelerate the rate of query. The experimental results show that our method can significantly reduce time consumption compared with existing methods.
format Article
id doaj-art-0a30d6613c624c91a6aafcfd79f07bae
institution Kabale University
issn 2096-0654
language English
publishDate 2018-06-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj-art-0a30d6613c624c91a6aafcfd79f07bae2025-02-02T06:00:36ZengTsinghua University PressBig Data Mining and Analytics2096-06542018-06-011214615910.26599/BDMA.2018.9020014HPPQ: A Parallel Package Queries Processing Approach for Large-Scale DataMeihui Shi0Derong Shen1Tiezheng Nie2Yue Kou3Ge Yu4<institution content-type="dept">College of Computer Science and Engineering</institution>, <institution>Northeastern University</institution>, <city>Shenyang</city> <postal-code>110000</postal-code>, <country>China</country>.<institution content-type="dept">College of Computer Science and Engineering</institution>, <institution>Northeastern University</institution>, <city>Shenyang</city> <postal-code>110000</postal-code>, <country>China</country>.<institution content-type="dept">College of Computer Science and Engineering</institution>, <institution>Northeastern University</institution>, <city>Shenyang</city> <postal-code>110000</postal-code>, <country>China</country>.<institution content-type="dept">College of Computer Science and Engineering</institution>, <institution>Northeastern University</institution>, <city>Shenyang</city> <postal-code>110000</postal-code>, <country>China</country>.<institution content-type="dept">College of Computer Science and Engineering</institution>, <institution>Northeastern University</institution>, <city>Shenyang</city> <postal-code>110000</postal-code>, <country>China</country>.A lot of scholars have focused on developing effective techniques for package queries, and a lot of excellent approaches have been proposed. Unfortunately, most of the existing methods focus on a small volume of data. The rapid increase in data volume means that traditional methods of package queries find it difficult to meet the increasing requirements. To solve this problem, a novel optimization method of package queries (HPPQ) is proposed in this paper. First, the data is preprocessed into regions. Data preprocessing segments the dataset into multiple subsets and the centroid of the subsets is used for package queries, this effectively reduces the volume of candidate results. Furthermore, an efficient heuristic algorithm is proposed (namely IPOL-HS) based on the preprocessing results. This improves the quality of the candidate results in the iterative stage and improves the convergence rate of the heuristic algorithm. Finally, a strategy called HPR is proposed, which relies on a greedy algorithm and parallel processing to accelerate the rate of query. The experimental results show that our method can significantly reduce time consumption compared with existing methods.https://www.sciopen.com/article/10.26599/BDMA.2018.9020014package queriesheuristic algorithmsparallel processingopposition-based learning
spellingShingle Meihui Shi
Derong Shen
Tiezheng Nie
Yue Kou
Ge Yu
HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
Big Data Mining and Analytics
package queries
heuristic algorithms
parallel processing
opposition-based learning
title HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
title_full HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
title_fullStr HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
title_full_unstemmed HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
title_short HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
title_sort hppq a parallel package queries processing approach for large scale data
topic package queries
heuristic algorithms
parallel processing
opposition-based learning
url https://www.sciopen.com/article/10.26599/BDMA.2018.9020014
work_keys_str_mv AT meihuishi hppqaparallelpackagequeriesprocessingapproachforlargescaledata
AT derongshen hppqaparallelpackagequeriesprocessingapproachforlargescaledata
AT tiezhengnie hppqaparallelpackagequeriesprocessingapproachforlargescaledata
AT yuekou hppqaparallelpackagequeriesprocessingapproachforlargescaledata
AT geyu hppqaparallelpackagequeriesprocessingapproachforlargescaledata