Stochastic algorithm for HDFS data theft detection based on MapReduce

To address the problems of big data efficient analysis and insider theft detection in the data theft detection of distributed cloud computing storage,taking HDFS (hadoop distributed file system) as a case study,a stochastic algorithm for HDFS data theft detection based on MapReduce was proposed.By a...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuanzhao GAO, Binglong LI, Xingyuan CHEN
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2018-10-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2018222/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To address the problems of big data efficient analysis and insider theft detection in the data theft detection of distributed cloud computing storage,taking HDFS (hadoop distributed file system) as a case study,a stochastic algorithm for HDFS data theft detection based on MapReduce was proposed.By analyzing the MAC timestamp features of HDFS generated by folder replication,the replication behavior’s detection and measurement method was established to detect all data theft modes including insider theft.The data set which is suitable for MapReduce task partition and maintains the HDFS hierarchy was designed to achieve efficient analysis of large-volume timestamps.The experimental results show that the missed rate and the number of mislabeled folders could be kept at a low level by adopting segment detection strategy.The algorithm was proved to be efficient and had good scalability under the MapReduce framework.
ISSN:1000-436X