Consistency preserving database watermarking algorithm for decision trees

Database watermarking technologies provide an effective solution to data security problems by embedding the watermark in the database to prove copyright or trace the source of data leakage. However, when the watermarked database is used for data mining model building, such as decision trees, it may...

Full description

Saved in:
Bibliographic Details
Main Authors: Qianwen Li, Xiang Wang, Qingqi Pei, Xiaohua Chen, Kwok-Yan Lam
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2024-12-01
Series:Digital Communications and Networks
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352864822002838
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846101661467017216
author Qianwen Li
Xiang Wang
Qingqi Pei
Xiaohua Chen
Kwok-Yan Lam
author_facet Qianwen Li
Xiang Wang
Qingqi Pei
Xiaohua Chen
Kwok-Yan Lam
author_sort Qianwen Li
collection DOAJ
description Database watermarking technologies provide an effective solution to data security problems by embedding the watermark in the database to prove copyright or trace the source of data leakage. However, when the watermarked database is used for data mining model building, such as decision trees, it may cause a different mining result in comparison with the result from the original database caused by the distortion of watermark embedding. Traditional watermarking algorithms mainly consider the statistical distortion of data, such as the mean square error, but very few consider the effect of the watermark on database mining. Therefore, in this paper, a consistency preserving database watermarking algorithm is proposed for decision trees. First, label classification statistics and label state transfer methods are proposed to adjust the watermarked data so that the model structure of the watermarked decision tree is the same as that of the original decision tree. Then, the splitting values of the decision tree are adjusted according to the defined constraint equations. Finally, the adjusted database can obtain a decision tree consistent with the original decision tree. The experimental results demonstrated that the proposed algorithm does not corrupt the watermarks, and makes the watermarked decision tree consistent with the original decision tree with a small distortion.
format Article
id doaj-art-d69748a1668d483caa3f1b45d7fa977a
institution Kabale University
issn 2352-8648
language English
publishDate 2024-12-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Digital Communications and Networks
spelling doaj-art-d69748a1668d483caa3f1b45d7fa977a2024-12-29T04:47:32ZengKeAi Communications Co., Ltd.Digital Communications and Networks2352-86482024-12-0110618511863Consistency preserving database watermarking algorithm for decision treesQianwen Li0Xiang Wang1Qingqi Pei2Xiaohua Chen3Kwok-Yan Lam4School of Telecommunications Engineering, Xidian University, Xi'an, 710071, ChinaSchool of Cyber Engineering, Xidian University, Xi'an, 710071, China; Corresponding author.School of Telecommunications Engineering, Xidian University, Xi'an, China2012 Laboratories, Huawei Technologies, Huawei Technology Co., Ltd, Hangzhou, 310000, ChinaThe School of Computer Science and Engineering, Nanyang Technological University, 639798, SingaporeDatabase watermarking technologies provide an effective solution to data security problems by embedding the watermark in the database to prove copyright or trace the source of data leakage. However, when the watermarked database is used for data mining model building, such as decision trees, it may cause a different mining result in comparison with the result from the original database caused by the distortion of watermark embedding. Traditional watermarking algorithms mainly consider the statistical distortion of data, such as the mean square error, but very few consider the effect of the watermark on database mining. Therefore, in this paper, a consistency preserving database watermarking algorithm is proposed for decision trees. First, label classification statistics and label state transfer methods are proposed to adjust the watermarked data so that the model structure of the watermarked decision tree is the same as that of the original decision tree. Then, the splitting values of the decision tree are adjusted according to the defined constraint equations. Finally, the adjusted database can obtain a decision tree consistent with the original decision tree. The experimental results demonstrated that the proposed algorithm does not corrupt the watermarks, and makes the watermarked decision tree consistent with the original decision tree with a small distortion.http://www.sciencedirect.com/science/article/pii/S2352864822002838Consistency preservingDecision treeDatabase watermarkingData mining
spellingShingle Qianwen Li
Xiang Wang
Qingqi Pei
Xiaohua Chen
Kwok-Yan Lam
Consistency preserving database watermarking algorithm for decision trees
Digital Communications and Networks
Consistency preserving
Decision tree
Database watermarking
Data mining
title Consistency preserving database watermarking algorithm for decision trees
title_full Consistency preserving database watermarking algorithm for decision trees
title_fullStr Consistency preserving database watermarking algorithm for decision trees
title_full_unstemmed Consistency preserving database watermarking algorithm for decision trees
title_short Consistency preserving database watermarking algorithm for decision trees
title_sort consistency preserving database watermarking algorithm for decision trees
topic Consistency preserving
Decision tree
Database watermarking
Data mining
url http://www.sciencedirect.com/science/article/pii/S2352864822002838
work_keys_str_mv AT qianwenli consistencypreservingdatabasewatermarkingalgorithmfordecisiontrees
AT xiangwang consistencypreservingdatabasewatermarkingalgorithmfordecisiontrees
AT qingqipei consistencypreservingdatabasewatermarkingalgorithmfordecisiontrees
AT xiaohuachen consistencypreservingdatabasewatermarkingalgorithmfordecisiontrees
AT kwokyanlam consistencypreservingdatabasewatermarkingalgorithmfordecisiontrees