Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN

The increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data...

Full description

Saved in:
Bibliographic Details
Main Authors: Saleh Alabdulwahab, Young-Tak Kim, Yunsik Son
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/22/7389
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850227883771953152
author Saleh Alabdulwahab
Young-Tak Kim
Yunsik Son
author_facet Saleh Alabdulwahab
Young-Tak Kim
Yunsik Son
author_sort Saleh Alabdulwahab
collection DOAJ
description The increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data generation method using a conditional tabular generative adversarial network (CTGAN) aimed at maintaining the utility of IoT sensor network data for IDS while safeguarding privacy. We integrate differential privacy (DP) with CTGAN by employing controlled noise injection to mitigate privacy risks. The technique involves dynamic distribution adjustment and quantile matching to balance the utility–privacy tradeoff. The results indicate a significant improvement in data utility compared to the standard DP method, achieving a KS test score of 0.80 while minimizing privacy risks such as singling out, linkability, and inference attacks. This approach ensures that synthetic datasets can support intrusion detection without exposing sensitive information.
format Article
id doaj-art-33ca7054f4e049d7b5dcde7bdba2d54b
institution OA Journals
issn 1424-8220
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-33ca7054f4e049d7b5dcde7bdba2d54b2025-08-20T02:04:41ZengMDPI AGSensors1424-82202024-11-012422738910.3390/s24227389Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGANSaleh Alabdulwahab0Young-Tak Kim1Yunsik Son2Department of Computer Science and Engineering, Dongguk University, Seoul 04620, Republic of KoreaDepartment of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USADivision of AI Software Convergence, Dongguk University, Seoul 04620, Republic of KoreaThe increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data generation method using a conditional tabular generative adversarial network (CTGAN) aimed at maintaining the utility of IoT sensor network data for IDS while safeguarding privacy. We integrate differential privacy (DP) with CTGAN by employing controlled noise injection to mitigate privacy risks. The technique involves dynamic distribution adjustment and quantile matching to balance the utility–privacy tradeoff. The results indicate a significant improvement in data utility compared to the standard DP method, achieving a KS test score of 0.80 while minimizing privacy risks such as singling out, linkability, and inference attacks. This approach ensures that synthetic datasets can support intrusion detection without exposing sensitive information.https://www.mdpi.com/1424-8220/24/22/7389differential privacydata utilitygenerative adversarial networkintrusion detection systemsInternet of thingsdeep learning
spellingShingle Saleh Alabdulwahab
Young-Tak Kim
Yunsik Son
Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
Sensors
differential privacy
data utility
generative adversarial network
intrusion detection systems
Internet of things
deep learning
title Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
title_full Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
title_fullStr Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
title_full_unstemmed Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
title_short Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
title_sort privacy preserving synthetic data generation method for iot sensor network ids using ctgan
topic differential privacy
data utility
generative adversarial network
intrusion detection systems
Internet of things
deep learning
url https://www.mdpi.com/1424-8220/24/22/7389
work_keys_str_mv AT salehalabdulwahab privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan
AT youngtakkim privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan
AT yunsikson privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan