Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN
The increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/24/22/7389 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850227883771953152 |
|---|---|
| author | Saleh Alabdulwahab Young-Tak Kim Yunsik Son |
| author_facet | Saleh Alabdulwahab Young-Tak Kim Yunsik Son |
| author_sort | Saleh Alabdulwahab |
| collection | DOAJ |
| description | The increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data generation method using a conditional tabular generative adversarial network (CTGAN) aimed at maintaining the utility of IoT sensor network data for IDS while safeguarding privacy. We integrate differential privacy (DP) with CTGAN by employing controlled noise injection to mitigate privacy risks. The technique involves dynamic distribution adjustment and quantile matching to balance the utility–privacy tradeoff. The results indicate a significant improvement in data utility compared to the standard DP method, achieving a KS test score of 0.80 while minimizing privacy risks such as singling out, linkability, and inference attacks. This approach ensures that synthetic datasets can support intrusion detection without exposing sensitive information. |
| format | Article |
| id | doaj-art-33ca7054f4e049d7b5dcde7bdba2d54b |
| institution | OA Journals |
| issn | 1424-8220 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Sensors |
| spelling | doaj-art-33ca7054f4e049d7b5dcde7bdba2d54b2025-08-20T02:04:41ZengMDPI AGSensors1424-82202024-11-012422738910.3390/s24227389Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGANSaleh Alabdulwahab0Young-Tak Kim1Yunsik Son2Department of Computer Science and Engineering, Dongguk University, Seoul 04620, Republic of KoreaDepartment of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USADivision of AI Software Convergence, Dongguk University, Seoul 04620, Republic of KoreaThe increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data generation method using a conditional tabular generative adversarial network (CTGAN) aimed at maintaining the utility of IoT sensor network data for IDS while safeguarding privacy. We integrate differential privacy (DP) with CTGAN by employing controlled noise injection to mitigate privacy risks. The technique involves dynamic distribution adjustment and quantile matching to balance the utility–privacy tradeoff. The results indicate a significant improvement in data utility compared to the standard DP method, achieving a KS test score of 0.80 while minimizing privacy risks such as singling out, linkability, and inference attacks. This approach ensures that synthetic datasets can support intrusion detection without exposing sensitive information.https://www.mdpi.com/1424-8220/24/22/7389differential privacydata utilitygenerative adversarial networkintrusion detection systemsInternet of thingsdeep learning |
| spellingShingle | Saleh Alabdulwahab Young-Tak Kim Yunsik Son Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN Sensors differential privacy data utility generative adversarial network intrusion detection systems Internet of things deep learning |
| title | Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN |
| title_full | Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN |
| title_fullStr | Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN |
| title_full_unstemmed | Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN |
| title_short | Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN |
| title_sort | privacy preserving synthetic data generation method for iot sensor network ids using ctgan |
| topic | differential privacy data utility generative adversarial network intrusion detection systems Internet of things deep learning |
| url | https://www.mdpi.com/1424-8220/24/22/7389 |
| work_keys_str_mv | AT salehalabdulwahab privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan AT youngtakkim privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan AT yunsikson privacypreservingsyntheticdatagenerationmethodforiotsensornetworkidsusingctgan |