Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance

In the realm of artificial intelligence, the pursuit of enhanced model performance has often prioritized the exponential growth of training data, sometimes relegating concerns about data privacy. This approach has fostered a perception that data privacy and the achievement of high-performance AI mod...

Full description

Saved in:
Bibliographic Details
Main Authors: Alejandro de Leon Langure, Mahdi Zareei
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11031447/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849425939511902208
author Alejandro de Leon Langure
Mahdi Zareei
author_facet Alejandro de Leon Langure
Mahdi Zareei
author_sort Alejandro de Leon Langure
collection DOAJ
description In the realm of artificial intelligence, the pursuit of enhanced model performance has often prioritized the exponential growth of training data, sometimes relegating concerns about data privacy. This approach has fostered a perception that data privacy and the achievement of high-performance AI models are inherently opposing goals, particularly as digital fingerprinting is increasingly presented as essential for personalized experiences. This study aims to challenge this notion by demonstrating that even straightforward anonymization preprocessing techniques do not substantially alter the performance of machine learning models, regardless of their initial capabilities, while simultaneously safeguarding user privacy. We trained four different machine learning models: linear regression, linear ridge regression, a neural network, and a BiLSTM network, and evaluated their performance in data sets with varying levels of k-anonymity, specifically comparing results from a K-index of 1 to 52. Our findings indicate that, while certain trade-offs may exist, they should not be considered significant enough to deter the integration of anonymization techniques in machine learning and AI research. This work advocates for the routine adoption of anonymization practices, supporting the premise that robust model performance and strong data privacy are not mutually exclusive objectives.
format Article
id doaj-art-cd6a5d11c2de40a392b2a0c37e0b6575
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-cd6a5d11c2de40a392b2a0c37e0b65752025-08-20T03:29:35ZengIEEEIEEE Access2169-35362025-01-011310590110591010.1109/ACCESS.2025.357895811031447Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model PerformanceAlejandro de Leon Langure0https://orcid.org/0000-0002-8362-2045Mahdi Zareei1https://orcid.org/0000-0001-6623-1758Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey, MexicoTecnologico de Monterrey, School of Engineering and Sciences, Monterrey, MexicoIn the realm of artificial intelligence, the pursuit of enhanced model performance has often prioritized the exponential growth of training data, sometimes relegating concerns about data privacy. This approach has fostered a perception that data privacy and the achievement of high-performance AI models are inherently opposing goals, particularly as digital fingerprinting is increasingly presented as essential for personalized experiences. This study aims to challenge this notion by demonstrating that even straightforward anonymization preprocessing techniques do not substantially alter the performance of machine learning models, regardless of their initial capabilities, while simultaneously safeguarding user privacy. We trained four different machine learning models: linear regression, linear ridge regression, a neural network, and a BiLSTM network, and evaluated their performance in data sets with varying levels of k-anonymity, specifically comparing results from a K-index of 1 to 52. Our findings indicate that, while certain trade-offs may exist, they should not be considered significant enough to deter the integration of anonymization techniques in machine learning and AI research. This work advocates for the routine adoption of anonymization practices, supporting the premise that robust model performance and strong data privacy are not mutually exclusive objectives.https://ieeexplore.ieee.org/document/11031447/Data privacynatural language processingsentiment analysistext emotion detectionanonymization
spellingShingle Alejandro de Leon Langure
Mahdi Zareei
Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
IEEE Access
Data privacy
natural language processing
sentiment analysis
text emotion detection
anonymization
title Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
title_full Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
title_fullStr Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
title_full_unstemmed Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
title_short Privacy-Preserving Emotion Detection: Evaluating the Trade-Off Between K-Anonymity and Model Performance
title_sort privacy preserving emotion detection evaluating the trade off between k anonymity and model performance
topic Data privacy
natural language processing
sentiment analysis
text emotion detection
anonymization
url https://ieeexplore.ieee.org/document/11031447/
work_keys_str_mv AT alejandrodeleonlangure privacypreservingemotiondetectionevaluatingthetradeoffbetweenkanonymityandmodelperformance
AT mahdizareei privacypreservingemotiondetectionevaluatingthetradeoffbetweenkanonymityandmodelperformance