Bounded Degradation in Latent Representation Under Bias Subspace Removal

This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Nur Amin, Phil Nemeth, Alexander Jesser
Format: Article
Language:English
Published: LibraryPress@UF 2025-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:https://journals.flvc.org/FLAIRS/article/view/139018
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849321610172956672
author Md Nur Amin
Phil Nemeth
Alexander Jesser
author_facet Md Nur Amin
Phil Nemeth
Alexander Jesser
author_sort Md Nur Amin
collection DOAJ
description This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate with gender above 0.95, indicating that only a few coordinates often capture dominant group distinctions. A further analysis using covariance differences isolates additional, though weaker, bias directions. To explain why neutralizing the principal bias dimension barely impairs classification performance, this paper introduces a Bounded Degradation Theorem. The result shows that unless a downstream classifier aligns heavily with the removed axis, any resulting logit shifts remain bounded, thus preserving accuracy. Empirical observations confirm that group-level outcomes shift, yet overall accuracy remains nearly unchanged. Theoretical and experimental insights highlight both the geo- metric underpinnings of bias in language-model embeddings and practical strategies for mitigating undesired effects, while leaving most classification power intact.
format Article
id doaj-art-306cc9d76ef545369ad402eeacb7aab7
institution Kabale University
issn 2334-0754
2334-0762
language English
publishDate 2025-05-01
publisher LibraryPress@UF
record_format Article
series Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling doaj-art-306cc9d76ef545369ad402eeacb7aab72025-08-20T03:49:42ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622025-05-0138110.32473/flairs.38.1.139018Bounded Degradation in Latent Representation Under Bias Subspace RemovalMd Nur Amin0Phil Nemeth1Alexander Jesser2Heilbronn University of Applied SciencesHeilbronn University of Applied SciencesHeilbronn University of Applied Sciences This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate with gender above 0.95, indicating that only a few coordinates often capture dominant group distinctions. A further analysis using covariance differences isolates additional, though weaker, bias directions. To explain why neutralizing the principal bias dimension barely impairs classification performance, this paper introduces a Bounded Degradation Theorem. The result shows that unless a downstream classifier aligns heavily with the removed axis, any resulting logit shifts remain bounded, thus preserving accuracy. Empirical observations confirm that group-level outcomes shift, yet overall accuracy remains nearly unchanged. Theoretical and experimental insights highlight both the geo- metric underpinnings of bias in language-model embeddings and practical strategies for mitigating undesired effects, while leaving most classification power intact. https://journals.flvc.org/FLAIRS/article/view/139018
spellingShingle Md Nur Amin
Phil Nemeth
Alexander Jesser
Bounded Degradation in Latent Representation Under Bias Subspace Removal
Proceedings of the International Florida Artificial Intelligence Research Society Conference
title Bounded Degradation in Latent Representation Under Bias Subspace Removal
title_full Bounded Degradation in Latent Representation Under Bias Subspace Removal
title_fullStr Bounded Degradation in Latent Representation Under Bias Subspace Removal
title_full_unstemmed Bounded Degradation in Latent Representation Under Bias Subspace Removal
title_short Bounded Degradation in Latent Representation Under Bias Subspace Removal
title_sort bounded degradation in latent representation under bias subspace removal
url https://journals.flvc.org/FLAIRS/article/view/139018
work_keys_str_mv AT mdnuramin boundeddegradationinlatentrepresentationunderbiassubspaceremoval
AT philnemeth boundeddegradationinlatentrepresentationunderbiassubspaceremoval
AT alexanderjesser boundeddegradationinlatentrepresentationunderbiassubspaceremoval