Bounded Degradation in Latent Representation Under Bias Subspace Removal
This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2025-05-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/139018 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849321610172956672 |
|---|---|
| author | Md Nur Amin Phil Nemeth Alexander Jesser |
| author_facet | Md Nur Amin Phil Nemeth Alexander Jesser |
| author_sort | Md Nur Amin |
| collection | DOAJ |
| description |
This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate with gender above 0.95, indicating that only a few coordinates often capture dominant group distinctions. A further analysis using covariance differences isolates additional, though weaker, bias directions. To explain why neutralizing the principal bias dimension barely impairs classification performance, this paper introduces a Bounded Degradation Theorem. The result shows that unless a downstream classifier aligns heavily with the removed axis, any resulting logit shifts remain bounded, thus preserving accuracy. Empirical observations confirm that group-level outcomes shift, yet overall accuracy remains nearly unchanged. Theoretical and experimental insights highlight both the geo- metric underpinnings of bias in language-model embeddings and practical strategies for mitigating undesired effects, while leaving most classification power intact.
|
| format | Article |
| id | doaj-art-306cc9d76ef545369ad402eeacb7aab7 |
| institution | Kabale University |
| issn | 2334-0754 2334-0762 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | LibraryPress@UF |
| record_format | Article |
| series | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| spelling | doaj-art-306cc9d76ef545369ad402eeacb7aab72025-08-20T03:49:42ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622025-05-0138110.32473/flairs.38.1.139018Bounded Degradation in Latent Representation Under Bias Subspace RemovalMd Nur Amin0Phil Nemeth1Alexander Jesser2Heilbronn University of Applied SciencesHeilbronn University of Applied SciencesHeilbronn University of Applied Sciences This work investigates the concentration of demo- graphic signals in high-dimensional embeddings, focusing on a “bias subspace” that encodes sensitive at- tributes such as gender. Experiments on textual job biographies reveal that a single vector—derived by subtracting subgroup means—can correlate with gender above 0.95, indicating that only a few coordinates often capture dominant group distinctions. A further analysis using covariance differences isolates additional, though weaker, bias directions. To explain why neutralizing the principal bias dimension barely impairs classification performance, this paper introduces a Bounded Degradation Theorem. The result shows that unless a downstream classifier aligns heavily with the removed axis, any resulting logit shifts remain bounded, thus preserving accuracy. Empirical observations confirm that group-level outcomes shift, yet overall accuracy remains nearly unchanged. Theoretical and experimental insights highlight both the geo- metric underpinnings of bias in language-model embeddings and practical strategies for mitigating undesired effects, while leaving most classification power intact. https://journals.flvc.org/FLAIRS/article/view/139018 |
| spellingShingle | Md Nur Amin Phil Nemeth Alexander Jesser Bounded Degradation in Latent Representation Under Bias Subspace Removal Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| title | Bounded Degradation in Latent Representation Under Bias Subspace Removal |
| title_full | Bounded Degradation in Latent Representation Under Bias Subspace Removal |
| title_fullStr | Bounded Degradation in Latent Representation Under Bias Subspace Removal |
| title_full_unstemmed | Bounded Degradation in Latent Representation Under Bias Subspace Removal |
| title_short | Bounded Degradation in Latent Representation Under Bias Subspace Removal |
| title_sort | bounded degradation in latent representation under bias subspace removal |
| url | https://journals.flvc.org/FLAIRS/article/view/139018 |
| work_keys_str_mv | AT mdnuramin boundeddegradationinlatentrepresentationunderbiassubspaceremoval AT philnemeth boundeddegradationinlatentrepresentationunderbiassubspaceremoval AT alexanderjesser boundeddegradationinlatentrepresentationunderbiassubspaceremoval |