Slovene and Croatian word embeddings in terms of gender occupational analogies
In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The ar...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
University of Ljubljana Press (Založba Univerze v Ljubljani)
2021-07-01
|
| Series: | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
| Subjects: | |
| Online Access: | https://journals.uni-lj.si/slovenscina2/article/view/9883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319346499747840 |
|---|---|
| author | Matej Ulčar Anka Supej Marko Robnik-Šikonja Senja Pollak |
| author_facet | Matej Ulčar Anka Supej Marko Robnik-Šikonja Senja Pollak |
| author_sort | Matej Ulčar |
| collection | DOAJ |
| description | In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The article focuses on evaluating Slovene and Croatian word embeddings in terms of gender bias using word analogy calculations. We compiled a list of masculine and feminine nouns for occupations in Slovene and evaluated the gender bias of fastText, word2vec and ELMo embeddings with different configurations and different approaches to analogy calculations. The lowest occupational gender bias was observed with the fastText embeddings. Similarly, we compared different fastText embeddings on Croatian occupational analogies.
|
| format | Article |
| id | doaj-art-9d4e22fbe85e43159fb32dc66266b2c4 |
| institution | Kabale University |
| issn | 2335-2736 |
| language | English |
| publishDate | 2021-07-01 |
| publisher | University of Ljubljana Press (Založba Univerze v Ljubljani) |
| record_format | Article |
| series | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
| spelling | doaj-art-9d4e22fbe85e43159fb32dc66266b2c42025-08-20T03:50:31ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave2335-27362021-07-019110.4312/slo2.0.2021.1.26-59Slovene and Croatian word embeddings in terms of gender occupational analogiesMatej Ulčar0Anka Supej1Marko Robnik-Šikonja2Senja Pollak3University of Ljubljana, Faculty of Computer and Information Science, SloveniaJožef Stefan Institute, Ljubljana, SloveniaUniversity of Ljubljana, Faculty of Computer and Information Science, SloveniaJožef Stefan Institute, Ljubljana, SloveniaIn recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The article focuses on evaluating Slovene and Croatian word embeddings in terms of gender bias using word analogy calculations. We compiled a list of masculine and feminine nouns for occupations in Slovene and evaluated the gender bias of fastText, word2vec and ELMo embeddings with different configurations and different approaches to analogy calculations. The lowest occupational gender bias was observed with the fastText embeddings. Similarly, we compared different fastText embeddings on Croatian occupational analogies. https://journals.uni-lj.si/slovenscina2/article/view/9883word embeddingsgender biasword analogy taskoccupationsnatural language processing |
| spellingShingle | Matej Ulčar Anka Supej Marko Robnik-Šikonja Senja Pollak Slovene and Croatian word embeddings in terms of gender occupational analogies Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave word embeddings gender bias word analogy task occupations natural language processing |
| title | Slovene and Croatian word embeddings in terms of gender occupational analogies |
| title_full | Slovene and Croatian word embeddings in terms of gender occupational analogies |
| title_fullStr | Slovene and Croatian word embeddings in terms of gender occupational analogies |
| title_full_unstemmed | Slovene and Croatian word embeddings in terms of gender occupational analogies |
| title_short | Slovene and Croatian word embeddings in terms of gender occupational analogies |
| title_sort | slovene and croatian word embeddings in terms of gender occupational analogies |
| topic | word embeddings gender bias word analogy task occupations natural language processing |
| url | https://journals.uni-lj.si/slovenscina2/article/view/9883 |
| work_keys_str_mv | AT matejulcar sloveneandcroatianwordembeddingsintermsofgenderoccupationalanalogies AT ankasupej sloveneandcroatianwordembeddingsintermsofgenderoccupationalanalogies AT markorobniksikonja sloveneandcroatianwordembeddingsintermsofgenderoccupationalanalogies AT senjapollak sloveneandcroatianwordembeddingsintermsofgenderoccupationalanalogies |