Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents

World literature plays a key role in understanding the global diversity of human storytelling. However, datasets suitable for large-scale cross-cultural analysis remain limited. Responding to the increasing digitization of literary texts and the need for more diverse and multilingual resources, we i...

Full description

Saved in:
Bibliographic Details
Main Authors: Andrew Piper, David Bamman, Christina Han, Jens Bjerring-Hansen, Hoyt Long, Itay Marienberg-Milikowsky, Tom McEnaney, Mathias Iroro Orhero, Emrah Peksoy, Pallavi Rastogi, Sebastian Rasmussen, Roel Smeets, Alexandra Stuart, Mads Rosendahl Thomsen
Format: Article
Language:English
Published: Ubiquity Press 2025-01-01
Series:Journal of Open Humanities Data
Subjects:
Online Access:https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/248
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823859321563250688
author Andrew Piper
David Bamman
Christina Han
Jens Bjerring-Hansen
Hoyt Long
Itay Marienberg-Milikowsky
Tom McEnaney
Mathias Iroro Orhero
Emrah Peksoy
Pallavi Rastogi
Sebastian Rasmussen
Roel Smeets
Alexandra Stuart
Mads Rosendahl Thomsen
author_facet Andrew Piper
David Bamman
Christina Han
Jens Bjerring-Hansen
Hoyt Long
Itay Marienberg-Milikowsky
Tom McEnaney
Mathias Iroro Orhero
Emrah Peksoy
Pallavi Rastogi
Sebastian Rasmussen
Roel Smeets
Alexandra Stuart
Mads Rosendahl Thomsen
author_sort Andrew Piper
collection DOAJ
description World literature plays a key role in understanding the global diversity of human storytelling. However, datasets suitable for large-scale cross-cultural analysis remain limited. Responding to the increasing digitization of literary texts and the need for more diverse and multilingual resources, we introduce Mini Worldlit, a manually curated dataset of 1,192 works of contemporary fiction from 13 countries, representing nine languages across five continents. Mini Worldlit employs consistent cross-cultural selection criteria, overseen by scholarly experts, to ensure geographic, linguistic, and stylistic coherence. The dataset provides a foundation for future comparative studies of global literary cultures, offering a template for cross-cultural sampling. Our methodology pairs geographic boundaries with linguistic communities, enabling a structured exploration of world literature. This dataset is designed to facilitate a comparative approach to understanding literature and support the growing field of multilingual digital humanities.
format Article
id doaj-art-594cc6905caf438fbb0effe1ba736e1c
institution Kabale University
issn 2059-481X
language English
publishDate 2025-01-01
publisher Ubiquity Press
record_format Article
series Journal of Open Humanities Data
spelling doaj-art-594cc6905caf438fbb0effe1ba736e1c2025-02-11T05:37:28ZengUbiquity PressJournal of Open Humanities Data2059-481X2025-01-01114410.5334/johd.248248Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five ContinentsAndrew Piper0https://orcid.org/0000-0001-9663-5999David Bamman1Christina Han2Jens Bjerring-Hansen3https://orcid.org/0000-0001-5786-8300Hoyt Long4https://orcid.org/0000-0002-8562-5426Itay Marienberg-Milikowsky5https://orcid.org/0000-0002-1150-7259Tom McEnaney6Mathias Iroro Orhero7https://orcid.org/0000-0002-1970-4505Emrah Peksoy8https://orcid.org/0000-0003-4940-616XPallavi Rastogi9Sebastian Rasmussen10https://orcid.org/0000-0002-6238-2513Roel Smeets11Alexandra Stuart12Mads Rosendahl Thomsen13https://orcid.org/0000-0002-4975-6752Department of Languages, Literatures, and Cultures, McGill University, MontrealSchool of Information, University of California, Berkeley, BerkeleyHistory Program, Faculty of Liberal Arts, Wilfrid Laurier University, BrantfordDepartment of Nordic Studies and Linguistics, University of Copenhagen, CopenhagenDepartment of East Asian Languages and Civilizations, The University of Chicago, ChicagoDepartment of Hebrew Literature, Ben-Gurion University of the Negev, Beer-ShevaDepartment of Spanish & Portuguese, University of California, Berkeley, BerkeleyDepartment of African and African-American Studies, Louisiana State University, Baton RougeDepartment of Translation and Interpreting, Faculty of Humanities and Social Sciences, Kahramanmaras Istiklal University, KahramanmaraşDepartment of English, Louisiana State University, Baton RougeSchool of Communication and Culture – Comparative Literature, Aarhus University, AarhusDepartment of Modern Languages and Cultures Radboud University Nijmegen, NijmegenDepartment of Psychology, McGill University, MontrealSchool of Communication and Culture – Comparative Literature, Aarhus University, AarhusWorld literature plays a key role in understanding the global diversity of human storytelling. However, datasets suitable for large-scale cross-cultural analysis remain limited. Responding to the increasing digitization of literary texts and the need for more diverse and multilingual resources, we introduce Mini Worldlit, a manually curated dataset of 1,192 works of contemporary fiction from 13 countries, representing nine languages across five continents. Mini Worldlit employs consistent cross-cultural selection criteria, overseen by scholarly experts, to ensure geographic, linguistic, and stylistic coherence. The dataset provides a foundation for future comparative studies of global literary cultures, offering a template for cross-cultural sampling. Our methodology pairs geographic boundaries with linguistic communities, enabling a structured exploration of world literature. This dataset is designed to facilitate a comparative approach to understanding literature and support the growing field of multilingual digital humanities.https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/248literaturemultilingualismfictionworld literature
spellingShingle Andrew Piper
David Bamman
Christina Han
Jens Bjerring-Hansen
Hoyt Long
Itay Marienberg-Milikowsky
Tom McEnaney
Mathias Iroro Orhero
Emrah Peksoy
Pallavi Rastogi
Sebastian Rasmussen
Roel Smeets
Alexandra Stuart
Mads Rosendahl Thomsen
Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
Journal of Open Humanities Data
literature
multilingualism
fiction
world literature
title Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
title_full Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
title_fullStr Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
title_full_unstemmed Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
title_short Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
title_sort mini worldlit a dataset of contemporary fiction from 13 countries nine languages and five continents
topic literature
multilingualism
fiction
world literature
url https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/248
work_keys_str_mv AT andrewpiper miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT davidbamman miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT christinahan miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT jensbjerringhansen miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT hoytlong miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT itaymarienbergmilikowsky miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT tommcenaney miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT mathiasiroroorhero miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT emrahpeksoy miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT pallavirastogi miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT sebastianrasmussen miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT roelsmeets miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT alexandrastuart miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents
AT madsrosendahlthomsen miniworldlitadatasetofcontemporaryfictionfrom13countriesninelanguagesandfivecontinents