Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data

Aflatoxin contamination is a challenge facing food security, health, and trade in Tanzania and other parts of the world. This contamination affects maize, groundnuts, and other crops and animal products. Once contamination occurs, the contaminated crops and animal products become toxic causing illne...

Full description

Saved in:
Bibliographic Details
Main Authors: Pamela Chogo, Elizabeth Mkoba, Neema Kassim
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340925002070
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850136578848980992
author Pamela Chogo
Elizabeth Mkoba
Neema Kassim
author_facet Pamela Chogo
Elizabeth Mkoba
Neema Kassim
author_sort Pamela Chogo
collection DOAJ
description Aflatoxin contamination is a challenge facing food security, health, and trade in Tanzania and other parts of the world. This contamination affects maize, groundnuts, and other crops and animal products. Once contamination occurs, the contaminated crops and animal products become toxic causing illness or death to humans and animals who consume them. Lack of awareness and knowledge of the contamination is seen to be one of the reasons for its continued occurrence. Various awareness-creation and knowledge-sharing techniques have been used but the situation is still not appealing. For this case, the use of a Natural Language Processing (NLP) chatbot in sharing aflatoxin knowledge is proposed. This is because NLP chatbots have been successful in knowledge sharing in various contexts. This data article presents a Swahili text-based aflatoxin knowledge questions and answers dataset. Data were collected through 7 focus group discussion (FGD) sessions conducted in Arusha, Dodoma, Mtwara, Tabora, Morogoro, and Iringa regions in Tanzania. Respondents for the study were farmers, traders, and consumers of maize and groundnuts. The collected data were processed and analyzed using R qualitative data analysis tool. This allowed the identification of 6 themes with respective questions under each theme. The questions were shared with experts through 9 interview sessions and the experts gave answers to the questions. The set of questions and answers were then translated into Swahili language using google translate and manual verification. Finally, an aflatoxin knowledge dataset containing 221 paired questions and answers organized into 6 knowledge areas Swahili dataset was developed. With this dataset, an NLP-based chatbot that uses Swahili language can be developed. This will be beneficial to farmers, traders, consumers, researchers, and policymakers. They can use it to learn more about aflatoxin and be able to make informed decisions. Moreover, the dataset can be adopted and modified to create NLP chatbots that can share aflatoxin knowledge in other languages apart from Swahili. The dataset also contributes to the availability of Swahili language datasets.
format Article
id doaj-art-ce190d3d6e0c4af1bf10db74a5b2329e
institution OA Journals
issn 2352-3409
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj-art-ce190d3d6e0c4af1bf10db74a5b2329e2025-08-20T02:31:05ZengElsevierData in Brief2352-34092025-06-016011147510.1016/j.dib.2025.111475Swahili questions and answers dataset for aflatoxin knowledge domainMendeley DataPamela Chogo0Elizabeth Mkoba1Neema Kassim2Corresponding author.; Nelson Mandela African Institution of Science and Technology, P.O.Box 447, Tengeru, Arusha, TanzaniaNelson Mandela African Institution of Science and Technology, P.O.Box 447, Tengeru, Arusha, TanzaniaNelson Mandela African Institution of Science and Technology, P.O.Box 447, Tengeru, Arusha, TanzaniaAflatoxin contamination is a challenge facing food security, health, and trade in Tanzania and other parts of the world. This contamination affects maize, groundnuts, and other crops and animal products. Once contamination occurs, the contaminated crops and animal products become toxic causing illness or death to humans and animals who consume them. Lack of awareness and knowledge of the contamination is seen to be one of the reasons for its continued occurrence. Various awareness-creation and knowledge-sharing techniques have been used but the situation is still not appealing. For this case, the use of a Natural Language Processing (NLP) chatbot in sharing aflatoxin knowledge is proposed. This is because NLP chatbots have been successful in knowledge sharing in various contexts. This data article presents a Swahili text-based aflatoxin knowledge questions and answers dataset. Data were collected through 7 focus group discussion (FGD) sessions conducted in Arusha, Dodoma, Mtwara, Tabora, Morogoro, and Iringa regions in Tanzania. Respondents for the study were farmers, traders, and consumers of maize and groundnuts. The collected data were processed and analyzed using R qualitative data analysis tool. This allowed the identification of 6 themes with respective questions under each theme. The questions were shared with experts through 9 interview sessions and the experts gave answers to the questions. The set of questions and answers were then translated into Swahili language using google translate and manual verification. Finally, an aflatoxin knowledge dataset containing 221 paired questions and answers organized into 6 knowledge areas Swahili dataset was developed. With this dataset, an NLP-based chatbot that uses Swahili language can be developed. This will be beneficial to farmers, traders, consumers, researchers, and policymakers. They can use it to learn more about aflatoxin and be able to make informed decisions. Moreover, the dataset can be adopted and modified to create NLP chatbots that can share aflatoxin knowledge in other languages apart from Swahili. The dataset also contributes to the availability of Swahili language datasets.http://www.sciencedirect.com/science/article/pii/S2352340925002070Natural language processingNLP-based chatbotKnowledge sharingFood securityAflatoxin dataset
spellingShingle Pamela Chogo
Elizabeth Mkoba
Neema Kassim
Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
Data in Brief
Natural language processing
NLP-based chatbot
Knowledge sharing
Food security
Aflatoxin dataset
title Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
title_full Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
title_fullStr Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
title_full_unstemmed Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
title_short Swahili questions and answers dataset for aflatoxin knowledge domainMendeley Data
title_sort swahili questions and answers dataset for aflatoxin knowledge domainmendeley data
topic Natural language processing
NLP-based chatbot
Knowledge sharing
Food security
Aflatoxin dataset
url http://www.sciencedirect.com/science/article/pii/S2352340925002070
work_keys_str_mv AT pamelachogo swahiliquestionsandanswersdatasetforaflatoxinknowledgedomainmendeleydata
AT elizabethmkoba swahiliquestionsandanswersdatasetforaflatoxinknowledgedomainmendeleydata
AT neemakassim swahiliquestionsandanswersdatasetforaflatoxinknowledgedomainmendeleydata