Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education

Abstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, whic...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhi Liu, Yao Xiao, Zhu Su, Luyao Ye, Kaili Lu, Xian Peng
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04836-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226645711355904
author Zhi Liu
Yao Xiao
Zhu Su
Luyao Ye
Kaili Lu
Xian Peng
author_facet Zhi Liu
Yao Xiao
Zhu Su
Luyao Ye
Kaili Lu
Xian Peng
author_sort Zhi Liu
collection DOAJ
description Abstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, which ultimately impact user experience. To address this challenge, we construct bilingual dialogue datasets in Chinese and English, incorporating Big Five personality traits and emotion annotations. We utilize the AutoGen tool within a multi-agent framework to generate multi-turn question-answering dialogue datasets based on fables. By creating persona agents with diverse personalities, we effectively enhance the heterogeneity of personalities, overcoming previous limitations in personality diversity. Finally, we validate the utterance quality in the dataset and investigate the alignment between conversational utterances and speakers’ personality traits. Moreover, by integrating emotional annotations for each utterance, This dataset offers significant potential for developing emotion-aware systems that automatically detect personality traits. It serves as a valuable resource for advancing emotionally intelligent dialogue systems and research in personality and affective computing.
format Article
id doaj-art-5d7cc52b42c84c48b52bf55944413052
institution Kabale University
issn 2052-4463
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-5d7cc52b42c84c48b52bf559444130522025-08-24T11:07:33ZengNature PortfolioScientific Data2052-44632025-03-0112111310.1038/s41597-025-04836-wBilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in EducationZhi Liu0Yao Xiao1Zhu Su2Luyao Ye3Kaili Lu4Xian Peng5Faculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityCollege of Education Science and Technology, Nanjing University of Posts and TelecommunicationsFaculty of Artificial Intelligence in Education, Central China Normal UniversityAbstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, which ultimately impact user experience. To address this challenge, we construct bilingual dialogue datasets in Chinese and English, incorporating Big Five personality traits and emotion annotations. We utilize the AutoGen tool within a multi-agent framework to generate multi-turn question-answering dialogue datasets based on fables. By creating persona agents with diverse personalities, we effectively enhance the heterogeneity of personalities, overcoming previous limitations in personality diversity. Finally, we validate the utterance quality in the dataset and investigate the alignment between conversational utterances and speakers’ personality traits. Moreover, by integrating emotional annotations for each utterance, This dataset offers significant potential for developing emotion-aware systems that automatically detect personality traits. It serves as a valuable resource for advancing emotionally intelligent dialogue systems and research in personality and affective computing.https://doi.org/10.1038/s41597-025-04836-w
spellingShingle Zhi Liu
Yao Xiao
Zhu Su
Luyao Ye
Kaili Lu
Xian Peng
Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
Scientific Data
title Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
title_full Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
title_fullStr Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
title_full_unstemmed Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
title_short Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
title_sort bilingual dialogue dataset with personality and emotion annotations for personality recognition in education
url https://doi.org/10.1038/s41597-025-04836-w
work_keys_str_mv AT zhiliu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation
AT yaoxiao bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation
AT zhusu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation
AT luyaoye bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation
AT kaililu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation
AT xianpeng bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation