Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education
Abstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, whic...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Scientific Data |
| Online Access: | https://doi.org/10.1038/s41597-025-04836-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849226645711355904 |
|---|---|
| author | Zhi Liu Yao Xiao Zhu Su Luyao Ye Kaili Lu Xian Peng |
| author_facet | Zhi Liu Yao Xiao Zhu Su Luyao Ye Kaili Lu Xian Peng |
| author_sort | Zhi Liu |
| collection | DOAJ |
| description | Abstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, which ultimately impact user experience. To address this challenge, we construct bilingual dialogue datasets in Chinese and English, incorporating Big Five personality traits and emotion annotations. We utilize the AutoGen tool within a multi-agent framework to generate multi-turn question-answering dialogue datasets based on fables. By creating persona agents with diverse personalities, we effectively enhance the heterogeneity of personalities, overcoming previous limitations in personality diversity. Finally, we validate the utterance quality in the dataset and investigate the alignment between conversational utterances and speakers’ personality traits. Moreover, by integrating emotional annotations for each utterance, This dataset offers significant potential for developing emotion-aware systems that automatically detect personality traits. It serves as a valuable resource for advancing emotionally intelligent dialogue systems and research in personality and affective computing. |
| format | Article |
| id | doaj-art-5d7cc52b42c84c48b52bf55944413052 |
| institution | Kabale University |
| issn | 2052-4463 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Data |
| spelling | doaj-art-5d7cc52b42c84c48b52bf559444130522025-08-24T11:07:33ZengNature PortfolioScientific Data2052-44632025-03-0112111310.1038/s41597-025-04836-wBilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in EducationZhi Liu0Yao Xiao1Zhu Su2Luyao Ye3Kaili Lu4Xian Peng5Faculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityFaculty of Artificial Intelligence in Education, Central China Normal UniversityCollege of Education Science and Technology, Nanjing University of Posts and TelecommunicationsFaculty of Artificial Intelligence in Education, Central China Normal UniversityAbstract Dialogue datasets are essential for advancing natural language processing (NLP) tasks. However, many existing datasets lack integrated annotations for personality and emotion, limiting models’ ability to effectively capture these aspects and generate personalized, human-like dialogues, which ultimately impact user experience. To address this challenge, we construct bilingual dialogue datasets in Chinese and English, incorporating Big Five personality traits and emotion annotations. We utilize the AutoGen tool within a multi-agent framework to generate multi-turn question-answering dialogue datasets based on fables. By creating persona agents with diverse personalities, we effectively enhance the heterogeneity of personalities, overcoming previous limitations in personality diversity. Finally, we validate the utterance quality in the dataset and investigate the alignment between conversational utterances and speakers’ personality traits. Moreover, by integrating emotional annotations for each utterance, This dataset offers significant potential for developing emotion-aware systems that automatically detect personality traits. It serves as a valuable resource for advancing emotionally intelligent dialogue systems and research in personality and affective computing.https://doi.org/10.1038/s41597-025-04836-w |
| spellingShingle | Zhi Liu Yao Xiao Zhu Su Luyao Ye Kaili Lu Xian Peng Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education Scientific Data |
| title | Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education |
| title_full | Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education |
| title_fullStr | Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education |
| title_full_unstemmed | Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education |
| title_short | Bilingual Dialogue Dataset with Personality and Emotion Annotations for Personality Recognition in Education |
| title_sort | bilingual dialogue dataset with personality and emotion annotations for personality recognition in education |
| url | https://doi.org/10.1038/s41597-025-04836-w |
| work_keys_str_mv | AT zhiliu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation AT yaoxiao bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation AT zhusu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation AT luyaoye bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation AT kaililu bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation AT xianpeng bilingualdialoguedatasetwithpersonalityandemotionannotationsforpersonalityrecognitionineducation |