LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations

This study explores whether large language models (LLMs) can learn a person’s opinions from their speech and act based on that knowledge. It also proposes the potential for utilizing such trained models in survey research. Traditional survey research collects information through standardi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Suhyun Cho, Jaeyun Kim, Jang Hyun Kim
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	LLM survey research NLP NLU synthetic data
Online Access:	https://ieeexplore.ieee.org/document/10758652/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study explores whether large language models (LLMs) can learn a person’s opinions from their speech and act based on that knowledge. It also proposes the potential for utilizing such trained models in survey research. Traditional survey research collects information through standardized questions. However, surveys require repeated administration with new participants each time, which involves significant costs and time. With the recent advancements in LLMs, artificial intelligence (AI) has shown remarkable capabilities, often surpassing humans in tasks that require natural language understanding (NLU) and natural language generation (NLG). Despite this, research on whether AI can replicate human thought processes in tasks such as text interpretation or question-answering remains insufficient. This study proposes a Surveyed LLM, specialized for survey tasks, and a Doppelganger LLM that mimics human thought processes. It tests to what extent the Doppelganger model can replicate human judgment. Furthermore, it suggests the possibility of mimicking not only group distributions but also individual opinions.
ISSN:	2169-3536

LLM-Based Doppelg&#x00E4;nger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations

Similar Items

LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations