LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations
This study explores whether large language models (LLMs) can learn a person’s opinions from their speech and act based on that knowledge. It also proposes the potential for utilizing such trained models in survey research. Traditional survey research collects information through standardi...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10758652/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850263626560045056 |
|---|---|
| author | Suhyun Cho Jaeyun Kim Jang Hyun Kim |
| author_facet | Suhyun Cho Jaeyun Kim Jang Hyun Kim |
| author_sort | Suhyun Cho |
| collection | DOAJ |
| description | This study explores whether large language models (LLMs) can learn a person’s opinions from their speech and act based on that knowledge. It also proposes the potential for utilizing such trained models in survey research. Traditional survey research collects information through standardized questions. However, surveys require repeated administration with new participants each time, which involves significant costs and time. With the recent advancements in LLMs, artificial intelligence (AI) has shown remarkable capabilities, often surpassing humans in tasks that require natural language understanding (NLU) and natural language generation (NLG). Despite this, research on whether AI can replicate human thought processes in tasks such as text interpretation or question-answering remains insufficient. This study proposes a Surveyed LLM, specialized for survey tasks, and a Doppelganger LLM that mimics human thought processes. It tests to what extent the Doppelganger model can replicate human judgment. Furthermore, it suggests the possibility of mimicking not only group distributions but also individual opinions. |
| format | Article |
| id | doaj-art-7a7b731ceaa34c1f85eb35e00635e38b |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-7a7b731ceaa34c1f85eb35e00635e38b2025-08-20T01:54:55ZengIEEEIEEE Access2169-35362024-01-011217891717892710.1109/ACCESS.2024.350221910758652LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey SimulationsSuhyun Cho0https://orcid.org/0000-0002-9410-8017Jaeyun Kim1Jang Hyun Kim2https://orcid.org/0000-0001-7750-2664Department of Applied Artificial Intelligence, Sungkyunkwan University, Seoul, Republic of KoreaAI Model Development, Dareesoft, Seongnam-si, Republic of KoreaDepartment of Applied Artificial Intelligence, Sungkyunkwan University, Seoul, Republic of KoreaThis study explores whether large language models (LLMs) can learn a person’s opinions from their speech and act based on that knowledge. It also proposes the potential for utilizing such trained models in survey research. Traditional survey research collects information through standardized questions. However, surveys require repeated administration with new participants each time, which involves significant costs and time. With the recent advancements in LLMs, artificial intelligence (AI) has shown remarkable capabilities, often surpassing humans in tasks that require natural language understanding (NLU) and natural language generation (NLG). Despite this, research on whether AI can replicate human thought processes in tasks such as text interpretation or question-answering remains insufficient. This study proposes a Surveyed LLM, specialized for survey tasks, and a Doppelganger LLM that mimics human thought processes. It tests to what extent the Doppelganger model can replicate human judgment. Furthermore, it suggests the possibility of mimicking not only group distributions but also individual opinions.https://ieeexplore.ieee.org/document/10758652/LLMsurvey researchNLPNLUsynthetic data |
| spellingShingle | Suhyun Cho Jaeyun Kim Jang Hyun Kim LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations IEEE Access LLM survey research NLP NLU synthetic data |
| title | LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations |
| title_full | LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations |
| title_fullStr | LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations |
| title_full_unstemmed | LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations |
| title_short | LLM-Based Doppelgänger Models: Leveraging Synthetic Data for Human-Like Responses in Survey Simulations |
| title_sort | llm based doppelg x00e4 nger models leveraging synthetic data for human like responses in survey simulations |
| topic | LLM survey research NLP NLU synthetic data |
| url | https://ieeexplore.ieee.org/document/10758652/ |
| work_keys_str_mv | AT suhyuncho llmbaseddoppelgx00e4ngermodelsleveragingsyntheticdataforhumanlikeresponsesinsurveysimulations AT jaeyunkim llmbaseddoppelgx00e4ngermodelsleveragingsyntheticdataforhumanlikeresponsesinsurveysimulations AT janghyunkim llmbaseddoppelgx00e4ngermodelsleveragingsyntheticdataforhumanlikeresponsesinsurveysimulations |