Can large language models be used to code text for thematic analysis? An explorative study
Abstract In practice, thematic analysis of text involves six stages, among which text coding is particularly cognitively demanding, labor-intensive, and time-consuming. This study investigates and compares the potential of two large language models (LLMs), namely ChatGPT-4 and OpenAI o1-preview, to...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-07-01
|
| Series: | Discover Artificial Intelligence |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44163-025-00441-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849332384343785472 |
|---|---|
| author | Zhiyong Han Aaron Tavasi JuYoung Lee Joshua Luzuriaga Kevin Suresh Michael Oppenheim Fortunato Battaglia Stanley R. Terlecky |
| author_facet | Zhiyong Han Aaron Tavasi JuYoung Lee Joshua Luzuriaga Kevin Suresh Michael Oppenheim Fortunato Battaglia Stanley R. Terlecky |
| author_sort | Zhiyong Han |
| collection | DOAJ |
| description | Abstract In practice, thematic analysis of text involves six stages, among which text coding is particularly cognitively demanding, labor-intensive, and time-consuming. This study investigates and compares the potential of two large language models (LLMs), namely ChatGPT-4 and OpenAI o1-preview, to perform text coding, with the goal of reducing the time and effort required by human researchers. Our results indicate that both models exhibit decreased coding comprehensiveness as document length increases, and both demonstrate low coding accuracy, primarily due to limitations in textual comprehension and reasoning. These findings highlight significant challenges in using LLMs to support thematic analysis, emphasizing the need for human oversight and rigorous validation to ensure analytic accuracy and validity. |
| format | Article |
| id | doaj-art-980eedb1c2254190bbcd8ea3a9464e72 |
| institution | Kabale University |
| issn | 2731-0809 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Springer |
| record_format | Article |
| series | Discover Artificial Intelligence |
| spelling | doaj-art-980eedb1c2254190bbcd8ea3a9464e722025-08-20T03:46:12ZengSpringerDiscover Artificial Intelligence2731-08092025-07-015111710.1007/s44163-025-00441-3Can large language models be used to code text for thematic analysis? An explorative studyZhiyong Han0Aaron Tavasi1JuYoung Lee2Joshua Luzuriaga3Kevin Suresh4Michael Oppenheim5Fortunato Battaglia6Stanley R. Terlecky7Department of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineDepartment of Medical Sciences, Hackensack Meridian School of MedicineAbstract In practice, thematic analysis of text involves six stages, among which text coding is particularly cognitively demanding, labor-intensive, and time-consuming. This study investigates and compares the potential of two large language models (LLMs), namely ChatGPT-4 and OpenAI o1-preview, to perform text coding, with the goal of reducing the time and effort required by human researchers. Our results indicate that both models exhibit decreased coding comprehensiveness as document length increases, and both demonstrate low coding accuracy, primarily due to limitations in textual comprehension and reasoning. These findings highlight significant challenges in using LLMs to support thematic analysis, emphasizing the need for human oversight and rigorous validation to ensure analytic accuracy and validity.https://doi.org/10.1007/s44163-025-00441-3ChatGPTOpenAI o1-previewText codingThematic analysisComprehensionReasoning |
| spellingShingle | Zhiyong Han Aaron Tavasi JuYoung Lee Joshua Luzuriaga Kevin Suresh Michael Oppenheim Fortunato Battaglia Stanley R. Terlecky Can large language models be used to code text for thematic analysis? An explorative study Discover Artificial Intelligence ChatGPT OpenAI o1-preview Text coding Thematic analysis Comprehension Reasoning |
| title | Can large language models be used to code text for thematic analysis? An explorative study |
| title_full | Can large language models be used to code text for thematic analysis? An explorative study |
| title_fullStr | Can large language models be used to code text for thematic analysis? An explorative study |
| title_full_unstemmed | Can large language models be used to code text for thematic analysis? An explorative study |
| title_short | Can large language models be used to code text for thematic analysis? An explorative study |
| title_sort | can large language models be used to code text for thematic analysis an explorative study |
| topic | ChatGPT OpenAI o1-preview Text coding Thematic analysis Comprehension Reasoning |
| url | https://doi.org/10.1007/s44163-025-00441-3 |
| work_keys_str_mv | AT zhiyonghan canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT aarontavasi canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT juyounglee canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT joshualuzuriaga canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT kevinsuresh canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT michaeloppenheim canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT fortunatobattaglia canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy AT stanleyrterlecky canlargelanguagemodelsbeusedtocodetextforthematicanalysisanexplorativestudy |