Can large language models be used to code text for thematic analysis? An explorative study

Abstract In practice, thematic analysis of text involves six stages, among which text coding is particularly cognitively demanding, labor-intensive, and time-consuming. This study investigates and compares the potential of two large language models (LLMs), namely ChatGPT-4 and OpenAI o1-preview, to...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhiyong Han, Aaron Tavasi, JuYoung Lee, Joshua Luzuriaga, Kevin Suresh, Michael Oppenheim, Fortunato Battaglia, Stanley R. Terlecky
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Discover Artificial Intelligence
Subjects:
Online Access:https://doi.org/10.1007/s44163-025-00441-3
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract In practice, thematic analysis of text involves six stages, among which text coding is particularly cognitively demanding, labor-intensive, and time-consuming. This study investigates and compares the potential of two large language models (LLMs), namely ChatGPT-4 and OpenAI o1-preview, to perform text coding, with the goal of reducing the time and effort required by human researchers. Our results indicate that both models exhibit decreased coding comprehensiveness as document length increases, and both demonstrate low coding accuracy, primarily due to limitations in textual comprehension and reasoning. These findings highlight significant challenges in using LLMs to support thematic analysis, emphasizing the need for human oversight and rigorous validation to ensure analytic accuracy and validity.
ISSN:2731-0809