Automated classification of online reviews of otolaryngologists

Abstract Objectives The study aimed to extract online comments of otolaryngologists in the 20 most populated cities in the United States from healthgrades.com, develop and validate a natural language processing (NLP) logistic regression algorithm for automated text classification of reviews into 10...

Full description

Saved in:
Bibliographic Details
Main Authors: Jake G. Stenzel, Nicholas R. Schultz, Michael J. Marino
Format: Article
Language:English
Published: Wiley 2024-12-01
Series:Laryngoscope Investigative Otolaryngology
Subjects:
Online Access:https://doi.org/10.1002/lio2.70036
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Objectives The study aimed to extract online comments of otolaryngologists in the 20 most populated cities in the United States from healthgrades.com, develop and validate a natural language processing (NLP) logistic regression algorithm for automated text classification of reviews into 10 categories, and compare 1‐ and 5‐star reviews in directly‐physician‐related and non‐physician‐related categories. Methods 1977 1‐star and 12,682 5‐star reviews were collected. The primary investigator manually categorized a training dataset of 324 1‐star and 909 5‐star reviews, while a validation subset of 100 5‐star and 50 1‐star reviews underwent dual manual categorization. Using scikit‐learn, an NLP algorithm was trained and validated on the subsets, with F1 scores evaluating text classification accuracy against manual categorization. The algorithm was then applied to the entire dataset with comparison of review categorization according to 1‐ and 5‐star reviews. Results F1 scores for NLP validation ranged between 0.71 and 0.97. Significant associations emerged between 1‐star reviews and treatment plan, accessibility, wait time, office scheduling, billing, and facilities. 5‐star reviews were associated with surgery/procedure, bedside manner, and staff/mid‐levels. Conclusion The study successfully validated an NLP text classification system for categorizing online physician reviews. Positive reviews were found to have an association with directly‐physician related context. 1‐star reviews were related to treatment plan, accessibility, wait time, office scheduling, billing, and facilities. This method of text classification effectively discerned the nuances of human‐written text, providing valuable insights into online healthcare feedback that is scalable. Level of evidence: Level 3
ISSN:2378-8038