Machine learning in psychiatric health records: A gold standard approach to trauma annotation

Abstract Psychiatric electronic health records present unique challenges for machine learning due to their unstructured, complex, and variable nature. This study aimed to create a gold standard dataset by identifying a cohort of patients with psychotic disorders and posttraumatic stress disorder, (P...

Full description

Saved in:
Bibliographic Details
Main Authors: Eben Holderness, Bruce Atwood, Marc Verhagen, Ann K. Shinn, Philip Cawkwell, Hudson Cerruti, James Pustejovsky, Mei-Hua Hall
Format: Article
Language:English
Published: Nature Publishing Group 2025-08-01
Series:Translational Psychiatry
Online Access:https://doi.org/10.1038/s41398-025-03487-0
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Psychiatric electronic health records present unique challenges for machine learning due to their unstructured, complex, and variable nature. This study aimed to create a gold standard dataset by identifying a cohort of patients with psychotic disorders and posttraumatic stress disorder, (PTSD), developing clinically-informed guidelines for annotating traumatic events in their health records and to create a gold standard publicly available dataset, and demonstrating the dataset’s suitability for training machine learning models to detect indicators of symptoms, substance use, and trauma in new records. We compiled a representative corpus of 200 narrative heavy health records (470,489 tokens) from a centralized database and developed a detailed annotation scheme with a team of clinical experts and computational linguistics. Clinicians annotated the corpus for trauma-related events and relevant clinical information with high inter-annotator agreement (0.715 for entity/span tags and 0.874 for attributes). Additionally, machine learning models were developed to demonstrate practical viability of the gold standard corpus for machine learning applications, achieving a micro F1 score of 0.76 and 0.82 for spans and attributes respectively, indicative of their predictive reliability. This study established the first gold-standard dataset for the complex task of labelling traumatic features in psychiatric health records. High inter-annotator agreement and model performance illustrate its utility in advancing the application of machine learning in psychiatric healthcare in order to better understand disease heterogeneity and treatment implications.
ISSN:2158-3188