Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study

BackgroundDomestic violence (DV) is a significant public health concern affecting the physical and mental well-being of numerous women, imposing a substantial health care burden. However, women facing DV often encounter barriers to seeking in-person help due to stigma, shame,...

Full description

Saved in:
Bibliographic Details
Main Authors: Shaowei Guan, Vivian Hui, Gregor Stiglic, Rose Eva Constantino, Young Ji Lee, Arkers Kwan Ching Wong
Format: Article
Language:English
Published: JMIR Publications 2025-05-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e65397
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850189226606329856
author Shaowei Guan
Vivian Hui
Gregor Stiglic
Rose Eva Constantino
Young Ji Lee
Arkers Kwan Ching Wong
author_facet Shaowei Guan
Vivian Hui
Gregor Stiglic
Rose Eva Constantino
Young Ji Lee
Arkers Kwan Ching Wong
author_sort Shaowei Guan
collection DOAJ
description BackgroundDomestic violence (DV) is a significant public health concern affecting the physical and mental well-being of numerous women, imposing a substantial health care burden. However, women facing DV often encounter barriers to seeking in-person help due to stigma, shame, and embarrassment. As a result, many survivors of DV turn to online health communities as a safe and anonymous space to share their experiences and seek support. Understanding the information needs of survivors of DV in online health communities through multiclass classification is crucial for providing timely and appropriate support. ObjectiveThe objective was to develop a fine-tuned large language model (LLM) that can provide fast and accurate predictions of the information needs of survivors of DV from their online posts, enabling health care professionals to offer timely and personalized assistance. MethodsWe collected 294 posts from Reddit subcommunities focused on DV shared by women aged ≥18 years who self-identified as experiencing intimate partner violence. We identified 8 types of information needs: shelters/DV centers/agencies; legal; childbearing; police; DV report procedure/documentation; safety planning; DV knowledge; and communication. Data augmentation was applied using GPT-3.5 to expand our dataset to 2216 samples by generating 1922 additional posts that imitated the existing data. We adopted a progressive training strategy to fine-tune GPT-3.5 for multiclass text classification using 2032 posts. We trained the model on 1 class at a time, monitoring performance closely. When suboptimal results were observed, we generated additional samples of the misclassified ones to give them more attention. We reserved 184 posts for internal testing and 74 for external validation. Model performance was evaluated using accuracy, recall, precision, and F1-score, along with CIs for each metric. ResultsUsing 40 real posts and 144 artificial intelligence–generated posts as the test dataset, our model achieved an F1-score of 70.49% (95% CI 60.63%-80.35%) for real posts, outperforming the original GPT-3.5 and GPT-4, fine-tuned Llama 2-7B and Llama 3-8B, and long short-term memory. On artificial intelligence–generated posts, our model attained an F1-score of 84.58% (95% CI 80.38%-88.78%), surpassing all baselines. When tested on an external validation dataset (n=74), the model achieved an F1-score of 59.67% (95% CI 51.86%-67.49%), outperforming other models. Statistical analysis revealed that our model significantly outperformed the others in F1-score (P=.047 for real posts; P<.001 for external validation posts). Furthermore, our model was faster, taking 19.108 seconds for predictions versus 1150 seconds for manual assessment. ConclusionsOur fine-tuned LLM can accurately and efficiently extract and identify DV-related information needs through multiclass classification from online posts. In addition, we used LLM-based data augmentation techniques to overcome the limitations of a relatively small and imbalanced dataset. By generating timely and accurate predictions, we can empower health care professionals to provide rapid and suitable assistance to survivors of DV.
format Article
id doaj-art-5ea25355d5bd421ba13b1dc4ae7d8aff
institution OA Journals
issn 1438-8871
language English
publishDate 2025-05-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj-art-5ea25355d5bd421ba13b1dc4ae7d8aff2025-08-20T02:15:40ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-05-0127e6539710.2196/65397Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation StudyShaowei Guanhttps://orcid.org/0009-0009-4434-1337Vivian Huihttps://orcid.org/0000-0003-1966-6139Gregor Stiglichttps://orcid.org/0000-0002-0183-8679Rose Eva Constantinohttps://orcid.org/0000-0003-0206-2160Young Ji Leehttps://orcid.org/0000-0001-6359-4721Arkers Kwan Ching Wonghttps://orcid.org/0000-0001-6708-3099 BackgroundDomestic violence (DV) is a significant public health concern affecting the physical and mental well-being of numerous women, imposing a substantial health care burden. However, women facing DV often encounter barriers to seeking in-person help due to stigma, shame, and embarrassment. As a result, many survivors of DV turn to online health communities as a safe and anonymous space to share their experiences and seek support. Understanding the information needs of survivors of DV in online health communities through multiclass classification is crucial for providing timely and appropriate support. ObjectiveThe objective was to develop a fine-tuned large language model (LLM) that can provide fast and accurate predictions of the information needs of survivors of DV from their online posts, enabling health care professionals to offer timely and personalized assistance. MethodsWe collected 294 posts from Reddit subcommunities focused on DV shared by women aged ≥18 years who self-identified as experiencing intimate partner violence. We identified 8 types of information needs: shelters/DV centers/agencies; legal; childbearing; police; DV report procedure/documentation; safety planning; DV knowledge; and communication. Data augmentation was applied using GPT-3.5 to expand our dataset to 2216 samples by generating 1922 additional posts that imitated the existing data. We adopted a progressive training strategy to fine-tune GPT-3.5 for multiclass text classification using 2032 posts. We trained the model on 1 class at a time, monitoring performance closely. When suboptimal results were observed, we generated additional samples of the misclassified ones to give them more attention. We reserved 184 posts for internal testing and 74 for external validation. Model performance was evaluated using accuracy, recall, precision, and F1-score, along with CIs for each metric. ResultsUsing 40 real posts and 144 artificial intelligence–generated posts as the test dataset, our model achieved an F1-score of 70.49% (95% CI 60.63%-80.35%) for real posts, outperforming the original GPT-3.5 and GPT-4, fine-tuned Llama 2-7B and Llama 3-8B, and long short-term memory. On artificial intelligence–generated posts, our model attained an F1-score of 84.58% (95% CI 80.38%-88.78%), surpassing all baselines. When tested on an external validation dataset (n=74), the model achieved an F1-score of 59.67% (95% CI 51.86%-67.49%), outperforming other models. Statistical analysis revealed that our model significantly outperformed the others in F1-score (P=.047 for real posts; P<.001 for external validation posts). Furthermore, our model was faster, taking 19.108 seconds for predictions versus 1150 seconds for manual assessment. ConclusionsOur fine-tuned LLM can accurately and efficiently extract and identify DV-related information needs through multiclass classification from online posts. In addition, we used LLM-based data augmentation techniques to overcome the limitations of a relatively small and imbalanced dataset. By generating timely and accurate predictions, we can empower health care professionals to provide rapid and suitable assistance to survivors of DV.https://www.jmir.org/2025/1/e65397
spellingShingle Shaowei Guan
Vivian Hui
Gregor Stiglic
Rose Eva Constantino
Young Ji Lee
Arkers Kwan Ching Wong
Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
Journal of Medical Internet Research
title Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
title_full Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
title_fullStr Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
title_full_unstemmed Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
title_short Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study
title_sort classifying the information needs of survivors of domestic violence in online health communities using large language models prediction model development and evaluation study
url https://www.jmir.org/2025/1/e65397
work_keys_str_mv AT shaoweiguan classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy
AT vivianhui classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy
AT gregorstiglic classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy
AT roseevaconstantino classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy
AT youngjilee classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy
AT arkerskwanchingwong classifyingtheinformationneedsofsurvivorsofdomesticviolenceinonlinehealthcommunitiesusinglargelanguagemodelspredictionmodeldevelopmentandevaluationstudy