Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling

Background: As digital health resources become increasingly prevalent, assessing the quality of information provided by publicly available AI tools is vital for evidence-based patient education. Objective: This study evaluates the accuracy and readability of responses from four large language models...

Full description

Saved in:

Bibliographic Details
Main Authors:	Anisha V. Patel, Sona Jasani, Abdelrahman AlAshqar, Rushabh H. Doshi, Kanhai Amin, Aisvarya Panakam, Ankita Patil, Sangini S. Sheth
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Digital
Subjects:	contraception contraceptive counseling reproductive health artificial intelligence large language models digital health
Online Access:	https://www.mdpi.com/2673-6470/5/2/10
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849432659269255168
author	Anisha V. Patel Sona Jasani Abdelrahman AlAshqar Rushabh H. Doshi Kanhai Amin Aisvarya Panakam Ankita Patil Sangini S. Sheth
author_facet	Anisha V. Patel Sona Jasani Abdelrahman AlAshqar Rushabh H. Doshi Kanhai Amin Aisvarya Panakam Ankita Patil Sangini S. Sheth
author_sort	Anisha V. Patel
collection	DOAJ
description	Background: As digital health resources become increasingly prevalent, assessing the quality of information provided by publicly available AI tools is vital for evidence-based patient education. Objective: This study evaluates the accuracy and readability of responses from four large language models—ChatGPT 4.0, ChatGPT 3.5, Google Bard, and Microsoft Bing—in providing contraceptive counseling. Methods: A cross-sectional analysis was conducted using standardized contraception questions, established readability indices, and a panel of blinded OB/GYN physician reviewers comparing model responses to an AAFP benchmark. Results: The models varied in readability and evidence adherence; notably, ChatGPT 3.5 provided more evidence-based responses than GPT-4.0, although all outputs exceeded the recommended 6th-grade reading level. Conclusions: Our findings underscore the need for the further refinement of LLMs to balance clinical accuracy with patient-friendly language, supporting their role as a supplement to clinician counseling.
format	Article
id	doaj-art-81cbac19625f42ae9cfa8a2559da4401
institution	Kabale University
issn	2673-6470
language	English
publishDate	2025-03-01
publisher	MDPI AG
record_format	Article
series	Digital
spelling	doaj-art-81cbac19625f42ae9cfa8a2559da44012025-08-20T03:27:18ZengMDPI AGDigital2673-64702025-03-01521010.3390/digital5020010Comparative Evaluation of Artificial Intelligence Models for Contraceptive CounselingAnisha V. Patel0Sona Jasani1Abdelrahman AlAshqar2Rushabh H. Doshi3Kanhai Amin4Aisvarya Panakam5Ankita Patil6Sangini S. Sheth7Department of Obstetrics, Gynecology, and Reproductive Sciences, Yale School of Medicine, New Haven, CT 06510, USADepartment of Obstetrics, Gynecology, and Reproductive Sciences, Yale School of Medicine, New Haven, CT 06510, USADepartment of Obstetrics, Gynecology, and Reproductive Sciences, Yale School of Medicine, New Haven, CT 06510, USADepartment of Internal Medicine, Yale School of Medicine, New Haven, CT 06510, USADepartment of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06520, USADepartment of Obstetrics and Gynecology, University of Pittsburgh Medical Center, Pittsburgh, PA 15219, USADepartment of Medicine, Division of Women’s Health, Brigham and Women’s Hospital, Boston, MA 02115, USADepartment of Obstetrics, Gynecology, and Reproductive Sciences, Yale School of Medicine, New Haven, CT 06510, USABackground: As digital health resources become increasingly prevalent, assessing the quality of information provided by publicly available AI tools is vital for evidence-based patient education. Objective: This study evaluates the accuracy and readability of responses from four large language models—ChatGPT 4.0, ChatGPT 3.5, Google Bard, and Microsoft Bing—in providing contraceptive counseling. Methods: A cross-sectional analysis was conducted using standardized contraception questions, established readability indices, and a panel of blinded OB/GYN physician reviewers comparing model responses to an AAFP benchmark. Results: The models varied in readability and evidence adherence; notably, ChatGPT 3.5 provided more evidence-based responses than GPT-4.0, although all outputs exceeded the recommended 6th-grade reading level. Conclusions: Our findings underscore the need for the further refinement of LLMs to balance clinical accuracy with patient-friendly language, supporting their role as a supplement to clinician counseling.https://www.mdpi.com/2673-6470/5/2/10contraceptioncontraceptive counselingreproductive healthartificial intelligencelarge language modelsdigital health
spellingShingle	Anisha V. Patel Sona Jasani Abdelrahman AlAshqar Rushabh H. Doshi Kanhai Amin Aisvarya Panakam Ankita Patil Sangini S. Sheth Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling Digital contraception contraceptive counseling reproductive health artificial intelligence large language models digital health
title	Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
title_full	Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
title_fullStr	Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
title_full_unstemmed	Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
title_short	Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling
title_sort	comparative evaluation of artificial intelligence models for contraceptive counseling
topic	contraception contraceptive counseling reproductive health artificial intelligence large language models digital health
url	https://www.mdpi.com/2673-6470/5/2/10
work_keys_str_mv	AT anishavpatel comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT sonajasani comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT abdelrahmanalashqar comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT rushabhhdoshi comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT kanhaiamin comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT aisvaryapanakam comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT ankitapatil comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling AT sanginissheth comparativeevaluationofartificialintelligencemodelsforcontraceptivecounseling

Comparative Evaluation of Artificial Intelligence Models for Contraceptive Counseling

Similar Items