Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.

<h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increa...

Full description

Saved in:
Bibliographic Details
Main Author: Emine Ozdemir Kacer
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0319782
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850265705690169344
author Emine Ozdemir Kacer
author_facet Emine Ozdemir Kacer
author_sort Emine Ozdemir Kacer
collection DOAJ
description <h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increasingly prevalent on these platforms (or on dedicated websites), providing automated, user-oriented breastfeeding guidance.<h4>Aim</h4>The goal of our study is to understand the relative performance of three AI-based chatbots: ChatGPT, Gemini, and Copilot, by evaluating the quality, reliability, readability, and similarity of the breastfeeding information they provide.<h4>Methods</h4>Two researchers evaluated the information provided by three different AI-based breastfeeding chatbots: ChatGPT version 3.5, Gemini, and Copilot. A total of 50 frequently asked questions about breastfeeding were identified and used in the study, divided into two categories (Baby-Centered Questions and Mother-Centered Questions), and evaluated using five scoring criteria, including the Quality Information Provision for Patients (EQIP) scale, the Simple Measure of Gobbledygook (SMOG) scale, the Similarity Index (SI), the Modified Dependability Scoring System (mDISCERN), and the Global Quality Scale (GQS).<h4>Results</h4>The evaluation of AI chatbots' answers showed statistically significant differences across all criteria (p <  0.05). Copilot scored highest on the EQIP, SMOG, and SI scales, while Gemini excelled in mDISCERN and GQS evaluations. No significant difference was found between Copilot and Gemini for mDISCERN and GQS scores. All three chatbots demonstrated high reliability and quality, though their readability required university-level education. Notably, ChatGPT displayed high originality, while Copilot exhibited the greatest similarity in responses.<h4>Conclusion</h4>AI chatbots provide reliable answers to breastfeeding questions, but the information can be hard to understand. While more reliable than other online sources, their accuracy and usability are still in question. Further research is necessary to facilitate the integration of advanced AI in healthcare.
format Article
id doaj-art-16f911707d6f4d73a4be613ba3f97196
institution OA Journals
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-16f911707d6f4d73a4be613ba3f971962025-08-20T01:54:21ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01203e031978210.1371/journal.pone.0319782Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.Emine Ozdemir Kacer<h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increasingly prevalent on these platforms (or on dedicated websites), providing automated, user-oriented breastfeeding guidance.<h4>Aim</h4>The goal of our study is to understand the relative performance of three AI-based chatbots: ChatGPT, Gemini, and Copilot, by evaluating the quality, reliability, readability, and similarity of the breastfeeding information they provide.<h4>Methods</h4>Two researchers evaluated the information provided by three different AI-based breastfeeding chatbots: ChatGPT version 3.5, Gemini, and Copilot. A total of 50 frequently asked questions about breastfeeding were identified and used in the study, divided into two categories (Baby-Centered Questions and Mother-Centered Questions), and evaluated using five scoring criteria, including the Quality Information Provision for Patients (EQIP) scale, the Simple Measure of Gobbledygook (SMOG) scale, the Similarity Index (SI), the Modified Dependability Scoring System (mDISCERN), and the Global Quality Scale (GQS).<h4>Results</h4>The evaluation of AI chatbots' answers showed statistically significant differences across all criteria (p <  0.05). Copilot scored highest on the EQIP, SMOG, and SI scales, while Gemini excelled in mDISCERN and GQS evaluations. No significant difference was found between Copilot and Gemini for mDISCERN and GQS scores. All three chatbots demonstrated high reliability and quality, though their readability required university-level education. Notably, ChatGPT displayed high originality, while Copilot exhibited the greatest similarity in responses.<h4>Conclusion</h4>AI chatbots provide reliable answers to breastfeeding questions, but the information can be hard to understand. While more reliable than other online sources, their accuracy and usability are still in question. Further research is necessary to facilitate the integration of advanced AI in healthcare.https://doi.org/10.1371/journal.pone.0319782
spellingShingle Emine Ozdemir Kacer
Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
PLoS ONE
title Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
title_full Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
title_fullStr Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
title_full_unstemmed Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
title_short Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
title_sort evaluating ai based breastfeeding chatbots quality readability and reliability analysis
url https://doi.org/10.1371/journal.pone.0319782
work_keys_str_mv AT emineozdemirkacer evaluatingaibasedbreastfeedingchatbotsqualityreadabilityandreliabilityanalysis