Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.
<h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increa...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0319782 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850265705690169344 |
|---|---|
| author | Emine Ozdemir Kacer |
| author_facet | Emine Ozdemir Kacer |
| author_sort | Emine Ozdemir Kacer |
| collection | DOAJ |
| description | <h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increasingly prevalent on these platforms (or on dedicated websites), providing automated, user-oriented breastfeeding guidance.<h4>Aim</h4>The goal of our study is to understand the relative performance of three AI-based chatbots: ChatGPT, Gemini, and Copilot, by evaluating the quality, reliability, readability, and similarity of the breastfeeding information they provide.<h4>Methods</h4>Two researchers evaluated the information provided by three different AI-based breastfeeding chatbots: ChatGPT version 3.5, Gemini, and Copilot. A total of 50 frequently asked questions about breastfeeding were identified and used in the study, divided into two categories (Baby-Centered Questions and Mother-Centered Questions), and evaluated using five scoring criteria, including the Quality Information Provision for Patients (EQIP) scale, the Simple Measure of Gobbledygook (SMOG) scale, the Similarity Index (SI), the Modified Dependability Scoring System (mDISCERN), and the Global Quality Scale (GQS).<h4>Results</h4>The evaluation of AI chatbots' answers showed statistically significant differences across all criteria (p < 0.05). Copilot scored highest on the EQIP, SMOG, and SI scales, while Gemini excelled in mDISCERN and GQS evaluations. No significant difference was found between Copilot and Gemini for mDISCERN and GQS scores. All three chatbots demonstrated high reliability and quality, though their readability required university-level education. Notably, ChatGPT displayed high originality, while Copilot exhibited the greatest similarity in responses.<h4>Conclusion</h4>AI chatbots provide reliable answers to breastfeeding questions, but the information can be hard to understand. While more reliable than other online sources, their accuracy and usability are still in question. Further research is necessary to facilitate the integration of advanced AI in healthcare. |
| format | Article |
| id | doaj-art-16f911707d6f4d73a4be613ba3f97196 |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-16f911707d6f4d73a4be613ba3f971962025-08-20T01:54:21ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01203e031978210.1371/journal.pone.0319782Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis.Emine Ozdemir Kacer<h4>Background</h4>In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increasingly prevalent on these platforms (or on dedicated websites), providing automated, user-oriented breastfeeding guidance.<h4>Aim</h4>The goal of our study is to understand the relative performance of three AI-based chatbots: ChatGPT, Gemini, and Copilot, by evaluating the quality, reliability, readability, and similarity of the breastfeeding information they provide.<h4>Methods</h4>Two researchers evaluated the information provided by three different AI-based breastfeeding chatbots: ChatGPT version 3.5, Gemini, and Copilot. A total of 50 frequently asked questions about breastfeeding were identified and used in the study, divided into two categories (Baby-Centered Questions and Mother-Centered Questions), and evaluated using five scoring criteria, including the Quality Information Provision for Patients (EQIP) scale, the Simple Measure of Gobbledygook (SMOG) scale, the Similarity Index (SI), the Modified Dependability Scoring System (mDISCERN), and the Global Quality Scale (GQS).<h4>Results</h4>The evaluation of AI chatbots' answers showed statistically significant differences across all criteria (p < 0.05). Copilot scored highest on the EQIP, SMOG, and SI scales, while Gemini excelled in mDISCERN and GQS evaluations. No significant difference was found between Copilot and Gemini for mDISCERN and GQS scores. All three chatbots demonstrated high reliability and quality, though their readability required university-level education. Notably, ChatGPT displayed high originality, while Copilot exhibited the greatest similarity in responses.<h4>Conclusion</h4>AI chatbots provide reliable answers to breastfeeding questions, but the information can be hard to understand. While more reliable than other online sources, their accuracy and usability are still in question. Further research is necessary to facilitate the integration of advanced AI in healthcare.https://doi.org/10.1371/journal.pone.0319782 |
| spellingShingle | Emine Ozdemir Kacer Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. PLoS ONE |
| title | Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. |
| title_full | Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. |
| title_fullStr | Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. |
| title_full_unstemmed | Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. |
| title_short | Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis. |
| title_sort | evaluating ai based breastfeeding chatbots quality readability and reliability analysis |
| url | https://doi.org/10.1371/journal.pone.0319782 |
| work_keys_str_mv | AT emineozdemirkacer evaluatingaibasedbreastfeedingchatbotsqualityreadabilityandreliabilityanalysis |