Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts
Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in <i>Current Oncology&...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Current Oncology |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1718-7729/32/2/102 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849720155471347712 |
|---|---|
| author | Emily Rinderknecht Anna Schmelzer Anton Kravchuk Christopher Goßler Johannes Breyer Christian Gilfrich Maximilian Burger Simon Engelmann Veronika Saberi Clemens Kirschner Dominik von Winning Roman Mayr Christian Wülfing Hendrik Borgmann Stephan Buse Maximilian Haas Matthias May |
| author_facet | Emily Rinderknecht Anna Schmelzer Anton Kravchuk Christopher Goßler Johannes Breyer Christian Gilfrich Maximilian Burger Simon Engelmann Veronika Saberi Clemens Kirschner Dominik von Winning Roman Mayr Christian Wülfing Hendrik Borgmann Stephan Buse Maximilian Haas Matthias May |
| author_sort | Emily Rinderknecht |
| collection | DOAJ |
| description | Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in <i>Current Oncology</i>. To achieve this, it systematically assessed ChatGPT-4’s ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic “simple” prompt and an enhanced “extended” prompt. Readability was assessed using established metrics, including the Flesch–Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, <i>p</i> < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, <i>p</i> < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers’ workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains. |
| format | Article |
| id | doaj-art-76ade3d74da14f979cff89e3dafaa87f |
| institution | DOAJ |
| issn | 1198-0052 1718-7729 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Current Oncology |
| spelling | doaj-art-76ade3d74da14f979cff89e3dafaa87f2025-08-20T03:12:00ZengMDPI AGCurrent Oncology1198-00521718-77292025-02-0132210210.3390/curroncol32020102Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer ManuscriptsEmily Rinderknecht0Anna Schmelzer1Anton Kravchuk2Christopher Goßler3Johannes Breyer4Christian Gilfrich5Maximilian Burger6Simon Engelmann7Veronika Saberi8Clemens Kirschner9Dominik von Winning10Roman Mayr11Christian Wülfing12Hendrik Borgmann13Stephan Buse14Maximilian Haas15Matthias May16Department of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Elisabeth Hospital Straubing, 94315 Straubing, GermanyDepartment of Urology, St. Elisabeth Hospital Straubing, 94315 Straubing, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Elisabeth Hospital Straubing, 94315 Straubing, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyDepartment of Urology, St. Elisabeth Hospital Straubing, 94315 Straubing, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyWorking Group on Artificial Intelligence and Digitalization of the German Society of UrologyWorking Group on Artificial Intelligence and Digitalization of the German Society of UrologyDepartment of Urology, Alfried Krupp Krankenhaus, 45131 Essen, GermanyDepartment of Urology, St. Josef Medical Center, University of Regensburg, 93053 Regensburg, GermanyWorking Group on Artificial Intelligence and Digitalization of the German Society of UrologyClear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in <i>Current Oncology</i>. To achieve this, it systematically assessed ChatGPT-4’s ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic “simple” prompt and an enhanced “extended” prompt. Readability was assessed using established metrics, including the Flesch–Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, <i>p</i> < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, <i>p</i> < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers’ workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains.https://www.mdpi.com/1718-7729/32/2/102patient communicationartificial intelligence in healthcarelanguage model applicationsplain language summarieslay abstractsprompt design |
| spellingShingle | Emily Rinderknecht Anna Schmelzer Anton Kravchuk Christopher Goßler Johannes Breyer Christian Gilfrich Maximilian Burger Simon Engelmann Veronika Saberi Clemens Kirschner Dominik von Winning Roman Mayr Christian Wülfing Hendrik Borgmann Stephan Buse Maximilian Haas Matthias May Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts Current Oncology patient communication artificial intelligence in healthcare language model applications plain language summaries lay abstracts prompt design |
| title | Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts |
| title_full | Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts |
| title_fullStr | Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts |
| title_full_unstemmed | Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts |
| title_short | Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts |
| title_sort | leveraging large language models for high quality lay summaries efficacy of chatgpt 4 with custom prompts in a consecutive series of prostate cancer manuscripts |
| topic | patient communication artificial intelligence in healthcare language model applications plain language summaries lay abstracts prompt design |
| url | https://www.mdpi.com/1718-7729/32/2/102 |
| work_keys_str_mv | AT emilyrinderknecht leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT annaschmelzer leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT antonkravchuk leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT christophergoßler leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT johannesbreyer leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT christiangilfrich leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT maximilianburger leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT simonengelmann leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT veronikasaberi leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT clemenskirschner leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT dominikvonwinning leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT romanmayr leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT christianwulfing leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT hendrikborgmann leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT stephanbuse leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT maximilianhaas leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts AT matthiasmay leveraginglargelanguagemodelsforhighqualitylaysummariesefficacyofchatgpt4withcustompromptsinaconsecutiveseriesofprostatecancermanuscripts |