An evaluation framework for ambient digital scribing tools in clinical applications
Abstract Ambient digital scribing (ADS) tools alleviate clinician documentation burden, reducing burnout and enhancing efficiency. As AI-driven ADS tools integrate into clinical workflows, robust governance is essential for ethical and secure deployment. This study proposes a comprehensive ADS evalu...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-06-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-025-01622-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849691406618066944 |
|---|---|
| author | Haoyuan Wang Rui Yang Mahmoud Alwakeel Ankit Kayastha Anand Chowdhury Joshua M. Biro Anthony D. Sorrentino Jessica L. Handley Sarah Hantzmon Sophia Bessias Nicoleta J. Economou-Zavlanos Armando Bedoya Monica Agrawal Raj M. Ratwani Eric G. Poon Michael J. Pencina Kathryn I. Pollak Chuan Hong |
| author_facet | Haoyuan Wang Rui Yang Mahmoud Alwakeel Ankit Kayastha Anand Chowdhury Joshua M. Biro Anthony D. Sorrentino Jessica L. Handley Sarah Hantzmon Sophia Bessias Nicoleta J. Economou-Zavlanos Armando Bedoya Monica Agrawal Raj M. Ratwani Eric G. Poon Michael J. Pencina Kathryn I. Pollak Chuan Hong |
| author_sort | Haoyuan Wang |
| collection | DOAJ |
| description | Abstract Ambient digital scribing (ADS) tools alleviate clinician documentation burden, reducing burnout and enhancing efficiency. As AI-driven ADS tools integrate into clinical workflows, robust governance is essential for ethical and secure deployment. This study proposes a comprehensive ADS evaluation framework incorporating human evaluation, automated metrics, simulation testing, and large language models (LLMs) as evaluators. Our framework assesses transcription, diarization, and medical note generation across criteria such as fluency, completeness, and factuality. To demonstrate its effectiveness, we developed an ADS tool and applied our framework to evaluate the tool’s performance on 40 real clinical visit recordings. Our evaluation revealed strengths, such as fluency and clarity, but also highlighted weaknesses in factual accuracy and the ability to capture new medications. These findings underscore the value of structured ADS evaluation in improving healthcare delivery while emphasizing the need for strong governance to ensure safe, ethical integration. |
| format | Article |
| id | doaj-art-ad995c1b87864b238da3a2bb5d80d06d |
| institution | DOAJ |
| issn | 2398-6352 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Digital Medicine |
| spelling | doaj-art-ad995c1b87864b238da3a2bb5d80d06d2025-08-20T03:21:02ZengNature Portfolionpj Digital Medicine2398-63522025-06-018111310.1038/s41746-025-01622-1An evaluation framework for ambient digital scribing tools in clinical applicationsHaoyuan Wang0Rui Yang1Mahmoud Alwakeel2Ankit Kayastha3Anand Chowdhury4Joshua M. Biro5Anthony D. Sorrentino6Jessica L. Handley7Sarah Hantzmon8Sophia Bessias9Nicoleta J. Economou-Zavlanos10Armando Bedoya11Monica Agrawal12Raj M. Ratwani13Eric G. Poon14Michael J. Pencina15Kathryn I. Pollak16Chuan Hong17Department of Biostatistics and Bioinformatics, Duke University School of MedicineCentre for Quantitative Medicine, Duke-NUS Medical SchoolDepartment of Medicine, Duke University School of MedicineDepartment of Medicine, Duke University School of MedicineDepartment of Medicine, Duke University School of MedicineMedstar Health National Center for Human Factors in HealthcareDepartment of Medicine, Duke University School of MedicineMedstar Health National Center for Human Factors in HealthcareCancer Prevention and Control Research Program, Duke Cancer InstituteDuke Clinical and Translational Science Institute, Duke University School of MedicineDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineMedstar Health National Center for Human Factors in HealthcareDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineCancer Prevention and Control Research Program, Duke Cancer InstituteDepartment of Biostatistics and Bioinformatics, Duke University School of MedicineAbstract Ambient digital scribing (ADS) tools alleviate clinician documentation burden, reducing burnout and enhancing efficiency. As AI-driven ADS tools integrate into clinical workflows, robust governance is essential for ethical and secure deployment. This study proposes a comprehensive ADS evaluation framework incorporating human evaluation, automated metrics, simulation testing, and large language models (LLMs) as evaluators. Our framework assesses transcription, diarization, and medical note generation across criteria such as fluency, completeness, and factuality. To demonstrate its effectiveness, we developed an ADS tool and applied our framework to evaluate the tool’s performance on 40 real clinical visit recordings. Our evaluation revealed strengths, such as fluency and clarity, but also highlighted weaknesses in factual accuracy and the ability to capture new medications. These findings underscore the value of structured ADS evaluation in improving healthcare delivery while emphasizing the need for strong governance to ensure safe, ethical integration.https://doi.org/10.1038/s41746-025-01622-1 |
| spellingShingle | Haoyuan Wang Rui Yang Mahmoud Alwakeel Ankit Kayastha Anand Chowdhury Joshua M. Biro Anthony D. Sorrentino Jessica L. Handley Sarah Hantzmon Sophia Bessias Nicoleta J. Economou-Zavlanos Armando Bedoya Monica Agrawal Raj M. Ratwani Eric G. Poon Michael J. Pencina Kathryn I. Pollak Chuan Hong An evaluation framework for ambient digital scribing tools in clinical applications npj Digital Medicine |
| title | An evaluation framework for ambient digital scribing tools in clinical applications |
| title_full | An evaluation framework for ambient digital scribing tools in clinical applications |
| title_fullStr | An evaluation framework for ambient digital scribing tools in clinical applications |
| title_full_unstemmed | An evaluation framework for ambient digital scribing tools in clinical applications |
| title_short | An evaluation framework for ambient digital scribing tools in clinical applications |
| title_sort | evaluation framework for ambient digital scribing tools in clinical applications |
| url | https://doi.org/10.1038/s41746-025-01622-1 |
| work_keys_str_mv | AT haoyuanwang anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ruiyang anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT mahmoudalwakeel anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ankitkayastha anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT anandchowdhury anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT joshuambiro anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT anthonydsorrentino anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT jessicalhandley anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT sarahhantzmon anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT sophiabessias anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT nicoletajeconomouzavlanos anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT armandobedoya anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT monicaagrawal anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT rajmratwani anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ericgpoon anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT michaeljpencina anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT kathrynipollak anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT chuanhong anevaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT haoyuanwang evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ruiyang evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT mahmoudalwakeel evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ankitkayastha evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT anandchowdhury evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT joshuambiro evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT anthonydsorrentino evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT jessicalhandley evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT sarahhantzmon evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT sophiabessias evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT nicoletajeconomouzavlanos evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT armandobedoya evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT monicaagrawal evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT rajmratwani evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT ericgpoon evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT michaeljpencina evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT kathrynipollak evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications AT chuanhong evaluationframeworkforambientdigitalscribingtoolsinclinicalapplications |