A comparative study of deep End-to-End Automatic Speech Recognition models for doctor-patient conversations in Polish in a real-life acoustic environment
The following paper presents research on the Automatic Speech Recognition (ASR) methods for the construction of a system to automatically transcribe the medical interview in Polish language during a visit in the clinic. Performance of four ASR models based on Deep Neural Networks (DNN) was evaluated...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Polish Academy of Sciences
2025-07-01
|
| Series: | International Journal of Electronics and Telecommunications |
| Subjects: | |
| Online Access: | https://journals.pan.pl/Content/135730/2-5157-Pondel-Sycz_sk.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The following paper presents research on the Automatic Speech Recognition (ASR) methods for the construction of a system to automatically transcribe the medical interview in Polish language during a visit in the clinic. Performance of four ASR models based on Deep Neural Networks (DNN) was evaluated. The applied structures included XLSR-53 large, Quartznet15x5, FastConformer Hybrid Transducer-CTC and Whisper large. The study was conducted on a self-developed speech dataset. Models were evaluated using Word Error Rate (WER), Character Error Rate (CER), Match Error Rate (MER), Word Accuracy (WAcc), Word Information Preserved (WIP), Word Information Lost (WIL), Levenshtein distance, Jaro - Winkler similarity and Jaccard index. The results show that the Whisper model outperformed other tested solutions in the vast majority of the conducted tests. Whisper achieved a WER = 20.84%, where XLSR-53 WER = 67.96%, Quartznet15x5 WER = 76.25%, FastConformer WER = 46.30%. These results show that Whisper needs further adaptation for medical conversations, as current volume of transcription errors is not practically acceptable (too many mistakes in the description of the patient's health description). |
|---|---|
| ISSN: | 2081-8491 2300-1933 |