SARS-CoV-2 Detection From Voice
Automated voice-based detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) could facilitate the screening for COVID19. A dataset of cellular phone recordings from 88 subjects was recently collected. The dataset included vocal utterances, speech and coughs that were self-recorded...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2020-01-01
|
| Series: | IEEE Open Journal of Engineering in Medicine and Biology |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/9205643/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849417512609906688 |
|---|---|
| author | Gadi Pinkas Yarden Karny Aviad Malachi Galia Barkai Gideon Bachar Vered Aharonson |
| author_facet | Gadi Pinkas Yarden Karny Aviad Malachi Galia Barkai Gideon Bachar Vered Aharonson |
| author_sort | Gadi Pinkas |
| collection | DOAJ |
| description | Automated voice-based detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) could facilitate the screening for COVID19. A dataset of cellular phone recordings from 88 subjects was recently collected. The dataset included vocal utterances, speech and coughs that were self-recorded by the subjects in either hospitals or isolation sites. All subjects underwent nasopharyngeal swabbing at the time of recording and were labelled as SARS-CoV-2 positives or negative controls. The present study harnessed deep machine learning and speech processing to detect the SARS-CoV-2 positives. A three-stage architecture was implemented. A self-supervised attention-based transformer generated embeddings from the audio inputs. Recurrent neural networks were used to produce specialized sub-models for the SARS-CoV-2 classification. An ensemble stacking fused the predictions of the sub-models. Pre-training, bootstrapping and regularization techniques were used to prevent overfitting. A recall of 78% and a probability of false alarm (PFA) of 41% were measured on a test set of 57 recording sessions. A leave-one-speaker-out cross validation on 292 recording sessions yielded a recall of 78% and a PFA of 30%. These preliminary results imply a feasibility for COVID19 screening using voice. |
| format | Article |
| id | doaj-art-ceca39267f21438cbb216bfe599ccc0e |
| institution | Kabale University |
| issn | 2644-1276 |
| language | English |
| publishDate | 2020-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Open Journal of Engineering in Medicine and Biology |
| spelling | doaj-art-ceca39267f21438cbb216bfe599ccc0e2025-08-20T03:32:47ZengIEEEIEEE Open Journal of Engineering in Medicine and Biology2644-12762020-01-01126827410.1109/OJEMB.2020.30264689205643SARS-CoV-2 Detection From VoiceGadi Pinkas0Yarden Karny1Aviad Malachi2Galia Barkai3https://orcid.org/0000-0002-0775-9015Gideon Bachar4Vered Aharonson5https://orcid.org/0000-0002-4406-6525Afeka Center of Language Processing, Afeka, Tel Aviv Academic College of Engineering, Tel Aviv-Yafo, IsraelAfeka Center of Language Processing, Afeka, Tel Aviv Academic College of Engineering, Tel Aviv-Yafo, IsraelAfeka Center of Language Processing, Afeka, Tel Aviv Academic College of Engineering, Tel Aviv-Yafo, IsraelPediatric Infectious Diseases Unit, Safra Children's Hospital, Sheba Medical Center and Sackler School of Medicine, Tel-Aviv University, Tel Aviv-Yafo, IsraelDepartment of Otorhinolaryngology, Rabin Medical center and Sackler School of Medicine, Tel-Aviv University, Tel Aviv-Yafo, IsraelAfeka Center of Language Processing, Afeka, Tel Aviv Academic College of Engineering, Tel Aviv-Yafo, IsraelAutomated voice-based detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) could facilitate the screening for COVID19. A dataset of cellular phone recordings from 88 subjects was recently collected. The dataset included vocal utterances, speech and coughs that were self-recorded by the subjects in either hospitals or isolation sites. All subjects underwent nasopharyngeal swabbing at the time of recording and were labelled as SARS-CoV-2 positives or negative controls. The present study harnessed deep machine learning and speech processing to detect the SARS-CoV-2 positives. A three-stage architecture was implemented. A self-supervised attention-based transformer generated embeddings from the audio inputs. Recurrent neural networks were used to produce specialized sub-models for the SARS-CoV-2 classification. An ensemble stacking fused the predictions of the sub-models. Pre-training, bootstrapping and regularization techniques were used to prevent overfitting. A recall of 78% and a probability of false alarm (PFA) of 41% were measured on a test set of 57 recording sessions. A leave-one-speaker-out cross validation on 292 recording sessions yielded a recall of 78% and a PFA of 30%. These preliminary results imply a feasibility for COVID19 screening using voice.https://ieeexplore.ieee.org/document/9205643/COVID19audio embeddingstransformerrecurrent neural networkensemble stackingsemi supervised learning |
| spellingShingle | Gadi Pinkas Yarden Karny Aviad Malachi Galia Barkai Gideon Bachar Vered Aharonson SARS-CoV-2 Detection From Voice IEEE Open Journal of Engineering in Medicine and Biology COVID19 audio embeddings transformer recurrent neural network ensemble stacking semi supervised learning |
| title | SARS-CoV-2 Detection From Voice |
| title_full | SARS-CoV-2 Detection From Voice |
| title_fullStr | SARS-CoV-2 Detection From Voice |
| title_full_unstemmed | SARS-CoV-2 Detection From Voice |
| title_short | SARS-CoV-2 Detection From Voice |
| title_sort | sars cov 2 detection from voice |
| topic | COVID19 audio embeddings transformer recurrent neural network ensemble stacking semi supervised learning |
| url | https://ieeexplore.ieee.org/document/9205643/ |
| work_keys_str_mv | AT gadipinkas sarscov2detectionfromvoice AT yardenkarny sarscov2detectionfromvoice AT aviadmalachi sarscov2detectionfromvoice AT galiabarkai sarscov2detectionfromvoice AT gideonbachar sarscov2detectionfromvoice AT veredaharonson sarscov2detectionfromvoice |