SARS-CoV-2 Detection from Voice

Gadi Pinkas, Yarden Karny, Aviad Malachi, Galia Barkai, Gideon Bachar, Vered Aharonson*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Automated voice-based detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) could facilitate the screening for COVID19. A dataset of cellular phone recordings from 88 subjects was recently collected. The dataset included vocal utterances, speech and coughs that were self-recorded by the subjects in either hospitals or isolation sites. All subjects underwent nasopharyngeal swabbing at the time of recording and were labelled as SARS-CoV-2 positives or negative controls. The present study harnessed deep machine learning and speech processing to detect the SARS-CoV-2 positives. A three-stage architecture was implemented. A self-supervised attention-based transformer generated embeddings from the audio inputs. Recurrent neural networks were used to produce specialized sub-models for the SARS-CoV-2 classification. An ensemble stacking fused the predictions of the sub-models. Pre-training, bootstrapping and regularization techniques were used to prevent overfitting. A recall of 78% and a probability of false alarm (PFA) of 41% were measured on a test set of 57 recording sessions. A leave-one-speaker-out cross validation on 292 recording sessions yielded a recall of 78% and a PFA of 30%. These preliminary results imply a feasibility for COVID19 screening using voice.

Original languageEnglish
Article number9205643
Pages (from-to)268-274
Number of pages7
JournalIEEE Open Journal of Engineering in Medicine and Biology
StatePublished - 2020


  • COVID19
  • audio embeddings
  • ensemble stacking
  • recurrent neural network
  • semi supervised learning
  • transformer


Dive into the research topics of 'SARS-CoV-2 Detection from Voice'. Together they form a unique fingerprint.

Cite this