Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound

Harshita Sharma, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a novel multi-modal learning approach for automated skill characterization of obstetric ultrasound operators using heterogeneous spatio-temporal sensory cues, namely, scan video, eye-tracking data, and pupillometric data, acquired in the clinical environment. We address pertinent challenges such as combining heterogeneous, small-scale and variable-length sequential datasets, to learn deep convolutional neural networks in real-world scenarios. We propose spatial encoding for multi-modal analysis using sonography standard plane images, spatial gaze maps, gaze trajectory images, and pupillary response images. We present and compare five multi-modal learning network architectures using late, intermediate, hybrid, and tensor fusion. We build models for the Heart and the Brain scanning tasks, and performance evaluation suggests that multi-modal learning networks outperform uni-modal networks, with the best-performing model achieving accuracies of 82.4% (Brain task) and 76.4% (Heart task) for the operator skill classification problem.

Original languageEnglish
Title of host publication2021 IEEE 18th International Symposium on Biomedical Imaging, ISBI 2021
PublisherIEEE Computer Society
Pages1646-1649
Number of pages4
ISBN (Electronic)9781665412469
DOIs
StatePublished - 13 Apr 2021
Externally publishedYes
Event18th IEEE International Symposium on Biomedical Imaging, ISBI 2021 - Nice, France
Duration: 13 Apr 202116 Apr 2021

Publication series

NameProceedings - International Symposium on Biomedical Imaging
Volume2021-April
ISSN (Print)1945-7928
ISSN (Electronic)1945-8452

Conference

Conference18th IEEE International Symposium on Biomedical Imaging, ISBI 2021
Country/TerritoryFrance
CityNice
Period13/04/2116/04/21

Keywords

  • Convolutional neural networks
  • Eye tracking
  • Multi-modal learning
  • Pupillometry
  • Ultrasound

Fingerprint

Dive into the research topics of 'Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound'. Together they form a unique fingerprint.

Cite this