Captioning ultrasound images automatically

Mohammad Alsharid*, Harshita Sharma, Lior Drukker, Pierre Chatelain, Aris T. Papageorghiou, J. Alison Noble

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


We describe an automatic natural language processing (NLP)-based image captioning method to describe fetal ultrasound video content by modelling the vocabulary commonly used by sonographers and sonologists. The generated captions are similar to the words spoken by a sonographer when describing the scan experience in terms of visual content and performed scanning actions. Using full-length second-trimester fetal ultrasound videos and text derived from accompanying expert voice-over audio recordings, we train deep learning models consisting of convolutional neural networks and recurrent neural networks in merged configurations to generate captions for ultrasound video frames. We evaluate different model architectures using established general metrics (BLEU, ROUGE-L) and application-specific metrics. Results show that the proposed models can learn joint representations of image and text to generate relevant and descriptive captions for anatomies, such as the spine, the abdomen, the heart, and the head, in clinical fetal ultrasound scans.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2019 - 22nd International Conference, Proceedings
EditorsDinggang Shen, Pew-Thian Yap, Tianming Liu, Terry M. Peters, Ali Khan, Lawrence H. Staib, Caroline Essert, Sean Zhou
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages9
ISBN (Print)9783030322502
StatePublished - 2019
Externally publishedYes
Event22nd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2019 - Shenzhen, China
Duration: 13 Oct 201917 Oct 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11767 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference22nd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2019


  • Deep learning
  • Fetal ultrasound
  • Image captioning
  • Image description
  • Natural language processing
  • Recurrent neural networks


Dive into the research topics of 'Captioning ultrasound images automatically'. Together they form a unique fingerprint.

Cite this