Original language | English |
---|---|
Pages (from-to) | 710-711 |
Number of pages | 2 |
Journal | Ultrasound in Obstetrics and Gynecology |
Volume | 60 |
Issue number | 5 |
DOIs |
|
State | Published - Nov 2022 |
Externally published | Yes |
Funding
Funders | Funder number |
---|---|
National Institute for Health and Care Research | |
Senior Scientific Advisors of Intelligent Ultrasound Ltd. | |
European Research Council | ERC‐ADG‐2015 694 581 |
Rhodes Scholarships | PCC160 |
Horizon 2020 Framework Programme | 694581 |
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
In: Ultrasound in Obstetrics and Gynecology, Vol. 60, No. 5, 11.2022, p. 710-711.
Research output: Contribution to journal › Comment/debate
TY - JOUR
T1 - A picture is worth a thousand words
T2 - textual analysis of the routine 20-week scan
AU - Alsharid, M.
AU - Drukker, L.
AU - Sharma, H.
AU - Noble, J. A.
AU - Papageorghiou, A. T.
N1 - Funding Information: J.A.N. and A.T.P. are Senior Scientific Advisors of Intelligent Ultrasound Ltd. Funding for this study was granted by the European Research Council (ERC-ADG-2015 694 581, project PULSE). A.T.P. is supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR or the Department of Health. M.A. acknowledges the Rhodes Trust. As part of our research to understand the skill and workflow of clinical sonography1, we recorded the natural speech of ultrasound operators. We then extracted the words used during the routine 20-week anatomy scan to develop a ‘lexicon’ of fetal ultrasound. These text data are being used to research artificial intelligence applications in obstetric ultrasound2, such as automatically generated real-time live captioning of scan content, that rely on a textual component, and could fast-track the laborious task of manual image or video annotation, a frequent prerequisite for undertaking machine learning. Details of the audio acquisition and preprocessing steps have been described previously1,3. In brief, two microphones (PCC160, Crown HARMAN, Northridge, CA, USA) were used, with one located near the operator (to capture the sonographer's voice) and another near the pregnant woman (to allow isolation of the sonographer's voice). Apart from removing background conversation and noise, we dropped the first 90 s of audio recording from the scans to exclude exchange of personal information and pleasantries between sonographers and subjects. The resulting word clouds illustrate the words used by the sonographer when moving the probe (Figure 1a), and those spoken when the image is frozen (Figure 1b). We depict nouns (in red), adjectives (in green) and verbs (in blue) as they constitute the most informative data. The font size correlates with the number of times a specific word is mentioned in the dataset. As illustrated in Figure 1a, during live scanning, nouns are the most frequently used part of speech, followed by determiners (not included in the figure), verbs, adjectives and adverbs (not included in the figure), accounting for 26%, 15%, 13%, 7% and 7% of the dataset, respectively. Other parts of speech make up the remaining 32%. We have noticed that sonographers use different terminology when they have frozen the frame to focus on anatomical structures vs when moving the probe in real time: on frozen images, substructures such as ‘cerebellum’ or ‘cavum septi pellucidi’ are named; however, they are not mentioned while the probe is being moved. Related to this, when manipulating the probe, the more prevalent nouns are those that refer to high-level anatomical structures, for example ‘heart’ or ‘head’, rather than their substructures. This suggests that a thorough commentary of anatomical content only happens once a high-level structure has been identified and the image has been frozen. Because measurements, such as head circumference, are made typically on frozen images, vocabulary relating to the word ‘measurement’ is noticeably absent when the probe is in motion. In Figure 2, we plot a representative chronological graph of the anatomical structure being communicated by the sonographer over the duration of the scan. While scanning, multiple structures may be present on the screen at the same time; however, the structure being spoken about by the sonographer is the one shown on the described-structure/timeline plot. If the sonographer speaks about the multiple structures present, these would be represented by thin alternating slices on the timeline plot. An example of this can be seen between frames 30 000 and 40 000, where the structure alternates between abdomen and kidney. The collected data are abstractive and preliminary; they are intended to motivate thinking and highlight concepts for future study, in particular how such data can be used to facilitate training of machine learning models that can automate the generation of descriptive captions of ultrasound content while a scan video is being played. This concept could also realize the potential of convenient ‘soft’ automated descriptions of hours of fetal ultrasound content. Perceptions and actions during clinical ultrasound involve vision, speech, touch and proprioception. Measuring these complex interactions during clinical obstetric ultrasound assessment allows their transformation into multimodal data science. This makes real-world data amenable to machine-learning applications with the ultimate objectives of bettering understanding of how we learn to scan, improving sonography training and workflow, and engineering assistive technologies to make ultrasound scanning easier in the future. J.A.N. and A.T.P. are Senior Scientific Advisors of Intelligent Ultrasound Ltd. Funding for this study was granted by the European Research Council (ERC-ADG-2015 694 581, project PULSE). A.T.P. is supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR or the Department of Health. M.A. acknowledges the Rhodes Trust. Funding Information: J.A.N. and A.T.P. are Senior Scientific Advisors of Intelligent Ultrasound Ltd. Funding for this study was granted by the European Research Council (ERC‐ADG‐2015 694 581, project PULSE). A.T.P. is supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR or the Department of Health. M.A. acknowledges the Rhodes Trust.
PY - 2022/11
Y1 - 2022/11
UR - http://www.scopus.com/inward/record.url?scp=85141713488&partnerID=8YFLogxK
U2 - 10.1002/uog.24972
DO - 10.1002/uog.24972
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.comment???
C2 - 35708528
AN - SCOPUS:85141713488
SN - 0960-7692
VL - 60
SP - 710
EP - 711
JO - Ultrasound in Obstetrics and Gynecology
JF - Ultrasound in Obstetrics and Gynecology
IS - 5
ER -