Ultrasound is the primary modality for obstetric imaging and is highly sonographer dependent. Long training period, insufficient recruitment and poor retention of sonographers are among the global challenges in the expansion of ultrasound use. For the past several decades, technical advancements in clinical obstetric ultrasound scanning have largely concerned improving image quality and processing speed. By contrast, sonographers have been acquiring ultrasound images in a similar fashion for several decades. The PULSE (Perception Ultrasound by Learning Sonographer Experience) project is an interdisciplinary multi-modal imaging study aiming to offer clinical sonography insights and transform the process of obstetric ultrasound acquisition and image analysis by applying deep learning to large-scale multi-modal clinical data. A key novelty of the study is that we record full-length ultrasound video with concurrent tracking of the sonographer’s eyes, voice and the transducer while performing routine obstetric scans on pregnant women. We provide a detailed description of the novel acquisition system and illustrate how our data can be used to describe clinical ultrasound. Being able to measure different sonographer actions or model tasks will lead to a better understanding of several topics including how to effectively train new sonographers, monitor the learning progress, and enhance the scanning workflow of experts.