Robust inter-subject audiovisual decoding in functional magnetic resonance imaging using high-dimensional regression

Gal Raz*, Michele Svanera, Neomi Singer, Gadi Gilam, Maya Bleich Cohen, Tamar Lin, Roee Admon, Tal Gonen, Avner Thaler, Roni Y. Granot, Rainer Goebel, Sergio Benini, Giancarlo Valente

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.

Original languageEnglish
Pages (from-to)244-263
Number of pages20
StatePublished - Dec 2017


FundersFunder number
FP7 Health Cooperation Work Program299/14, 602186
Human Enhancement and Learning
Universiteit Maastricht


    • Audiovisual decoding
    • Face
    • Kernel ridge regression
    • Motion pictures
    • Motion pictures
    • Optical flow
    • Sound loudness
    • fMRI


    Dive into the research topics of 'Robust inter-subject audiovisual decoding in functional magnetic resonance imaging using high-dimensional regression'. Together they form a unique fingerprint.

    Cite this