TY - GEN
T1 - Monkey See, Monkey Do
T2 - 2024 SIGGRAPH Asia 2024 Conference Papers, SA 2024
AU - Raab, Sigal
AU - Gat, Inbar
AU - Sala, Nathan
AU - Tevet, Guy
AU - Shalev-Arkushin, Rotem
AU - Fried, Ohad
AU - Haim Bermano, Amit
AU - Cohen-Or, Daniel
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/12/3
Y1 - 2024/12/3
N2 - Given the remarkable results of motion synthesis with diffusion models, a natural question arises: how can we effectively leverage these models for motion editing? Existing diffusion-based motion editing methods overlook the profound potential of the prior embedded within the weights of pre-trained models, which enables manipulating the latent feature space; hence, they primarily center on handling the motion space. In this work, we explore the attention mechanism of pre-trained motion diffusion models. We uncover the roles and interactions of attention elements in capturing and representing intricate human motion patterns, and carefully integrate these elements to transfer a leader motion to a follower one while maintaining the nuanced characteristics of the follower, resulting in zero-shot motion transfer. Manipulating features associated with selected motions allows us to confront a challenge observed in prior motion diffusion approaches, which use general directives (e.g., text, music) for editing, ultimately failing to convey subtle nuances effectively. Our work is inspired by the phrase Monkey See, Monkey Do, relating to human mimicry. Our technique enables accomplishing tasks such as synthesizing out-of-distribution motions, style transfer, and spatial editing. Furthermore, diffusion inversion is seldom employed for motions; as a result, editing efforts focus on generated motions, limiting the editability of real ones. MoMo harnesses motion inversion, extending its application to both real and generated motions. Experimental results show the advantage of our approach over the current art. In particular, unlike methods tailored for specific applications through training, our approach is applied at inference time, requiring no training. Webpage: https://monkeyseedocg.github.io.
AB - Given the remarkable results of motion synthesis with diffusion models, a natural question arises: how can we effectively leverage these models for motion editing? Existing diffusion-based motion editing methods overlook the profound potential of the prior embedded within the weights of pre-trained models, which enables manipulating the latent feature space; hence, they primarily center on handling the motion space. In this work, we explore the attention mechanism of pre-trained motion diffusion models. We uncover the roles and interactions of attention elements in capturing and representing intricate human motion patterns, and carefully integrate these elements to transfer a leader motion to a follower one while maintaining the nuanced characteristics of the follower, resulting in zero-shot motion transfer. Manipulating features associated with selected motions allows us to confront a challenge observed in prior motion diffusion approaches, which use general directives (e.g., text, music) for editing, ultimately failing to convey subtle nuances effectively. Our work is inspired by the phrase Monkey See, Monkey Do, relating to human mimicry. Our technique enables accomplishing tasks such as synthesizing out-of-distribution motions, style transfer, and spatial editing. Furthermore, diffusion inversion is seldom employed for motions; as a result, editing efforts focus on generated motions, limiting the editability of real ones. MoMo harnesses motion inversion, extending its application to both real and generated motions. Experimental results show the advantage of our approach over the current art. In particular, unlike methods tailored for specific applications through training, our approach is applied at inference time, requiring no training. Webpage: https://monkeyseedocg.github.io.
KW - Animation
KW - Computer Graphics
KW - Deep Features
KW - Human motion
KW - Motion synthesis
UR - http://www.scopus.com/inward/record.url?scp=85212454016&partnerID=8YFLogxK
U2 - 10.1145/3680528.3687579
DO - 10.1145/3680528.3687579
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85212454016
T3 - Proceedings - SIGGRAPH Asia 2024 Conference Papers, SA 2024
BT - Proceedings - SIGGRAPH Asia 2024 Conference Papers, SA 2024
A2 - Spencer, Stephen N.
PB - Association for Computing Machinery, Inc
Y2 - 3 December 2024 through 6 December 2024
ER -