TY - GEN
T1 - The Chosen One
T2 - SIGGRAPH 2024 Conference Papers
AU - Avrahami, Omri
AU - Hertz, Amir
AU - Vinker, Yael
AU - Arar, Moab
AU - Fruchter, Shlomi
AU - Fried, Ohad
AU - Cohen-Or, Daniel
AU - Lischinski, Dani
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/7/13
Y1 - 2024/7/13
N2 - Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach.
AB - Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach.
KW - Consistent characters generation
UR - http://www.scopus.com/inward/record.url?scp=85199911617&partnerID=8YFLogxK
U2 - 10.1145/3641519.3657430
DO - 10.1145/3641519.3657430
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85199911617
T3 - Proceedings - SIGGRAPH 2024 Conference Papers
BT - Proceedings - SIGGRAPH 2024 Conference Papers
A2 - Spencer, Stephen N.
PB - Association for Computing Machinery, Inc
Y2 - 28 July 2024 through 1 August 2024
ER -