TY - GEN
T1 - TurboEdit
T2 - 2024 SIGGRAPH Asia 2024 Conference Papers, SA 2024
AU - Deutch, Gilad
AU - Gal, Rinon
AU - Garibi, Daniel
AU - Patashnik, Or
AU - Cohen-Or, Daniel
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/12/3
Y1 - 2024/12/3
N2 - Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods has proven surprisingly challenging. Here, we focus on a popular line of text-based editing frameworks - the “edit-friendly” DDPM-noise inversion approach. We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength. We trace the artifacts to mismatched noise statistics between inverted noises and the expected noise schedule, and suggest a shifted noise schedule which corrects for this offset. To increase editing strength, we propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts. All in all, our method enables text-based image editing with as few as three diffusion steps, while providing novel insights into the mechanisms behind popular text-based editing approaches.
AB - Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods has proven surprisingly challenging. Here, we focus on a popular line of text-based editing frameworks - the “edit-friendly” DDPM-noise inversion approach. We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength. We trace the artifacts to mismatched noise statistics between inverted noises and the expected noise schedule, and suggest a shifted noise schedule which corrects for this offset. To increase editing strength, we propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts. All in all, our method enables text-based image editing with as few as three diffusion steps, while providing novel insights into the mechanisms behind popular text-based editing approaches.
KW - fast image editing
KW - few-step diffusion models
KW - image editing
UR - http://www.scopus.com/inward/record.url?scp=85213437879&partnerID=8YFLogxK
U2 - 10.1145/3680528.3687612
DO - 10.1145/3680528.3687612
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85213437879
T3 - Proceedings - SIGGRAPH Asia 2024 Conference Papers, SA 2024
BT - Proceedings - SIGGRAPH Asia 2024 Conference Papers, SA 2024
A2 - Spencer, Stephen N.
PB - Association for Computing Machinery, Inc
Y2 - 3 December 2024 through 6 December 2024
ER -