TY - GEN
T1 - LCM-Lookahead for Encoder-Based Text-to-Image Personalization
AU - Gal, Rinon
AU - Lichter, Or
AU - Richardson, Elad
AU - Patashnik, Or
AU - Bermano, Amit H.
AU - Chechik, Gal
AU - Cohen-Or, Daniel
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, they often maintain alignment with the original model, retaining similar outputs for similar prompts and seeds. These properties present opportunities to leverage fast sampling methods as a shortcut-mechanism, using them to create a preview of denoised outputs through which we can backpropagate image-space losses. In this work, we explore the potential of using such shortcut-mechanisms to guide the personalization of text-to-image models to specific facial identities. We focus on encoder-based personalization approaches, and demonstrate that by augmenting their training with a lookahead identity loss, we can achieve higher identity fidelity, without sacrificing layout diversity or prompt alignment. Code at https://lcm-lookahead.github.io/.
AB - Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, they often maintain alignment with the original model, retaining similar outputs for similar prompts and seeds. These properties present opportunities to leverage fast sampling methods as a shortcut-mechanism, using them to create a preview of denoised outputs through which we can backpropagate image-space losses. In this work, we explore the potential of using such shortcut-mechanisms to guide the personalization of text-to-image models to specific facial identities. We focus on encoder-based personalization approaches, and demonstrate that by augmenting their training with a lookahead identity loss, we can achieve higher identity fidelity, without sacrificing layout diversity or prompt alignment. Code at https://lcm-lookahead.github.io/.
UR - http://www.scopus.com/inward/record.url?scp=85212921764&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-72630-9_19
DO - 10.1007/978-3-031-72630-9_19
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85212921764
SN - 9783031726293
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 322
EP - 340
BT - Computer Vision – ECCV 2024 - 18th European Conference, Proceedings
A2 - Leonardis, Aleš
A2 - Ricci, Elisa
A2 - Roth, Stefan
A2 - Russakovsky, Olga
A2 - Sattler, Torsten
A2 - Varol, Gül
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th European Conference on Computer Vision, ECCV 2024
Y2 - 29 September 2024 through 4 October 2024
ER -