TY - JOUR
T1 - Style Aligned Image Generation via Shared Attention
AU - Hertz, Amir
AU - Voynov, Andrey
AU - Fruchter, Shlomi
AU - Cohen-Or, Daniel
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visu- ally compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique de- signed to establish style alignment among a series of gener- ated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style con- sistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.
AB - Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visu- ally compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique de- signed to establish style alignment among a series of gener- ated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style con- sistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.
KW - Computer Vision
KW - Generative models
KW - Machine Learning
KW - Style Transfer
UR - http://www.scopus.com/inward/record.url?scp=85201757183&partnerID=8YFLogxK
U2 - 10.1109/CVPR52733.2024.00457
DO - 10.1109/CVPR52733.2024.00457
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85201757183
SN - 1063-6919
SP - 4775
EP - 4785
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Y2 - 16 June 2024 through 22 June 2024
ER -