Style Aligned Image Generation via Shared Attention

Amir Hertz, Andrey Voynov, Shlomi Fruchter, Daniel Cohen-Or

Research output: Contribution to journalConference articlepeer-review

11 Scopus citations

Abstract

Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visu- ally compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique de- signed to establish style alignment among a series of gener- ated images. By employing minimal 'attention sharing' during the diffusion process, our method maintains style con- sistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.

Original languageEnglish
Pages (from-to)4775-4785
Number of pages11
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs
StatePublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States
Duration: 16 Jun 202422 Jun 2024

Keywords

  • Computer Vision
  • Generative models
  • Machine Learning
  • Style Transfer

Fingerprint

Dive into the research topics of 'Style Aligned Image Generation via Shared Attention'. Together they form a unique fingerprint.

Cite this