Stitch it in Time: GAN-Based Facial Editing of Real Videos

Rotem Tzaban, Ron Mokady, Rinon Gal, Amit Bermano, Daniel Cohen-Or

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing. However, replicating their success with videos has proven challenging. Applying StyleGAN editing to real videos introduces two main challenges: (i) StyleGAN operates over aligned crops. When editing videos, these crops need to be pasted back into the frame, resulting in a spatial inconsistency. (ii) Videos introduce a fundamental barrier to overcome - temporal coherency. To address the first challenge, we propose a novel stitching-tuning procedure. The generator is carefully tuned to overcome the spatial artifacts at crop borders, resulting in smooth transitions even when difficult backgrounds are involved. Turning to temporal coherence, we propose that this challenge is largely artificial. The source video is already temporally coherent, and deviations arise in part due to careless treatment of individual components in the editing pipeline. We leverage the natural alignment of StyleGAN and the tendency of neural networks to learn low-frequency functions, and demonstrate that they provide a strongly consistent prior. These components are combined in an end-to-end framework for semantic editing of facial videos. We compare our pipeline to the current state-of-the-art and demonstrate significant improvements. Our method produces meaningful manipulations and maintains greater spatial and temporal consistency, even on challenging talking head videos which current methods struggle with. Our code and videos are available at https://stitch-time.github.io/.

Original languageEnglish
Title of host publicationProceedings - SIGGRAPH Asia 2022 Conference Papers
EditorsStephen N. Spencer
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450394703
DOIs
StatePublished - 29 Nov 2022
EventSIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022 - Daegu, Korea, Republic of
Duration: 6 Dec 20229 Dec 2022

Publication series

NameProceedings - SIGGRAPH Asia 2022 Conference Papers

Conference

ConferenceSIGGRAPH Asia 2022 - Computer Graphics and Interactive Techniques Conference - Asia, SA 2022
Country/TerritoryKorea, Republic of
CityDaegu
Period6/12/229/12/22

Keywords

  • Image Synthesis

Fingerprint

Dive into the research topics of 'Stitch it in Time: GAN-Based Facial Editing of Real Videos'. Together they form a unique fingerprint.

Cite this