Spice·E: Structural Priors in 3D Diffusion using Cross-Entity Attention

Etai Sella, Gal Fiebelman, Noam Atia, Hadar Averbuch-Elor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We are witnessing rapid progress in automatically generating and manipulating 3D assets due to the availability of pretrained text-to-image diffusion models. However, time-consuming optimization procedures are required for synthesizing each sample, hindering their potential for democratizing 3D content creation. Conversely, 3D diffusion models now train on million-scale 3D datasets, yielding high-quality text-conditional 3D samples within seconds. In this work, we present Spice · E - a neural network that adds structural guidance to 3D diffusion models, extending their usage beyond text-conditional generation. At its core, our framework introduces a cross-entity attention mechanism that allows for multiple entities - in particular, paired input and guidance 3D shapes - to interact via their internal representations within the denoising network. We utilize this mechanism for learning task-specific structural priors in 3D diffusion models from auxiliary guidance shapes. We show that our approach supports a variety of applications, including 3D stylization, semantic shape editing and text-conditional abstraction-to-3D, which transforms primitive-based abstractions into highly-expressive shapes. Extensive experiments demonstrate that Spice · E achieves SOTA performance over these tasks while often being considerably faster than alternative methods. Importantly, this is accomplished without tailoring our approach for any specific task. We will release our code and trained models.

Original languageEnglish
Title of host publicationProceedings - SIGGRAPH 2024 Conference Papers
EditorsStephen N. Spencer
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400705250
DOIs
StatePublished - 13 Jul 2024
EventSIGGRAPH 2024 Conference Papers - Denver, United States
Duration: 28 Jul 20241 Aug 2024

Publication series

NameProceedings - SIGGRAPH 2024 Conference Papers

Conference

ConferenceSIGGRAPH 2024 Conference Papers
Country/TerritoryUnited States
CityDenver
Period28/07/241/08/24

Funding

FundersFunder number
Israel Science Foundation2510/23

    Keywords

    • 3D Generative AI
    • 3D Textual Editing
    • Conditional Generation
    • Diffusion Models

    Fingerprint

    Dive into the research topics of 'Spice·E: Structural Priors in 3D Diffusion using Cross-Entity Attention'. Together they form a unique fingerprint.

    Cite this