SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

  • Yaniv Benny*
  • , Lior Wolf
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper proposes a novel method for omnidirectional 360° perception. Most common previous methods relied on equirectangular projection. This representation is easily applicable to 2D operation layers but introduces distortions into the image. Other methods attempted to remove the distortions by maintaining a sphere representation but relied on complicated convolution kernels that failed to show competitive results. In this work, we introduce a transformer-based architecture that, by incorporating a novel "Spherical Local Self-Attention"and other spherically-oriented modules, successfully operates in the spherical domain and outperforms the state-of-the-art in 360° perception benchmarks for depth estimation and semantic segmentation. Our code is available at https://github.com/yanivbenny/sphere-uformer.

Original languageEnglish
Pages (from-to)940-950
Number of pages11
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs
StatePublished - 2025
Event2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025 - Nashville, United States
Duration: 11 Jun 202515 Jun 2025

Funding

Funders
Tel Aviv University

    Fingerprint

    Dive into the research topics of 'SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception'. Together they form a unique fingerprint.

    Cite this