Single image 3D hand reconstruction with mesh convolutions

Dominik Kulon, Haoyang Wang, Riza Alp Güler, Michael Bronstein, Stefanos Zafeiriou

Research output: Contribution to conferencePaperpeer-review


Monocular 3D reconstruction of deformable objects, such as human body parts, has been typically approached by predicting parameters of heavyweight linear models. In this paper, we demonstrate an alternative solution that is based on the idea of encoding images into a latent non-linear representation of meshes. The prior on 3D hand shapes is learned by training an autoencoder with intrinsic graph convolutions performed in the spectral domain. The pre-trained decoder acts as a non-linear statistical deformable model. The latent parameters that reconstruct the shape and articulated pose of hands in the image are predicted using an image encoder. We show that our system reconstructs plausible meshes and operates in real-time. We evaluate the quality of the mesh reconstructions produced by the decoder on a new dataset and show latent space interpolation results. Our code, data, and models will be made publicly available.

Original languageEnglish
StatePublished - 2020
Externally publishedYes
Event30th British Machine Vision Conference, BMVC 2019 - Cardiff, United Kingdom
Duration: 9 Sep 201912 Sep 2019


Conference30th British Machine Vision Conference, BMVC 2019
Country/TerritoryUnited Kingdom


Dive into the research topics of 'Single image 3D hand reconstruction with mesh convolutions'. Together they form a unique fingerprint.

Cite this