Unsupervised 3D Learning in the Wild @ICCV2021

The workshop is over. The recording of the entire event can be watched below

Recent years have witnessed significant progress in learning 3D representations from 2D visual data, which allows us to reason about our 3D world from just a few or even just a single 2D observation at test time. However, many of these approaches rely heavily on synthetic data and/or extensive manual annotations as supervision during training, and hence face challenges when generalizing to complex real world scenarios. Unsupervised 3D learning in the wild has therefore been gaining popularity recently, with the goal of understanding our entire 3D world by learning from unannotated, unconstrained (i.e., "in-the-wild") data.

This workshop aims to cover the recent advances in unsupervised and weakly-supervised 3D learning. More concretely, we will primarily focus on the problem of learning 3D shape, pose, motion, appearance as well as illumination and material properties from "in-the-wild" data, such as Internet images and videos, without explicit ground-truth supervision.

Beyond the current state, the workshop will hopefully shed light on the challenges as well as the next steps in this field. For example:

  • - In order to perceive the 3D world in the wild, which level of supervision is needed: well-captured real 3D data, synthetic data, weak annotations (e.g. keypoints, masks), category template shapes, category labels only or nothing at all?
  • - Inductive biases are often injected into unsupevised methods to replace explicit ground-truth supervision. Can they actually generalize to in-the-wild environments?
  • - What are the right representations for effective 3D learning in the wild?
  • - How to evaluate methods trained on in-the-wild data, as ground-truth information is usually unavailable?
  • - How to move beyond single-category learning towards more general 3D scene understanding?
  • - How to recover complex appearance properties beyond geometry, as the ground-truth labels of those are even more difficult to obtain?

Program ( playlist)




Tali Dekel "Layered Neural Representations for Video" YouTube, bilibili
Andreas Geiger "Generative Neural Scene Representations for 3D-Aware Image Synthesis" YouTube, bilibili
Jingyi Yu "Neural Modeling and Rendering for Creating Virtual Humans" YouTube, bilibili
David Novotny "Common Objects in 3D" YouTube, bilibili
Iasonas Kokkinos "Non-rigid 3D Objects from Correspondence Self-supervision" YouTube, bilibili
Niloy Mitra "Generative Models for Vector Graphics" YouTube, bilibili
Panel Discussion YouTube, bilibili
Jitendra Malik "Differentiable Stereopsis" YouTube, bilibili
Angjoo Kanazawa "Real-time rendering of NeRFs with PlenOctrees" YouTube, bilibili
Shubham Tulsiani "Sparse-view 3D Reconstruction in the Wild" YouTube, bilibili
Noah Snavely "Towers of Babel: Marrying 3D and Language" YouTube, bilibili
Adel Ahmadyan "3D Object Understanding with Objectron" YouTube, bilibili
Shalini De Mello "Can We Use Part Correspondences and Temporal Consistency for Self-Supervised 3D Reconstruction?" YouTube, bilibili
Panel Discussion YouTube, bilibili