Unsupervised 3D Learning in the Wild @ICCV2021
The workshop is over. The recording of the entire event can be watched below
Recent years have witnessed significant progress in learning 3D representations from 2D visual data, which allows us to reason about our 3D world from just a few or even just a single 2D observation at test time. However, many of these approaches rely heavily on synthetic data and/or extensive manual annotations as supervision during training, and hence face challenges when generalizing to complex real world scenarios. Unsupervised 3D learning in the wild has therefore been gaining popularity recently, with the goal of understanding our entire 3D world by learning from unannotated, unconstrained (i.e., "in-the-wild") data.
This workshop aims to cover the recent advances in unsupervised and weakly-supervised 3D learning. More concretely, we will primarily focus on the problem of learning 3D shape, pose, motion, appearance as well as illumination and material properties from "in-the-wild" data, such as Internet images and videos, without explicit ground-truth supervision.
Beyond the current state, the workshop will hopefully shed light on the challenges as well as the next steps in this field. For example:
- - In order to perceive the 3D world in the wild, which level of supervision is needed: well-captured real 3D data, synthetic data, weak annotations (e.g. keypoints, masks), category template shapes, category labels only or nothing at all?
- - Inductive biases are often injected into unsupevised methods to replace explicit ground-truth supervision. Can they actually generalize to in-the-wild environments?
- - What are the right representations for effective 3D learning in the wild?
- - How to evaluate methods trained on in-the-wild data, as ground-truth information is usually unavailable?
- - How to move beyond single-category learning towards more general 3D scene understanding?
- - How to recover complex appearance properties beyond geometry, as the ground-truth labels of those are even more difficult to obtain?
Program ( playlist)
Speaker |
Title |
Recordings |
Tali Dekel | "Layered Neural Representations for Video" | YouTube, bilibili |
Andreas Geiger | "Generative Neural Scene Representations for 3D-Aware Image Synthesis" | YouTube, bilibili |
Jingyi Yu | "Neural Modeling and Rendering for Creating Virtual Humans" | YouTube, bilibili |
David Novotny | "Common Objects in 3D" | YouTube, bilibili |
Iasonas Kokkinos | "Non-rigid 3D Objects from Correspondence Self-supervision" | YouTube, bilibili |
Niloy Mitra | "Generative Models for Vector Graphics" | YouTube, bilibili |
Panel Discussion | YouTube, bilibili | |
Jitendra Malik | "Differentiable Stereopsis" | YouTube, bilibili |
Angjoo Kanazawa | "Real-time rendering of NeRFs with PlenOctrees" | YouTube, bilibili |
Shubham Tulsiani | "Sparse-view 3D Reconstruction in the Wild" | YouTube, bilibili |
Noah Snavely | "Towers of Babel: Marrying 3D and Language" | YouTube, bilibili |
Adel Ahmadyan | "3D Object Understanding with Objectron" | YouTube, bilibili |
Shalini De Mello | "Can We Use Part Correspondences and Temporal Consistency for Self-Supervised 3D Reconstruction?" | YouTube, bilibili |
Panel Discussion | YouTube, bilibili |