1st Workshop on Scalable 3D Scene Generation and Geometric Scene Understanding

ECCV 2024 Workshop

Room Amber 6 - September 29th (9:00am - 1:00pm), 2024


Introduction

Large-scale geometric scene understanding is one of the most critical and long-standing research directions in computer vision, with the impactful applications in autonomous driving, and robotics. Recently, there has been a surge of interest in 3D scene generation, driven by its wide-ranging applications in the gaming industry, augmented reality (AR), and virtual reality (VR). All these have been transforming our lives and enabling significant commercial opportunities. Both academia and industry have been investing heavily in pushing the research directions toward more efficiency and handling the large-scale scene.

The efficiency and quality of the large-scale reconstruction, and generation rely on the 3D representation and priors applied in solving the problem. Moreover, different industries such as robotics, autonomous driving and gaming industry have distinct requirements on the quality and efficiency of the obtained 3D scene structures. The proposed workshop will gather top researchers and engineers from both academia and industry to discuss the future key challenges for this.


Call For Papers

Call for papers: We invite non-archival papers of up to 14 pages (in ECCV format) for work on tasks related to 3D generation, reconstruction, geometric scene understanding. Paper topics may include but are not limited to:

  • Scalable large-scale 3D scene generation
  • Efficient 3D representation learning for large-scale 3D scene reconstruction
  • Learning compositional structure of the 3D Scene, 3D scalable Object-centric learning
  • 3D Reconstruction and generation for dynamic scene (with humans and/or rigid objects such as cars)
  • Online learning for scalable 3D scene reconstruction
  • Foundation models for 3D geometric scene understanding
  • 3D Reconstruction and Generation for AR/VR/Robotics etc
  • Datasets for large-scale scene reconstruction and generation with (moving objects)

Submission: We encourage submissions of up to 14 pages, excluding references and acknowledgements. The submission should be in the ECCV format. Reviewing will be double-blind. Please submit your paper to the following address by the deadline: Submission Portal


Poster Presentation

#1. Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting
      Raja Kumar, Vanshika Vats, University of California Santa Cruz
#2. On Scaling Up 3D Gaussian Splatting Training
      Hexu Zhao, Haoyang Weng, Daohan Lu, Ang Li, Jinyang Li, Aurojit Panda, Saining Xie, New York University
#3. AEPnP: A Less-constrained EPnP Solver for Pose Estimation with Anisotropic Scaling,
      Jiaxin Wei, Stefan Leutenegger, Laurent Kneip, Technical University of Munich, ShanghaiTech University
#4. Scalable Indoor Novel-View Synthesis using Drone-Captured 360 Imagery with 3D Gaussian Splatting,
      Yuanbo Chen, Chengyu Zhang, Jason Wang, Xuefan Gao, Avideh Zakhor, University of California, Berkeley
#5. SceneTeller: Language-to-3D Scene Generation
      Başak Melis Öcal, Sezer Karaoğlu, Theo Gevers, ECCV2024
#6. NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
      Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira,  Rares Ambrus, ECCV2024
#7. BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation,
      Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, weizhe Liu, Hiroyuki Sato, Hongdong Li, Pan Ji, SIGGRAPH 2024



Important Dates

Paper submission deadline July 7th, 2024
Notifications to accepted papers July 28th, 2024
Paper camera ready August 15th, 2024
Workshop date September 29th, AM, 2024


Schedule

Welcome 9:05am - 9:10am
Zan Gojcic
Neural rendering – from reconstruction to generation
9:10am - 9:40am
Marc Pollefeys
Mapping, Localization and Understanding of Large-Scenes for Robotics and AI
9:40am - 10:10am
Vincent Lepetit
Novel methods for the exploration and reconstruction of indoor scenes
10:10am - 10:40am
Coffee Break and Poster Session 10:40am - 11:40am
Jiajun Wu
3D Generation: Scaling from Objects to Scenes to Worlds
11:40am - 12:10pm
Lingjie Liu
Human Pose Tracking and Controllable Motion Generation
12:10pm - 12:40pm
Concluding Remarks 12:40pm - 12:50pm


Invited Speakers


Marc Pollefeys is a Professor of Computer Science at ETH Zurich and the Director of the Microsoft Mixed Reality and AI Lab in Zurich where he works with a team of scien- tists and engineers to develop advanced perception capabilities for HoloLens and Mixed Reality. He is best known for his work in 3D computer vision, having been the first to develop a software pipeline to automatically turn photographs into 3D models, but also works on robotics, graphics and machine learning problems. Other noteworthy projects he worked on are real-time 3D scanning with mobile devices, a real-time pipeline for 3D reconstruction of cities from vehicle mounted-cameras, camera-based self-driving cars and the first fully autonomous vision-based drone. Most recently his academic research has focused on combining 3D reconstruction with semantic scene understanding.


Zan Gojcic is a Senior Research Scientist and a Research Manager at NVIDIA Zurich, where he leads a research team focused on Neural Reconstruction and Simulation. He holds a PhD from ETH Zurich and has completed a research visit at Stanford University during his doctoral studies. Zan's expertise lies at the intersection of computer vision and computer graphics, with a particular emphasis on 3D vision and neural reconstruction. His research aims to advance large-scale reconstruction and generation of data-driven simulation environments, to enable testing and training of end-to-end robotics systems.


Lingjie Liu is the Aravind K. Joshi Assistant Professor in the Department of Computer and Information Science at the University of Pennsylvania, where she leads the Penn Computer Graphics Lab. and she is also a member of the General Robotics, Automation, Sensing \& Perception (GRASP) Lab. Previously, she was a Lise Meitner Postdoctoral Research Fellow at Max Planck Institute for Informatics. She received her Ph.D. degree at the University of Hong Kong in 2019. Her research interests are at the interface of Computer Graphics, Computer Vision, and AI, with a focus on Neural Scene Representations, Neural Rendering, Human Performance Modeling and Capture, and 3D Reconstruction.


Vincent Lepetit is a professor at ENPC ParisTech, France. Before that, he was a full professor at the Institute for Computer Graphics and Vision, TU Graz, Austria and before that, a senior researcher at CVLab, EPFL, Switzerland. His research focuses on 3D scene understanding. More exactly, he aims at reducing as much as possible the guidance a system needs to learn new 3D objects and new 3D environments: How can we remove the need for training data for each new 3D problem? Currently, even self-supervised methods often require CAD models, which are not necessarily available for any type of object. This question has both theoretical implications and practical applications, as the need for training data, even synthetic, is often a deal breaker for non-academic problems. Vincent received the Koenderick “test-of-time” award at the European Conference on Computer Vision 2020 for “Brief: Binary Robust Independent Elementary Features”. He regularly serves as an area chair of the major computer vision conferences: CVPR, ICCV, ECCV, ACCV, BMVC and as an editor for PAMI and IJCV.


Jiajun Wu is an Assistant Professor of Computer Science and of Psychology at Stanford University, working on computer vision, machine learning, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Wu's research has been recognized through the Young Investigator Programs (YIP) by ONR and by AFOSR, the NSF CAREER award, paper awards and finalists at ICCV, CVPR, SIGGRAPH Asia, CoRL, and IROS, dissertation awards from ACM, AAAI, and MIT, the 2020 Samsung AI Researcher of the Year, and faculty research awards from J.P. Morgan, Samsung, Amazon, and Meta.


Organizers

Miaomiao Liu
Australian National University, Australia
Jose M. Alvarez
NVIDIA, US
Mathieu Salzmann
EPFL, Swiss Data Science Center (SDSC), Switzerland
Buyu Liu
Zhejiang University, China
Hongdong Li
Australian National University, Australia
Richard Hartley
Australian National University, & Google, Australia



Contact

To contact the organizers please use S3DSGR@gmail.com



Acknowledgments

Thanks to visualdialog.org for the webpage format.