3rd Workshop on Scalable 3D Scene Generation and Geometric Scene Understanding
ECCV 2026 Workshop
Introduction
Large-scale geometric scene understanding is one of the most critical and long-standing research directions in computer vision, with the impactful applications in autonomous driving, and robotics. Recently, there has been a surge of interest in 3D scene generation, driven by its wide-ranging applications in the gaming industry, augmented reality (AR), and virtual reality (VR). All these have been transforming our lives and enabling significant commercial opportunities. Both academia and industry have been investing heavily in pushing the research directions toward more efficiency and handling the large-scale scene.
The efficiency and quality of the large-scale reconstruction, and generation rely on the 3D representation and priors applied in solving the problem. Moreover, different industries such as robotics, autonomous driving and gaming industry have distinct requirements on the quality and efficiency of the obtained 3D scene structures. The proposed workshop will gather top researchers and engineers from both academia and industry to discuss the future key challenges for this.
Call For Papers
Call for papers: We invite papers of up to 8 pages (in ECCV26 format) for work on tasks related to 3D generation, reconstruction, geometric scene understanding. As the paper will be included the ICCV workshop proceedings, no dual submission is accepted. Paper topics may include but are not limited to:
- Scalable large-scale 3D scene generation
- Efficient 3D representation learning for large-scale 3D scene reconstruction
- Learning compositional structure of the 3D Scene, 3D scalable Object-centric learning
- 3D Reconstruction and generation for dynamic scene (with humans and/or rigid objects such as cars)
- Online learning for scalable 3D scene reconstruction
- Foundation models for 3D geometric scene understanding
- 3D Reconstruction and Generation for AR/VR/Robotics etc
- Datasets for large-scale scene reconstruction and generation with (moving objects)
- Multi-modal 3D scene generation and geometric understanding
Submission: We encourage submissions of up to xx pages, excluding references and acknowledgements. The submission should be in the ECCV format. Reviewing will be double-blind. Please submit your paper to the following address by the deadline: Submission Portal
Important Dates
| Paper submission deadline | July 1st, 2026 |
| Notifications to accepted papers | July 18th, 2026 |
| Paper camera ready | August 10th, 2026 |
| Workshop date | xx xx 2026 |
Schedule
| Welcome | 2:05pm - 2:10pm |
| TBD TBD. |
2:10pm - 2:40pm |
| TBD TBD. |
2:40pm - 3:10pm |
| TBD TBD |
3:10pm - 3:40pm |
| Coffee Break and Poster Session | 3:40pm - 4:40pm |
| TBD TBD |
4:40pm - 5:10pm |
| TBD TBD |
5:10pm - 5:40pm |
| Concluding Remarks | 5:40pm - 5:45pm |
Invited Speakers
Angela Dai is an Associate Professor at the Technical University of Munich where she leads the 3D AI Lab. Angela's research aims to enable machines to understand, model, and generate real-world 3D environments. She focuses on enabling the creation of rich, semantically grounded, and interactable 3D worlds that allow machines not only to perceive physical spaces, but to reason about them and act within them. She received her PhD in computer science from Stanford in 2018, advised by Pat Hanrahan, and her BSE in computer science from Princeton in 2013. Her research has been recognized through an ECVA Young Researcher Award, ERC Starting Grant, Eurographics Young Researcher Award, German Pattern Recognition Award, Google Research Scholar Award, and an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention. She has also served as Program Chair for Eurographics 2025 and CVPR 2026.
TBD.
Andreas Geiger is a Professor and is heading the Autonomous Vision Group (AVG) at the University of Tübingen located in Tübingen, Germany at the heart of CyberValley. He is the head of the Department of Computer Science, a core faculty member of the Tübingen AI Center, PI in the cluster of excellence ML in Science and the CRC Robust Vision. He is also an ELLIS fellow and coordinator of the ELLIS PhD program. His research group is developing machine learning models for computer vision, natural language and robotics with applications in self-driving, VR/AR and scientific document analysis.
TBD
Richard Zhang is a Professor in the School of Computing Science at SFU, and also Vice President of AI and R&D at Augmenta. At Augmenta, he leads the company's effort in developing the most advanced AI models and tools for efficient and sustainable building designs. More broadly, they are exploring spatial and functional intelligence to tackle challenges in the physical world and contributing to the R&D ecosystem for physical AI. He obtained his Ph.D. from the Dynamic Graphics Project (DGP) at the University of Toronto, and MMath and BMath degrees from the University of Waterloo. He directs the GrUVi (Graphics U Vision) Lab, one of the top places in the world to conduct computer graphics and computer vision research. His research is in computer graphics and more broadly, visual computing, with special interests in geometric and generative modeling, shape analysis, 3D vision, spatial AI, geometric deep learning, as well as computational design and fabrication. He has published more than 200 papers on these topics, including 75+ articles in SIGGRAPH (+Asia) and ACM Trans. on Graphics (TOG), the top venue in computer graphics, and he has an Erdös number of 3. His research has been sponsored by Adobe, Autodesk, Boeing, Glodon, Google, and NSERC. He is an IEEE Fellow (see SFU news coverage), hold a Distinguished University Professorship, and is a member of the ACM SIGGRAPH Academy. He was an Amazon Scholar from late 2021 to early 2025.
Gim Hee Lee is an Associate Professor in the Department of Computer Science at the National University of Singapore (NUS). He received his PhD in Computer Science from ETH Zurich and was previously a researcher at Mitsubishi Electric Research Laboratories (MERL), USA. He has served as Area Chair for leading conferences such as CVPR, ICCV, ECCV, ICLR, and NeurIPS, and has held organizing roles including Program Chair of 3DV 2022, Demo Chair of CVPR 2023, and General Chair of 3DV 2025. He is a recipient of the Singapore NRF Investigatorship (Class of 2024). His research focuses on 3D computer vision and robotics.
Amir Zamir is an Assistant Professor of computer science at the Swiss Federal Institute of Technology (EPFL). His research is in computer vision and machine learning. Before joining EPFL in 2020, he was with UC Berkeley, Stanford, and UCF. He has received paper awards at SIGGRAPH 2022, CVPR 2020, CVPR 2018, CVPR 2016, and the NVIDIA Pioneering Research Award 2018, PAMI Everingham Prize 2022, and ECCV/ECVA Young Researcher Award 2022. His research has been covered by press outlets, such as The New York Times or Forbes. He was the chief scientist of Aurora Solar, a Forbes AI 50 company, from 2015 to 2022 and currently serves as the chief scientist of Duranta Inc.
Title: Multimodal Scene Understanding
Organizers
Australian National University, Australia
NVIDIA, US
EPFL, Swiss Data Science Center (SDSC), Switzerland
University of Pennsylvania, US
Australian National University, Australia
Australian National University, & Google, Australia
Contact
To contact the organizers please use S3DSGR@gmail.com
Acknowledgments
Thanks to visualdialog.org for the webpage format.
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.




