UGSDF:


Leveraging 2D Priors and SDF Guidance for Dynamic Urban Scene Rendering

*University of Heidelberg RRC, IIIT-Hyderabad
§VLM Run $MBZUAI IIT Kharagpur
Accepted at ICCV 2025


teaser

Motivation: Existing dynamic reconstruction methods rely heavily on LiDAR, 3D motion annotations, or object templates, which limits scalability and generalization. Can we achieve high-quality dynamic 4D reconstruction and rendering using only 2D priors?

TL;DR: We introduce UGSDF, a novel approach that fuses Gaussian Splatting and SDFs to achieve temporally consistent, high-fidelity 4D reconstructions of dynamic urban scenes from sparse RGB videos and only 2D priors.

Abstract

Dynamic scene rendering and reconstruction play a crucial role in computer vision and augmented reality. Recent methods based on 3D Gaussian Splatting (3DGS), have enabled accurate modeling of dynamic urban scenes, but for urban scenes they require both camera and LiDAR data, ground-truth 3D segmentations and motion data in the form of tracklets or pre-defined object templates such as SMPL. In this work, we explore whether a combination of 2D object agnostic priors in the form of depth and point tracking coupled with a signed distance function (SDF) representation for dynamic objects can be used to relax some of these requirements. We present a novel approach that integrates Signed Distance Functions (SDFs) with 3D Gaussian Splatting (3DGS) to create a more robust object representation by harnessing the strengths of both methods. Our unified optimization framework enhances the geometric accuracy of 3D Gaussian splatting and improves deformation modeling within the SDF, resulting in a more adaptable and precise representation. We demonstrate that our method achieves near state-of-the-art performance in rendering metrics even without LiDAR data on urban scenes. Furthermore, when incorporating LiDAR, our approach surpasses existing methods in reconstructing and generating novel views across diverse object categories, without ground-truth 3D motion annotation. Additionally, our method enables various scene editing tasks including scene decomposition, and scene composition.



Method

UGSDF integrates 3D Gaussian Splatting (3DGS) and Signed Distance Functions (SDFs) to accurately model and render dynamic urban scenes. The method takes 2D priors—depth maps from a monocular depth network and point tracks from a tracker—to derive 3D geometry and motion cues without requiring LiDAR or ground-truth 3D motion annotations. The approach builds a canonical 3D model of each dynamic object using 2D depth and tracking data, then jointly learns SDF and 3D Gaussian representations. The Gaussians provide high-fidelity rendering and guide surface sampling for the SDF, while the SDF refines Gaussian placement and enforces geometric smoothness. This bi-directional guidance yields accurate reconstructions and realistic novel-view synthesis of vehicles and pedestrians in real-world urban environments

overview

UGSDF takes images, depth maps, and 2D tracking data as input to jointly learn 3D Gaussian and SDF representations, producing realistic renderings of dynamic urban scenes

SDF network

An MLP-based SDF network predicts signed distance values for each point, deforming observations into a canonical space for consistent shape modeling

SDF guided densification

The SDF network guides where to add or remove Gaussian primitives, improving geometric accuracy and surface fidelity in dynamic regions



Qualitative Results

qualitative results



Poster

[Download PDF]

BibTeX


      @InProceedings{Tourani_2025_ICCV,
        author    = {Tourani, Siddharth and Reddy, Jayaram and Kumbar, Akash and Tourani, Satyajit and Goyal, Nishant and Krishna, Madhava and Reddy, N Dinesh and Khan, Muhammad Haris},
        title     = {Leveraging 2D Priors and SDF Guidance for Urban Scene Rendering},
        booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
        month     = {October},
        year      = {2025},
        pages     = {29051-29063}
    }

Acknowledgements

Page borrowed from Nerfies.