Capturing and re-animating the 3D structure of articulated objects present significant barriers. On one hand, methods requiring extensively calibrated multi-view setups are prohibitively complex and resource-intensive, limiting their practical applicability. On the other hand, while single-camera Neural Radiance Fields (NeRFs) offer a more streamlined approach, they have excessive training and rendering costs.
3D Gaussian Splatting would be a suitable alternative but for two reasons. Firstly, existing methods for 3D dynamic Gaussians require synchronized multi-view cameras, and secondly, the lack of controllability in dynamic scenarios. We present CoGS, a method for Controllable Gaussian Splatting, that enables the direct manipulation of scene elements, offering real-time control of dynamic scenes without the prerequisite of pre-computing control signals. We evaluated CoGS using both synthetic and real-world datasets that include dynamic objects that differ in degree of difficulty. In our evaluations, CoGS consistently outperformed existing dynamic and controllable neural representations in terms of visual fidelity.
We show results on both real-world and synthetic scenes. For synthetic scenes, we demonstrate the effectiveness of two novel losses, denoted as Ldiff and Lnorm. With Ldiff, the trajectories are more consistent across timesteps. When applying Lnorm, the static portions of the scene (e.g. the lego base) stabilize. This stabilization is evident in the reduced trajectory fluctuations over time.
Jumping Jacks w/o Ldiff
Jumping Jacks w/ Ldiff
Lego w/o Lnorm
Lego w/ Lnorm
With our method, we can then control Dynamic 3D Gaussians. Use the sliders below to control the state of each scene.
@article{yu2023cogs,
author = {Yu, Heng and Julin, Joel and Milacski, Zoltan A and Niinuma, Koichiro and Jeni, Laszlo A},
title = {CoGS: Controllable Gaussian Splatting},
journal = {arXiv},
year = {2023},
}