Introducing 👀Stereo4D👀
A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories.
We used Stereo4D to make a dataset of over 100k real-world 4D scenes.
A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories.
We used Stereo4D to make a dataset of over 100k real-world 4D scenes.
Comments
We gave this a shot — by extending DUSt3R to model 3D motion, and training on our dataset. Given a pair of frames, our model predicts a 3D point cloud, and corresponding 3D motion trajectories.
Paper link: https://arxiv.org/abs/2412.09621
Thanks to the great team! Richard Tucker, @zhengqili.bsky.social , David Fouhey @snavely.bsky.social , @holynski.bsky.social
Please stay tuned for updates on data & code.