Profile avatar
fdellaert.bsky.social
Robotics/Perception Prof at Georgia Tech; Chief AI Officer at Verdant Robotics. Stints at Skydio, B*8, Reality Labs, Google Research. https://dellaert.github.io
39 posts 2,427 followers 333 following
Prolific Poster
Conversation Starter

There are 2 previous historical cases of countries destroying their science and universities, crippling them for decades: Lysenkoism in the USSR and Nazi Germany. The Trump administration will be the 3rd. It's not just budgets but research, institutions, expertise, and training the next generation.

GTSAM 4.3a0 ! github.com/borglab/gtsa...

I know this look well. It’s the same one I get when I introduce the Fourier transform in my computer vision course to the CS crowd.

Trump Says He Won’t Rule Out Third Reich

Part 2 of SLAM handbook is out for public comments! let us know what you think :-) Issue tracker on GitHub awaits! Link: github.com/SLAM-Handboo...

And yet Joe Rogan puts guests on who say "Vaccines aren't actually responsible for the reduction in infectious diseases.”

Aria Gen 2 is very impressive, with fully onboard SLAM and various other perception all within a 75g device with an hours-long battery life. Processing on custom silicon. Congrats to the Reality Labs team.

MASt3R-SLAM code release! github.com/rmurai0610/M... Try it out on videos or with a live camera Work with @ericdexheimer.bsky.social*, @ajdavison.bsky.social (*Equal Contribution)

CRA statement about NSF firings cra.org/cuts-to-nsf-...

Some personal news: as of January I am back full-time at Georgia Tech following a 2-year leave as Verdant Robotics’ CTO. I will continue to be involved with Verdant as part-time Chief AI Officer, thinking strategically about the role of AI in Robotics for Ag.

Gemini is good but too verbose :-)

The visual system of a jumping spider is fascinating. Look at those cones behind the fixed main lenses! The retinas are at the end of the cones. youtu.be/gvN_ex95IcE?...

We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data. It is SOTA on every planning benchmark we tried. In self-play, it goes 20 years between collisions.

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos TL;DR: Long videos support; Depth Anything V2 with efficient spatial-temporal head. Temporal consistency loss -> depth gradient (no geometric priors)

That’s such a cool idea! (Thanks K for flagging this to me)

First iRIM@GT seminar of 2025: Ani Majumdar! Well attended!

Sign of the times, in 2025

Getting myself set up here. I found the Sky Follower Bridge Chrome plugin pretty helpful (thanks @kawamataryo.bsky.social!) chromewebstore.google.com/detail/sky-f...

3D feed-forward Gaussian Splatting feels like magic, let alone 4D!

100 years ago today, #OTD in 1925, Edwin Hubble announced that Andromeda and other spiral nebulae were definitely separate galaxies outside the Milky Way, in a paper read to an AAS meeting by H.N. Russell. There was no doubt that the Universe was more than just our little island of stars. 🧪 🔭 ⚛️

Just WOW: youtu.be/X2UxtKLZnNo?...

ChatGPT canvas is cool. Maybe a bit slow still.

Awesome stuff!

Convolutional Differentiable Logic Gate Networks @FHKPetersen

Today I used ChatGPT canvas to help me simplify the SE_2(3) exponential map calculation, and its Jacobian. 4o looks dumb in comparison to o1 now, though :-(

Introducing 👀Stereo4D👀 A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories. We used Stereo4D to make a dataset of over 100k real-world 4D scenes.

Back in college I took a developmental biology class. This field has come so far with advancing technologies like light sheet microscopy. This video is just plain amazing.

I’m talking differential geometry with ChatGPT o1 like there is no tomorrow. This thing is amazing. On the other hand, I might need my own nuclear power station.

This is so cool!

Zhengqi led a really nice new paper that computes really nice camera poses and depth maps from everyday videos, not just the standard SLAM-mable sorts of videos you often see. It's really fast and robust, and I think it's quite neat.

Fit-NGP: millimetre accurate 3D object pose estimation from RGB images only via Instant NGP. What can we do in manipulation with this level of accuracy? From Marwan and Ignacio from the Dyson Robotics Lab, Imperial College, ICRA24. marwan99.github.io/Fit-NGP/ youtu.be/KQ7yH_em3Qg?...

I'd like to introduce what I've been working at @hellorobot.bsky.social: Stretch AI, a set of open-source tools for language-guided autonomy, exploration, navigation, and learning from demonstration. Check it out: github.com/hello-robot/... Thread ->

A common question nowadays: Which is better, diffusion or flow matching? 🤔 Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.

MoSh has won the 2024 SIGGRAPH Asia Test-of-Time Award. What’s MoSh? It takes motion capture markers and returns the animation of a realistic 3D human body in #SMPL-X format. I wrote a blog post to explain why MoSh is still relevant after 10 years. perceiving-systems.blog/en/news/moti...

Turns out aria-glasses are a very useful tool to demonstrate actions to robots: Based on egocentric video we track dynamic changes in a scene graph and use the representation to replay or plan interactions for robots 🔗 behretj.github.io/LostAndFound/ 📄 arxiv.org/abs/2411.19162 📺 youtu.be/xxMsaBSeMXo

We implemented undo in @rerun.io by storing the viewer state in the same type of in-memory database we use for the recorded data. Have a look (sound on!)