willsmithvision.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

I just pushed a new paper to arXiv. I realized that a lot of my previous work on robust losses and nerf-y things was dancing around something simpler: a slight tweak to the classic Box-Cox power transform that makes it much more useful and stable. It's this f(x, λ) here:

submitted 9 days ago • 2 comments

#CVPR2025 Area Chair update: depending on which time zone the review deadline is specified in, we are past or close to the review deadline. Of the 60 reviews needed for my batch, I currently have 52 and they have been coming in quite fast this morning. In general, review standard looks good.

submitted 45 days ago • 0 comments

Image matching and ChatGPT - new post in the wide baseline stereo blog. tl;dr: it is good, even feels like human, but not perfect. ducha-aiki.github.io/wide-baselin...

submitted 56 days ago • 2 comments

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

submitted 71 days ago • 4 comments

Me and my friend-since-before-school @ekd.bsky.social (a law academic) have written a blog post about the NotebookLM podcast generator in the style of, well, a corny podcast dialogue: slsablog.co.uk/blog/blog-po... 1/5

submitted 72 days ago • 1 comment

Entropy is one of those formulas that many of us learn, swallow whole, and even use regularly without really understanding. (E.g., where does that “log” come from? Are there other possible formulas?) Yet there's an intuitive & almost inevitable way to arrive at this expression.

submitted 80 days ago • 23 comments

"Sora is a data-driven physics engine." x.com/chrisoffner3...

submitted 80 days ago • 12 comments

Multistable Shape from Shading Emerges from Patch Diffusion #NeurIPS2024 Spotlight X. Nicole Han, T. Zickler and K. Nishino (Harvard+Kyoto) Diffusion-based SFS lets you sample multistable shape perception! Nicole at poster on Th 12/12 11am East A-C 1308 vision.ist.i.kyoto-u.ac.jp/research/mss...

submitted 80 days ago • 0 comments

To kick off using Bluesky: our new dataset called Oxford Spires. Synchronised, multi-color cameras and lidar in multiple Oxford colleges. Ground-Truth highly accurate 3D maps from tripod scanners. The ideal basis for NeRF/3DGS SLAM research. dynamic.robots.ox.ac.uk/datasets/oxf...

submitted 81 days ago • 2 comments

Introducing MegaSaM! Accurate, fast, & robust structure + camera estimation from casual monocular videos of dynamic scenes! MegaSaM outputs camera parameters and consistent video depth, scaling to long videos with unconstrained camera paths and complex scene dynamics!

submitted 83 days ago • 1 comment

OK If we are moving to Bluesky I am rescuing my favourite ever twitter thread (Jan 2019). The renamed: Bluesky-sized history of neuroscience (biased by my interests)

submitted 88 days ago • 14 comments

Really cool new work out of Deep Mind for video game world generation using latent diffusion! Soon you'll be able to speed run a game just by tricking a model to morph you from one location to another. deepmind.google/discover/blo...

submitted 85 days ago • 1 comment

How to drive your research forward? “I tested the idea we discussed last time. Here are some results. It does not work. (… awkward silence)” Such conversations happen so many times when meetings with students. How do we move forward? You need …

submitted 88 days ago • 1 comment

For my first post on Bluesky, this recent talk I did at the recent BMVA one day meeting on World Models is a good summary of my work on Computer Vision, Robotics and SLAM, and my thoughts on a bigger picture of #SpatialAI. youtu.be/NLnPG95vNhQ?...

submitted 92 days ago • 5 comments

I am a first time Area Chair for #CVPR2025 so, in the interests of transparency, I'll post some updates here on the various stages of the process. There are 708 (!) ACs (not that long ago, CVPR could have coped with 708 reviewers!) We've been allocated 18.27 papers on average (I have 20).

submitted 92 days ago • 1 comment

Hello Bluesky! 🔵 We start our account by having our third guess for Ask Me Anything session #3DV2025AMA! Noah Snavely @snavely.bsky.social from Cornell & Google DeepMind! 🌟 🕒 You have now 24 HOURS to ask him anything — drop your questions in the comments below! Keep it engaging but respectful!

submitted 93 days ago • 13 comments

Introducing Generative Omnimatte: A method for decomposing a video into complete layers, including objects and their associated effects (e.g., shadows, reflections). It enables a wide range of cool applications, such as video stylization, compositions, moment retiming, and object removal.

submitted 93 days ago • 3 comments

A real-time (or very fast) open-source txt2video model dropped: LTXV. HF: huggingface.co/Lightricks/L... Gradio: huggingface.co/spaces/Light... Github: github.com/Lightricks/L... Look at that prompt example though. Need to be a proper writer to get that quality.

submitted 96 days ago • 6 comments

I used 📍🔗 emojis to maximize Twitter/Bluesky parity in my profile. This is definitely pointless, but it's fun.

submitted 97 days ago • 5 comments

NeurIPS Conference is now Live on Bluesky! -NeurIPS2024 Communication Chairs

submitted 98 days ago • 11 comments

We've released our paper "Generating 3D-Consistent Videos from Unposed Internet Photos"! Video models like Luma generate pretty videos, but sometimes struggle with 3D consistency. We can do better by scaling them with 3D-aware objectives. 1/N page: genechou.com/kfcw

submitted 98 days ago • 4 comments

For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...

submitted 100 days ago • 4 comments

As we approach the one year anniversary of a T-PAMI submission still waiting for first reviews, I imagine "With Associate Editor" to mean they sit in a lotus position atop a Himalayan peak, our paper and the reviews in their hand as they meditate (indefinitely) on what recommendation to make.

submitted 100 days ago • 3 comments

🍏 New preprint alert! 🍏 PoM: Efficient Image and Video Generation with the Polynomial Mixer arxiv.org/abs/2411.12663 This is my latest "summer project" and it was so big I had to call in reinforcements (Thanks @nicolasdufour.bsky.social) TL;DR Transformers are for boomers, welcome to the future 🧵👇

submitted 100 days ago • 1 comment

I'm slowly putting my intro to ML course material on github, starting with the lab sessions: github.com/davidpicard/... These are self-contained notebooks in which you have to implement famous algorithms from the literature (k-NN, SVM, DT, etc), with a custom dataset that I (painstakingly) made!

submitted 101 days ago • 4 comments