davnords.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

davnords.bsky.social

Phd Student @ Chalmers Deep Learning for Computer Vision.

10 posts 21 followers 60 following

Posts 7 Comments 9

GPT 4o's new image capabilities seem to be liked. The insinuation from OpenAI seems to be that it is not based on diffusion. I wonder how their work relates to the infamous NeurIPS paper "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction" (arxiv.org/abs/2404.02905)

submitted 29 days ago • 1 comment

New paper! We merge SfM reconstructions with point cloud registration. Link: arxiv.org/abs/2503.17093 Code: Not yet public, but coming later.

submitted 33 days ago • 9 comments

New paper! (arxiv.org/abs/2503.13433), we look into improving the threshold roubustness of Random Sample Consensus (RANSAC) through (less biased) inlier noise scale estimation.

submitted 39 days ago • 4 comments

Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! Project Page: vgg-t.github.io Code & Weights: github.com/facebookrese...

submitted 40 days ago • 3 comments

Introducing DaD, Part 2, a pretty cool keypoint detector.

submitted 46 days ago • 5 comments

We made a new keypoint detector named DaD, paper isn't up yet, but code and weights are: github.com/Parskatt/dad

submitted 47 days ago • 7 comments

Common beliefs about equivariant networks for image input include 1) They are slow. 2) They don’t scale to ImageNet. 3) They are complicated. In my opinion, these three are all false. To argue against them, we made minimal modifications to popular vision models, turning them mirror-equivariant.

submitted 75 days ago • 2 comments