dholzmueller.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

dholzmueller.bsky.social

Postdoc in machine learning with Francis Bach & @GaelVaroquaux: neural networks, tabular data, uncertainty, active learning, atomistic ML, learning theory. https://dholzmueller.github.io

157 posts 722 followers 140 following

Posts 17 Comments 33

Learning rate schedules seem mysterious? Why is the loss going down so fast during cooldown? Turns out that this behaviour can be described with a bound from convex, nonsmooth optimization. A short thread on our latest paper 🚞 arxiv.org/abs/2501.18965

submitted 22 days ago • 2 comments

Early stopping on validation loss? This leads to suboptimal calibration and refinement errors—but you can do better! With @dholzmueller.bsky.social, Michael I. Jordan, and @bachfrancis.bsky.social, we propose a method that integrates with any model and boosts classification performance across tasks.

submitted 24 days ago • 4 comments

The first independent evaluation of our RealMLP is here! On a recent 300-dataset benchmark with many baselines, RealMLP takes a shared first place overall. 🔥 Importantly, RealMLP is also relatively CPU-friendly, unlike other SOTA DL models (including TabPFNv2 and TabM). 🧵 1/

submitted 42 days ago • 1 comment

Join us on 27 Feb in Amsterdam for the ELLIS workshop on Representation Learning and Generative Models for Structured Data ✨ sites.google.com/view/rl-and-... Inspiring talks by @eisenjulian.bsky.social, @neuralnoise.com, Frank Hutter, Vaishali Pal, TBC. We welcome extended abstracts until 31 Jan!

submitted 51 days ago • 2 comments

My book is (at last) out, just in time for Christmas! A blog post to celebrate and present it: francisbach.com/my-book-is-o...

submitted 68 days ago • 2 comments

I'll present our paper in the afternoon poster session at 4:30pm - 7:30 pm in East Exhibit Hall A-C, poster 3304!

submitted 77 days ago • 0 comments

We wrote a benchmark paper with many practical insights on (the benefits of) active learning for training neural PDE solvers. 🚀 I was happy to be a co-advisor on this project - most of the credit goes to Daniel and Marimuthu.

submitted 78 days ago • 0 comments

Stable model scaling with width-independent dynamics? Thrilled to present 2 papers at #NeurIPS 🎉 that study width-scaling in Sharpness Aware Minimization (SAM) (Th 16:30, #2104) and in Mamba (Fr 11, #7110). Our scaling rules stabilize training and transfer optimal hyperparams across scales. 🧵 1/10

submitted 80 days ago • 1 comment

I'll be at #NeurIPS2024 next week to present this paper (Thu afternoon) as well as a workshop paper on active learning for neural PDE solvers. Let me know if you'd like to chat about tabular data, uncertainty, active learning, etc.!

submitted 86 days ago • 0 comments

Proud to announce our NeurIPS spotlight, which was in the works for over a year now :) We dig into why decomposing aleatoric and epistemic uncertainty is hard, and what this means for the future of uncertainty quantification. 📖 arxiv.org/abs/2402.19460 🧵1/10

submitted 86 days ago • 3 comments

If you have train+validation data, should you refit on the whole data with the stopping epoch found on the train-validation split? In the quoted paper, we did an experiment including 5-fold ensembles on a 5-fold cross-validation splits (bagging) and with refitting. (short 🧵)

submitted 90 days ago • 1 comment

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024. Here is the recording of the presentation: www.youtube.com/watch?v=-gYn...

submitted 92 days ago • 1 comment

One thing I learned from this project is that accuracy is a quite noisy metric. With small validation sets (~1K samples), hyperparameter opt. using AUROC instead of accuracy can yield better accuracy on the test set. We also did some experiments on metrics for early stopping. 🧵

submitted 92 days ago • 2 comments

PyTabKit 1.1 is out! - Includes TabM and provides a scikit-learn interface - some baseline NN parameter names are renamed (removed double-underscores) - other small changes, see the readme. github.com/dholzmueller...

submitted 94 days ago • 1 comment

WIP starterpack w researchers on Table Representation Learning (TRL): all things related to representation learning and generative models for e.g. tables, DBs, spreadsheets! I'll curate but DM/reply w handle+some info welcome! Also follow @trl-research.bsky.social for updates 🤗 go.bsky.app/4SNSMRj

submitted 101 days ago • 8 comments

For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...

submitted 99 days ago • 4 comments

@bsky.app is the new cool. So is tabular learning? skrub is a library that eases preprocessing and feature engineering for tabular machine learning. skrub-data.org/stable/ Main features: - scikit-learn compatible - handles Pandas and Polars dataframes - works on heterogeneous types

submitted 100 days ago • 1 comment