Profile avatar
dholzmueller.bsky.social
Postdoc in machine learning with Francis Bach & @GaelVaroquaux: neural networks, tabular data, uncertainty, active learning, atomistic ML, learning theory. https://dholzmueller.github.io
157 posts 722 followers 140 following
Regular Contributor
Active Commenter

Learning rate schedules seem mysterious? Why is the loss going down so fast during cooldown? Turns out that this behaviour can be described with a bound from *convex, nonsmooth* optimization. A short thread on our latest paper 🚞 arxiv.org/abs/2501.18965

Early stopping on validation loss? This leads to suboptimal calibration and refinement errorsβ€”but you can do better! With @dholzmueller.bsky.social, Michael I. Jordan, and @bachfrancis.bsky.social, we propose a method that integrates with any model and boosts classification performance across tasks.

The first independent evaluation of our RealMLP is here! On a recent 300-dataset benchmark with many baselines, RealMLP takes a shared first place overall. πŸ”₯ Importantly, RealMLP is also relatively CPU-friendly, unlike other SOTA DL models (including TabPFNv2 and TabM). 🧡 1/

Join us on 27 Feb in Amsterdam for the ELLIS workshop on Representation Learning and Generative Models for Structured Data ✨ sites.google.com/view/rl-and-... Inspiring talks by @eisenjulian.bsky.social, @neuralnoise.com, Frank Hutter, Vaishali Pal, TBC. We welcome extended abstracts until 31 Jan!

My book is (at last) out, just in time for Christmas! A blog post to celebrate and present it: francisbach.com/my-book-is-o...

I'll present our paper in the afternoon poster session at 4:30pm - 7:30 pm in East Exhibit Hall A-C, poster 3304!

We wrote a benchmark paper with many practical insights on (the benefits of) active learning for training neural PDE solvers. πŸš€ I was happy to be a co-advisor on this project - most of the credit goes to Daniel and Marimuthu.

Stable model scaling with width-independent dynamics? Thrilled to present 2 papers at #NeurIPS πŸŽ‰ that study width-scaling in Sharpness Aware Minimization (SAM) (Th 16:30, #2104) and in Mamba (Fr 11, #7110). Our scaling rules stabilize training and transfer optimal hyperparams across scales. 🧡 1/10

I'll be at #NeurIPS2024 next week to present this paper (Thu afternoon) as well as a workshop paper on active learning for neural PDE solvers. Let me know if you'd like to chat about tabular data, uncertainty, active learning, etc.!

Proud to announce our NeurIPS spotlight, which was in the works for over a year now :) We dig into why decomposing aleatoric and epistemic uncertainty is hard, and what this means for the future of uncertainty quantification. πŸ“– arxiv.org/abs/2402.19460 🧡1/10

If you have train+validation data, should you refit on the whole data with the stopping epoch found on the train-validation split? In the quoted paper, we did an experiment including 5-fold ensembles on a 5-fold cross-validation splits (bagging) and with refitting. (short 🧡)

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024. Here is the recording of the presentation: www.youtube.com/watch?v=-gYn...

One thing I learned from this project is that accuracy is a quite noisy metric. With small validation sets (~1K samples), hyperparameter opt. using AUROC instead of accuracy can yield better accuracy on the test set. We also did some experiments on metrics for early stopping. 🧡

PyTabKit 1.1 is out! - Includes TabM and provides a scikit-learn interface - some baseline NN parameter names are renamed (removed double-underscores) - other small changes, see the readme. github.com/dholzmueller...

WIP starterpack w researchers on Table Representation Learning (TRL): all things related to representation learning and generative models for e.g. tables, DBs, spreadsheets! I'll curate but DM/reply w handle+some info welcome! Also follow @trl-research.bsky.social for updates πŸ€— go.bsky.app/4SNSMRj

For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...

@bsky.app is the new cool. So is tabular learning? skrub is a library that eases preprocessing and feature engineering for tabular machine learning. skrub-data.org/stable/ Main features: - scikit-learn compatible - handles Pandas and Polars dataframes - works on heterogeneous types