Profile avatar
emilevankrieken.com
Post-doc @ University of Edinburgh. Neurosymbolic Machine Learning, Generative Models, NLP https://www.emilevankrieken.com/
248 posts 4,118 followers 983 following
Prolific Poster
Conversation Starter

๐Ÿš€ By *learning* to compress the KV cache in Transformer LLMs, we can generate more tokens for the same compute budget. This unlocks *inference-time hyper-scaling* For the same runtime or memory load, we can boost LLM accuracy by pushing reasoning even further!

Ever changing from physics to AI and studying probabilistic ML, I have been wondering, how does all of this relate to quantum information theory ๐ŸŽฒ. I believe I've gotten a whole lot closer to the answer. arxiv.org/abs/2506.01824 I'll be presenting the work at this year's @auai.org in Rio ๐Ÿ˜Ž

๐Ÿ“ข New #paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue: โ†’ LLMs are limited in creativity as they learn to predict the next token โ†’ creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" ๐ŸŒฑ) 1/ #MLSky #AI #arxiv ๐Ÿงต๐Ÿ‘‡๐Ÿฝ

โณโณโณ One more day to the NeSy abstract submission! Don't miss it! Full timeline at: 2025.nesyconf.org/call-for-pap...

๐Ÿ—“๏ธ Deadline extended: ๐Ÿ’ฅ2nd June 2025!๐Ÿ’ฅ We are looking forward to your works on: ๐Ÿ”Œ #circuits and #tensor #networks ๐Ÿ•ธ๏ธ โณ normalizing #flows ๐Ÿ’จ โš–๏ธ scaling #NeSy #AI ๐Ÿฆ• ๐Ÿš… fast and #reliable inference ๐Ÿ” ...& more! please share ๐Ÿ™

New preprint, w/ @predictivebrain.bsky.social ! we've found that visual cortex, even when just viewing natural scenes, predicts *higher-level* visual features The aligns with developments in ML, but challenges some assumptions about early sensory cortex www.biorxiv.org/content/10.1...

A new blog post with intuitions behind continuous-time Markov chains, a building block of diffusion language models, like @inceptionlabs.bsky.social's Mercury and Gemini Diffusion. This post touches on different ways of looking at Markov chains, connections to point processes, and more.

โ€˜intermediate tokens-often anthropomorphized as "thoughts" or reasoning tracesโ€™ ๐ŸŒถ๏ธ but true! really glad to see work approaching inference scaling more skeptically and objectively

Strong Platonic Representation Hypothesis All embedding models, given large enough scale, can be translated between them without paired data Security implication: Embeddings arenโ€™t encryption, theyโ€™re basically plain text arxiv.org/abs/2505.12540

Rio is looking forward to seeing all your works on #fast #reliable #tractable #models #circuits #tensor #networks #flows! please share!

Join us for our next I-X seminar with Dr Antonio Vergari @nolovedeeplearning.bsky.social titled "Logically-Consistent Deep Learning". ๐Ÿ•“ 13:00 - 14:00 (GMT) ๐Ÿ“… Thursday, 5 June ๐Ÿ“ Hybrid (White City Campus / Microsoft Teams) Register via link below. www.imperial.ac.uk/events/19426...

This is a new experience: people using AI to overhype your paper ๐Ÿซข @neuralnoise.com

Q: how to design a diffusion model that 1๏ธโƒฃ encodes an expressive conditional distribution over discrete labels 2๏ธโƒฃ has to marginalize over many unseen discrete variables and 3๏ธโƒฃ needs to be aware of symbolic constraints 4๏ธโƒฃ ***scales*** ??? A: ๐Ÿ‘‡๐Ÿ‘‡๐Ÿ‘‡

We propose Neurosymbolic Diffusion Models! We find diffusion is especially compelling for neurosymbolic approaches, combining powerful multimodal understanding with symbolic reasoning ๐Ÿš€ Read more ๐Ÿ‘‡

10 more days until the abstract deadline of the second submission phase! We accept: - Regular full & short papers - Extended abstracts of recently published papers - Industry abstracts Submit at openreview.net/group?id=nes... CFP: 2025.nesyconf.org/call-for-pap...

๐ŸŽ“ Looking for PhD students, postdocs & interns! Iโ€™m recruiting for my new lab at NUS School of Computing, focusing on generative modeling, reasoning, and tractable inference. ๐Ÿ’ก Interested? Learn more here: liuanji.github.io ๐Ÿ—“๏ธ PhD application deadline: June 15, 2025

When bidding for @neuripsconf.bsky.social I can only see two pages of papers ๐Ÿค” However, the recommendations aren't that great. Is it intentional that I can only see two pages?

""" Supplementary First-Stage Reviews: LLM-generated reviews will be included as one component of the initial review stage, providing an additional perspective alongside traditional human expert evaluations. """ ... ... ... ...why?

๐Ÿน Job alert: 4 Tenure Track Professorships dedicated to highly qualified junior researchers w/PhD at @jku.at in ๐Ÿ“Œ Reinforcement Learning ๐Ÿ“Œ NLP ๐Ÿ“Œ Neuro-Symbolic AI ๐Ÿ“Œ Knowledge & Data Processing ๐Ÿ“ Linz ๐Ÿ‡ฆ๐Ÿ‡น ๐Ÿ—“๏ธ Apply by May 28 ๐Ÿ”— More info: bit.ly/4j7o0Nh

Just under 10 days left to submit your latest endeavours in #tractable probabilistic models! Join us at TPM @auai.org #UAI2025 and show how to build #neurosymbolic / #probabilistic AI that is both fast and trustworthy!

Bad news: Overleaf is down. The only way to edit LaTeX known to humankind!

[1/2] We've released the code for LegoGPT. Our autoregressive model generates physically stable and buildable designs from text prompts by integrating physics laws and assembly constraints into LLM training and inference. Code: github.com/AvaLovelace1... Website: avalovelace1.github.io/LegoGPT/

For a long time, the biggest problem in machine learning has been improving and understanding robustness and generalization to OOD. We are just increasingly making more & more problems in-distribution but the models still don't generalize out-of-the-box to the tail of problems.

#ICML2025 Is standard RLHF optimal in view of test-time scaling? Unsurprisingly no. We show a simple change to standard RLHF framework that involves ๐ซ๐ž๐ฐ๐š๐ซ๐ ๐œ๐š๐ฅ๐ข๐›๐ซ๐š๐ญ๐ข๐จ๐ง and ๐ซ๐ž๐ฐ๐š๐ซ๐ ๐ญ๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ๐š๐ญ๐ข๐จ๐ง (suited to test-time procedure) is optimal!

New Python package for scalable association rule mining! We propose a novel neurosymbolic method for scalable rule mining from tabular data that can be used for both knowledge discovery and fully interpretable inference. ๐Ÿงฉ github.com/DiTEC-projec... ๐Ÿ“œ arxiv.org/pdf/2504.19354 ๐Ÿงต1/5

If you are at #AISTATS2025 and are interested in concept erasure, talk to @somnathbrc.bsky.social at Poster Session 1 on Saturday May 3.

in a nutshell, if you are an author of a paper submitted at @neuripsconf.bsky.social: - you or one of your co-authors is expected to be a reviewer - until you don't submit your reviews, you won't see reviews for your papers - if you submit poor quality reviews your papers can be rejected

๐Ÿ”ฅOur work โ€œWhere is the Truth? The Risk of Getting Confounded in a Continual World" was accepted with a spotlight poster at ICML! arxiv.org/abs/2402.06434 -> we introduce continual confounding + the ConCon dataset, where confounders over time render continual knowledge accumulation insufficient โฌ‡๏ธ

Oh man, that definition of reasoning... ๐Ÿซ 

๐Ÿšจ New paper: โ€œTowards Adaptive Self-Normalized ISโ€, @ IEEE Statistical Signal Processing Workshop. TLDR; To estimate ยต = E_p[f(ฮธ)] with SNIS, instead of doing MCMC on p(ฮธ) or learning a parametric q(ฮธ), we try MCMC directly on p(ฮธ)| f(ฮธ)-ยต | (variance-minimizing proposal). arxiv.org/abs/2505.00372

MMLU-Redux just touched down at #NAACL2025! ๐ŸŽ‰ Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope ๐Ÿ˜… If anyone's swinging by, give our research some love! Hit me up if you check it out! ๐Ÿ‘‹

New paper accepted to ICML! We present a novel policy optimization algorithm for continuous control with a simple closed form which generalizes DDPG, SAC etc. to generic stochastic policies: Wasserstein Policy Optimization (WPO).

Aligned Multi-Objective Optimization (A-๐Ÿฎ) has been accepted at #ICML2025! ๐ŸŽ‰ We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization