emilevankrieken.com - Profile | ThreadSky | a Reddit-style client for Bluesky

🚀 By learning to compress the KV cache in Transformer LLMs, we can generate more tokens for the same compute budget. This unlocks inference-time hyper-scaling For the same runtime or memory load, we can boost LLM accuracy by pushing reasoning even further!

submitted 22 hours ago • 1 comment

Ever changing from physics to AI and studying probabilistic ML, I have been wondering, how does all of this relate to quantum information theory 🎲. I believe I've gotten a whole lot closer to the answer. arxiv.org/abs/2506.01824 I'll be presenting the work at this year's @auai.org in Rio 😎

submitted 3 days ago • 0 comments

📢 New #paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue: → LLMs are limited in creativity as they learn to predict the next token → creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ #MLSky #AI #arxiv 🧵👇🏽

submitted 4 days ago • 1 comment

⏳⏳⏳ One more day to the NeSy abstract submission! Don't miss it! Full timeline at: 2025.nesyconf.org/call-for-pap...

submitted 8 days ago • 1 comment

🗓️ Deadline extended: 💥2nd June 2025!💥 We are looking forward to your works on: 🔌 #circuits and #tensor #networks 🕸️ ⏳ normalizing #flows 💨 ⚖️ scaling #NeSy #AI 🦕 🚅 fast and #reliable inference 🔍 ...& more! please share 🙏

submitted 13 days ago • 0 comments

New preprint, w/ @predictivebrain.bsky.social ! we've found that visual cortex, even when just viewing natural scenes, predicts higher-level visual features The aligns with developments in ML, but challenges some assumptions about early sensory cortex www.biorxiv.org/content/10.1...

submitted 14 days ago • 1 comment

A new blog post with intuitions behind continuous-time Markov chains, a building block of diffusion language models, like @inceptionlabs.bsky.social's Mercury and Gemini Diffusion. This post touches on different ways of looking at Markov chains, connections to point processes, and more.

submitted 15 days ago • 1 comment

‘intermediate tokens-often anthropomorphized as "thoughts" or reasoning traces’ 🌶️ but true! really glad to see work approaching inference scaling more skeptically and objectively

submitted 15 days ago • 0 comments

Strong Platonic Representation Hypothesis All embedding models, given large enough scale, can be translated between them without paired data Security implication: Embeddings aren’t encryption, they’re basically plain text arxiv.org/abs/2505.12540

submitted 16 days ago • 6 comments

Rio is looking forward to seeing all your works on #fast #reliable #tractable #models #circuits #tensor #networks #flows! please share!

submitted 16 days ago • 0 comments

Join us for our next I-X seminar with Dr Antonio Vergari @nolovedeeplearning.bsky.social titled "Logically-Consistent Deep Learning". 🕓 13:00 - 14:00 (GMT) 📅 Thursday, 5 June 📍 Hybrid (White City Campus / Microsoft Teams) Register via link below. www.imperial.ac.uk/events/19426...

submitted 16 days ago • 0 comments

This is a new experience: people using AI to overhype your paper 🫢 @neuralnoise.com

submitted 16 days ago • 1 comment

Q: how to design a diffusion model that 1️⃣ encodes an expressive conditional distribution over discrete labels 2️⃣ has to marginalize over many unseen discrete variables and 3️⃣ needs to be aware of symbolic constraints 4️⃣ scales ??? A: 👇👇👇

submitted 16 days ago • 0 comments

We propose Neurosymbolic Diffusion Models! We find diffusion is especially compelling for neurosymbolic approaches, combining powerful multimodal understanding with symbolic reasoning 🚀 Read more 👇

submitted 17 days ago • 4 comments

10 more days until the abstract deadline of the second submission phase! We accept: - Regular full & short papers - Extended abstracts of recently published papers - Industry abstracts Submit at openreview.net/group?id=nes... CFP: 2025.nesyconf.org/call-for-pap...

submitted 17 days ago • 0 comments

🎓 Looking for PhD students, postdocs & interns! I’m recruiting for my new lab at NUS School of Computing, focusing on generative modeling, reasoning, and tractable inference. 💡 Interested? Learn more here: liuanji.github.io 🗓️ PhD application deadline: June 15, 2025

submitted 20 days ago • 0 comments

When bidding for @neuripsconf.bsky.social I can only see two pages of papers 🤔 However, the recommendations aren't that great. Is it intentional that I can only see two pages?

submitted 20 days ago • 4 comments

""" Supplementary First-Stage Reviews: LLM-generated reviews will be included as one component of the initial review stage, providing an additional perspective alongside traditional human expert evaluations. """ ... ... ... ...why?

submitted 21 days ago • 2 comments

🏹 Job alert: 4 Tenure Track Professorships dedicated to highly qualified junior researchers w/PhD at @jku.at in 📌 Reinforcement Learning 📌 NLP 📌 Neuro-Symbolic AI 📌 Knowledge & Data Processing 📍 Linz 🇦🇹 🗓️ Apply by May 28 🔗 More info: bit.ly/4j7o0Nh

submitted 22 days ago • 0 comments

Just under 10 days left to submit your latest endeavours in #tractable probabilistic models! Join us at TPM @auai.org #UAI2025 and show how to build #neurosymbolic / #probabilistic AI that is both fast and trustworthy!

submitted 23 days ago • 0 comments

Bad news: Overleaf is down. The only way to edit LaTeX known to humankind!

submitted 24 days ago • 6 comments

[1/2] We've released the code for LegoGPT. Our autoregressive model generates physically stable and buildable designs from text prompts by integrating physics laws and assembly constraints into LLM training and inference. Code: github.com/AvaLovelace1... Website: avalovelace1.github.io/LegoGPT/

submitted 28 days ago • 4 comments

For a long time, the biggest problem in machine learning has been improving and understanding robustness and generalization to OOD. We are just increasingly making more & more problems in-distribution but the models still don't generalize out-of-the-box to the tail of problems.

submitted 30 days ago • 0 comments

#ICML2025 Is standard RLHF optimal in view of test-time scaling? Unsurprisingly no. We show a simple change to standard RLHF framework that involves 𝐫𝐞𝐰𝐚𝐫𝐝 𝐜𝐚𝐥𝐢𝐛𝐫𝐚𝐭𝐢𝐨𝐧 and 𝐫𝐞𝐰𝐚𝐫𝐝 𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 (suited to test-time procedure) is optimal!

submitted 29 days ago • 1 comment

New Python package for scalable association rule mining! We propose a novel neurosymbolic method for scalable rule mining from tabular data that can be used for both knowledge discovery and fully interpretable inference. 🧩 github.com/DiTEC-projec... 📜 arxiv.org/pdf/2504.19354 🧵1/5

submitted 33 days ago • 1 comment

If you are at #AISTATS2025 and are interested in concept erasure, talk to @somnathbrc.bsky.social at Poster Session 1 on Saturday May 3.

submitted 35 days ago • 0 comments

in a nutshell, if you are an author of a paper submitted at @neuripsconf.bsky.social: - you or one of your co-authors is expected to be a reviewer - until you don't submit your reviews, you won't see reviews for your papers - if you submit poor quality reviews your papers can be rejected

submitted 35 days ago • 2 comments

🔥Our work “Where is the Truth? The Risk of Getting Confounded in a Continual World" was accepted with a spotlight poster at ICML! arxiv.org/abs/2402.06434 -> we introduce continual confounding + the ConCon dataset, where confounders over time render continual knowledge accumulation insufficient ⬇️

submitted 36 days ago • 2 comments

Oh man, that definition of reasoning... 🫠

submitted 35 days ago • 0 comments

🚨 New paper: “Towards Adaptive Self-Normalized IS”, @ IEEE Statistical Signal Processing Workshop. TLDR; To estimate µ = E_p[f(θ)] with SNIS, instead of doing MCMC on p(θ) or learning a parametric q(θ), we try MCMC directly on p(θ)| f(θ)-µ | (variance-minimizing proposal). arxiv.org/abs/2505.00372

submitted 35 days ago • 1 comment

MMLU-Redux just touched down at #NAACL2025! 🎉 Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅 If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋

submitted 35 days ago • 0 comments

New paper accepted to ICML! We present a novel policy optimization algorithm for continuous control with a simple closed form which generalizes DDPG, SAC etc. to generic stochastic policies: Wasserstein Policy Optimization (WPO).

submitted 36 days ago • 2 comments

Aligned Multi-Objective Optimization (A-🐮) has been accepted at #ICML2025! 🎉 We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization

submitted 36 days ago • 0 comments