Profile avatar
shan23chen.bsky.social
PhDing @AIM_Harvard @MassGenBrigham|PhD Fellow @Google | Previously @Bos_CHIP @BrandeisU More robustness and explainabilities 🧐 for Health AI. shanchen.dev
36 posts 1,397 followers 223 following
Regular Contributor
Active Commenter

[1/]💡New Paper Large reasoning models (LRMs) are strong in English — but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy 📄Link: arxiv.org/abs/2505.22888

Agents are all the rage and we need to track their abilities in the medical domain. Enter MedBrowseComp, the 1st benchmark to assess agents' abilities to reason, navigate the web, and search for verifiable med info! Preprint: arxiv.org/abs/2505.14963 Site: moreirap12.github.io/mbc-browse-a...

✨ What if your face could tell something about how old your body really is? Excited to share our latest paper just published in The Lancet Digital Health (open access!) 👉 www.thelancet.com/journals/lan...

CALL FOR REMOTE SPEAKERS: Science in the News Seminar Series, hosted by Harvard x Beacon Hill Seminars scientists, engineers & doctors, from academic researchers to industry professionals! 🧑‍🔬🧑‍💻  Email the organizers at [email protected] to sign up for a date! (First-come-first-served)

We have a NEW PAPER in @naturemedicine.bsky.social on reporting recommendations for addressing the unique challenges of #largelanguagemodels (LLMs) in biomedical applications www.nature.com/articles/s41... #MLSky #StatsSky #medSky #AISky #artificialintelligence #generativeAI #transparency

I am always worrying about Benzene (my cat)! www.nytimes.com/2024/12/05/w... But please don't stop wearing sunscreen! Sun exposure is a known cancer risk, benzene risks unknown. This article has good tips if you want to minimize benzene exposure. Obligatory Benzene (cat) pic ⬇️

Team @AnthropicAI & @thesubhashk @joshengels.bsky.social shows SAE features can be good for classifications. Good evidence by @arthurconmy.bsky.social & @neelnanda.bsky.social on SAE features are transferable across base and IT models. 🧐 How about LLaVA? tiny.cc/sae1

Crosscare is accepted @neuripsconf.bsky.social 🎉 We showed LLMs are far from grounded with true prevalence, and groundings across languages are so inconsistent! Also, a dashboard for people to explore the prevalence data across diseases and racial groups: crosscare.net #NeurIPS2024

My department is hiring: apply to be my colleague! www.chip.org/employment/i...

Million thanks to my wonderful advisor @daniellebitterman.bsky.social and all my colleagues and friends!

Here are some reflections on many studies we did this year. Tons of progress has been made, but there are still safety concerns..🧐 Poster 10:30 riverfront at EMNLP2024 🏖️ Happy to chat and connect! 📃 huggingface.co/blog/shanche... 🔊 tinyurl.com/aimpodcast24 @daniellebitterman.bsky.social