shan23chen.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

shan23chen.bsky.social

PhDing @AIM_Harvard @MassGenBrigham｜PhD Fellow @Google | Previously @Bos_CHIP @BrandeisU More robustness and explainabilities 🧐 for Health AI. shanchen.dev

36 posts 1,397 followers 223 following

Posts 11 Comments 31

[1/]💡New Paper Large reasoning models (LRMs) are strong in English — but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy 📄Link: arxiv.org/abs/2505.22888

submitted 3 days ago • 1 comment

Agents are all the rage and we need to track their abilities in the medical domain. Enter MedBrowseComp, the 1st benchmark to assess agents' abilities to reason, navigate the web, and search for verifiable med info! Preprint: arxiv.org/abs/2505.14963 Site: moreirap12.github.io/mbc-browse-a...

submitted 10 days ago • 1 comment

✨ What if your face could tell something about how old your body really is? Excited to share our latest paper just published in The Lancet Digital Health (open access!) 👉 www.thelancet.com/journals/lan...

submitted 23 days ago • 2 comments

CALL FOR REMOTE SPEAKERS: Science in the News Seminar Series, hosted by Harvard x Beacon Hill Seminars scientists, engineers & doctors, from academic researchers to industry professionals! 🧑‍🔬🧑‍💻 Email the organizers at [email protected] to sign up for a date! (First-come-first-served)

submitted 87 days ago • 0 comments

We have a NEW PAPER in @naturemedicine.bsky.social on reporting recommendations for addressing the unique challenges of #largelanguagemodels (LLMs) in biomedical applications www.nature.com/articles/s41... #MLSky #StatsSky #medSky #AISky #artificialintelligence #generativeAI #transparency

submitted 145 days ago • 1 comment

I am always worrying about Benzene (my cat)! www.nytimes.com/2024/12/05/w... But please don't stop wearing sunscreen! Sun exposure is a known cancer risk, benzene risks unknown. This article has good tips if you want to minimize benzene exposure. Obligatory Benzene (cat) pic ⬇️

submitted 177 days ago • 1 comment

Team @AnthropicAI & @thesubhashk @joshengels.bsky.social shows SAE features can be good for classifications. Good evidence by @arthurconmy.bsky.social & @neelnanda.bsky.social on SAE features are transferable across base and IT models. 🧐 How about LLaVA? tiny.cc/sae1

submitted 178 days ago • 1 comment

Crosscare is accepted @neuripsconf.bsky.social 🎉 We showed LLMs are far from grounded with true prevalence, and groundings across languages are so inconsistent! Also, a dashboard for people to explore the prevalence data across diseases and racial groups: crosscare.net #NeurIPS2024

submitted 186 days ago • 1 comment

My department is hiring: apply to be my colleague! www.chip.org/employment/i...

submitted 188 days ago • 0 comments

Million thanks to my wonderful advisor @daniellebitterman.bsky.social and all my colleagues and friends!

submitted 196 days ago • 0 comments

Here are some reflections on many studies we did this year. Tons of progress has been made, but there are still safety concerns..🧐 Poster 10:30 riverfront at EMNLP2024 🏖️ Happy to chat and connect! 📃 huggingface.co/blog/shanche... 🔊 tinyurl.com/aimpodcast24 @daniellebitterman.bsky.social

submitted 201 days ago • 0 comments