Profile avatar
jennarussell.bsky.social
CS PhD Student @ UMD Undergrad @ Cornell https://jenna-russell.github.io/
16 posts 926 followers 386 following
Regular Contributor
Conversation Starter

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts? 🧟 You get what we call a Frankentext! 💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

International students will stop coming to American universities if their visas are going to be at risk. This will make our intellectual community poorer and also make tuition more expensive for domestic students.

Introducing 🐻 BEARCUBS 🐻, a “small but mighty” dataset of 111 QA pairs designed to assess computer-using web agents in multimodal interactions on the live web! ✅ Humans achieve 85% accuracy ❌ OpenAI Operator: 24% ❌ Anthropic Computer Use: 14% ❌ Convergence AI Proxy: 13%

Is the needle-in-a-haystack test still meaningful given the giant green heatmaps in modern LLM papers? We create ONERULER 💍, a multilingual long-context benchmark that allows for nonexistent needles. Turns out NIAH isn't so easy after all! Our analysis across 26 languages 🧵👇

⚠️Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification. We present CLIPPER ✂️, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.

People often claim they know when ChatGPT wrote something, but are they as accurate as they think? Turns out that while general population is unreliable, those who frequently use ChatGPT for writing tasks can spot even "humanized" AI-generated text with near-perfect accuracy 🎯

We're hiring new #nlp faculty this year! Asst or Assoc Professors in NLP at UMass CICS -- careers.umass.edu/amherst/en-u...

If you are at #EMNLP2024 you should really check this work from our lab: github.com/Yixiao-Song/... (poster: Tue 4:00-5:30) If you aren't you should still read the paper! It's a great metric to use and build upon!

🌊Heading to #EMNLP2024 tmr, presenting PostMark on Tue. morning! 🔗 arxiv.org/abs/2406.14517 Aside from this, I'd love to chat about: • long-context training • realistic & hard eval • synthetic data • tbh any cool projects people are working on Also, I'm on the lookout for a summer 2025 internship!

Long-form text generation with multiple stylistic and semantic constraints remains largely unexplored. We present Suri 🦙: a dataset of 20K long-form texts & LLM-generated, backtranslated instructions with complex constraints. 📎 arxiv.org/abs/2406.19371

I will be present our paper on LMs performance on long-context reasoning task at #EMNLP2024 (Tue 16:00-17:30; riverfront hall) Come and chat with us! 🧚🦋