Profile avatar
saxon.me
NLP/Vision+Language PhD Candidate @ UCSB Evals, metrics, multilinguality, multiculturality, multimodality, and (dabbling in) reasoning https://saxon.me/
238 posts 2,640 followers 668 following
Regular Contributor
Active Commenter

I have at least two funny meme LM evaluation project ideas for any interested collaborators

Breaking News, I am just getting word that they are in fact called "Multi-", not "Wulti-modal models" Thank you @shaily99.bsky.social for informing me

Check out our CVPR tutorial "Evaluating Large Wulti-modal Models: Challenges and Methods" on Wednesday from 1-5pm in room 109! Unfortunately, I won't be there to present but my labmate Xiao will share some of my slides during her section :) lmm-understand.github.io

broooo what if some of those that work forces are the same that burn crosses

This piece rhetorically asks: "Should the climate movement start demanding that everyone stop listening to Spotify? Would that be a good use of our time?" unfortunately I think many would say 'yes'. andymasley.substack.com/p/individual...

"The cost of living has increased, but the cost of owning has increased more" says the rent hike letter from a landlord who had funded 0 repairs since I've lived here, in CA with frozen property tax

A cheeseburger uses a lot more water than a ChatGPT request 🍔 Actual farms, not the data center variety, are sucking up groundwater more quickly than surface water, explains @markgongloff.bsky.social 🎥

Proposed to cut number of people involved in NSF activities by 70%. We are literally on the chopping board. Call your reps.

To be honest, I kinda love grok? (when it isn't being Elonbotomized to be a racism machine) So many rightoid maniacs query it expecting to see their conspiracist beliefs echoed back at them only to repeatedly get gently corrected with factual information lmao

I cannot stop thinking about Andor. Masterpiece, must watch for pretty much everyone imo

Sent my thesis in to my committee this week, will defend June 2 at 1pm PT! If you're interested in catching it on zoom, here's a calendar link! calendar.google.com/calendar/u/0...

Despite clickbaity title this is a great level-headed piece from a real scientist who tried working in AI for science. The key point that AI is a tool not an all encompassing revolution is common sense but the details are interesting and illuminating open.substack.com/pub/understa...

If we just add a few more annoying tasks for authors and a few more for reviewers we can fix peer review in AI!

According to a 2021 report, the University of California system: • generated $82B in economic activity in California • supported 529K jobs in the state • generated $21 in economic output for every $1 received Public divestment from higher ed makes no sense, even in the narrowest economic terms.

Michael News! I will be joining the Tech Policy Lab at the University of Washington @ischool.uw.edu and UW NLP working with @aylincaliskan.bsky.social as a postdoc in the fall, to work on situated evaluation, multimodal/lingual/cultural genAI, and new directions in safety, fairness, and alignment!

"Women are PIs on 58% of the canceled grants, although they are PIs on only 34% of all active NSF grants. Similarly, Blacks are PIs on 17% of the terminated grants, although they make only 4% of the total pool. Hispanic PIs and those with disabilities were twice as likely to lose a grant."

There's no escape! Even in my sister's bar admission ceremony the bar president starts talking about AI 🤣

We were interviewed for IEEE spectrum about reasoning models! spectrum.ieee.org/chain-of-tho...

What is it about the City of Berkeley and Country of England that makes interest in AI safety and weirder fringe stuff like AI consciousness so prevalent? Like why are these topics so big there and not in like Seattle or Pittsburgh??

Finally a study mix I can get behind www.youtube.com/watch?v=0tR5...

"LLM on way to replace doctors" gets published in Nature. meanwhile "LLM judgement not as good as human MDs" gets a spot in "Physical Therapy and Rehabilitation Journal".

Very interesting oral history -- interviews with some top NLP folks on the effects of GenAI on their field: www.quantamagazine.org/when-chatgpt...

We won an outstanding paper award!! 2025.naacl.org/blog/best-pa...

PSA for NAACL peeps from a southwest boi (sadly I won't be there): be sure to find a place to eat New Mexico style stacked enchiladas. You can get it "Christmas style" where its served with both red and green hatch chile. The hatch chile is integral, do not skip. Not photogenic, but very delicious

I wondered if it could really be all that bad from the beginning, after all users are signing up to publicly interact with each other on a forum but woof, I don't think I would have signed off on this broad of a "the LM is allowed to impersonate this" policy

So basically, there was a Signal chat with tech folks and some Harpers Letter writers, and the Harpers folk were chased out when Andreessen realized they would not go along with censorship. But the tech guys stuck with Chris Rufo. www.semafor.com/article/04/2...

"Man it's sad that every single one of these trailers is a franchise sequel we're never gonna get an original movie again are we" he said, sitting in the theater to watch a rerelease of Revenge of the Sith