Profile avatar
cthorrez.bsky.social
LLM applied scientist by day, esports data scientist for fun rating systems and datasets with riix and EsportsBench! https://cthorrez.github.io/riix/riix.html https://huggingface.co/datasets/EsportsBench/EsportsBench
624 posts 406 followers 3,204 following
Getting Started
Active Commenter

Pretty cool that I wrote the currently deployed ranking code for a company now valued at 600 million dollars 🤯🤯🤯 github.com/lm-sys/FastC... techcrunch.com/2025/05/21/l...

Claude 4 Sonnet gave the best answer so far to my go-to first prompt: "Explain the relationship between Elo and Bradley-Terry from the perspective of machine learning and optimization" But I've used it so many times on ChatBot Arena I assume my conversations are in the training data by now

I actually think google search is so dead. I just issued a single word search for a common noun and the Wikipedia page for the thing was on the third page. There is no reason whatsoever in that scenario that it should not be in on the first page or info side panel

Interesting new paper: arxiv.org/pdf/2505.03475 It's 12 pages which could be described in about 1 sentence: "Bradley Terry with per annotator ability parameters" Basically instead of in Elo, there is a constant log(10)/400 temperature, this learns a different temperature for each annotator

I guess I picked the right day to start reading Stand on Zanzibar by John Brunner

Reading The Leaderboard Illusion today. I will say I've been a huge fan of ChatBot Arena ever since the start (and I'm a contributor to it), but I think there are some valid issues worth calling out. But I'm really not a fan of the format the discourse has taken on twitter which is overly hostile :/