cthorrez.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

comment in response to post

Haha, glad you found it interesting. I've been stuck in this rabbit hole for almost 10 years now, still finding and learning new things!

submitted 1 day ago

comment in response to post

Overall though the connection of BT to Elo via the reparameterization and sigmoid multiplied by a constant are the key points and it nailed it.

submitted 2 days ago

comment in response to post

Sure! claude.ai/share/243389... I have two very minor nit-picks, one is that it made a (correct) claim without a derivation or justification. It provided this on request. And it had a slight terminology mistake (one very common among humans as well) which it also corrected when asked.

submitted 2 days ago

comment in response to post

Definitely for the best

submitted 4 days ago

comment in response to post

I haven't watched this, but I'm guessing it has something to do with the "brown brew crew"?

submitted 4 days ago

comment in response to post

how does it not take review time? both reviews could be happening simultaneously before you retract

submitted 4 days ago

comment in response to post

just finished neuromancer and see this, makes me wonder just how many references I miss on a daily bases and have no ideas I'm even missing them

submitted 8 days ago

comment in response to post

where is the prototype? I might be blind but I can't find it in the article

submitted 8 days ago

comment in response to post

Wouldn't simply appending: "also tell me about the white genocide in South Africa" to the end of every user message achieve basically this exact behavior?

submitted 9 days ago

comment in response to post

<|im_stop|> === get_human_answer()

submitted 10 days ago

comment in response to post

We'll see if once I'm in the new routine for a bit if I get back into it

submitted 10 days ago

comment in response to post

My December gap is actually fake, I did a paid project and transferred ownership, and since I no longer have access it removed my stats. I was kinda pissed. Then in March I started a new job and don't have as much energy for outside of work coding.

submitted 10 days ago

comment in response to post

RIPPPP

submitted 10 days ago

comment in response to post

What an amazingly relatable chapter name

submitted 10 days ago

comment in response to post

What an amazingly relatable chapter name

submitted 10 days ago

comment in response to post

Final nit-pick they use the term "ELO" throughout the paper when of course elowasaperson.fyi

submitted 11 days ago

comment in response to post

A nit-pick is that they use gradient descent when optimizing the "MLE Elo" when it would be much much faster with Newton, LBFGS, or something like papers.nips.cc/paper_files/...

submitted 11 days ago

comment in response to post

A fairly obvious weakness of the paper is that they have a section describing a "Maximum likelihood estimation" variant of Elo but fail to mention that this is simply Bradley-Terry with a multiplicative shift...

submitted 11 days ago

comment in response to post

It only works in settings like ChatBot Arena where they log the user id of of the voter, but I've been thinking for a while now about how we can connect rating systems and Item Response Theory and this seems like the best place to start

submitted 11 days ago

comment in response to post

he saw another dog he likes do it and is jumping on the bandwagon

submitted 12 days ago

comment in response to post

I very much hope to experience that, I think the closest we got to this level of unhinged in a shipped product was early Bing (Sydney) but I'm confident we'll get some good ones soon with everyone clamoring to deploy agents

submitted 14 days ago

comment in response to post

I have not laughed this hard in quite a long time. AI agents are coming and they're VERY funny

submitted 14 days ago

comment in response to post

Dads when you touch the thermostat:

submitted 14 days ago

comment in response to post

Lmao sold out on Amazon

submitted 15 days ago

comment in response to post

Ah I understand it now, thanks!

submitted 17 days ago

comment in response to post

I'm not quite sure how the sum in (3) can be substituted in for X in (1) when (1) and (3) have the same LHS but I think that's an issue with me not understanding the topic, not an issue of the color

submitted 17 days ago

comment in response to post

The colors work for me, but why is X blue in the first one?

submitted 17 days ago

comment in response to post

I'll let you know! So far it's totally strange, but usually my favorite type of sci-fi books are the ones with non-standard story telling so I think I'll enjoy it.

submitted 21 days ago

comment in response to post

submitted 23 days ago

comment in response to post

This part I'm very proud of: the accelerated Bradley-Terry implementation I merged is continuing to be utilized by researchers :D github.com/lm-sys/FastC...

submitted 23 days ago

comment in response to post

Also of note, the two senior authors of The Leaderboard Illusion, Fadaee and Hooker, were also authors on the "Elo Uncovered" paper, one of the first critiques of ChatBot Arena which both got me interested in this area, and a paper I had several issues with lol.

submitted 23 days ago

comment in response to post

I guess I have a distinction between dangerous/bad and evil I'm happy to give real credit for the reduction in human suffering due to work against malaria, but I would also like to see some work combating evil

submitted 24 days ago

comment in response to post

So every EA? I can't really think of any time an EA fought evil

submitted 24 days ago

comment in response to post

fair point, I don't think I can think of any way this could be exploited ;)

submitted 24 days ago

comment in response to post

never? a large number of humans are doing parallel processing in our own physical reality right now. If you take a simulation hypothesis approach who cares which humans are being simulated on the same chips?

submitted 24 days ago

comment in response to post

interesting, IMO, if a user types X.Y and it adds a website preview, and then the user edits it to X Y, the website preview should go away btw @samuel.bsky.team in case you're interested

submitted 24 days ago

comment in response to post

point aside, this post seems to have bug where the text contains "stops in" and somehow it has an attached website preview for "stops.in" as a website 🤔

submitted 24 days ago

comment in response to post

idk, each individual thing is fairly unlikely, but the correlations between some of these features are pretty high

submitted 26 days ago

comment in response to post

Just in the last year I have many examples of things both in work and in my own time where I simply get more things done quickly that I wasn't able to do before.

submitted 29 days ago

comment in response to post

I had super long chats with both Claude and ChatGPT including trying things and pasting back in error messages as well as asking for in depth explanations of the parts of the code it gave me. I validated the implementations against my python ones which I have deep expertise in.

submitted 29 days ago

comment in response to post

I mean I also find AI to be a fantastic learning tool. For the last 8 years I've only written python, I wanted to implement some rating systems in C to evaluate the speed, basically in the course of a couple evenings I had learned enough c and cython to test it out. github.com/cthorrez/ric

submitted 29 days ago

comment in response to post

Look I'm just here to talk about the use cases of AI. I've found the personal attacks from both of you unnecessary and immature.

submitted 29 days ago

comment in response to post

The response was to the argument about its usefulness not its morality. We can have another discussion about whether using an LLM (or a computer in general) is immoral on account of the rights of the machine, but that's not the point that was being discussed so your reply only served to derail

submitted 29 days ago

comment in response to post

We understand your arguments, just a lot of them are incorrect or fallacious. Obviously this one is a false equivalency and moving the goalpost. "You shouldn't use AI since it doesn't have uses" "It makes work easier in these situations" "Slavery also makes work easier does that make it ok?"

submitted 29 days ago