eointravers.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

comment in response to post

I think deep linear networks are an example of this, where you have a deep model with just the capacity of regular linear regression.

submitted 5 days ago

comment in response to post

Bullshitting Engines?

submitted 12 days ago

comment in response to post

To be fair, if you're doing analytic philosophy, "bullshit engine" reads as an engine that is bullshit, not an engine that engages in the communicative act of bullshitting, because engines don't communicate or have concerns, they set things in motion.

submitted 12 days ago

comment in response to post

Sorry, broke the second link: www.cell.com/neuron/fullt...

submitted 19 days ago

comment in response to post

Yup: pmc.ncbi.nlm.nih.gov/articles/PMC... (commentary: www.cell.com/neuron/fullt.... Lots of later work on the same idea since then as well.

submitted 19 days ago

comment in response to post

It's in the opposite direction though, once your participants know how you want them to behave, they're pretty likely to behave that way. en.m.wikipedia.org/wiki/Demand_...

submitted 49 days ago

comment in response to post

I think it's a choose-your-own-null adventure kind of thing, like you're worried it might be. I can see the (bad) argument, if you allow lots of parameters to vary either side of the dotted line, any differences prove that the line is important?

submitted 60 days ago

comment in response to post

Um. statmodeling.stat.columbia.edu/2021/11/21/s...

submitted 61 days ago

comment in response to post

If you think different slopes are bad, you should see www.tandfonline.com/doi/abs/10.1... (and related posts: statmodeling.stat.columbia.edu?s=regression...). Polynomial madness.

submitted 61 days ago

comment in response to post

For one-in-three, it’s ~70.4%. One-in-four, ~68.4%. As n increases, the answer gets closer and closer to 1−1/e ≈ 63.2%, where e is Euler’s number. en.wikipedia.org/wiki/E_(math... Why 1-1/e? Honestly, you would have to ask someone better at maths than me, but I think it’s a pretty cool result.

submitted 78 days ago

comment in response to post

So the prob. that it does happen at least once is the probability that it *doesn’t not happen*, 1 - (1 - 1/n)^n For a one-in-two chance, this works out as 1 - (1 - 1/2)^2 = 1 - 1/4 = 75%

submitted 78 days ago

comment in response to post

The prob. of trying twice and it not happening is the prob. of it not happening the first time, times the prob. of it not happening the second time: (1 - 1/n) * (1 - 1/n), or (1 - 1/n)^2 The prob. of it not happening in n attempts is (1 - 1/n)^n

submitted 78 days ago

comment in response to post

If you take a one-in-n chance, the probability of it coming off is 1/n. If you roll a six-side die, the probability of rolling a 6 is 1/6. The prob. of the event not occurring is one minus the probability that it does occur: 1 - 1/n

submitted 78 days ago

comment in response to post

But first, @xkcd.com xkcd.com/882/

submitted 78 days ago

comment in response to post

Poker is probably an interesting case study here, because AFAIK expert poker players don't try to solve K-level theory of mind problems, they just have really good heuristics.

submitted 78 days ago

comment in response to post

In principle, that might mean we can get LLMs to reason under uncertainty pretty well if we fine-tune on the right heuristics?

submitted 78 days ago

comment in response to post

There's an old idea in psychology (e.g. core.ac.uk/download/pdf...) that when people do perform well under uncertainty, it's because they're pattern matching using the right heuristics, rather than doing Bayesian inference.

submitted 78 days ago

comment in response to post

To be fair, humans are famously also pretty bad at it, so this one might be a draw.

submitted 79 days ago

comment in response to post

Yes, but also: xkcd.com/927/

submitted 79 days ago

comment in response to post

and, as an ilustration, uses an LLM to quickly check the job spec against my requirements. A browser extension is a more mature way of achieving the same thing, but would take me considerably more time to get up and running.

submitted 88 days ago

comment in response to post

Just realised you're doing this in the Buttondown UI, so you have no need for a nice python abstraction, but I had my fun anyway.

submitted 91 days ago

comment in response to post

This is nice. I got sidetracked by this, and accidentally spent 10 minutes ALMOST figuring out how to make this compatible with `+` and `|` operators. Make of that what you will.

submitted 91 days ago

comment in response to post

YES! I've been planning to write more about the importance of this stuff (largely, to be fair, for selfish reasons, since I'm a psycho methods-to-AI person, and I'm on the job market). eointravers.com/blog/convo-e...

submitted 93 days ago

comment in response to post

[...] would benefit from learning transferable experimental psychology and AI skills doing this as dissertation or even group projects.

submitted 100 days ago

comment in response to post

Thanks. Those links are...interesting. 🤔

submitted 106 days ago

comment in response to post

TLDR: conversations are graphs, and each node contains a) a prompt guiding what the chatbot should say, and b) possible classifications for the user's resposne, dictating which node we go to next. That's most of what you need!

submitted 107 days ago

comment in response to post

Come on.

submitted 108 days ago

comment in response to post

Ah.

submitted 108 days ago

comment in response to post

Also, a different domain, but this work on using LLMs as proxy human participants for cognitive modelling might be of interest: arxiv.org/abs/2502.00879

submitted 108 days ago

comment in response to post

ref 4 (Argyle) seems to be suggesting generalising from LLMs to humans. Sorry for the rabbit hole, my question is just: Who is openly saying we should study LLMs in place of humans, so I can avoid them? Thanks!

submitted 108 days ago

comment in response to post

I'm sure there are crackpots on the internet saying this, but is this something actual social scientists say? 😟 A lot of what I've seen on this, including some of your early refs, essentially say "you can do this, but obviously you should only do it for pilotting", although [...]

submitted 108 days ago

comment in response to post

This looks like great work! 🚀 Could I ask about the "LLMs are replacing human participants" bit? Are there serious people out there claiming that we can analyse the outputs of LLMs role-playing as members of minority groups, and generalise from these to actual behaviour of ppl from those groups?

submitted 108 days ago

comment in response to post

@hamel.bsky.social whoops, didn't see that one, thanks. I'll add a link to it. I don't disagree with your advice to focus on binary decisions initially, but I do think there's a lot of value in making stakeholders agree early on on provisional criteria for what would make a "good" interaction.

submitted 109 days ago

comment in response to post

From that point of view, I don't see any reason LLMs can't be "creative". Before the pitchforks arrive, I need to be careful: I'm not saying that they can create art, or anything like that, but they can produce text that triggers novel, useful ideas in the mind of the reader.

submitted 115 days ago

comment in response to post

There's an old, robust idea that creativity is about combining and reassociating existing ideas in novel ways, rather than somehow creating things "from scratch" (whatever that means). e.g. www.themarginalian.org/2013/05/20/a..., or www.semanticscholar.org/reader/927c1...

submitted 115 days ago