tfwerner.com - Profile | ThreadSky | a Reddit-style client for Bluesky

tfwerner.com

Postdoc the Center for Humans and Machines (CHM/MPIB) | PhD in Economics | Affiliated with DICE/HHU & BCCP I am on the economic job market 2024/2025. Tfwerner.com

60 posts 824 followers 252 following

Posts 31 Comments 19

comment in response to post

I think those are excellent suggestions that align well with our approach. I'd also reconsider the design to minimize open-ended questions where possible. While automated AI bots may be the future, they're not the norm yet. But, most participants already seem inclined to use LLMs for open-ended Q

submitted 17 days ago

comment in response to post

Wrong link above. The correct one: blog.cloudflare.com/declaring-yo...

submitted 29 days ago

comment in response to post

It would be great if organizations like ESA and others could lead international collaborations to tackle this problem together.

submitted 29 days ago

comment in response to post

That said, these problems will get worse. More lab experiments could be a solution. But if we go in that direction, we should think about scaling them to match online sample sizes & diversity.

submitted 29 days ago

comment in response to post

Companies like Cloudflare are already working on detecting fully automated LLM agents. (HT @Hiromu Yakura from our lab) blog.cloudflare.com/firewall-for...

submitted 29 days ago

comment in response to post

Also, let's not declare online experiments a lost cause. There are ways to adapt: Hidden instructions designed to trick LLMs, tracking tab changes and other meta info, disabling copy-paste, more sophisticated detection methods, ...

submitted 29 days ago

comment in response to post

If we want to scale lab experiments, we must build international collaborations. This applies overall sample sizes and the diversity of the subjects we recruit.

submitted 29 days ago

comment in response to post

Offline labs often have a very narrow participant pool (mostly young students from industrialized countries). This is fine for some research questions but not for others. Online sampling made it much easier to get diverse and more representative samples, including non-WEIRD populations.

submitted 29 days ago

comment in response to post

Not sure about what's more important but you cannot do reinforcement learning without a very large foundation model.

submitted 30 days ago

comment in response to post

Interesting! Do you think the difference could be novelty bias since we’re less familiar with DeepSeek’s flaws when it comes to writing tasks? Still have to test it myself for writing

submitted 31 days ago

comment in response to post

Yes, I agree on the brick-and-mortar labs. Also having international corporations allows for more representative samples compared to past offline lab days. It would be great if an organization like the ESA could lead the lead the coordination of such efforts.

submitted 33 days ago

comment in response to post

I haven't tested Nightshade, but my understanding is that it's mostly to protect your images from being included in training data. Does it also work to prevent AI from using the images at inference time?

submitted 33 days ago

comment in response to post

H/T to @iyadrahwan.bsky.social for highlighting this issue the other day.

submitted 34 days ago

comment in response to post

While Gemini doesn’t take active decisions, it streams my screen in real-time to an LLM, helping me within the experiment. With OpenAI’s Operator model, participants could fully outsource their participation to LLMs. How will we tackle this growing issue as a profession? openai.com/index/introd...

submitted 34 days ago

comment in response to post

"Erst ab 10 Euro" and "Nur mit EC Karte"! ;-)

submitted 35 days ago

comment in response to post

bsky.app/profile/tfwe...

submitted 50 days ago

comment in response to post

Thanks Simon! Sounds interesting, I will check it out next year

submitted 61 days ago

comment in response to post

bsky.app/profile/tfwe...

submitted 63 days ago

comment in response to post

Haha, glad there’s so much interest! I got it here 🎄🎅: www.geeksoutfit.com/products/tha...

submitted 63 days ago