evijit.io
Appled Policy Researcher at HuggingFace 🤗 and Researcher at University of Connecticut. AI Ethics/Safety.
167 posts
2,291 followers
661 following
Regular Contributor
Active Commenter
comment in response to
post
If you're around and want to chat, hit me up! Let's talk AI, Disclosures, Agents, and more!
comment in response to
post
See this work presented at IAAI in the "AI Safety, Reliability, and Incident Management" session on Thursday the 27th at 2:30pm! arxiv.org/abs/2410.121...
comment in response to
post
2. To Err is AI: A Case Study Informing LLM Flaw Reporting Practices. This paper documents lessons learned from a bug bounty event at DEF CON 2024, demonstrating how systematic evaluation and coordinated, structured flaw reporting of AI systems can help prevent real-world harms.
comment in response to
post
Come see our poster during the AI Alignment Track on Friday the 28th - 12:30pm! arxiv.org/abs/2406.042...
comment in response to
post
1. Quantifying Misalignment Between Agents: Towards a Sociotechnical
Understanding of Alignment. We introduce a novel mathematical model to measure misalignment between multiple human and AI agents across various problem domains, moving beyond single-agent or monolithic approaches to alignment.
comment in response to
post
@businessinsider.com link: www.businessinsider.com/ai-agents-jo...
comment in response to
post
TechCircle Link: www.techcircle.in/2025/02/19/w...
comment in response to
post
2. Quoted on a @businessinsider.com piece by Effie Webb on AI Agents and Job Boards. I push back against conflating autonomy with agency, and point out that agents with human oversight will augment, not replace, human workers.
comment in response to
post
Read the paper! Would love to know what you think!
huggingface.co/papers/2502....
comment in response to
post
Post shows as “blocked” oh no
comment in response to
post
Glad!
comment in response to
post
Please do read the full paper! We worked really hard on it. This is, again, the type of paper that can be written forever, but that's not practical, so instead, take what we have and please critically engage with it! 🤗
Looking forward to your thoughts!
huggingface.co/papers/2502....
comment in response to
post
And that's all folks! This paper is an extension of our (pretty well loved) blog post from last month:
huggingface.co/blog/ethics-...
comment in response to
post
Thus, from a purely value laden perspective, we conclude in the paper that highest-level agents should not be deployed, since the values that people seem to care about will largely break down. Some level of human control needs to always be in place.
comment in response to
post
In fact, when we look at the entire risk-benefit landscape, we find that at the highest level of agency (full autonomy), most values break down and risk > benefit:
comment in response to
post
And we posit that Agents have values, that express themselves differently based on the level of agency. For example, tool interoperability may be high in high agency, but consistency may be low:
comment in response to
post
We also break down AI Agents into levels of Agency:
comment in response to
post
And propose the following:
"Computer software systems capable of creating context-specific plans in non-deterministic environments."
comment in response to
post
Okay, so -- what is an AI Agent? We looked at a million different definitions...
comment in response to
post
TLDR:
[1] Agents have values
[2] Agents have different levels of agency
[3] Agent values are realized differently based on different levels of agency
[4] Most values break down at the highest level of agency (full autonomy).
[5] Thus, don't cede all control!
comment in response to
post
It was so fun to engage in deep debate with some of the most prominent scholars in the field today, and I’m very proud of the big tent paper that we ended up with. It would be an honor if you critically engage with this!
Also big kudos to @borhane.bsky.social for herding us cats deftly 🐈
comment in response to
post
Much of the coverage has been focused on US-China tech competition. That misses a bigger story: DeepSeek has demonstrated that scaling up AI models relentlessly, a paradigm OpenAI introduced & champions, is not the only, and far from the best, way to develop AI. 3/
comment in response to
post
The best model is the one you use the most, so in my opinion, the next "moat" is for models to be integrated and useful rather than just powerful. Google's and OpenAI's agentic systems and Perplexity's smartphone assistant point towards this new focus!
👇
www.businessinsider.com/openai-starg...
comment in response to
post
Read on @fortune.com: fortune.com/2025/01/22/o...
comment in response to
post
Additionally some research has shown that these relationships are developed by lonely people to replace their deceased partner or child and it actively kneecaps them from processing loss in a healthy way.
comment in response to
post
My fear is the phenomenon of robot death (falling in love with a model that might be discontinued) and misplaced trust (model controlled by a company that waits for you to lower your guard and then show ads).
comment in response to
post
Ethical resource distribution is more important than ever in the fact of private funding conglomeration, and it is so important to keep AI open and accessible.
comment in response to
post
A key finding was that compute bottlenecks create perverse incentives and harm everyone else who is not working on nebulously defined "AGI" in trillion dollar companies. It specifically harms researchers who are doing use case specific research, like language generators for speech impaired people.
comment in response to
post
Only then does it help humanity. Otherwise, this once again acts as a gatekeeping measure and concentration of power. I spoke a little bit more on this after the Public AI event by Aspen Digital at the Library of Congress last year (read their excellent whitepaper at publicai.network).
comment in response to
post
HOWEVER - a significant portion of this must be public infrastructure, meaning freelance developers and university research teams working on cutting-edge and hopefully open models should be able to access and use this infrastructure.
comment in response to
post
4. This article can be written and updated a gazillion times, so please treat it as a snapshot in time and engage with it in interesting ways! Still a nascent subfield and there's a ways to go!
Hope you enjoy!
comment in response to
post
3. Agency = Control. More control ceded to the system, less control you have, and this magnifies the risk surface as compared to model risks in a vacuum. On the flip side, inter-agent collaboration can actually unlock more types of complex tasks than single models, so...
comment in response to
post
Funnily, the best table we discovered for this was a blog from Caterpillar, the construction company: