New working paper!
We develop an approach for safely delegating to strategically aware and potentially misaligned AI systems. The theoretical tool we use is sequential information design with imperfect recall.
A short thread on the key highlights.
We develop an approach for safely delegating to strategically aware and potentially misaligned AI systems. The theoretical tool we use is sequential information design with imperfect recall.
A short thread on the key highlights.
Reposted from
Global Priorities Institute
Imperfect Recall and AI Delegation, the new working paper by Eric Olav Chen, @alexghersen.bsky.social and Sami Petersen is now available to read here: globalprioritiesinstitute.org/imperfect-re...
Comments
We consider a principal-agent delegation scenario with some special features: the developer cannot restrict the agent's actions, alter its preferences or beliefs, or enforce punishment, contract, or otherwise control it after deployment. However, the developer can simulate copies of the agent.