it’s very challenging to reduce false negatives without increasing false positives, but OpenAI’s safety research team was able to reduce both with a new technique called deliberative alignment https://openai.com/index/deliberative-alignment/
Comments
Log in with your Bluesky account to leave a comment
Comments