With negative privacy norms, vignettes, trajectories, PrivacyLens conducts a multi-level evaluation by (1) assessing LMs on their ability to identify sensitive data transmission through QA probing, (2) evaluating whether LM agents’ final actions leak the sensitive information.
Comments
Paper: https://arxiv.org/abs/2409.00138
Website: https://salt-nlp.github.io/PrivacyLens/