The unlearning goal says: U(D, S, A(D)) ≈ A(D◦S)
Meaning: for a request S to delete/add data in set D, your “unlearning algorithm” U should produce a model U(D, S, A(D)) that looks like a model A(D◦S) re-trained from scratch on dataset D◦S. But does it actually "delete" information requested in S? 👀

Comments