๐จ Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact.
UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods.
๐งต๐
UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods.
๐งต๐
Comments
Yet, standard unlearning can degrade unrelated knowledge, making it unreliable.
Effective unlearning is key for:
๐ Compliance (GDPR, CCPA)
๐ Privacy & security
โ๏ธ Ethical AI development
Points contributing to high variance cause more collateral damage when unlearned.
By pruning these outliers, UPCORE reduces unintended forgetting while ensuring effective deletion.
โ Minimizes unintended degradation
โ Preserves model utility
โ Compatible with multiple unlearning methods
Thus, UPCORE reduces negative collateral effects while maintaining effective deletion.
๐ Gradient Ascent
๐ซ Refusal
๐ Negative Preference Optimization (NPO)
We measure:
โ๏ธ Deletion effectiveness โ How well the target is removed
โ๏ธ Unintended degradation โ Impact on other abilities
โ๏ธ Positive transfer โ How well unlearning generalizes
This provides a complete picture of the trade-off between forgetting and knowledge retention over the unlearning trajectory.