🚨 Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact. UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods. 🧵👇 - ThreadSky

vaidehipatil.bsky.social • 8 days ago

🚨 Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact.

UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods.

🧵👇

Comments

vaidehipatil.bsky.social•8 days ago

LLMs train on vast datasets, often with sensitive or unwanted info. Regulations like GDPR, CCPA mandate removal.

Yet, standard unlearning can degrade unrelated knowledge, making it unreliable.

Effective unlearning is key for:
📜 Compliance (GDPR, CCPA)
🔐 Privacy & security
⚖️ Ethical AI development

vaidehipatil.bsky.social•8 days ago

Our key insight: Not all forget set points degrade the model equally.

Points contributing to high variance cause more collateral damage when unlearned.

By pruning these outliers, UPCORE reduces unintended forgetting while ensuring effective deletion.

vaidehipatil.bsky.social•8 days ago

UPCORE constructs a core forget set by identifying and removing outlier points using Isolation Forest.

✅ Minimizes unintended degradation
✅ Preserves model utility
✅ Compatible with multiple unlearning methods

vaidehipatil.bsky.social•8 days ago

Even after pruning, the pruned points in the forget set still become unlearned -- thanks to positive collateral transfer from the core forget set.

Thus, UPCORE reduces negative collateral effects while maintaining effective deletion.

vaidehipatil.bsky.social•8 days ago

We apply UPCORE across three unlearning methods:
📉 Gradient Ascent
🚫 Refusal
🔄 Negative Preference Optimization (NPO)

We measure:
✔️ Deletion effectiveness – How well the target is removed
✔️ Unintended degradation – Impact on other abilities
✔️ Positive transfer – How well unlearning generalizes

vaidehipatil.bsky.social•8 days ago

Instead of evaluating at a single training checkpoint, we introduce AUC (Area Under the Curve) across deletion effectiveness and utility.

This provides a complete picture of the trade-off between forgetting and knowledge retention over the unlearning trajectory.

Comments

Posting Rules

Reply