Hello privateers,
I'm back in Europe after an amazing PyCon India 2025. My keynote linked information theory, propaganda and how we can bring real information back to AI/ML narratives. I also got a chance to meet some really cool humans, help pack bags, attend a PyLadies lunch and drink (almost) too much tea.
In this issue, we're exploring how to actually implement machine unlearning. How can you selectively remove information from AI/ML models?
Unlearning definitions vary (which you learned about in the last Probably Private), but once you have one that works for you, you can begin implementation. Like the definitions, today's unlearning methods are dependent on your model architecture and data choices.
If you are using simpler or smaller machine learning models, you can either build model ensembles or leverage statistical query algorithms, which query the underlying features/data to assemble aggregate inputs. With these approaches, you're essentially "cutting out" data points or contributions for deletion.
In ensemble learning, this happens by rolling one of the models back to a particular checkpoint before it was trained on a particular example or by retraining one of the models entirely. In statistical query algorithms, you can rerun a query presuming your model doesn't significantly depend on the data you are deleting. Check out Sharded, Isolated, Sliced, and Aggregated (SISA) learning for an ensemble example and one of the first unlearning papers (2015) for statistical query unlearning.
To unlearn with today's large scale deep learning models (LLMs, multi-modal models, diffusion, computer vision, etc.) without moving to an ensemble, you'll basically continue finetuning the network, but this time to forget/unlearn examples. I'll call these approaches "deep unlearning".
To do this, you'll assemble a set of examples called the "forget set" which you want to forget, and hopefully you also have a set of examples similar to the forget set that you want to retain (i.e. of the same class, with similar qualities but data that you are allowed to continue using).
Deep Unlearning methods include:
Sounds doable in most deep learning setups. But the hard part is really in the details.
For many setups, defining forget and retain sets is going to be difficult, because data expiration or deletion requests are often separate from how training data is collected and prepared. Data duplication or murky data lineage also introduces challenges in unlearning if your training data isn't properly documented.
In addition, scaling datasets so you can unlearn without breaking model utility is challenging. Unlearning 1% of the training data is significantly different than trying to unlearn 10%. This also depends on data and task complexity (i.e. more complexity = harder to unlearn without breaking the model entirely).
Aside from that, you also need to:
This is a lot of work! Given the field of unlearning is still young, I hope that choosing a metric and finding the right deep unlearning method combination will significantly improve in the next few years; but it's also possible that model retraining will happen more often than unlearning.
If large AI vendors are going to pretrain and train/finetune a new set of large models every 4 months, does it make sense to dedicate time and resources to unlearning or would it be better to build model governance to more easily segregate data for deletion from training datasets? Ideally there's budget for both so unlearning continues to progress from research into reality.
Sounds interesting? In the longer article on how unlearning is done, I uncover additional approaches that I think could unlock new insights to improve unlearning, including ideas inspired by the same patterns as LORA fine-tuning and new architectures informed by information theory and differential privacy.
If you're an audio/visual learner, I posted a new Probably Private YouTube summary of this article.
I've been preparing content for my masterclasses and there's been requests to attend the class in an online setting. I'm curious to get your feedback. Would an online version be interesting for you?
Core themes and activities:
If any part of this sounds interesting, can you take a minute to reply to this email with:
Looking forward to reading your thoughts!
Conference season is starting, so here's some upcoming speaking engagements where you can catch me:
If you enjoyed this newsletter, consider forwarding it to someone so they can subscribe.
With Love and Privacy, kjam