PROBABLY PRIVATE

A newsletter on data privacy by Katharine Jarmul

About

Probably Private is a newsletter for privacy and data science enthusiasts. Whether you are here to learn about AI, machine learning and data science via the lens of privacy, or the other way around, this place is an open conversation on technical and social aspects of privacy and their intersections with surveillance, law, technology, mathematics and probability.

Past Issues

  • Video Course launch, studying memorization via security and privacy and personalizing your recommendations
    In this issue, you'll learn about how privacy and security research exposed AI memorization before it was a common topic in AI conferences. I also announce my new O'Reilly video course and an open-source library for content curation and personalized recommendations.
  • How does memorization in AI systems happen? What should we do about it?
    In this issue, you'll explore exactly how memorization in AI systems happens, by investigating training data repetition, novel training examples and model overparameterization. You'll also explore some ways that this testing could move into regular AI system development or regulation.
  • Building Private and Personal AI systems, Exploring ML Memorization
    In this issue, you'll explore how today's AI systems parallel early computing and what this could mean for building out more private, more personal AI. In addition, the ML memorization exploration continues by investigating datasets and distributions and encoding and embeddings.
  • Memorization in Machine Learning and Multidisciplinary Practices
    Does memorization in machine learning happen, and if so, how? In this issue, I reveal the initial thoughts on an article series on how, why, when and exactly what happens when deep learning models memorize their training data. In addition, I share some thoughts post-sabbatical on next steps and observations on multidisciplinary settings for privacy work.
  • AI Act, Biden's Executive Order on AI, Data Privacy in der Praxis
    This newsletter covers recent legislation changes, including the EU's AI Act, Data Governance Act, Data Act and Biden's Executive Order on AI. In addition, Practical Data Privacy is being released in German with updated sections on recommended risk analysis for LLMs and how LLMs and other large models memorize their training data.