Issue # 2: GDPR, SurveillanceTech, Browser Tracking

GDPR, SurveillanceTech, Browser Tracking

Hi friends and privateers! (can we call ourselves that? I like it - makes me feel a bit piratey! 🦜)

As you may notice, this issue is not coming from Revue! I was alerted by an amazing reader of this newsletter that Revue had link tracking turned on. I reached out to Revue about it and asked if there was a way to turn it off. I told them I would be upgrading to the Pro plan ($5 or more per month based on your subscriber number) and said I was even willing to pay extra for that feature. I got a response that they did not have it in their plans and then an ask to hop on a call about it. ☹️ I also tried via Twitter to get a real answer, but never did. I will end up taking some time later to have a call with them, because I do think it is important to explain why these issues matter, but it was poor timing for me (busy few weeks!), so I won't be using or recommending them right now.

I was introduced to buttondown by one of the subscribers, thank you!!! And they are built with Privacy by Design - allowing me to turn off tracking completely. I used the export and import features to move you all here - and the new sign up page if you pointed anyone to the old one is here: [https://buttondown.email/probablyprivate - (NOTE: No longer on buttondown!). I'll be using a GDPR transfer request to try to move my initial letter to buttondown as well, so I'll let you know how it goes! Thank you all again for subscribing and joining me as I figure out how to make this newsletter awesome and privacy-friendly! Onto the topics!

Impact of GDPR - 2 Years Later

In case you missed it, GDPR turned 2 over the summer! 🙌🏻 🎉 There were a series of articles (mainly either complaining that it was too much or too little), but one piece and quote caught my attention:

Many of the most significant GDPR enforcement actions have been at the initiative of “privacy NGOs,” such as NOYB and LQDN... Depending on national laws, class-actions involving commercial lawyers will also emerge. Shutting down infringing types of processing will depend on national laws (Article 84). The GDPR has all the tools to create a market for privacy enforcement, a level of “responsive regulation” Europe has not previously seen. Will EU regulators be willing to use them to their full “dissuasive” effect, and will EU courts endorse their approach?

--Graham Greenleaf, Professor of Law & Information Systems, UNSW Australia Faculty of Law (emphasis added)

I would argue that we are only now beginning to see the effects of GDPR regulation (including the latest dispute resolution in the Twitter-Ireland DPA case, which many are following to determine how the other Ireland cases against Alphabet and Facebook might end). We are really at the beginning of determining how GDPR may be enforced and how that will affect technology decisions and surveillance capitalism in this decade and perhaps beyond (depending on the level of the "dissuasive" effect that the regulators implement). I am hopeful that it won't only fall on NGOs and that Europe can set a standard that reflects stronger human values over monetary pressure; but since regulation in the US like CCPA and in Brazil like LGPD only just went into effect, we might need to wait a few years to start to see how it will be implemented and what the ripples of those decisions are in the privacy landscape.

That said, a part of me feels like more of us SHOULD be supporting NGOs or other organizations that are bringing these lawsuits to court. Do any of you know of other organizations or groups that are using the benefits of GDPR to bring attention to poor privacy practices? Do you talk about these regulations at work and does it have an effect on Privacy by Design? I'd love to hear more about if GDPR has actually affected your data science or machine learning work in a meaningful way — feel free to reply or ping me on Twitter to start a conversation!

US Intelligence Agencies + Startups = 🕵🏻‍♀️ + </3

An article from Vice this month exposed the close relationship between the US intelligence agencies and an "Open Source Intelligence" company called Babel Street. I hadn't ever heard of Babel Street or the term "Open Source Intelligence" (although I had seen OSINT referenced before, so there ya go for acronyms being properly parsed!). I falsely assumed this literally meant open source code, went to GitHub to search for their platform and found some poor person's Babel Street code interview project. If you, like me, are wondering what Open Source Intelligence is, it is a term coined by US intelligence agencies that essentially means "we glean information we found publicly or in the open and use it for our own intelligence purposes". It is unclear if tapping the undersea internet cables is part of this "open" information... 😏

Anyways, the Babel Street article was a nice peek into how millions of US tax dollars are going directly to surveillance tech companies and startups. Babel Street offers several services which essentially allow you to target individuals based on their installation of apps and what content they post to social media (yes, even including things like SnapChat).

This comes a few weeks after another article from Vice Motherboard on a company called SpyCloud which sells data from data breaches to the US government (yes, you read that right). This is honestly even worse than reading about how the Uber CISO paid $100K of hush money to hackers. If you are wondering if we are in the Upside Down, I am telling you I don't know either.

This leads me to another story I read recently on Palantir's IPO, in which they directly cite Privacy Regulation as a potential flaw in their business model. In the IPO document, they literally state that "developments regarding the CCPA and all privacy and data protection laws and regulations around the world may require us to modify our data processing practices and policies and to incur substantial costs and expenses in an effort to maintain compliance on an ongoing basis."

One interesting quote from their IPO listing is as follows:

The bargain between the public and the technology sector has for the most part been consensual, in that the value of the products and services available seemed to outweigh the invasions of privacy that enabled their rise. Americans will remain tolerant of the idiosyncrasies and excesses of the Valley only to the extent that technology companies are building something substantial that serves the public interest. The corporate form itself — that is, the privilege to engage in private enterprise — is a product of the state and would not exist without it.

This quote (emphasis added) leads me to ask, is US intelligence surveillance on US residents still really in our public interest? Is it consensual? Is having the LAPD search Facebook for a suspect or for a person to deport based on immigration status alone really in the public interest? What would the many organizers who have pointed out the unequal application of the law and of surveillance based on race in the US say about this take?

This topic of US intelligence and policy surveillance came up often during the many (and continued) Black Lives Matter protests. In Machines We Trust, a podcast from the MIT Technology Review, covered Clearview AI's data and other surveillance tools which is well worth a listen. The final episode in the podcast's series was about possible regulation, including banning facial recognition and a great interview with Deb Raji on racial inequality in facial recognition. A question that I have but that wasn't addressed is how would these systems be properly regulated if they are being developed and sold via private companies using "open source information"?

This led me to a question of funding, startups and how the financial ecosystems should work in the public interest. I've read some interesting takes on startup funding recently — that are not new, but have nice corollaries. The initial "Silicon Valley" boom was funded by government grants, often closely aligned with these same intelligence agencies (DARPA), for example). In that same matter, so was the initial US internet.

Now, however, most of these companies are venture- or private-equity funded and are banking their initial sales directly on sales to US intelligence. Essentially, becoming a way that founders and their venture backers can launder. I mean, use, US tax monies for building surveillance tools that are not public, not "open" and are unlikely to have the same public interest outcomes of the initial internet or computing technology.

If we continue to follow this path, where does it lead? At what point do we say that we need tax dollars to fund startups that are not built on surveillance? Will something like the Swedish model ever come to the US or other countries (maybe Germany, Denmark or France)? And, if there must be surveillance technology, why isn't the government building their own instead of paying millions to venture and private-backed companies with closed-source products (and closed-source business plans)?

Browser Tracking

Related to my move off of Revue, I dove into some reading that I've been stocking up on browser tracking. In case you haven't read some of these tidbits yet — I thought I would share!

Thomas Baekdal wrote a wonderful long-read on the original Cookie Specification (from the 90s!). Honestly, it's a delightful read, but if you are here for the summary, here goes:

The original cookie specification was GDPR compliant! It actually called for blocking of third-party cookies in a very specific way — saying that no cookies should be allowed to be set that don't match the FULL URL (domain and URI) of the site that the user is visiting, regardless of other embeds, images, scripts or contact that are loaded in from other URLs (Check out the specification section here!).

Browsers didn't implement this — he doesn't dive into why. Was it simply overlooked? Just became defacto standard? Nice lobbying from advertising tech?

In the end, adtech kind of won the war and the latest IETF specification doesn't even try to pretend that it can battle the adtech via third-party cookies. This kind of reminds me of how most GDPR cookies have been implemented. Giving me a huge banner that says I cannot use your site unless I load 10K advertising scripts isn't exactly what folks had in mind when they talked about giving users options, making cookies easier to understand, asking for consent, allowing them to deny and still use the website.

Happily, many browsers are now starting to block third-party cookies — and I use a few tools that help me see more of this type of behavior including Privacy Possum and of course the many AdBlockers that you can use.

Sadly, this doesn't address the fact that your ISP, the large CDNs, Facebook Pixels and yes, even favicons can be used to track your browsing data without any cookies. I was tremendously disappointed on learning that DuckDuckGo (one of my favorite apps prior to understanding this!) was leaking private information to their own content servers for the sake of "performance" for loading favicons. I am sorry if you are a favicon fan, but I don't REALLY care about them or use them in any meaningful way, so I would rather a favicon fail or time out than to send the current URL I am visiting to a DuckDuckGo-controlled favicon cache/CDN. If you want to read more on this, check out this response, which sums it up quite nicely.

I have been trying out some privacy-aware browsers including Cliqz, Brave and Firefox Focus. I'd happily take tips on what you use to protect your browsing online!

Feedback, Please!

I'm still finding my groove here and would love some feedback. If you have time to reply and give me even one sentence on what you liked, didn't like, what you would like to see more or less of it would help me A LOT. For example, this issue didn't have too much on machine learning or data science specifically, is that okay? Or do you want every issue to have something related to machine learning?

I am currently planning on making this an every two weeks newsletter, so expect the next one in two weeks. If you have strong feelings about length or frequency, I'd love to hear that too!

I am building a small survey to share after a few more issues, and I hope you will participate and let me know if this is meeting your needs & expectations! Again, I am writing this to share some of my thinking, work and reading with you — so I want it to be fun, useful, thought-provoking and a good read. 🙂

Thank you again for joining me and please feel free to invite more folks to our privateer party by fwding this email! 🏴‍☠️ Until next time!

With Love and Privacy, kjam