Think About Privacy Like an Engineer
There are lots of ways to think about privacy these days. If you happen to be a lawyer at Meta, you probably think about privacy in terms of all the bits of information that is being collected from users and is legally required to be listed in a privacy policy. If, by chance, you happen to be a member of the US Congress, privacy might mean the protected classes of data including things like medical and educational records. In the much more likely case that you are a normal user of various apps and websites, privacy probably just means that there’s a checkbox or a button that you need to click before you use the aforementioned app and/or website that says you must accept “cookies”.
Alternatively, if you happen to know many software engineers (and in particular, a software engineer that works in infrastructure) and ask them what privacy means, you may end up in a conversation about risk aggregation. That’s because risk is typically a currency in business and crucially in software businesses. You can bet that any software company with a stamp that says “SOC II” on it has a “risk register” that they maintain (at least they update it once a year when the auditors ask about it) and the purpose is pretty simple - gauge the likelihood that a certain event will take place as well as it’s downside impacts to the business. By events, we generally talk about things like data breaches or fraud, and the likelihood and impacts are the risk that the business takes. Take the entire registry altogether and that is the aggregated risk assumed by a business in order to offer its services.
Need an example? Pretend you are an executive at Equifax circa 2017. You might have a risk register that has a line item that looks like this:
Risk: Personal information database breach
Likelihood: Low
Impacts: Reputation loss, financial impacts due to exposure to class action settlement, regulatory oversight compresses margins
This risk exists because the product consists of collecting lots of personal financial data about individuals; if they don’t take this risk, they don’t have a product (arguably they have been such poor stewards of data they likely shouldn’t have this product anyways).
Most modern businesses rely on information flowing cleanly between different pieces of infrastructure and vendors (subprocessors); for example, when you buy a fancy espresso from the local coffee shop and pay with a credit card, the point of sale system sends data to an app hosted on a cloud provider like AWS, which interacts with a vendor like Square to process the payment (that’s two vendors, or subprocessors) and then the rewards system updates your history to record the purchase (a third subprocessor). What’s your risk registry look like for this transaction?
Your credit card number (high impact, low likelihood of exposure)
You like coffee (low impact, high likelihood of exposure via data sharing)
This contrived example isn’t terribly risky but its just one of hundreds of decisions we make subconsciously and the registry adds up over time — you are aggregating risk, just like Equifax as they accumulated personal financial information one person at a time, 147 million times over.
The way we use devices and the internet has evolved a lot over time (duh) and data has emerged as a hard currency of the information era. Sometimes you are paying money to manage risk (buying an app that runs locally on your laptop), and sometimes you are forking over data to get some service for free (e.g. Instagram, paid for by ads) and when that happens, you are aggregating risks. This is the part of the post where I should write a “call to action” so that you hopefully exchange cash for something from me, but I haven’t built that part yet. I can tell you what I want for myself however —
I want to use LLM’s on my own machine
When I need to do some one off task, I want to make an LLM do it for me on my own computer, and not sign up for yet another service
When I save a file, I want to save it to my local machine
When I send someone a file, I don’t want to save it to some cloud drive
When I need to buy pants I don’t need my algorithm updated
When I publish things, I don’t want companies scraping it for financial gain without permission
I could really keep going on, but I think you get the point — I want to use my computer like its 2002. What are you nostalgic for in today’s computering world? How do you want the internet to change?