Making Public Information Secret
A way to recover and enforce privacy
|McNealy bio source|
Scott McNealy, when he was the CEO of Sun Microsystems, famously said nearly 15 years ago, “You have zero privacy anyway. Get over it.”
Today I want to talk about how to enforce privacy by changing what we mean by “privacy.”
We seem to see an unending series of breaks into databases. There is of course a huge amount of theory literature and methods for protecting privacy. Yet people are still broken into and lose their information. We wish to explore whether this can be fixed. We believe the key to the answer is to change the question:
Can we protect data that has been illegally obtained?
This sounds hopeless—how can we make data that has been broken into secure? The answer is that we need to look deeper into what it means to steal private data.
After The Horse Leaves The Barn
The expression “the horse has left the barn” means:
Closing/shutting the stable door after the horse has bolted, or trying to stop something bad happening when it has already happened and the situation cannot be changed.
Indeed, our source gives as its main example: “Improving security after a major theft would seem to be a bit like closing the stable door after the horse has bolted.”
|Photo by artist John Lund via Blend Images, all rights reserved.
This strikes us as the nub of privacy. Once information is released on the Internet, whether by accident or by a break-in, there seems to be little that one can do. However, we believe that there may be hope to protect the information anyway. Somehow we believe we can shut the barn door after the horse has left, and get the horse back.
Suppose that some company makes a series of decisions. Can we detect if those decisions depend on information that they should not be using. Let’s call this Post-Privacy Detection.
Consider a database that stores values where is an -bit vector of attributes and is a attribute. Think of as small, even a single bit such as the sex of the individual with attributes . Let us also suppose that the database is initially secure for insofar as given many samples of the values of only, it is impossible to gain advantage in inferring the values of . Thus the leak of is meaningful information.
Now say a decider is an entity that uses information from this database to make decisions. has one or more Boolean functions of the attributes. Think of as a yes/no on some issue: granting a loan, selling a house, giving insurance at a certain rate, and so on. The idea is that while may not be secret—the database has been broken into—we can check that in aggregate that is effectively secret.
The point here is that we can detect if is being used in an unauthorized manner to make some decision, given protocols for transparency that enable sampling the values . If given a polynomial number of samples we cannot tell ‘s within then we have large-scale assurance that was not material to the decision. Our point is this: a leak of values about individuals is material only if they are used by someone to make a decision that should not depend on their “private” information. Thus if a bank gets values of , but does not use them to make a decision, then we would argue that that information while public was effectively private.
Definition 1 Let a database contain values of the form , and let be a Boolean function. Say that the part is effectively private for the decision provided there is another function so that
where . A decider respects if is effectively private in all of its decision functions.
We can prove a simple lemma showing that this definition implies that is not compromised by sampling the decision values.
Lemma 2 If the database is secure for and is effectively private, then there is no function such that .
Proof: Suppose for contradiction such an exists. Also suppose for avoiding contradiction of effective privacy that a function as above exists. Then given , we obtain with probability . Then using we obtain with overall probability at least . This contradicts the initial security of the database for .
Pulling in the Long Reins
To be socially effective, our detection concept should exert influence on deciders to behave in a manner that overtly does not depend on the unauthorized information. This applies to repeatable decisions whose results can be sampled. The sampling would use protocols that effect transparency while likewise protecting the data.
Thus our theoretical notion would require social suasion for its effectiveness. This includes requiring deciders to provide infrastructure by which their decisions can be securely sampled. It might not require them to publish their -oblivious decision functions , only that they could—if challenged—provide one. Most of this is to ponder for the future.
What we can say now, however, is that there do exist ways we can rein in the bad effects of lost privacy. The horses may have bolted, but we can still exert some long-range control over the herd.
Is this idea effective? What things like it have been proposed?