Making A Heuristic Into A Theorem
A provable example of the linearization method
Nicolas Courtois is one of the leading experts on attacks on practical crypto-systems. He has worked on trying to break, for example, various symmetric cipher systems, including AES. Since 2000, AES, the Advanced Encryption Standard, is the symmetric cipher approved by NIST. It is claimed that it will be safe for the next 200 years. See Courtois’ page for up to date information on AES and related issues.
Today I would like to talk about a heuristic technology that has been quite powerful in attacking crypto-systems—called linearization. There is a great new result that there is a class of problems that can be provably solved by this heuristic.
There are two cultures in cryptography: the practical cryptography researchers and the theoretical cryptography researchers. Let’s call them the ‘s and ‘s. Both are important, both study related problems, they share techniques, and many researchers work in both areas. For example, the Siegenthaler inequality has been used by both communities—named for Thomas Siegenthaler. It relates the degree of the a boolean function and the size of its smallest non-zero Fourier coefficient.
Courtois is certainly one of the leaders in , and Sanjeev Arora is one of the leaders in . Our discussion is about a method that both groups have used in the past. However, I believe it is fair to say that the method, called linearization, has been mostly used by those in . What is exciting to report is that Arora and his student Rong Ge have a paper that shows how researchers in can add to our global understanding of this important method. Perhaps their ideas, which are interesting on their own, can help shed light on . We will see.
The linearization method is based on the simple idea of converting a quadratic equation into a linear equation. For instance,
is mapped to
There is good news and bad news here: The good news that the new equation is linear in the variables ; the bad news is that the new equation has lost some information. The following is now possible,
If we go back to the actual variables this is impossible, since the first two equations make and non-zero, yet the last equation means that one of these must be zero. This is a linear equation, and since linear is “good” this has the potential to change a hard problem into simple one.
The method typically takes a set of quadratic equations and converts them into a set of linear equations. Then these equations are solved by Gaussian elimination. Any solution of the quadratic equations is of course a solution of the linear ones; however, the linear equations may have many extra solutions. The method works provided the linear equations have few extra solutions.
We can try the same idea on equations of any degree . We replace terms of degree at most by variables where is a set of size up to . Of course, for greater we have many more ways to lose information. To keep the number of extra solutions down in proportion, we need many more equations. Note also that we could rewrite our higher-degree equations as quadratic by introducing new variables, but doing so might not improve the situation.
Heuristically, we can judge how many initial equations we need to have a good chance of making this work. In practice, in this works sometimes. In theory, in there are certain cases where this can be proved to work. That is the insight of Arora and Ge.
Linearization In Practice
The key technical insight is that AES has a nice algebraic structure. Nice algebraic structures can hurt crypto-systems, especially symmetric ciphers. Bruce Schneier and Niels Ferguson have written:
We have one criticism of AES: we don’t quite trust the security What concerns us the most about AES is its simple algebraic structure No other block cipher we know of has such a simple algebraic representation. We have no idea whether this leads to an attack or not, but not knowing is reason enough to be skeptical about the use of AES.
One problem is that there are attacks on AES that reduce to solving many quadratic equations—apply linearization. Solving such systems of equations in general is NP-hard, and so the hope is that AES is safe. But in practical cryptography one must look at the actual systems. In this case there are about 8000 equations with 1600 variables for the 128-bit AES.
These are apparently not enough equations to make linearization to work directly on AES. An extension of linearization called the XSL attack was created in 2002 by researchers Nicolas Courtois and Josef Pieprzyk to try to overcome this roadblock. The XSL attack is based on linearization and some further tricks. There is an on-going debate whether these or related methods will one day break AES.
Enter Arora and Ge. They have a pretty paper on using linearization methods to solve a learning problem. I think this paper and their results are important. They can prove that linearization actually works on their learning problem. This is in sharp contrast where in the practical work on ciphers like AES there are no proofs yet. There are ample experiments to suggest that certain attacks will or will not work. But no proofs.
Learning Noisy Parity
Arora and Ge work on the problem of learning parities with noise. This is a classic problem, well studied, and extremely difficult. The standard model is that there is a secret boolean vector of some length . There is a “box” so that when you press the “button” the following happens:
- The system generates uniformly at random and shows you .
- The system generates a hidden “noise bit” under a non-uniform distribution, such as happening with probability .
- The system shows you the bit .
After many button presses, one gets a system of linear equations, most of which are correct. However, a substantial fraction of them are incorrect, and this makes solving the linear system very difficult.
Arora and Ge modify the problem in a interesting manner. Each time you press the button you get, say, ten ‘s and ten corresponding ‘s. You are promised that at most 3 are incorrect. Note this is the same on average, but it is not exactly the same as the original problem.
The surprise is that they can solve this new problem, which they call structured noise. The problem reminds me some of list decoding, and it seems to be related to that in some way.
Each time the button is pressed we get values. The incorrect solutions collectively satisfy a certain noise polynomial . For example, they may satisfy a certain polynomial that is zero on all binary -tuples of Hamming weight at most . Note that can have degree . The one condition on their theorem is that the correct pattern (or some correct pattern) cannot be written as the GF(2) sum of two patterns on which is zero. This rules out arbitrary noise patterns.
In their model the original variables stand for bits of the secret vector . The new variables after the linearization have the form for sets of size up to . A solution to those variables is good provided for all . In this model they give a randomized algorithm and prove:
Theorem: If there are values and variables, then runs in time and finds a good solution with high probability.
Of course for constant —which entails constant —this yields a probabilistic polynomial-time algorithm. I really like this paper, not only for proving a nice result, but for the type of result that they get. We need to prove more results of the form: the following heuristic method can be proved to work in the case of One of the mysteries of complexity theory is that some heuristics work very well in practice, and we do not understand why. I think that their result helps unravel this.
Can we prove that linearization works on more problems? Can we prove that other heuristics also work?
Is there connection between the beautiful work of Arora-Ge and crypto-analysis? Does the structure in AES allow an attack in which a certain proportion of “hits” are bound to be correct, and the non-hits satisfy a noise polynomial of suitable low degree? Can help ?