Quantum Algorithms A Different View
Quantum algorithms explained in a different way, not the usual way.
David Deutsch is one of the founders of quantum computing. His 1985 paper described quantum Turing machines, and his famous algorithm is considered the first quantum algorithm ever.
Today I plan to try to explain the power of quantum algorithms, but do it in my way. I hope this “explanation” will help you understand their power, yet avoid some of the complexities with the usual explanations.
There is something “magical” about quantum algorithms. This magic can make one, like me, who is not an expert in quantum mechanics feel lost. I have even published a paper on this subject years ago—it was a generalization of Peter Shor’s famous factorization algorithm. And it is joint work with Dan Boneh, which I will discuss another time.
Yet I still feel puzzled. I guess being puzzled seems to be the first step toward showing that you “understand” quantum algorithms (QA). Here are three quotes from two famous physicists and one famous non-physicist on quantum mechanics.
- Niels Bohr:
If quantum mechanics hasn’t profoundly shocked you, you haven’t understood it yet.
- Richard Feynman:
I think I can safely say that nobody understands quantum mechanics.
- Groucho Marx:
Very interesting theory—it makes no sense at all.
If you think you understand, then you do not—a very strange state of affairs. Let’s turn to QA’s and in particular Deutsch’s famous algorithm.
Trying To Teach Quantum Algorithms
This last spring I taught a class at Tech on computational complexity. At the end of the semester I decided to present some basic ideas from quantum computation (QA). I thought the best way to introduce the class to the power of QA was to present the famous “simple” algorithm due to Deutsch. I thought it would convince them that QA’s are more powerful than classical algorithms.
I thought wrong. I looked on the web—where else—and found many detailed descriptions of how Deutsch’s algorithm works. These descriptions are detailed, use special notation, and give no intuition on why the algorithm works. At least for me the descriptions were not very insightful.
I did present the algorithm to my class, and used lots of extra comments to try to help them understand what was going on. I was not happy with the outcome. I think part of the class got something out of the lecture, yet I felt that most were probably not really following. In my experience if I do not understand something, it is pretty hard to explain to others.
This failure led me to think about how to explain this famous algorithm. I have some ideas of how to do this that I would like to share with you. I hope the following has two properties: one that it helps you understand QA’s better than before, and that my expert friends will not find it incorrect or worse silly.
Suppose Alice and Bob decide to play a simple “game.” Bob selects a boolean function , yes Bob selects a one variable boolean function. He has only four choices: . Alice’s task is determine whether or not Bob has selected a constant valued function, i.e. he has selected or .
In the classical world Alice simply asks Bob for the values of and , and clearly she can determine his choice. However, if Alice can only ask Bob for one evaluation of , then she has no way to determine his selection. This is easy to see: for instance, if Alice discovers that , then she still cannot tell if the function is constant or not.
Deutsch’s brilliant insight is that the quantum world is different. Alice can determine whether Bob has selected a constant function or not in one evaluation. She can do this with quantum magic.
In order to explain Deutsch’s Algorithm () let’s first compare classical probabilistic computation with quantum. They are different, but the comparison, I believe, should help explain some the nature of quantum computation. While it is very different from probabilistic computation, it does share much of the high-level structure with probabilistic computation.
Probabilistic Computation A probabilistic machine has a number of states that it can be in at any given time. What makes it probabilistic is at each time there is a probability distribution that describes the total state of the computation. More formally if a computation has just two states, then at any time its total state is a vector
where each are non-negative and . Thus, are a probability distribution: the probability of being in state is , and the probability of being in state is .
Every step of a probabilistic computation must preserve that the total state is a distribution. This implies that each step of the computation must be described by some stochastic matrix . The key property of such a matrix is that it sends a distribution to a distribution. If is
then and each is non-negative. If is a distribution, then
so is .
At the end of a probabilistic computation we can ask what is the probability that the computation is in a given state. In our simple two-state example, we can get the value of , for instance.
Quantum Computation A quantum machine has a number of states that it can be in at any given time. What makes it quantum is at each time there is a unit vector that describes the total state of the computation. More formally if a computation has just two states, then at any time its total state is a vector
where . This is the source of the magic. We are familiar with probability distributions, and we intuitively understand them. But what does it mean for a total state to be a unit vector? As Bohr and Feynman might say: it just is. Indeed.
Every step of a quantum computation must preserve that the total state is a unit vector. This implies that each step of the computation must be described by some unitary matrix . The key property of such a matrix is that it sends a unit vector to a unit vector. If is
then the rows are orthogonal unit vectors. If is a unit vector, then
so is . This is forced, that is it is forced that be a unitary matrix. Otherwise, the total state would not always stay a unit vector. So, if you accept that total states are unit vectors, then you must accept this.
At the end of a quantum computation we cannot get the value of final unit vector. We can perform a measurement. This is perhaps one of the other strangest properties of quantum computations. For us a measurement is the following: If is the total state then we get the answer with probability . Of course this is reasonable, since the vector is a unit vector.
We cannot ask for the value of —this is not allowed. In a sense this is one of the real distinctions from probabilistic computation. Note, if , then we will never get the answer , never. If we repeat the same quantum computation over and over we will see that must be small, and will never get as the output of the quantum computation.
At the highest level the algorithm can be viewed as follows:
- Alice picks a special dimension unit vector .
- She asks Bob to apply a unitary matrix to . Recall Bob has selected one of four boolean functions:
For each of these functions there is a different permutation matrix: Let’s use
to denote them. Note, permutation matrices are also unitary, and that Bob’s matrices are permutation because he is applying a deterministic computation. Of course Bob does not tell Alice which matrix he applies to the total state . The new total state is .
- Alice applies a certain unitary transformation to the vector . She then measures the resulting vector. If she gets as the answer, then she is sure Bob applied or . Absolutely sure. If on the other hand, she gets , , or , then she is unsure what Bob did.
Further, at least a positive fraction of the time, Alice will get the answer . The reason this is so cool is that Alice has a positive probability of solving the problem with one application of Bob’s function—step (2). In the classical world she cannot do this with one application, even if we only insist she can do this some positive fraction of the time.
This is the magic of quantum at work.
Why Does It Work?
The next key question is why can Alice figure out what Bob did? Or said another way, why does work? The answer is there is a simple geometric insight that explains all.
Let’s look at what happened—it’s like unravelling a magic card trick. It really is simple once you think about from a linear algebraic point of view. Alice started with a vector . Then Bob mapped that vector to one of four vectors,
Alice’s vector is such that these four vectors are independent—in the usual sense of linear algebra. This relies on Alice selecting the start total state cleverly, which is why I called it a special total state vector.
Alice knows the set of vectors, she does not know which one occurs in any particular execution of the algorithm. It is not hard to show that there is a unitary transformation that moves these independent vectors in a nice way. Alice can pick a unitary transformation so the first two go to unit vectors of the form
That is the high-level explanation. I left out all the details, but I hope you follow the main thrust. The critical point is while Alice does not know what Bob does, she knows he has moved her starting total state to one of four total states. Her job is to find a measurement that can separate the cases in two groups. If you like all Alice needs is to find a separating hyperplane: essentially that is what a measurement is.
Thus, at some level the reason works comes down to: The ability to separate four independent unit vectors into two groups by one hyperplane. For me this seems a lot less magical. We often see separation by hyperplanes in computational learning. I am getting a bit worried—am I starting to understand quantum algorithms?
I Cheated You
I left some details out, which I will now fill in. For example, I need to define Bob’s four permutation transformations. I also lied to you. Sorry. One complexity is that the four vectors
are not independent. But all is well as you will see. If you do not care to see all the details you can stop here. If you wish to really see what is happening, then read on.
Recall Alice works in a dimensional Hilbert space. The basis vectors are labelled
For each boolean function of one variable define the permutation matrix as follows:
where . Note,
The matrices are the permutation matrices Bob uses. The following table shows what each of these permutations does on a vector . Thus, sends the vector to , and so on.
The key difficulty is when Bob applies his permutation matrices to Alice’s we will not get independent vectors. I will now supply some simple linear algebra lemmas that will allow all to work out.
Two Lemmas From Linear Algebra
If are vectors in a Hilbert space define them to be almost independent provided any of them are independent.
Lemma: Let be a set of almost independent vectors in a finite dimension Hilbert space. Then for any nontrivial partition of into two sets and , there is a vector and an so that
for all in the set and
for all the in the set .
Proof: Let the vectors in the set be and the vectors in be . Of course plus must equal , since is partitioned into these two sets. Also , because the partition is nontrivial.
Consider the vectors in . If they are independent, then there is a so that
for all and
for all . This claim follows by linearity of the above constraints, and the independence of the vectors.
are independent. So must be a linear combination of the vectors (3). Let
Note, each coefficient must be non-zero by almost independence.
As before, there must be a vector so that
for all and
for all .
In order to complete the proof we need to show that
is not equal to zero. But this is equal to
which is non-zero. The lemma follows by taking to be the minimum of and the last value.
We need a simple lemma that shows that four vectors that arise in the protocol are almost independent.
Lemma: Let be a random unit vector. Consider the following four vectors:
Then with probability one the above vectors are almost independent.
Proof: Consider the first three vectors. The key is the matrix ,
must have rank with probability . Look at the upper corner by sub-matrix. It is equal to
It’s determinant is a homogeneous polynomial of degree . It is non-trivial: just set and and we get a matrix with determinant .
We must now, to complete the argument, show that with probability this polynomial is non-zero when are a random unit vector. This sounds difficult, because the values are not independent. But, they really have a nice property that is nearly as strong as being independent. Select four values independently from a standard Gaussian distribution. Then, the vector
is a unit vector. Here is the factor needed to normalize the vector and make it have unit length.
Why is this useful? Well now we need compute the probability that
Since is homogeneous this is the same as the probability that
where are independent Gaussians. But is nontrivial so this probability is .
The other cases follow in the same manner.
Revisiting: Why Does It Work?
We can now see why the protocol works. By lemma 2, Alice can find a starting vector so that
are almost independent. Further, she can find the she needs to make the measurement work as required.
Does this approach help you understand the power of quantum computing? Did I confuse you further? Or did this help?
Ken Regan has pointed out that I should add several comments:
- How what I call differs from the “real” Deutsch algorithm.
- How the power of quantum is usually explained.
- How a better algorithm can be constructed, one where Alice is always right.
Since this discussion is already very long, Ken and I will put together a longer piece in the near future that includes these points.