# BDD’s and Factoring

* Binary decision diagrams, pushdown machines, and factoring *

Randy Bryant is famous for his work on BDD’s, which are one of the most powerful tools used today in the design and analysis of digital logic. While Randy has done other great work, his paper on BDD’s created an avalanche of research, papers, systems, and applications. This famous paper is one of the most cited in all of computer science. There are many additional papers that should probably cite him–after a while you stop getting citations, since the work is so well known. This phenomena is related to *Ron Graham’s Award Lemma*:

Lemma:For every field of science with awards, there is a , so that if you win at least awards, then you will win at least .

Today I plan to talk about BDD’s from the view of complexity theory, not from the view of a logic designer. I will explain what they are, why they are important, and the relation to problems near and dear to me–like factoring. Always factoring.

At the anniversary celebration of the creation of the School of Computer Science at Carnegie-Mellon (CMU) Randy, the Dean, made a comment that I think sums up his approach to many things. The , in 2006, was a big event and one of the greats of AI was speaking to a packed hall. John McCarthy was talking about the seminal work that was done at CMU, in the early years, especially highlighting the work of Allen Newell and Herbert Simon on list processing. Now there is no doubt that they were the first to create list processing languages, but it is unclear whether they actually invented the concept of a list. In any event, they shared, a well deserved, Turing award in 1975 for this and related work.

When John finished his talk to a huge applause, Randy, who was in the audience, stood up and started to question whether Newell and Simon actually invented the notion of a list. John was definitely taken aback: why would the Dean ask these questions about two of CMU’s greatest scientists? But, Randy is a true independent thinker, and he wanted to know–in front of hundreds of people–did Newell and Simon actually invent the list concept or not? The session chair quickly stepped in and the celebration moved on to another talk.

This was an especially interesting debate, since Newell was one of the wisest and kindest faculty that CMU probably has ever had; truly he is beloved by the entire CMU community. He taught me what little AI I know–in those days long ago–AI was called Complex Information Processing (CIP).

I still remember his great sense of humor, which helped make him such a wonderful teacher. The first day of his CIP class, our classroom was completely full, at least twenty students had to stand in the back of room. They had no seats, no desks. One of them raised a hand and asked Newell if we could possibly move to a larger room. He simply said, “in my experience, the room will grow in size as the class goes on.” Of course he was correct, after a few more class meetings, there were plenty of seats.

Randy never did get an answer to his question: who invented lists? I think raising the question shows his quest for knowledge, his ability to think out of the box, and his ability to not care what the answer is–as long as it is correct. I respect him very much for these traits.

** BDD’s **

So what is a BDD? It is a *binary decision diagram*. Simply, it is a way to represent a boolean function, that allows the discovery of properties of boolean functions that would otherwise be hard to determine.

Actually, Randy studied what are called *Ordered Binary Decision Diagrams* (OBDD), and today what I call BDD’s are a special case of OBDD–confused? I am doing this for two reasons. The special case that we will study has, up to a small factor, the same size as the more general OBDD’s. So if you are an expert, please do not get too upset at me; and if you are not, just be aware that I have simplified things to avoid complex notation. The essential ideas, the power of the method, are still the same.

Also by restricting ourselves to this special, but powerful class of BDD, we will be able to extend the notion of BDD’s in an interesting way. Since this extension is more interesting to theorists, than to logic designers, I believe that it is not yet well explored. But, I think it has great potential.

One of the great mysteries to me about BDD’s is that they are not–I believe–used much in the complexity theory arena, yet they are a simple generalization of something that we know well: finite state automata (fsa). One way to view a BDD is as a kind of fsa on “steroids.”

Suppose that where . In complexity theory we have many ways to define the complexity of . What BDD’s do is define the complexity of as this: Let be the size, in states, of the smallest deterministic finite state automata that accepts the language:

where is a permutation. Then, define as the minimum of over all permutations .

A classic example, that demonstrates the power of BDD’s, is the language of all strings of the form

where , for all indices . It is easy to see that a fsa for this language requires states: after scanning the first half of the string, the automata must “remember” different values. Any fewer states, forces the automata to confuse two strings, and therefore make a mistake.

Yet, there is something unsettling about this example. It only works, because the string is represented in a most inconvenient manner. Suppose instead the string is represented as:

Then, the fsa needs only states to accept the new language. This huge improvement is the fundamental idea of BDD’s:

Allow the inputs to be re-ordered in the most convenient way.

** A Basic Property of BDD’s **

One of the most important properties of a BDD is:

Theorem:Suppose that is at most and we know the permutation . Then, in polynomial time in , we can determine whether or not there is an so that . Even more, we can also count the number of , so that in the same time bound.

In many representations deciding if there is an so that is NP-hard. Here we can even count the exact number of such inputs.

The proof of this theorem depends on a simple property of deterministic finite state automata. Consider an fsa with states, that we wish to count the number of inputs of length that it accepts. We create a new fsa with at most states such that, (i) simulates on length inputs, and (ii) there are no cycles in ‘s finite state diagram. Then, we can label, inductively, the states of with the number of inputs that can reach that state. In this manner we can count the number of inputs that accepts.

The above argument is quite useful, and is only a slight extension of the well known fact that deciding if an fsa accepts some input is in polynomial time. Yet, this extension does not seem to be as well known as it should be.

** Bounded Width Computations **

There is a relation between read-once bounded width programs and BDD’s. Are they not the same? The answer is both yes and no.

A BDD can be viewed as a read-once bounded width computation, where the width is bounded by the number of states. So the answer is yes.

This is a bit misleading, however. The power of BDD’s is the ability to select the order of the inputs to the computation. The ability to vary the order of inputs to make the width of the computation smaller is what the theory of BDD representation is all about. So the answer is no.

Also in the next two sections, I will present two extensions to the basic BDD concept that will diverge greatly from bounded width computations.

** Pushdown BDD’s **

There is a further reason that BDD are different from read-once computations. It is possible to generalize the BDD notion and replace finite state automata by a nondeterministic push down automata. Now the complexity of a boolean function is determined by the size, in states, of the smallest pushdown automata that accepts the language:

where is a permutation. Again we are free to chose the “best” permutation. Let us denote this minimum by .

The reason this is interesting is that the power of a pushdown trivially allows a kind of read-twice. Suppose the input is , then by placing it on the pushdown and popping it off at the end, such a machine can simulate the ability to see the string:

This is a special kind of read twice ability.

The counter-part of the basic theorem now is weaker:

Theorem:Suppose that is at most and we know the permutation . Then, in polynomial time in , we can determine whether or not there is an so that .

Note, we do not preserve the ability, to my knowledge, to be able to count the number of accepting paths.

The proof of this theorem is simple, but a bit curious. The proof is simply to note that to determine if for some , reduces to the emptiness problem for a pushdown machine. This can be done in polynomial time in the number of states of the machine.

However, the classic literature on language theory does not seem to prove this. The literature does prove that any such language is in polynomial time, but that is not exactly what we need. It is an issue of uniformity: we need given a pushdown machine in time polynomial in the size of , we can tell if is empty. That is different from the well known fact that is in polynomial time. Do you see the difference?

My guess is that most of the theorems about pushdown machines were proved long before the great interest in polynomial time. This is an aside, but clearly as our field advances we must re-examine old theorems and results–many may need to be re-proved in light of new concerns.

** Another Extension of BDD’s **

Define an *encoding* as a mapping so that (i) is computable in polynomial time and (ii) exists and is also computable in polynomial time. Obviously, this means that the mapping is an efficiently computable bijection on the set of length bit strings.

For many applications of the BDD concept to theory, we can allow much more powerful encodings than just permutations. Clearly, a permutation is an encoding in the sense that we mean here. But, many other natural mappings could be used:

- The encoding could be an arbitrary linear invertible map over .
- The encoding could be a permutation polynomial over the input;
- The encoding could be a piecewise linear map.

The latter could be a map that is defined, for example, as one bijection on the interval to itself, and another on to itself.

Of course, the more powerful the encoding allowed the more difficult it will be to prove lower bounds, or lower bounds may not even exist. Thus, it may be possible to change the complexity of a function dramatically.

** Lower Bounds on BDD’s **

One of the first theorems that Randy proved about BDD’s was a lower bound on . Let be the boolean function that is on strings of the form:

where as integers.

The importance of this function to Randy was that “real” digital logic had multipliers. Thus, he was quite interested in how large is the BDD representation. He proved the following pretty theorem:

Theorem:For large enough,

Randy used a lemma from an earlier paper of Bob Sedgewick and myself on related lower bounds via communication complexity. I have mentioned our work in an earlier post, and will return to it in the future. For the details of how Randy proved this theorem see his paper.

Philipp Woelfel, in 2005, improved this greatly to,

Theorem:For large enough,

These lower bounds do not apply to the extensions that we have raised: and other types of encodings.

** Factoring **

The interest that I have in the previous lower bounds is their relationship to factoring. If has BDD complexity , then we can factor numbers of length in time polynomial in . The same is true, if we replace the BDD complexity by BDDPUSH complexity.

Suppose that for some ordering of there is an fsa that accepts if and only if . Then, there is an oracle that can answer any query of the form: Given are there so that ? Here is the concatenation of and . The existence of this oracle follows directly from the main theorem on BDD’s.

Then, by a standard self-reduction argument, we can factor. Suppose that we wish to factor . Then, we proceed to try and find the value of so that for some . We do this by guessing one bit of at a time, while restricting .

** Open Problems **

There are a number of interesting open problems. One is what is the value of ? If it is of the form , then what is the value of ? If it is small, then this could be a threat to factoring based cryptography.

Suppose we have a boolean function , how well can we approximate it with another boolean function so that ? For example, is the multiplication function is well approximated by such a boolean function?

Also what happens for non-standard encodings? Can we still prove exponential lower bounds on , or do these become too hard to prove? Of course, a better upper bound would be quite interesting.

The BDD measure of complexity is quite different from some of our standard measures. For example, some of our favorite boolean functions, old friends, that have high complexity in other measures are trivial in this model. For example, computing parity, or more generally any threshold function are easy in this model.

Finally, can BDD’s be used to solve, for example, the Junta problem? Or are there other possible applications?

This is the post.

*To paraphrase Don Knuth: whenever you need more bits in a counter, it is a big deal. So thank you all for your kind support. On to the next power of two.*

Regarding second open problem, the constant function g()=0 agrees with mult on most inputs, and has BDD of size 1.

Pretty quick answer.

I guess we need to refine the question, don’t you think. Perhaps you have some idea what a better question is? I will think about this some myself.

One possible idea would be to look at the function f(x,y,i) = 1 if and only if the ith bit of xy is 1. Does that work?

Apropos your question on Juntas: Can BDDs solve k-CLIQUE. I.e. given a graph, can you determine if it has a clique of size k? I suspect that the ability to re-order the input does not help, neither for these nor for Juntas, so results for automata might apply (or, at least, the impossibility proofs on automata might generalize to BDDs).

One minor detail about the question on k-CLIQUE: The natural input representation would be of size n^2, but I don’t want this. I want to allow the BDD to access input bits over and over. So I assume the input is of infinite length: it cycles over the n^2 bits over and over. And the BDD can stop when it wishes. This might also be an interesting variant to study multiplication in: suppose you get the input over and over again. Do you still need exponential number of states? Similarly, for Juntas the input would be an infinite number of samples.

Hmm, I just realized the following: If I’m not wrong, then TC0 can be computed by BDD’s with poly(n). states (it’s not entirely trivial, since the weights may be large; I think it’s true, though). I think it might even be the case that TC^1 can be computed by poly-size BDDs. That would mean that any superpolynomial lower bound for BDDs gives a lower bound against TC0 or maybe even TC^1. If I’m not wrong, such lower bounds are not known for explicit functions. On the other hand, a lower bounds which is linear or quadratic or something like that might not be all that exciting for Juntas or k-CLIQUE, I think.

On the other hand, poly-sized BDDs can easily be seen to be in P. So there shouldn’t be a poly-size BDD for Juntas or k-CLIQUE or whatever.

So, I’m at a loss for what would be a good question to ask. Maybe define a class of monotone BDDs, and to prove that no poly-size monotone BDD solves k-CLIQUE or Juntas. This would be analogous to the corresponding result about monotone circuits.

How should we define monotone BDDs? One way would be to add monotonicity as a syntactic requirement, i.e. just ask that the function f also be monotone; another way would be to require that there is an ordering of the states, such that if the current state transitions to v if 0 is read and to u if 1 is read, then v lies before u in the ordering.

Finally, one question: is the result that monotone circuits cannot compute k-CLIQUE, known also for monotone threshold circuits?

P.S. My comments keep getting blocked by the spam filter, because I’m writing through a proxy (wordpress is blocked in China). Is it possible to relax the filter?

Apropos your question on Juntas: Can BDDs solve k-CLIQUE. I.e. given a graph, can you determine if it has a clique of size k? I suspect that the ability to re-order the input does not help, neither for these nor for Juntas, so results for automata might apply (or, at least, the impossibility proofs on automata might generalize to BDDs).

One minor detail about the question on k-CLIQUE: The natural input representation would be of size n^2, but I don’t want this. I want to allow the BDD to access input bits over and over. So I assume the input is of infinite length: it cycles over the n^2 bits over and over. And the BDD can stop when it wishes. This might also be an interesting variant to study multiplication in: suppose you get the input over and over again. Do you still need exponential number of states? Similarly, for Juntas the input would be an infinite number of samples.

(later thoughts:)

Hmm, I just realized the following: If I’m not wrong, then TC0 can be computed by BDD’s with poly(n). states (it’s not entirely trivial, since the weights may be large; I think it’s true, though). I think it might even be the case that TC^1 can be computed by poly-size BDDs. That would mean that any superpolynomial lower bound for BDDs gives a lower bound against TC0 or maybe even TC^1. If I’m not wrong, such lower bounds are not known for explicit functions. On the other hand, a lower bounds which is linear or quadratic or something like that might not be all that exciting for Juntas or k-CLIQUE, I think.

On the other hand, poly-sized BDDs can easily be seen to be in P. So there shouldn’t be a poly-size BDD for Juntas or k-CLIQUE or whatever.

So, I’m at a loss for what would be a good question to ask. Maybe define a class of monotone BDDs, and to prove that no poly-size monotone BDD solves k-CLIQUE or Juntas. This would be analogous to the corresponding result about monotone circuits.

How should we define monotone BDDs? One way would be to add monotonicity as a syntactic requirement, i.e. just ask that the function f also be monotone; another way would be to require that there is an ordering of the states, such that if the current state transitions to v if 0 is read and to u if 1 is read, then v lies before u in the ordering.

Finally, one question: is the result that monotone circuits cannot compute k-CLIQUE, known also for monotone threshold circuits?

P.S. My comments keep getting blocked by the spam filter, because I’m writing through a proxy (wordpress is blocked in China). Is it possible to relax the filter?

I like the idea of looking at cliques. I think you are right that BDD will not do well there, but does need a proof. Also the question about monotone threshold should be yes–as I understand what you mean. The general monotone could simulate the threshold easily etc.

Your last about wordpress filter–alas I only use their servers and have no way to do anything.

Sorry about that.

I like this reordering idea. It reminds me of a problem I’ve been tossing around: it’s well-known that if is a regular language and denotes the number of words of length , then is rational.

If it’s known that is rational, does it follow that rearranges to a regular language? (I’m thinking in particular of the language of palindromes, but there exist languages that aren’t even context-free with this property.)

Very interesting question. I do not know what to think. Quite a neat idea.

Dick, I was happy to see you point out that BDDs are essentially DFAs. I have been teaching this in my logic course at Rice for years, but this obvious fact is not widely known. It amazes me that Bryant did not realize this. In fact, all of his constructions for BDDs are essentially standard constructions for DFAs.

Two comments:

1. It should be mentioned that, in practice, finding a good variable order is one of the most difficult challenges for BDD users.

2. I am not sure I understand your comment about the emptiness problem for pushdown automata. Are you saying that the polynomial decidability of this problem was not classically known? But pushdown automata can be polynomially translated to equivalent CFGs, and the emptiness problem for CFGs can be checked in linear time.

Moshe

No. Was just saying that the classic proofs that show pda –> cfg etc do not state that all is polynomial bounded. If you look inside the proof, the answer is clear and you are right. But, at least what I found, did not explicitly claim what is needed.

It is a small but interesting point I thought.

Dick, you are right, but also many automata-theory books do not explicitly state that NFA nonemptiness is in LOGSPACE. They do not even say that it is in PTIME. I agree with you that this is because the classical results in automata theory were obtained before people were thinking complexity-theoretically.

“This is the 64th post”.

Congratulations!

Is the next number 128?

Robin Whitty is doing this by prime numbers.

http://www.theoremoftheday.org/Theorems.html

P.S. I quess this is the little difference between a computer scientist and a pure mathematician.

Yes we like to count are bits. The next is 128. Perhaps will change to another milestone system since after a while they get pretty sparse.

Hello.

What if, in a general theory of everything kind of way, some singular human conscious had used simple Yes(I should feel guilty) or No(I do not feel guilty) answers to answer every moral question that he could remember that had ever occurred in his life. In this way he could have become completely pure. He would have truly asked himself all the questions that could be asked of him. He would have emotionally evolved back to when he was a child.

What if he then assigned an ‘It doesn’t matter as life is only made to make mistakes’ label on every decision that he had analysed.

This would not make him God or the devil, but just very still and very exhausted. Anybody can do this but just for the purpose of this experiment lets say I have done this. Which I have.

There are no fears in me and if I died today I could deal with that because who can know me better than myself? Neither God or the Devil. I am consciously judging myself on ‘their’ level.

To make this work, despite my many faults, take ME as the ONLY universal constant. In this sense I have killed God and the Devil external to myself.The only place that they could exist is if I chose to believe they are internal.

This is obviously more a matter for a shrink more than a mathematician, but that would only be the case depending on what you believed the base operating system of the universe to be. Math / Physics / morals or some other concept.

As long I agreed to do NOTHING, to never move or think a thought, humanity would have something to peg all understanding on. Each person could send a moral choice and I would simply send back only one philosophy. ‘ LIFE IS ONLY FOR MAKING MISTAKES’.

People, for the purpose of this experiment could disbelief their belief system knowing they could go back to it at any time. It would give them an opportunity to unburden themselves to someone pure. A new Pandora’s box. Once everyone had finished I would simply format my drive and always leave adequate space for each individual to send any more thoughts that they thought were impure. People would then eventually end up with clear minds and have to be judged only on their physical efforts. Either good or bad. It would get rid of a lot of maybes which would speed lives along..

If we then assume that there are a finite(or at some point in the future, given a lot of hard work, there will be a finite amount) amount of topics that can be conceived of then we can realise that there will come to a time when we, as a people, will NOT have to work again.

Once we reach that point we will only have the option of doing the things we love or doing the things we hate as society will be completely catered for in every possible scenario. People will find their true path in life which will make them infinitely more happy, forever.

In this system there is no point in accounting for time in any context.

If we consider this to be The Grand Unified Theory then we can set the parameters as we please.

This will be our common goals for humanity. As before everyone would have their say. This could be a computer database that was completely updated in any given moment when a unique individual thought of a new idea / direction that may or may not have been thought before.

All that would be required is that every person on the planet have a mobile phone or access to the web and a self organising weighting algorithm biased on an initial set of parameters that someone has set to push the operation in a random direction.

As I’m speaking first I say we concentrate on GRAINE.

Genetics – Robotics – Artificial Intelligence – Nanotechnology and Zero Point Energy.

I have chosen these as I think the subjects will cross breed information(word of mouth first) at the present day optimum rate to get us to our finishing point, complete and finished mastery of all possible information.

Surely mastery of information in this way will lead us to the bottom of a fractal??? What if one end of the universes fractal was me??? What could we start to build with that concept???

As parameters, we can assume that we live only in 3 dimensions. We can place this box around The Milky Way galaxy as this gives us plenty of scope for all kinds of discoveries.

In this new system we can make time obsolete as it only making us contemplate our things that cannot be solved because up to now, no one has been willing to stand around for long enough. It has been holding us back.

All watches should be banned so that we find a natural rhythm with each other, those in our immediate area and then further afield.

An old clock watch in this system is should only be used when you think of an event that you need to go to. It is a compass, a modern day direction of thought.

A digital clock can be adapted to let you know where you are in relation to this new milky way boxed system.(x,y,z).

With these two types of clocks used in combination we can safely start taking small steps around the box by using the digital to see exactly where you are and then using the old watch to direct you as your thoughts come to you.

We can even build as many assign atomic clocks as are needed to individually track stars. Each and every star in the Milky Way galaxy.

I supposed a good idea would be to assume that I was inside the centre of the super-massive black hole at the centre of the galaxy. That would stop us from being so Earth centric.

We could even go as far as to say that we are each an individual star and that we project all our energies onto the super-massive black hole at the centre of the galaxy.

You can assume that I have stabilized the black hole into a non rotating perfect cube. 6 outputs /f aces in which we all see ‘the universe and earth, friends’ etc. This acts like a block hole mirror finish. Once you look it is very hard to look away from.

The 6 face cube should make the maths easier to run as you could construct the inner of the black hole with solid beams of condensed light(1 unit long) arranged into right angled triangles with set squares in for strength.

Some people would naturally say that if the universe is essentially unknowable as the more things we ‘discover’ the more things there are to combine with and therefore talk about. This is extremely fractal in both directions. There can be no true answers because there is no grounding point. Nothing for the people who are interested, to start building there own personal concepts, theories and designs on.

Is it possible that with just one man becoming conscious of a highly improbable possibility that all of universes fractals might collapse into one wave function that would answer all of humanities questions in a day? Could this be possible?

Answers to why we are here? What the point of life really is et al?

Is it possible that the insignificant possibility could become an escalating one that would at some point reach 100%???

Could it be at that point that the math would figure itself out so naturally that we would barely need to think about it. We would instantly understand Quantum theory and all.

Can anybody run the numbers on that probability?

may be the idea first has to be applied to primitive integer polynomials whose factorization is known to be polynomial?