Permutation Problems With Strings
From “Parsons Puzzles” to Babai’s breakthrough, and more?
|NZ silver fern source: Robin Ducker (CC)|
Dale Parsons and Patricia Haden, of Otago Polytechnic in New Zealand, wrote a seminal paper on a genre of programming puzzles that Barbara Ericson at Georgia Tech is furthering in her doctoral work. The puzzles give lines of code in a scrambled order, sometimes with incorrect lines thrown in. The goal is to find the correct order of the correct lines.
Today Ken and I wish to discuss various string rearrangement problems in relation to the Graph Isomorphism problem.
The Parsons method was pointed out to me by Ericson, who also serves as our Director of Computing Outreach. Her thesis abstract says in part:
Parson’s programming puzzles are a family of code construction assignments where lines of code are given, and the task is to form the solution by sorting and possibly selecting the correct code lines. We introduce a novel family of Parson’s puzzles where the lines of code need to be sorted in two dimensions. The vertical dimension is used to order the lines, whereas the horizontal dimension is used to change control flow and code blocks based on indentation as in Python.
This really intrigued me and I looked into what exactly is the Parsons method. One of her major references is a paper by Petri Ihantola and Ville Karavirta of Aalto University in Finland. Here is one of their examples:
One can make a similar but harder puzzle out of Ken’s example a year ago in our memorial post for Susan Horwitz, by scrambling his six lines of code that swap two nodes in a circularly linked list. In a class teaching proofs, I wonder how good an exercise it would be to scramble lines of a formal derivation.
All this has us thinking further about problems involving rearranging strings. We can think of each given line as a character. Some Parsons puzzles have multiple occurrences of the same line which is like having the same character multiple times in a string.
Viewed this way, we can even call “Parson’s puzzle” a Parsons puzzle: the apostrophe should if anywhere be after the final ‘s’. There’s nothing wrong with “Parsons puzzle” (reading the name as an adjective not possessive), which is also a legal Parsons solution since the apostrophe “line of code” is skipped. There are isomorphic answers since we could swap either ‘s’ or ‘z’, and these swaps remain automorphisms of the final string. Now we are getting into the domain of László Babai’s new results about strings, to which we turn.
Graph Isomorphism With Subgroups
To understand Babai’s usage of “string isomorphism,” a concept introduced by Eugene Luks in 1980 under the name “color isomorphism,” let’s first consider a natural, known add-on to the graph isomorphism (GI) problem:
Given two labeled graphs on nodes plus a group presented by permutations of , is there an isomorphism from to that belongs to ?
For example, consider and the subgroup generated by and . These are both even permutations since so we immediately know is not all of the symmetric group . Now consider these two labeled graphs and :
Then and are isomorphic by the interchange , but being odd, this does not belong to . Does that mean the answer to the problem for is “no”? Well, the graph has odd automorphisms such as its mirror flip—perhaps one of them composed with belongs to ? It’s a tricky question. We could force a “no” answer by adding a “dongle” to nodes 3 and 5 in both graphs so they can only be swapped with each other and different “dongles” to other nodes that prevent any other possible isomorphism. The dongles could be a single node connected to 3 and 5 and nodes connected only to node for in each graph.
GI is then the special case where we are given generators of . Whether the general case reduces back to GI is a major open problem.
The simplest form of string isomorphism is, given strings of length , is there a permutation of that carries to ? This has a trivial answer: yes if and only if for every character in the alphabet, the numbers of occurrences of in and in are the same. This is easily in polynomial time, as are some other problems by the name “string isomorphism” on the Net. But here is the analogous subgroup problem:
Given strings of length and generators of a subgroup of , does permuting the entries of by some member of give ?
This is not trivial. It is exactly as hard as the graph subgroup version, hence hard for GI. Let us first reduce GI to it. Having enforces that only permutations derived from rearrangements of the node indices are allowed on the string indices. Specifying some way to “unroll” a graph’s adjacency matrix into a string yields generators for the right .
Since our graphs are undirected we need only unroll the upper triangle. Let’s do it by first going down the largest diagonal
then the next-largest, and so on in “bands” ending at the upper-right corner . For we get entries in the following order:
Here the six “edges” are really the slots in the string for possible edges. A pair of generators for is and . The corresponding permutations of the edge slots are and . By “design of accident” these are the same as in the example above though there is no relation—this just expedites our seeing that the subgroup generated by is not all of . The idea extends for any in canonical fashion: from the cycle and transposition generators of we get two generators for . Babai calls the group in such cases to emphasize the action being on pairs.
Now given any labeled graphs and on nodes, we can read off the corresponding strings and , and then is isomorphic to if and only if is a yes-case of the string problem. This completes the reduction from GI.
To reduce from the subgroup version of GI we map the given generators of to their images in under the above embedding. This reduction can be inverted: Given the string , define the graph to be a disjoint union of isolated pairs of nodes connected by an edge for each such that , and isolated pairs not connected by an edge for each such that . Likewise map to a graph . Finally map each generator by replacing each move in by and .
Graph and String Automorphisms Too
An automorphism is a rearrangement that leaves a structure the same. Such permutations form a group called the automorphism group . GI has long been known to be polynomial-time equivalent to the problem of computing a complete set of generators for when is a labeled graph. Assuming and are connected graphs (else use their complements), the forward reduction takes and notes that every generating set of contains a member that maps onto if and only if is isomorphic to .
The reverse direction is trickier and requires both decomposing the group and progressively adding “dongles” to pairs of graphs and asking whether they are isomorphic. The decision version, “does have a nontrivial automorphism?” (GA), reduces to GI but is not known to be equivalent.
This extends to an equivalence of the subgroup- version of GI and finding generators for all automorphisms in a subgroup. We almost get a reduction to the subgroup version of GA which is a decision problem: Make two copies of and convert each given generator of into a that maps to and to . Also convert the identity which yields the swap and let these generate . Then and are isomorphic by a member of if and only if has an automorphism that is an odd-length word over these generators. We still get a reduction to finding all generators of . The reverse reduction is still tricky, but no more so than the original since it was decomposing groups anyway.
For strings , we again have a situation where the basic problem is trivial: the automorphisms permute the indices for each individual character among themselves, so we need only write the usual generators for a product of symmetric groups. However, the subgroup version is again nontrivial. We state it for a binary string, but like the above it extends to any set of characters or “colors.”
Given a string and generators for a subgroup of , find generators for where permutes the indices having and permutes with .
Note how this involves finding generators for the intersection of two subgroups. By ideas like those just above this is equivalent to the subgroup version of string isomorphism, hence in turn to the subgroup version of GI. The method of the last section can also reduce the functional graph-automorphism problem directly to this string version, and conversely.
The extra generality of the string problems is that the subgroup involved need not be as in the reductions from GI, indeed need not arise from subgroups acting on graphs at all. This feature is exploited in the recursions that Babai’s proof sets up. His algorithm classifies all of these problems into quasipolynomial time. Thus, defining the problems with subgroups makes them nicer.
Use in the Algorithms
Although automorphisms seem simpler than isomorphisms, if our objective is to reduce the groups involved, the last reduction above involving is going in the wrong direction since is larger than . Moreover, the odd-length words that interested us formed not a subgroup but a coset of the even-length words under the swap . The following key idea of Eugene Luks’s seminal 1980-82 paper putting the bounded-degree case of GI into shows how cosets can be handled directly and without resort to the reduction to automorphisms.
Suppose in general that one has a subgroup of and coset representatives where is the identity and . For right-cosets this means
For any subset of write to mean the set of isomorphisms from to that belong to . That is, writing actions on the right, . Then just by simple set theory,
The terms for aren’t recursing on subgroups but the trick is that we can make it so by permuting by to get . Then:
This allows making progress by -fold recursion on the isomorphism problem for the subgroup . The idea applies equally well for graphs or strings or other structures.
Babai’s algorithm builds on this by creating new ways to achieve either this or some other progress, meanwhile metering progress via certain homomorphisms from into symmetric groups on smaller structures rather than by trying to divide up the domain of or . Automorphisms still frame essential concepts in the algorithm, in part because whenever is nonempty, any one member makes . We hope to survey more of this in a future post.
What to Call These Problems?
We hasten to add that these computational problems are not new. Their contexts in Luks’s paper are brought out by further references including a 1983 paper by Gary Miller and Babai’s 1994 Handbook of Combinatorics survey, and they are foreshadowed in the treatment of coloring-preserving auto/iso-morphisms in Babai’s own 1979 paper which introduced the term “Las Vegas algorithm.”
In highlighting them all together we’d like to find a common and consistent naming scheme. Adapting names from these papers and echoing Luks’s 1993 survey might suggest names like or or , the last distinguishing the problem of giving a full set of generators for from the decision problem of whether is nontrivial. We’ve had a suggestion to put the ‘G’ out front and the class of relational structures in parentheses, such as or , with the latter’s math-italicized indicating a particular subgroup .
For abbreviations we like appending the ‘G’ to turn GI into GIG and similarly SIG for the string version. The rub is that for the automorphism problems this could lead to names like GAG and SAG, even GAGG and SAGG for the generator-set versions, which don’t sound so nice. Writing for polynomial-time or some tighter equivalence, and using less-gaggy notation, the current state is:
- and belong to .
- reduces to GI, which reduces to the problems.We don’t know about the decision versions for graphs and strings. All of these in turn reduce to various forms of the canonization problem, which features in this 1983 paper by Babai and Luks, Luks’s 1993 survey, this 1997 note by Yuri Gurevich, and this recent paper by Lance Fortnow and Joshua Grochow.
More Isomorphism and Permutation Problems
As a hint that much more can be said, consider this 1998 dissertation by David Christie titled “Genome Rearrangement Problems.” Then consider that the word “genome” stops appearing after page 15 but there are 143 pages of neat material on permutations and sorting and string edit-distance type problems and cases that are -hard, or not. Evidently there are “Parsons puzzles” for genetic code. We’re not saying that Babai’s algorithm has any relevance to these—Laci is the first to disclaim any competitiveness with existing heuristic GI solvers—but something in the new structural and algorithmic ideas might click.
The most particular problem is whether GIG—that is, —reduces back to GI. If not, then does it sit properly between GI and the canonization problems? What to call these problems?
We end with a simple Parsons problem, with one line to omit:
and we await further news expectantly.
talk last Tuesday
to the University of Chicago,
We were glad to hear that Laci’s part-III
the proof is by double recursion
was not disturbed by the previous day’s hoax threat
[wording change at end of “Use” section.]