Skip to content

The Lemma Cited From Burnside

March 3, 2018

How might it be applied in complexity theory?

St. Andrews history source

William Burnside was a well-known researcher into the early theory of finite groups.

Today Ken and I thought we would talk about one of his most-cited results—a result that is really due to others.

This happens all the time in mathematics. In this case Burnside wrote an important book on finite group theory and included a lemma that is called various things. It is sometimes named for Augustin-Louis Cauchy or Ferdinand Georg Frobenius, or called, “the Lemma that is not Burnside’s.” The lemma was incorrectly attributed to Burnside because he proved it in his 1897 book Theory of Groups of Finite Order. His title for the last two sections of his eighth chapter was the statement of the lemma:

Number of symbols left unchanged by all the substitutions of a group is the product of the order of the group and the number of the sets in which the symbols are interchanged transitively.

After his proof in section 118, he began section 119 by writing:

119. The formula just obtained is the first of a series of similar formulae, due to Herr Frobenius,* which are capable of many useful applications.

The * citation was to a one-page paper by Frobenius in Crelle’s Journal in 1887. The formula itself appeared in an 1845 paper by Cauchy. In his 1911 second edition, Burnside stated it in section 145 (page 191) as “Theorem VII” where he adjoined a twin statement about summing the squares of the numbers of fixed elements.

The book has no instance of the word “lemma,” which the first edition used only in section 77. There is no nearby mention of Cauchy or Frobenius and neither the 1887 nor the 1845 paper is cited anywhere. Instead, Burnside’s next mention of Frobenius comes on page 269 astride five new chapters on representation theory that were his great thrust in the expanded edition. He used a big * footnote stretching across two pages to cite multiple works by Frobenius en banc:

* The theory of the representation of a group of finite order as a group of linear substitutions was largely, and the allied theory of group-characteristics was entirely, originated by Prof. Frobenius. …

That Burnside gave a tandem statement supports the position that he felt it was all well-known research, but perhaps what the online Encyclopedia of Mathematics calls the “mysterious dropping” of Frobenius owed as much to his sweep toward this grand encomium.

Burnside is more properly known for actual beautiful results such as the important {p^{a}q^{b}} theorem, which shows that every finite group whose order has only two prime divisors is solvable. The original proof relied heavily on representation theory, and it took many years to get a proof that avoided needing this machinery—see this for the theorem and a short historical note.

Then there is Burnside’s conjecture, which lasted for six-plus decades before being refuted in 1964, but which still has viable forms and impacts as we covered before.

But what of the Lemma? The problem was less Burnside’s lack of citation and more the habit of others citing it from Burnside. So we propose calling it the Lemma Cited From Burnside (LCFB). The initials can also stand for “Lemma of Cauchy-Frobenius per Burnside.”


An action of a group {G} on a set {X} is a mapping {\phi} from {G} to permutations of {X} that multiplicative: {\phi(gh) = \phi(g)\phi(h)}. That is, the action is a subgroup of the symmetric group of {X} that is a homomorphic image of {G}, but the action retains information about which homomorphism was used.

An action induces—and is equivalently definable as—a function

\displaystyle  \Phi: G \times X \rightarrow X

with the property that for all {g,h \in G} and {x \in X}, {\Phi(gh,x) = \Phi(g,\Phi(h,x))}. The orbit of {x} under the action is the set of values {\Phi(g,x)} over all {g \in G}.

This is all made more visual if we write {g\cdot x} for {\Phi(g,x)}. Especially when the action is faithful, meaning {\phi} is 1-to-1, we can picture the elements of {G} as mapping {x}-es directly. Then the orbit is {G\cdot x = \{g\cdot x: g \in G\}}. Some questions to ask are:

  • How many elements {x} are fixed by a given {g}, namely {g\cdot x = x}?

  • How many elements are fixed by the action on average?

  • How many orbits does the action have?

The latter two questions may seem unrelated. But the LCFB says that their answers are the same:

Lemma 1 Let {G} act on the set {X}. Then

\displaystyle  \#\mathsf{orbits}_G(X) = \frac{1}{|G|} \sum_{g \in G} |\mathsf{Fixed}(g)|.

Balance in Orbits

The key idea of the proof is that orbits are balanced. For any {x}, define its stabilizer by {\mathsf{Stab}_G(x) = \{g \in G: g\cdot x = x\}} and note that this forms a subgroup of {G}. The orbit-stabilizer theorem states:

Theorem 2 Let {G} act on the set {X} and let {x \in X}. Then

\displaystyle  |G\cdot x|\cdot |\mathsf{Stab}_G(x)| = |G|.

It follows right away that for all {x,y} in the orbit {{\cal O}}, {|\mathsf{Stab}_G(x)| = |\mathsf{Stab}_G(y)|}; call this {s_{\cal O}}. The proof of Lemma 1 comes quickly too:

\displaystyle  \begin{array}{rcl}  \sum_{g \in G,x \in X} [g\cdot x = x] &=& \sum_{\text{orbits }{\cal O}}\sum_{x \in {\cal O}}\sum_{g \in G}[g\cdot x = x]\\ &=& \sum_{\cal O}\sum_{x \in {\cal O}}|\mathsf{Stab}_G(x)|\\ &=& \sum_{\cal O} |{\cal O}|\cdot s_{\cal O}\\ &=& \sum_{\cal O} |G| = |G|\cdot\#\mathsf{orbits}_G(X), \end{array}

which is what we needed.

We have used a “Theorem” to prove a “Lemma.” Is the “Theorem” obvious? We note substantial discussion about that in a StackExchange post and links from it to a lengthy post by Tim Gowers. We will try our own explanation because it speaks right to aspects of the cycle structure of permutations that we are trying to quantify in new ways.

List out the orbit of {x} as {x = x_0,x_1,x_2,\dots,x_{m-1}}. For each {j}, pick an element {h_j} of {G} such that {h_j \cdot x = x_j}. And list out {\mathsf{Stab}_G(x)} as {g_0,g_1,\dots,g_{n-1}}. We claim that

\displaystyle  h_j g_i = h_\ell g_k \implies i = k \wedge j = \ell.

If they are equal as elements of {G} then {h_j \cdot x = h_j g_i \cdot x = h_\ell g_k \cdot x = h_\ell \cdot x}, but by the choice of {h_j} and {h_\ell} this forces {\ell = j}. And {h_j g_i = h_j g_k \implies h_j^{-1} h_j g_i = h_j^{-1} h_j g_k \implies g_i = g_k}. Thus the elements {h_j g_i} are all different.

Now consider any {h \in G}. There is a {j} such that {h \cdot x = x_j = h_j \cdot x}. Now put {g = h_j^{-1} h}. Then

\displaystyle  g \cdot x = h_j^{-1} h \cdot x = h_j^{-1} \cdot x_j = x.

Therefore {g} belongs to {\mathsf{Stab}_G(x)},so it equals {g_i} for some {i}. Thus {g_i = h_j^{-1} h}, so we have written {h = h_j g_i}. Thus the elements {h_j g_i} run through {G} exactly once. It follows further that for each {j} there are exactly {n} elements {h} such that {h \cdot x = x_j}, namely those {h_j g_i} for different {i}. It follows that {m\times n = |G|}, which proves the theorem.

Note that we seem to have avoided appealing to the concepts of order (of an element or group) or quotient that go into statements of Lagrange’s theorem. Of course they are present, but we used {X} as a screen to hide them. We didn’t even have to exhibit a 1-1 correspondence between {\mathsf{Stab}_G(x)} and {\mathsf{Stab}_G(x_j)}. It all simply flows from the axiom of inverses and multiplicativeness of the action which in particular gave us {h_j^{-1} \cdot x_j = x}. There is no intermediate notion of how a group can act on a set—it must have perfect balance in orbits.


Here is an example that also shows the difference between the physical and algebraic nature of an action. Consider regular {n}-gons whose edges are colored one of {k} colors. Equivalently, we can consider them to have {n} triangular facets colored the same front and back. The physical actions are rotating the polygon right by {360/n} degrees and flipping it over. These generate the dihedral group {D_n}. The number of orbits of the action of {D_n} on the set {X} of tile colorings tells us how many tile types there are.

The LCFB tells us to count the colorings fixed by each group element. Consider {n = 4} and {k = 2}, that is, black and red colorings of a square. This gives {|D_4 = 8|} permutations and {2^4 = 16} colorings. We count as follows:

  • The identity fixes all 16 colorings.

  • The 90-degrees-right ({R}) and 90-degrees-left ({L}) rotations fix only the two monochrome tiles.

  • The 180-degree rotation fixes those two plus the two with diagonals of opposite colors.

  • The flip shown at top below fixes all tiles that have facets 2 and 4 the same color, 8 in all, including the coloring shown.

  • The flip plus 180-degree rotation fixes when 1 and 3 have the same color, giving another 8.

  • The flip plus {R} rotation fixes the 4 colorings with 1 and 2 the same color and 3 and 4 the same color. The flip plus {L} does likewise for 1 and 4 vis-à-vis 2 and 3.

We get {16 + 2\times 2 + 4 + 8 + 8 + 2\times 4 = 48} fixed items, hence the LCFB gives {48/8 = 6} orbits.

           Two different representations of a flip element in {D_4}.

The bottom part of our figure shows how we could have oriented the square like a diamond before doing the mirror-image flip. Now, however, the coloring shown is not fixed, and only 4 not 8 colorings are preserved. Is this an inconsistency? Although the flip is physically the same, it is algebraically different. Represented as a permutation in cycle form, the original flip was {(1)(2\;\,4)(3)}, whereas the latter flip is {(1\;\,4)(2\;\,3)}. The latter flip is equal to following the former flip with the {L} rotation, which indeed fixed 4 colorings under the former representation of the action of {D_n}.

This suggests that the cycle structure of the permutations used in the action is what matters. George Pólya observed this to give a generating-function form, which Nicolaas de Bruijn refined and extended, and which allows for weighting the elements. In our example, for any {n}, it gives a polynomial that can be used to compute the number of {n}-gon tile types for any number {k} of colors, without repeating the inspection of fixed elements. The polynomial involves variables {a_i} raised to the power of the number of cycles of length {i} in a given permutation. Each permutation gives a monomial and those are summed up and divided by the size of the group. In the case of {D_4} the polynomial is

\displaystyle  Z(D_4) = \frac{1}{8}(a_1^4 + 2a_4 + 3a_2^2 + 2a_1^2 a_2).

If we simply substitute each {a_i} by {k} then we get the number of orbits; for instance, square tiles with 3 colors give 21 orbits. But with colors {b,r,g} we can substitute {a_i} by {(b^i + r^i + g^i)} and then the resulting {Z(b,r,g)} gives more information. Assigning other weights besides {(1,1,1)} to {(b,r,g)} achieves other counting tasks. This article by Nick Baxter serves to introduce a detailed survey by Mollee Huisinga titled “Pólya’s Counting Theory,” which is great for further applications including classifying molecular structures.

We are interested in cases involving succinctly-described permutations {\pi} of exponential-sized sets {X}. That the general formula for {Z(D_n)} involves the Euler totient function not only of {n} but also divisors of {n} hints at one source of such cases. In our setting, whether {\pi} fixes an element {x \in X} is decidable in polynomial time, but whether {x} and {y} belong to different orbits—and other predicates about orbits—may not be. Thus the LCFB is instrumental to getting the orbit-counting problem and other functions into {\mathsf{\#P}}.

Open Problems

What applications of the Lemma Cited From Burnside have been noted in complexity theory? We can ask this more generally about applications of the combinatorial double-counting trick, where one way of doing the counting looks hopeless but the other way at least gives a {\mathsf{\#P}} function or difference of {\mathsf{\#P}} functions.

7 Comments leave one →
  1. March 4, 2018 12:34 pm

    Mark Jerrum added a nice extra to this proof.

    Like many double counting proofs, the argument can be formulated as counting the edges of a bipartite graph, in this case the graph whose vertices are the elements of G and X, with an edge from g to x if g fixes x. Now Mark pointed out that a random walk on this graph, finishing in X, converges to a random choice of orbit (that is, the probability of arriving at x in the limit is inversely proportional to the size of the orbit containing x).
    This has various applications: one orbit might be very small and not visible to normal random search, but Jerrum’s method “magnifies” it. For example, a group acts on itself by conjugation, and Jerrum’s random walk chooses a random conjugacy class.

    There are also interesting open questions about the rate of convergence (of course!)

    • March 5, 2018 2:02 pm

      Thanks, Peter!—this is indeed very interesting along lines of what Dick and I have been wanting to do.

  2. Harold permalink
    March 4, 2018 12:47 pm

    One nice application is ranking /indexing irreducible polynomials and necklaces by Kumar Kopparty Saks.

    • March 5, 2018 2:00 pm

      Thanks! Here is a relevant remark from the intro to the paper:

      “However, it seems that not all combinatorial counting arguments lead to efficient indexing algorithms. A prime example of this situation is when we have a finite group acting on a finite set, and the set we want to count is the set of orbits of the action. The associated counting problem can be solved using the Burnside counting lemma, and there seems to be no general way to use this to get an efficient indexing algorithm.”

  3. March 7, 2018 4:38 am

    Polya’s counting theorem, which is a consequence/refinement of Burnside’s lemme, leads to an algorithm for counting (precisely) the number of isomorphism types of graphs with n vertices in \exp (\sqrt n ) steps. From Polya theorem you get a formula where you need to run over all conjugacy classes of the symmetric group S_n and the number of conjugacy classes in latex $S_n$ is the number of partitions of the number n known to behave like \exp (\sqrt n ).

    It is a nice question ito find a better algorithm. Untill not long ago the problem and running time looked similar to that of Graph-Isomorphism, but Babai’s breakthrough change this.


  4. March 13, 2018 3:13 am

    We used the lemma in the context of enumeration complexity of problems in dependence logic ( In the arxiv paper it is Proposition 14.

Leave a Reply to KWRegan Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s