Looking for more in Property Testing

Oded Goldreich is one of the world leaders in theory. He has done seminal work in several areas including randomness in computation and cryptography, and has written very important books.

Today Ken and I wish to define a finite version of the classic notion of compactness.

We both are big fans of Oded’s work and are sorry that neither of us has ever had the chance to work with him. Our loss.

One of the real services Oded and several like-minded people provide is writing surveys:

sur-vey/sər,vā/
Noun: A general view, examination, or description of someone or something.
Verb: (of a person or their eyes) Look carefully and thoroughly at (someone or something), esp. so as to appraise them.

Surveys collect into one place ideas that can often be lost when distributed among various papers in diverse places. We have written previously about the iceberg effect of foundational results in a paper being obscured by main results that get the top-level attention. Surveys emphasize the progression of ideas building from below, and help us see the whole iceberg. Especially good surveys do this—and those are the only kind Oded writes.

Sometimes even a survey is not enough, and what you need is a book or a blog. Oded has written many books; now we will try the blog’s turn. We wish to draw out some ideas that have been beneath the iceberg’s surface in property testing, and collide them with a concept called compactness in logic that we think has steamboat power in computational complexity. (Maybe that is not a good analogy.)

## Finite Compactness – The Idea

The classic notion of compactness in logic states roughly this: suppose some property holds for all finite subsets of an infinite set ${\Omega}$, then it holds for all of ${\Omega}$. This is a powerful notion that is true for first order logic—it is a corollary of the Completeness Theorem of Kurt Gödel. One way to think about this is:

$\displaystyle \text{true on all finite subsets } \implies \text{true everywhere}.$

Recently we realized there is a notion that we will call finite compactness. The rough idea is that even for finite objects there can be a kind of compactness. Here is how it will go:

$\displaystyle \text{true on all }k\text{-subsets} \implies \text{almost true everywhere}.$

In this post the objects we focus on are graphs. Here is a concrete definition for undirected graphs:

Definition 1 A graph property ${\Pi}$ is ${(k,\epsilon)}$compact provided for all graphs ${G}$ of large enough size ${N}$, if all the subgraphs induced by ${k}$ vertices of ${G}$ have ${\Pi}$, then there are at least ${(1-\epsilon)N}$ vertices of ${G}$ that induce a subgraph ${H}$ having property ${\Pi}$. It is finitely compact if it is ${(k,\epsilon)}$-compact for some ${k}$ and ${\epsilon}$.

Note that ${k}$ and ${\epsilon}$ are independent of ${N}$. We could allow them to depend on ${N}$, or alternately consider the stronger condition that for all ${\epsilon > 0}$ there exists ${k > 0}$ such that ${\Pi}$ is ${(k,\epsilon)}$-compact; combining these might yield conditions under which ${H}$ can have ${(1 - o(1))N}$ nodes. The notion can be extended to any reasonable family of mathematical objects defined as formal structures on a ground set ${\Omega}$.

## Are There Finitely Compact Properties?

Here is our prime candidate: Let ${\Pi}$ be the property of being bipartite, i.e., 2-colorable. Is ${\Pi}$ finitely compact? In particular, is there an absolute constant ${c > 0}$ such that for all ${k}$, the property is ${(k,1/k^{c})}$ compact?

A yes answer says that if a graph is “partially” 2-colorable, then the graph is close to being 2-colorable. This is the finite version of the notion of compactness.

## Property Testing

The idea of finite compactness is related to property testing. Property testing started explicitly with the work of Manuel Blum, Michael Luby, and Ronitt Rubinfeld, and since then there has been a vast literature of what properties can be tested and which cannot.

The general setup is this: One is presented with a graph ${G}$ and some property ${\Pi}$. The usual goal is to determine whether ${G}$ has the property ${\Pi}$ or not. This can be easy for some properties and hard for others. But we make the problem much harder: we only allow a small number of probes of ${G}$—we cannot examine the entire graph. Clearly, for almost any property it is impossible to test the property without at least looking at all of the graph.

One of the great insights in lower bound theory is that usually an algorithm must at least look at all of the input. But that is wrong, if one relaxes the notion of correctness. The magic of property testing is that if we relax the goal we can often solve the a computational problem in much less time than the size of the input. Indeed it is often possible to make the algorithm run in time independent of the size of the input. The relaxation requires only that the randomized algorithm ${A}$ works as follows:

1. If ${G}$ has the property ${\Pi}$, then after the probing the algorithm ${A}$ says yes’ with probability at least ${2/3}$. In the one-sided model, the algorithm always says yes’ in this case.
2. If the graph does not have ${\Pi}$, and is not “near” a graph in ${\Pi}$, then ${A}$ returns a no’ with probability at least ${2/3}$.

Here “near” is defined via a parameter ${\epsilon > 0}$ that is also given to the algorithm. Two graphs ${G,G'}$ are ${\epsilon}$-close if they have the same size ${N}$ and for all but ${\epsilon\binom{N}{2}}$ pairs ${1 \leq i < j \leq N}$, ${E(i,j) \leftrightarrow E'(i,j)}$. The query complexity ${q}$ of the tester is the maximum number of edge probes, and is always a function of ${\epsilon}$. Sometimes it is a function of ${N}$, or in case of an exponential-sized graph with ${N = 2^n}$, a function of ${n}$ usually bounded by ${n^{O(1)}}$.

For more on property testing, check out Dana Ron’s surveys here or (longer) here, and Oded’s here. These are great surveys.

## Cognizance and Compactness?

Our intent is that the hypothesis of ${(k,\epsilon)}$-compactness leads the tester ${A}$ to accept ${G}$, which implies that ${G}$ is ${\epsilon}$-close to a graph ${G'}$ with property ${\Pi}$, and then use the existence of ${G'}$ to produce ${H}$. There are two obstacles:

• The tester ${A}$ need not work by probing small subgraphs for the property ${\Pi}$ itself.
• The closeness condition counts edges, and might not carry over to vertex-induced subgraphs.

We address the former issue by strengthening a notion called “canonical” in Oded’s survey. This is referenced to a paper by him with Luca Trevisan, but the key point for us is credited to a private communication from Noga Alon and brought to fruition in “Appendix D” of that paper. We will give it a fresh name, hope the authors don’t mind:

Definition 2 A property testing algorithm ${A}$ for ${\Pi}$ is cognizant if it generates one or more vertex-induced subgraphs ${H}$ of ${G}$, probing only the edges in ${H}$, and accepts if and only if the majority of the probed graphs have property ${\Pi}$.

A property ${\Pi}$ is closed under induced subgraphs if whenever ${G}$ has ${\Pi}$ and ${H}$ is a vertex-induced subgraph of ${G}$, then ${H}$ has ${\Pi}$. Bipartiteness is such a property; indeed it is closed under edge-induced subgraphs as well. So is ${k}$-colorability for any ${k}$. Ignoring some slippage in bounds, the gist of Appendix D is:

Theorem 3 Every testable property that is closed under induced subgraphs has a cognizant tester.

This yields a weak “edge-induced” notion of finite compactness. An ${N}$-vertex graph is dense if it has ${\delta N^2}$ edges, where we intend ${\delta > \epsilon}$ and fixed. For the property of bipartiteness the weaker version of compactness is achieved this way:

Theorem 4 There is an absolute constant ${c > 0}$ such that for all dense graphs ${G}$ and sufficiently large ${k}$, if all ${k}$-node subgraphs of ${G}$ are bipartite, then by deleting at most ${(1/k^c)N^2}$ edges we can obtain a bipartite graph ${H}$.

Proof: Given ${\epsilon > 0}$, the cognizant tester in Oded’s survey (Algorithm 2.3) selects ${k = O(\log(1/\epsilon)/\epsilon^2)}$ vertices uniformly at random, and accepts iff the subgraph ${R}$ they induce is bipartite. This makes ${1/k^3 < \epsilon}$. Suppose ${G}$ is a graph for which all ${k}$-node subgraphs are bipartite. Then the tester accepts with certainty. By definition of being a tester, ${G}$ is near a graph ${G'}$ in ${\Pi}$, which here entails that ${\epsilon N^2}$ edges can be deleted from ${G}$ to yield ${H}$. (In fact, as observed in a comment after the proof, a suitable ${H}$ can be described succinctly in terms of choices that accompany ${R}$.) $\Box$

## Can We Get Vertex-Induced Subgraphs?

The issue, however, is that ${H}$ need not be a vertex-induced subgraph. Indeed, ${\epsilon N^2}$ edges can easily be incident on all ${N}$ vertices, in such a way that all ${(1-\epsilon)N}$-vertex induced subgraphs ${H}$ have many of the bad edges from ${G}$.

Note, however, that our condition of ${(k,\epsilon)}$ compactness is stronger than what one needs for a tester ${A}$ to accept, even given that ${A}$ is a cognizant tester. The question is whether the stronger condition sufficies for the desired stronger conclusion.

Is every subgraph-closed testable property finitely compact?

We’ve thought about this in the particular case of bipartiteness. Here is a general kind of attempt to break the desired conclusion.

## Bipartite Case

Let ${N}$ be a multiple of 4, and let two copies of ${\{1,...,N/2\}}$ give rise to partitions ${A}$ and ${B}$. For some function ${f(N)}$, say ${f(N) = \sqrt{N}}$, connect every ${A}$-vertex to its ${2f(N)+1}$ ${B}$-neighbors and vice-versa. That is, each crossing edge goes just a short ${f(N)}$ distance up’ or down.’ Call this bipartite graph ${G_0}$. (We will also consider adding a few more “rogue” crossing edges.)

Now add ${N/4}$ “violating edges” to each partition, connecting ${a_i}$ to ${a_{i + N/4}}$ (mod ${N/2}$) for each ${a_i \in A}$, and similarly for each ${b_i \in B}$. We added ${N/2}$ edges total. This gives the graph ${G}$.

${G}$ is near ${G_0}$, because ${N/2 < \epsilon N^2}$ when ${N > 1/(2\epsilon}$). Moreover, we claim that every ${k}$-subset ${S}$ for ${k}$ up to about ${N/f(N)}$, the graph induced by ${S}$ is bipartite. The reason is that if not, then ${S}$ has a violating edge, say between nodes in ${A}$. In order to use that violating edge, there has to be a path within ${G_0}$ that connects those endpoints. However, the endpoints are ${N/4}$ apart, and a path within ${G_0}$ that gets ${N/4}$ apart has to have non-constant many steps. The violating edges in ${B}$ don’t help shorten this path because they too all jump ${N/4}$ places forward or backward around the graph.

Thus ${G}$ satisfies the ${k}$-compactness hypothesis. No problem with the bipartite tester: it accepts and ${G}$ is near a bipartite graph, so fine. The question is whether ${G}$—or some further modification of ${G}$—has large vertex subsets ${G'}$ that induce a bipartite graph.

For ${G}$ itself the answer is yes, but in a fairly narrow manner. By ${|G'| > (7/8)N}$, ${G'}$ has to contain at least half of the violating edges (i.e. at least ${N/4}$ of those ${N/2}$ edges). To make ${G'}$ bipartite, we seem forced to do something like this: Make the violating edges be consecutive, say by taking vertices ${1 \dots 3N/16}$ and ${N/4+1 \dots 7N/16}$ on both sides. Then the point is that between ${3N/16+1}$ and ${N/4}$ we have no vertices so we have killed all the paths within ${G_0}$ that could have gone from the left block to the right block. So ${G'}$ is bipartite. But if ${G_0}$ also has a few “rogue” crossing edges that jump further up or down than ${f(N)}$, then they could destroy any ${G'}$ being bipartite without making a tiny size-${k}$ non-bipartite subgraph.

We don’t know whether this extends to a counterexample to our hypothesis that bipartite graphs are finitely compact. Maybe the answer is around the corner, and we could find it, but … both of us have impending deliveries of somewhat different kinds …

## Finitely Compact Time?

The moral that goes with this tale is that proving new results with the deadline pressures of getting out the next post are not very compatible. Not even close. One idea was to not make this into a post at all—keep it secret from everyone. After all it is a failure. Another idea is to stop here and keep thinking about finite compactness.

After long discussion and thinking the GLL staff, Ken and I, decided to post this and even explain in more details more about our ideas. Just be aware that this is new stuff, not well debugged—it could be silly, or could be neat. We do not know yet.

## Open Problems

What properties are finitely compact? Are there any interesting ones? And what properties are not?

Does the notion of finite compactness sharpen some of the issues in efficient property testing? What extra impact might the logical form of sentences defining the property have?

1. May 25, 2011 9:14 pm

Hi Dick,

If I understand your definition correctly, bipartiteness is not finitely compact. Take a constant-degree highly expanding graph of logarithmic girth. Then every vertex-induced subset of, say, N/2 vertices is not bipartite, because if you assume, toward a contradiction, that it were bipartite then you would have an independent set of size at least N/4 in the graph, which is incompatible with the expander mixing lemma. On the other hand, because of the high girth, every constant-size subset of vertices induces a tree, which is bipartite.

• May 25, 2011 11:17 pm

Thanks! That’s a nice example of expert knowledge completing what one’s vague intuition was trying to do—I wrote the “Bipartite Case” section.

Well, we did write the post in a way we could joke about our ship being sunk and what kind of salvage operation can be done… Perhaps the notion can still be put into reasonable effect by letting k vary as (log N)^2, say?

2. May 26, 2011 2:12 am

Before you gave your definition for finite compactness, I initially thought you might be referring to Helly-type theorems, which strike me as a more direct analogy with the infinite case. (i.e. there is no epsilon involved)