Why is this a flawed counterexample to P=NP?

Question

I apologize in advance for asking this, since I'm sure this site is flooded by amateurs like me asking about P and NP. If there's a better platform to ask this on, please let me know, but this question has always bothered me.

Suppose you have a perfect binary tree of depth $d$. Each node contains a boolean. Of the $2^{d}$ leaves, exactly one is labelled true (at random) and the rest are false. The question is "Is the true node in the left half of the tree"? Clearly, there is no efficient algorithm for finding the true node (since it has been assigned at random) besides exploring all $O(2^d)$ leaves. A solution, however, can be verified in $O(d)$ time by simply providing the path from the root node to the true leaf.

If valid, this would be an example problem that is in NP, but not P, showing that P!=NP (unless I'm mistaken). I believe where this example goes wrong is that in the formal definition of a decision problem, $n$ is the size of the input. Since our tree is an input, $n=2^d$ and the naive algorithm is actually $O(n)$, and thus in P. Is this the only issue, or is there something more fundamental that I'm completely missing?

If that's the only issue, then can we circumvent it by some clever mechanism for generating a similar problem from a smaller input?

You correctly identified the issue; the size of the tree is $O(2^d)$ so the input of the algorithms is $n = 2^d$ and the naive algorithm takes $O(n)$ and thus is polynomial time. The "fundamental issue" is the clever mechanism you mentioned. No one has found such a thing as of now. — plshelp, Dec 19 '22 at 04:41
Hi Tim, I'm not much more than an amateur, but I think it's pretty standard to state $n$ is the size of any problem's input. So your tree is size $n$, and can always be solved in time $o(n)$. From this perspective, I think that your problem is in P. I'd also say that my experience over the years is that P vs. NP is a really hard problem. — Matt Groff, Dec 19 '22 at 04:42
Unfortunately, $n$ is overloaded, usually it is the number of elements or vertices, or it could even be number of dimensions you have. The formal definition of P is that there is a Turing machine $M$ and a polynomial function $f(s)$ such that $M$ halts after at most $f\left( \left| x \right| \right)$ steps where $x$ is the input. Hence, it is crucial that we talk about the size of the input. — Pål GD, Dec 19 '22 at 08:45

score 2 · Accepted Answer · answered Dec 19 '22 at 06:42

The problem is that representing the tree takes exponential space, so the input length is $2^d$. Therefore a running time of $O(2^d)$ is actually polynomial in the length of the input: we care about whether the algorithm's running time is polynomial in the length of the input, in bits, not some other parameter. In other words, this problem is in P.

I think you are accurately getting at a core aspect of the P vs NP problem. Suppose instead of writing down the entire tree, we had a function $f$ that was implicitly specified, that let us compute for a given leaf whether it is labelled true or false. Maybe there is a way to write down an algorithm or circuit or description for $f$ that is much shorter than $2^d$ in length. Then we get a new problem, that is similar to your problem but not identical. Intuitively, it seems unlikely that there is anything better than trying all leaves, for this new problem. But actually proving that is very hard. Our intuition might be misleading. Maybe there is some way to use the structure in $f$ (that ensures it is not a totally arbitrary function, but rather one with a short description) to more quickly find the leaf that is true. It seems challenging to imagine how that could be the case, but we have no proof that it is impossible. This could be said to be a reason why we might intuitively expect P != NP, but we don't know how to prove that rigorously.

Thanks for elaborating on previous comments, and explaining why querying a second machine or function doesn't solve the issue. Sounds like my understanding was in the right direction. — TimD1, Dec 19 '22 at 23:13
Since we're assuming that it's a perfect tree of depth $d$ and only a single leaf node is true, what if $f$ just stores the path to the true node as a binary string of length $d$? $f$ can do a simple string comparion, which is $O(d)$. The tree is then represented in $d$ space, but the algorithm must make $2^d$ queries to $f$. — TimD1, Dec 20 '22 at 07:14
@TimD1, in that case, simple inspection of the description/specification of $f$ suffices to find the value of the true leaf. So for that class of functions $f$, it is easy to find a true leaf -- the problem is in P. It's not necessary to do $2^d$ queries to $f$, since one can look at the description of $f$ in the input and immediately pull out that path from the description. But for a sufficiently large class of functions (that doesn't just include such trivial cases), it's apparently hard to do that -- the problem is in NP, and in particular, is NP-complete. — D.W., Dec 20 '22 at 09:45

Why is this a flawed counterexample to P=NP?

1 Answers1