20

I have had problems accepting the complexity theoretic view of "efficiently solved by parallel algorithm" which is given by the class NC:

NC is the class of problems that can be solved by a parallel algorithm in time $O(\log^cn)$ on $p(n) \in O(n^k)$ processors with $c,k \in \mathbb{N}$.

We can assume a PRAM.

My problem is that this does not seem to say much about "real" machines, that is machines with a finite amount of processors. Now I am told that "it is known" that we can "efficiently" simulate a $O(n^k)$ processor algorithm on $p \in \mathbb{N}$ processors.

What does "efficiently" mean here? Is this folklore or is there a rigorous theorem which quantifies the overhead caused by simulation?

What I am afraid that happens is that I have a problem which has a sequential $O(n^k)$ algorithm and also an "efficient" parallel algorithm which, when simulated on $p$ processors, also takes $O(n^k)$ time (which is all that can be expected on this granularity level of analysis if the sequential algorithm is asymptotically optimal). In this case, there is no speedup whatsover as far as we can see; in fact, the simulated parallel algorithm may be slower than the sequential algorithm. That is I am really looking for statements more precise than $O$-bounds (or a declaration of absence of such results).

Raphael
  • 72,336
  • 29
  • 179
  • 389
  • Brent's theorem? – cic May 03 '12 at 09:27
  • Do you mean $T_p < \frac{W}{p} + D$? If so, this is (afaik) only applicable in certain circumstances and also does not immediately allow to translate runtimes. Or if it does, please elaborate in an answer. – Raphael May 03 '12 at 09:49
  • NC answer the question "is it possible to trade-off more hardware for less run-time?" You may want to restrict yourself to constant hardware and this is similar to restricting yourself to constant memory, a better modeling of some problems. For a practical use see carry lookhead adders, more hardware so that addition of $N$ bits is done in $O(N)$. – AProgrammer May 04 '12 at 04:33

2 Answers2

13

If you assume that the number of processors is bounded by a constant, then you are right that a problem being in NC does not mean much in practice. Since any algorithm on a PRAM with k processors and t parallel time can be simulated with a single-processor RAM in O(kt) time, the parallel time and the sequential time can differ only by a constant factor if k is a constant.

However, if you assume that you can prepare a computer with more processors as the input size grows, then a problem being in NC means that as long as you can prepare more processors, running time will be “very short” or, more precisely, polylogarithmic in the input size. If you think that this assumption is unrealistic, compare it to the assumption of unbounded memory: actual computers have only finite amount of space, but in the study of algorithms and complexity, we almost always assume that a computational device does not have a constant upper bound on space. In practice, this means that we can prepare a computer with more memory as the input size grows, which is how we usually use computers in the real world. NC models an analogous situation in parallel computation.

Tsuyoshi Ito
  • 2,412
  • 20
  • 15
  • 1
  • Yes, parallelising on constantly many cores can only yield constant speedup. That is inherent and sadly hidden in $O$-terms. The (imho) interesting question is: can I get (optimal) speedup $k$, or only $k/2$, or $k-1$?
  • While the assumption of infinite memory can be justified by the availability of lots of RAM (and, technically, you can add the hard disk), this is not generally true for processors. Typical (personal) machines have 16 or less cores nowadays. In other words, you can use "normal" results up to relevant problem sizes, many parallel results only up to $n \leq 20$.
  • – Raphael May 03 '12 at 12:22
  • 4
    @Raphael: The question of whether a certain problem belongs to NC or not does not model your question. I am not saying that your question is uninteresting; I am just saying that NC is not the right complexity class to model that. – Tsuyoshi Ito May 03 '12 at 13:40
  • I am actually happy to hear that; a person claims otherwise, though. Not necessarily with NC but with complexity theoretic results in general. How is it with other classes? – Raphael May 03 '12 at 14:43
  • A correction: A problem being in NC means that the running time is polylogarithmic if the number of processors is a sufficiently large polynomial in the input size. In the arguably more realistic scenario where the number of processors is a fixed polynomial like $O(\sqrt{n})$, or a slower non-constant function like $O(\log n)$, membership in NC doesn't formally imply anything at all. – JeffE May 04 '12 at 12:01
  • @JeffE: That is not a correction. I only wrote “prepare more processors” without giving its rigorous meaning (because I thought that doing so would obscure the point). – Tsuyoshi Ito May 15 '12 at 19:50