1

This is an excerpt from the algorithms textbook How to Think About Algorithms by Jeff Edmonds (This book is a gem by the way).

HTTAA Chapter 5.4 Radix Counting Sort

I get his conclusion about Merge/Quick/Heap sorts having $O(NlogN)$ operations with respect to $N$, the number of elements in the input list, and at the same time they are linear in the context that they need $O(n)$ opertaions with respect to the number of bits needed to represent the input list. Different models of computation and a good way to describe one algorithm under different models.

But my question is in the line

Assuming that the $N$ numbers to be sorted are distinct, each needs $logN$ bits to be represented, for a total of $n = \Theta(NlogN)$ bits.

My understanding of this was that with $N$ distinct numbers, we need word size $w = \theta(logN)$, as defined in CLRS. We want the word size to be able to at least index into $N$ different elements but not so big that we can put everything in one word. Also, with $N$ distinct elements, we need $\Omega(logN)$ bits to represent the largest number. Assuming each word will fit in $w$, Edmonds' claim that $n = \theta(NlogN)$ bits made sense. Please correct me if my analysis is wrong up to here.

But when I try to apply this to counting sort, something doesn't seem right. With $N$ again the number of elements in the input and $k$ the value of the maximum element in the list, the running time is $O(N + k)$. Counting sort is linear with respect to $N$, the number of elements in the input, when $k = O(N)$. Using this constraint to represent the input as the total number of bits $n$, I think $n = O(Nlogk) = O(NlogN)$.

So with in the RAM model of computation, how can I express the run-time of counting sort with respect to $n$ bits? Merge/quick/heap sorts had time complexity $O(n)$ with respect to $n$ bits, as was expressed cleanly by Edmonds. I am not sure how to do something similar for counting sort, and maybe possibly for radix sort using counting sort as a subroutine. Any idea how to do this in the two cases when $k = \Theta(N)$ and when this condition is not present? I am suspecting the former will give some kind of polynomial time with respect to $n$ and the latter exponential (hence pseudo-polynomial) time with respect to $n$ bits, but have trouble expressing the mathematics...

namesake22
  • 211
  • 1
  • 8

1 Answers1

1

In the RAM model of computation, machine words are $\Theta(\log n)$ bits long (where $n$ is the size of the input in bits), and operations on machine words can be done in $O(1)$. Counting sort has running time $O(N + k)$ in the RAM model assuming that each value is $\Theta(\log n)$ bits long.

Let us now consider an array consisting of $N$ values, the maximal value of which is $k$. We would need $\log k$ bits to store each value. Assuming that we use $\Theta(\log k)$ bits per value, the input length is $n = \Theta(N\log k)$ bits, and so a machine word is $\log N + \log\log k + O(1)$ bits long. Let us denote by $w$ the number of machine words needed to store a single entry, which we can calculate by $$ w = \left\lceil \frac{\log k}{\log N + \log\log k} \right\rceil = \Theta\left(\max\left(1,\frac{\log k}{\log N + \log\log k}\right)\right). $$ The running time of counting sort is $O(Nw + k)$.

Yuval Filmus
  • 276,994
  • 27
  • 311
  • 503
  • the CLRS post and this post I thought a word $w = \Theta(logN)$, where $N$ is the number of elements, not $n$ the size of the input in bits. Am I missing some connection here? Which definition should I follow? (I will go through your answer after you reply and will probably have additional questions that I will leave as comments here) – namesake22 Jan 07 '18 at 16:11
  • 1
    The definition in CLRS is circular. There is no ambiguity if, for example, the input is an array of $N$ words. If a word takes $\Theta(\log N)$ bits then the total input size is $n = \Theta(N\log N)$, and so $\log n = \Theta(\log N)$. – Yuval Filmus Jan 07 '18 at 16:13
  • 1
    The usual assumption is that each element fits in a constant number of words. If this is not the case, you have to specify it. – Yuval Filmus Jan 07 '18 at 16:22
  • Yes, $\log N + \log\log N = \Theta(\log N)$. – Yuval Filmus Jan 07 '18 at 16:31
  • When $k = \Theta(N)$, $w = O(1)$, and $O(N) = O(n/\log(n))$. Without this condition, $O(Nw + k) = O(\frac{n}{\log(n)} + k)$. I am not so sure how to interpret this running time with respect to n bits. Both terms seem to be of smaller magnitude than $n$... but $k$ doesn't seem expressible nicely in terms of $n$... – namesake22 Jan 08 '18 at 16:16
  • You can't express $k$ in terms of $n$. The parameters $N$ and $k$ are independent. – Yuval Filmus Jan 08 '18 at 16:18
  • Is $O(N + k)$ pseudo-polynomial time because the numeric value of k requires $O(\log k)$ bits to write out, so the runtime is exponential in the input size? Not so sure how to "exponential" works out – namesake22 Jan 08 '18 at 16:23
  • 1
    Here is what Wikipedia has to say: In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the length of the input (the number of bits required to represent it) and the numeric value of the input (the largest integer present in the input).. – Yuval Filmus Jan 08 '18 at 16:25