2

I am looking at the proof of the following theorem and I have some questions.

The theorem is the following:

On the assumption that all permutations of a sequence of $n$ elements are equally likely to appear as input, any decision tree that sorts $n$ elements has an expected depth of at least $\log n!$.

The proof is the following:

Let $D(T)$ be the sum of the depths of the leaves of a binary rtee $T$.

Let $D(m)$ be the smallest value of $D(T)$ taken over all binary trees $T$ with $m$ leaves.

We shall show, by induction on $m$, that $D(m) \geq m \log m$.

The basis, $m=1$, is trivial.

Now, assume the inductive hypothesis is true for all values of $m$ less than $k$.

Consider a decision tree $T$ with $k$ leaves.

$T$ consists of a root having a left subtree $T_{i}$ with $i$ leaves and a right subtree $T_{k-i}$ with $k-i$ leaves for some $i$, $1 \leq i \leq k$.

Clearly, $$D(T)=i+D(T_{i})+(k-i)+D(T_{k-i})$$

Therefore, the minimum sum is given by $$D(k)=MIN_{1 \leq i \leq k} [k+D(i)+D(k-i)] \ \ \ \ (*)$$

Invoking the inductive hypothesis, we obtain from $(*)$ $$D(k) \geq k+MIN_{1 \leq i \leq k} [i \log i+(k-i \log (k-i)]$$

It is easy to show that the minimum occurs at $i=k/2$. Thus $$D(k) \geq k+k \log \frac{K}{2}=k \log k$$

We conclude that $D(m) \geq m \log m$ for all $m \geq 1$.

Now we claim that a decision tree $T$ sorting $n$ random elements has at least $n!$ leaves.

Moreover, exactly $n!$ leaves will have probability $1/n!$ each, and the remaining leaves will have probability zero.

We may remove from $T$ all vertices that are ancestors only of leaves of probability zero, without changing the expected depth of $T$.

We are thus left with a tree $T'$ having $n!$ leaves each of probability $1/n!$.

Since $D(T') \geq n! \log n!$, the expected depth of $T'$ (and hence of$T$) is at least $(1/n!)n! \log n!=\log n!$.

$$$$

My questions are the following:

  1. Why do we want to show, by induction, that $D(m) \geq m \log m$ ??
  2. When we consider a decision tree $T$ with $k$ leaves where $T_{i}$ is the left subtree with $i$ leaves and $T_{k-i}$ is the right subtree with $k-i$ leaves, why does it stand that $$D(T)=i+D(T_{i})+(k-i)+D(T_{k-i})$$ ??
  3. How do we conlcude that the minimum sum is given by $$D(k)=MIN_{1 \leq i \leq k} [k+D(i)+D(k-i)]$$ ??
  4. How do we obtain, from the inductive hypothesis, that $$D(k) \geq k+MIN_{1 \leq i \leq k} [i \log i +(k-i) \log (k-i) ]$$ ??
  5. Could you expalin me the part after "Now, assume the inductive hypothesis is true for all values of $m$ less than $k$." ??
Mary Star
  • 13,956

1 Answers1

2


1) Because you want the expected depth, you need a way to control the depth of the decision tree and by knowing that you will have at least $n!$ leaves(possible orders) it seems fair that you want $\frac{n!log(n!)}{n!}$ to happen.
2) When you glue together the 2 subtrees you are creating a new level. Because of that, every path from a leave to the root must be increased by $1$, there are $k=i+(k-i)$ leaves, then you must increase by $k$ the paths found in $D(T_i)$ and $D(T_{k-i})$.
3) Notice that $D(T)=i+D(T_i)+(k-i)+D(T_{k-i})=k+D(T_i)+D(T_{k-i})$, also see that you are iterating over all possible binary trees with $k$ leaves by splitting it into subtrees with $i$ and $k-i$ leaves.
4) Notice that $k$ is fixed and recall that $D(i)\geq ilog(i)$ and $D(k-i)\geq (k-i)log(k-i)$, also notice that $i$ and $(k-i)$ are smaller than $k$.
5) See Strong induction.

Edit:
2) Yes, i am glad to. But you will understand it better if you do some picture.
Let $C_i:=$binary trees with $i$ leaves. You will have that $C_i=\bigcup _{k=0}^i C_k \times C_{i-k}$, because if you have a binary tree you always can take out the root(call it $r$) and the result will be two trees, the left and the right subtrees with roots $r_r$,$r_l$ respectively. If you do the picture, you will see that the unique path of any leave to the root in any subtree is the same path that the one in the original tree but without the edge that joins the root $r$ to $r_r$ or $r_l$ in the original tree. So, the depth of every single leave in the original tree is one plus the depth in any subtree. So, if you have $k$ leaves, the depth must be increased by $k=i+(k-i)$.
3)Because you always take the minimum, but if you see it in general for every single tree $T$, you will have it for the tree that generates $D(i)$. Think about the $\min$ function.
5) Yes, recall that the binary trees are defined in a recursive way as pointed out in "2)" below. If you have the desired property for every $m<k$, because of that recursive way, you can construct any possible tree with $k$ leaves by gluing the two subtrees using the hypothesis that the problem is solved in the subtrees. When you suposse that for any single $m<k$ you have the problem solved and you use that toconclude the property for $k$ that is strong induction.

Hope it helps.

Phicar
  • 14,722
  • I still don't understand why we want to show, by induction, that $D(m) \geq m \log m$.
  • $$$$ 2) Could you explain it further to me? $$$$ 3) At the proof it is $$D(k)=MIN_{1 \leq i \leq k} [k+D(i)+D(k-i)]$$ Why is it $D(i)$ and not $D(T_{i})$ ?? $$$$ 4) I undertsnd!! $$$$ 5) Could you explain me the strong induction in this case??

    – Mary Star Dec 05 '14 at 18:50
  • I edited the answer. – Phicar Dec 05 '14 at 19:47
  • I understand.
  • $$$$ 3) I still don't understand why we use $D(i)$ instead of $D(T_{i})$.

    – Mary Star Dec 10 '14 at 22:38
  • Hi. It is clear that for every tree $T$ with $k$ leafs you can do $D(T)=k+D(T_{i})+D(T_{k-i})$ where $T_i$ is a tree with $i$ leafs and $T_{k-i}$ is a tree with $k-i$ leafs? If it is not, read again the edit of "2)". If it is clear, and i think it is, recall $D(k)$ is the smallest sum of path of leaves in a tree with $k$ leaves, in other words $D(k)=D(T_k)$ for one special $T_k$. But, $D(k)=D(T_k)=k+D(T_j)+D(T_{k-j})$ for some $j$. So, if you iterate over all possible $j$, you will end up with a tree that generates $D(k)$. – Phicar Dec 10 '14 at 22:54
  • "$D(k)$ is the smallest value of $D(T)$ (the sum of the depths of the leaves of $T$) taken over all binary trees $T$ with $k$ leaves." What does this mean?? Do we calculate the sum of depths of the leaves of each subtree and then we take the minimum of them?? I got stuck right now... – Mary Star Dec 10 '14 at 23:24
  • That means $D(k)=\min _{T}D(T)$ with $T$ a binary tree with $k$ leaves. No, subtrees no, you sum the depths of the leaves of each tree and then take the minimum. – Phicar Dec 10 '14 at 23:31
  • Do we not a specific binary tree $T$?? Or what does it mean "...of each tree ..." ?? – Mary Star Dec 11 '14 at 00:29
  • Yes, of each specific binary tree. – Phicar Dec 11 '14 at 02:51