Maximum number of strict binary trees that can be made, each having exactly n leaf nodes.

Question

I am trying to evaluate(Mathematical expression) the number of strict binary trees that can be made with n leaf nodes.

I already know that a strict binary tree with n leaf nodes would have exactly 2*n-1 nodes, so I thought, maybe there is a way to determine the total number of strict binary trees possible with the same number of leaves.

If you mean unlabelled, rooted binary trees in which each non-leaf has $2$ children, there are $C_n=\frac1{n+1}\binom{2n}n$ of them. $C_n$ is the $n$-the Catalan number. — Brian M. Scott, Apr 03 '16 at 18:16
Yes that's exactly what I mean. But I would also like to know the proof of this expression, if you could provide me with that. — Karthik Prakash, Apr 03 '16 at 18:29

score 2 · Answer 1 · edited Mar 08 '20 at 00:17

If you already know that the Catalan number $C_n$ is the number of valid parenthesis strings with $n$ pairs of parentheses, you can prove that the number of strict binary trees with $n$ internal nodes is $C_n$ by demonstrating that there is a bijection between the two sets. There is a bijection that is fairly easy to describe, and I’ll sketch the argument for it.

Given a strict binary tree with $n$ leaves, think of each internal node as a pair of parentheses surrounding a product operator; the operands associated with that product operator are the two children of the node, and they go inside the parentheses. The leaves are simply labelled with variable names, e.g., $x_1,x_2$, etc. This procedure associates to each strict binary tree with $n$ internal nodes a fully parenthesized product with $n+1$ factors. I’ve illustrated this below for two strict binary trees with $3$ internal nodes, using the names $a,b,c$, and $d$ for the leaves.

           *                                 *  
          / \                               / \  
         a   *                             /   \  
            / \                           *     *  
           *   d                         / \   / \  
          / \                           a   b c   d
         b   c  

     (a·(b·(c·d)))                     ((a·b)·(c·d))

Now remove everything except the parentheses from the associated strings; in this case you get ((())) and (()()). It shouldn’t be too hard to convince yourself of the following observations.

Each parenthesis string produced in this way is valid.
Each valid parenthesis string with $n$ pairs of parentheses can be produced in this way. (Checking this amounts to showing how to reconstruct the binary tree from the parenthesis string.)

Those two observations taken together amount to saying that this construction yields a bijection from the set of strict binary trees with $n$ internal nodes to the set of valid parenthesis strings with $n$ pairs of parentheses.

Alternatively, if you know that the Catalan numbers satisfy the recurrence

$$C_{n+1}=\sum_{k=0}^nC_kC_{n-k}\;,$$

with $C_0=1$, you can prove the result as follows. Let $a_n$ be the number of strict binary trees with $n$ internal nodes. Clearly $a_0=1$. To build a strict binary tree with $n+1$ internal nodes, we can start with the root and its two children. Each of these children must be the root of a strict binary tree. The internal nodes of our tree will be the internal nodes of these subtrees together with the root, so if the left subtree has $k$ internal nodes, then the right subtree must have $n-k$ internal nodes. Thus, there are $a_k$ possible choices for the left subtree and $a_{n-k}$ possible choices for the right subtree. Each choice on the left can be combined with any choice on the right, so there are $a_ka_{n-k}$ strict binary trees with $n+1$ internal nodes, exactly $k$ of which are in the left subtree. Now sum over $k$: $k$ can be anything from $0$ through $n$, so

$$a_{n+1}=\sum_{k=0}^na_ka_{n-k}\;.$$

Thus, the sequence $\langle a_n:n\in\Bbb N\rangle$ satisfies the same recurrence and initial condition as the sequence of Catalan numbers and therefore must be the sequence of Catalan numbers: $a_n=C_n$ for each $n\ge 0$.

This map isn't a bijection; the tree in your first example and its reverse both map to ((())), and there isn't any tree that maps to ()(()). — NoName, Mar 08 '20 at 02:45

NoName · Answer 2 · 2020-03-08T02:46:42.133

The Catalan numbers count these, and the Catalan number $C_n$ also counts the number of ways to parenthesize $n+1$ factors. For instance, factors $a,b,c$ can be parenthesized $(ab)c$ or $a(bc)$, as $C_2 = 2$. See Bijection between valid parenthesis strings and ways to parenthesize factors for a bijection between these and parenthesis strings. Parenthesis strings and parenthesized factors sound similar but are actually very different objects.

It's easier to show a bijection between the parenthesized factors and the full binary trees. Given a string representing parenthesized factors, the factors will correspond to leaves, and the open and close parentheses will correspond to steps down and up, respectively.

To borrow an example:

           *                                 *  
          / \                               / \  
         a   *                             /   \  
            / \                           *     *  
           *   d                         / \   / \  
          / \                           a   b c   d
         b   c  

     (a·((b·c)·d))                    (((a·b)·(c·d)))

With this bijection, a pair of factors will be enclosed in parentheses together (like $b$ and $c$ in the first example above) if and only if the corresponding leaves are children of the same node in the corresponding binary tree.

Maximum number of strict binary trees that can be made, each having exactly n leaf nodes.

2 Answers2