6

Let $\mathcal{S}=[m]^*$ be the set of all strings on the alphabet $[m]=\{1, 2,\cdots, m\}$. Let $\Sigma\subset[m]^2$ be a set of strings of length $2$, and let $\overline{\Sigma}=[m]^2\backslash\Sigma$ be the complementary set. Let $\mathcal{G}$ (respectively $\overline{\mathcal{G}}$) be the set of all strings in $\mathcal{S}$ with the property that every substring of length $2$ belongs to $\Sigma$ (respectively $\overline{\Sigma}$). (Note that $\mathcal{G}$ and $\overline{\mathcal{G}}$ both include all strings of length $0$ or $1$.) Letting $\pi(\sigma)$ be the length of a string $\sigma\in\mathcal{S}$, define the two generating functions \begin{equation} G(x)=\sum_{\sigma\in\mathcal{G}}x^{\pi(\sigma)}\qquad\qquad \overline{G}(x)=\sum_{\sigma\in\overline{\mathcal{G}}}x^{\pi(\sigma)} \end{equation} The following identity holds: $G(x)=\overline{G}(-x)^{-1}$


A few years ago during my undergrad, I was asked to prove the above generating function identity in an assignment for a class on combinatorial enumeration. I have written out a proof of this identity below as an answer (I think that makes the most sense here, to keep this question concise). However, after I submitted this proof, I was informed that my proof was unexpected, and that there were other more intended methods to arrive at the identity.

Therefore I ask out of curiosity: What are some alternate methods of proving the above identity? (perhaps using more standard string counting generating function arguments, generalized inclusion exclusion, a combinatorial approach, or some more exotic method)

I would also be interested in seeing if/how different approaches to this problem allow for generalization of this identity. (perhaps a higher order identity of this sort exists, or perhaps one involving substrings of length greater than $2$, or one involving more general objects) I give mention one such small generalization in my answer.

2 Answers2

5

$\newcommand{\vo}{\mathbf{u}}$

Firstly, we rephrase our problem in terms of graphs. Define the directed graphs $X=([m],\Sigma)$ and $\overline{X}=([m],\overline{\Sigma})$. Let $A,\overline{A}$ be the adjacency matrices of $X,\overline{X}$ respectively. Note that since $\overline{\Sigma}=[m]^2\backslash\Sigma$, then $X,\overline{X}$ are graph compliments of each other, so $A+\overline{A}=M$, where $M$ is the $m\times m$ matrix of ones.

Let $n\geq 1$. By construction, the strings in $\mathcal{G}$ (respectively $\overline{\mathcal{G}}$) of length $n$ are precisely the walks over the graph $X$ (respectively $\overline{X}$) of walk length $n-1$. The number $\#\mathcal{G}_n$ of $(n-1)$-length walks on $X$ from a vertex $i$ to a vertex $j$ of $X$ is given by $[A^{n-1}]_{ij}$, so the total number of walks over the graph $X$ of walk length $n-1$ is given by \begin{equation} \#\mathcal{G}_n=\sum_{i,j\in [m]}[A^{n-1}]_{ij}=\vo^TA^{n-1}\vo \end{equation} where $\vo$ is the all one vector. We therefore have that \begin{equation} G(x)=\sum_{n\geq 0}\#\mathcal{G}_n x^n =1+\sum_{n\geq 1}\vo^TA^{n-1}\vo x^n =1+x\vo^T\left(\sum_{n\geq 0}x^nA^n\right)\vo =1+x\vo^T(I-xA)^{-1}\vo\\ \end{equation} By the same logic, we have that \begin{equation} \overline{G}(x)=1+x\vo^T(I-x\overline{A})^{-1}\vo\\ \end{equation} Finally, we may compute \begin{equation} \begin{split} (G(x)-1)(\overline{G}(-x)-1)&=(x\vo^T(I-xA)^{-1}\vo)(-x\vo^T(I+x\overline{A})^{-1}\vo)\\ &=-x^2\vo^T(I-xA)^{-1}\vo\vo^T(I+x\overline{A})^{-1}\vo\\ &=-x^2\vo^T(I-xA)^{-1}M(I+x\overline{A})^{-1}\vo\\ &=-x^2\vo^T(I-xA)^{-1}(A+\overline{A})(I+x\overline{A})^{-1}\vo\\ &=x\vo^T(I-xA)^{-1}((I-xA)-(I+x\overline{A}))(I+x\overline{A})^{-1}\vo\\ &=x\vo^T((I+x\overline{A})^{-1}-(I-xA)^{-1})\vo\\ &=x\vo^T(I+x\overline{A})^{-1}\vo-x\vo^T(I-xA)^{-1}\vo\\ &=2-\overline{G}(-x)-G(x)\\ \end{split} \end{equation} and therefore \begin{equation} G(x)\overline{G}(-x)=(G(x)-1)(\overline{G}(-x)-1)+G(x)+\overline{G}(-x)-1=1 \end{equation} In other words, $G(x)=\overline{G}(-x)^{-1}$, as god intended.


Interestingly, this approach may be used to prove the following weighted generalization; letting $\nu_i(\sigma)$ be the number of times $i$ appears in a string $\sigma\in\mathcal{S}$, we define the generating series \begin{equation} G(\mathbf{y};x)=\sum_{\sigma\in\mathcal{G}}y_1^{\nu_1(\sigma)}y_2^{\nu_2(\sigma)}\cdots y_m^{\nu_m(\sigma)}x^{\pi(\sigma)}\qquad\qquad \overline{G}(\mathbf{y};x)=\sum_{\sigma\in\overline{\mathcal{G}}}y_1^{\nu_1(\sigma)}y_2^{\nu_2(\sigma)}\cdots y_m^{\nu_m(\sigma)}x^{\pi(\sigma)} \end{equation} then the following identity holds: $G(\mathbf{y};x)=\overline{G}(\mathbf{y};-x)^{-1}$

To see this, it is enough to note that $G(\mathbf{y};x)=1+x\vo^T(I-xB)^{-1}\mathbf{y}$ where $B=A\odot \vo\mathbf{y}^T$, and follow the same procedure as above.

4

Equivalently, we can prove that $$ G(x)\times \overline{G}(-x)=1 $$ Using the definition, we must show $$ \left(\sum_{\sigma\in \mathcal G}x^{\text{len}(\sigma)}\right) \left(\sum_{\tau\in \overline{\mathcal G}} (-x)^{\text{len}(\tau)}\right) = \sum_{\sigma\in \mathcal G} \sum_{\tau\in \overline{\mathcal G}} (-1)^{\text{len}(\tau)}x^{\text{len}(\sigma)+\text{len}(\tau)}=1 $$ We are effectively summing over ordered pairs $(\sigma,\tau)$, where $\sigma$ is a valid word for $\mathcal G$, and $\tau$ is valid word for $\newcommand\H{\overline{\mathcal G}}\H$. The contribution to $x^n$ is equal to the number of order pairs with total length $n$ for which $\newcommand\len{\text{len}}\len(\tau)$ is even, minus the number of ordered pairs for which $\len(\tau)$ is odd. We must show that this signed count of ordered pairs is zero, for all $n\ge 1$. To do this, we will use a sign-reversing involution. That is, we will divide the set of ordered pairs into pairs, where one contributes positively to the coefficient of $x^n$ and the other contributes negatively.

Specifically, I will define a function $f$, which takes as input an ordered pair $(\sigma,\tau)$ for which $\len(\sigma)+\len(\tau)=n$, and outputs a different order pair $(\sigma',\tau')$ for which $\len(\sigma')+\len(\tau')=n$. This function will have the property that $f\circ f$ is the identity, and $f$ has no fixed points, which means $f$ partitions the set of all order pairs into pairs of the form $\{(\sigma,\tau),(\sigma',\tau')\}$. Furthermore, $\len(\tau)$ will have the opposite parity of $\len(\tau')$, which means that the contributions of $(\sigma,\tau)$ and $(\sigma',\tau')$ to $x^n$ will cancel each other out. Since all pairs cancel, the coefficient of $x^n$ will be zero for all $n\ge 1$, as desired.

Suppose that $\sigma=\sigma_1\dots \sigma_s$ and $\tau=\tau_1\dots\tau_t$. We compute $f(\sigma,\tau)$ as follows:

  • If $\sigma_s\,\tau_1\in \Sigma$, then $\sigma'=(\sigma_1,\dots,\sigma_s,\tau_1)$ and $\tau'=(\tau_2,\dots,\tau_{t})$. That is, delete the first character of $\tau$, and append it to $\sigma$.

  • If $\sigma_s\,\tau_1\not\in \Sigma$, then $\sigma'=(\sigma_1,\dots,\sigma_{s-1})$ and $\tau'=(\sigma_s,\tau_1,\dots,\tau_{t})$. That is, delete the last character of $\sigma$, and prepend it to $\tau$.

I leave it to you to prove that $f$ has the properties I claimed; namely, that $(f\circ f)(\sigma,\tau)=(\sigma,\tau)$, and that $\len(\tau')$ always has the opposite parity as $\len(\tau)$.

Mike Earnest
  • 75,930
  • 1
    Nicely done! A fairly elementary solution that is definitely much less “black magic” than the one I provided (in fact, too elementary to be the fabled “intended method”). Notably, the generalization described in my answer is also pretty immediate from this proof. I’ll accept this answer in a week or so to hopefully attract extra answers in the interim period (I suspect there are at least a few other possible approaches that I’m still very much interested in seeing) Depending on how much attention this gets, I may or may not offer a bounty for my favourite solution. Thanks for the answer. : ) – Christian E. Ramirez Nov 19 '23 at 22:11