1

I wonder whether there is such kind of relation between the scale of the perturbation and the stationary distribution of a Markov chain. Suppose $\hat{P}=P+F\in\mathbb{R}^{n\times n}$, where $F$ is the perturbation matrix, each of whose line is sum up to $0$. The corresponding stationary distribution is $\pi, \hat{\pi}$. Is there exist some $C(n)$, which only relate to $n$, so that $$\sum_{i,j=1}^n|F_{ij}|\ge C(n)||\hat{\pi}-\pi||_1$$

I know there is some existing result like $C$ is some kind of condition number. I also find "A Note on Entrywise Perturbation Theory for Markov Chains", which shows that if $|F_{ij}|\le\epsilon|P_{ij}|$, we have $||\hat{\pi}-\pi||_1\le2(n-1)\epsilon+\mathcal{O}(\epsilon^2)$. But still I can't derive what $C(n)$ is in my question. I also start doubting the existence of this C. Is there any counter example that can prove that this $C$ does not exist?

1 Answers1

1

I think I answered this here, but I'll translate into the setting of that question: Stationary distribution of convex combination of Markov chains

I believe there is no formula that only depends on $n$.

Here is my reasoning:

Instead of writing $P + F$, write $\lambda P + (1 - \lambda) Q$, with $F_{\lambda} = ( \lambda - 1) P + (1 - \lambda)Q$.

Then we have $|F_{\lambda}| \leq (|P| + |Q|) ( 1- \lambda) = 2n ( 1 - \lambda)$, where $|M| = \sum |M_{ij}|$, since $|P| = n$ for any transition Kernel. So we can make $|F_{\lambda}|$ as small as we want, by choosing the appropriate $\lambda$.

So adding $F_{\lambda}$ amounts to mixing with $Q$ by a factor of $\lambda$. This is now in the setting of the link, which basically explained the following:

The original $P$ chain may be very lazy, and then adding $F_{\lambda}$ will have a huge impact even for small $\lambda$ (and make the stationary distribution immediately look a lot like $\pi_Q$), or $Q$ might be extremely lazy, and adding it will have little impact, so the stationary distribution will continue to look like $\pi_P$.

In light of this, the condition $|F_{ij}| \leq \epsilon |P_{ij}|$ seems to be ruling out the extremely lazy $P$ case : if $P$ is extremely lazy, then $F$ is basically the identity matrix, and adding it shouldn't change the stationary distribution too much. The inequality also rules out the case that $P + F$ can make transitions that $P$ couldn't: $F$ can't be positive where $P$ is zero.

In general the bound you give seems to be saying that if $Q$ is lazier than $P$, and doesn't introduce new jumps, you can get some control over the stationary distribution. That sounds reasonable.

What do you think?

Elle Najt
  • 20,740
  • Thank you for your explanation! I just have some more questions about this. If I know $|F_{ij}|\le \epsilon|P_{ij}|$ already, will it help to get a C(n)? I mean whether we can derive a stronger version of ||π^−π||1≤2(n−1)ϵ+O(ϵ2), something like ||π^−π||1≤2(n−1)||F_{ij}||+O(ϵ2) – PieForever Mar 30 '19 at 22:22
  • @MengxiaoZhang I don't know, sorry. I suggest looking through the techniques in that paper and seeing if you can get their result into the form you desire. Please let me know if you learn anything interesting from doing that. :-) (Please scrutinize this answer to make sure it satisfies your reason -- I'm only a graduate student, and I didn't learn this from a book, so there could easily be a mistake in here.) – Elle Najt Mar 30 '19 at 22:30