7

This question is a follow-up to Expected number of steps for reaching a specific absorbing state in an absorbing Markov chain because I don't understand the answer given there. I think I need to see a concrete example.

Suppose I play red and black at a casino. On each play, I either win the amount I staked, with probability $p<\frac12$ or I lose my stake. Let's say I start with a bankroll of $2$ and I decide to play until I have won $3$ or lost everything. My strategy is to bet just enough to reach my goal, or everything I have, whichever is less.

We have a Markov chain with $6$ states, $2$ of which are absorbing. There are well-known methods to determine the probability of winning, and of determining the average number of plays I make, but what if I want to know the average number of plays I make if I reach my goal? The transition matrix, with $q=1-p$ is $$\begin {bmatrix} 1&0&0&0&0&0\\ q&0&p&0&0&0\\ q&0&0&0&p&0\\ 0&q&0&0&0&p\\ 0&0&0&q&0&p\\ 0&0&0&0&0&1 \end{bmatrix}$$

Henning Malcolm suggests two approaches, neither of which I can follow. (This is because of my limitations, and is in no way intended as a criticism of the answer.) The first assumes that I have figured out the probability of winning starting with every possible bankroll. Then we are to compute new transition probabilities that describe the experience of the winners, as I understand it, and compute the time to absorption in the new chain. Let $p_k$ be the probability of winning, if my bankroll is $k$. How should I calculate the new transition matrix?

Henning Malcolm gives an alternative method if there is only one starting state we're interested in, and I'm only interested in the actual case where I start in state $2$. He says, "first set up a system of equations that compute for each state the expected number of times one will encounter that state before being absorbed." If we let $e_k$ be this number for state $k=1,2,3,4,$ how do we construct the equations relating the $e_k?$ I can see how to do this if we only care about the number plays until the game ends, but how do I make it reflect only the number of plays until winning?

saulspatz
  • 53,131
  • Did you deliberately choose an example where each state has only two successors? I think in this case, in the first method you can set up a system of linear equations for the transition probabilities conditional on winning; but that won't work in the same way if you have a more general transition matrix, as you'll have too many unknowns and too few equations. – joriki Jul 15 '18 at 07:20
  • @joriki No I didn't. I wanted to make a simple example, so that it wasn't asking to much to request a solution, but it didn't occur to me that I might be over-simplifying the problem. – saulspatz Jul 15 '18 at 14:06

2 Answers2

6

$\color{red}{\textbf{The currently accepted answer (as of 2022-12-16) is incorrect.}}$ Here I'll give an example that makes the incorrectness clear. Then I'll provide a correct method to solve this problem, including examples and a Mathematica implementation.

Paraphrasing the other answer

Let $X$ denote the number of steps until absorption, and let $Z_j$ be the event "we end up in absorbing state $S_j$." ($Z_j$ is only meaningful for $j$ s.t. $S_j$ is an absorbing state.) Suppose we want $E[X | Z_G]$ for some goal state $S_G$. The other answer suggests the following algorithm:

  1. Delete the other absorbing state state(s) from the system.
  2. Iteratively delete any state that can only transition to deleted state(s). Repeat until there's an iteration with no new deletions.
  3. For each state that currently could transition to one or more deleted states, modify its transition probabilities to condition on not transitioning to those deleted states on the current step.
  4. The above steps give a modified Markov chain in which every transient state eventually ends up at $S_G$. Use standard methods to find $E[X]$ for that MC. This is the answer to the original problem.

Why that answer is incorrect (example)

First, an example. Let's say we have states $S_0, S_1, S_2, S_3$ which represent my current bankroll. If I have \$0 I can't play anymore. If I have \$1, I bet my dollar on a coin toss with $P[\text{win}] = \frac 1 {1000}$, ending up with either \$0 or \$2. If I have \$2, I bet \$1 on a coin toss with $P[\text{win}] = \frac 1 2$, ending up with either \$1 or \$3. Finally, if I have \$3 then I go home happy.

The transition matrix is: $$P = \left(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ \frac{999}{1000} & 0 & \frac{1}{1000} & 0 \\ 0 & \frac{1}{2} & 0 & \frac{1}{2} \\ 0 & 0 & 0 & 1 \end{array} \right)$$

Let's say I start the game with \$2 and I want to compute $E[S | Z_3]$. If I follow the algorithm described above, I will first delete the other absorbing state $S_0$, then adjust the transition probabilities from $S_1$, getting the new matrix $$\hat P = \left(\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & \frac{1}{2} \\ 0 & 0 & 1 \end{array} \right)$$

At this point we can find the expected absorption time by Markov chain methods, or even just in our heads because the new MC is so simple. I'll demo the second approach just for fun. We have $\frac 1 2$ chance to win in 1 step, then $\frac{1}{4}$ chance to win in 2 steps, then $\frac{1}{8}$ chance to win in 3 steps, and so on, so the expected game length is $$E[X | Z_3] = \sum_{n=1}^\infty \frac{n}{2^n} = \color{red}{\boxed{2}}.$$

This answer doesn't pass intuitive sanity checks. Think back to the original Markov chain before we pruned it. We have $\frac{1}{2}$ chance of winning the game on the first turn. If we don't win immediately, then we move to state $S_1$, and from there we have extremely high probability of losing the game immediately. This means that the games where we win will overwhelmingly be games where we just won the very first coin toss, so we should get $E[X | Z_3] = 1 + \varepsilon$ where $\varepsilon$ is some TBD small positive number. I'll use this same example below and we can confirm this intuition.

Corrected algorithm

We can follow these steps:

  1. For each state $S_j$, compute $g_j := P[Z_G | \text{ start at } S_j]$. There's a standard formula to compute these; see Wikipedia.
  2. Now compute a whole new transition matrix $\tilde P$. If $p_{jk}$ was the original transition probability from $S_k$ to $S_j$, then define $\tilde p_{jk} = P[\text{step from $S_k$ to $S_j$} | Z_G]$. This can be computed using Bayes' Theorem and it turns out we get $\tilde p_{jk} = \frac{g_j p_{jk}}{g_k}$; see extra details below. Exception: We don't modify the transition probabilities starting from absorbing states.
  3. The modified Markov chain $\tilde P$ exactly describes the transition probabilities for random walkers who will eventually end up state $S_G$ (unless they started in some other absorbing state). The expected time till absorption in this MC is exactly the quantity we set out to find, and we can compute it using another standard MC method.

Step (2) is not on Wikipedia's list of standard MC calculations, so I'll give extra details here. The idea is to use Bayes' Theorem but with all probabilities conditioned on our current location being $S_k$. We compute: $$\begin{align} \tilde p_{jk} &= P[\text{next location is $S_j$} | Z_G, \text{current location is $S_k$}] \\ &= \frac{P[Z_G | \text{next location is $S_j$, curr location is $S_k$}] P[\text{next location is $S_j$ | curr location is $S_k$}]}{P[Z_G | \text{curr location is $S_k$}]} \\ &= \frac{g_j p_{jk}}{g_k} \end{align}$$

Example 1: My counterexample from above

Let's calculate the $E[X | Z_3]$ for the simple MC I used to show the other proposed strategy doesn't work. I'm not going to show every step of plugging into formulas, but I'll give the main intermediates so that readers could use this to confirm their understanding of the method.

Step 1: We find $g_0 = 0$, $g_1 = \frac{1}{1999}$, $g_2 = \frac{1000}{1999}$, $g_3 = 1$.

Step 2: The modified transition matrix is: $$\tilde P = \left(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & \frac{1}{2000} & 0 & \frac{1999}{2000} \\ 0 & 0 & 0 & 1 \end{array} \right)$$

Step 3: Our final answer is $E[X | Z_G] = \color{blue}{\boxed{\frac{2001}{1999}}}$.

Note this time our answer agrees with the "intuitive sanity check" above, where we reasoned that the answer should be very slightly larger than 1.

Example 2: The actual problem stated in this question

The results are much messier this time. Starting from state $S_2$, I get $$E[X : Z_5] = \boxed{\frac{5 - 3 p + 3 p^2 - 7 p^3 + 5 p^4 - p^5}{(-2 + p) (-1 + p^2 - 2 p^3 + p^4)}}$$ I'm not including intermediate results because they're quite messy and I assume anyone interested in this method would just be implementing in some CAS anyway. (I myself used Mathematica.)

A couple of examples: if $p = 0.5$ then the answer is 2.6. If $p = 0.999$ then the answer is 2.001. If $p = 0.001$ then the answer is 2.49975.

My Mathematica implementation

(* Define the goal state G and the transition matrix P. For this \
implementation to work, we need all transient states to be before G, \
and all "bad" absorbing states to be after G. *)
G = 5;
P = ({
    {0, p, 0, 0, 0, 1 - p},
    {0, 0, 0, p, 0, 1 - p},
    {1 - p, 0, 0, 0, p, 0},
    {0, 0, 1 - p, 0, p, 0},
    {0, 0, 0, 0, 1, 0},
    {0, 0, 0, 0, 0, 1}
   });

dim = Length[P]; Q = P[[1 ;; G - 1, 1 ;; G - 1]]; (* transient part *) R = P[[1 ;; G - 1, G ;; dim]]; n = Inverse[IdentityMatrix[Length[Q]] - Q]; B = n . R // Simplify;

(* g[k] = P[win | start in state k] ) g = Join[B[[All, 1]], {1}]; ( modified transition matrix ) Ptilde = Table[ P[[j]][[k]]g[[k]]/g[[j]] // Simplify , {j, 1, G}, {k, 1, G}]; Qtilde = Ptilde[[1 ;; G - 1, 1 ;; G - 1]]; Ntilde = Inverse[IdentityMatrix[Length[Qtilde]] - Qtilde];

(* tt holds the final expected number of turns till absorption given
that we eventually win. tt is a list that holds the answers for each
transient state in order. *) tt = Ntilde . Table[1, {k, 1, Length[Ntilde]}] // Simplify

The output looks like Mathematica output screenshot

Note the 2nd term in that list (i.e. the answer starting from state $S_2$) is the answer I quoted for example 2 above.

David Clyde
  • 5,251
  • Apologies for the extremely late followup! I came across this question while solving a problem and I was glad to see the simple algorithm in the accepted answer but then disappointed when I found it was wrong. I eventually figured it out myself using this answer https://math.stackexchange.com/a/2852070/1127272 but I thought I'd explain details + provide code to save some trouble for future people who come across this. – David Clyde Dec 17 '22 at 02:38
  • Nice! One suggestion would be to justify why the conditioned chain is still Markov. – Aditya Dhawan Dec 18 '22 at 03:30
  • @AdityaDhawan I defined $\tilde p_{jk}$ to be the transition properties for random walkers who eventually arrive at the goal state $S_G$. Those walkers' transition probabilities from a given state will still sum to 1, so the conditioned version is still a MC. – David Clyde Dec 18 '22 at 03:51
  • That does not show that the conditioned version is a Markov Chain. Any discrete process has a transition matrix. You need to show the conditioned chain satisfies the Markov Property. – Aditya Dhawan Dec 18 '22 at 04:03
  • @AdityaDhawan What do you mean by the Markov Property? I would state the MP as "state at time $n+1$ conditioned on state at time $n$ is the same as state at time $n+1$ conditioned on all prior states$". That property is satisfied by any discrete process governed by a matrix with nonnegative real entries whose rows/columns (depending on setup) sum to 1. See https://en.wikipedia.org/wiki/Stochastic_matrix for example. – David Clyde Dec 18 '22 at 04:10
  • @AdityaDhawan "Any discrete process has a transition matrix" This is not true. If the discrete process depends on multiple prior states then you can't write down a transition matrix that lets you compute state at time n+1 by inputting the state at time n. To explain from another angle: the matrix fully defines the transition probabilities depending on where you're currently standing, without any reference to where you've been in the past, so MP is satisfied. – David Clyde Dec 18 '22 at 04:13
  • It's possible we're working from different definitions somehow. So if you still disagree, please a) explain what you think Markov Property means and b) give an example of a discrete process that is governed by a transition matrix satisfying my conditions (nonnegative real entries, sum of transition probabilities from any fixed state is 1) and is not a Markov chain. – David Clyde Dec 18 '22 at 04:15
  • Let me clarify; by a transition matrix I mean the matrix with entries $\mathbb{P}(X_{n+1} = j|X_{n} = i)$, which is possibly dependent on $n$. Note that this is a stochastic matrix.

    For a homogenous process, $\mathbb{P}(X_{n+1} = j|X_{n} = i)$ is independent of $n$.

    I agree with you, these transition probabilities in general do not determine the law of a discrete process. I did not claim so.

    We agree on our definitions of the Markov property as well. I also agree that any stochastic matrix $P$ corresponds to a homogenous Markov Chain with transition matrix $P$

    – Aditya Dhawan Dec 18 '22 at 04:32
  • I think the source of our disagreement might be ambiguity on my part : I agree that there is a Markov Chain with transition probabilities corresponding to $\tilde p_{jk}$. What is not trivial to me (although it is provable) is why the process$(S_n)_{n \in \mathbb{N}}$ conditional on the event $ { S_n \text{ ends up in the absorbing state G}}$ is Markov. – Aditya Dhawan Dec 18 '22 at 04:32
  • 1
    I did a simulation with $p=.5$ and got $2.604$ plays, in strong agreement with your calculation. I just got out of the hospital, and my brain is still pretty foggy, so I can't follow your calculation at the moment. I look forward to reading it when I get my mind back. – saulspatz Dec 21 '22 at 21:44
  • @saulspatz Certainly don't worry about how fast you get back to me! Take your time and heal up, I'll still come back to this if you have some clarification question later. I do agree my algorithm is a little more complex than the other version; I'd be much happier if that other method were also valid, since it's much easier to implement. – David Clyde Dec 21 '22 at 22:22
  • is it possible to derive a probability distribution of these times? – Uk rain troll Jan 30 '24 at 18:38
  • Yes. My answer above shows how to compute a modified transition matrix $\tilde P$. Once you've done that, you can use $\tilde P$ to do any operation that you could normally do on a Markov chain. Therefore, we just need to be able to answer your question for a general absorbing Markov chain with 1 absorbing state, and this previous question looks like a good source for that. – David Clyde Jan 31 '24 at 19:48
0

[A partial answer illustrating the first method.]

State $1$ represents losing the game, so remove it from the system and condition the remaining transition probabilities on the event that the system does not transition from state $i$ to state $1$. In practical terms, this means that you delete from your transition matrix column $1$ and every row that had a $1$ in this column, and scale each remaining row $i$ by $1/(1-p_{1i})$. For your example system, this produces the reduced matrix $$P' = \begin{bmatrix}0&1&0&0&0 \\ 0&0&0&1&0 \\ q&0&0&0&p \\ 0&0&q&0&p \\ 0&0&0&0&1 \end{bmatrix}.$$ Applying standard techinques to this matrix, we can verify that the absorption probability is $1$ for any starting state and that the expected conditional absorption times are $$\begin{bmatrix}{3+q\over1-q^2} \\ {2+q+q^2\over1-q^2} \\ {1+3q\over1-q^2} \\ {1+q+2q^2\over1-q^2}\end{bmatrix}.$$

amd
  • 53,693
  • Perfect. Thank you. – saulspatz Oct 27 '18 at 15:58
  • @saulspatz I also answered your original question. The thing to watch out for when doing this is that you also have to prune any dead-end branches that result from deleting the “bad” absorbing states. – amd Oct 27 '18 at 19:22
  • Actually, that was someone else's question, but I went and looked at your answer. I don't think I really understood what you meant by "dead-end branches" till I read the other answer, but now it makes perfect sense to me. – saulspatz Oct 27 '18 at 23:32