Transition Matrix in Markov Chain Explained
Markov chain is a very important math model. Many practical problems can be modeled as Markov chain, and then be solved. I would like to talk about the transition matrix, specifically how to deduce the transition matrix after step n.
Let us use a simple but non-trivial example. Let say we have only two states: State 1 and State 2. State 1 has 0.6 probability to stay at itself, and 0.4 probability to go to State 2. State 2 has 0.8 probability to stay at itself and 0.2 probability to go to State 1. We can easily to write down the transition matrix below:
$$q =
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
Please notice, in different literature, people tend to use different way of representing transition matrix. In our example, each row represents the corresponding probability for each state. Let us assume we start at $T_0$, which has PMF of $(1,0)$. It simply means at $T_0$, it is 100% at State 1. What happen after 1 step at $T_1$, what will the PMF change to? We can use $T_0$ PMF to multiply our transition matrix as below
$$PMF(T_1) = (1,0) *
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
We get PMF at $T_1$ to be (0.6 , 0.4). This makes intuitively sense by just looking at the Markov chain, because if we are sure 100% we start at State 1, and the PMF for the next step is (0.6 , 0.4). The tricky part is what about Step 2 which is $T_2$? What the PMF will be like? I will write down the result first and then give an intuitive explanation. It turns out that the PMF at $T_2$ is just simply using $T_1$ PMF to multiply our transition matrix again as below:
$$PMF(T_2) = (0.6 , 0.4) *
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
We get (0.44 , 0.56) which means at step 2, it has 0.44 probability to stay at State 1 and 0.56 probability at State 2. Now the question is why next step's PMF can be calculated by multiplication of current PMF and transition matrix? To answer that, let me rewrite the above as below
$$PMF(T_{n+1}) = (P_{s1(n)} , P_{s2(n)}) *
\begin{pmatrix}
P_{11} & P_{12} \\
P_{21} & P_{22} \\
\end{pmatrix}
$$
Let us multiply it out as $PMF(T_{n+1}) = (P_{s1(n+1)}, P_{s2(n+1)}) = (P_{s1(n)}*P_{11} + P_{s2(n)}*P_{21} , P_{s1(n)}*P_{12} + P_{s2(n)}*P_{22})$. Now we can see it is just a discrete case of probability calculation. We know current time PMF $(P_{s1(n)}, P_{s2(n)})$, and we know for each state, the probability of its going to State 1 as $(P_{11}, P_{21})$, so then we know the new probability of being at State 1 at next time step, which is $P_{s1(n)}*P_{11} + P_{s2(n)}*P_{21}$. Same way, you can calculate the new probability for State 2. Intuitively, if we have M state, we could use the rule to multiply current time PMF and transition matrix to get next step PMF.
This also brings in a very important property of Markov chain which is the state of n+1 only depends on state n. It "forgets" all this history of from 0 to n-1.
Let us use a simple but non-trivial example. Let say we have only two states: State 1 and State 2. State 1 has 0.6 probability to stay at itself, and 0.4 probability to go to State 2. State 2 has 0.8 probability to stay at itself and 0.2 probability to go to State 1. We can easily to write down the transition matrix below:
$$q =
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
Please notice, in different literature, people tend to use different way of representing transition matrix. In our example, each row represents the corresponding probability for each state. Let us assume we start at $T_0$, which has PMF of $(1,0)$. It simply means at $T_0$, it is 100% at State 1. What happen after 1 step at $T_1$, what will the PMF change to? We can use $T_0$ PMF to multiply our transition matrix as below
$$PMF(T_1) = (1,0) *
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
We get PMF at $T_1$ to be (0.6 , 0.4). This makes intuitively sense by just looking at the Markov chain, because if we are sure 100% we start at State 1, and the PMF for the next step is (0.6 , 0.4). The tricky part is what about Step 2 which is $T_2$? What the PMF will be like? I will write down the result first and then give an intuitive explanation. It turns out that the PMF at $T_2$ is just simply using $T_1$ PMF to multiply our transition matrix again as below:
$$PMF(T_2) = (0.6 , 0.4) *
\begin{pmatrix}
0.6 & 0.4 \\
0.2 & 0.8 \\
\end{pmatrix}
$$
We get (0.44 , 0.56) which means at step 2, it has 0.44 probability to stay at State 1 and 0.56 probability at State 2. Now the question is why next step's PMF can be calculated by multiplication of current PMF and transition matrix? To answer that, let me rewrite the above as below
$$PMF(T_{n+1}) = (P_{s1(n)} , P_{s2(n)}) *
\begin{pmatrix}
P_{11} & P_{12} \\
P_{21} & P_{22} \\
\end{pmatrix}
$$
Let us multiply it out as $PMF(T_{n+1}) = (P_{s1(n+1)}, P_{s2(n+1)}) = (P_{s1(n)}*P_{11} + P_{s2(n)}*P_{21} , P_{s1(n)}*P_{12} + P_{s2(n)}*P_{22})$. Now we can see it is just a discrete case of probability calculation. We know current time PMF $(P_{s1(n)}, P_{s2(n)})$, and we know for each state, the probability of its going to State 1 as $(P_{11}, P_{21})$, so then we know the new probability of being at State 1 at next time step, which is $P_{s1(n)}*P_{11} + P_{s2(n)}*P_{21}$. Same way, you can calculate the new probability for State 2. Intuitively, if we have M state, we could use the rule to multiply current time PMF and transition matrix to get next step PMF.
This also brings in a very important property of Markov chain which is the state of n+1 only depends on state n. It "forgets" all this history of from 0 to n-1.
Comments
Post a Comment