Suppose I have some recurrent formula for probability:
$$f_k(t)=e_k(x_t)\sum_{i}f_i(t-1)p_{ik}. \tag{1}$$
Where $1 \leq i,k \leq N$ denote finite number of states, time $t \leq T$ is discrete and $x_t$ is some discrete observation value at time $t$.
The conditional probabilities $p_{ik}=P(\pi_t=k \mid \pi_{t-1}=i)$ and $e_k(x_t)=P(\text{observation}=x_t \mid \pi_{t}=k)$ doesn't depend on $t$ (I just used time in the notation. These are transition probabilities of time-homogeneous Markov chain and probability of associated observation given the state of the chain).
$f_k(t)$ and $\sum_{k=1}^N f_k(t)$ are indeed probabilities ($<1$), I'm actually interested in $\sum_{k=1}^N f_k(T)$.
Now I want to generalize this formula to the continuous distributions.
Suppose that I have $T$ observations as before, but the state space and observations space are $\mathbb{R}$. I have two conditional probability density functions $p_{Y \mid Y'}(y \mid y' )$ and $e(x \mid y)$. Transitions (and observations) occur at some discrete times $t=1,...,T$.
I intend to compute the following:
$$f_t(y) = e(x \mid y)\int_{\mathbb{R}}f_{t-1}(y') p_{Y \mid Y'}(y \mid y' ) dy'$$
I think here $f$ tends to be something similar to density, integration over $\mathbb{R}$ should give me the probability I want:
$$\int_{\mathbb{R}}f_T(y)dy \leq 1. $$
However, direct computation gives me very big result $1e+226$. The apparent reason is: the density $p_{Y \mid Y'}(y \mid y' )$ has very big values and is integrated over $y'$, not $y$.
What can be wrong with this idea and what can I do to apply this approach to continuous state space?