As the title says I'm interested in rigourously justifying the process of linearization of systems of non -linear difference equations. I recently started reading a book about dynamical systems and generally speaking the arguments given seem solid enough, but when I got to the study of stability of steady state solutions the arguments are a bit too informal to my liking and I haven't been able to fill in the details in a way that I find satisfying.
According to the book, we start by considering a generic difference equation $x_{n+1}=f(x_{n})$. The book assumes that f is at least twice differentiable, in a neighborhood of a steady state which is a reasonable requirement.
Then we consider a value of $x$ such that $x=f(x)$ (i.e. a steady state). To determine stability we start with a small deviation from $x$, say $x_{0}=x+\epsilon$ and look at the behaviour of this value under iteration. Specifically, we would like to know what happens to the difference $x_{n}-x$ as $n$ goes to infinity.
Now, we know that for all $n$, $x_{n+1}=f(x_{n})$. On the other hand we can write $x_{n}=x+\epsilon_{n}$. If we expand $f$ into its Taylor polynomial around $x$ we get $f(x_{n})=f(x)+f^{'}(x)\epsilon_{n}+O(\epsilon_{n}^{2})$. To simplify notation lets just set $a=f^{'}(x)$. Since $f(x)=x$, we end up with $\epsilon_{n+1}=x_{n+1}-x=a\epsilon_{n}+O(\epsilon_{n}^{2})$, which, for sufficiently small $\epsilon_{n}$ is roughly equal to $a\epsilon_{n}$.
The next part of the argument is what I find particularly sketchy: regardless of the value of $n$ we neglect the quadratic term and claim that $\epsilon_{n+1}=a\epsilon_{n}$. Why can we simply do this? Even if the individual error terms are tiny when we iterate using that last equation the errors begin to add up.
So my question is, can we even guarantee that if the starting value is sufficiently close to the steady state all the succesive values remain close to it? I would say no because if the steady state is unstable the difference may grow bigger and bigger, to the point we can no longer ignore the error term.
We could suppose then that all the iterations stay sufficiently close to the steady state. Then all the errors must also remain close to zero. As a result we could take the supremum and infimun over all these errors say $M$ and $m$, define $E=\max\lbrace\vert m\vert,\vert M\vert\rbrace$ and claim that for all $n$ the following inequalities hold:
$-E+a\epsilon_{n}\leq\epsilon_{n+1}\leq E+a\epsilon_{n}$
If I could somehow drop the error $E$, then I could easily find that if $\vert a\vert<1$ the error approaches zero, but then again I don't know how to rigourously justify that step. On the other hand if I keep $E$ I can only show that in the limit $\epsilon_{n}$ remains in a neighborhood of zero (assuming $\vert a\vert<1$ ), which is not enough.
So the only thing left is to assume that $\epsilon_{n}$ converges to zero. In other words, assume that the steady state $x$ is stable, and even then I have no idea what to do to prove that the error approaches 0.
To summarize, I intuitively see why it's reasonable to assume that as long as the error term in the expansion is small the linearized system should behave in the same way as the original system, but I don't find the argument rigourous. I've seen other proofs that make use of the mean value theorem to prove the stability criteria but I don't know if those generalize easily to higher dimensions, so I was hoping this other argument could be formalized.