Lyapunov exponents: Why do we know that the changes happen at an exponential rate

Question

Let $E$ be a $\mathbb R$-Banach space, $\Omega\subseteq E$ be open, $f:\Omega\to\Omega$ be continuously Fréchet differentiable, $x_0\in\Omega$ and $\varepsilon>0$ with $B_\varepsilon(x_0)\subseteq\Omega$.

If I got the intuition right, the Lyapunov stability theory of the dynamical system $$x_n=f(x_{n-1})=\cdots=f^n(x_0)\;\;\;\text{for all }n\in\mathbb N,\tag1$$ is trying to measure the change of the system when $x_0$ is perturbed in direction $h\in E$, $\left\|h\right\|_E=1$, to $\tilde x_0=x_0+\varepsilon h$. By $(1)$, this change is equal to $$y_n:=f^n(\tilde x_0)-f^n(x_0)$$ after $n\in\mathbb N$ iterations.

If $E=\mathbb R$, I've read that "the" Lyapunov exponent $\lambda$ is measuring the "exponential change of the distance", $$\varepsilon e^{n\lambda}=|y_n|\;\;\;\text{for all }n\in\mathbb N\tag2.$$

Now, my question is: How do we know that the "change of the distance" happens at an exponential rate? Or is $(1)$ a "model assumption"?

I've started to ask myself this question as I saw the multiplicative ergodic theorem and wondered why the logarithm is occurring in it (this form can be derived from $(2)$ by solving for $\lambda$ and letting $\varepsilon\to0$). Why don't we consider the solely the distance instead of the logarithm of it?

And of course, my subsequent question is how $(2)$ above is generalized to the general Banach space case. From what I've read so far, I assume one is not looking at perturbations in arbitrary directions, but in those which form bases of the eigenspaces related to $A:={\rm D}f(x_0)$ (which togehther form a basis of $\overline{\mathcal R(A)}$, but what is with directions in $\mathcal N(A)$?).

I understand the latter considerations needs to be understood with respect to the linearization of $(1)$.

Does this answer your question? Relation between classical definitions of chaos and exponential divergence of trajectories — Wrzlprmft, May 02 '20 at 19:25
@Wrzlprmft Thank you for the link. It is helpful, but $E=\mathbb R$ was just an example. How does this generalize? What are the directions $h$ which we consider in the general case? Only those in directions in the eigenspaces? This and the other questions in my post are not answered by this other question. — 0xbadf00d, May 02 '20 at 19:29
@Wrzlprmft If I apply your thoughts in the link, I get that $$\left|x_n-y_n\right|E\approx\left|{\rm D}f(x{n-1})\right|{\mathfrak L(E)}\cdots\left|{\rm D}f(x_0)\right|{\mathfrak L(E)}\left|y_0-x_0\right|E,$$ but why is the right-hand side asymptotically equivalent to $\mu^n\left|y_0-x_0\right|_E$ for some "appropriate average $\mu$ over the values of $\left|{\rm D}f\right|{\mathfrak L(E)}$" as you say? — 0xbadf00d, May 05 '20 at 07:40
Roughly, because you average over all trajectories or initial conditions (on the attractor), respectively. As the attractor is invariant under time evolution, this average should be the same at each step of the iteration. — Wrzlprmft, May 05 '20 at 08:25
@Wrzlprmft Do you have a more formally rigorous argument? What does it mean to average over all trajectories? All possible choices of the initial vector $x_0$? Those form an uncountable set ... — 0xbadf00d, May 05 '20 at 09:56
Do you have a more formally rigorous argument? – I am afraid not. — What does it mean to average over all trajectories? All possible choices of the initial vector x0? Those form an uncountable set ... – Yes and yes. — Wrzlprmft, May 05 '20 at 12:47
@Wrzlprmft I have one, at least in a special case: Assume our system is given by $x_n=A^nx$, where $A\in\mathbb R^{d\times d}$ with $\sigma(A)\setminus{0}={|\lambda_1|>\cdots>|\lambda_d|}$. Let $E_i:=\mathcal N(\lambda_i-A)$ and $H_i:=E_i\oplus\cdots\oplus E_d$, $H_{d+1}:={0}$. Then, if $x\in H_i\setminus H_{i+1}$ (which means that $x$ has nonvanishing projections onto $E_i,\ldots,E_d$), it's easy to see that $|A^nx|^{\frac1n}\xrightarrow{n→∞}|\lambda_i|$. Can we conclude that $A^nx\xrightarrow{n→∞}\lambda_i\langle x,e_i\rangle e_i$, where $e_i\in E_i$ is arbitrary with $|e_i|=1$? — 0xbadf00d, May 05 '20 at 14:07
@Wrzlprmft Please take note of my comment below the answer of Felipe Pérez. Maybe you know the answer to my question. — 0xbadf00d, May 07 '20 at 07:35

score 2 · Accepted Answer · answered May 06 '20 at 17:58

I will talk about the one dimensional case, which is the one that I understand better, in a not-so-general setting

Suppose you have a transformation $T\colon[0,1]\to[0,1]$, which is somewhat nice, let's say, piecewise $\mathcal{C}^1$. Think of maps such as the doubling map $x\mapsto 2x \pmod 1$ or the Gauss map $x\mapsto \frac{1}{x} \pmod 1$. We are interested in the rate at which close orbits diverge from each other as you iterate the map. Take two points which are very close $x,x+\delta\in[0,1]$. You can approximate the difference of $T(x)$ and $T(x+\delta)$ with $T'(x)$. If you keep iterating, you can use $(T^nx)'$ instead of $T'(x)$.

Now comes something important: you are interested in the typical behavior of the growth of $(T^n x)'$ for generic $x$ according to a given measure. In cases such as the ones I mentioned above, there are not so hard choices of measures, as for both there are nice choices of such measures: the Lebesgue measure and the Gauss measure respectively, which are both invariant and ergodic wrt the maps (the Gauss measure is equivalent to the Lebesgue measure so their null sets are the same). For the doubling map, almost all points in $[0,1]$ are such that $(T^nx)'=2^n$ (in fact for all but a countable set). Thus, it only makes sense to look at the exponential growth of $(T^nx)'$, that is why we look at the almost sure value of

$$ \lambda(x) = \lim_{n\to\infty} \dfrac{1}{n}\log |(T^n)'x| $$

The magic part is that the expression above can be written as a Birkhoff sum, so by the ergodic theorem you have

$$ \lambda(x) = \lim_{n\to\infty} \dfrac{1}{n}\log |(T^n)'x| = \lim_{n\to\infty}\dfrac{1}{n}\sum_{k=0}^{n-1} \log|T'\circ T^k x| = \int_{[0,1]} \log |T'| d\mu $$

for $\mu$-almost every point. For higher dimensional systems the magic comes from the submultiplicative ergodic theorem.

Now, about the change rate, it depends on the assumptions on your map. For a one-dimensional example, just consider an irrational rotation $x\mapsto x + \alpha \pmod 1$. This map has constant derivative equal to one, and as you iterate, it does not grow at all. In higher dimensions the thing gets even more complicated: you can have maps with expanding and contracting directions and possible neutral directions where there is nor expansion nor contraction. Again, here the rate of expansion/contraction does not need to be exponential. You can build examples by taking products of whatever one dimensional system you want.

Trying to wrap up things, people have looked at exponential contraction/expansion because it is one of the easiest set ups to investigate. In uniformly hyperbolic dynamics, the second you allow a point where the derivative does not have modulus bigger than $1$, things start going bad really quick. For instance, one of the most important tools in dynamics is the transfer operator, which lets you understand some dynamical properties in spectral terms. One of the fundamental properties of this operator in this context is that it is quasi-compact, and this yields many important properties for your system (I can give you some references if you are interested). In the context where you don't have uniform expansion, this property of the operator is lost, and the theory becomes much more difficult.

Thank you for your answer. Is there a reason why you (and, as far as I can tell, everyone who writes about this topic) write $\lambda(x) = \lim_{n\to\infty} \dfrac{1}{n}\log |(T^n)'x|$ instead of $\mu(x):=\lim_{n\to\infty}|(T^n)'x|^{\frac1n}$? This seems to be counterintuitive, since not $\lambda(x)$, but $\mu(x)=e^{\lambda(x)}$ is the eigenvalue of the limiting matrix $\lim_{n\to\infty}|T|^{\frac1n}$. — 0xbadf00d, May 06 '20 at 18:55
In ergodic theory and in particular in thermodynamic formalism, some conventions regarding eigenvalues of certain operators are not well established. For instance, Bowen's pressure and Baladi's pressure differ by a logarithm. Here the situation is similar: depending on what you're interested, you can take or not the logarithm of mu. In most applications you are interested in the actual rate, so you look at lambda, but you could perfectly look at mu as well. It is just a convention thing. — Felipe Pérez, May 08 '20 at 01:57
@0xbadf00d One reason is to be faithful to the intuition coming from the ergodic theorem (which Felipe used in his answer): Using logarithms allows one to interpret $\lambda(x)$ as the value to which the averages of $y\mapsto |T'(y)|$ along the orbit of $x$ converge. — Alp Uzman, Jul 20 '21 at 22:25
Another reason becomes more apparent in so-called higher rank actions (e.g. the $\mathbb{Z}_{\geq0}^2$ action generated by $x\mapsto 2x$ and $x\mapsto 3x$ on the circle). In this case Lyapunov exponents (~ "logarithms of eigenvalues") become linear functionals, and they play a role analogous to weights or roots in representation theory. Still, it's a matter of (taste in) convenience, as far as I can see. — Alp Uzman, Jul 20 '21 at 22:28

Lyapunov exponents: Why do we know that the changes happen at an exponential rate

1 Answers1