Coordinate change to make function linear on a neighborhood

Question

The following was stated over at Mathoverflow when discussing applications of the Implicit function theorem.

By changing coordinates you can make a simple function appear complicated. Have you ever asked yourself if the opposite is true: given a "complicated" function, can I make it look simpler in a neighborhood of a point by changing the coordinates near that point?

The implicit function theorem states that if that point is not a critical point, then you can find coordinates near that point such that, in these coordinates the function is a linear function.

However, I do not quite see how this follows from the theorem I know: Let $F:\mathbb{R}^{n+m}\to\mathbb{R}^m $ be continuously differentiable, then at a point $(\mathbf{a},\mathbf{b})$ such that $F(\mathbf{a},\mathbf{b})=\mathbf{0}$, the Jacobian $${\displaystyle J_{F,\mathbf {y} }(\mathbf {a} ,\mathbf {b} )=\left[{\frac {\partial F_{i}}{\partial y_{j}}}(\mathbf {a} ,\mathbf {b} )\right]}$$ tells us if locally near $(\mathbf{a},\mathbf{b})\in U$ an implicit function $f:U\subset \mathbb{R}^n\to\mathbb{R}^m$ exists such that $F(\mathbf{x},f(\mathbf{x}))=\mathbf{0}$ for all $x$ in the neighborhood $U$. Additionally, it provides a formula for the Jacobian of the implicit function $f$.

Now, I cannot wrap my head around how you use this to show that you can make any multivariable function locally linear with respect to some coordinates.

peek-a-boo · Answer 1 · 2019-12-30T17:33:04.193

First of all, I hope you know that both the inverse and implicit functions are equivalent. Having said this, I sometimes find it easier to work with one over the other, and in this case, I find it easier to work with the inverse function theorem, so that's what my answer will be based on (but because of the equivalence, this is just a matter of presentation).

Theorem:

Let $F: \Bbb{R}^n \times \Bbb{R}^m \to \Bbb{R}^m$ be $C^r$ ($r \geq 1$), and suppose that at some point $(a,b) \in \Bbb{R}^n \times \Bbb{R}^m$, the $m \times m$ Jacobian matrix $\dfrac{\partial F}{\partial y}(a,b)$ is invertible. Then, there is an open neighborhood $U$ of $(a,b)$ and a $C^r$ diffeomorphism $\phi: U \to \phi[U] \subset \Bbb{R}^n \times \Bbb{R}^m$ such that for all $(x,y) \in \phi[U]$, \begin{align} (F \circ \phi^{-1})(x,y) &= y \end{align} In other words, if we let $\pi_2: \Bbb{R}^n \times \Bbb{R}^m \to \Bbb{R}^m$ be the projection onto the second factor (second meaning the $\Bbb{R}^m$ factor) then \begin{align} F \circ \left( \phi|_U \right)^{-1} &= \pi_2\bigg|_{\phi[U]} \tag{$\ddot{\smile}$} \end{align}

Hence, by a change of coordinates on the domain, we managed to "transform" the map $F$ (which has a surjective derivative $DF_{(a,b)}$, since $\partial F/ \partial y(a,b)$ is invertible) into the canonical projection (which is linear).

The proof is as follows: Define $\phi: \Bbb{R}^n \times \Bbb{R}^m \to \Bbb{R}^n \times \Bbb{R}^m $ by $\phi(x,y) = (x, F(x,y))$. Then, a simple calculation shows that the matrix representation of $D \phi_{(a,b)}$ (or perhaps it is the transpose... I'm too lazy to carefully check, but it really doesn't matter) is \begin{align} [D \phi_{(a,b)}] &= \begin{pmatrix} I_n & 0 \\ * & \frac{\partial F}{\partial y}(a,b) \end{pmatrix} \end{align} where $(*)$ is some irrelevant matrix. The point here is that this $(m+n) \times (m + n)$ matrix is invertible (because of the block structure). Hence, by the inverse function theorem, there exists a neighborhood $U$ of $(a,b)$ such that the restricted map $\phi: U \to \phi[U]$ is a $C^r$ diffeomorphism of open sets in $\Bbb{R}^n \times \Bbb{R}^m$.

Now, note that by definition of $\phi$, we have that $F = \pi_2 \circ \phi$. From this, $(\ddot{\smile})$ is obvious.

Now, I'll state a couple of general theorems (whose proofs are similar in spirit to the one above; simply define a new function carefully and apply the inverse function to it).

Theorem $\ddot{\smile}$

Let $M, N$ be manifolds of dimension $m,n$ respectively, and let $f: M \to N$ be a $C^r$ map $(r \geq 1)$, and let $p \in M$. Then, the tangent mapping $Tf_p : T_pM \to T_{f(p)}N$ determines the local behavior of $f$ in the following sense:

(Local immersion theorem for manifolds) If $Tf_p$ is an injective linear map (which in particular means $m \leq n$), then there exist an open neighbourhoods $U,V$ with $p \in U \subset M$ and $f(p) \in V \subset N$ and $C^r$ diffeomorphisms $\phi: U \to \phi[U] \subset \Bbb{R}^m$ and $\psi: V \to \psi[V] \subset \Bbb{R}^n$ such that \begin{align} \psi \circ f \circ \phi^{-1} = \iota \bigg|_{\phi[U]} \end{align} where $\iota: \Bbb{R}^n \to \Bbb{R}^m$, $\iota(x_1, \dots, x_n) = (x_1, \dots, x_m, 0,\dots, 0)$ is the canonical inclusion.

(Local submersion theorem for manifolds). If $Tf_p$ is a surjective linear map (which in particular means $m \geq n$), then there exist an open neighbourhoods $U,V$ with $p \in U \subset M$ and $f(p) \in V \subset N$ and $C^r$ diffeomorphisms $\phi: U \to \phi[U] \subset \Bbb{R}^m$ and $\psi: V \to \psi[V] \subset \Bbb{R}^n$ such that \begin{align} \psi \circ f \circ \phi^{-1} = \pi \bigg|_{\phi[U]} \end{align} where $\pi: \Bbb{R}^m \to \Bbb{R}^n$ is the canonical projection onto the last $n$ coordinates.

(Inverse function theorem on manifolds... stated in a funny way). If $Tf_p$ is a linear isomorphism (which in particular means $m = n$), then there exist an open neighbourhoods $U,V$ with $p \in U \subset M$ and $f(p) \in V \subset N$ and $C^r$ diffeomorphisms $\phi: U \to \phi[U] \subset \Bbb{R}^m$ and $\psi: V \to \psi[V] \subset \Bbb{R}^n$ such that \begin{align} \psi \circ f \circ \phi^{-1} = \text{id} \bigg|_{\phi[U]} \end{align} where $\text{id}: \Bbb{R}^m \to \Bbb{R}^n = \Bbb{R}^m$ is the identity map.

In the last part, I stated the inverse function theorem in a weird way, only to highlight the similarity in the "structure" of the theorems. Notice that each theorem tells you how to go from the "infinitesimal" to "local" (i.e behaviour of the derivative determines the behaviour of the function locally). So, if the derivative at a point is nice enough, then by a change of coordinates, the original map itself can be "transformed" to a simple linear map.

And in a sense, each of these theorems is the non-linear analogue of the basic linear algebra theorems about row and column reducing whenever there is a injective/surjective matrix (i.e finding bases in which a linear map takes a particularly simple form).

Hopefully the proof I gave above in the special case helps you appreciate the recurring theme that linear algebraic results + tedious $\epsilon$-$\delta$ analysis (to prove things like IFT) $\implies$ useful and general calculus results on vector spaces and manifolds. If you want to see the proofs of these more general results, take a look at any book on differential topology/geometry; it should be proven in one of the first few chapters. For instance Guillemin and Pollack, or even Abraham-Marsden-Ratiu's book on Tensor analysis and Manifolds.

Btw, your statement of the implicit function theorem has a few errors (eg. the definition of $U$ makes no sense)... but this I'm sure is just a typo in an attempt to state the theorem quickly.

Coordinate change to make function linear on a neighborhood

1 Answers1

Linked