I know that a function $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$ is called totally differentiable at a point $a\in \mathbb{R^n}$ if $\exists$ a linear transformation $T_a$ such that
\begin{equation}
\|{\lim_{h\rightarrow 0} \frac{f(a+hv)-f(a) - T_a(h,v)}{h}}\| = 0
\end{equation}
I also know that this expression is derived from Taylor's theorem.
My question is that if this definition is supposed to be analogous to differentiablity of a function in a single variable setup. Then how does this expression justifies it.
In short, my question is how does this function justify differentiablity as we know.
Also, here we have used linear transformation $T_a$ as derivative of that function. How do we know that the derivative of this function will be a linear transformation?
-
Take a look at this and this. – peek-a-boo Feb 11 '21 at 06:51
-
@peek-a-boo I got it, thanks. But Why do we have error term in multivariable function but in single valued functions, we don't – QuantumOscillator Feb 11 '21 at 07:02
-
who says we don't – peek-a-boo Feb 11 '21 at 07:03
-
@peek-a-boo okay, but why $T_a$ needs to be linear map? – QuantumOscillator Feb 11 '21 at 07:53
-
it needs to be linear because that's the definition. why do we make that definition? Because its the most natural generalisation of the single variable csse. I'm pretty sure i explained the connection between these two in my other answer. – peek-a-boo Feb 11 '21 at 15:48
1 Answers
Consider functions ${ f,g : \mathbb{R} \to \mathbb{R} }$ differentiable at ${ p \in \mathbb{R} }.$ We can say “${ f(x) \approx g(x) \text{ near } p}$” if the tangent line approximations at ${ p }$ agree, that is ${ f(p) + f ^{‘}(p) (x-p) = g(p) + g ^{‘} (p) (x-p) ,}$ that is ${ f(p) = g(p), f^{‘} (p) = g ^{‘} (p) }.$
Now consider functions ${ f,g : \mathbb{R} ^{n} \to \mathbb{R} }$ and a point ${ p \in \mathbb{R} ^n }.$ We can say “${ f(x) \approx g(x) \text{ near } p }$” if for every unit vector ${ e _i }$ the slices ${ t \mapsto f(p + t e _i) , t \mapsto g(p + t e _i) }$ agree approximately near ${ 0 }$ in the above 1D sense. (Say all directional derivatives ${ \frac{d}{d t} \big\vert _{t=0} f(p + t e _i) , \frac{d}{d t} \big\vert _{t=0} g(p + t e _i) }$ exist).
This condition is equivalent to asking ${ f(p) = g(p) },$ and $${\frac{d}{d t} \Bigg\vert _{t=0} f(p + t e _i) = \frac{d}{d t} \Bigg\vert _{t=0} g(p + t e _i) }$$ that is $${ \lim _{t \to 0} \frac{f(p+ t e _i) - g(p + t e _i)}{\lVert t e _i \rVert } = 0 }$$ for all unit vectors ${ e _i }.$
This suggests the following more general definition.
Def: Consider functions ${ f, g : \mathbb{R} ^n \to \mathbb{R} ^m }$ and a point ${ p \in \mathbb{R} ^n }.$ For the purposes here, we say “${ f(x) \approx g(x) \text{ near } p }$” if ${ f(p) = g(p) }$ and the error $${ \epsilon(h) := f(p + h) - g(p+h) }$$ satisfies $${ \lim _{\lVert h \rVert \to 0} \frac{\epsilon(h)}{\lVert h \rVert} = 0 .}$$
(Informally, ${ \lim _{\lVert h \rVert \to 0} \frac{\epsilon(h)}{\lVert h \rVert} = 0 }$ means the error ${ \epsilon(h) = f(p+h) - g(p+h) }$ goes to ${ 0 }$ faster than ${ \lVert h \rVert }$)
Now the derivative of ${ f }$ at ${ p }$ can be defined as (linear part of) an affine map which approximates ${ f }$ near ${ p }$ in the above sense.
Def: Consider a function ${ f : \mathbb{R} ^n \to \mathbb{R} ^m }$ and a point ${ p \in \mathbb{R} ^n }.$ If there exists an affine map which approximates ${ f }$ near ${ p }$ in the above sense, we say ${ f }$ is differentiable at ${ p }.$
In this case, since both maps agree at ${ p }$ the affine map must be of the form ${ x \mapsto f(p) + L (x-p) }.$ The ${ L }$ can be shown to be unique, and it is called the derivative of ${ f }$ at ${ p }.$