4

The concept of derivative and differential has always caused me confusion. So, I was reviewing analysis of several variables, and in a book I found the following definition

Let $X, Y$ be Banach space, $U\subset X$ an open set and $f: U\to Y$ an application, differentiable at a point $a\in U$. The (unique) linear application $A: X\to Y$, which satisfies \eqref{eq1} is called the derivative of $f$ at point $a$, denoted by $df(a):= A: X\to Y$. If $f$ is differentiable at any point in $U$, then the application $$df:U\to\mathcal{L}(X,Y) \qquad a\mapsto df(a)$$ is called the differential of $f$.

$$\forall\;\varepsilon>0\;\exists\:\delta>0\;\forall\: h\in X, \text{ such that } 0<\left\lVert h \right\lVert_{X}<\delta \Rightarrow a+h\in U \text{ and } \frac{\left\lVert f(a+h)-f(a)-Ah \right\lVert_{Y}}{\left\lVert h\right\lVert_{X}}<\varepsilon\tag{1}\label{eq1}$$

Through the site there are several posts about the difference between derivative and differential, to to quote a few (as a reference):

What is the practical difference between a differential and a derivative?
Differential vs Derivative
Are the differential and derivative of a single-variable function exactly the same thing?

But despite the good answers, I was still having trouble understanding the difference between these two concepts, until I came across the definition above. So, I would like to see if I really understand these concepts (my interest is restricted to functions $f:\mathbb{R}^m\to\mathbb{R}^n$). I thought of the following example:

Let $f:\mathbb{R}^2\to\mathbb{R}^2$, be defined by $f(x,y)=e^x(\cos y,\sin y)$. So, the differential is given by

$$df:\mathbb{R}^2\to\mathcal{L}(\mathbb{R}^2 ,\mathbb{R}^2)\quad\text{where}\quad df= \begin{pmatrix} e^x\cos y & -e^x\sin y \\ e^x\sin y & e^x\cos y \\ \end{pmatrix} $$

And the derivative "only makes sense" to speak, when talking about a derivative at a point, so, considering the point $(0,2\pi)$, the derivative at this point is given by

$$df(0,2\pi)= \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix} $$ which is a linear transformation from $\mathbb{R}^2$ to $\mathbb{R}^2$.

I would like to know if my example is correct, that is, if I managed to understand the difference between these concepts.

Mrcrg
  • 2,767
  • 1
    In (1), you probably mean $a+h \in U$. And in your example, I would write it as $df_{(x,y)}$ being equal to that matrix (of course keeping in mind the relation between matrices and linear operators). – peek-a-boo Jun 21 '20 at 20:03
  • @peek-a-boo yes – Mrcrg Jun 21 '20 at 20:05
  • 1
    They are just distinguishing a function and the evaluation of a function. For example, $\sin$ represents the function but $\sin x$ is the evaluation of $\sin$ at $x$. – copper.hat Jun 21 '20 at 20:09
  • @copper.hat So, am I correct in my interpretation? – Mrcrg Jun 21 '20 at 21:04
  • I believe so. ${}$ – copper.hat Jun 21 '20 at 21:08
  • 2
    Somewhat related too: https://math.stackexchange.com/questions/3379173/ – Maximilian Janisch Jun 21 '20 at 21:11
  • I agree with what's written here. Your derivative at a point and your differential are correct, though I might still call the map $df$ the "derivative of $f$" so the terminology isn't clear cut – FShrike Sep 12 '23 at 19:00
  • I am skeptical that this distinction between derivative and differential is actually all that widespread, I’ve always used the terms more or less interchangeably, and added “at a point” (or otherwise clarified via notation) to specify if I am talking about the evaluation at a specific point. – M W Sep 12 '23 at 21:27
  • I personally like to distinguish between the differential $\mathrm df$ and "a derivative" by saying that a derivative for $f$ is some function $D$ such that for some fixed bilinear function $M$ we have $\mathrm df(x)(h) = M(D(x), h)$. In other words, a derivative is something you "multiply by" to get the differential. This is consistent with your definitions above by taking $M$ to be matrix multiplication, and is also consistent with the term "derivative" used in single-variable real and complex analyis as well as when it's used for various "matrix derivatives". – Nicholas Todoroff Sep 13 '23 at 16:33

1 Answers1

4

It seems there is no general clear separation of the term differential from the term derivative. Some authors distinguish between these terms, others do not. Here are two examples:

R. Courant: Introduction to Calculus and Analysis, I and II

R. Courant clearly separates these terms and provides a detailed and careful explanation already in the one-dimensional case. In section 2.8 (j): The Approximation of Functions by Linear Functions. Definition of Differentials the author defines the differential as the dominant linear part with respect to the increment $\Delta x$.

From section 2.8 (j): The derivative of a function $y=f(x)$ was defined by \begin{align*} f^{\prime}(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}=\lim_{\Delta x\to 0}\frac{\Delta y}{\Delta x} \end{align*} where $\Delta x=h$. If for a fixed $x$ and a variable $h$, we define a quantity $\varepsilon$ by \begin{align*} \varepsilon(h)=\frac{f(x+h)-f(x)}{h}-f^{\prime}(x)=\frac{\Delta y}{\Delta x}-f^{\prime}(x), \end{align*} then the fact that $f^{\prime}(x)$ is the derivative of $f$ at the point $x$ amounts to the equation \begin{align*} \lim_{h\to 0}\varepsilon(h)=0 \end{align*} The quantity $\Delta y=f(x+h)-f(x)$ represents the change or increment in the value of the dependent variable $y$ that results when the value $x$ of the independent variable is changed by the amount $\Delta x=h$. Since \begin{align*} \Delta y=f^{\prime}(x)\Delta x+\varepsilon \Delta x, \end{align*} the quantity $\Delta y$ appears as the sum of two parts, namely, a part $f^{\prime}(x)\Delta x$ which is proportional to $\Delta x$ and a part $\varepsilon \Delta x$ which can be made as small as we please compared to $\Delta x$ by making $\Delta x$ itself small enough. The dominant, linear part in the expression for $\Delta y$ we shall call the differential $dy$ of $y$ and write for it \begin{align*} dy=df(x)=f^{\prime}(x)\Delta x\tag{2} \end{align*}

Some more detailed information also about the two-dimensional case is given in this answer.

W. Rudin: Principles of Mathematical Analysis

On the other hand there is a remark by W. Rudin in his Principles clearly indicating the other direction. We find the definition of Derivative in section 9.11:

From section 9.11: Definition Suppose $E$ is an open set in $\mathbb{R}^n$, $\mathbf{f}$ maps $E$ into $\mathbb{R}^m$, and $\mathbf{x}\in E$. If there exists a linear transformation $A$ of $\mathbb{R}^n$ into $\mathbb{R}^m$ such that \begin{align*} \lim_{\mathbf{h}\to\mathbf{0}}\frac{|\mathbf{f}\left(\mathbf{x}+\mathbf{h}\right)-\mathbf{f}(\mathbf{x})-A\mathbf{h}|} {|\mathbf{h}|}=0,\tag{14} \end{align*} then we say that $\mathbf{f}$ is differentiable at $\mathbf{x}$, and we write \begin{align*} \mathbf{f}^{\prime}(\mathbf{x})=A. \end{align*} If $\mathbf{f}$ is differentiable at every $\mathbf{x}\in E$, we say that $\mathbf{f}$ is differentiable in $E$.

This section is followed by section 9.13 Remarks where remark (a) and (d) are of interest for us. Here the author explicitly states that derivative and differential are sometimes used interchangeably.

From section 9.13 (a) and (d):

  • Remark (a): The relation (14) can be rewritten in the form \begin{align*} \mathbf{f}(\mathbf{x}+\mathbf{h})-\mathbf{f}(\mathbf{x})=\mathbf{f}^{\prime}(\mathbf{x})\mathbf{h}+\mathbf{r}(\mathbf{h}) \tag{17} \end{align*} where the remainder $\mathbf{r}(\mathbf{h})$ satisfies \begin{align*} \lim_{\mathbf{h}\to\mathbf{0}}\frac{|\mathbf{r}(\mathbf{h})|}{|\mathbf{h}|}=0.\tag{18} \end{align*}
  • Remark (d): The derivative defined by (14) or (17) is often called the differential of $\mathbf{f}$ at $\mathbf{x}$, or the total derivative of $\mathbf{f}$ at $\mathbf{x}$, to distinguish it from the partial derivatives that will occur later.
Markus Scheuer
  • 108,315