4

In my understanding, to make a rigorous use of the Leibniz notation, one must write either

$\frac{df(x)}{dx}$ $ \ \ \ $ i.e. $ \ \ $ $f'$ $ \ \ $ (which denotes a function)

or

$\frac{df(x)}{dx}(a)$ $ \ \ $ or $ \ \ $ $\frac{df(x)}{dx}|_{x=a}$ $ \ \ $ i.e. $ \ $ $f'(a)$ $ \ \ $ (which denotes a value)

(where $f$ is a function that is differentiable at $a$, and $x$ is just a placeholder/a bound variable)

Thus, I suppose that writing $\frac{df}{dx}$ for $f'$ or $\frac{df}{dx}|_{x=a}$ for $f'(a)$ is merely a common abuse of notation, since the $\frac d{dx}$ must be followed by a literal expression dependant on $x$, not by a function. (See also this question)

What disturbs me is that I sometimes see the notation $\frac{dy}{dx}$ for $\frac{df(x)}{dx}$ where $y=f(x)$ is a "dependent variable".

How can $y$ have any mathematical meaning ? It's seemingly neither a function (because it equals $f$ evaluated at $x$) nor a constant number (because it depends on $x$). So it is apparently a weird mathematical object linked by convention to a variable called $x$.

(I think this illustrates one of the big problems with Leibniz notation: it requires assigning fixed letters to the variables of a function, which is bogus since a function should be independent of the name given to its argument. The same problem occurs with the Leibniz notation for partial derivatives: if $f$ is a function $\mathbb{R}^2 \rightarrow \mathbb{R}$, then unlike the unambiguous notation $\partial_1f$ (for the partial derivative w.r.t. the first argument), the Leibniz notation $\frac{\partial f}{\partial \, r}$ presupposes that the first variable of the function will always be denoted by $r$)

So is there a rigorous way to define a « dependent variable », or is this just pseudo-mathematical quirkiness ?

Edit: What is driving my question is that I have the impression that Leibniz's notation consistently treats everything as variables dependent on each other rather than as functions and arguments. As in the chain rule $\frac{dy}{dx}=\frac{dy}{dg}\frac{dg}{dx}$ that acts as if $y$ depends on $g$ even though $g$ is a function. I would like to know if there is a there is a purely mathematical aspect behind it, perhaps related to something in higher math like maybe manifolds. (I don't know what manifolds are, I don't even necessarily want to try to understand the mathematical definition of a "dependent variable", I would just like to know if this rigorous mathematical aspect exists or not).

Edit: This thread asks questions similar to mine, but I haven't found satisfactory answers on it.

Mr Jackie
  • 111
  • 5
    $\frac{df}{dx}$ is another notation for $f’$, and $\frac{df(x)}{dx}$ is another notation for $f’(x)$ (the derivative of $f$ evaluated at $x$). The notation $\frac{df(x)}{dx}(a)$ is not correct and is never used. – littleO Nov 06 '21 at 23:56
  • Thx for your answer. According to the Wikipedia article I linked, "the derivative of the function $f$" (which is presumably synonymous with $f'$) can be written as $\frac{d(f(x))}{dx}$. Besides, $\frac{df}{dx}$ is a bizarre notation for $f'$ because the $x$ isn't involved (would $\frac{df}{dz}$ mean anything different ?) Moreover, $\frac{df(x)}{dx}$ for $f'(x)$ is also bizarre because that would imply writing $f'(0.5)$ as $\frac{df(0.5)}{d0.5}$ – Mr Jackie Nov 07 '21 at 00:06
  • 6
    Lots of mathematical notation is technically ambiguous and requires context to interpret correctly, but this isn't particularly problematic. We aren't writing computer programs, we're communicating with other humans. – Karl Nov 07 '21 at 01:49
  • 1
    In $\frac{dy}{dx}$ I like to think of both $x$ and $y$ as values that vary simultaneously, e.g. as functions of time $x(t)$ and $y(t)$. – Karl Nov 07 '21 at 01:51
  • I dislike writing the chain rule as $\frac{dy}{dx} = \frac{dy}{dg} \frac{dg}{dx}$ because two different functions are both being called by the same name $y$. This is a very common abuse of notation and it’s a frequent source of confusion on this site. It would be preferable in my opinion to define $\hat y(x) = y(g(x))$ and then write the chain rule as $\frac{d\hat y}{dx} = \frac{dy}{dg} \frac{dg}{dx}$. Even better opinion would be to write the chain rule as $\hat y’(x) = y’(g(x)) g’(x)$. – littleO Nov 07 '21 at 03:24
  • I dislike it too, but I thought maybe there is a mathematical way to make it rigorous. – Mr Jackie Nov 07 '21 at 10:46
  • @Karl Again, this refers seemingly to the derivative of a function with respect to another function. This does not make any rigorous sense unless somehow we consider all letters as variables that depend on each other, rather than functions. – Mr Jackie Nov 07 '21 at 10:52
  • Instead of thinking of the variables as dependent or independent, think of them as just varying together subject to a constraint given by an equation relating the variables to each other. $dy/dx$ is the ratio of the rate of change of $y$ to the rate of change of $x$ as we move around on the graph of the given equation. This perspective is also natural when doing implicit differentiation. – Karl Nov 07 '21 at 15:45
  • There is a process where we think of all variables as dependent variables -- implicit differentiation. There is also a variant where we think of all but one variable as dependent variables (so we make the reduction $\frac{\mathrm{d}x}{\mathrm{d}x} = 1$ among the forest of primes). – Eric Towers Nov 07 '21 at 16:00
  • @littleO and Mr Jackie: There is a way to make it mathematically rigorous, as sketched in this answer (see "Notes"). – user21820 Nov 11 '21 at 20:22
  • @Karl: Indeed, in practical applications of implicit differentiation such as here, these variables (in the older sense and not the modern sense) are generally functions of time, and so having variables of the sort compatible with Leibniz notation (see my other linked post) makes it very intuitive to do implicit differentiation. – user21820 Nov 11 '21 at 20:25

4 Answers4

6

Is it rigorous to write $\frac{dy}{dx}$ with $y=f(x)$ instead of $\frac{df(x)}{dx}$?

Mathematical notations are for communications. There is no such notion of "rigorous notation". As long as the context is clear and the definition is correctly stated, one can use whatever notation one likes. Of course, there is a matter of following the "convention" so that the intended communication is effective.

There are various notations of derivatives. People use different notations in different contexts.

  • In the Leibniz notation $\dfrac{dy}{dx}$, one can tell from the notation that $y$ is the dependent variable and $x$ is the independent variable.

  • Euler's notation $Df$ treats the derivative as an operator.

  • Newton notation $\dot{f}(x)$ and the Lagrange notation $f'(x)$ suggest that the derivative is more like a function.

  • People sometimes also use a combination of these versions of notations.

These notations are all useful in different scenarios:

  • When solving a simple ODE like $y'+y=\sin(x)$, the Leibniz notation allows you to formally manipulate "differentials".
  • In functional analysis, the Euler notation is convenient for statements regarding the derivative operators. For instance, one may phrase a question like "is $D:C^\infty(\mathbb{R})\to C^\infty(\mathbb{R})$ diagonalizable?"
  • The Newton notation is often used in differential geometry to denote derivatives with respect to the arc length parameter.
  • The Lagrange notation is a compact way to write derivatives. For instance: $\|ff'\|_{L^1}\le \|f\|_{L^2}\cdot \|f'\|_{L^2}$.

There is no good-for-all notation.


Can "dependent variables" be defined mathematically?

Mathematics is mostly carried out in natural language, which has no formal grammar. If one wants to do everything in a formal (which may be a more suitable word than "rigorous" in your question) way, one should study logic. In logic, a certain collection of expressions are chosen to be the "variables" at the outset; the semantics of the formal languages allow these variables to refer to various mathematical objects.

See also: Is there a way of defining the notion of a variable mathematically? and two answers there:

  • Thank you for taking the time to respond, but I find this a bit beside the point. My question was not about the validity of Leibniz's notation compared to other notations (although I must say that I strongly disapprove of it). Rather, I was wondering if it was possible to describe rigorously (or formally) what an "independent variable" is, an object that seems at first sight very strange because it is neither a function nor a constant value. Moreover, I was wondering if this related to a wider mathematical theory like manifolds. – Mr Jackie Nov 07 '21 at 14:01
  • @MrJackie: you ask the notation question explicitly in the title of your post, thus the answer. The question regarding formal definitions of "independent variables" is answered in the last part of my post. –  Nov 07 '21 at 14:26
0

First, to your very first claim. Neither $\frac{\mathrm{d}f(x)}{\mathrm{d}x}$ nor $f'$ are primary, the definition of the derivative is primary and all notations are shorthand references to that definition. Notation cannot be rigorous because notation is just syntax. Reasoning about the semantics of the objects described by syntax can be rigorous, but the syntax is just strings of symbols having whatever semantics we have defined them to have. Since we give the string of symbols "$\frac{\mathrm{d}f}{\mathrm{d}x}$" semantics, that string of symbols has that meaning. That string of symbols also makes it easier to communicate precisely about rigorous reasoning about the abstract process that string of symbols labels.

Additionally, there is a property of equality. If the equality $A = B$ holds, then in any expression, anywhere $A$ appears, it may be replaced with $B$ and vice versa. So, once we have $y = f(x)$, then \begin{align*} D_x(y) &= D_x(f(x)) \text{,} \\ \partial_1 y &= \partial_1 f(x) \text{,} \\ y' &= \left( f(x) \right)' = f'(x) \text{,} \\ y^{(1)} &= \left( f(x) \right)^{(1)} = f^{(1)}(x) \text{,} \\ \frac{\mathrm{d}y}{\mathrm{d}x} &= \frac{\mathrm{d}f(x)}{\mathrm{d}x} \text{, and} \\ \frac{\partial y}{\partial x} &= \frac{\partial f(x)}{\partial x} \text{.} \end{align*} Further, if we are working in calculus of a single variable, the above list of equivalences are also equivalent to $D(y) = D(f) = \partial_1 f = f' = f^{(1)} = \frac{\mathrm{d}f}{\mathrm{d}x} = \frac{\partial f}{\partial x}$, for the reason that in the calculus of a single variable, functions can only have one argument, so differentiation is with respect to that argument.

You say "the $\frac{\mathrm{d}}{\mathrm{d}x}$ must be followed by a literal expression dependant on $x$, not by a function." However, a literal expression dependent on $x$ does denote a function dependent on $x$. "$x^2$" is a function, dependent on $x$, that squares its argument. Further, the operator "$\frac{\mathrm{d}}{\mathrm{d}x}$" need only be followed by something that can be differentiated with respect to $x$. If we have "$y = f(x) = x^2$", then $y$, $f$, and $x^2$ are all functions depending explicitly on $x$. Consequently, the strings of symbols "$\frac{\mathrm{d}}{\mathrm{d}x} y$", "$\frac{\mathrm{d}y}{\mathrm{d}x}$", "$\frac{\mathrm{d}}{\mathrm{d}x}f(x)$", "$\frac{\mathrm{d}f(x)}{\mathrm{d}x}$", "$\frac{\mathrm{d}}{\mathrm{d}x} x^2$", and "$\frac{\mathrm{d} x^2}{\mathrm{d}x}$" all denote the same operation.

You write "[in "$y = f(x)$", $y$ is] neither a function (because it equals $f$ evaluated at $x$) nor a constant number (because it depends on $x$)". However, you can't have it both ways. Either $y$ is evaluatable for various values of $x$ or $f(x)$ is not. The equality establishes that the value on the left of the equation equals the value on the right of the equation. The expression on the right of the equation, "$f(x)$", evaluates when $x$ has a value. Consequently, $y$ evaluates when $x$ has a value.

The problem you are not exactly stating, which is "how do I know what the independent variable is if all I see is '$f$'?" is exactly the same problem one has with any partial derivative and can be made worse by using notation badly. What's $\frac{\partial f(x,x,x)}{\partial x}$? The meaning of that expression is not communicated clearly by that expression in isolation ... because the "$x$" in the $\partial x$ is required to be either the label of a formal parameter when the function $f$ was declared or is required to be the label of one argument in "$\partial f(x,x,x)$". You recite the notation $\partial_1 f$, which is frequently also given the notations $\partial_1 f = f_1 = f^{(1,0,0)}$ to improve the situation, to explicitly enumerate the formal parameter that is treated as the independent variable (and in the latter two notations I recite, to allow multiple derivatives without excessive repetition of the symbol "$\partial$").

Consequence: Leibniz notation is unambiguous when the function appearing in the "change of output" portion of the simplified difference quotient that is the notation $\frac{\mathrm{d}f}{\mathrm{d}x}$, is either nullary or unary, i.e. has zero or one formal parameter. Certainly, if $g$ is declared $g(y,z)$ then $\frac{\mathrm{d}g}{\mathrm{d}x}$ has the same problem that partial differentiation of $g$ with respect to $x$ has. However, $\frac{\mathrm{d}g(x,x)}{\mathrm{d}x}$ has no such ambiguity and has a definite value, unlike when we notate partial differentiation. This is one of the few places where explicitly listing the parameters to the function in the "top" of a derivative usefully communicates something.

You say "...one of the big problems with Leibniz notation: it requires assigning fixed letters to the variables of a function, which is bogus since a function should be independent of the name given to its argument." This leaves you in a tight spot. Either you should be able to provide precise notation for a function that doesn't make a choice of formal parameter, for instance, you should be able to express $f(x) = x^2$ without any recourse to "$x$" or an equivalent, or you make a very clear argument that function definitions requires a choice of formal parameter. If we choose to be imprecise about the formal parameter, then $\frac{\mathrm{d}f(x)}{\mathrm{d}t}$ suddenly becomes semantically valid -- it's the derivative of the $f$ with respect to its only formal parameter, but it is ambiguous whether we intend to express the result in terms of the formal parameter $x$ or the formal parameter $t$.

In comments to the Question, you write "Moreover, $\frac{\mathrm{d}f(x)}{\mathrm{d}x}$ for $f′(x)$ is also bizarre because that would imply writing $f′(0.5)$ as $\frac{\mathrm{d}f(0.5)}{\mathrm{d}0.5}$" is only valid if you make the same error in both "$f′(0.5)$" as "$\frac{\mathrm{d}f(0.5)}{\mathrm{d}0.5}$". In "$f'(0.5)$", there is the potential ambiguity: are you differentiating $f$ and then evaluating the resulting derivative function at $0.5$, or are you evaluating $f$ at $0.5$, obtaining a constant (hence constant function), then differentiating that to obtain $0$. The latter is useless, so no person communicating something worthwhile ever uses notation to express that process -- differentiate first. Additionally, $\frac{\mathrm{d}f}{\mathrm{d}x}$ is a function, so there should never be any confusion that $\frac{\mathrm{d}f}{\mathrm{d}x}(0.5)$ has the semantics $\left. \frac{\mathrm{d}f}{\mathrm{d}x}\right|_{x=0.5}$, that is, differentiate, then specialize the value of the argument, yet another argument in favor of the notation you seem to dislike.

Eric Towers
  • 67,037
0

One way to make sense of "variables" is to think of the mathematics as modeling an underlying physical reality. Imagine, say, a ball dropped from a cliff. The underlying reality is a set of states of the system. The "variables" are real valued functions on the states, often written with suggestive letters. So you have $t$ for the time, $d$ for the distance the ball has fallen, $v$ for the velocity, $a$ for acceleration, $k$ for kinetic energy ...

The values of these functions (the variables) are related. So when you say $d = f(t)$ you are really writing function composition: each state $s$ of the world $$ d(s) = f(t(s)). $$ In beginning calculus you are often told that relation in the form $$ d = 16t^2. $$ There is no intrinsic sense in which any of the variables is independent or dependent. You could equally easily write the relation between the variables (the functions) $d$ and $t$ as $$ t(s) = \frac{\sqrt{d(s)}}{4} $$.

When you see simply $$ y = f(x) $$ you can think of the underlying set being modeled as the graph of the function $f$. For each point $P$ on the graph you have the two coordinate functions $x$ and $y$ whose values are related by the numerical recipe given by $f$.

The implicit assumption is that you are thinking of the $x$ coordinate as an independent "cause" that produces the point $(x,f(x)$ on the graph, so you say $x$ is the independent variable.

Ethan Bolker
  • 95,224
  • 7
  • 108
  • 199
0

Let's start with the function $f$. Generally, a function is defined as a set of ordered pairs such that $y_1=y_2$ whenever $(x,y_1)\in f$ and $(x,y_2)\in f$. (This is the bare-bones definition of a function, which ignores the codomain.) The notation $y=f(x)$ means nothing more or less than $(x,y)\in f$. Here $x$, $y$, etc. are arbitrary names for mathematical objects that may (or may not) occur as the first or second component of an ordered pair belonging to $f$. Usually, by writing notation such as $y=f(x)$, we are announcing that $x$ is in the domain of $f$ (the set of first components of the ordered pairs that constitute $f$) and $y$ is the unique element of the set of second components such that $(x,y)\in f$. It is this uniqueness property of a function that allows us to consistently use the convention that, while $x$ may be any element of the domain, $y$ is fixed as soon as $x$ is fixed.

It is a tradition of mathematical writing that $x$ and $y$ are used in this way. “Dependent variable”, as you have observed, is not really a mathematical term. Rather, it is description of a notational convention that we adhere to when discussing a particular function. It is the function that has mathematical reality; variables can be considered as names of mathematical objects—objects that, for the present at least, we have chosen not to fix. However, having announced the convention “$y=f(x)$”, once we have fixed $x$ (say $x=3$), then we are obliged to fix the value of $y$ accordingly (i.e. $y=f(3)$ in this case).

Continuing in this way with $y=f(x)$, where $f$ has a derivative $f'$, notation such as $\mathrm dy/\mathrm dx$, which means the same as $\mathrm df(x)/\mathrm dx$, can be thought of as a dependent variable: namely its meaning is fixed when $x$ is fixed. It is not correctly described as a function. Here the corresponding function is $f'$ or $x\mapsto\mathrm dy/\mathrm dx$.

In the case of partial derivatives, where we have (say) $z=f(x,y)$, it is similarly fine to write $\partial f(x,y)/\partial x$ or $\partial z/\partial x$. The function here is properly written $\partial_1f$. Unfortunately (in my view), you will come across notation such as “$\partial f/\partial x$”, which is intended to mean $\partial_1f$ despite its appearance as something depending on $x$.

John Bentin
  • 18,454