Function-by-function derivatives commonly used in the physics litterature

Question

In statistical physics, one is often presented with function-by-function derivatives. For instance, consider the derivation of the fundamental relation of thermodynamics from statistical physics.

Let the entropy be:

$$ S=\ln Z+\beta \overline{E} + \gamma \overline{V} \tag{1} $$

Then in most textbooks, one writes:

$$ dS= \left. \frac{\partial S}{\partial \overline{E}} \right|_{\overline{V}}d\overline{E} + \left. \frac{\partial S}{\partial \overline{V}} \right|_{\overline{E}} d\overline{V} $$

Then finally,

$$ dS= \beta d\overline{E} + \gamma d\overline{V} \tag{2} $$

The problem I have is that $\overline{E}$ and $\overline{V}$ are not variables, but functions of $\beta,\gamma$.

$$ \overline{E}(\beta,\gamma)=\sum_{q\in\mathbb{Q}} E(q)\exp(-\beta E(q)-\gamma V(q))\\ \overline{V}(\beta,\gamma)=\sum_{q\in\mathbb{Q}} V(q)\exp(-\beta E(q)-\gamma V(q)) $$

Since these are functions, the 'verbose' equation for the entropy actually is:

$$ S(\beta,\gamma)=\ln Z(\beta,\gamma) +\beta \overline{E}(\beta, \gamma)+ \gamma \overline{V}(\beta,\gamma) $$

In this context, taking the differential with respect to the function $\overline{E}(\beta,\gamma)$ doesn't make sense. A derivative is taken with respect to a variable of the function.

What is the correct process (notation and good definitions) to go from $S$ to $dS=\beta d\overline{E}+\gamma d\overline{V}$?

That is, starting with:

$$ S(\beta,\gamma)= \ln Z(\beta,\gamma)+\beta \overline{E}(\beta, \gamma)+ \gamma \overline{V}(\beta,\gamma) $$

and getting:

$$ dS(\beta,\gamma)=\beta d\overline{E}(\beta, \gamma)+ \gamma d\overline{V}(\beta,\gamma) $$

?

Surely, the function-by-function derivatives (A) and (B) of this expression are ill-defined:

$$ dS= \underbrace{\left. \frac{\partial S}{\partial \overline{E}} \right|_{\overline{V}} }_{\text{A}}d\overline{E} +\underbrace{ \left. \frac{\partial S}{\partial \overline{V}}\right|_{\overline{E}} }_{\text{B}}d\overline{V} $$

Let me make an attempt using proper definitions:

$$ \begin{align} S(\beta,\gamma)&=\ln Z(\beta,\gamma) +\beta \overline{E}(\beta, \gamma)+ \gamma \overline{V}(\beta,\gamma)\\ d S(\beta,\gamma)&= \left[ \frac{\partial (\ln Z(\beta,\gamma)+\beta \overline{E}(\beta, \gamma)+ \gamma \overline{V}(\beta,\gamma))}{\partial \beta} \right] d\beta + \left[ \frac{\partial (\ln Z(\beta,\gamma)+\beta \overline{E}(\beta, \gamma)+ \gamma \overline{V}(\beta,\gamma))}{\partial \gamma} \right] d\gamma\\ &= \left[ \frac{\partial (\beta \overline{E}(\beta, \gamma))}{\partial \beta}+ \frac{\partial (\gamma \overline{V}(\beta,\gamma))}{\partial \beta} \right] d\beta + \left[ \frac{\partial (\beta \overline{E}(\beta, \gamma))}{\partial \gamma} + \frac{\partial (\gamma \overline{V}(\beta,\gamma))}{\partial \gamma} \right] d\gamma\\ &= \left[ \frac{\partial \beta}{\partial \beta} \overline{E}(\beta,\gamma) + \beta \frac{\partial \overline{E}(\beta,\gamma)}{\partial \beta}+ \frac{\partial \gamma}{\partial \beta} \overline{V}(\beta,\gamma)+ \gamma \frac{\partial \overline{V}(\beta,\gamma)}{\partial \gamma} \right] d\beta \\ &\quad\quad + \left[ \frac{\partial \beta}{\partial \gamma} \overline{E}(\beta,\gamma) + \beta \frac{\partial \overline{E}(\beta,\gamma)}{\partial \gamma}+ \frac{\partial \gamma}{\partial \gamma} \overline{V}(\beta,\gamma)+ \gamma \frac{\partial \overline{V}(\beta,\gamma)}{\partial \gamma} \right] d\gamma\\ &= \left[ \overline{E}(\beta,\gamma) + \beta \frac{\partial \overline{E}(\beta,\gamma)}{\partial \beta}+ \gamma \frac{\partial \overline{V}(\beta,\gamma)}{\partial \gamma} \right] d\beta \\ &\quad\quad + \left[ + \beta \frac{\partial \overline{E}(\beta,\gamma)}{\partial \gamma}+ \overline{V}(\beta,\gamma)+ \gamma \frac{\partial \overline{V}(\beta,\gamma)}{\partial \gamma} \right] d\gamma \end{align} $$

Can this be simplified further?

Some thoughts: I think starting with your equation $(1)$, that is how you get to the $dS$ you are looking for. Because $\bar{E}$ and $\bar{V}$ are the independent variables. $\beta$ is inverse temperature and $\gamma$ is some other fixed constant. If $\beta$ and $\gamma$ are fixed constants of your system, you shouldn't consider $\bar{E}$ and $\bar{V}$ functions of $\beta,\gamma$. These are presumably, just given numbers, not variables. If this is the case, the entire problem goes away — DWade64, Aug 29 '19 at 19:49

DWade64 · Answer 1 · 2019-08-29T19:26:56.773

I'm getting confused, but let me say this:

Consider a 3-variable function $f$ taking some domain element to some range element $f: (x,y,z) \mapsto w$ or $w = f(x,y,z)$. There's nothing wrong with writing the total differential, defined as

$$ df = \frac{\partial}{\partial x}f\; dx + \frac{\partial}{\partial y}f \; dy + \frac{\partial}{\partial z}f\; dz$$

where $d$'s are understood as differentials and not derivatives. Therefore $d(x)$ or $dx$ indicates the differential of the identity function $x \mapsto x$ and $d(y)$ or $dy$ indicates the differential of the identity function $y \mapsto y$, etc. Just as $d(e^x)$ would indicate the differential of the function $x \mapsto e^x$.

Now I tell you that $x, y$ and $z$ depend on $\beta, \gamma$. Everything I've written down so far is good. $f$ has some domain and some mapping. But if I write $f(\alpha_1 (\beta, \gamma), \alpha_2(\beta, \gamma), \alpha_3(\beta,\gamma))$ where $x = \alpha_1(\beta,\gamma)$ and $y = \alpha_2(\beta,\gamma)$ etc. First note that I'm careful not to write $f(x(\beta,\gamma), y(\beta,\gamma), z(\beta, \gamma))$ where $x = x(\beta, \gamma)$ etc. Why? This would be overloading. I've already told you that the symbols $x,y,z$ are domain elements of $f$. I can't then overload $x$ to mean a domain element of $f$ and the name of a function (you can, it's done often, but I'm trying to be precise here). The larger point that I wish to make is that this second function is a completely different than the first. It is a composite function. It has a totally different domain than $f$ and the mapping is different. If you want me to find what $df$ is, I would write down the first equation above. If you wanted me to find the differential of the completely different function $df(\alpha_1 (\beta, \gamma), \alpha_2(\beta, \gamma), \alpha_3(\beta,\gamma))$, then I would write down something different. Let the name of the composite function be $g : = f(\alpha_1 (\beta, \gamma), \alpha_2(\beta, \gamma), \alpha_3(\beta,\gamma))$

$$ dg = \frac{\partial}{\partial \beta}g\; d\beta + \frac{\partial}{\partial \gamma}g \; d\gamma$$

Or using the chain rule:

$$ dg = \underbrace{\Bigg( \frac{\partial f}{\partial x} \frac{\partial \alpha_1}{\partial \beta} + \frac{\partial f}{\partial y} \frac{\partial \alpha_2}{\partial \beta} + \frac{\partial f}{\partial z}\frac{\partial \alpha_3}{\partial \beta}\Bigg) d\beta}_{\text{first term of equation just above}} + \underbrace{\Bigg( \frac{\partial f}{\partial x} \frac{\partial \alpha_1}{\partial \gamma} + \frac{\partial f}{\partial y} \frac{\partial \alpha_2}{\partial \gamma} + \frac{\partial f}{\partial z}\frac{\partial \alpha_3}{\partial \gamma}\Bigg) d\gamma}_{\text{second term of equation just above}}$$

So the question becomes how exactly is $S$, the name of the entropy function defined? (or is it actually the output of an unnamed entropy function - as it stands in your first equation, it's the output. Who cares, lets overload) Are we talking about a function $S: (\bar{E}, \bar{V}) \mapsto S$. If so, there is nothing wrong with what you did in your first section. Is it a single variable composite function in section $2$? (Remember composite functions are not the outside function - you should ideally give composite functions different names because they truly are different function - two functions are only the same if they have the same domain and the same mapping). I don't understand what $\frac{\partial}{\partial \bar{E}(\beta)}$ is. If you're just taking the derivative with respect to $\beta$ use $\frac{\partial}{\partial \beta}$

Likewise in section $3$ I have similar confusions. I don't know what $\frac{\partial}{\partial \bar{E}(\beta,\gamma)}$ is. When you take any derivative, anywhere, anytime, it's with respect to a single variable in mind. Gradients, or "total derivatives", or partial derivatives, or divergences, are just a bunch of single derivatives. Hopefully this is useful to you. Notational overload is common in physics. Take the wave function $\Psi$. It's common to see the "wave function in position space" written as $\Psi := \langle x | \Psi \rangle$, which doesn't make any sense to the beginner because we just overloaded $\Psi$ with two meanings.

I don't see how $1$ and $2$ are ill-defined derivatives if $\bar{E}$ and $\bar{V}$ are independent variables to your function $S$. If our starting point is $S = \ln Z + \beta \bar{E} + \gamma \bar{V}$, where $S$ is a function of $Z$, $\bar{E}$ and $\bar{V}$, then $dS$ is basically what you you say it is. I will consider it a function of $Z$ too.

$$dS = \beta d\bar{E} + \gamma d\bar{V} + \frac{1}{Z}dZ \tag{1}$$

If we form the composite function $S(Z(\beta, \gamma), \bar{E}(\beta,\gamma), \bar{V}(\beta,\gamma))$, mathematically speaking, this is a very different function. It's domain is different. It's domain is two dimensional - whatever values $(\beta, \gamma)$ can take. I'll denote it $S_{\text{comp}}(\beta, \gamma)$. In this case you are right, $1$ and $2$ in your question are ill-defined derivatives. $\frac{\partial S_{\text{comp}}}{\partial \bar{E}}$ and $\frac{\partial S_{\text{comp}}}{\partial \bar{V}} $ are ill-defined because our composition $S_{\text{comp}}$ is not a function of $\bar{E}$ and $\bar{V}$. Continuing,

$$ dS_{\text{comp}} = \frac{\partial S_{\text{comp}}}{\partial \beta}d\beta + \frac{\partial S_{\text{comp}}}{\partial \gamma}d\gamma $$

Or,

$$ dS_{\text{comp}} = \Bigg(\frac{1}{Z}\frac{\partial Z}{\partial \beta} + \bar{E}(\beta,\gamma) + \beta \frac{\partial \bar{E}}{\partial \beta} + \gamma\frac{\partial \bar{V}}{\partial \beta} \Bigg) d\beta + \Bigg(\frac{1}{Z}\frac{\partial Z}{\partial \gamma} + \beta \frac{\partial \bar{E}}{\partial \gamma} + \bar{V}(\beta,\gamma) + \gamma\frac{\partial \bar{V}}{\partial \gamma} \Bigg) d\gamma \tag{2}$$

A derivative with respect to a function most likely means derivative with respect to the output variable of the function. If I have a function $w = f(t)$ and I write $\frac{d}{d f(t)}$ I most likely mean $\frac{d}{dw}$ where there is some other function depending on $w$. It seems $\bar{E}$ in your question is overloaded as a function name and an output. Not a problem, but it can be very confusing if you aren't aware of it — DWade64, Aug 29 '19 at 16:17
Using your exposition, it sounds like the fundamental relation of thermodynamics $dS$ does not follow from $S$ as stated in physics textbook (even the function-by-function derivatives). Even in wikipedia https://en.wikipedia.org/wiki/Fundamental_thermodynamic_relation they use the function-by-function derivative (bottom of page). Is the relation even correct? — Anon21, Aug 29 '19 at 18:38
I am not familiar with the physics anymore, but I have added to my answer — DWade64, Aug 29 '19 at 19:14

Function-by-function derivatives commonly used in the physics litterature

What is the correct process (notation and good definitions) to go from $S$ to $dS=\beta d\overline{E}+\gamma d\overline{V}$?

1 Answers1

Linked