12

I made the following observation

Let $f(t):=\left(\begin{matrix} 0 &e^{it} \\ e^{-it} & 0 \end{matrix}\right),$ then $f(t)^2= \operatorname{id}$. Thus, we have $\frac{d}{dt}f(t)^2= \frac{d}{dt}\operatorname{id}=0.$ On the other hand yields the chain-rule $$\frac{d}{dt}f(t)^2= 2 f(t)f'(t)=0.$$

However, $f(t)$ and $f'(t)$ are both matrices with full-rank. So something is very wrong here, no? Can anybody explain to me what just happened?

2 Answers2

19

The chain rule obviously applies, and since answers/comments seem to imply that it is ill-defined, I think this answer may be useful.

The problem with your application of the chain rule is that the derivative of $X \mapsto X^2$ at a matrix $X_0$ is not $H \mapsto 2X_0H$, but instead $H \mapsto X_0H+HX_0$.*

This yields the derivative of your $g(t)=f(t)^2$ (identifying the linear map $g':\mathbb{R} \to M_n(2,2)$ with its value at $1$, as is usually done) as

$$f(t)\cdot f'(t)+f'(t) \cdot f(t)$$ by the chain rule, which you can see that makes the computation check out.

*This is a simple computation, given by $$(X_0+H)^2=X_0^2+X_0H+HX_0+H^ 2=X_0^2+(X_0H+HX_0)+o(H),$$ and noting that the term in parenthesis is linear on $H$.

5

Trying to use a chain rules leads us into a morass of missing definitions (what does it mean to differentiate a function whose input is a matrix, for example), but we can see something of what happens when the use the product rule:

$$\frac{d}{dt} (f(t)g(t)) = f'(t)g(t) + f(t)g'(t) $$ Here we need to remember that we're talking about matrices, so the order of factors is important. For example, we cannot replace the $f(t)g'(t)$ term with $g'(t)f(t)$ and expect its value to stay the same.

For $f(t)^2$ this gives us $$ \frac{d}{dt} f(t)^2 = f'(t)f(t) + f(t)f'(t) $$ In contrast to the usual commutative case, the two terms cannot be combined into one. For your particular example we have $$ f(t) = \begin{pmatrix} 0 & e^{it} \\ e^{-it} & 0 \end{pmatrix} \qquad\qquad f'(t) = \begin{pmatrix} 0 & ie^{it} \\ -ie^{-it} & 0 \end{pmatrix} $$ And we get $$ f'(t)f(t) = \begin{pmatrix} i&0\\ 0 & -i \end{pmatrix} \qquad\qquad f(t)f'(t) = \begin{pmatrix} -i&0\\ 0 & i \end{pmatrix} $$ whose sum is clearly the zero matrix.

  • 8
    (Explaining the downvote): I downvoted because this answer avoids the question, which is: "why is this not a counterexample to the chain rule?". The answer is that the chain rule is being applied in a wrong manner, not that the chain rule will lead to missing definitions. Also, your comment on the question is very misleading, particularly the parenthesis. – Aloizio Macedo Apr 07 '17 at 22:36
  • 6
    I disagree with the downvotes: this answer does not avoid the question. OP asks "So something is very wrong here, no? Can anybody explain to me what just happened?" NOT "why is this not a counterexample to the chain rule?", which is a question that one of the downvoter subjectively assumed. Moreover, "For example, we cannot replace the $f(t)g'(t)$ term with $g'(t)f(t)$ and expect its value to stay the same." explains why OP gets "something is very wrong". –  Dec 09 '17 at 03:06
  • @Jack The title of the question is explicitly "counterexample to the chain rule", which leads me to believe that this is of special interest for OP. The question itself is based on a proposed usage of the chain rule. The questions which you refer to are referring to said "usage". The questions have a clear context: The chain rule is giving to me a different result than what is expected. Why? – Aloizio Macedo Dec 09 '17 at 03:28
  • @Jack On a different note, regardless of what you or I think, I don't think it is appropriate to put an emphasis in someone's question as you did in a somewhat arbitrary fashion (I think this is very frowned upon). I will thus reverse your edit. If you feel this is unwarranted, please consider taking this to meta where we can discuss more appropriately. – Aloizio Macedo Dec 09 '17 at 03:30
  • 6
    "... which leads me to believe that this is of special interest for OP." Neither you or I can know OP's real intension, can we? I would say that you gave a possible explanation of what was in OP's mind, which forms an interesting question regarding the chain rule. However, this does not support your criticism of Henning's answer: "this answer avoids the question", which is what you assume, not OP literally says in the body of the post. –  Dec 09 '17 at 03:40
  • @Jack We frequently ask for context for that specific reason: to try to know what OP's real intention is. To me, it is clear. Thus the downvote. If there was no context and if the title was blank, with the question being preceded by that formula alone, I would agree with you. Even then, the first lines of this answer serve no purpose but confuse the unaware. – Aloizio Macedo Dec 09 '17 at 04:08