Derivative of the Euclidean norm with respect to matrix?

Question

If one defines the Euclidean norm of the matrix $A$ as follows

$$\|A\| = \sqrt{\mbox{tr}(A^H A)}$$

where $A^H$ is the conjugate transpose of $A$.

Then what would be the derivative of $\|A\|$ with respect to matrix $A$ itself?

I am looking for a general formula.

Actually, I have a solution from my professor but I couldn't find the reference to that. According to him,

$\frac{\partial ||A||}{A} = \frac{2A}{2\sqrt{tr(A^H A)}} = \frac{A}{||A||}$.

Is this solution valid or I am missing something?

See this: https://math.stackexchange.com/questions/1481200/derivative-of-frobenius-norm — Alex Silva, May 02 '18 at 09:21
What about this: https://math.stackexchange.com/questions/701062/derivative-of-the-nuclear-norm-with-respect-to-its-argument?noredirect=1&lq=1 — Alex Silva, May 02 '18 at 09:47
http://thousandfold.net/cz/2013/11/12/a-useful-trick-for-computing-gradients-w-r-t-matrix-arguments-with-some-examples/ — Jürgen Sukumaran, May 02 '18 at 10:57

greg · Accepted Answer · 2018-05-02T13:27:38.553

For ease of typing, I'll use the following notations $$\eqalign{ X:Y &= {\rm tr}(X^TY)\cr f &= \|A\| \cr A^H &= (A^T)^*\cr }$$ Treating $(A,A^*)$ independently, yields the Wirtinger derivatives as $$\eqalign{ f^2 &= {\rm tr}(A^HA) = A^*:A \cr 2f\,df &= A^*:dA \cr \frac{\partial f}{\partial A} &= \frac{A^*}{2f} \,\,\implies \frac{\partial f}{\partial A^*} = \frac{A}{2f} \cr }$$ If $A\in{\mathbb R}^{m\times n}$, then the standard derivative process yields $$\eqalign{ f^2 &= {\rm tr}(A^TA) = A:A \cr 2f\,df &= 2A:dA \cr \frac{\partial f}{\partial A} &= \frac{A}{f} \cr }$$ which appears to be what your professor had in mind.

Update

Some more detail on the Wirtinger derivatives.

The full differential contains terms for both $A^*$ and $A$ $$\eqalign{ 2f\,df &= A^*:dA + A:dA^* \cr }$$ When $A^*$ is held constant, $dA^*=0$, leaving $$\eqalign{ 2f\,df &= A^*:dA \cr df &= \frac{A^*}{2f}:dA \cr \frac{\partial f}{\partial A} &= \frac{A^*}{2f} \cr }$$ Conversely, if $A$ is held constant, then $dA=0$ and $$\eqalign{ 2f\,df &= A:dA^* \cr df &= \frac{A}{2f}:dA^* \cr \frac{\partial f}{\partial A^*} &= \frac{A}{2f} \cr }$$ Finally, if $A$ is real then $A=A^*,\,$ $dA=dA^*,\,$ and $$\eqalign{ 2f\,df &= 2A:dA \cr df &= A:dA \cr \frac{\partial f}{\partial A} &= \frac{A}{f} \cr }$$

thanks for your explanation. could you please elaborate more how 2fdf = A*:dA and also the derivative of f w.r.t. A? — KratosMath, May 02 '18 at 12:25

score 0 · Answer 2 · answered May 02 '18 at 11:11

0

It simply does not exist. Let $A$ be $1\times1$ then what you want is a derivative like $\frac{d|z|}{dz}$ in essence. Which we know does not exist (I use a frescher's definition of derivative if you have a generalized alternative definition, let us know!)

answered May 02 '18 at 11:11

K. Sadri

929

subdifferential should exist since it's a convex function. – Jürgen Sukumaran May 02 '18 at 11:14
I have the solution from my professor but I couldn't find a reference for that. I will edit the post so that you can see the solution of my prof. – KratosMath May 02 '18 at 11:21

Jürgen Sukumaran · Answer 3 · 2018-05-02T14:09:17.267

0

Let us start with a simpler function,

$$G(X) = \frac{1}{2}\|X\|^2$$

We then have,

$$G(X+V) = \frac{1}{2}\|X+V\|^2 = \frac{1}{2} tr((X+V)^H(X+V))=\frac{1}{2}(tr(X^HX)+tr(X^HV)+tr(V^HX)+tr(V^HV))\\ =G(X) + G(V) + tr(X^HV)$$

This gives us that the Frechet derivative of $G$ w.r.t. $X$ is $tr(X^HV)$. But we are really interested in the derivative of the function $F(X) = \|X\|$. Using the chain rule, we know that the derivative of $G$ is $\|X\|\cdot F'$ so we finally have $F' = \frac{X}{\|X\|}$

edited May 02 '18 at 14:09

answered May 02 '18 at 11:38

Jürgen Sukumaran

7,645

Could you please explain more that how the derivative of G is ||X||.F' and how you got the final solution? – KratosMath May 02 '18 at 12:05
I use the method described here http://thousandfold.net/cz/2013/11/12/a-useful-trick-for-computing-gradients-w-r-t-matrix-arguments-with-some-examples/ – Jürgen Sukumaran May 02 '18 at 12:08

Derivative of the Euclidean norm with respect to matrix?

3 Answers3