1

If one defines the Euclidean norm of the matrix $A$ as follows

$$\|A\| = \sqrt{\mbox{tr}(A^H A)}$$

where $A^H$ is the conjugate transpose of $A$.

Then what would be the derivative of $\|A\|$ with respect to matrix $A$ itself?

I am looking for a general formula.

Actually, I have a solution from my professor but I couldn't find the reference to that. According to him,

$\frac{\partial ||A||}{A} = \frac{2A}{2\sqrt{tr(A^H A)}} = \frac{A}{||A||}$.

Is this solution valid or I am missing something?

3 Answers3

2

For ease of typing, I'll use the following notations $$\eqalign{ X:Y &= {\rm tr}(X^TY)\cr f &= \|A\| \cr A^H &= (A^T)^*\cr }$$ Treating $(A,A^*)$ independently, yields the Wirtinger derivatives as $$\eqalign{ f^2 &= {\rm tr}(A^HA) = A^*:A \cr 2f\,df &= A^*:dA \cr \frac{\partial f}{\partial A} &= \frac{A^*}{2f} \,\,\implies \frac{\partial f}{\partial A^*} = \frac{A}{2f} \cr }$$ If $A\in{\mathbb R}^{m\times n}$, then the standard derivative process yields $$\eqalign{ f^2 &= {\rm tr}(A^TA) = A:A \cr 2f\,df &= 2A:dA \cr \frac{\partial f}{\partial A} &= \frac{A}{f} \cr }$$ which appears to be what your professor had in mind.

Update

Some more detail on the Wirtinger derivatives.

The full differential contains terms for both $A^*$ and $A$ $$\eqalign{ 2f\,df &= A^*:dA + A:dA^* \cr }$$ When $A^*$ is held constant, $dA^*=0$, leaving $$\eqalign{ 2f\,df &= A^*:dA \cr df &= \frac{A^*}{2f}:dA \cr \frac{\partial f}{\partial A} &= \frac{A^*}{2f} \cr }$$ Conversely, if $A$ is held constant, then $dA=0$ and $$\eqalign{ 2f\,df &= A:dA^* \cr df &= \frac{A}{2f}:dA^* \cr \frac{\partial f}{\partial A^*} &= \frac{A}{2f} \cr }$$ Finally, if $A$ is real then $A=A^*,\,$ $dA=dA^*,\,$ and $$\eqalign{ 2f\,df &= 2A:dA \cr df &= A:dA \cr \frac{\partial f}{\partial A} &= \frac{A}{f} \cr }$$

greg
  • 35,825
0

It simply does not exist. Let $A$ be $1\times1$ then what you want is a derivative like $\frac{d|z|}{dz}$ in essence. Which we know does not exist (I use a frescher's definition of derivative if you have a generalized alternative definition, let us know!)

K. Sadri
  • 929
0

Let us start with a simpler function,

$$G(X) = \frac{1}{2}\|X\|^2$$

We then have,

$$G(X+V) = \frac{1}{2}\|X+V\|^2 = \frac{1}{2} tr((X+V)^H(X+V))=\frac{1}{2}(tr(X^HX)+tr(X^HV)+tr(V^HX)+tr(V^HV))\\ =G(X) + G(V) + tr(X^HV)$$

This gives us that the Frechet derivative of $G$ w.r.t. $X$ is $tr(X^HV)$. But we are really interested in the derivative of the function $F(X) = \|X\|$. Using the chain rule, we know that the derivative of $G$ is $\|X\|\cdot F'$ so we finally have $F' = \frac{X}{\|X\|}$

  • Could you please explain more that how the derivative of G is ||X||.F' and how you got the final solution? – KratosMath May 02 '18 at 12:05
  • I use the method described here http://thousandfold.net/cz/2013/11/12/a-useful-trick-for-computing-gradients-w-r-t-matrix-arguments-with-some-examples/ – Jürgen Sukumaran May 02 '18 at 12:08