0

I'm trying to find the derivative of

$$|(L^TL - \sigma)|_1 = \mbox{Tr} \left( \sqrt{(L^TL - \sigma)^\dagger(L^TL - \sigma)} \right)$$

with respect to $L$, where $\dagger$ is the transpose conjugate and $\sigma$ is some matrix.

I tried doing this with differentials and ended up at $$\begin{align} &\partial\text{Tr}\left(\sqrt{(L^TL - \sigma)^\dagger(L^TL - \sigma)}\right) \\ &= \left(\frac{1}{2\sqrt{(L^TL - \sigma)^\dagger(L^TL - \sigma)}}\right)^T:\left(dX^\dagger (X - \sigma) + (X - 1)^\dagger dX\right) \end{align}$$

where $X = L^TL$. This doesn't look too promising as I eventually only want $dL$ terms. Could someone point out how to proceed? Thank you.

user1936752
  • 1,688

1 Answers1

2

Define $$\eqalign{ M &= L^TL-\Sigma,\quad S &= \big(M^TM\big)^{1/2},\quad \phi &= \|M\|_* = {\rm Tr}(S) }$$ Then, assuming all the matrices are real $$\eqalign{ \frac{\partial\phi}{\partial L} &= LMS^{-1}+LS^{-1}M^T \cr }$$

The detailed calculations follow. $$\eqalign{ d\phi &= M(M^TM)^{-1/2}:dM \cr &= MS^{-1}:(L^TdL + dL^TL) \cr &= \big(MS^{-1}+S^{-1}M^T\big):L^TdL \cr &= L\big(MS^{-1}+S^{-1}M^T\big):dL \cr \frac{\partial\phi}{\partial L} &= L\big(MS^{-1}+S^{-1}M^T\big) \cr }$$ where a colon represents the trace/Frobenius product, i.e. $$\eqalign{A:B = {\rm Tr}(A^TB)\cr}$$


Update

If all the matrices are complex, and Wirtinger derivatives are acceptable to you, then $$\eqalign{ M &= L^\dagger L-\Sigma,\quad S = \big(M^\dagger M\big)^{1/2},\quad \phi = {\rm Tr}(S) \cr \frac{\partial\phi}{\partial L} &= \tfrac{1}{2}L^*M^*S^{-1} \cr }$$ If $L$ is real (i.e. $L=L^*,\, L^\dagger=L^T$), and all the others are complex then $$\eqalign{ \frac{\partial\phi}{\partial L} &= \tfrac{1}{2}L\big(M^*S^{-1}+(S^*)^{-1}M^\dagger\big) \cr }$$

greg
  • 35,825
  • Thank you for the answer. In my case though $S = M^\dagger M$ and the matrices aren't real. But I will try to adapt your technique to the complex case. Thank you! – user1936752 Feb 25 '19 at 16:03
  • It's strange that you've use $L^TL$. Is $L$ real or complex? – greg Feb 25 '19 at 16:44
  • You're right, I think I should use $L^\dagger L$. Thank you so much for writing such detailed answers! Really appreciate it! – user1936752 Feb 25 '19 at 17:32