0

The problem:

Let $S \in \mathbb{C}^{N\times M}$ with $N > M$ and $S^{H}S=\mathbb{I}$, let $\rho$ and $\sigma$ be hermitian matrices of trace $1$ and define the function $D: \mathbb{C}^{N\times M} \rightarrow \mathbb{R}$ as:

$$D(S) = \text{tr}(|S\rho S^{H} - \sigma|),$$

with $|A-B| = (A-B)(A-B)^{H}$ and $^H$ denoting the hermitian transpose, i.e., $D$ is the trace distance. My goal is to compute $\nabla_S D(S)$, the gradient of $D$ w.r.t $S$.

My approach:

I defined the following variables:

$$A = S\rho S - \sigma$$ $$B = A^H A.$$

$D$ then becomes:

$$D = tr(B^{1/2})$$

The goal is now to take the differential of $D$ and rearrange terms to eventually arrive at something like:

$$dD = \text{tr} (K dS),$$

with the transpose of $K$, $K^T$, being the gradient we're looking for.

My progress so far:

$$dD = d(\text{tr}(B^{1/2}) = \text{tr}(d(B^{1/2}))$$ $$dD = \frac{1}{2}\text{tr}((B^{-1/2})^T dB)$$

We have:

$$dB = (dA)^HA + A^HdA$$

And:

$$dA = dS\rho S^H + S\rho (dS)^H$$

I will now get terms with $dS$ and terms with $(dS)^H$ and I'm not sure how to manipulate them to get to an expression from which I can read out the gradient. Is this even the (or a) right approach?

1 Answers1

1

You've done all the hard work, now you just need to do some algebra to substitute the various differentials and rearrange things into a suitable form. $$\eqalign{ dB &= dA^HA + A^HdA \cr &= (dS\,pS^H + Sp\,dS^H)^HA + A^H(dS\,pS^H + Sp\,dS^H) \cr &= (Sp^H\,dS^H + dS\,p^HS^H)A + A^H(dS\,pS^H + Sp\,dS^H) \cr &= (dS\,p^HS^HA + A^HdS\,pS^H) + (Sp^H\,dS^HA + A^HSp\,dS^H) \cr &= (dS-{\rm terms}) \quad+\quad (dS^H-{\rm terms}) \cr \cr C &= \tfrac{1}{2}\big(B^{-1/2}\big)^T \quad {\rm \big(for\,convenience\big)} \cr \cr dD &= C:dB \cr &= C:dS\,p^HS^HA + C:A^HdS\,pS^H + (dS^H-{\rm terms}) \cr &= (CA^TS^*p^* + A^*CS^*p^T):dS \quad + ({\rm terms}):dS^H \cr \frac{\partial D}{\partial S} &= CA^TS^*p^* + A^*CS^*p^T \cr }$$ The gradient wrt the conjugate variable is simply the conjugate of the gradient. $$\eqalign{ \frac{\partial D}{\partial S^H} &= (CA^TS^*p^* + A^*CS^*p^T)^H \cr }$$ NB:   Colons denote trace/Frobenius products, i.e. $\,A:B={\rm Tr}(A^TB)$

greg
  • 35,825