1

Let $\mathbf{X} \in \mathbb{R}^{n \times m}$, $\mathbf{A} \in \mathbb{R}^{m \times n}$, $(.)^T$ denotes the transpose operator and $||.||_F$ is the matrix Frobenius norm

What is the value of $\frac{\partial ||\mathbf{X}^T -\mathbf{A} ||_F^2}{\partial \mathbf{X}}$ ?

Is it zero?

Please, any help?

Celina
  • 69
  • 1
    Consider the case when $n=1$ and $a=0$ (for example)... Is it true then? No. In this case you get $2x$ since this is the derivative of $x^2$ and the transpose operator does nothing when $n=1$. – Squirtle Oct 30 '16 at 15:36
  • derivative of matrix w.r.t matrix doesnt exist – Ahmad Bazzi Oct 30 '16 at 16:02
  • what you said means you didnt read what i say: you have a scalar w.r.t matrix .. i said derivative of matrix w.r.t matrix doesnt exist. – Ahmad Bazzi Oct 30 '16 at 16:07
  • Yes $\mathbf{A}$ is of size $m \times n$ . Sorry error with my latex. I wrote my question so fast. I edited it – Celina Oct 30 '16 at 16:09
  • @ElBazzi why cant derivative of matrix with respect to matrix exist? – vidyarthi Oct 30 '16 at 16:11
  • @ElBazzi ok, understood. it is somewhat similar to the jacobian. The derivative is not explicitly with respect to matrix, but with each element of the matrix, right? – vidyarthi Oct 30 '16 at 16:15
  • @vidyarthi : The notation chosen by the OP is perhaps not very suggestive, but she surely means this : https://en.wikipedia.org/wiki/Fr%C3%A9chet_derivative. In a standard analysis course, you show that this Frechet derivative can be computed by evaluating the partial derivatives and forming the Jacobian. – Patrick Da Silva Oct 30 '16 at 17:34
  • I added my answer. I solved it. – Celina Oct 30 '16 at 17:36
  • Celina, I've added an answer to your earlier question on the same topic here http://math.stackexchange.com/questions/1987919/transposition-problems-inside-the-gradient-of-squared-l2-norm/1992848#1992848 which will close this one too. – rych Oct 31 '16 at 10:50
  • Yes thank you:) – Celina Oct 31 '16 at 14:47

2 Answers2

2

The answer is $2(\mathbf{X} - \mathbf{A}^T)$.

You have $||\mathbf{X}^T - \mathbf{A}||_F^2 = trace((\mathbf{X}^T - \mathbf{A})(\mathbf{X}^T - \mathbf{A})^T) = trace((\mathbf{X}^T - \mathbf{A}) (\mathbf{X} - \mathbf{A}^T)) = trace(\mathbf{X}^T \mathbf{X} - \mathbf{X}^T \mathbf{A}^T - \mathbf{AX} + \mathbf{A}\mathbf{A}^T) = trace(\mathbf{X}^T \mathbf{X}) - trace(\mathbf{X}^T\mathbf{A}^T) - trace(\mathbf{AX}) + trace(\mathbf{A}\mathbf{A}^T) =trace(\mathbf{X\mathbf{X}^T}) - trace((\mathbf{AX})^T) - trace(\mathbf{AX}) + trace(\mathbf{A}\mathbf{A}^T) = trace(\mathbf{X\mathbf{X}^T}) - 2trace(\mathbf{AX}) + trace(\mathbf{A}\mathbf{A}^T)$.

Keep in mind that $trace(\mathbf{AB}) = trace(\mathbf{BA}))$, where $\mathbf{B}$ is a matrix, and $trace(\mathbf{A}^T) = trace(\mathbf{A})$.

Then we have:

$\frac{\partial (trace(\mathbf{X}\mathbf{X}^T))}{\partial \mathbf{X}} = 2\mathbf{X}$.

$\frac{\partial (trace(\mathbf{AX}))}{\partial \mathbf{X}} = \frac{\partial (trace(\mathbf{XA}))}{\partial \mathbf{X}} = \mathbf{A}^T$.

$\frac{\partial (trace(\mathbf{A}\mathbf{A}^T))}{\partial \mathbf{X}} = 0$

Therefore, $\frac{\partial ||\mathbf{X}^T - \mathbf{A}||_F^2}{\partial \mathbf{X}} = 2(\mathbf{X} - \mathbf{A}^T)$.

Celina
  • 69
  • 1
    Your result is correct, but using the Frobenius (:) inner product would be more succinct $$\eqalign{Y&=X-A^T\cr f&=|Y|^2_F&=Y:Y\cr df&=2Y:dY&=2Y:dX\cr\frac{\partial f}{\partial X}&=2Y}$$ – greg Oct 30 '16 at 18:30
1

Let $X$ and $A$ be two matrices of same dimensions, then $$\Vert X - A \Vert_F^2 = \text{trace}\big( X^T X - X^TA - A^TX + A^TA\big)$$ And using $$\frac{\partial}{\partial X} \text{trace} (X^TAX) = X^T(A + A^T)$$ and $$\frac{\partial}{\partial X} \text{trace} (X^TA) = A$$ we get $$\frac{\partial \Vert X - A \Vert^2}{\partial X} = 2X^T - A - A^T$$ You ask if it is zero. Let's equate it to zero, we get $$X = \frac{1}{2}(A + A^T)$$

Ahmad Bazzi
  • 12,076
  • It is what it is.. If $X$ is equal to half the value of $A$ summed up to the transposition of $A$. – Ahmad Bazzi Oct 30 '16 at 16:23
  • The value of $\frac{\partial ||\mathbf{X} -\mathbf{A} ||_F^2}{\partial \mathbf{X}}$ is simply $2(\mathbf{X} - \mathbf{A})$. But for me, I have $\mathbf{X}^T$. So I don't know the value of $\frac{\partial ||\mathbf{X}^T -\mathbf{A} ||_F^2}{\partial \mathbf{X}}$. – Celina Oct 30 '16 at 16:24
  • @ElBazzi But, A being of order$m \times n$, cannot be added to its transpose, right? – vidyarthi Oct 30 '16 at 16:25
  • @Celina By symmetry, isnt your derivative simply $2(\mathbf{X}^T -\mathbf{A})$ – vidyarthi Oct 30 '16 at 16:43
  • No in fact it will be $2(\mathbf{X} - \mathbf{A}^T)$. – Celina Oct 30 '16 at 16:55
  • I added my answer. Thank you @ElBazzi for your time. I appreciate your help:) – Celina Oct 30 '16 at 17:33
  • 1
    As @vidyarthi has commented, $A$ is rectangular,so the sum $(A+A^T)$ does not exist. The problem with this answer, is that it incorporates $trace(X^TAX)$ -- which appears nowhere in the expansion of the norm. – hans Oct 30 '16 at 22:03