0

I need help with the below proof,

$$A = \underset{A}{\text{argmin}}(\frac{1}{2} ||X_{(1)} - A(C \odot B)^T||^2_F + ||\Lambda \boxdot (A - \tilde{A})||_F^2 + \frac{\rho}{2} ||A - \tilde{A}||_F^2) \\ = (X_{(1)}(C \odot B) + \rho \tilde{A} - \Lambda ) ((C \odot B)^T(C \odot B) + \rho I_R)^{-1}$$

Specifications

$A, \Lambda$ and $ \tilde{A} \in \mathbb{R} ^{I \times R}$ $B \in \mathbb{R} ^{J\times R}, C \in \mathbb{R} ^{K \times R}, $ $ X_{(1)} \in \mathbb{R} ^{I \times JK}$

$A*B = trace(A^T B)$, $||A||_F^2 = trace(A^TA)$

$\odot:$ Kronecker product

$\boxdot:$ element wise product

The reference of the formula page 8

Mour_Ka
  • 316
  • 1
    Why do you say "non convex"? – kimchi lover Sep 16 '17 at 15:28
  • 1
    It seems that you are minimizing a quadratic functional of $A$. You only need to find the point where the derivative of this functional vanishes. – Gribouillis Sep 16 '17 at 15:29
  • @kimchi because this is tensor factorization which is always non convex. From the same reference you would find tensor factorization is already a hard non-convex (multi-linear) problem – Mour_Ka Sep 16 '17 at 15:30
  • @Gribouillis, I tried but I ended with different formula and somehow a trick is used to reach such formula with this shape, this is why I need help with it.$ – Mour_Ka Sep 16 '17 at 15:32
  • 1
    Maybe the overall problem is non-convex, but you are talking about one substep of what's in the paper, and each of the terms in your expression look like convex functions of $A$ to me. – kimchi lover Sep 16 '17 at 15:34
  • @kimchi you might be right but in the reference it is mentioned in page 13 first line solving the formula (A,B,C) = argmin(fx(A,B,C)) is non-convex. While B and C are exactly the same as A. So you might be right for A its self but for A, B and C in the whole problem is non-convex. I actually tried to get its first derivative w.r.t A and it wasn't function of A so getting its second derivative is not possible so I don't know how to confirm its convexity as well. – Mour_Ka Sep 16 '17 at 15:43
  • 1
    Your expression is a sum with positive coefficients of the function $|\cdot|_F$ applied arguments that are obviously linear in $A$. A glance at your paper shows that $|\cdot|_F$ is a convex function of its argument. QED. As Gribouillis noted earlier. – kimchi lover Sep 16 '17 at 15:57

1 Answers1

0

I found a mistake copying from the paper$||\Lambda \boxdot(A - \tilde{A})||^2_F = (\Lambda * (A - \tilde{A}))$ (misunderstanding the relation between element wise product and trace)

$\dfrac{d}{dA} (\dfrac{1}{2} ||X_{(1)} - A(C \odot B)^T||_F^2) + (\Lambda * (A - \tilde{A})) + \dfrac{\rho}{2}||A - \tilde{A}||_F^2 = 0$

$= (1/2 * 2) (X_{(1)} - A(C \odot B)^T) \times -(C \odot B) + \Lambda + \rho(A-\tilde{A})$

$= -X_{(1)} (C \odot B) + A(C \odot B)^T(C \odot B)+\Lambda + \rho A - \rho \tilde{A}$

$ A ((C \odot B)^T (C \odot B) + \rho I_R) = X_{(1)} (C \odot B) - \Lambda+ \rho \tilde{A} $

$A = (X_{(1)} (C \odot B) - \Lambda+ \rho \tilde{A})((C \odot B)^T (C \odot B) + \rho I_R)^{-1}$

Mour_Ka
  • 316