1

In an optimization problem like the following

$$ \min_{x \geq 0} f(x) $$ where $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a convex function, we write the KKT condition by breaking apart the constraint $x \geq 0$ where $x=[x_1,x_2,\cdots,x_n]^T \in \mathbb{R}^n$ as $$ x_1 \geq 0 $$ $$ x_2 \geq 0 $$ $$ \vdots $$ $$ x_n \geq 0 $$ Then we associate for each of them a dual variable so we have the following $$ \nabla L(x,\mu)=\nabla f(x)-\sum_{i=1}^n\mu_i $$ where $\mu=[\mu_1,\mu_2,\cdots,\mu_n]^T$.

Now suppose we have the following

$$ \min_{0\preceq M \preceq I} f(M) $$ where $M \in \mathbb{R}^{m \times m}$ is a positive semi-definite matrix which has the set of eigenvalues and eigenvectors as $(\lambda_i(M),v_i)$.

Why KKT condition for this problem is

$$ \nabla L(M,\gamma)=\nabla f(M)+\sum_{i=1}^n\gamma_iv_iv_i^T-\sum_{i=1}^n w_iv_iv_i^T $$

Saeed
  • 175
  • Technically, Lagrangian multiplier is for equality constraints, perhaps it is better to use KKT conditions instead since you have inequality constraints. Also, it is better to mention you are taking the derivative w.r.t to $M$. – abolfazl Nov 02 '18 at 01:14

1 Answers1

1

Likewise what you did for the first problem, you can break the constrain $0\preceq M \preceq I$ to the following

$$ f_1(M)=-\lambda_1(M) \leq 0 $$ $$ \vdots $$ $$ f_d(M)=-\lambda_d(M) \leq 0 $$ and

$$ g_1(M)=\lambda_1(M) -1 \leq 0 $$ $$ \vdots $$ $$ g_d(M)=\lambda_n(M) -1 \leq 0 $$

Because eigenvalues of $I$ are $1$.

Now need to take the derivative of each $f_i$ and $g_i$ with respect to $M$. I start with the derivative of $f_i(M)$ with respect to $M$. $M$ is symmetric, so we have the following

$v_i^T \partial M v_i=\partial \lambda_i$

Confer to the following link:

Derivatives of eigenvalues

Now you have

$$ \partial \lambda_i(M)=v_i^T \partial M v_i=[v_i^T \partial M v_i]_{11}=\sum_j v_{i_j1}[M v_i]_{j1}=\sum_j v_{i_j1} \sum_k M_{jk} v_{i_{k1}}=\sum_j \sum_k v_{i_j1}M_{jk} v_{i_{k1}} $$

$$ \frac{\partial \lambda_i(M)}{\partial M_{jk} }= \sum_j \sum_k v_{i_j1} v_{i_{k1}}=[v_iv_i^T]_{jk} $$

So $$ \frac{\partial \lambda_i(M)}{\partial M }= v_iv_i^T $$

Now you assign $\gamma_i$ for each constraint including $g_i(M)$ and $w_i$ for each constraint including $f_i(M)$.

Saeed
  • 175