2

In a memory protocol of a past machine learning exam, I found the following task

Consider $\max_{w \in \mathbb R^d, v \in \mathbb R^n} w^T A w + v^T B v$ subject to $\| v \|^2 + \| w \|^2 = 1$, where $A$ and $B$ are positive definite matrices of respective forms ($d \times d$, $n \times n$). Write the Lagrange function associated with this constraint problem and derive the solution $\begin{bmatrix} w^* \\ v^* \end{bmatrix}$ of this constraint problem.

Here is what I have done: The Lagrangian is $$ L(v, w, \lambda) := w^T A w + v^T B v + \lambda (1 - \| v \|^2 - \| w \|^2), $$ whose partial derivatives I set to zero: $$ \frac{\partial L(v, w, \lambda)}{\partial v} = 2 B v - 2 \lambda v \overset{!}{=} 0 \iff B v = \lambda v $$ and analogously $\frac{\partial L(v, w, \lambda)}{\partial w} = 0 \iff A w = \lambda w$.

In my quest to find $\lambda$, I multiplied those two equations with $v^T$ and $w^T$, respectively, and added them to each other to obtain $$ v^T B v + w^T A w = \lambda (v^T v + w^T w) = \lambda. $$ As $A$ and $B$ are positive definite, we must have $\lambda > 0$.

How do I continue from here to find $\begin{bmatrix} w^* \\ v^* \end{bmatrix}$?

ViktorStein
  • 4,838

1 Answers1

3

A more concrete hint, let $x:=(w,v)\in \mathbb{R}^{d+n}$ and let $C:=\mathrm{diag}(A,B)\in \mathcal{S}^{d+n}_{\succ 0}$, then your optimization problem becomes $\max_{x\in \mathbb{S}^{d+n-1}}\langle Cx,x\rangle$, or equivalently, $\max_{x:\|x\|_2=1}\|Cx\|_2$, which has a well-known solution.

Further remark (9/17/2020): Given some $A\in \mathcal{S}^n_{\succ 0}$, we know there is some $Q\in \mathsf{O}(n)$ such that $A=Q\Lambda Q^{\mathsf{T}}$ where $\Lambda=\mathrm{diag}(\lambda_1,\dots,\lambda_n)$, $\lambda_1\geq\cdots \geq \lambda_n>0$. Now since $\|Q^{\mathsf{T}}x\|=\|x\|$ $\forall x\in \mathbb{S}^{n-1}$ we can consider, without loss of generality, $\max_{z\in \mathbb{S}^{n-1}} z^{\top}\Lambda z$, which again equals $\max_{\{z\}_i}\sum^n_{i=1}\lambda_i z_i^2$ subject to $\sum^n_{i=1}z_i^2=1$. Hence, we pick the basis vector $z_1$ corresponding to the largest eigenvalue $\lambda_1$. Then, to retrieve the optimizer to the original problem, see that $z_1=Q^{\mathsf{T}}x^{\star}$ holds if and only if $x^{\star}=q_1$, which is indeed the eigenvector corresponding to $\lambda_1$. Of course, which did not touch upon uniqueness here.

WalterJ
  • 1,097
  • And the well-known solution is the eigenvector of $C$ associated with the largest eigenvalue, right? – ViktorStein Sep 16 '20 at 14:11
  • Because after "derive the solution $\begin{bmatrix} w^* \ v^* \end{bmatrix}$ of this constraint problem." the next task is to "show that the solution is a eigenvector of some square matrix $C$ of side $d + n$". – ViktorStein Sep 16 '20 at 20:37