1

Here's the question:

You have a dataset $\{\mathbf{x}\}$ of N vectors, each of which is d-dimensional. Assume $\mathbf{mean}(\{x\})=0$. Consider a linear function on our dataset, some vector $\mathbf{a}$, which we can write as being evaluated on each data item as $f_i(\mathbf{a})=\mathbf{a}^T\mathbf{x}_i$.

Show that

"Maximize $\mathbf{Var}(\{f(\mathbf{a})\})$ subject to $\mathbf{a}^T\mathbf{a} = 1$"

is solved by the eigenvector of $\mathbf{Covmat}(\{f(\mathbf{a})\})$ corresponding to the largest eigenvalue.

What I know/understand:

For $\mathbf{a}^T\mathbf{a} = 1$ to be true, a must be an orthogonal matrix. (Rusty on my matrix properties, so still trying to figure out how this might help me)

$\mathbf{Var}(\{f(s *\mathbf{a})\}) = s^2*\mathbf{Var}(\{f(\mathbf{a})\})$, something I proved in a previous question. I'm pretty sure this is supposed to help, but I haven't figured out the connection yet.

I believe this can be maximized using Lagrange, where $f=\mathbf{Var}(\{f(\mathbf{a})\})$, $g=\mathbf{a}^T\mathbf{a}$, and $c=1$, giving us a LaGrange equation of $L = \mathbf{Var}(\mathbf{a}^T\{\mathbf{x}\})-\lambda(\mathbf{a}^T\mathbf{a})$

I understand that I need to set the Lagrangian equation equal to zero and take the gradient in order to maximize the given functions. However, it has been some time since I have done any calculus/linear algebra, so I am not fully sure how to go about doing this, especially in such a general sense where our linear function is just an arbitrary matrix, $\mathbf{a}$.

I believe I am really close to piecing this together and it would be really helpful if someone could help me go in the right direction. Thanks!

zdale
  • 11

0 Answers0