Need help reasoning about a proof with variance/Covmat of a dataset

Question

Here's the question:

You have a dataset $\{\mathbf{x}\}$ of N vectors, each of which is d-dimensional. Assume $\mathbf{mean}(\{x\})=0$. Consider a linear function on our dataset, some vector $\mathbf{a}$, which we can write as being evaluated on each data item as $f_i(\mathbf{a})=\mathbf{a}^T\mathbf{x}_i$.

Show that

"Maximize $\mathbf{Var}(\{f(\mathbf{a})\})$ subject to $\mathbf{a}^T\mathbf{a} = 1$"

is solved by the eigenvector of $\mathbf{Covmat}(\{f(\mathbf{a})\})$ corresponding to the largest eigenvalue.

What I know/understand:

For $\mathbf{a}^T\mathbf{a} = 1$ to be true, a must be an orthogonal matrix. (Rusty on my matrix properties, so still trying to figure out how this might help me)

$\mathbf{Var}(\{f(s *\mathbf{a})\}) = s^2*\mathbf{Var}(\{f(\mathbf{a})\})$, something I proved in a previous question. I'm pretty sure this is supposed to help, but I haven't figured out the connection yet.

I believe this can be maximized using Lagrange, where $f=\mathbf{Var}(\{f(\mathbf{a})\})$, $g=\mathbf{a}^T\mathbf{a}$, and $c=1$, giving us a LaGrange equation of $L = \mathbf{Var}(\mathbf{a}^T\{\mathbf{x}\})-\lambda(\mathbf{a}^T\mathbf{a})$

I understand that I need to set the Lagrangian equation equal to zero and take the gradient in order to maximize the given functions. However, it has been some time since I have done any calculus/linear algebra, so I am not fully sure how to go about doing this, especially in such a general sense where our linear function is just an arbitrary matrix, $\mathbf{a}$.

I believe I am really close to piecing this together and it would be really helpful if someone could help me go in the right direction. Thanks!

I've answered this question previously here, let me know if it helps you: https://math.stackexchange.com/a/3218979/383062 — mostsquares, May 12 '20 at 21:56

Need help reasoning about a proof with variance/Covmat of a dataset

0 Answers0