6

Let $\mathbf{x}\in\Bbb{R}^n$ and $A\in\Bbb{S}_{++}^n$, where $\Bbb{S}_{++}^n$ denotes the space of symmetric positive definite $n\times n$ real matrices. Also, let $Q\colon\Bbb{R}^n\to\Bbb{R}_{+}$ be the quadratic form given by $$ Q(\mathbf{x})=\mathbf{x}^\top A\mathbf{x}\geq0. $$ I would like to approximate $Q(\mathbf{x})$ by a scalar multiple of the squared Euclidean norm of $\mathbf{x}$, that is $$ Q(\mathbf{x})\approx c \lVert\mathbf{x}\rVert^2,\quad c>0. $$ If $A$ is a multiple of the identity matrix (of order $n$), i.e. $A=aI_n$, $a>0$, then $c=a$ and $Q(\mathbf{x})=a\lVert\mathbf{x}\rVert^2$. In this case we have no approximation but a strict equality.

On the other hand, is $A\neq aI_n$, we could approximate the quadratic form using the mean of eigenvalues of $A$, since it holds that $$ \lambda_{min}(A)\lVert\mathbf{x}\rVert^2\leq Q(\mathbf{x})\leq\lambda_{max}(A)\lVert\mathbf{x}\rVert^2, $$ that is, $$ Q(\mathbf{x})\approx\frac{1}{n}\sum_{i=1}^{n}\lambda_{i}(A)\lVert\mathbf{x}\rVert^2, $$ and thus $c=\frac{1}{n}\sum_{i=1}^{n}\lambda_{i}(A)$.

Is there any way of finding an optimal $c$ such that the approximation of $Q(\mathbf{x})$ is optimal (by satisfying some criterion)?

nullgeppetto
  • 3,006
  • Do you have any specific criteria in mind? The answer depends on the specific criteria chosen---both on how to compute an optimal $c$, and also whether or not that computation is tractable. – Michael Grant Jun 29 '15 at 14:39

3 Answers3

8

Since $$\mathbf{Q}(\mathbf{x})=\sum_{i=1}^n \sum_{j=1}^n a_{ij}x_ix_j$$

We can represent $\mathbf{Q}(\mathbf{x})$ as a vector $\mathbf{q} \in \mathbb{R}^d,d=\frac{n+n^2}{2}$ with the quadratic basis functions $\{x_1^2,x_1x_2,...,x_n^2\}$:

$$\mathbf{Q}(\mathbf{x})=a_{11}x_1^2+2a_{21}x_1x_2+....+a_{nn}x_{n}^2$$ $$\mapsto \mathbf{q}:=(a_{11},2a_{21},....,a_{nn})$$

We can also represent $||\mathbf{x}||^2$ in this space:

$$||\mathbf{x}||^2 = x_1^2+x_2^2+...+x_n^2$$ $$\mapsto \mathbf{v}:=(1,0,0,0,..,1,0,0,0,...,0,1)$$

Now, lets try to minimize the squared Euclidean norm of the difference between $c\mathbf{v}$ and $\mathbf{q}$:

$$\min_c L(c),\;\; \mathrm{where}\; L(c):=||\mathbf{q}-c\mathbf{v}||^2$$

A (very) little bit of algebra shows that:

$$L(c)= \sum_{i=1}^n (a_{ii}-c)^2 + \sum_{i\neq j}a_{ij}$$

This will be a convex function of $c$, so we just take the derivative and set to 0:

$$\frac{d}{dc} L(c) = -2\sum_{i=1}^n (a_{ii}-c)=0 \implies c=\frac{1}{n}\sum_{i=1}^n a_{ii}$$

So, we can set $c$ to the average of the trace of A:

$$c=\frac{Tr(A)}{n}$$

3

I took a probabilistic approach to the problem. I wanted to say something like if you choose a "random vector" $x$ then $Q(x)$ will "typically be" $c\|x\|^2$. I decided on typically be meaning the expected value of $Q(x)$. So I want to show $$E[Q(X)] = c\|X\|^2$$ for some $c$.

This also requires a distribution for a random vector $X$. I would like to say chosen uniformly from $\mathbb{R}^n$ but no such measure exists. Instead I will consider choices of vectors that are uniform on the unit sphere $S^n$ and have length given by some other distribution. It turns out the result is independent of the other distribution. I will then seek a "c" s.t. $$E[Q(X)] = c$$ where $X$ is chosen from the unit sphere.

It is a theorem that if $X_i$ are independently chosen from a normal distribution $N(0, 1)$ then the vector with components $$Y_i = \frac{X_i}{\sqrt{\sum_{i=1}^nX_i^2}} $$ is uniformly distributed on the sphere.

Taking $\{e_i\}$ to be an eigenbasis for $Q$, we can write it's value on a vector $Y$ as $$Q(Y) = \sum_{i=1}^n \lambda_i Y_i^2$$ Then $$E[Q(Y)] = E[\sum_{i=1}^n \lambda_i Y_i^2] = \sum_{i=1}^n \lambda_i E[Y_i^2]$$

Now, $E[\sum_{i = 1}^n Y_i^2] = 1$ so by symmetry $E[Y_i^2] = \frac{1}{n}$. So we arrive at $$E[Q(Y)] = \frac{1}{n}\sum_{i=1}^n \lambda_i$$

A nice side effect of this approach is we actually have the full distribution of the random variable $Q(Y)$ which is $$Q(X) = \frac{\sum_{i=1}^n \lambda_i X_i^2}{\sum_{i=1}^nX_i^2}$$ which is valid so long as the vector $X$ is chosen from a density symmetric under rotations. From there you can analyze the accuracy of the estimator.

muaddib
  • 8,267
1

Although @Bey was correct, we can state a more general result in a language natural to matrices.

Let us consider the Schatten $p$-norm with the $k$ largest singular value for $p = 1$ or $p = 2$. Then, after changing to an orthonormal basis of $A$, we may assume $A$ to be diagonal and the problem reduces to $\ell_p$ approximation of $k$ largest Eigen values of $A$ by a scalar $c$. In case of $p = 1$ the solution is the median of these $k$ Eigen values. In case of $p = 2$ the solution is the mean of these $k$ Eigen values.

The computation is only "easy" in the case stated by @Bey, that is $k=n$ and $p=2$.

user251257
  • 9,229