1

If matrix $A \in \Bbb R^{d \times d}$ is positive semidefinite and tall matrix $W \in \Bbb R^{d \times c}$ (where $d > c$) has orthonormal columns, $W^T W = I_c$, does the following inequality hold?

$$\mbox{Tr} \left( W^T A W \right) \leq \mbox{Tr} (A)$$

It seems true after I run the code many times, but I can't prove it. Anyone can help? Thanks.

import numpy as np

d = 10 m = round(0.3*d)

A = np.random.randint(-100, 100, (d, d)) A = A.dot(A.T) #A is postive semidefinte now

B = np.random.randint(-100, 100, (d, m)) u, s, vh = np.linalg.svd(B) W = u[:, 0:m]

p1 = np.trace(W.T.dot(A).dot(W)) p2 = np.trace(A) print(p1) print(p2)

Pei
  • 31
  • This is not true in general. You can modify your code to generate a $W$ matrix with very large entries relative to $A$ and easily find a counterexample- your current code ensures that the columns of $W$ have norm 1, which is too small. – Brian Borchers Jul 31 '20 at 04:11
  • @BrianBorchers Sorry, I forgot the orthogonal constraint, now I have modified my question. – Pei Jul 31 '20 at 04:19
  • With the orthogonality requirement on $W$, it's easy to prove this using the von Neumann trace inequality. – Brian Borchers Jul 31 '20 at 04:34
  • thanks for your help @BrianBorchers – Pei Aug 01 '20 at 13:04

4 Answers4

3

Since $\mathrm W^\top \mathrm W = \mathrm I_c$, the (symmetric) projection matrix that projects onto the column space of $\rm W$ is

$$\mathrm W \left( \mathrm W^\top \mathrm W \right)^{-1} \mathrm W^\top = \mathrm W \mathrm W^\top$$

and, thus, the (symmetric) projection matrix that projects onto the left null space of $\rm W$ (which is the subspace orthogonal to the column space) is $\mathrm I_d - \mathrm W \mathrm W^\top$. Since the eigenvalues of projection matrices are $0$ and $1$, matrix $\mathrm I_d - \mathrm W \mathrm W^\top$ is positive semidefinite.

Since both $\rm A$ and $\mathrm I_d - \mathrm W \mathrm W^\top$ are symmetric and positive semidefinite and the trace of the product of two symmetric positive semidefinite matrices is non-negative,

$$\mbox{tr} \left( \left( \mathrm I_d - \mathrm W \mathrm W^\top \right) \mathrm A \right) \geq 0$$

Using the cyclic property of the trace, $\mbox{tr} \left( \mathrm W \mathrm W^\top \mathrm A \right) = \mbox{tr} \left( \mathrm W^\top \mathrm A \mathrm W \right)$. Hence,

$$\color{blue}{\mbox{tr} (\mathrm A) \geq \mbox{tr} \left( \mathrm W^\top \mathrm A \mathrm W \right)}$$

3

Here are two ways to see that this holds that I find intuitive.

One option: for any orthonormal basis $\{w_1,\dots,w_c\}$ of $\Bbb R^d$, we have $$ \operatorname{tr}(A) = \sum_{i=1}^d w_i^TAw_i; $$ this corresponds to how the trace of an operator is defined for operators over an inner product space. With that, let $\{w_1,\dots,w_d\}$ be a basis such that $w_1,\dots,w_c$ are the columns of $W$. Then $$ \operatorname{tr}(W^TAW) = \sum_{i=1}^c w_i^T Aw_i \leq \sum_{i=1}^d w_i^TAw_i = \operatorname{tr}(A). $$


Alternatively: again, we take $w_1,\dots,w_d$ to be an orthonormal basis that extends the orthonormal columns of $W$. Let $V$ be the matrix with columns $w_{c+1},\dots,w_{d}$, and let $\tilde W = [W \ \ V]$. Note that $\tilde W$ is an orthogonal matrix, so that $\tilde W^T A \tilde W$ is (positive definite and) similar to $A$, and therefore has the same trace. On the other hand, $$ \tilde W^T A \tilde W = \pmatrix{W^T \\ V^T} A \pmatrix{W & V} = \pmatrix{W^TAW & W^TAV\\ V^TAW & V^TAV}. $$ In other words, $W^TAW$ is essentially a submatrix of $A$. With that, we see that $$ \operatorname{tr}(A) = \operatorname{tr}(W^TAW) + \operatorname{tr}(V^TAV) \geq \operatorname{tr}(W^TAW). $$

Ben Grossmann
  • 225,327
0

Since $A$ is orthogonally diagonalizable, we may assume $A=\mbox{diag}(a_1, ..., a_d)$. Write $W=(w_{ij})$. Direct calculation shows that $$\operatorname{Trace}(W^TAW)=\sum_{i=1}^{c} \sum_{k=1}^{d}w_{ki} a_k w_{ki}=\sum_{k=1}^{d} a_k \left( \sum_{i=1}^{c}w_{ki} w_{ki} \right)$$

Thus it suffices to show $\sum_{i=1}^{c}w_{ki} w_{ki} \leq 1$.

To see this, note that $\sum_{k=1}^{d}w_{ki} w_{kj} = \delta_{ij}$ as $W^T W = I$. In other words, the columns $[W]^{1}, \dots, [W]^{c}$ are orthonormal. Since $c<d$, we can find $v_{c+1}, \dots, v_{d}$ in $\mathbb R^d$ such that $[W]^1, \dots, [W]^c, v_{c+1}, \dots, v_{d}$ is a orthonormal basis of $\mathbb R^{d}$. Let $Z$ be a matrix having these vectors as columns. Then $Z^TZ = I$, and thus $ZZ^T=I$, i.e. $Z^T$ is an orthogonal matrix. In particular, the columns of $Z^T$ is of length $\leq 1$. You can show $\sum_{i=1}^{c}w_{ki} w_{ki} \leq 1$ from this.

luxerhia
  • 3,538
0

Let $$M = \left( \begin{array}{cc} I_d & W \\[6pt] W^\mathsf{T} & I_c \\ \end{array} \right). $$ By the Schur complement (See [1], or Propositions 2.1 and 2.2 in [2]), we have $$I_c - W^\mathsf{T}W = 0 \quad \Longrightarrow \quad M \succeq 0 \quad \Longrightarrow \quad I_d - WW^\mathsf{T} \succeq 0. $$ Then we have $$\mathrm{Tr}(A) - \mathrm{Tr}(W^\mathsf{T}AW) = \mathrm{Tr}(A) - \mathrm{Tr}(AWW^\mathsf{T}) = \mathrm{Tr}(A(I_d - WW^\mathsf{T})) \ge 0$$ where we have used $\mathrm{Tr}(BC)\ge 0$ for two real symmetric positive semidefinite matrices $B, C$ (It is easy to prove. Or see: Page 329, "Matrix algebra" by K. M. Abadir and J. R. Magnus, 2005). We are done.

Reference

[1] https://en.wikipedia.org/wiki/Schur_complement

[2] https://www.cis.upenn.edu/~jean/schur-comp.pdf

River Li
  • 37,323