4

I encounter the following problem when I study ridge regression.

Problem. Let $\{d_j\}_{j=1}^\infty$ be a sequence of positive integers. Let $\{\psi_{i,j}\}_{i,j=1}^\infty$ be a collection of vectors where $\psi_{i,j}\in\mathbb{R}^{d_j}$ and $\|\psi_{i,j}\|_2\le 1$. For positive integers $n_0, n_1$ and $n_2$ ($n_1\ge n_2$), we define \begin{equation} \begin{aligned} \Lambda_1=\sum_{i\in[n_0]}\left(\bigotimes_{j\in[n_1]}\psi_{i,j}\right)\left(\bigotimes_{j\in[n_1]}\psi_{i,j}\right)^\top+\lambda I_1,\\ \Lambda_2=\sum_{i\in[n_0]}\left(\bigotimes_{j\in[n_2]}\psi_{i,j}\right)\left(\bigotimes_{j\in[n_2]}\psi_{i,j}\right)^\top+\lambda I_2, \end{aligned} \end{equation} where $\otimes$ is the Kronecker product, $I_1$ and $I_2$ are identity matrices and $\lambda\in\mathbb{R}_+$. Let $\{\varphi_k\}_{k=1}^\infty$ be a sequence of vectors where $\varphi_k\in\mathbb{R}^{d_k}$ and $\|\varphi_k\|_2\ge 1$, and we define \begin{equation} \phi_1=\bigotimes_{k\in[n_1]} \varphi_k, \quad \phi_2=\bigotimes_{k\in[n_2]} \varphi_k. \end{equation} Then, show that the following holds \begin{equation} \phi_1^\top\Lambda_1^{-1}\phi_1\ge \phi_2^\top\Lambda_2^{-1}\phi_2. \end{equation}

I have verified it via programming, and no counterexample was found. Therefore, I believe it is probably true. However, I can only prove the case where $n_0=1$. The main idea of my proof for $n_0=1$ is the following:

  1. Clearly, it suffices to prove that it holds when $n_1=2$ and $n_2=1$.
  2. We diagonalize $\Lambda_1^{-1}$ and $\Lambda_2^{-1}$.
  3. By diagonalization, we can translate $\phi_1^\top\Lambda_1^{-1}\phi_1$ and $\phi_2^\top\Lambda_2^{-1}\phi_2$ into a combination of eigenvalues, and finally obtain the desired inequality.

I do not how to handle the case where $n_0>1$ since there are fundamental differences--we can not easily diagonalize the matrices. Hence, I am stuck... Any help or hint would be appreciated.

Thanks in advance.

Wheel
  • 322
  • 1
  • 12

1 Answers1

1

Let $u_i = \bigotimes_{j=1}^{n_2} \psi_{ij}$, $v_i = \bigotimes_{j=n_2+1}^{n_1} \psi_{ij}$, and $\Lambda_3 = \Lambda_2 \otimes I_3$ where $I_3$ is the identity matrix such that $I_1 = I_2 \otimes I_3$. Then, $$ \Lambda_1 = \sum_{i=1}^{n_0} (u_i u_i^\top) \otimes (v_iv_i^\top) + \lambda I_2 \otimes I_3 \quad\And\quad \Lambda_3 = \sum_{i=1}^{n_0} (u_i u_i^\top) \otimes I_3 + \lambda I_2 \otimes I_3 .$$ Denote the Loewner ordering by $\preceq$. Since the only non-zero eigenvalue of $v_i v_i^\top$ is $\ \lVert v_i \rVert_2^2 = \prod_{j=n_2+1}^{n_1} \lVert \psi_{ij} \rVert_2^2 \leq 1 $, we have $v_i v_i^\top \preceq I_3$. By the diagonalizability of $u_i u_i^\top$ and $v_i v_i^\top$, we have $(u_i u_i^\top) \otimes (v_i v_i^\top) \preceq (u_i u_i^\top) \otimes I_3$. Thus, $\Lambda_1 \preceq \Lambda_3$, implying that $\Lambda_1^{-1} \succeq \Lambda_3^{-1}$, which is due to a simple modification (dropping the strictness of the Loewner ordering) of this MSE post. Let $\phi_3 = \bigotimes_{k=n_2+1}^{n_1} \varphi_k$. Then, $\ \lVert \phi_3 \rVert_2^2 = \prod_{k=n_2+1}^{n_1} \lVert \varphi_k \rVert_2^2 \geq 1 $, and $$ \phi_1^\top \Lambda_1^{-1} \phi_1 \geq \phi_1^\top \Lambda_3^{-1} \phi_1 = (\phi_2^\top \otimes \phi_3^\top) (\Lambda_2^{-1} \otimes I_3) (\phi_2 \otimes \phi_3) = (\phi_2^\top \Lambda_2^{-1} \phi_2) \lVert \phi_3 \rVert_2^2 \geq \phi_2^\top \Lambda_2^{-1} \phi_2 .$$