Proving a matrix inequality

Question

Let $A, B \in \mathbb R^{m\times m}$ be symmetric positive semi-definite matrices. Is it true that $$\sup_{\|x\| = 1} \left| \|Ax\| - \|Bx\| \right| \geq c(m) \|A-B\|,$$ with $c(m) > 0$ and where $\|\cdot\|$ denotes the 2-norm?

Here is how I approached the problem. Let us introduce the notation $\Delta = A -B$ for convenience. Without loss of generality, we can assume that $B$ is diagonal, that $\| \Delta\|$ = 1, and that $\Delta$ has an eigenvalue equal to $+1$.

We have $$ \begin{align} \|Ax\| = \sqrt{x^T (B+\Delta)^2 x} &= \sqrt{x^T B^2 x + x^T (B \Delta + \Delta B) x + x^T \Delta^2 x} \\ &= \sqrt{\left(\sqrt {x^T B^2 x} + \sqrt{x^T \Delta^2 x}\right)^2 - 2 \left(\sqrt{x^T B^2 x} \sqrt{x^T \Delta^2 x} - x^T B \Delta x\right)}. \end{align} $$ By Cauchy-Schwarz inequality, the expression in the second brackets is non-negative, so $$ \begin{align} \sup_{\|x\| = 1} \, \left| \|Ax\| - \|Bx\| \right| &\geq \sup_{\|x\| = 1}\frac{|2 \, x^T B \Delta x + x^T \Delta^2 x|}{2(\sqrt {x^T B^2 x} + \sqrt{x^T \Delta^2 x})} \end{align} $$ From here things get more complicated and any help would be appreciated. Of course, there might be other approaches to the problem.

Thank you.

Bonus question: Is the statement true for positive self-adjoint operators on a Hilbert space?

This should be true since $d(A,B) = \sup_{|x| = 1} \left| |Ax| - |Bx| \right|$ seems like it would provide a metric that is topologically equivalent to the usual on $\Bbb R^{m \times m}$ — Ben Grossmann, Jan 25 '17 at 16:15
Thanks. I have tried reformulating the problem that way, but proving that $d(A,B)$ does indeed define a metric seems to be as difficult as the original problem. — Roberto Rastapopoulos, Jan 25 '17 at 16:25
really? I would think that the triangle inequality comes pretty easily. I'll give it a try later if I can — Ben Grossmann, Jan 25 '17 at 18:29
In fact, you are right: showing that $d(,)$ defines a metric is relatively easy, but how do you show that it is topologically equivalent to the usual on $\mathbb R^{m\times m}$? — Roberto Rastapopoulos, Jan 25 '17 at 22:57
Of course, this is the difficult part. However, since all norms are topologically equivalent, we don't necessarily need to use the $2$-norm, so maybe that will help. — Ben Grossmann, Jan 25 '17 at 23:09
That is true, but only for the right-hand side (not for $|Ax|$ and $|Bx|$). I used the 2-norm, because it is the one that seemed the most appropriate in this case. — Roberto Rastapopoulos, Jan 28 '17 at 09:34
@user60589 The question still doesn't make sense. Does it mean $$\frac{\sup_{|x| = 1} \left|,|Ax| - |Bx|,\right|}{|A-B|} =\mathcal O(m)? $$ or else, what form do you suggest for $c(m)$? Is it the thing that has to be derived in an answer? — polfosol, Mar 07 '17 at 07:11
@polfosol I think just for any $m$ there exists a $c(m) >0$ such that the equation is true. — user60589, Mar 07 '17 at 07:33

George Lowther · Accepted Answer · 2017-03-10T22:51:36.443

Here's a proof of the inequality. Start by setting $\epsilon=\lVert A-B\rVert > 0$. I am using the operator norm on $\mathbb{R}^{m\times m}$, so this is the maximum absolute eigenvalue of $A-B$. Exchanging $A$ and $B$ if necessary, there exists a unit vector $e_1$ with $(B-A)e_1=\epsilon e_1$. Diagonalising the bilinear form $(x,y)=x^TAy$ on the orthogonal complement of $e_1$ we extend to an orthonormal basis $(e_1,e_2,\ldots,e_m)$ with respect to which $A$ is $$ A = \left(\begin{array}{ccc}a_1&-u_2&-u_3&-u_4&\cdots\\ -u_2&a_2&0&0&\cdots\\ -u_3&0&a_3&0&\cdots\\ -u_4&0&0&a_4&\\ \vdots&\vdots&\vdots\end{array}\right) $$ Positive semidefiniteness of $A$ gives $\sum_{k=2}^mu_k^2/a_k\le a_1$ (any terms with $a_k=0$ necessarily have $u_k=0$, and I am setting the ratio to zero). This inequality follows from $x^TAx\ge0$ where $x_1=1$ and $x_k=u_k/a_k$ for $k\ge2$.

Now, choose a $\delta_0$ small enough, the precise value to be chosen later. As we have $m$ distinct intervals $(\delta^2,\delta)$ for $\delta=\delta_0^{2^k}$ ($k=0,1,\ldots,m-1$), at least one of these will be disjoint from $\{\lvert u_k/a_k\rvert\colon k=2,\ldots,m\}$. Let $S$ be the set of $k=2,\ldots,m$ with $a_k=0$ or $\lvert u_k\rvert/a_k\ge\delta$, and $S^\prime=\{2,\ldots,m\}\setminus S$, so that $\lvert u_k\rvert/a_k\le\delta^2$ for $k\in S^\prime$. Define $x\in\mathbb{R}^m$ by $x_1=1$ and $$ x_k=\begin{cases} 0,&k\in S,\\ u_k/a_k,&k\in S^\prime. \end{cases} $$ We can compute $$ (Ax)_1=a_1-\sum_{k\in S^\prime}u_k^2/a_k\ge\sum_{k\in S}u_k^2/a_k\ge\delta\sum_{k\in S}\lvert u_k\rvert\ge\delta u $$ where $u=\sqrt{\sum_{k\in S}u_k^2}$. And, $(Ax)_k=-u_k$ for $k\in S$ and $(Ax)_k=0$ for $k\in S^\prime$.

If we define $C\in\mathbb{R}^{m\times m}$ by $C_{11}=B_{11}=A_{11}+\epsilon$ and $C_{ij}=A_{ij}$ otherwise, then $$ \lVert Cx\rVert=\sqrt{((Ax)_1+\epsilon)^2+u^2}. $$ As $(Ax)_1\ge\delta u$, this implies $$ \lVert Cx\rVert\ge\frac{\epsilon\delta}{\sqrt{1+\delta^2}}+\lVert Ax\rVert. $$ Here, I have used the simple fact that the gradient $(d/da)\sqrt{a^2+u^2}\ge\delta/\sqrt{1+\delta^2}$ over $a\ge\delta u$. We also have $\lVert C-B\rVert\le\epsilon$ and, as $\lvert x_k\rvert\le\delta^2$ for $k=2,\ldots,m$, $$ \left\lvert\lVert Cx\rVert-\lVert Bx\rVert\right\rvert \le\lVert(C-B)x\rVert\le\epsilon\delta^2\sqrt{m-1}. $$ Hence, $$ \lVert Bx\rVert-\lVert Ax\rVert\ge\frac{\epsilon\delta}{\sqrt{1+\delta^2}}-\epsilon\delta^2\sqrt{m-1}. $$ So, as we have $\lVert x\rVert\le\sqrt{1+(m-1)\delta^4}$ and $\epsilon=\lVert A-B\rVert$, $$ \sup_{\lVert x\rVert=1}\left\lvert\lVert Ax\rVert-\lVert Bx\rVert\right\rvert\ge\left(\frac{1}{\sqrt{1+\delta^2}}-\delta\sqrt{m-1}\right)\frac{\delta}{\sqrt{1+(m-1)\delta^4}}\lVert A-B\rVert. $$ As $\delta$ is in the range $\delta_0^{2^{m-1}}\le\delta\le\delta_0$, $$ \sup_{\lVert x\rVert=1}\left\lvert\lVert Ax\rVert-\lVert Bx\rVert\right\rvert\ge\left(\frac{1}{\sqrt{1+\delta_0^2}}-\delta_0\sqrt{m-1}\right)\frac{\delta_0^{2^{m-1}}}{\sqrt{1+(m-1)\delta_0^4}}\lVert A-B\rVert. $$ Choosing $\delta_0$ small enough to make the multiplier on the right hand side positive, which can be done independently of $A,B$ (e.g., $\delta_0=1/(2\sqrt{m})$), gives the result.

Very clever. The separation of the ratios $|u_k|/a_k$ into two groups is brilliant. +1 — user1551, Mar 10 '17 at 19:23

Roberto Rastapopoulos · Answer 2 · 2017-03-12T00:13:56.923

Here is the result of my attemps at concluding the proof, using the same approach as the one presented in the question. Without loss of generality, we can assume that the eigenvalues of $B$ occur in descending order in the diagonal. From the last equation in the question, we have: \begin{equation*} \begin{aligned} \left|\sqrt{x^T (B+\Delta)^2 x} - \sqrt{x^T B^2 x}\right| &\geq \frac{\left|2 \, x^T B \Delta x + x^T \Delta^2 x\right|}{2(\sqrt {x^T B^2 x} + \sqrt{x^T \Delta^2 x})} \\ &\geq \frac{2 \, x^T B \Delta x + x^T \Delta^2 x}{2(\sqrt {x^T B^2 x} + 1)} =: R_{B,\Delta}(x) \end{aligned} \end{equation*} We show that there exists $c(m)$ such that for any $M,\Delta$ as above, there exists $x \in \mathbb R^m$ such that $R_{B,\Delta}(x) \geq c(m)$. Assume for contradiction that there exist $(B)_{i \in \mathbb N}$ and $(\Delta)_{i\in\mathbb N}$ such that $\sup_{\|x\|=1} R_{B_i,\Delta_i}(x) \to 0$, and let $x_i$ denote an eigenvector corresponding to the eigenvalue $1$ of $\Delta_i$. Passing to a subsequence if necessary, we can assume that $x_i \to x_\infty$. By assumption, \begin{equation*} R_{B_i,\Delta_i}(x_i) = \frac{2 \, x_i^T B_i x_i + 1}{2\sqrt {x_i^T B_i^2 x_i} + 2} \to 0 \Rightarrow \frac{x_i^T B_i x_i}{\sqrt {x_i^T B_i^2 x_i}} \to 0 \Rightarrow x_i^T \left(\frac{B_i}{\|B_i\|}\right) x_i \to 0, \end{equation*} so in particular the first component of $x_\infty$ is equal to 0 (Because $B_i/\|B_i\|$ is a diagonal matrix whose first entry on the diagonal is equal to $1$). Let now $(y_i)_{i\in\mathbb N}$ be the sequence obtained from $(x_i)_{i\in\mathbb N}$ by replacing the first component by 0. By assumption, \begin{equation*} \begin{aligned} R_{B_i,\Delta_i}(y_i) &= \frac{(B_i\,y_i)^T(\Delta_i y_i)}{\|B_i y_i\| + 1}\, + \,\frac{\|\Delta_i y_i\|^2}{2 \|B_i y_i\| + 2}. \\ &= \underbrace{\frac{y_i^TB_i\,y_i}{\|B_i y_i\| + 1}}_{\text{$\geq 0$}}\, + \underbrace{\frac{(B_i\,y_i)^T\,}{\|B_i y_i\| + 1}}_{\text{vector of norm less than 1}}\,\underbrace{(\Delta_i y_i - y_i)}_{\to 0} + \,\underbrace{\frac{\|\Delta_i y_i\|^2}{2 \|B_i y_i\| + 2}}_{\geq 0} \to 0 \end{aligned} \end{equation*} Since $(\Delta_i y_i - y_i) \to 0$ (because the first component of $x_\infty$ is 0), all terms must tend to 0. In particular, $$\frac{\|\Delta_i y_i\|^2}{2 \|B_i y_i\| + 2} \to 0,$$ so $\|B_i y_i\| \to \infty$, because $\|\Delta_i y_i\| \to 1$. In addition, $$\frac{y_i^TB_i\,y_i}{\|B_i y_i\| + 1} \to 0,$$ and since $\|B_i\,y_i\| \to \infty$ this impies that: $$\frac{y_i^T B_i y_i}{\|B_i y_i\|} \to 0,$$ so the second component of $x_\infty$ is 0 too. Continuing this way, we obtain $x_\infty = 0$, which is a contradiction because $x_i \to x_\infty$ and $\|x_i\| = 1$.

Thanks! Well, because the supremum goes to zero, by assumption. — Roberto Rastapopoulos, Mar 13 '17 at 00:00

score 2 · Answer 3 · answered Mar 07 '17 at 10:03

2

I am not yet satisfied with the following approach. So if you found a bug in it, please don't punch me!

It can be shown that if $A,B$ are symmetric matrices then $$\|Ax\|\cdot \|Bx\|\ge\frac 12 x^T(AB+BA)x\label{*}\tag{*}$$ because $\|u\|\, \|v\|\ge u\cdot v$ implies $\|Ax\|\cdot \|Bx\|\ge x^T A^T Bx$ which can be rearranged as above.

The left-side of your inequality can be written as: $$\sup_{\|x\| = 1} \left|\,\|Ax\| - \|Bx\|\,\right|=\sup_{\|x\| = 1}\left\{\|Ax\|^2 + \|Bx\|^2-2\|Ax\|\cdot \|Bx\|\right\}^{1/2}$$ And due to the definition of the matrix norm, for the right-side: $$\begin{align}\|A-B\|&=\sup_{\|x\| = 1}\|(A-B)x\|\\ &=\sup_{\|x\| = 1}\left\{x^T(A-B)^T(A-B)x\right\}^{1/2}\\ &=\sup_{\|x\| = 1}\left\{x^TA^T Ax+x^T B^T Bx-x^T(A^T B+B^TA)x\right\}^{1/2}\\ &=\sup_{\|x\| = 1}\left\{\|Ax\|^2 + \|Bx\|^2-x^T(AB+BA)x\right\}^{1/2}\end{align}$$ Now you can relate the two sides of the inequality using the $\eqref{*}$.

answered Mar 07 '17 at 10:03

polfosol

9,245

1

I think you are proving the wrong inequality? If you subtract less you get more... – user60589 Mar 07 '17 at 10:09
@user60589 Yes, it seems I proved the contrary. That's why I asked about $c(m)$ – polfosol Mar 07 '17 at 10:13
1

You could also see this by reverse triangle inequality. But the other inequality is more interesting, since it would give an equivalence of metrics as Omnomnomnom said. – user60589 Mar 07 '17 at 10:20
@user60589 We know for sure that $c(m)<1$. For every positive reals $a,b$ if $a<b$ then there exists a positive $c<1$ such that $$a>cb$$ and still the question is vague... – polfosol Mar 07 '17 at 10:29
We need one $c(m)$ for all symmetric positive semi-definite $A,B$. – user60589 Mar 07 '17 at 10:32

Proving a matrix inequality

3 Answers3

Linked