4

The Cauchy-Schwarz inequality states that

$$\left(\sum_{i=1}^n x_i y_i\right)^2\leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right).$$

The proof, with the discriminant argument, is easy to understand; however, it does not really (in my opinion) provide any intuitive justification as to why the inequality should be true.

Note that similar questions have been posted here and here; however, they do not help me because I have not yet studied linear algebra. For the same reason, an ideal answer would use only (high school) algebra and, if necessary, calculus.

  • 5
    The inequality means that projection of a vector onto another vector (direction) is at most the length of the projected vector. – A.S. Feb 03 '16 at 17:42
  • For two vectors $\vec v,\vec w$ for which the angle between them is $\theta$, you have $\vec v\cdot\vec w = |\vec v||\vec w|\cos\theta$, so $|\vec v\cdot\vec w| = |\vec v||\vec w||\cos\theta| \le |\vec v||\vec w|$ since $\left|\cos\theta\right|\le 1$. That's what I think of when I think of the "intuition" involved in Cauchy--Schwarz. But I'm not sure if that's what you're looking for. $\qquad$ – Michael Hardy Feb 03 '16 at 18:52

4 Answers4

4

Here is the geometric intuition behind this for $n = 3$. You need not know much linear algebra, but you do need to know about vectors.

Given two vectors $x = (x_1,x_2,x_3)$ and $y = (y_1,y_2,y_3)$, we define their dot product $x \cdot y$ to be $|x||y|\cos \theta$, where $|x|$ and $|y|$ are the lengths of the vectors $x$ and $y$, and $\theta$ is the angle between them. (In theory, this definition may be slightly circular, since at an advanced level dot products are usually used to define angles. But if you accept the idea of an angle as intuitively meaningful, we needn't worry about this technicality.)

The dot product $x \cdot y$ is also given by the formula $x_1y_1 + x_2 y_2 + x_3 y_3$.

Then the Cauchy-Schwarz inequality is exactly equivalent to the statement that $|\cos \theta| \leq 1$.

An alternative interpretation without angles in general, but using perpendicularity, is the one given in A.S.'s comment.

If you'd like to see the details of this, have a look either at Chapter 12 of Apostol's Calculus or at Chapter 1 of Lang's Introduction to Linear Algebra.

Edit I can try to give a very imperfect algebraic "interpretation" of the inequality. I'm not convinced this is the best one, so I'll keep thinking about it.

If you look at the inequality $$\left(\sum_{i=1}^n x_i y_i\right)\left(\sum_{i=1}^n x_i y_i\right)\leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right),$$ note first that the general inequality follows from the special case where all the $x_i$'s and $y_i$'s are nonnegative, since $|\sum x_i y_i| \leq \sum |x_i||y_i|$. Next think about how the $x_i$'s and $y_i$'s match up. The inequality says that to make a product like the LHS and RHS as large as possible, it's better to match up the numbers $x_i$ and $y_i$ with themselves than with each other. This sort of makes sense, because if you have $x_1 < y_1$, and you go from $x_1 y_1$ (twice) to $x_1^2$ and $y_1^2$, you're better off with the latter. This is because $y_1^2$ is relatively large, and this is usually more than enough to compensate for the smaller $x_1^2$. Obviously, this is not a proof in any way. But it does make the inequality plausible.

David
  • 6,306
  • Interestingly, I am reading Apostol's Calculus; he brought up the inequality much earlier than chapter 12. – MathematicsStudent1122 Feb 03 '16 at 18:01
  • Yes, as I recall, it appears in an exercise on proof by induction. Vectors are discussed only later in the book. – David Feb 03 '16 at 18:04
  • You need to put some restrictions on $\theta$ to ensure uniqueness. I think $\pi > \theta \geq 0$ works. – goblin GONE Feb 03 '16 at 18:16
  • @goblin You mean if I use the dot product to define angles, right? Then that works. But here I'm going in the opposite direction: I assume we already know what angles are. – David Feb 03 '16 at 18:18
  • Yep, sorry. Didn't read it carefully enough! – goblin GONE Feb 03 '16 at 18:19
  • @David I can't help but wonder if there's some interesting combinatorics involved. What do you think? – MathematicsStudent1122 Feb 05 '16 at 00:59
  • I'm not sure if there is a way to do it through combinatorics. I think the last word on what can be achieved by rearranging terms may be given by Lagrange's identity: https://en.wikipedia.org/wiki/Lagrange%27s_identity – David Feb 05 '16 at 04:30
1

We can check that $$\left(\sum_{i=1}^na_i^2\right)\left(\sum_{i=1}^nb_i^2\right) - \left(\sum_{i=1}^na_ib_i\right)^2 =\ \sum_{1\leqslant i<j\leqslant n}(a_ib_j-a_jb_i)^2 \geqslant 0 .$$

username
  • 532
1

The Cauchy-Schwarz inequality follows from Pythagoras' Theorem. Without loss of generality, we may assume that one of the vectors is non-zero. Suppose $u$ and $v$ are two vectors with $\|v\| \ne 0$. Let $w = v/\|v\|$.

Let $a = (u\cdot w) w$, the projection of $u$ onto $w$. Then the vectors $u$, $a$ and $u-a$ are the three sides of a right triangle whose hypotenuse is $u$. You can see this either by drawing a picture, or by computing $a\cdot(u-a) = 0$.

From Pythagoras: $$ \|a\|^2 + \|u-a\|^2 = \|u\|^2 $$ from which it follows $$ \|a\| \le \|u\| $$ that is $$ |u \cdot w| \le \|u\| $$ or $$ |u \cdot v| \le \|u\| \|v\| .$$

Stephen Montgomery-Smith
  • 26,430
  • 2
  • 35
  • 64
0

The following arguments are taken from the fantastic book "Cauchy-Schwarz Masterclass" written by Michael Steele.

At first, we take a look at a rather similar inequality which we will show to be very close to Cauchy-Schwarz.

For the case of $n=1$, we consider

$$\tag{elementary} xy \leq \frac{x^2}{2} + \frac{y^2}{2}.$$

If we replace - as Steele does on page 19f - $x$ and $y$ with its square roots and multiply by $4$, we arrive at

$$ 4 \sqrt{xy} < 2 \, (x + y), \quad \text{for all nonnegative } x \neq y.$$

The equality holds when $(x-y)^2 = 0$, which is excluded now.

Let us fix $a$ and $b$ as side lenghts of a rectangle such that $A = ab$. Considering all arbitrary side length $x$ and $y$ such that $A = ab = xy$, we might realise that the left hand side is the perimeter of the square with length $s = \sqrt{xy} = \sqrt{ab}$. The right hand side is the perimeter of all other possible rectangles with $x$ and $y$ as side lengths.

The generalisation from $xy$ to $x\cdot y$ is now very close. For $n=3$, we can interprete the statement that among all boxes in $\mathbb{R}^3$, the cube is the one with largest volume given a fixed surface area.

So, the elementary inequality given above is easy to interpret. But how to arrive at Cauchy-Schwarz?

First, Steele on page 5 adds up $n$-times the elementary inequality for $x_i$ and $y_i$ to arrive at

$$ \sum_{i=1}^n x_i \, y_i \leq \frac12 \, \left( \sum_{i=1}^n x_i^2 \, + \, \sum_{i=0}^n y_i^2 \right). \tag{additive}$$

If we take a look at normed vectors, namely $\tilde{x_i} = \frac{x_i}{\sqrt{\sum_{i=0}^n x_i^2}}$ and $\tilde{y_i} = \frac{y_i}{\sqrt{\sum_{i=0}^n y_i^2}}$, we are able to convert the additive bound into the Cauchy-Schwarz inequality as

$$\tag{CS} \frac{\sum_{i=1}^n x_i \, y_i}{\sqrt{\sum_{i=0}^n x_i^2}\,\sqrt{\sum_{i=0}^n y_i^2}}=\sum_{i=1}^n \tilde{x_i} \, \tilde{y_i} \leq \frac12 \, \left( \sum_{i=1}^n \tilde{x_i}^2 \, + \, \sum_{i=0}^n \tilde{y_i}^2 \right) = 1$$

So, as the Cauchy-Schwarz inequality is recovered from the elementary inequality for just very special choices of $x_i$ and $y_i$. But now we got a neat intuition at hand and we can imagine why Cauchy-Schwarz will appear pretty often - as it is closely related to the fundamental perimetrical inequality for boxes and cubes.

mdot
  • 947