1

general proof for correlation

Hi! I have recently found a proof for that the correlation must be smaller than $1$.

I have two questions:

  1. Why must the discriminant be negative? It seems that it is related to treating $a$ as unknown and solving the quadratic equation. And looking at the discriminant. But I don't get why the discriminant has to be smaller than or equal $0$?

  2. Why do they use the $\{a(x-\mu_x)+(y-\mu_y)\}^2$ to prove in general that coefficent of correlation is $\leq 1$?

Source: A. Papoulis, "Probability, random variables, and stochastic processes", Third Edition, Chap 7. p. 152-153

user190080
  • 3,701

1 Answers1

1
  1. A quadratic expression $f(x) = ax^2 + bx + c$ that is always positive cannot have any real roots. The graph of $y = f(x)$ is a parabola, and $f(x) > 0$ means it lies entirely above the $x$-axis. Thus, it never intersects or touches the $x$-axis, which means it has no real roots. This is so only when the discriminant is negative. In the case where $f(x) \ge 0$, there can be at most one real root (where the parabola touches the $x$-axis), so the discriminant can be $0$ or negative.
  2. This is an indirect method of proof.

In the spirit of simplicity, here is a rewriting of the same proof:

Let $X$ and $Y$ be two random variables, and let $a$ be a positive real number. Now, $aX + Y$ is a random variable, so its variance $V(aX + Y) \ge 0$ (variance is always non-negative). Using the properties of variance:

$V(aX + Y) \ge 0\Rightarrow\\ a^2V(X) + 2a\text{Cov}(X, Y) + V(Y^2) \ge 0$

Now, as the LHS is a quadratic function of $a$ that is always non-negative, its graph is a parabola that lies entirely above the $x$-axis. Thus, it either has complex roots, or has at most one real root, which implies that the discriminant is non-positive:

$[2\text{Cov}(X, Y)]^2 - 4a^2V(X)V(Y) \le 0 \Rightarrow\\ [\text{Cov}(X, Y)]^2 \le V(X)V(Y) \Rightarrow\\ \left[\dfrac{\text{Cov}(X, Y)}{V(X)V(Y)}\right]^2 \le 1 \Rightarrow\\ \rho^2 \le 1 \Rightarrow\\ -1 \le \rho \le 1$

M. Vinay
  • 9,004
  • Why is it a hyperbola? And an indirect method of proof is not that a proof by contradiction but still a complete proof? Where is the contradiction here? Hope I am right about the defintion of indirect proving. – fisher garry Jun 05 '14 at 08:07
  • Parabola, not hyperbola. The graph of a quadratic function is a parabola. If you want to compare it to the standard form that you may be more familiar with, $y = ax^2 + bx + c$ can be written as $y = \left(ax + \frac{b}{2a}\right)^2 + c - \frac{b^2}{4a^2}$, by completing the square. This is comparable to $y = x^2$, but with shifting (by adding constants to $x$ and $y$) and scaling (by multiplying $x$ by $a$). Shifting and scaling a parabola gives another parabola. – M. Vinay Jun 05 '14 at 08:17
  • By indirect method of proof, I did not mean proof by contradiction. I meant it's not the kind of proof that starts with the LHS and transforms it into the RHS through a sequence of simplifications. Rather it starts from an unrelated result (in this case, a result about the roots of a quadratic equation) and from it, derives the theorem as an implication. – M. Vinay Jun 05 '14 at 08:20
  • Thanks. Since covariance can be positive I dont get why it always have to have no roots? Is not that what we are trying to prove? – fisher garry Jun 05 '14 at 08:34
  • It's not the covariance that "has no roots". It is the quadratic expression that is obtained on simplifying $E[\lbrace a(x - \mu_x) + (y - \mu_y) \rbrace^2]$. This is because $E[\lbrace a(x - \mu_x) + (y - \mu_y) \rbrace^2]$ is the expected value (or mean) of $\lbrace a(x - \mu_x) + (y - \mu_y) \rbrace^2$ which is positive. The mean of a positive random variable is also positive. (Actually everywhere it says "positive", it should be "non-negative" ($\ge 0$), because it is not strictly positive $(> 0)$). – M. Vinay Jun 05 '14 at 08:43