0

From what I know, theoretically, covariance is a measure of the degree to which two variables change together. A positive covariance indicates that the variables increase or decrease together, while a negative covariance indicates that one variable increases as the other decreases. Correlation, on the other hand, is a standardized version of covariance that measures the strength and direction of a linear relationship between two random variables. Correlation ranges from $-1$ to $1$, where $-1$ indicates a perfect negative linear relationship, $0$ indicates no linear relationship, and $1$ indicates a perfect positive linear relationship.

More generally, correlation does not necessarily imply causation. Correlation simply indicates that there is a relationship between two variables, but it does not specify the nature of the relationship. It is possible for two variables to be correlated without one causing the other.

(I read the following example somewhere, but unfortunately, I don't remember where exactly!) For example, ice cream sales and the number of drowning deaths might be positively correlated because both increase during the summer months. However, this does not mean that ice cream causes drowning deaths. In this case, a third variable, summer, is the cause of both ice cream sales and drowning deaths.

Question: Let's say under some circumstances the correlation between age and income turns out to be $0.75.$ But, as we know, these two ordinal variables need not necessarily increase or decrease together. However, according to me, the value of $0.75$ is statistically significant enough to indicate that the variables have a strong correlation! In other words, does this (relatively high) value have no impact on the relationship, if any, between age and income?

As mentioned above, I find "correlation" alone to be misleading to some extent. Moreover, in data science, there is this concept of correlation measure, which is used in clustering (please correct me if I'm wrong!). Since correlation $\nRightarrow$ causation, is it actually possible to group any two objects together based only on the value of the correlation between the two?

Thanks in advance...

  • 1
    This question is related; yours isn't a duplicate, but you may find it interesting. If you're interested in a "physical" interpretation, maybe Physics SE would better suit the question. – J.G. Jan 23 '23 at 18:04
  • 1
    This is the fundamental question of causal inference, an entire field. – Andrew Jan 23 '23 at 18:09
  • @J.G. Thank you for the reference, sir. Actually, to be honest, I did go through the post you just referenced before I posted this. That post, along with two others, helped me get the intuition behind defining Pearson's correlation coefficient the way it was. But, based on those posts, I couldn't draw a conclusion regarding the physical interpretation or significance of "correlation." So, I asked this question specifically. Sure, I will consider Physics SE. – Usual_Learner Jan 23 '23 at 18:27
  • @AndrewZhang Sir, could you please elaborate a bit more on it? I mean, does this fundamental question lead to some conclusion? Thanks... – Usual_Learner Jan 23 '23 at 18:29
  • 1
    The question you are asking is the premise of an entire field, with lots of active research. There is no reasonable way for someone to answer your question on stackexchange. – Andrew Jan 23 '23 at 18:56
  • @AndrewZhang Oh! I didn't know that. Thanks for clarifying... – Usual_Learner Jan 23 '23 at 19:18

0 Answers0