Statistics

Correlation is not transitive, in general at least: A simulation approach

Let \(\rho_{XY}\) be the correlation between the stochastic variables \(X\) and \(Y\) and similarly for \(\rho_{XZ}\) and \(\rho_{YZ}\). If we know two of these, can we say anything about the third? In a recent blog post I dealt with the problem mathematically and I used the concept of a partial correlation coefficient. Here I will take a simulation approach. First z is simulated. Then x and y is simulated based on z in a regression context with a slope between \(-1\) and \(1\).

Correlation is not transitive, in general at least

Update Aug 10, 2019: I wrote a new blog post about the same as below but using a simulation approach. Let \(\rho_{XY}\) be the correlation between the stochastic variables \(X\) and \(Y\) and similarly for \(\rho_{XZ}\) and \(\rho_{YZ}\). If we know two of these, can we say anything about the third? Yes, sometimes, but not always. Say we have \(\rho_{XZ}\) and \(\rho_{YZ}\) and they are both positive. Intuition would then make us believe that \(\rho_{XY}\) is probably also positive then.