
Part E Inter-relationships between variables ⏐ 10: Correlation and linear regression 267
Note, however, that if r
2
= 0.81, we would say that 81% of the variations in y can be explained by variations in x.
We do not necessarily conclude that 81% of variations in y are
caused by the variations in x. We must beware of
reading too much significance into our statistical analysis.
2.6 Correlation and causation
If two variables are well correlated, either positively or negatively, this may be due to pure chance or there may be
a
reason for it. The larger the number of pairs of data collected, the less likely it is that the correlation is due to
chance, though that possibility should never be ignored entirely.
If there is a reason, it may not be
causal. For example, monthly net income is well correlated with monthly credit to
a person's bank account, for the logical (rather than causal) reason that for most people the one equals the other.
Even if there is a causal explanation for a correlation, it does not follow that variations in the value of one variable
cause variations in the value of the other. For example, sales of ice cream and of sunglasses are well correlated,
not because of a direct causal link but because the weather influences both variables.
3 Spearman's rank correlation coefficient
3.1 Coefficient of rank correlation
In the examples considered above, the data were given in terms of the values of the relevant variables, such as the
number of hours. Sometimes however, they are given in terms of order or
rank rather than actual values.
Spearman's rank correlation coefficient is used when data is given in terms of order or rank, rather than actual
values.
Coefficient of rank correlation, R = 1 –
⎥
⎦
⎤
⎢
⎣
⎡
−
∑
)1n(n
d6
2
2
Where n = number of pairs of data
d = the difference between the rankings in each set of data.
The coefficient of rank correlation can be interpreted in exactly the same way as the ordinary correlation
coefficient. Its value can range from –1 to +1.
3.2 Example: The rank correlation coefficient
The examination placings of seven students were as follows.
Statistics Economics
Student placing placing
A 2 1
B 1 3
C 4 7
D 6 5
E 5 6
F 3 2
G 7 4
Assessment
formula
FA
T F
RWAR