REGRESSION LINE AND CORRELATION COEFFICIENT
THIRD TERM E-LEARNING NOTE
SUBJECT: FURTHER MATHEMATICS CLASS: SS 2
SCHEME OF WORK
WEEK TEN
TOPIC: REGRESSION LINE AND CORRELATION COEFFICIENT
SCATTER DIAGRAM
Definition: a scatter diagram is a graphic display of bivariate data. A bivariate data involves two variables
TYPES OF SCATTER DIAGRAM:
Linear positive correlation.
A positive correlation between two variables x any y means that in general, increase in x is accompanied by increase in y. The regression line has a positive slope.
X
Linear negative correlation
A negative correlation between x and y means that an increase in x is accompanied by a decrease in y, negative correlation has a negative slope.
x
Zero Correlation:
There is no apparent association between x and y.
y
Non Linear Correlation:
Most of the points lie on or near a curve which is parabolic in shape. The parabolic curve is called a regression curve.
x
REGRESSION LINE OR LINE OF BEST FIT OR THE LEAST SQUARES LINE
There are two variables where one is dependent and the other is independent variable. The regression line can be fit using scatter diagram method and the least squares method.
LEAST SQUARES METHOD: If x is independent variable and y dependent variable, that is y on x. then :The equation of the regression line is written as y = ax + b
Where a is the slope and b is the y – intercept. Given two sets of variables x and y it can be deduced that
a = n ∑ xy – ∑ x ∑ y
∑ x2 – ( ∑ x)2
b = y a – ax
Where x = ∑ x
n
y = ∑ y
n
Example: use the least square method to fit a regression line of y on x for the following data
X | 3 | 5 | 6 | 9 | 11 | 14 | 15 | 18 |
---|---|---|---|---|---|---|---|---|
Y | 2 | 3 | 5 | 7 | 10 | 12 | 13 | 17 |
Find value of y when x = 8
SOLUTION:
X | y | Xy | x2 |
---|---|---|---|
3 | 2 | 6 | 9 |
5 | 3 | 15 | 25 |
6 | 5 | 30 | 36 |
9 | 7 | 63 | 81 |
11 | 10 | 110 | 121 |
14 | 12 | 168 | 196 |
15 | 13 | 195 | 225 |
18 | 17 | 306 | 324 |
∑ x = 81 | ∑ y = 69 | ∑ xy = 893 | ∑ x2= 1017 |
a = n ∑ xy – ∑x ∑ y= 8 (893) – 81x 69
n∑(x2 ) – ( ∑x)2 8 (1017) – (81)2
a = 7144 – 5589 = 1555
8136 – 6561 1575
a = 0. 9873
x = ∑ x = 81 = 10.125
n 8
y = ∑ y = 69 = 8. 625
n 8
b = y – ax
b = 8.625 — 0.9873 (10.125)
= 8.625 – 9.996
b = -1.37
y = ax + b
y = 0.9873x – 1.37 (regression line of y on x )
When x = 8
y = 0. 9873 (8) – 1.37
y = 6.5284 ~ 6. 5
EVALUATION
Use the least square method to fit a regression line of y on x for the following data
X | 1 | 4 | 5 | 7 | 8 | 10 | 12 | 16 | 19 | 20 |
---|---|---|---|---|---|---|---|---|---|---|
Y | 2 | 3 | 4 | 5 | 7 | 8 | 10 | 15 | 20 | 18 |
Use the line obtained to find the value of y when x = 9
CORRELATION COEFFICIENT
DEFINITION:
The correlation coefficient determines the amount or degree of linear relationship between two variables. The correlation coefficient is represented by r
The characteristics of r are as follows:
The value of r is the same irrespective of the variable labelled x or y.
the value of r satisfies the inequality -1< x < + 1
if r is close to +1, the variables are highly positively correlated. If r is close to -1 then, x and y are highly negatively correlated. If r is close to zero, the correlation between x and y is very low. There is no correlation between x and y when r = 0
There are two methods of obtaining the correlation coefficient.
Pearson’s coefficient of correlation or product moment correlation coefficient
Rank correlation coefficient.
RANK CORRELATION COEFFICIENT: It is also known as Spearman’s rank correlation coefficient and defined as :
rk = 1 – 6 ∑ D2
n(n2 -1)
As the name implies, the variables (if not ranked) can be ranked in ascending order or descending order. Where there are ties, the average is used as the rank.
Where D is the difference between the pairs of variables and n is the number of variables. D = Rx – Ry
Example:
The table below gives the examination marks of 10 students in mathematics and history.
Maths | 51 | 25 | 33 | 55 | 65 | 38 | 35 | 53 | 61 | 44 |
---|---|---|---|---|---|---|---|---|---|---|
History | 20 | 65 | 25 | 36 | 51 | 50 | 77 | 31 | 60 | 5 |
A Calculate the rank correlation coefficient
b) Comment briefly on your result
SOLUTION:
MATHS (x) | HISTORY (y) | Rx | Ry | D | D2 |
---|---|---|---|---|---|
51 | 20 | 5 | 9 | -4 | 16 |
25 | 65 | 10 | 2 | 8 | 64 |
33 | 25 | 9 | 8 | 1 | 1 |
55 | 36 | 3 | 6 | -3 | 9 |
65 | 51 | 1 | 4 | -3 | 9 |
38 | 50 | 7 | 5 | 2 | 4 |
35 | 77 | 8 | 1 | 7 | 49 |
53 | 31 | 4 | 7 | -3 | 9 |
61 | 60 | 2 | 3 | -1 | 1 |
44 | 5 | 6 | 10 | -4 | 16 |
∑D2 = 178
rk = 1- 6 ∑D2
n(n2 – 1)
=1 – 6 x 178
10 (102 – 1)
1 – 1068/990
= 1-1.178= -0.078
There is a very low negative correlation between the marks obtained in mathematics and history.
EVALUATION:
The table below shows the marks obtained by ten students in both theory (x) and practical (y) examination.
X | 50 | 70 | 85 | 35 | 60 | 65 | 75 | 40 | 45 | 80 |
---|---|---|---|---|---|---|---|---|---|---|
Y | 45 | 55 | 75 | 40 | 50 | 60 | 70 | 35 | 30 | 65 |
Calculate the rank correlation coefficient between x and y comment on your result.
PEARSON’S CORRELATION COEFFICIENT: It is fully called Pearson’s product moment correlation coefficient. It is simple to calculate and it does not recognise any of the variables as independent or dependent. It is obtained using the formula below.
r = n ∑ xy – ∑x∑ y
√ [n∑(x2 ) – (∑x)2 ][n∑(y2) – (∑y)2
Example:
Calculate the product moment correlation coefficient for the following data
X | 2 | 4 | 7 | 9 | 11 |
---|---|---|---|---|---|
Y | 1 | 2 | 3 | 7 | 9 |
Comment on your result.
SOLUTION:
X | Y | XY | X2 | Y2 |
---|---|---|---|---|
2 | 1 | 2 | 4 | 1 |
4 | 2 | 8 | 16 | 4 |
7 | 3 | 21 | 49 | 9 |
9 | 7 | 63 | 81 | 49 |
11 | 9 | 99 | 121 | 81 |
∑x = 33 | ∑y = 22 | ∑xy = 193 | ∑x2 = 271 | ∑y2 = 144 |
r = 5 x 193 – 33 x 22
√[5(271) – ( 33)2][5(144) – ( 22)2]
r = 965 – 726
√266 x 236
r = 239
250.55
r = 0.9539. r = 0.95 (approximately to 2 s.f)
Comment: The relationship between x and y is highly positive.
EVALUATION: The following data are the marks obtained by five students in statistics (X) and mathematics(Y). Calculate the product moment correlation coefficient and comment on your result.
X | 33 | 36 | 42 | 52 | 40 |
---|---|---|---|---|---|
Y | 42 | 46 | 38 | 62 | 52 |
GENERAL EVALUATION/REVISIONAL QUESTIONS
If Cos A = 24/25 and Sin B= 3/5, where A is acute and B is obtuse, find without using tables, the values of (a) Sin 2A (b) Cos 2B (c) Sin (A-B)
Use the addition formula to find the values of the following
(a)Sin 750 (b) cos 750 (c) tan 450
Calculate the Product moment correlation coefficient and the Spearman’s rank correlation coefficient.
X | 50 | 45 | 43 | 30 | 30 | 43 | 23 | 43 | 25 |
---|---|---|---|---|---|---|---|---|---|
Y | 12 | 13.5 | 14 | 11 | 12 | 15 | 13.5 | 12 | 14 |
READING ASSIGNMENT: Read correlation and regression.Page313–320. Further Mathematics project 2.
WEEKEND ASSIGNMENT
Use the table below to answer questions 1 and 2.
Height | 160 | 161 | 162 | 163 | 164 | 165 |
---|---|---|---|---|---|---|
No of students | 4 | 6 | 3 | 7 | 8 | 2 |
The mean of the distribution is
(a) 4875.1 cm ( b) 4001.2 (c) 3571.0cm (d) 162.2 cm (e) 129.2cm
2. The median of the distribution is
(a) 160 (b) 162 (c) 163 (d) 164 (e) 165
3. Calculate the standard deviation of 3,4, 5,6,7,8,9
(a) 2 (b) 2.4 (c) 3.6 (d) 4.0 (e) 4.2
4. Calculate the mean deviation of 6 , 8 , 4 , 0 , 4
(a) 4 .0 (b) 3.6 (c) 3.0 (d) 2. 8 (e) 2 . 1
5. The table below shows the rank Rx and Ry of marks scored by 10 candidates in an oral and
written tests respectively. Calculate the spearman’s rank correlation coefficient of the data.
Rx | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
Ry | 2 | 3 | 4 | 1 | 6 | 5 | 8 | 7 | 10 | 9 |
(a)51/55 b) 6/55 c)49/55 d)54/55 e) 61/55
THEORY
1 The distribution of marks scored in statistics and mathematics by ten students is given in the table below:
Maths(x | 11 | 20 | 23 | 42 | 48 | 50 | 57 | 64 | 80 | 90 |
---|---|---|---|---|---|---|---|---|---|---|
Stat(y) | 26 | 23 | 35 | 46 | 44 | 50 | 50 | 58 | 68 | 70 |
Plot a scatter diagram for the distribution
Draw an eye- fitted line of best fit
Use your line to estimate the students marks in statistics if his mark in maths is 40
2. The table below gives the marks obtained by members of a class in maths and physics examination
STUDENTS | A | B | C | D | E | F | G | H | I | J |
---|---|---|---|---|---|---|---|---|---|---|
Maths | 85 | 75 | 59 | 43 | 74 | 69 | 62 | 80 | 54 | 63 |
Physic | 92 | 72 | 62 | 48 | 85 | 73 | 46 | 74 | 58 | 50 |
Calculate the product moment correlation coefficient.
Comment on your result.