REGRESSION LINE AND CORRELATION COEFFICIENT

THIRD TERM E-LEARNING NOTE

SUBJECT: FURTHER MATHEMATICS CLASS: SS 2

SCHEME OF WORK

 

 

WEEK TEN

TOPIC: REGRESSION LINE AND CORRELATION COEFFICIENT

SCATTER DIAGRAM

Definition: a scatter diagram is a graphic display of bivariate data. A bivariate data involves two variables

TYPES OF SCATTER DIAGRAM:

Linear positive correlation.

A positive correlation between two variables x any y means that in general, increase in x is accompanied by increase in y. The regression line has a positive slope.

X

Linear negative correlation

A negative correlation between x and y means that an increase in x is accompanied by a decrease in y, negative correlation has a negative slope.

x

Zero Correlation:

There is no apparent association between x and y.

y

Non Linear Correlation:

Most of the points lie on or near a curve which is parabolic in shape. The parabolic curve is called a regression curve.

x

REGRESSION LINE OR LINE OF BEST FIT OR THE LEAST SQUARES LINE

There are two variables where one is dependent and the other is independent variable. The regression line can be fit using scatter diagram method and the least squares method.

LEAST SQUARES METHOD: If x is independent variable and y dependent variable, that is y on x. then :The equation of the regression line is written as y = ax + b

Where a is the slope and b is the y – intercept. Given two sets of variables x and y it can be deduced that

a = n ∑ xy – ∑ x ∑ y

∑ x2 – ( ∑ x)2

b = y a – ax

Where x = ∑ x

n

y = ∑ y

n

Example: use the least square method to fit a regression line of y on x for the following data

X 3 5 6 9 11 14 15 18
Y 2 3 5 7 10 12 13 17

Find value of y when x = 8

SOLUTION:

X y Xy x2
3 2 6 9
5 3 15 25
6 5 30 36
9 7 63 81
11 10 110 121
14 12 168 196
15 13 195 225
18 17 306 324
∑ x = 81 ∑ y = 69 ∑ xy = 893 ∑ x2= 1017

a = n ∑ xy – ∑x ∑ y= 8 (893) – 81x 69

n∑(x2 ) – ( ∑x)2 8 (1017) – (81)2

a = 7144 – 5589 = 1555

8136 – 6561 1575

a = 0. 9873

x = ∑ x = 81 = 10.125

n 8

y = ∑ y = 69 = 8. 625

n 8

b = y – ax

b = 8.625 — 0.9873 (10.125)

= 8.625 – 9.996

b = -1.37

y = ax + b

y = 0.9873x – 1.37 (regression line of y on x )

When x = 8

y = 0. 9873 (8) – 1.37

y = 6.5284 ~ 6. 5

EVALUATION

Use the least square method to fit a regression line of y on x for the following data

X 1 4 5 7 8 10 12 16 19 20
Y 2 3 4 5 7 8 10 15 20 18

Use the line obtained to find the value of y when x = 9

CORRELATION COEFFICIENT

DEFINITION:

The correlation coefficient determines the amount or degree of linear relationship between two variables. The correlation coefficient is represented by r

The characteristics of r are as follows:

The value of r is the same irrespective of the variable labelled x or y.

the value of r satisfies the inequality -1< x < + 1

if r is close to +1, the variables are highly positively correlated. If r is close to -1 then, x and y are highly negatively correlated. If r is close to zero, the correlation between x and y is very low. There is no correlation between x and y when r = 0

There are two methods of obtaining the correlation coefficient.

Pearson’s coefficient of correlation or product moment correlation coefficient

Rank correlation coefficient.

RANK CORRELATION COEFFICIENT: It is also known as Spearman’s rank correlation coefficient and defined as :

rk = 1 – 6 ∑ D2

n(n2 -1)

As the name implies, the variables (if not ranked) can be ranked in ascending order or descending order. Where there are ties, the average is used as the rank.

Where D is the difference between the pairs of variables and n is the number of variables. D = Rx – Ry

Example:

The table below gives the examination marks of 10 students in mathematics and history.

Maths 51 25 33 55 65 38 35 53 61 44
History 20 65 25 36 51 50 77 31 60 5

A Calculate the rank correlation coefficient

b) Comment briefly on your result

SOLUTION:

MATHS (x) HISTORY (y) Rx Ry D D2
51 20 5 9 -4 16
25 65 10 2 8 64
33 25 9 8 1 1
55 36 3 6 -3 9
65 51 1 4 -3 9
38 50 7 5 2 4
35 77 8 1 7 49
53 31 4 7 -3 9
61 60 2 3 -1 1
44 5 6 10 -4 16

∑D2 = 178

rk = 1- 6 ∑D2

n(n2 – 1)

=1 – 6 x 178

10 (102 – 1)

1 – 1068/990

= 1-1.178= -0.078

There is a very low negative correlation between the marks obtained in mathematics and history.

EVALUATION:

The table below shows the marks obtained by ten students in both theory (x) and practical (y) examination.

X 50 70 85 35 60 65 75 40 45 80
Y 45 55 75 40 50 60 70 35 30 65

Calculate the rank correlation coefficient between x and y comment on your result.

PEARSON’S CORRELATION COEFFICIENT: It is fully called Pearson’s product moment correlation coefficient. It is simple to calculate and it does not recognise any of the variables as independent or dependent. It is obtained using the formula below.

r = n ∑ xy – ∑x∑ y

√ [n∑(x2 ) – (∑x)2 ][n∑(y2) – (∑y)2

Example:

Calculate the product moment correlation coefficient for the following data

X 2 4 7 9 11
Y 1 2 3 7 9

Comment on your result.

SOLUTION:

X Y XY X2 Y2
2 1 2 4 1
4 2 8 16 4
7 3 21 49 9
9 7 63 81 49
11 9 99 121 81
∑x = 33 ∑y = 22 ∑xy = 193 ∑x2 = 271 ∑y2 = 144

r = 5 x 193 – 33 x 22

√[5(271) – ( 33)2][5(144) – ( 22)2]

r = 965 – 726

√266 x 236

r = 239

250.55

r = 0.9539. r = 0.95 (approximately to 2 s.f)

Comment: The relationship between x and y is highly positive.

EVALUATION: The following data are the marks obtained by five students in statistics (X) and mathematics(Y). Calculate the product moment correlation coefficient and comment on your result.

X 33 36 42 52 40
Y 42 46 38 62 52

GENERAL EVALUATION/REVISIONAL QUESTIONS

If Cos A = 24/25 and Sin B= 3/5, where A is acute and B is obtuse, find without using tables, the values of (a) Sin 2A (b) Cos 2B (c) Sin (A-B)

Use the addition formula to find the values of the following

(a)Sin 750 (b) cos 750 (c) tan 450

Calculate the Product moment correlation coefficient and the Spearman’s rank correlation coefficient.

X 50 45 43 30 30 43 23 43 25
Y 12 13.5 14 11 12 15 13.5 12 14

READING ASSIGNMENT: Read correlation and regression.Page313–320. Further Mathematics project 2.

WEEKEND ASSIGNMENT

Use the table below to answer questions 1 and 2.

Height 160 161 162 163 164 165
No of students 4 6 3 7 8 2

The mean of the distribution is

(a) 4875.1 cm ( b) 4001.2 (c) 3571.0cm (d) 162.2 cm (e) 129.2cm

2. The median of the distribution is

(a) 160 (b) 162 (c) 163 (d) 164 (e) 165

3. Calculate the standard deviation of 3,4, 5,6,7,8,9

(a) 2 (b) 2.4 (c) 3.6 (d) 4.0 (e) 4.2

4. Calculate the mean deviation of 6 , 8 , 4 , 0 , 4

(a) 4 .0 (b) 3.6 (c) 3.0 (d) 2. 8 (e) 2 . 1

5. The table below shows the rank Rx and Ry of marks scored by 10 candidates in an oral and

written tests respectively. Calculate the spearman’s rank correlation coefficient of the data.

Rx 1 2 3 4 5 6 7 8 9 10
Ry 2 3 4 1 6 5 8 7 10 9

(a)51/55 b) 6/55 c)49/55 d)54/55 e) 61/55

THEORY

1 The distribution of marks scored in statistics and mathematics by ten students is given in the table below:

Maths(x 11 20 23 42 48 50 57 64 80 90
Stat(y) 26 23 35 46 44 50 50 58 68 70

Plot a scatter diagram for the distribution

Draw an eye- fitted line of best fit

Use your line to estimate the students marks in statistics if his mark in maths is 40

2. The table below gives the marks obtained by members of a class in maths and physics examination

STUDENTS A B C D E F G H I J
Maths 85 75 59 43 74 69 62 80 54 63
Physic 92 72 62 48 85 73 46 74 58 50

Calculate the product moment correlation coefficient.

Comment on your result.

Spread the word if you find this helpful! Click on any social media icon to share