REGRESSION LINE AND CORRELATION COEFFICIENT

THIRD TERM E-LEARNING NOTE

SUBJECT: FURTHER MATHEMATICS CLASS: SS 2

SCHEME OF WORK

 

 

WEEK TEN

TOPIC: REGRESSION LINE AND CORRELATION COEFFICIENT

SCATTER DIAGRAM

Definition: a scatter diagram is a graphic display of bivariate data. A bivariate data involves two variables

TYPES OF SCATTER DIAGRAM:

Linear positive correlation.

A positive correlation between two variables x any y means that in general, increase in x is accompanied by increase in y. The regression line has a positive slope.

X

Linear negative correlation

A negative correlation between x and y means that an increase in x is accompanied by a decrease in y, negative correlation has a negative slope.

x

Zero Correlation:

There is no apparent association between x and y.

y

Non Linear Correlation:

Most of the points lie on or near a curve which is parabolic in shape. The parabolic curve is called a regression curve.

x

REGRESSION LINE OR LINE OF BEST FIT OR THE LEAST SQUARES LINE

There are two variables where one is dependent and the other is independent variable. The regression line can be fit using scatter diagram method and the least squares method.

LEAST SQUARES METHOD: If x is independent variable and y dependent variable, that is y on x. then :The equation of the regression line is written as y = ax + b

Where a is the slope and b is the y – intercept. Given two sets of variables x and y it can be deduced that

a = n ∑ xy – ∑ x ∑ y

∑ x2 – ( ∑ x)2

b = y a – ax

Where x = ∑ x

n

y = ∑ y

n

Example: use the least square method to fit a regression line of y on x for the following data

X356911141518
Y235710121317

Find value of y when x = 8

SOLUTION:

XyXyx2
3269
531525
653036
976381
1110110121
1412168196
1513195225
1817306324
∑ x = 81∑ y = 69∑ xy = 893∑ x2= 1017

a = n ∑ xy – ∑x ∑ y= 8 (893) – 81x 69

n∑(x2 ) – ( ∑x)2 8 (1017) – (81)2

a = 7144 – 5589 = 1555

8136 – 6561 1575

a = 0. 9873

x = ∑ x = 81 = 10.125

n 8

y = ∑ y = 69 = 8. 625

n 8

b = y – ax

b = 8.625 — 0.9873 (10.125)

= 8.625 – 9.996

b = -1.37

y = ax + b

y = 0.9873x – 1.37 (regression line of y on x )

When x = 8

y = 0. 9873 (8) – 1.37

y = 6.5284 ~ 6. 5

EVALUATION

Use the least square method to fit a regression line of y on x for the following data

X145781012161920
Y23457810152018

Use the line obtained to find the value of y when x = 9

CORRELATION COEFFICIENT

DEFINITION:

The correlation coefficient determines the amount or degree of linear relationship between two variables. The correlation coefficient is represented by r

The characteristics of r are as follows:

The value of r is the same irrespective of the variable labelled x or y.

the value of r satisfies the inequality -1< x < + 1

if r is close to +1, the variables are highly positively correlated. If r is close to -1 then, x and y are highly negatively correlated. If r is close to zero, the correlation between x and y is very low. There is no correlation between x and y when r = 0

There are two methods of obtaining the correlation coefficient.

Pearson’s coefficient of correlation or product moment correlation coefficient

Rank correlation coefficient.

RANK CORRELATION COEFFICIENT: It is also known as Spearman’s rank correlation coefficient and defined as :

rk = 1 – 6 ∑ D2

n(n2 -1)

As the name implies, the variables (if not ranked) can be ranked in ascending order or descending order. Where there are ties, the average is used as the rank.

Where D is the difference between the pairs of variables and n is the number of variables. D = Rx – Ry

Example:

The table below gives the examination marks of 10 students in mathematics and history.

Maths51253355653835536144
History2065253651507731605

A Calculate the rank correlation coefficient

b) Comment briefly on your result

SOLUTION:

MATHS (x)HISTORY (y)RxRyDD2
512059-416
2565102864
33259811
553636-39
655114-39
38507524
357781749
533147-39
616023-11
445610-416

∑D2 = 178

rk = 1- 6 ∑D2

n(n2 – 1)

=1 – 6 x 178

10 (102 – 1)

1 – 1068/990

= 1-1.178= -0.078

There is a very low negative correlation between the marks obtained in mathematics and history.

EVALUATION:

The table below shows the marks obtained by ten students in both theory (x) and practical (y) examination.

X50708535606575404580
Y45557540506070353065

Calculate the rank correlation coefficient between x and y comment on your result.

PEARSON’S CORRELATION COEFFICIENT: It is fully called Pearson’s product moment correlation coefficient. It is simple to calculate and it does not recognise any of the variables as independent or dependent. It is obtained using the formula below.

r = n ∑ xy – ∑x∑ y

√ [n∑(x2 ) – (∑x)2 ][n∑(y2) – (∑y)2

Example:

Calculate the product moment correlation coefficient for the following data

X247911
Y12379

Comment on your result.

SOLUTION:

XYXYX2Y2
21241
428164
7321499
97638149
1199912181
∑x = 33∑y = 22∑xy = 193∑x2 = 271∑y2 = 144

r = 5 x 193 – 33 x 22

√[5(271) – ( 33)2][5(144) – ( 22)2]

r = 965 – 726

√266 x 236

r = 239

250.55

r = 0.9539. r = 0.95 (approximately to 2 s.f)

Comment: The relationship between x and y is highly positive.

EVALUATION: The following data are the marks obtained by five students in statistics (X) and mathematics(Y). Calculate the product moment correlation coefficient and comment on your result.

X3336425240
Y4246386252

GENERAL EVALUATION/REVISIONAL QUESTIONS

If Cos A = 24/25 and Sin B= 3/5, where A is acute and B is obtuse, find without using tables, the values of (a) Sin 2A (b) Cos 2B (c) Sin (A-B)

Use the addition formula to find the values of the following

(a)Sin 750 (b) cos 750 (c) tan 450

Calculate the Product moment correlation coefficient and the Spearman’s rank correlation coefficient.

X504543303043234325
Y1213.51411121513.51214

READING ASSIGNMENT: Read correlation and regression.Page313–320. Further Mathematics project 2.

WEEKEND ASSIGNMENT

Use the table below to answer questions 1 and 2.

Height160161162163164165
No of students463782

The mean of the distribution is

(a) 4875.1 cm ( b) 4001.2 (c) 3571.0cm (d) 162.2 cm (e) 129.2cm

2. The median of the distribution is

(a) 160 (b) 162 (c) 163 (d) 164 (e) 165

3. Calculate the standard deviation of 3,4, 5,6,7,8,9

(a) 2 (b) 2.4 (c) 3.6 (d) 4.0 (e) 4.2

4. Calculate the mean deviation of 6 , 8 , 4 , 0 , 4

(a) 4 .0 (b) 3.6 (c) 3.0 (d) 2. 8 (e) 2 . 1

5. The table below shows the rank Rx and Ry of marks scored by 10 candidates in an oral and

written tests respectively. Calculate the spearman’s rank correlation coefficient of the data.

Rx12345678910
Ry23416587109

(a)51/55 b) 6/55 c)49/55 d)54/55 e) 61/55

THEORY

1 The distribution of marks scored in statistics and mathematics by ten students is given in the table below:

Maths(x11202342485057648090
Stat(y)26233546445050586870

Plot a scatter diagram for the distribution

Draw an eye- fitted line of best fit

Use your line to estimate the students marks in statistics if his mark in maths is 40

2. The table below gives the marks obtained by members of a class in maths and physics examination

STUDENTSABCDEFGHIJ
Maths85755943746962805463
Physic92726248857346745850

Calculate the product moment correlation coefficient.

Comment on your result.