Page 339 - DMTH404_STATISTICS
P. 339
Unit 23: Regression Analysis
The best estimate (an estimate having minimum sum of squares of errors) of Y, independently Notes
of X, is given by Y = Y .
C
Remarks: If X and Y are independent variables, the two lines of regression are Y =Y and
C
X = X .
C
Very often, when we use X for the estimation of Y, we are interested in knowing how far the use
of X enables us to explain the variations in Y values from Y or, in other words, how much of the
variations in Y, from Y , are being explained by the regression equation Y = a + bX ? To answer
Ci i
this question, we write
Y - Y =Y - Y +Y - Y (Subtracting and adding Y )
i
Ci
Ci
i
or Y - Y = Y - Y h+ Y - Yi
d
c
Ci
i i Ci Ci
Squaring both sides and taking sum over all the observations, we have
2 2 2
Y -
Y
Y -
2
Y -
Y -
å ( i Y ) = å ( i Y Ci ) + å ( Ci Y ) + å ( i Y Ci )( Ci - Y ) ....(1)
Consider the product term
Y -
Y -
Y -
2å ( i Y Ci )( Ci Y ) = 2å é ë { i Y - ( b X - X )} ( { b X - X )} ù û
i
i
2
Y -
= b 2 å ( i Y )( X - X ) 2- b å ( X - X ) 2
i
i
2 2
2 2
X -
= b 2 å ( X - X ) - b 2 å ( i X ) = 0
i
Thus, equation (1) becomes
2 2 2
Y -
å ( i Y ) = å ( i Y Ci ) + å ( Ci ) Y .... (2)
Y -
Y -
From the above figure, we note that Y - Y is the deviation of the estimated value from .
Ci Y
This deviation has occurred because X and Y are related by the regression equation Y = a + bX ,
Ci i
so that the estimate of Y is Y when X = X . Similar type of deviations would occur for other
Ci i
2
å Y - Y ) gives the strength of the relationship,
values of X. Thus, the magnitude of the term ( Ci
Y = a + bX , between X and Y or, equivalently, the variations in Y that are explained by the
Ci i
regression equation.
Figure 23.5
LOVELY PROFESSIONAL UNIVERSITY 331