Page 235 - DMGT209_QUANTITATIVE_TECHNIQUES_II
P. 235
Quantitative Techniques-II
Notes dependent variables, as the objective is to arrive at a probabilistic assessment of a binary choice.
The independent variables can be either discrete or continuous. A contingency table is produced,
which shows the classification of observations as to whether the observed and predicted events
match. The sum of events that were predicted to occur which actually did occur and the events
that were predicted not to occur which actually did not occur, divided by the total number of
events, is a measure of the effectiveness of the model. This tool helps predict the choices consumers
might make when presented with alternatives.
11.4 Coefficient of Multiple Determinations
2
In statistics, the coefficient of determination R is used in the context of statistical models whose
main purpose is the prediction of future outcomes on the basis of other related information. It is
the proportion of variability in a data set that is accounted for by the statistical model. It
provides a measure of how well future outcomes are likely to be predicted by the model.
2
There are several different definitions of R which are only sometimes equivalent. One class of
2
such cases includes that of linear regression. In this case, if an intercept is included thenR is
simply the square of the sample correlation coefficient between the outcomes and their predicted
values, or in the case of simple linear regression, between the outcomes and the values of the
single regress or being used for prediction. In such cases, the coefficient of determination ranges
2
from 0 to 1. Important cases where the computational definition of R can yield negative values,
depending on the definition used, arise where the predictions which are being compared to the
corresponding outcomes have not been derived from a model-fitting procedure using those
data, and where linear regression is conducted without including an intercept. Additionally,
2
negative values of R may occur when fitting non-linear trends to data. In these instances, the
mean of the data provides a fit to the data that is superior to that of the trend under this goodness
of fit analysis.
In multiple regression analysis, the proportion of the variation in Y explained by the regression,
which can be calculated as SSexplained/SStotal . In other words this is the proportion of variation
in the criterion variable that is accounted for by the co-variations in the predictor (independent)
variable. The coefficient of determination of a multiple linear regression model is the quotient
of the variances of the fitted values and observed values of the dependent variable. If we
denote y as the observed values of the dependent variable, y as its mean, and ˆ y as the fitted
i i
value, then the coefficient of determination is:
ˆ (y y) 2
i
R 2 = 2
(y y)
i
Self Assessment
Fill in the blanks:
3. …………………..analysis sometimes referred to as choice models.
4. In statistics, the …………………………. R is used in the context of statistical models whose
2
main purpose is the prediction of future outcomes on the basis of other related information.
11.5 Summary
If the coefficient of correlation calculated for bivariate data (X , Y ), i = 1, 2, ...... n, is
i i
reasonably high and a cause and effect type of relation is also believed to be existing
between them, the next logical step is to obtain a functional relation between these variables.
230 LOVELY PROFESSIONAL UNIVERSITY