Page 298 - DMGT404 RESEARCH

Page 298 - DMGT404 RESEARCH_METHODOLOGY

P. 298

Research Methodology

Notes dimensional space. The distances between the points in the 4-dimensional space summarize all
information about the similarities between the rows in the table above. Now suppose one could
find a lower-dimensional space, in which to position the row points in a manner that retains all,
or almost all, of the information about the differences between the rows. You could then present
all information about the similarities between the rows (types of employees in this case) in a
simple 1, 2, or 3-dimensional graph. While this may not appear to be particularly useful for
small tables like the one shown above, one can easily imagine how the presentation and
interpretation of very large tables (e.g., differential preference for 10 consumer items among
100 groups of respondents in a consumer survey) could greatly benefit from the simplification
that can be achieved via correspondence analysis (e.g., represent the 10 consumer items in a two-
dimensional space).

13.4.2 Rotation in Factor Analysis

Rotation is the step in factor analysis that permits you to identify meaningful factor names or
descriptions like these.

Linear Functions of Predictors

To identify with rotation, first consider a problem that doesn’t involve factor analysis. Suppose
you want to predict the grades of college students (all in the same college) in many dissimilar
courses, from their scores on general “verbal” and “math” skill tests. To build up predictive
formulas, you have a body of past data consisting of the grades of numerous hundred previous
students in these courses, plus the scores of those students on the math and verbal tests. To
predict grades for present and future students, you might use these data from past students to fit
a series of two-variable multiple regressions, each regression forecasting grade in one course
from scores on the two skill tests.
At present suppose a co-worker suggests summing each student’s verbal and math scores to
obtain a composite “academic skill” score I’ll call AS, and taking the difference among each
student’s verbal and math scores to obtain a second variable I’ll call VMD (verbal-math difference).
The co-worker advises running the same set of regressions to predict grades in individual
courses, except using AS and VMD as predictors in each regression, instead of the original verbal
and math scores. In this instance, you would get exactly the same predictions of course grades
from these two families of regressions: one predicting grades in individual courses from verbal
and math scores, the other predicting the identical grades from AS and VMD scores. In fact, you
would get the same predictions if you formed composites of 3 math + 5 verbal and 5 verbal + 3
math, and ran a series of two-variable multiple regressions forecasting grades from these two
composites. These examples are all linear functions of the original verbal and math scores.
The vital point is that if you have m predictor variables, and you replace the m original predictors
by m linear functions of those predictors, you usually neither gain nor lose any information—
you could if you wish use the scores on the linear functions to rebuild the scores on the original
variables. But multiple regression uses whatever information you have in the optimum way (as
measured by the sum of squared errors in the current sample) to forecast a new variable (e.g.
grades in a particular course). Since the linear functions contain the same information as the
original variables, you get the similar predictions as before.

Specified that there are lots of ways to get exactly the same predictions, is there any advantage
to using one set of linear functions rather than another? Yes there is; one set might be simpler
than another. One particular pair of linear functions may enable many of the course grades to be
forecasted from just one variable (that is, one linear function) rather than from two. If we regard
regressions with less predictor variables as simpler, then we can ask this question: Out of all the

292 LOVELY PROFESSIONAL UNIVERSITY

293 294 295 296 297 298 299 300 301 302 303