Page 298 - DMGT404 RESEARCH_METHODOLOGY
P. 298

Research Methodology




                    Notes          dimensional space. The distances between the points in the 4-dimensional space summarize all
                                   information about the similarities between the rows in the table above. Now suppose one could
                                   find a lower-dimensional space, in which to position the row points in a manner that retains all,
                                   or almost all, of the information about the differences between the rows. You could then present
                                   all information about the similarities between the rows (types of employees in this case) in a
                                   simple 1, 2, or 3-dimensional graph. While this may  not appear to be particularly useful for
                                   small tables like the  one shown  above,  one  can easily  imagine  how  the presentation  and
                                   interpretation of very large tables (e.g., differential preference for 10 consumer items  among
                                   100 groups of respondents in a consumer survey) could greatly benefit from the simplification
                                   that can be achieved via correspondence analysis (e.g., represent the 10 consumer items in a two-
                                   dimensional space).

                                   13.4.2 Rotation in Factor Analysis

                                   Rotation is the step in factor analysis that permits you to identify meaningful factor names or
                                   descriptions like these.

                                   Linear Functions of Predictors

                                   To identify with rotation, first consider a problem that doesn’t involve factor analysis. Suppose
                                   you want to predict the grades of college students (all in the same college) in many dissimilar
                                   courses, from their scores on general “verbal” and “math” skill tests. To build up predictive
                                   formulas, you have a body of past data consisting of the grades of numerous hundred previous
                                   students in these courses, plus the scores of those students on the math and verbal tests. To
                                   predict grades for present and future students, you might use these data from past students to fit
                                   a series of two-variable multiple regressions, each regression forecasting grade in one course
                                   from scores on the two skill tests.
                                   At present suppose a co-worker suggests summing each student’s verbal and math scores to
                                   obtain a composite “academic skill” score I’ll call AS, and taking the difference among  each
                                   student’s verbal and math scores to obtain a second variable I’ll call VMD (verbal-math difference).
                                   The  co-worker advises  running the  same set  of regressions to predict grades in  individual
                                   courses, except using AS and VMD as predictors in each regression, instead of the original verbal
                                   and math scores. In this instance, you would get exactly the same predictions of course grades
                                   from these two families of regressions: one predicting grades in individual courses from verbal
                                   and math scores, the other predicting the identical grades from AS and VMD scores. In fact, you
                                   would get the same predictions if you formed composites of 3 math + 5 verbal and 5 verbal + 3
                                   math, and ran a series of two-variable multiple regressions forecasting grades from these two
                                   composites. These examples are all linear functions of the original verbal and math scores.
                                   The vital point is that if you have m predictor variables, and you replace the m original predictors
                                   by m linear functions of those predictors, you usually neither gain nor lose any information—
                                   you could if you wish use the scores on the linear functions to rebuild the scores on the original
                                   variables. But multiple regression uses whatever information you have in the optimum way (as
                                   measured by the sum of squared errors in the current sample) to forecast a new variable (e.g.
                                   grades in a particular course). Since the linear functions contain the same information as the
                                   original variables, you get the similar predictions as before.

                                   Specified that there are lots of ways to get exactly the same predictions, is there any advantage
                                   to using one set of linear functions rather than another? Yes there is; one set might be  simpler
                                   than another. One particular pair of linear functions may enable many of the course grades to be
                                   forecasted from just one variable (that is, one linear function) rather than from two. If we regard
                                   regressions with less predictor variables as simpler, then we can ask this question: Out of all the




          292                               LOVELY PROFESSIONAL UNIVERSITY
   293   294   295   296   297   298   299   300   301   302   303