Page 188 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH
P. 188

Educational Measurement and Evaluation


                   Notes          A criterion-referenced test is designed to measure how well test takers have mastered a particular
                                  body of knowledge.
                                  These tests generally have an established “passing” score. Students know what the passing score
                                  is and an individual’s test score is determined by knowledge of the course material.
                                  He runs a second tryout, having established that 10 seconds in the 100 yard dash is competitive in the event.
                                  He now picks those who run the dash in 10 seconds or less. This is criterion-referenced testing. He knows
                                  the runners he selected can compete. He gets the funds.





                                          The term “criterion-referenced test” is not part of the everyday vocabulary in schools,
                                          and yet,  nearly all students take criterion-referenced tests on a routine basis.

                                  14.2 Features of Criterion Referenced Test ( CRT)

                                  Features of Criterion reference tests are as follows:-
                                  (i)  Criterion-referenced test place a primary focus on the content and what is being measured.
                                      Norm-referenced tests are also concerned about what is being measured but the degree of
                                      concern is less since the domain of content is not the primary focus for score interpretation.
                                      In norm-referenced test development, item selection, beyond the requirement that items
                                      meet the content specifications, is driven by item statistics. Items are needed that are not too
                                      difficult or too easy, and that are highly discriminating. These are the types of items that
                                      contribute most to score spread, and enhance test score reliability and validity.
                                  (ii)  With criterion-referenced test development, extensive efforts go into insuring content validity.
                                      Item statistics play less a role in item selection though highly discriminating items are still
                                      greatly valued, and sometimes item statistics are used to select items that maximize the
                                      discriminating power of a test at the performance standards of interest on the test score
                                      scale.
                                      A good norm-referenced test is one that will result in a wide distribution of scores on the
                                      construct being measured by the test. Without score variability, reliable and valid
                                      comparisons of candidates cannot be made. A good criterion-referenced test will permit
                                      content-referenced interpretations and this means that the content domains to which scores
                                      are referenced must be very clearly defined. Each type of test can serve the other main
                                      purpose (norm-referenced versus criterion-referenced interpretations), but this secondary
                                      use will never be optimal. For example, since criterion-referenced tests are not constructed
                                      to maximize score variability, their use in comparing candidates may be far from optimal
                                      if the test scores that are produced from the test administration are relatively similar.

                                      Because the purpose of a criterion-referenced test is quite different from that of a norm-
                                      referenced test, it should not be surprising to find that the approaches used for reliability
                                      and validity assessment are different too.
                                  (iii) With criterion-referenced tests, scores are often used to sort candidates into performance
                                      categories. Consistency of scores over parallel administrations becomes less central than
                                      consistency of classifications of candidates to performance categories over parallel
                                      administrations. Variation in candidate scores is not so important if candidates are still
                                      assigned to the same performance category.
                                      Therefore, it has been common to define reliability for a criterion-referenced test as the
                                      extent to which performance classifications are consistent over parallel-form administrations.





        182                                 LOVELY PROFESSIONAL UNIVERSITY
   183   184   185   186   187   188   189   190   191   192   193