Page 188 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH
P. 188
Educational Measurement and Evaluation
Notes A criterion-referenced test is designed to measure how well test takers have mastered a particular
body of knowledge.
These tests generally have an established “passing” score. Students know what the passing score
is and an individual’s test score is determined by knowledge of the course material.
He runs a second tryout, having established that 10 seconds in the 100 yard dash is competitive in the event.
He now picks those who run the dash in 10 seconds or less. This is criterion-referenced testing. He knows
the runners he selected can compete. He gets the funds.
The term “criterion-referenced test” is not part of the everyday vocabulary in schools,
and yet, nearly all students take criterion-referenced tests on a routine basis.
14.2 Features of Criterion Referenced Test ( CRT)
Features of Criterion reference tests are as follows:-
(i) Criterion-referenced test place a primary focus on the content and what is being measured.
Norm-referenced tests are also concerned about what is being measured but the degree of
concern is less since the domain of content is not the primary focus for score interpretation.
In norm-referenced test development, item selection, beyond the requirement that items
meet the content specifications, is driven by item statistics. Items are needed that are not too
difficult or too easy, and that are highly discriminating. These are the types of items that
contribute most to score spread, and enhance test score reliability and validity.
(ii) With criterion-referenced test development, extensive efforts go into insuring content validity.
Item statistics play less a role in item selection though highly discriminating items are still
greatly valued, and sometimes item statistics are used to select items that maximize the
discriminating power of a test at the performance standards of interest on the test score
scale.
A good norm-referenced test is one that will result in a wide distribution of scores on the
construct being measured by the test. Without score variability, reliable and valid
comparisons of candidates cannot be made. A good criterion-referenced test will permit
content-referenced interpretations and this means that the content domains to which scores
are referenced must be very clearly defined. Each type of test can serve the other main
purpose (norm-referenced versus criterion-referenced interpretations), but this secondary
use will never be optimal. For example, since criterion-referenced tests are not constructed
to maximize score variability, their use in comparing candidates may be far from optimal
if the test scores that are produced from the test administration are relatively similar.
Because the purpose of a criterion-referenced test is quite different from that of a norm-
referenced test, it should not be surprising to find that the approaches used for reliability
and validity assessment are different too.
(iii) With criterion-referenced tests, scores are often used to sort candidates into performance
categories. Consistency of scores over parallel administrations becomes less central than
consistency of classifications of candidates to performance categories over parallel
administrations. Variation in candidate scores is not so important if candidates are still
assigned to the same performance category.
Therefore, it has been common to define reliability for a criterion-referenced test as the
extent to which performance classifications are consistent over parallel-form administrations.
182 LOVELY PROFESSIONAL UNIVERSITY