Page 191 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION

Page 191 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH

P. 191

Unit 14 : Criterion Referenced Test

• A criterion-referenced test is designed to measure how well test takers have mastered a Notes
particular body of knowledge.
• The term “criterion-referenced test” is not part of the everyday vocabulary in schools, and
yet, nearly all students take criterion-referenced tests on a routine basis.
• Criterion-referenced tests place a primary focus on the content and what is being measured.

• With criterion-referenced test development, extensive efforts go into insuring content validity.
Item statistics play less a role in item selection though highly discriminating items are still
greatly valued, and sometimes item statistics are used to select items that maximize the
discriminating power of a test at the performance standards of interest on the test score
scale.
• With criterion-referenced tests, scores are often used to sort candidates into performance
categories. Consistency of scores over parallel administrations becomes less central than
consistency of classifications of candidates to performance categories over parallel
administrations.
• It has been common to define reliability for a criterion-referenced test as the extent to which
performance classifications are consistent over parallel-form administrations.
• With criterion-referenced tests, the focus of validity investigations is on (1) the match between
the content of the test items and the knowledge or skills that they are intended to measure,
and (2) the match between the collection of test items and what they measure and the
domain of content that the tests are expected to measure.
• Many criterion-referenced tests are constructed to assess higher-level thinking and writing
skills, such as problem solving and critical reasoning. Demonstrating that the tasks in a test
are actually assessing the intended higher-level skills is important, and this involves
judgments and the collection of empirical evidence.
• Most difficult and controversial part of criterion-referenced testing is setting the performance
standards, i.e., determining the points on the score scale for separating candidates into
performance categories such as “passers” and “failers.” The challenges are great because
with criterion-referenced tests in education, it is common on state and national assessments
to separate candidates into not just two performance categories, but more commonly, three,
four, or even five performance categories.
• Criterion-referenced tests are more suitable than norm-referenced tests for tracking the
progress of students within a curriculum. Test items can be designed to match specific
program objectives.
• Assessing student progress is something that every teacher must do. Criterion-referenced
tests can be developed at the classroom level. If the standards are not met, teachers can
specifically diagnose the deficiencies.
• Criterion-referenced tests have some built-in disadvantages. Creating tests that are both
valid and reliable requires fairly extensive and expensive time and effort. In addition,
results cannot be generalized beyond the specific course or program.
• Item analysis is used to measure the effectiveness of individual test items. The main purpose
is to improve tests, to identify questions that are too easy, too difficult or too susceptible to
guessing.
• Criterion-referenced tests are used in many ways. Classroom teachers use them to monitor
student performance in their day-to-day activities. States find them useful for evaluating
student performance and generating educational accountability information at the classroom,
school, district, and state levels.

LOVELY PROFESSIONAL UNIVERSITY 185

186 187 188 189 190 191 192 193 194 195 196