Page 195 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION

Page 195 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH

P. 195

Unit 15 : Norm Referenced Test

15.3 Features of Norm Referenced Test (NRT) Notes

Norm-referenced tests (NRTs) compare a person’s score against the scores of a group of people
who have already taken the same exam, called the “norming group.” When you see scores in the
paper which report a school’s scores as a percentage -- “the Lincoln school ranked at the 49th
percentile” -- or when you see your child’s score reported that way -- “Jamal scored at the 63rd
percentile” -- the test is usually an NRT.
Most achievement NRTs are multiple-choice tests : Some also include open-ended, short-answer
questions. The questions on these tests mainly reflect the content of nationally-used textbooks,
not the local curriculum. This means that students may be tested on things your local schools or
state education department decided were not so important and therefore were not taught.
Creating the bell curve.
NRTs are designed to “rank-order” test takers -- that is, to compare students’ scores : A commercial
norm-referenced test does not compare all the students who take the test in a given year. Instead,
test-makers select a sample from the target student population (say, ninth graders). The test is
“normed” on this sample, which is supposed to fairly represent the entire target population (all
ninth graders in the nation). Students’ scores are then reported in relation to the scores of this
“norming” group.
To make comparing easier, testmakers create exams in which the results end up looking at least
somewhat like a bell-shaped curve (the “normal” curve, shown in the diagram). Testmakers
make the test so that most students will score near the middle, and only a few will score low (the
left side of the curve) or high (the right side of the curve).
Scores are usually reported as percentile ranks : The scores range from 1st percentile to 99th
percentile, with the average student score set at the 50th percentile. If Jamal scored at the 63rd
percentile, it means he scored higher than 63% of the test takers in the norming group. Scores
also can be reported as “grade equivalents,” “stanines,” and “normal curve equivalents.”
One more question right or wrong can cause a big change in the student’s score : In some cases,
having one more correct answer can cause a student’s reported percentile score to jump more
than ten points. It is very important to know how much difference in the percentile rank would
be caused by getting one or two more questions right.
In making an NRT, it is often more important to choose questions that sort people along the
curve than it is to make sure that the content covered by the test is adequate : The tests sometimes
emphasize small and meaningless differences among testtakers. Since the tests are made to sort
students, most of the things everyone knows are not tested. Questions may be obscure or tricky,
in order to help rank order the testtakers.
Tests can be biased : Some questions may favor one kind of student or another for reasons that
have nothing to do with the subject area being tested. Non-school knowledge that is more
commonly learned by middle or upper class children is often included in tests. To help make the
bell curve, testmakers usually eliminate questions that students with low overall scores might
get right but those with high overall scores get wrong. Thus, most questions which favor minority
groups are eliminated.
NRTs usually have to be completed in a time limit : Some students do not finish, even if they
know the material. This can be particularly unfair to students whose first language is not English
or who have learning disabilities. This “speededness” is one way testmakers sort people out.

LOVELY PROFESSIONAL UNIVERSITY 189

190 191 192 193 194 195 196 197 198 199 200