Page 119 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH
P. 119
Unit 9 : Test Standardization
intended outcomes of the course, keeping in view the maturity level of pupils being tested Notes
and the best item type or item form that suits best for the content elements.
Three other factors that characterise the objective test item.
(iv) Ensuring uniformity of response
Each test item be worded in precise, unambiguous and comprehensible language so that the
intended response is clearly demonstrated. This is particularly necessary when supply-type
items are used. It is difficult to expect uniformity in responses, if distractors in multiple-
choice item are not plausible; the task is not properly set in the stem of the item; key is
doubtful; more than one answer is possible when needed qualifiers are not used in the stem
etc.
(v) Avoiding clues and suggestions
These clues are found in the stem of the item, in the key and even in distractors. Associational
clues, verbal clues and determiner clues are commonly found. Use of prefixes like an, un;
article clue (like a, an, the) and use of determiners like always, never, seldom etc are to be
avoided.
(vi) Freedom from ambiguity
Use of ambiguous language, imprecise wording, unfamiliar vocabulary, vague directional
words, incongruent mechanics of writing the item that leads to grammatically incorrect
statements (stem of the item not tallying well with each option)—all lead to ambiguity in
terms of intended response.
(vii) Reasonable Difficulty Level : Determination of optimum difficulty level is a serious problem,
on which the experts do not agree. However consensus is that test as a whole should have
about 50% difficulty for average pupil.
Likewise, brighter section of pupils need not waste more time on easy items, which are in
the beginning. Thus gradual, continuous difficulty of items if made the basis of scaling of
items in the test helps the less-abled and the brighter students to score the maximum.
Modern practice of arranging test items in a standardised test is to cover wide
range of difficulty in ascending order from easy to difficult. It ensures less-abled
pupils to attempt maximum number of items before encountering more and more
difficult items.
(viii)Acceptable Discriminating Power : Discriminating power of a test item refers to the quality
or magnitude of response that may be expected from individuals along a defined scale in
accordance with difference in their achievement due to varying degrees of abilities. In
other words, superior-ability pupils should answer the item correctly more often than
those with inferior ability. This suggests a method by which the power of a test item to
discriminate between groups of pupils may be determined. For calculating the discriminating
index (D.I.), Kelley used the method of grouping pupils on the basis of scores into three
grades. Upper group of 27% making the highest scores, lower group of 27% making the
lowest scores and the middle groups of 46%, which are not considered for calculating D.I.
The next step is the count of all test items in the test. For each item the number of students
from upper or higher group (27%) and those from lower group (27%) can be counted who
attempted the item correctly, as shown in Table 1. D.I. can be calculated by the formula :
RH–RL
D. I. =
NH
LOVELY PROFESSIONAL UNIVERSITY 113