Page 89 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION

Page 89 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH

P. 89

Unit 7 : Reliability – Types , Methods and Usability

to another. This consistency is influenced by two types of errors. First, how the test questions Notes
have been selected, whether the questions are right or not, and second, heterogeneity of
behaviour, that is, if the same type of behaviour has been studied. If a test studies different
types of behaviours at the same time, then it would be less reliable than a test which studies
the same type of behaviour. For example, if a test contains only multiplication questions
and another test has addition, subtraction, division and multiplication questions, then the
first test will have more inter-item consistency as compared to the second one, because if a
student scores 10 in the second test, it would be difficult to tell if a student is good in
addition, or subtraction, or division or multiplication. So, all subjects cannot be compared
using heterogeneous tests. Now the question arises to ascertain whether the test which we
are going to use for prediction is homogeneous or heterogeneous. Though a homogeneous
test is good, yet a question arises : Can a homogeneous test predict about a heterogeneous
test ? Therefore, if we have to measure heterogeneous behaviour, then we have to take a
heterogeneous test, and the amount of error variance will have to be controlled in it.
supposing, if an intelligence test is heterogeneous and can predict well, but if it is in the
battery form and measures only one ability at one time, and in the sum total, also measures
many abilities. There are many formulae for calculating reliability coefficient in this method,
but the following formula is the most prevalent and useful. It is called K-R formula :

n σ 2 – ∑ pq
r = ×
11 n –1 σ 2
Where, r = Reliability coefficient of the whole test
11
n = Number of items in a test
σ = Median deviation of scores in the test

p = Proportion of students solving each question correctly
q = (1 – p) Proportion of students solving each question incorrectly
In 1951, Cronback standardized this formula mathematically and arrived at this conclusion
: “K-R coefficient is actually the mean of all split-half coefficients resulting from different
splittings of a test but, unless the test items are highly homogeneous, the K-R coefficient
will be lower than the split-half reliability.”
That is, the reliability of K-R is equal to that of split-half, but the only condition is that the
test should be homogeneous. This method is not good enough to know reliability of a
heterogeneous test.
In order to find out reliability coefficient, at first, it is found out how many percent students
have solved each question correctly and how many percent students have solved each
question incorrectly. They are displayed by p and q respectively. Thus, the value of p and q
is found out for each question and they are multiplied with each other, which is called pq.
In the end, the pq values of all questions are added which becomes ∑ pq . After that, the
σ
scores of each student in the whole test are taken and their standard deviation () is
2
σ
calculated and squared, which is called ( ) . Then, these values are set in the above formula
and thus reliability coefficient is calculated.
Limitations
(a) The reliability coefficient obtained from this method is somewhat less than that
obtained by other methods.
(b) The K-R formula is based on the basic assumption that the difficulty level of all items
will be the same, but in practice, it is not possible.

LOVELY PROFESSIONAL UNIVERSITY 83

84 85 86 87 88 89 90 91 92 93 94