Page 125 - DEDU504_EDUCATIONAL_MEASUREMENT_AND_EVALUATION_ENGLISH
P. 125

Unit 9 : Test Standardization



            9.4.6 Reliability Coefficient                                                            Notes
            This requires correlating scores on two equivalent forms of the same test, given simultaneously
            to the same group of students using the same procedure. This measure is the coefficient of
            realibility. The following types of coefficients are commonly used.
            Retesting Coefficient : When only one form of test is available, it is given to the group of pupils
            twice under similar testing conditions. Retesting coefficient is the correlation coefficient between
            two sets of scores. However, second administration should neither follow the first too quickly to
            avoid significant increase of scores that may result from memory, nor be delayed too much lest
            forgetting operates to a large extent.
            Chance Half Coefficient (Split Half) : The test is given to a group of pupils and their scores are
            then obtained for two halves of the test. Two halves can be made as :
            (i)  odd and even numbered items; and

            (ii)  obtaining separate scores on items 1, 4, 5, 8, 9, 12, 13 etc. and on items 2, 3, 6, 7, 10, 11 etc. to
                equalise the difficulty of the two halves when items are scaled in a scaled order of difficulty.
                Correlation coefficient obtained between the two sets of scores indicates the degree of
                conformance between the two chance halves of the test. Reliability coefficient of the test is
                then found by estimating the correlation by using Spearman-Brown Prophecy formula :
                                                    r
                                                   2½½
                                              r  =   1+ ½½
                                              12
                                                     r
            Foot Rule Coefficient : This may be an underestimate but never overestimate of the reliability
            coefficient. It is not the most accurate method. It requires use of three facts and measures from
            the test in a simple formula - the arithmetic means, standard deviation of scores and the
            number of items in the test. Owing to sufficient accuracy and simplicity, this method is
            recommended for use by teachers in estimating reliability of their informal objective
            examinations. The formula used is given below :
                                                  xk    )
                                                   ( – x
                                             rH =   k (  2 ) SD

            where      rH =  coefficient correlation;

                        x = mean;
                        k = number of items in the test; and

                     SD 2  = variance (Standard deviations squared).
            Estimate of reliability coefficient often results in high or low test reliability. It must be based on
            known and appropriate range of ages or grade placement of pupils if it is to mean what it
            purports to mean. Hence reliability coefficient is neither an entirely adequate device, nor for that
            matter the only method of indicating the internal consistency of a test.
            Standard Error of Measurement (SEM) : The other popular device by which test reliability can be
            estimated is the standard error of measurement. Standard error indicates the degree of accuracy
            existing in the test score, obtained for each pupil on a test.
            Here the accuracy refers” to magnitude of sampling errors. Since SEM isnot ecTed by range of
            talent of the pupil group on which it is based (as in reliability coefficient), it is recognised as a
            more concrete way of indicating test reliability.
            Adequacy and Objectivity in Test Reliability : “Adequacy” is the degree to which test samples
            sufficiently widely into the subject so that the resulting scores are representative of relative total




                                               LOVELY PROFESSIONAL UNIVERSITY                                    119
   120   121   122   123   124   125   126   127   128   129   130