Front | Back |
Classical test theory
|
1. true score = observed score + error component;
2. an indivdual's true score is the average score in a hypothetical distribution of scores that would be obtained if the test was done an infinite number of times.
3. true score is construced to be that portion of the observed score which reflectes whatever ability, trait, characteristic the test assesses;
4. error component = the difference between the observed score and the true score; represents any other factors that may enter into the observed score as a consequence of the measurement process
a larger number of observations will produce a more reliable result than a smaller number of observations
|
Sample variance OR population variance
|
Average amount of variability in a group of scores
s2 = s2(true variance) + s2 (error variance)
without score variability tests could not help us to make comparitive decisions about people
true variance consists of those diffrences among the scores of indivduals within a group and reflect their standing in whatever construct the test is assessing.
error variance is made up of factors affecting the test that are irrelievant to what the test is assessing.
|
Reliability coefficient
|
Reliability coefficient = true score variance/total score variance
if all test score variance were true variance then the reliability would be perfect (1.00)
a number that estimates the proportion of the vraince in a group of test scores that is accounted for by error stemming from one or more sources
evaluating score reliabilty is a) determining what possible sources of error may enter into test scores and b) estimating the magnitude of those errors
|
Reliability
|
A quality of test scores that suggests they are sufficiently consistant and free from measurement error to be useful
|
Measurement error
|
Any fluctuation in scores that resulsts from factors related to the measurement process that are irrelevant to what is being measured
|
6 sources of error
|
1. interscorer differences - error that may enter into scores whenever the element of subjectivity plays a part in scoring
2. time sampling error - variability inherent in test scores as a function of the fact that they are obtained at one point in time rather than another. behaviour may fluctuate in time
3. content sampling error - the trait-irrelevant variable that can enter into test scores as a result of fortuitous factors related to the content of the specific items included on the test
4. interitem inconsistancy - error in scores that results from fluctuations in items across an entire test, as opposed to the content sampling error emanating from teh particular configuration of items included in the test as a whole
5. interitem inconsistancy and content heterogeneity combined
6. time and content sampling error combined
|
Scorer reliability
|
Basic method for estimating error due to interscorer differences.
have two indivduals score the same set of tests, sto that for each test taker's perfomance two or more independent scoresare generated
the correlation between the sets of scores generated is the scorer reliability
|
Test-retest reliability or stability coefficient
|
The correlation between the scores obtained from 2 administrations
may be viewed as an index of the extent to which scores are likely to fluctuate as a result of time sampling error
the time intenterval
|
Alternate-form reliability
|
Intended to estimate the amount of error in test scores that is attributable to content sampling error
2 or more different forms of the test - identicle in purpose but differing in specific content - need to be prepared and administered to the same groupp of subjects
the test takers scores on each of the versions is then correlated to obtain the alternate form reliabilty
high scores are .90 and above
|
Split-half reliability
|
A measure of consistency where a test is split in two and the scores for each half of the test is compared with one another.
once this has been accomplished the correlation between the scores on one half of the test and those on the other half is used to derive a split-half reliabilty coefficient
|
Spearman brown formula
|
Rsb = n(rxx) / 1 + (n - 1) rxx
rsb = spearman brown estimate of reliability
n = the multiplier by which test length is to be increased or decreased
rxx = the reliability coefficient obtained with the original test
forumla can be used to estimate the effect that lengthening or shortening a test by any amount will have on the obtained coefficient
|
Content heterogeneity
|
Results from the inclusion of items or sets of items that tap content knowledge or psychological functions that differ from those tapped by other items in the same test
|
Internal consistency
|
The extent of inconsistancy across test items
|
Inter item correlation
|
The correlation between performance on all the items within a test
|
Kuder-Richardson formula 20 (K-R 20)
|
(n/n-1) * [(s2t - sum of p*q)/s2t]
n=number of items in the test
s2t = variance of total scores on test
p = proportion of persons who pass each item
q = proportion of persons who fail each item
a formula applied to tsts who items are scored as right or wrong, or in any other dichotomous fashion, such as true or false, if all the items are phrased so that the meaning of each alternative is uniform throughout the test
used to calculate interitem consistency
affected by a) the number of items in the test (the more items the better) and b) the ration of variability in test takers' performance across all items in the test to total test score variance (as the ratio decreases reliability improves)
|