Using Ordinal Rescore Measures to Monitor Rater Drift

Authors

  • John R. Donoghue Author
  • Adrienne Sgammato Author

DOI:

https://doi.org/10.64634/fgv13811

Keywords:

constructed response items, score, rescore, bias, Type 1 error, drift, omnibus measures, d-statistic, t-test

Abstract

When constructed response items are used on more than one occasion, a natural concern is whether the scoring is consistent (e.g., not more lenient or strict) across the occasions. It is common to conduct trend scoring, in which a set of Occasion A responses are rescored at Occasion B. The responses are usually selected according to some rescore design, such as being balanced (with an equal number from each score category), proportional to the distribution of Occasion A scores, or a mixed version of these two designs. Recent work has demonstrated that treating the two-way table as if it arose from multinomial sampling is incorrect and can yield seriously biased estimates of whether the scores are lower or higher at Occasion B. The present study builds on these results by incorporating ordinal measures of change. It contrasts the usual trend analysis with an alternative analysis that explicitly conditions on the rescore design and finds only the latter to be effective. Omnibus measures based on combining the individual t-tests or d-statistics are examined. Measures were somewhat conservative in Type I error control and had good power to detect drift. Omnibus measures based on t-tests had marginally higher power, having higher correct detection rates than those based on the d-statistic in 1%–8% of the cases. The difference between the best versions (E weighted, which is based on t-tests, vs. D weighted, which is based on d-statistics) was only 1.8%.

Suggested citation: Donoghue, J. R., & Sgammato, A. (2025). Using ordinal rescore measures to monitor rater drift
(Research Report No. RR-25-15). ETS.

Author Biographies

Cover for ETS Research Report No. RR-25-15, Using Ordinal Rescore Measures to Monitor Rater Drift

Downloads

Published

2025-12-31