Reliability, effect size, and responsiveness of health status measures in the design of randomized and cluster-randomized trials

Contemp Clin Trials. 2005 Feb;26(1):45-58. doi: 10.1016/j.cct.2004.11.014. Epub 2005 Jan 27.

Abstract

Background: New health status survey instruments are often described by their psychometric (measurement) properties, such as Validity, Reliability, Effect Size, and Responsiveness. For cluster-randomized trials, another important statistic is the Intraclass Correlation (ICC) for the instrument within clusters. Studies using better instruments can be performed with smaller sample sizes, but better instruments may be more expensive in terms of dollars, opportunity cost, or poorer data quality due to the response burden of longer instruments.

Methods: We defined the psychometric statistics in terms of a mathematical model, and examined the power of a two-sample test as a function of the test-retest Reliability, Effect Size, Responsiveness, and Intraclass Correlation of the instrument. We examined the "cost-effectiveness" of using a one-item versus a five-item measure of mental health status.

Findings: Under the standard model for measurement error, the psychometric statistics are all functions of the same error term. They are also functions of the setting in which they were estimated. In randomized trials, power is a function of Reliability and sample size, and a less reliable instrument can achieve the desired power if N is increased. In cluster-randomized trials, adequate power may be obtained by increasing the number of clusters per treatment group (and often the number of persons per cluster), as well as by choosing a more reliable instrument. The one-item measure of mental health status may be more cost-effective than the five-item measure in some situations.

Conclusion: If the goal is to diagnose or refer individual patients, an instrument with high Validity and Reliability is needed. In settings where the sample sizes are large or can be increased easily, any valid instrument may be cost-effective. It is likely that many published values of psychometric statistics are accurate only in settings similar to that in which they were estimated.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cost-Benefit Analysis
  • Depressive Disorder / diagnosis
  • Depressive Disorder / epidemiology
  • Epidemiologic Factors
  • Health Surveys*
  • Humans
  • Interview, Psychological
  • Mental Health*
  • Psychological Tests*
  • Randomized Controlled Trials as Topic / economics
  • Randomized Controlled Trials as Topic / methods*
  • Reproducibility of Results
  • Research Design*
  • Sample Size