Abstract
Purpose
Recent US-based studies have utilised item response theory (IRT) to equate several self-report scales for depression and anxiety using the PROMIS depression and anxiety common metrics. The current study reports on the validity of the US-based equating procedures for the Patient Health Questionnaire-9 (PHQ-9), Generalized Anxiety Disorder-7 (GAD-7) and Kessler 6 psychological distress scale (K6) to equate scores in a large online sample of Australian adults.
Methods
Data comprised 3175 Australians recruited online. Each participant provided responses to the PROMIS depression and anxiety item banks, the PHQ-9, the GAD-7 and the K6. Two scoring methods were used to convert the scores on the PHQ-9, GAD-7 and K6 to the PROMIS depression and anxiety metrics. The converted scores were compared to the PROMIS depression and anxiety scores using intraclass correlations, mean difference, mean of absolute differences and Bland–Altman limits of agreement.
Results
Statistically significant mean differences were identified in five out of eight equated scores, albeit the effect sizes were small (Cohen’s dz ≤ 0.25). The correlations were uniformly high (ICC ≥ 0.86). The mean of absolute differences between observed and equated scores for each metric and across scoring methods ranged between 4.23 and 5.33.
Conclusions
The results demonstrate the validity of generating PROMIS depression and anxiety scores from the PHQ-9, GAD-7 and K6 in an independent sample of Australians. The agreement between equated scores provides some assurance that researchers and clinicians can utilise the converted PHQ-9, GAD-7 and K6 scores on the PROMIS metrics without a substantial decrease in accuracy and precision at the group level.
Similar content being viewed by others
References
Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., & Reeve, B. et al. (2007). The patient-reported outcomes measurement information system (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11. https://doi.org/10.1097/01.mlr.0000258615.42478.55.
DeWalt, D. A., Rothrock, N., Yount, S., & Stone, A. A. (2007). Evaluation of item candidates. Medical Care, 45(Suppl 1), S12–S21. https://doi.org/10.1097/01.mlr.0000254567.79743.e2.
Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., et al. (2010). The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63(11), 1179–1194. https://doi.org/10.1016/j.jclinepi.2010.04.011.
Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the patient-reported outcomes measurement information system (PROMIS®): Depression, anxiety, and anger. Assessment, 18(3), 263–283. https://doi.org/10.1177/1073191111411667.
Amtmann, D., Kim, J., Chung, H., Bamer, A. M., Askew, R. L., Wu, S., et al. (2014). Comparing CESD-10, PHQ-9, and PROMIS depression instruments in individuals with multiple sclerosis. Rehabilitation Psychology, 59(2), 220–229. https://doi.org/10.1037/a0035919.
Batterham, P. J., Sunderland, M., Carragher, N., & Calear, A. L. (2017). Psychometric properties of 7- and 30-Day Versions of the PROMIS emotional distress Item Banks in an Australian adult sample. Assessment. https://doi.org/10.1177/1073191116685809.
Schalet, B. D., Pilkonis, P. A., Yu, L., Dodds, N., Johnston, K. L., Yount, S., et al. (2016). Clinical validity of PROMIS depression, anxiety, and anger across diverse clinical samples. Journal of Clinical Epidemiology, 73, 119–127. https://doi.org/10.1016/j.jclinepi.2015.08.036.
Pilkonis, P. A., Yu, L., Dodds, N. E., Johnston, K. L., Maihoefer, C. C., & Lawrence, S. M. (2014). Validation of the depression item bank from the patient-reported outcomes measurement information system (PROMIS) in a three-month observational study. Journal of Psychiatric Research, 56, 112–119. https://doi.org/10.1016/j.jpsychires.2014.05.010.
Dorans, N. J. (2007). Linking scores from multiple health outcome instruments. Quality of Life Research, 16(S1), 85–94. https://doi.org/10.1007/s11136-006-9155-3.
Hussong, A. M., Curran, P. J., & Bauer, D. J. (2013). Integrative data analysis in clinical psychology research. Annual Review of Clinical Psychology, 9, 61–89. https://doi.org/10.1146/annurev-clinpsy-050212-185522.
Curran, P. J., Hussong, A. M., Cai, L., Huang, W., Chassin, L., Sher, K. J., et al. (2008). Pooling data from multiple longitudinal studies: The role of item response theory in integrative data analysis. Developmental Psychology, 44(2), 365–380. https://doi.org/10.1037/0012-1649.44.2.365.
Choi, S. W., Schalet, B., Cook, K. F., & Cella, D. (2014). Establishing a common metric for depressive symptoms: Linking the BDI-II, CES-D, and PHQ-9 to PROMIS Depression. Psychological Assessment, 26(2), 513–527. https://doi.org/10.1037/a0035768.
Schalet, B. D., Cook, K. F., Choi, S. W., & Cella, D. (2014). Establishing a common metric for self-reported anxiety: Linking the MASQ, PANAS, and GAD-7 to PROMIS anxiety. Journal of Anxiety Disorders, 28(1), 88–96. https://doi.org/10.1016/j.janxdis.2013.11.006.
Fischer, H. F., & Rose, M. (2016). www.common-metrics.org: a web application to estimate scores from different patient-reported outcome measures on a common scale. BMC Medical Research Methodology, 16(1), 142. https://doi.org/10.1186/s12874-016-0241-0.
Kaat, A. J., Newcomb, M. E., Ryan, D. T., & Mustanski, B. (2016). Expanding a common metric for depression reporting: Linking two scales to PROMIS® depression. Quality of Life Research, 1–10. https://doi.org/10.1007/s11136-016-1450-z.
Kim, J., Chung, H., Askew, R. L., Park, R., Jones, S. M. W., Cook, K. F., et al. (2017). Translating CESD-20 and PHQ-9 Scores to PROMIS Depression. Assessment, 24(3), 300–307. https://doi.org/10.1177/1073191115607042.
Slade, T., Johnston, A., Oakley Browne, M. A., Andrews, G., & Whiteford, H. (2009). 2007 National survey of mental health and Wellbeing: Methods and key findings. Australian and New Zealand Journal of Psychiatry, 43(7), 594–605.
Batterham, P. J., Sunderland, M., Carragher, N., & Calear, A. L. (2016). Development and community-based validation of eight item banks to assess mental health. Psychiatry Research. https://doi.org/10.1016/j.psychres.2016.07.011.
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613.
Spitzer, R. L., Kroenke, K., Williams, J. B. W., & Lowe, B. (2006). A brief measure for assessing generalized anxiety disorder: the GAD-7. Archives of Internal Medicine, 166, 1092–1097.
Kessler, R. C., Andrews, G., Colpe, L. J., Hiripi, E., Mroczek, D. K., Normand, S. T., et al. (2002). Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychological Medicine, 32, 959–976.
Choi, S. W., Podrabsky, T., McKinney, N., Schalet, B. D., Cook, K. F., & Cella, D. (2015). PROSETTA stone analysis report: A rosetta stone for patient reported outcomes. Vol 1. Chicago.
Cella, D., Schalet, B. D., Kallen, M., Lai, J.-S., Cook, K. F., Rutsohn, J. P., & Choi, S. W. (2016). PROSETTA stone analysis report: A rosetta stone for patient reported outcomes. Vol 2. Chicago.
Samejima, F. (1997). Graded response model. In Handbook of modern item response theory (pp. 85–100). New York: Springer. https://doi.org/10.1007/978-1-4757-2691-6_5.
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. https://doi.org/10.18637/jss.v048.i06.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420–428.
Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135–160. https://doi.org/10.1177/096228029900800204.
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. https://doi.org/10.3389/fpsyg.2013.00863.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012.
Tourangeau, R., Yan, T., & Ting, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133(5), 859–883.
Acknowledgements
Data for the current study were collected as part of NHMRC Project Grant 1043952. PB and AC are supported by NHMRC fellowships 1083311, 1122544, respectively.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors report no conflict of interest.
Ethical Approval
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. Data collection and recruitment was approved by the Australian National University Human Research Ethics Committee (protocol #2013/509).
Rights and permissions
About this article
Cite this article
Sunderland, M., Batterham, P., Calear, A. et al. Validity of the PROMIS depression and anxiety common metrics in an online sample of Australian adults. Qual Life Res 27, 2453–2458 (2018). https://doi.org/10.1007/s11136-018-1905-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-018-1905-5