Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative

Bjorner, Jakob B.; Rose, Matthias; Gandek, Barbara; Stone, Arthur A.; Junghaenel, Doerte U.; Ware, John E.

doi:10.1007/s11136-013-0451-4

Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative

Published: 23 July 2013

Volume 23, pages 217–227, (2014)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Jakob B. Bjorner^1,2,3,
Matthias Rose^4,5,
Barbara Gandek⁵,
Arthur A. Stone⁶,
Doerte U. Junghaenel⁶ &
…
John E. Ware Jr.^5,7

1347 Accesses
50 Citations
1 Altmetric
Explore all metrics

Abstract

Purpose

To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS).

Methods

Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function, fatigue, and depression) were completed by 923 adults (age 18–89) with chronic obstructive pulmonary disease, depression, or rheumatoid arthritis. In a randomized cross-over design, subjects answered one form by interactive voice response (IVR) technology, paper questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods.

Results

Multigroup confirmatory factor analysis supported equivalence of factor structure across MOA. Analyses by item response theory found no differences in item location parameters and strongly supported the equivalence of scores across MOA.

Conclusions

We found no statistically or clinically significant differences in score levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Score equivalence of paper-, tablet-, and interactive voice response system-based versions of PROMIS, PRO-CTCAE, and numerical rating scales among cancer patients

Article Open access 17 September 2021

Minji K. Lee, Timothy J. Beebe, … Jeff A. Sloan

Using Item Response Theory to Identify Responders to Treatment: Examples with the Patient-Reported Outcomes Measurement Information System (PROMIS®) Physical Function Scale and Emotional Distress Composite

Article Open access 12 June 2021

Ron D. Hays, Karen L. Spritzer & Steven P. Reise

Careless responding in internet-based quality of life assessments

Article 16 December 2017

Stefan Schneider, Marcella May & Arthur A. Stone

Abbreviations

CAT:: Computerized adaptive testing
COPD:: Chronic obstructive pulmonary disease
DEP:: Depression
FAT:: Fatigue
IRT:: Item response theory
IVR:: Interactive voice response
MOA:: Method of administration
PC:: Personal computer
PDA:: Personal digital assistant
PF:: Physical functioning
PQ:: Paper questionnaire
NLMIXED:: SAS procedure for estimating mixed models
PRO:: Patient-reported outcomes
PROMIS:: Patient-Reported Outcomes Measurement Information System
WLSMV:: Weighted least squares with mean and variance adjustment

References

Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value Health, 11(2), 322–333.
Article PubMed Google Scholar
Raat, H., Mangunkusumo, R. T., Landgraf, J. M., et al. (2007). Feasibility, reliability, and validity of adolescent health status measurement by the Child Health Questionnaire Child Form (CHQ-CF): Internet administration compared with the standard paper version. Quality of Life Research, 16(4), 675–685.
Article PubMed Central PubMed Google Scholar
Yu, S. C. (2007). Comparison of Internet-based and paper-based questionnaires in Taiwan using multisample invariance approach. CyberPsychology & Behavior, 10(4), 501–507.
Article Google Scholar
Duncan, P., Reker, D., Kwon, S., et al. (2005). Measuring stroke impact with the Stroke Impact Scale: Telephone versus mail administration in veterans with stroke. Medical Care, 43(5), 507–515.
Article PubMed Google Scholar
Hepner, K. A., Brown, J. A., & Hays, R. D. (2005). Comparison of mail and telephone in assessing patient experiences in receiving care from medical group practices. Evaluation and the Health Professions, 28(4), 377–389.
Article PubMed Google Scholar
de Vries, H., Elliott, M. N., Hepner, K. A., et al. (2005). Equivalence of mail and telephone responses to the CAHPS Hospital Survey. Health Services Research, 40(6 Pt 2), 2120–2139.
Article PubMed Google Scholar
Powers, J. R., Mishra, G., & Young, A. F. (2005). Differences in mail and telephone responses to self-rated health: Use of multiple imputation in correcting for response bias. Australian and New Zealand Journal of Public Health, 29(2), 149–154.
Article CAS PubMed Google Scholar
Beebe, T. J., McRae, J. A., Harrison, P. A., et al. (2005). Mail surveys resulted in more reports of substance use than telephone surveys. Journal of Clinical Epidemiology, 58(4), 421–424.
Article PubMed Google Scholar
Kraus, L., & Augustin, R. (2001). Measuring alcohol consumption and alcohol-related problems: Comparison of responses from self-administered questionnaires and telephone interviews. Addiction, 96(3), 459–471.
Article CAS PubMed Google Scholar
McHorney, C. A., Kosinski, M., & Ware, J. E, Jr. (1994). Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: Results from a national survey. Medical Care, 32(6), 551–567.
Article CAS PubMed Google Scholar
Hanmer, J., Hays, R. D., & Fryback, D. G. (2007). Mode of administration is important in US national estimates of health-related quality of life. Medical Care, 45(12), 1171–1179.
Article PubMed Google Scholar
Hays, R. D., Kim, S., Spritzer, K. L., et al. (2009). Effects of mode and order of administration on generic health-related quality of life scores. Value Health, 12(6), 1035–1039.
Article PubMed Central PubMed Google Scholar
Agel, J., Rockwood, T., Mundt, J. C., et al. (2001). Comparison of interactive voice response and written self-administered patient surveys for clinical research. Orthopedics, 24(12), 1155–1157.
CAS PubMed Google Scholar
Dunn, J. A., Arakawa, R., Greist, J. H., & Clayton, A. H. (2007). Assessing the onset of antidepressant-induced sexual dysfunction using interactive voice response technology. Journal of Clinical Psychiatry, 68(4), 525–532.
Article CAS PubMed Google Scholar
Rush, A. J., Bernstein, I. H., Trivedi, M. H., et al. (2006). An evaluation of the quick inventory of depressive symptomatology and the hamilton rating scale for depression: A sequenced treatment alternatives to relieve depression trial report. Biological Psychiatry, 59(6), 493–501.
Article PubMed Central PubMed Google Scholar
Cella, D., Yount, S., Rothrock, N., et al. (2007). The Patient-Reported Outcomes Measurement Information System (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11.
Article PubMed Central PubMed Google Scholar
Broderick, J. E., Schwartz, J. E., Vikingstad, G., et al. (2008). The accuracy of pain and fatigue items across different reporting periods. Pain, 139(1), 146–157.
Article PubMed Central PubMed Google Scholar
Broderick, J. E., Schneider, S., Schwartz, J. E., & Stone, A. A. (2010). Interference with activities due to pain and fatigue: Accuracy of ratings across different reporting periods. Quality of Life Research, 19(8), 1163–1170.
Article PubMed Central PubMed Google Scholar
Schneider, S., Stone, A. A., Schwartz, J. E., & Broderick, J. E. (2011). Peak and end effects in patients’ daily recall of pain and fatigue: A within-subjects analysis. J Pain, 12(2), 228–235.
Article PubMed Central PubMed Google Scholar
Ware, J. E, Jr, Kosinski, M., Bayliss, M. S., et al. (1995). Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: Summary of results from the Medical Outcomes Study. Medical Care, 33(4 Suppl), AS264–AS279.
PubMed Google Scholar
Cella, D., Riley, W., Stone, A., et al. (2010). The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63(11), 1179–1194.
Article PubMed Central PubMed Google Scholar
Ware, J. E, Jr, Snow, K. K., Kosinski, M., & Gandek, B. (1993). SF-36 health survey. Manual and interpretation guide. Boston: The Health institute, New England Medical Center.
Google Scholar
Hambleton, R. K., & Jones, R. W. (1993). An NCME Instructional Module on the comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.
Article Google Scholar
van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.
Book Google Scholar
Reeve, B. B., Hays, R. D., Bjorner, J. B., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Medical Care, 45(5 Suppl 1), S22–S31.
Article PubMed Google Scholar
Kolen, M. L., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Book Google Scholar
Chew, L. D., Bradley, K. A., & Boyko, E. J. (2004). Brief questions to identify patients with inadequate health literacy. Family Medicine, 36, 588–594.
PubMed Google Scholar
Muthen, B. O., & Muthen, L. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén & Muthén.
Google Scholar
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–803.
Article Google Scholar
Cohen, J. (1988). Statistical power for the behavioral sciences. Hillsdale NJ: Erlbaum.
Google Scholar
Coons, S. J., Gwaltney, C. J., Hays, R. D., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value Health, 12(4), 419–429.
Article PubMed Google Scholar
Dillman, D. A., Phelps, G., Tortora, R., et al. (2009). Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response (IVR) and the Internet. Social Science Research, 38, 1–18.
Article Google Scholar

Download references

Acknowledgments

The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Northwestern University PI: David Cella, PhD, U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, PhD, U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, PhD, U01AR52170; and University of Washington, PI: Dagmar Amtmann, PhD, U01AR52171). NIH Science Officers on this project are Deborah Ader, Ph.D., Susan Czajkowski, PhD, Lawrence Fine, MD, DrPH, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, and Susana Serrate-Sztein, PhD. This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. The authors would like to thank two anonymous PROMIS reviewers and two journal reviewers for comments on a previous version of this manuscript. See the web site at www.nihpromis.org for additional information on the PROMIS cooperative group.

Author information

Authors and Affiliations

QualityMetric, Lincoln, RI, USA
Jakob B. Bjorner
Department of Public Health, University of Copenhagen, Copenhagen, Denmark
Jakob B. Bjorner
National Research Centre for the Working Environment, Copenhagen, Denmark
Jakob B. Bjorner
Department of Psychosomatic Medicine and Psychotherapy, Medical Clinic, Charité, Universitätsmedizin, Berlin, Germany
Matthias Rose
Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA
Matthias Rose, Barbara Gandek & John E. Ware Jr.
Department of Psychiatry and Behavioral Science, Stony Brook University, Stony Brook, NY, USA
Arthur A. Stone & Doerte U. Junghaenel
John Ware Research Group, Worcester, MA, USA
John E. Ware Jr.

Authors

Jakob B. Bjorner
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Rose
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Gandek
View author publications
You can also search for this author in PubMed Google Scholar
Arthur A. Stone
View author publications
You can also search for this author in PubMed Google Scholar
Doerte U. Junghaenel
View author publications
You can also search for this author in PubMed Google Scholar
John E. Ware Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jakob B. Bjorner.

Appendix

The standard graded response IRT model can be formulated:

$$ \log \left( {\frac{{P(x_{ji} \ge c)}}{{P(x_{ji} < c)}}} \right) = \alpha_{i} \;(\theta_{j} - (\lambda_{i} - \tau_{ic} )) $$

where θ_j, is the latent health of person j: (here: physical functioning, fatigue, or depression), α _i is the discrimination parameter for item i, λ _i is the location parameter for item i, and τ _ic is the item category parameter. An extended graded response model can be formulated in the following way:

$$ \log \left( {\frac{{P (x_{jiopqa} \ge c )}}{{P (x_{jiopqa} < c )}}} \right) = (\alpha_{i} + \alpha_{o} + \alpha_{p} + \alpha_{q} )\;(\theta_{j} - (\lambda_{i} + \lambda_{o} + \lambda_{p} + \lambda_{q} - \tau_{ic} )) $$

where α _o, λ _o represents the potential effect of item order (being administered in the second part of the form as opposed to the first) on item discrimination and location parameters. α _p, λ _p represents the potential effect of IVR phone administration (as opposed to Internet administration). α _q, λ _q represents the potential effect of paper & pencil questionnaire administration (as opposed to Internet administration).

The model was estimated using SAS proc MLMIXED. The item parameters α _i, λ _i, and τ _ic were initially treated as known constants and fixed to the values estimated in the PROMIS item bank development calibrations. In additional analyses, α _i, λ _i, and τ _ic were estimated for each item using the current sample. The mean and standard deviation of θ was estimated separately for each diagnostic group.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bjorner, J.B., Rose, M., Gandek, B. et al. Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative. Qual Life Res 23, 217–227 (2014). https://doi.org/10.1007/s11136-013-0451-4

Download citation

Accepted: 29 May 2013
Published: 23 July 2013
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11136-013-0451-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative