Reliability and validity of three quality rating instruments for systematic reviews of observational studies

Jennifer M Hootman; Jeffrey B Driban; Michael R Sitler; Kyle P Harris; Nicole M Cattano

doi:10.1002/jrsm.41

Reliability and validity of three quality rating instruments for systematic reviews of observational studies

Res Synth Methods. 2011 Jun;2(2):110-8. doi: 10.1002/jrsm.41. Epub 2011 Sep 15.

Authors

Jennifer M Hootman¹, Jeffrey B Driban^{2

3}, Michael R Sitler², Kyle P Harris², Nicole M Cattano⁴

Affiliations

¹ Division of Adult and Community Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA. jhootman@cdc.gov.
² Biokinetics Research Laboratory, Athletic Training Division, Department of Kinesiology, Temple University, Philadelphia, PA, USA.
³ Division of Rheumatology, Tufts Medical Center, Boston, MA, USA.
⁴ Department of Sports Medicine, West Chester University of Pennsylvania, West Chester, PA, USA.

PMID: 26061679
DOI: 10.1002/jrsm.41

Abstract

To assess the inter-rater reliability, validity, and inter-instrument agreement of the three quality rating instruments for observational studies. Inter-rater reliability, criterion validity, and inter-instrument reliability were assessed for three quality rating scales, the Downs and Black (D&B), Newcastle-Ottawa (NOS), and Scottish Intercollegiate Guidelines Network (SIGN), using a sample of 23 observational studies of musculoskeletal health outcomes. Inter-rater reliability for the D&B (Intraclass correlations [ICC] = 0.73; CI = 0.47-0.88) and NOS (ICC = 0.52; CI = 0.14-0.76) were moderate to good and was poor for the SIGN (κ = 0.09; CI = -0.22-0.40). The NOS was not statistically valid (p = 0.35), although the SIGN was statistically valid (p < 0.05) with medium to large effect sizes (f(2) = 0.29-0.47). Inter-instrument agreement estimates were κ = 0.34, CI = 0.05-0.62 (D&B versus SIGN), κ = 0.26, CI = 0.00-0.52 (SIGN versus NOS), and κ = 0.43, CI = 0.09-0.78 (D&B versus NOS). Reliability and validity are quite variable across quality rating scales used in assessing observational studies in systematic reviews. Copyright © 2011 John Wiley & Sons, Ltd.

Keywords: instrument psychometrics; meta‐analysis; research methods; systematic review.