Measurement property | Rating | Criteria |
Structural validity | + |
Classical Test Theory (CTT) Confirmatory Factor Analysis (CFA): Comparative Fit Index (CFI) or Tucker-Lewis Index (TLI) or comparable measure >0.95 or Root Mean Square Error of Approximation (RMSEA) <0.06 or Standardized Root Mean Square Residual (SRMR) <0.08 |
IRT (Item Response Theory)/Rasch
No violation of unidimensionality: CFA: CFI or TLI or comparable measure >0.95 OR RMSEA <0.06 OR SRMR <0.08 AND No violation of local independence: residual correlations among the items after controlling for the dominant factor <0.20 OR Q3’s <0.37 AND No violation of monotonicity: adequate-looking graphs OR item scalability >30 AND Adequate model fit: IRT χ2>0.01; Rasch: infit and outfit mean squares ≥0.5 and ≤15 OR Z-standardised values >−2 and <2 | ||
? |
CTT
Not all information for ‘+’ reported | |
IRT/Rasch
Model fit not reported | ||
– | Criteria for ‘+’ not met. | |
Internal consistency | + | At least low evidence for sufficient structural validity AND Cronbach’s alpha(s) ≥0.70 for each unidimensional scale or subscale |
? | Criteria for ‘At least low evidence for sufficient structural validity’ not met | |
– | Criteria for ‘+’ not met | |
Reliability | + | Intraclass correlation coefficient (ICC) or weighted kappa ≥0.70 |
? | ICC or weighted kappa not reported | |
– | ICC or weighted kappa <0.70 | |
Measurement error | + | Smallest Detectable Change (SDC) or Limit of Agreement (LoA) <Minimal Important Change (MIC) |
? | MIC not defined | |
– | SDC or LoA >MIC | |
Hypotheses testing for construct validity | + | The result is in accordance with the hypothesis |
? | No hypothesis defined (by the review team) | |
– | The result is not in accordance with the hypothesis | |
Cross-cultural validity/measurement invariance | + | No important differences found between group factors in multiple group factor analysis OR no important differential item functioning (DIF) for group factors (McFadden’s R<0.02) |
? | No multiple group factor analysis OR DIF analysis performed | |
– | Important differences found between group factors in multiple group factor analysis OR DIF was found | |
Criterion validity | + | Correlation with gold standard ≥0.70 OR area under the curve (AUC) ≥0.70 |
? | Not all information for ‘+’ reported | |
– | Correlation with gold standard <0.70 OR AUC <0.70 | |
Responsiveness | + | The result is in accordance with the hypothesis OR AUC ≥0.70 |
? | No hypothesis defined (by the review team) | |
– | The result is not in accordance with the hypothesis OR AUC <0.70 |
+=sufficient; ?=indeterminate; −=insufficient.
COSMIN, COnsensus-based Standards for the selection of health Measurement INstruments.