Table 2

Psychometric tests and criteria*

Psychometric propertyDefinition/testCriteria
AcceptabilityQuality of data; assessed by completeness of data and score distributions.
  • Proportion of missing data for scales (<10%)3

  • Low floor/ceiling effects in the pre-revascularisation samples (percentage scoring lowest/highest possible scale scores)

Reliability: internal consistencyExtent to which items in a scale measure the same construct (such as homogeneity of the scale); assessed by Cronbach's α, item-total correlations, and value of α if an item is deleted from a scale.
  • Cronbach's α for scales >0.7013

  • Item-total correlations >0.3013

  • Value of α if an item is deleted from scale should not ‘substantially increase’25

Tests of scaling assumptionsEvidence that an item belongs in its own scale and not another scale (item convergent and discriminant validity).
  • Scaling success/failure (item does/does not correlate significantly higher with own scale than other scale) and probable scaling success/failure (item does/does not correlate more highly, but not significantly, with own scale than other scales)26

Construct validity (within scale analyses)Evidence that each scale measures a single construct and that items can be combined to form scales; assessed on the basis of evidence of good internal consistency, factor analysis and correlations between scale scores.
  • Internal consistency (Cronbach's α >0.70)

  • Principle axis factor analysis (factor loadings ≥0.30)3

  • Moderate intercorrelations between scale scores and evidence of unique reliable variance (reliability coefficients with values greater than the intercorrelations between scales)26

Construct validity (analyses against external criteria): convergent and discriminant validityEvidence that scales are correlated with other measures of the same or similar construct, and not correlated with other measures of different constructs; assessed on the basis of correlations between CROQ, EQ-5D-3L, and age and sex.
  • Magnitude and direction of correlations expected to vary according to the similarity of constructs being measured in each instrument

  • Low to moderate correlations expected between a disease-specific and generic tool

  • Very low correlations (<0.30) expected for age and sex

Construct validity (analyses against external criteria): hypothesis testingEvidence that scales differentiate known groups; assessed by comparing CROQ scores between groups hypothesised to differ.
  • CROQ scores should be significantly (p<0.05) different for groups expected to differ

ResponsivenessAbility of scales to detect clinically important change over time between Q1 and Q2. Assessed by effect sizes (mean change score between prerevascularisation and postrevascularisation divided by the SD of scores at pre-revascularisation).
  • Effect sizes defined as small (0.20), medium (0.50) and large (≥0.80)27

  • *Adapted from Schroter and Lamping 2004.3

  • CROQ, Coronary Revascularisation Outcome Questionnaire.