Table 2

Operational definitions of psychometric properties19 20

TermDefinition
DomainMeasurement propertyAspect of a measurement property
ReliabilityThe degree to which the measurement is free from measurement error
Reliability (extended definition)The extent to which scores for patients who have not changed are the same for repeated measurement under several conditions: for example, using different sets of items from the same patient-reported outcome measure (PROM) (internal consistency); over time (test–retest); by different persons on the same occasion (inter-rater) or by the same persons (ie, raters or responders) on different occasions (intrarater)
Internal consistencyThe degree of the inter-relatedness among the items
ReliabilityThe proportion of the total variance in the measurements which is due to 'true’* differences between patients
Measurement errorThe systematic and random error of a patient’s score that is not attributed to true changes in the construct to be measured
ValidityThe degree to which a PROM measures the construct(s) it purports to measure
Content validityThe degree to which the content of a PROM is an adequate reflection of the construct to be measured
Face validityThe degree to which (the items of) a PROM indeed looks as though they are an adequate reflection of the construct to be measured
Construct validityThe degree to which the scores of a PROM are consistent with hypotheses (for instance with regard to internal relationships, relationships to scores of other instruments or differences between relevant groups) based on the assumption that the PROM validly measures the construct to be measured
Structural validityThe degree to which the scores of a PROM are an adequate reflection of the dimensionality of the construct to be measured
Hypotheses testingItem construct validity
Cross-cultural validityThe degree to which the performance of the items on a translated or culturally adapted PROM are an adequate reflection of the performance of the items of the original version of the PROM
Criterion validityThe degree to which the scores of a PROM are an adequate reflection of a ‘gold standard’
ResponsivenessThe ability of a PROM to detect change over time in the construct to be measured
ResponsivenessItem responsiveness
Interpretability†Interpretability is the degree to which one can assign qualitative meaning—that is, clinical or commonly understood connotations—to a PROM’s quantitative scores or change in scores
  • *The word ‘true’ must be seen in the context of the CTT, which states that any observation is composed of two components—a true score and error associated with the observation. ‘True’ is the average score that would be obtained if the scale were given an infinite number of times. It refers only to the consistency of the score, and not to its accuracy.

  • †Interpretability is not considered a measurement property, but an important characteristic of a measurement instrument

  • CTT, classical test theory; PROM, patient-reported outcome measure.