Original ArticleThe Equivalence of SF-36 Summary Health Scores Estimated Using Standard and Country-Specific Algorithms in 10 Countries: Results from the IQOLA Project
Introduction
The scoring and interpretation of physical and mental summary measures from the SF-36 Health Survey has been shown to achieve a number of advantages 1, 2, 3, 4. Compared with the eight SF-36 scales, scores for physical and mental health summary measures can be estimated with smaller confidence intervals, expand the range of health states measured, and greatly increase the number of levels distinguished, in comparison with any one of the eight scales [3]. While the summary measures do not reproduce all of the reliable variance in the eight-scale SF-36 profile, they have the advantage of reducing the number of statistical comparisons required when analyzing SF-36 data. Empirical tests suggest that they do so without a substantial loss of information 3, 5.
Construction of the summary measures in the United States was based on a number of findings. First, two physical and mental factors were shown to account for 80% to 85% of the reliable variance in the eight SF-36 scales in patient and general populations 3, 4. As hypothesized, scales measuring physical functioning, role limitations due to physical health, bodily pain, and general health correlated highest with the physical component and lowest with the mental component, whereas mental health, role limitations due to emotional problems, social functioning, and vitality correlated highest with the mental factor and lowest with the physical. This pattern of correlations between scales and summary component scores was also quite robust, suggesting that each summary has a comparable interpretation across population subgroups 3, 4, 5. The summary measures have also been shown to be valid in discriminating between physical and mental health status and outcomes in both cross-sectional and longitudinal tests 3, 5, 6, 7, 8.
The two-component SF-36 model of health was first described in the United States 3, 4, 5. It has been replicated across large general population samples from nine Western European countries (Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, Sweden, and the United Kingdom) 6, 9. These replications suggest that it may be feasible to score and interpret physical and mental health summary measures in these countries. It is not clear, however, how such summary scores should be estimated. In this study, we compare country-specific versus standard (U.S.-derived) scoring algorithms for the SF-36 physical and mental health summary measures to evaluate their equivalence and explore the implications of using one scoring method or the other in international analyses.
Section snippets
Data
Data come from 10 general population surveys, which have been described in detail elsewhere [10]. In brief, samples were selected to be nationally representative in nine countries (Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, the United Kingdom, and the United States). Data from Sweden were collected through seven mail surveys conducted in various regions of Sweden [11]. Self-administration of the SF-36 was used in six countries; the exceptions were Italy (50% personal
Analyses
The correlation between each pair of SF-36 summary components scored using standard (U.S.) and country-specific algorithms was examined to test their equivalence in each country. We hypothesized that these correlations would be positive and very high and accepted correlations greater than 0.90 as satisfactory evidence of equivalence. In addition, we examined correlations between physical and mental summary components that were scored using the same methods (e.g., PCS-36/MCS-36); we hypothesized
Results
Correlations between the SF-36 summary measures scored using standard (U.S.) scoring algorithms and country-specific scoring algorithms were very high, ranging from 0.980 to 0.998 across countries for the PCS-36/CPCS-36 and 0.984 to 0.998 for the MCS-36/CMCS-36 (Table 2). Thus, the correlational standard of equivalence was satisfied for both physical and mental health summary measures in all countries. Correlations between SF-36 physical and mental summary measures scored using standard
Discussion
For both physical and mental health, we observed substantial relative agreement between SF-36 summary measures estimated using standard and country-specific scoring algorithms in all countries. Specifically, product-moment correlations between SF-36 summary measures scored using standard (U.S.) scoring and country-specific scoring ranged from 0.980 to 0.998 across countries. On the basis of the strength of these findings, we recommend use of standard scoring, using U.S.-derived scoring
References (15)
- et al.
The factor structure of the SF-36 Health Survey in ten countriesResults from the IQOLA Project
J Clin Epidemiol
(1998) - et al.
Methods for validating and norming translations of health status questionnairesThe IQOLA project approach
J Clin Epidemiol
(1998) - et al.
Tests of data quality, scaling assumptions, and reliability of the SF-36 in eleven countriesResults from the IQOLA Project
J Clin Epidemiol
(1998) - et al.
Psychometric and clinical tests of validity of the Japanese SF-36 Health Survey
J Clin Epidemiol
(1998) - et al.
The MOS 36-Item Short-Form Health Survey (SF-36)I. Conceptual framework and item selection
Med Care
(1992) - et al.
SF-36 Health Survey Manual and Interpretation Guide
(1993) - et al.
SF-36 Physical and Mental Health Summary ScalesA User’s Manual
(1994)
Cited by (517)
Quality of life and functional limitations in persons with epilepsy
2023, Epilepsy ResearchHealth-related quality of life in hoarding: A comparison to chronic conditions with high disease burden
2022, Journal of Psychiatric ResearchCitation Excerpt :Standardization and weighted aggregation of the eight multi-item domains results in two summary scales: the Physical Component Summary (PCS) and the Mental Component Summary (MCS). Methodology for calculation of component summary scores is described elsewhere (Ware et al., 1998). The QoL of individuals with CHS was compared to that of those diagnosed with other conditions for which data were available in the BHR and were known to have a high disease burden.
Evolution of Bowel Complaints after Laparoscopic Endometriosis Surgery: A 1497 Women Comparative Study
2022, Journal of Minimally Invasive Gynecology