Original ArticleTests of Data Quality, Scaling Assumptions, and Reliability of the Danish SF-36
Introduction
A critical step in the development of health status measures is to evaluate their measurement properties. In the evaluation of such questionnaires, the health status measurement field has to some extent developed criteria that are unique to or especially important in the health field (e.g., response burden and responsiveness), but most criteria and techniques have been adopted from the area of psychological testing. In particular, statistical techniques from classical psychometrics [1] have been adopted to examine scaling assumptions, reliability, and validity. In principle, these techniques assume that data are continuous and normally distributed, but the techniques often work well also with categorical rank scaled data 1, 2, that is, the kind of data used in health status measurement. The use of methods from classical psychometrics in the evaluation of health status measures may, however, still be debatable, as health status data are often skewed.
The psychometric properties of health status measures are examined as part of the development process, but it is also recommended to do new psychometric analyses when a questionnaire is translated into another language [3]. Further, because the validity and reliability of a questionnaire are specific to the setting and the population [1], a check of psychometric properties may be appropriate if the questionnaire is to be used in new settings or with groups for whom it has not previously been tested.
This article concerns the psychometric properties of the Danish translation of the MOS SF-36 Health Survey. The translation and adaptation of the Danish SF-36 followed the procedures of the International Quality of Life Assessment (IQOLA) project 3, 4, 5, an international collaboration to translate and implement the SF-36 across languages and cultures. Initial studies have shown the adequacy of the Danish translation as judged from backward translations and from independent evaluation of the conceptual equivalence, use of common language, and clarity of the Danish translation [5]. Factor analytic studies have established that the factor structure is very similar to the U.S. version 6, 7. Analyses by the Rasch model of the Physical Functioning scale and contingency table analyses of differential item functioning (DIF) in all scales have found that some items show DIF but that these cases of DIF have only a small impact on the total score for each scale in comparisons of general population data 8, 9.
We present data here on the psychometric properties of the Danish SF-36, focusing on data completeness, response consistency, tests of scaling assumptions, and the reliability of the Danish SF-36. As has been done in studies of the U.S. SF-36 [10], these properties are analyzed for the total population and subgroups defined by age, gender, education, and disease status. In addition, we investigate some more general issues concerning the choice of different statistical methods in testing scaling assumptions. Given that the Danish general population sample data used in this paper include many healthy people, our data are more skewed than the data originally used to validate the SF-36 in the United States 10, 11. For this reason, we examined whether our results changed when using methods that are not dependent on the assumption of normal distribution of the observed variables.
Section snippets
Data Collection
Data for the Danish general population were collected from February to August 1994 as part of a population health survey. A representative sample of 5983 noninstitutionalized Danish citizens more than 15 years of age was drawn from the Civil Registration System, which registers addresses and other data for all Danes. The survey included a home visit with a 30-minute structured personal interview regarding social and demographic data, health behavior, health status, and diseases. After the
Background Data
The respondents’ ages ranged from 16 to 94 years. The mean age of the population was approximately the same for men and women (44 years), and the mean number of years of education did not differ between genders. Chronic diseases were slightly more frequent among women. The younger age groups had more years of education and fewer chronic diseases than the older age groups. In the age groups 25 to 66 years, chronic diseases were more frequent among people with short school education, but no
Discussion
Danes tend to have a somewhat more positive self-evaluated health status than Americans (see Ware et al. [23]) as indicated by more skewness and higher ceiling effects in Danish data. Even in this healthy Danish population sample, the SF-36 is able to distinguish levels of health. For some items, like the PF7, PF8, and MH4, skewness is especially pronounced compared with the U.S. general population data [24]. It is possible that the translation has led to a slight shift in meaning of these
Conclusion
Although the results on missing responses warrant further work on questionnaire layout for some subgroups, the rest of our psychometric results are satisfactory. The scaling properties, the reliability, and the discriminatory power of the Danish SF-36 seemed to be best in the group with whom the questionnaire can be expected to be most used: among people with chronic disease. Thus, the Danish SF-36 may be said to have found the balance between fulfilling requirements of shortness and low
Acknowledgements
The International Quality of Life Assessment (IQOLA) Project is sponsored by Glaxo Wellcome, Inc., Research Triangle Park, North Carolina, and Schering-Plough Corporation, Kenilworth, New Jersey. This study has been supported by grants from Glaxo Research Institute, from the Danish Medical Research Council, and from the Danish Health Insurance Fund. We thank Barbara Gandek, John E. Ware, Jr., and two anonymous reviewers for comments to a previous version of this article.
References (32)
- et al.
Translating health status questionnaires and evaluating their qualityThe IQOLA approach
J Clin Epidemiol
(1998) - et al.
The Danish SF-36 health surveyTranslation and preliminary validity studies
J Clin Epidemiol
(1998) - et al.
Use of structural equation modeling to test the construct validity of the SF-36 Health Survey in ten countries
J Clin Epidemiol
(1998) - et al.
The factor structure of the SF-36 Health Survey in ten CountriesResults From the IQOLA Project
J Clin Epidemiol
(1998) - et al.
Comparison of Rasch and summating rating scales constructed from SF-36 Physical Functioning items in seven countriesResults from the IQOLA Project
J Clin Epidemiol
(1998) - et al.
Differential item functioning in the Danish translation of the SF-36
J Clin Epidemiol
(1998) - et al.
The equivalence of SF-36 summary health scores estimated using standard and country-specific algorithms in 10 countriesResults from the IQOLA Project
J Clin Epidemiol
(1998) - et al.
The Swedish SF-36 health survey—I. Evaluation of data quality, scaling assumptions, reliability and construct validity across general populations in Sweden
Soc Sci Med
(1995) - et al.
Psychometric Theory
(1994) - et al.
Pearson’s r and coarsely categorized measures
Am Sociol Rev
(1981)
The MOS 36-item Short-Form Health Survey (SF-36)III. Test of data quality, scaling assumptions, and reliability across diverse patient groups
Med Care
The MOS 36-Item Short-Form Health Survey (SF-36)II. Psychometric and clinical tests of validity in measuring physical and mental health constructs
Med Care
SF-36 Health SurveyManual and Interpretation Guide
Danish Manual for the SF-36 (in Danish)
Cited by (227)
Internet-delivered treatment for patients suffering from severe functional somatic disorders: Protocol for a randomized controlled trial
2023, Contemporary Clinical Trials CommunicationsAutologous fat grafting seems to alleviate postherpetic neuralgia – a feasibility study investigating patient-reported levels of pain
2021, Journal of Plastic, Reconstructive and Aesthetic SurgeryCitation Excerpt :We asked the patient to report on their maximum and average level of pain during the last two weeks. Our secondary outcome measures were QoL, measured using the Short-Form 36 (SF-36) and the quality of neuropathic pain, measured using the Neuropathic Pain Inventory Scale (NPSI).28–30 The SF-36 reported on eight parameters; general health, pain, social functioning, emotional well-being, energy/fatigue, role limitations due to emotional problems, role limitations due to physical health, and physical function.
Quality of Life Assessment in Danish Heroin Assisted Treatment Patients: Validity of the SF-36 Survey
2024, Journal of Psychosocial Rehabilitation and Mental HealthAutologous Fat Grafting Is Not Superior to Placebo as Treatment of Postherpetic Neuralgia: A Double-Blind Randomized Clinical Trial
2023, Plastic and Reconstructive SurgeryPatient-reported outcome measures as determinants for the utilization of health care among outpatients with epilepsy: a prognostic cohort study
2023, Journal of Patient-Reported OutcomesQuality of life of pediatric and adult individuals with osteogenesis imperfecta: a meta-analysis
2023, Orphanet Journal of Rare Diseases