Tests of Data Quality, Scaling Assumptions, and Reliability of the Danish SF-36

doi:10.1016/S0895-4356(98)00092-4

Journal of Clinical Epidemiology

Volume 51, Issue 11, November 1998, Pages 1001-1011

https://doi.org/10.1016/S0895-4356(98)00092-4 Get rights and content

Abstract

We used general population data (n = 4084) to examine data completeness, response consistency, tests of scaling assumptions, and reliability of the Danish SF-36 Health Survey. We compared traditional multitrait scaling analyses to analyses using polychoric correlations and Spearman correlations. The frequency of missing values was low, except for elderly people and people with lower levels of education. Response consistency was high and compared well with results for the U.S. SF-36. For respondents with computable scales in all eight domains, scaling assumptions (item internal consistency, item discriminant validity, equal item–own scale correlations, and equal variances) were satisfactory in the total sample and in all subgroups. The SF-36 could discriminate between levels of health in all subgroups, but there were skewness, kurtosis, and ceiling effects in many subgroups (elderly people and people with chronic diseases excepted). Concerning correlation methods, we found interesting differences indicating advantages of using methods that do not assume a normal distribution of answers as an addition to traditional methods.

Introduction

A critical step in the development of health status measures is to evaluate their measurement properties. In the evaluation of such questionnaires, the health status measurement field has to some extent developed criteria that are unique to or especially important in the health field (e.g., response burden and responsiveness), but most criteria and techniques have been adopted from the area of psychological testing. In particular, statistical techniques from classical psychometrics [1] have been adopted to examine scaling assumptions, reliability, and validity. In principle, these techniques assume that data are continuous and normally distributed, but the techniques often work well also with categorical rank scaled data 1, 2, that is, the kind of data used in health status measurement. The use of methods from classical psychometrics in the evaluation of health status measures may, however, still be debatable, as health status data are often skewed.

The psychometric properties of health status measures are examined as part of the development process, but it is also recommended to do new psychometric analyses when a questionnaire is translated into another language [3]. Further, because the validity and reliability of a questionnaire are specific to the setting and the population [1], a check of psychometric properties may be appropriate if the questionnaire is to be used in new settings or with groups for whom it has not previously been tested.

This article concerns the psychometric properties of the Danish translation of the MOS SF-36 Health Survey. The translation and adaptation of the Danish SF-36 followed the procedures of the International Quality of Life Assessment (IQOLA) project 3, 4, 5, an international collaboration to translate and implement the SF-36 across languages and cultures. Initial studies have shown the adequacy of the Danish translation as judged from backward translations and from independent evaluation of the conceptual equivalence, use of common language, and clarity of the Danish translation [5]. Factor analytic studies have established that the factor structure is very similar to the U.S. version 6, 7. Analyses by the Rasch model of the Physical Functioning scale and contingency table analyses of differential item functioning (DIF) in all scales have found that some items show DIF but that these cases of DIF have only a small impact on the total score for each scale in comparisons of general population data 8, 9.

We present data here on the psychometric properties of the Danish SF-36, focusing on data completeness, response consistency, tests of scaling assumptions, and the reliability of the Danish SF-36. As has been done in studies of the U.S. SF-36 [10], these properties are analyzed for the total population and subgroups defined by age, gender, education, and disease status. In addition, we investigate some more general issues concerning the choice of different statistical methods in testing scaling assumptions. Given that the Danish general population sample data used in this paper include many healthy people, our data are more skewed than the data originally used to validate the SF-36 in the United States 10, 11. For this reason, we examined whether our results changed when using methods that are not dependent on the assumption of normal distribution of the observed variables.

Section snippets

Data Collection

Data for the Danish general population were collected from February to August 1994 as part of a population health survey. A representative sample of 5983 noninstitutionalized Danish citizens more than 15 years of age was drawn from the Civil Registration System, which registers addresses and other data for all Danes. The survey included a home visit with a 30-minute structured personal interview regarding social and demographic data, health behavior, health status, and diseases. After the

Background Data

The respondents’ ages ranged from 16 to 94 years. The mean age of the population was approximately the same for men and women (44 years), and the mean number of years of education did not differ between genders. Chronic diseases were slightly more frequent among women. The younger age groups had more years of education and fewer chronic diseases than the older age groups. In the age groups 25 to 66 years, chronic diseases were more frequent among people with short school education, but no

Discussion

Danes tend to have a somewhat more positive self-evaluated health status than Americans (see Ware et al. [23]) as indicated by more skewness and higher ceiling effects in Danish data. Even in this healthy Danish population sample, the SF-36 is able to distinguish levels of health. For some items, like the PF7, PF8, and MH4, skewness is especially pronounced compared with the U.S. general population data [24]. It is possible that the translation has led to a slight shift in meaning of these

Conclusion

Although the results on missing responses warrant further work on questionnaire layout for some subgroups, the rest of our psychometric results are satisfactory. The scaling properties, the reliability, and the discriminatory power of the Danish SF-36 seemed to be best in the group with whom the questionnaire can be expected to be most used: among people with chronic disease. Thus, the Danish SF-36 may be said to have found the balance between fulfilling requirements of shortness and low

Acknowledgements

The International Quality of Life Assessment (IQOLA) Project is sponsored by Glaxo Wellcome, Inc., Research Triangle Park, North Carolina, and Schering-Plough Corporation, Kenilworth, New Jersey. This study has been supported by grants from Glaxo Research Institute, from the Danish Medical Research Council, and from the Danish Health Insurance Fund. We thank Barbara Gandek, John E. Ware, Jr., and two anonymous reviewers for comments to a previous version of this article.

References (32)

M. Bullinger et al.
Translating health status questionnaires and evaluating their qualityThe IQOLA approach
J Clin Epidemiol
(1998)
J.B. Bjorner et al.
The Danish SF-36 health surveyTranslation and preliminary validity studies
J Clin Epidemiol
(1998)
S.D. Keller et al.
Use of structural equation modeling to test the construct validity of the SF-36 Health Survey in ten countries
J Clin Epidemiol
(1998)
J.E. Ware et al.
The factor structure of the SF-36 Health Survey in ten CountriesResults From the IQOLA Project
J Clin Epidemiol
(1998)
A.E. Raczek et al.
Comparison of Rasch and summating rating scales constructed from SF-36 Physical Functioning items in seven countriesResults from the IQOLA Project
J Clin Epidemiol
(1998)
J.B. Bjorner et al.
Differential item functioning in the Danish translation of the SF-36
J Clin Epidemiol
(1998)
J.E. Ware et al.
The equivalence of SF-36 summary health scores estimated using standard and country-specific algorithms in 10 countriesResults from the IQOLA Project
J Clin Epidemiol
(1998)
M. Sullivan et al.
The Swedish SF-36 health survey—I. Evaluation of data quality, scaling assumptions, reliability and construct validity across general populations in Sweden
Soc Sci Med
(1995)
J.C. Nunnally et al.
Psychometric Theory
(1994)
K.A. Bollen et al.
Pearson’s r and coarsely categorized measures
Am Sociol Rev
(1981)

Ware JE, Keller SD, Gandek B, Brazier JE, Sullivan M, The IQOLA Project Group. Evaluating Translations of Health Status...

C.A. McHorney et al.

The MOS 36-item Short-Form Health Survey (SF-36)III. Test of data quality, scaling assumptions, and reliability across diverse patient groups

Med Care

(1994)

C.A. McHorney et al.

The MOS 36-Item Short-Form Health Survey (SF-36)II. Psychometric and clinical tests of validity in measuring physical and mental health constructs

Med Care

(1993)

[Classification of Diseases - Systematic Part]. Copenhagen: Danish National Board of Health;...

J.E. Ware et al.

SF-36 Health SurveyManual and Interpretation Guide

(1993)

J.B. Bjorner et al.

Danish Manual for the SF-36 (in Danish)

(1997)

Cited by (227)

Internet-delivered treatment for patients suffering from severe functional somatic disorders: Protocol for a randomized controlled trial
2023, Contemporary Clinical Trials Communications
Functional somatic disorders (FSDs) with symptoms from multiple organs, i.e., multi-system type, are common in the general population and may lead to disability and reduced quality of life. Evidence for efficient treatment programs has been established, however, there is a need for making treatments accessible to a larger group of patients. Internet-delivered therapy has become prevalent and has proven as effective as face-to-face therapy, while providing a flexible and easily accessible treatment alternative. The aim of the current study is to compare the efficacy of the therapist-assisted internet-delivered treatment program One step at a time (OneStep) with the internet-delivered self-help program Get started (GetStarted).
A total of 166 participants aged 18–60 years diagnosed with multi-system FSD will be assessed and randomized to either 1) OneStep: a 14-week program consisting of 11 treatment modules based on principles from cognitive behavioural therapy or 2) GetStarted consisting of 1 module on psychoeducation. The primary outcome is physical health, assessed by a Short Form Health Survey (SF-36) aggregate score of the subscales vitality, physical functioning, and bodily pain 3 months after end-of-treatment and self-reported improvement assessed by the Clinical Global Improvement Scale. Secondary outcomes include symptom load, depression, anxiety, and illness worry. Process measures include emotional distress, illness perception, illness behaviour, and symptom interference.
This study is the first study to test an internet-delivered treatment program for FSD, multi-system type and has the potential to show the importance of making evidence-based internet-delivered treatment for FSD more accessible.
Autologous fat grafting seems to alleviate postherpetic neuralgia – a feasibility study investigating patient-reported levels of pain
2021, Journal of Plastic, Reconstructive and Aesthetic Surgery
Citation Excerpt :
We asked the patient to report on their maximum and average level of pain during the last two weeks. Our secondary outcome measures were QoL, measured using the Short-Form 36 (SF-36) and the quality of neuropathic pain, measured using the Neuropathic Pain Inventory Scale (NPSI).28–30 The SF-36 reported on eight parameters; general health, pain, social functioning, emotional well-being, energy/fatigue, role limitations due to emotional problems, role limitations due to physical health, and physical function.
Postherpetic neuralgia (PHN) is a relatively common side effect after an outbreak of herpes zoster (HZ), characterized by chronic neuropathic dermal pain. No effective treatment exists today. Fat grafting has shown promise in alleviating neuropathic pain, yet the exact mechanism of action, at a biological level, is not yet known. We report on the first human study using autologous fat grafting for treating PHN. Our hypothesis was that fat grafting can alleviate pain and improve the quality of life (QoL) in patients suffering from PHN. If successful, this could be a safe, cost-effective alternative to analgesics. This safety and feasibility study aimed to investigate the possible pain-relieving effect of autologous fat grafting on PHN.
Ten adult patients suffering from PHN underwent autologous fat grafting to a dermal area of neuralgia, with a 12-week follow up. The primary endpoint was patient-reported pain. Secondary endpoints were patient-reported changes in QoL, and the degree and quality of the neuropathic pain.
The pain was measured by using a visual analog scale (range: 0-10). We observed improvements in both the average and maximum level of pain with a reduction of (-4.0 ± 3.1) and (-5.1 ± 3.9), respectively, (Δ mean ± SD), P<0.05. All parameters investigating neuropathic pain were significantly reduced. No improvement was seen in the QoL. The average amount of fat grafted was 208 ml. We observed no serious adverse effects.
This study suggests that autologous fat grafting can relieve chronic pain resulting from HZ. The next step toward routine clinical translation is to perform a randomized, blinded, placebo-controlled trial with a more extended follow-up period.
Quality of Life Assessment in Danish Heroin Assisted Treatment Patients: Validity of the SF-36 Survey
2024, Journal of Psychosocial Rehabilitation and Mental Health
Autologous Fat Grafting Is Not Superior to Placebo as Treatment of Postherpetic Neuralgia: A Double-Blind Randomized Clinical Trial
2023, Plastic and Reconstructive Surgery
Patient-reported outcome measures as determinants for the utilization of health care among outpatients with epilepsy: a prognostic cohort study
2023, Journal of Patient-Reported Outcomes
Quality of life of pediatric and adult individuals with osteogenesis imperfecta: a meta-analysis
2023, Orphanet Journal of Rare Diseases

View all citing articles on Scopus

View full text

Original ArticleTests of Data Quality, Scaling Assumptions, and Reliability of the Danish SF-36

Abstract

Introduction

Section snippets

Data Collection

Background Data

Discussion

Conclusion

Acknowledgements

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

Soc Sci Med

Psychometric Theory

Pearson’s r and coarsely categorized measures

Am Sociol Rev

The MOS 36-item Short-Form Health Survey (SF-36)III. Test of data quality, scaling assumptions, and reliability across diverse patient groups

Med Care

The MOS 36-Item Short-Form Health Survey (SF-36)II. Psychometric and clinical tests of validity in measuring physical and mental health constructs

Med Care

SF-36 Health SurveyManual and Interpretation Guide

Danish Manual for the SF-36 (in Danish)

Original Article
Tests of Data Quality, Scaling Assumptions, and Reliability of the Danish SF-36