Article Text
Abstract
Background Information on the validity of self-reported cases of stroke and acute myocardial infarction (AMI) is varied. The aim of this study was to assess the validity and agreement of self-reported prevalent cases of stroke and AMI in the Spanish cohort of the European Prospective Investigation into Cancer and Nutrition (EPIC).
Methods At recruitment, 1992–1996, and in the follow-up (3 years after recruitment), each participant in the Spanish EPIC cohort (15 630 men and 25 808 women) was asked if a doctor had ever said that they had had a stroke or AMI, and the results were compared with information available in medical records. Validity of self-reported prevalent cases of stroke and AMI was examined by calculating sensitivity, specificity, positive predictive values and κ statistics.
Results The sensitivity of self-reported prevalent cases of stroke was 81.3% and that for AMI was 97.7%. The positive predictive value was 22.2% and 60.7% for stroke and AMI, respectively, whereas specificity was very high (>99%) for both diseases. The agreement between self-report questionnaire results and medical records was substantial (κ=0.75) for AMI but not for stroke (κ=0.35).
Conclusion Self-reported information on stroke and AMI included in the EPIC questionnaire is a valid instrument for the assessment of AMI disease but should be used with caution in stroke.
- Stroke
- acute myocardial infarction
- sensitivity
- specificity
- positive predictive value
- κ statistic
- EPIC
- cancer
- coronary heart disease
- ischaemic heart disease
- epidemiology
- vascular disease
- public health
- chronic DI
- diabetes
- ischaemic heart disease
- nutrition
- CHD/coronary heart
- communicable diseases
Statistics from Altmetric.com
- Stroke
- acute myocardial infarction
- sensitivity
- specificity
- positive predictive value
- κ statistic
- EPIC
- cancer
- coronary heart disease
- ischaemic heart disease
- epidemiology
- vascular disease
- public health
- chronic DI
- diabetes
- ischaemic heart disease
- nutrition
- CHD/coronary heart
- communicable diseases
Introduction
Although cerebrovascular and coronary heart disease (CHD) represent a major cause of death in Europe, associated mortality rates have fallen in recent years.1 In Spain, epidemiological information about stroke disease is less comprehensive2 ,3 in comparison to the well-known epidemiology of CHD.4–6 Nevertheless, incidence of both these diseases is known to be lower than in other industrialised countries.2 ,5 In addition, risk and prognostic factors for ischaemic heart disease have been analysed in several Spanish studies, but there are little data on this issue in stroke.7 ,8
Self-report questionnaires are frequently used to obtain information about cardiovascular disease (CVD), such as stroke9 and acute myocardial infarction10 (AMI), in epidemiological research. They are considered a valuable, complete and cost-efficient method to assess prevalence and incidence of stroke and AMI in the absence of specific population registers for these diseases, though information about their validity is varied. In particular, sensitivity has been found to be quite high for both stroke (73%–95%) and AMI (80%–97%), whereas positive predictive values (PPVs) reported have been lower: 57%–79% for stroke and 43%–87% for AMI.9–13 Furthermore, agreement between self-reported health information and medical records has generally been found to be substantial (κ statistic >0.61) in the field of cardiovascular epidemiology.10 ,13 Nonetheless, there has been little research on this subject in Spanish populations.
The aim of this study was to assess the validity and agreement of self-reported prevalent stroke and AMI, at recruitment and at a 3-year follow-up, compared with medical records in the Spanish cohort of the European Prospective Investigation into Cancer and Nutrition (EPIC), by estimating the sensitivity, specificity, PPVs and κ statistics.
Methods
Study population
The EPIC study is a multicentre prospective study conducted on half a million adult volunteers recruited between 1992 and 2000 in 10 Western European countries.14 The Spanish EPIC cohort consists of 41 438 volunteers enrolled in five Spanish regions, three in the North (Asturias, San Sebastian and Navarra) and two in the South (Granada and Murcia), between 1992 and 1996. The methodological details of the EPIC project have been published previously.14–16
In the present study, data from Asturias, San Sebastian, Navarra and Murcia were included for validation of self-reported prevalent cases of stroke (13 832 men and 19 722 women, aged 29–70 years) at recruitment and at a 3-year follow-up, while all Spanish EPIC regions were included in the AMI analysis (15 630 men and 25 808 women, aged 29–69 years). All participants gave their informed consent concerning the use of patient identifiable information at recruitment.
Questionnaire data
At the time of recruitment, between 1992 and 1996, dietary habits from each participant were collected using a questionnaire. Anthropometric measurements and a blood sample were also taken at recruitment using standardised procedures. In addition, they were given a questionnaire about non-dietary variables, including their history of smoking, work and leisure-time physical activity, and level of educational, as well as various self-reported conditions (stroke, AMI, high blood pressure, hyperlipidaemia and diabetes mellitus). Both lifestyle and medical information was obtained by means of a personal interview with the participant. The questions about stroke and AMI were formulated as follows: ‘Have you ever been told by a physician that you have or have had a stroke (cerebral thrombosis or haemorrhage)?’ and ‘Have you ever been told by a physician that you have or have had an acute myocardial infarction (heart attack)?’ In the event of a positive answer to either of these questions, they were also asked to report their age at diagnosis of the disease.
A follow-up telephone interview, carried out 3 years after recruitment, was carried out with 98% of the participants and included the following questions: ‘Have your ever been told by a physician that you have or have had a stroke (cerebral thrombosis or haemorrhage)?’ and ‘Have you ever been told by a physician that you have or have had an AMI (heart attack)?’ If the answer was ‘yes’ to either of these questions, they were also asked to indicate the date of diagnosis and the city and hospital where they were admitted in cases of hospitalisation. In this follow-up, a participant could report a prevalent stroke or AMI event that had not been mentioned at recruitment, the same stroke or AMI event that was cited at recruitment or an incident stroke or AMI event. At this point, we only took into account those stroke and AMI events reported at 3-year follow-up that had not been reported at recruitment.
Outcome assessment
Stroke cases were identified from self-reported questionnaires at recruitment and at the 3-year follow-up as well as by record linkage with two sources of information: hospital discharge databases and computerised primary care registers. The linkage between hospitals discharges and the EPIC cohort was made reviewing the International Classification of Disease (ICD) codes: ICD-9 430–438, while to make the linkage with the computerised primary care, register codes K89, K90 and K92 from the International Classification of Primary Care were revised in Asturias, Navarra and Murcia, and the codes ICD-9 430–438 were used in San Sebastian.
AMI events were also identified from self-reported questionnaires at recruitment and at the 3-year follow-up as well as by record linkage with hospital discharge databases (though for Granada we had limited access to these records). The link between hospital discharges was made by reviewing the following ICD codes: ICD-9 410–414.
A validation process was carried out to confirm and classify the stroke and AMI events identified. This validation process was performed by a team of trained health professionals who reviewed patient hospital medical records and primary care information (only for stroke and when hospital data were not available). The stroke cases were classified on the basis of symptoms, presence of cerebrovascular risk factors and specific medical tests (CT, MRI, angiography, Doppler echography, lumbar puncture, etc) in accordance with a guide published by the Spanish Society of Neurology.17 Two neurologists helped with the classification of the most difficult stroke cases. The AMI events were classified on the basis of symptoms, enzyme test results, ECG and biomarker findings according to the Multinational Monitoring of Trends and Determinants in Cardiovascular Disease criteria.18 They were classified as either definite AMI or possible AMI (in those cases which did not meet all diagnostic criteria), and both were considered as cases. A cardiologist helped with the classification of the most difficult AMI cases.
The present analysis examined the accuracy of self-reported prevalent stroke and AMI at recruitment and at the 3-year follow-up. To validate self-reported prevalent stroke and AMI events, self-reported information was compared with medical records. Specifically, a stroke or AMI was considered prevalent if there was an indication of a recognised stroke or recognised AMI disease in the patient's records before the recruitment date.
Statistical analysis
The validity of self-reported questionnaire data compared with medical records was expressed in terms of sensitivity, specificity and PPV. Sensitivity was calculated as the number of true positives, that is, correctly identified cardiovascular cases, divided by the total number of cardiovascular cases registered. Specificity was estimated as the number of true negatives, that is, correctly identified cardiovascular events, divided by the total number of non-cases. The PPV was calculated as the number of true positive cardiovascular cases divided by the total number of self-reported cardiovascular cases. We did not focus on negative predictive value of stroke or AMI as it was close to 100% in both diseases.
The κ statistic was calculated to determine the agreement between self-reported EPIC questionnaire data and medical records. A κ value of <0.40 was considered to indicate poor to fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement and 0.81–1.00 was considered almost perfect agreement.19
Statistical analysis of data was performed using SPSS V.17.0 software and EPIDAT 3.0.
Results
The sample consisted of 33 554 individuals for the stroke analysis and 41 438 for the AMI analysis. The main socio-demographic and self-reported health characteristics of the subjects in the Spanish EPIC cohort are described in table 1 as a function of whether they did or did not self-report prevalent stroke or AMI. The percentage of men, former smokers, individuals among the older age ranges, those with a low level of educational, self-reported diabetes, high blood pressure and/or hyperlipidaemia and those who are physically inactive and/or obese was found to be higher among those subjects who self-reported prevalent stroke or AMI.
The total numbers of self-reported prevalent CVD cases were 176 for stroke and 277 for AMI. A total of 48 prevalent strokes cases (by sex, 30 men and 18 women; by centre, five in Asturias, 17 in Sebastian, nine in Navarra and 17 in Murcia) and 172 prevalent AMI cases (by sex, 131 men and 41 women; by centre, 19 in Asturias, 31 in San Sebastian, 23 in Navarra, 69 in Murcia and 30 in Granada) were confirmed in the review of medical records (data not shown).
The results of validity and agreement between self-report information and medical records for stroke and AMI are presented in table 2. For stroke, sensitivity was 81.3%, while the specificity and PPVs were 99.6% and 22.2%, respectively. The corresponding values for AMI were 97.7%, 99.7% and 60.7%. Agreement between self-reported questionnaires and medical records was poor to fair for stroke (κ=0.35) and substantial for AMI (κ=0.75). By sex, men showed higher PPV and κ statistic values and lower sensitivity results compared with women, in the cases of both stroke and AMI (table 3). On the other hand, the specificity values for stroke and AMI were similar in the two sexes.
Discussion
Within the Spanish EPIC cohort study, we analysed validity and agreement between self-reported prevalent stroke and AMI and medical records. The study showed high sensitivity and specificity values for both stroke and AMI, whereas a relatively high PPV was observed for AMI but not for stroke. In addition, substantial agreement was found for AMI though not for stroke. With respect to validity, we found higher PPV and κ statistics and lower sensitivity values in men than in women.
Self-reported health questionnaires continue to be important tools to assess the prevalence and incidence of various diseases, especially in the absence of adequate population-based disease registries. However, when comparing results from different studies, it is important to bear in mind that the characteristics of the study population, gold standard used, information obtained from questionnaires or even the analytical approach vary widely between validation studies, and this fact may contribute to the variability observed in this field.
Despite this heterogeneous methodology, we have identified some common findings between this research and earlier studies. Specifically, in our data, specificity was very high (>99%) for both stroke and AMI, as has already been shown in other studies. However, this is attributable to the large number of non-cases registered. On the other hand, the corresponding sensitivity estimates for self-reported stroke and AMI were also elevated in the EPIC questionnaire compared with other results. O'Mahony et al 11 reported a sensitivity of 95% for stroke while the values reported in Baena-Díez et al 13 were 87% for cerebrovascular disease (stroke and transient ischaemic attack (TIA)) and 97% for AMI. By contrast, the PPVs found in our research were lower than those observed in most of the previous studies published, particularly for stroke (22%). For example, the Tromsø Study9 (Norway) estimated a PPV of 79% for self-reported stroke, a study from Minnesota10 showed a PPV of 67% for stroke (including TIA) and 73% for AMI and a PPV of 63% for self-reported stroke (including TIA) was seen in a study from the UK.11 On the other hand, a relatively low PPV was calculated in a study carried out in a large Japanese cohort, which reported a PPV of 57% and 43% for incident stroke and AMI cases, respectively.12 Furthermore, in a paper from New Zealand,20 a PPV of 27% was reported for self-reported strokes compared with primary diagnoses for hospitalisation.
The lower PPV values obtained in our work (especially for stroke) compared with other studies suggest the presence of a high rate of false-positive responses in the EPIC questionnaire. The question about stroke also included information about thrombosis or cerebral haemorrhage, and in the case of AMI, also mentioned a heart attack. This could have induced the participants with other cerebrovascular or cardiac diseases to answer affirmatively to those questions. Therefore, it is possible that the questionnaire has not been sufficiently specific for stroke and AMI. In addition, it has been suggested that patients tend to misunderstand the neurological symptoms of a stroke10 ,21 and also that it may be difficult for them to differentiate a TIA from a stroke. In our data, 6.3% of self-reported strokes turned out to be TIAs, while other studies have shown false-positive rates of 6.7%9 and 14.4%.11 Furthermore, these lower PPV could be related to the low prevalence of the conditions in our study. However, to our knowledge, these worse results in terms of validity in stroke disease are also in line with the findings of other studies.10 ,12 It seems necessary to develop more effective methods to improve the accuracy of self-reported surveys. One possible way could be the inclusion in health questionnaires of additional information concerning the specific symptoms of the disease of interest.
The agreement between self-reported information gathered in the EPIC questionnaire and medical records was substantial for AMI (κ=0.75) but not for stroke (κ=0.35). Few studies provided κ value for stroke or stroke and TIA disease, but in those that did they were higher, 0.710 and 0.82,13 than the figures found in our study.
In the analysis stratified by sex, in contrast with results from other studies, lower sensitivity values were obtained in men compared with women for stroke and AMI.12 ,13 However, the better PPV and agreement observed in men than in women for stroke and AMI coincide with the findings of previous studies.12 ,13 The reason why women showed lower PPV seems to be related to the low prevalence of stroke and AMI among women.
In Spain, there are few epidemiological data about the validity of CVDs.13 Therefore, one of the important characteristics of this study is that it estimates the accuracy of self-reported stroke and AMI in a large cohort, 33 554 and 41 438 individuals, respectively. However, despite the study being based on a large cohort, which allows for a long follow-up of a large sample of healthy individuals, it included only a small sample of stroke cases (n=48), and this may have influenced the results obtained for this illness. Another limitation of the study is that hospital discharge databases were not available for hospitalisations that occurred before 1995 (varying by centre). Accordingly, some of the prevalent cases could have been not identified, and this could have overestimated stroke and AMI sensitivity results. It is also important to consider that even though the κ statistic is generally accepted and used, it has been suggested that it is affected by differences in marginal totals and in the prevalence of the disease.22–24 Nevertheless, the estimation of κ statistics may provide additional information to support the interpretation of the accuracy of the EPIC questionnaire.
On the other hand, although it has been suggested that the use of medical records as a reference standard is one of the best methods to evaluate the accuracy of self-report medical information, there is a lack of consensus on this matter. For example, Baena-Díez et al 13 point out that when several objective criteria (among others, biomarkers of necrosis and CT) are used from the reference standard (medical records) in CVD, it is unlikely that the corresponding patients have not really had the CVD reported. Indeed, Okura et al,10 even though they consider it an imperfect method to determine the presence of a disease, concluded that medical record is the best available reference standard for its study. On the other hand, Carter et al 20 suggested that hospitalisation data may underestimate the prevalence of stroke while self-reported information may overestimate it, concluding that a combination of methods may be a more appropriate way to estimate the prevalence of stroke in the general population.
Conclusions
The results of this study suggest that the validity and agreement of self-reported CVD compared with data in medical records was good for AMI but not for stroke. Therefore, such data should be used with caution in epidemiological studies, especially for stroke disease.
What is already known on this subject
-
Information on accuracy of self-reported cases of stroke and AMI is varied and has also been scarce in Spain.
What this study adds
-
In this study, self-reports of prevalent stroke and AMI among the Spanish EPIC cohort showed similar high sensitivity and specificity values but lower PPV and κ statistics values than other studies.
-
The usage of health questionnaires together with other sources of information seems to be justified in the identification of prevalent stroke and AMI cases.
Acknowledgments
The authors thank the participants in the Spanish EPIC cohort for their contribution to this study as well as the team of trained nurses who participated in the recruitment. The authors also thank Dr Maite Martínez Zabaleta and Dr Fermín Moreno Izco from Donostia Hospital (Gipuzkoa, Spain) for helping in the classification of the most difficult stroke cases and to Dr Jesús Berjón for helping with the classification of the most difficult AMI cases.
References
Footnotes
-
Funding This work was supported by the CIBER in Epidemiology and Public Health (AC07_010). The EPIC study received financial support from the European Commission (Agreement SO 97 200302 05F02), the participating Regional Governments, the Red Temática de Investigación Cooperativa de Centros de Cáncer (RTICCC, C03/10) and the International Agency for Research on Cancer (Agreement AEP/93/02).
-
Competing interests None.
-
Ethics approval The study was approved by a local ethical review board.
-
Provenance and peer review Not commissioned; externally peer reviewed.