Objective Updated knowledge on the validity of self-reported myocardial infarction (SMI) and self-reported stroke (SRS) is needed in Norway. Our objective was to compare questionnaire data and hospital discharge data from regions with Sami and Norwegian populations to assess the validity of these outcomes by ethnicity, sex, age and education.
Design Validation study using cross-sectional questionnaire data and hospital discharge data from all Norwegian somatic hospitals.
Participants and setting 16 865 men and women aged 30 and 36–79 years participated in the Population-based Study on Health and Living Conditions in Sami and Norwegian Populations (SAMINOR) 1 Survey in 2003–2004. Information on SMI and SRS was available from self-administered questionnaires for 15 005 and 15 088 of these participants, respectively. We compared this information with hospital discharge data from 1994 until SAMINOR 1 Survey attendance.
Primary and secondary outcomes Sensitivity, specificity, positive predictive value (PPV), negative predictive value and κ.
Results The sensitivity and PPV of SMI were 90.1% and 78.9%, respectively; the PPV increased to 93.1% when all ischaemic heart disease (IHD) diagnoses were included. The SMI prevalence estimate was 2.3% and hospital-based 2.0%. The sensitivity and PPV of SRS were 81.1% and 64.3%, respectively. The SRS prevalence estimate was 1.5% and hospitalisation-based 1.2%. Moderate to no variation was observed in validity according to ethnicity, sex, age and education.
Conclusions The sensitivity and PPV of SMI were high and moderate, respectively; for SRS, both of these measures were moderate. Our results show that SMI from the SAMINOR 1 Survey may be used in aetiological/analytical studies in this population due to a high IHD-specific PPV. The SAMINOR 1 questionnaire may also be used to estimate the prevalence of acute myocardial infarction and acute stroke.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
The overall results are considered representative for the general population aged 36–79 years in the SAMINOR 1 region.
Limited selection bias with regard to hospital discharge records.
Imperfect reference standard.
In the absence of registries, epidemiological studies often use questionnaires to obtain information on cardiovascular disease (CVD). Validation studies published within the past decade and onwards used different instruments and reference standards, and indicated moderate-to-high sensitivity for self-reported myocardial infarction (SMI) (78–98%)1–4 and self-reported stroke (SRS) (69–81%),1–5 whereas reported positive predictive values (PPVs) were lower (SMI: 43–74%;1–4 ,6 ,7 SRS: 22–79%1–8).
There have been a few validity and reliability studies of SMI and SRS in Norway, but recent studies are lacking.8–10 Moreover, no studies until now have assessed the validity of these questions in the indigenous Sami population of Norway. We used questionnaire data from the SAMINOR 1 Survey, the first survey of the Population-based Study on Health and Living Conditions in Sami and Norwegian Populations,11 and hospital discharge data from the Cardiovascular Disease in Norway (CVDNOR) project12 to assess the validity of SMI and SRS by ethnicity, sex, age and education.
In 2003–2004, in collaboration with the Norwegian Institute of Public Health, the Centre for Sami Health Research conducted the SAMINOR 1 cross-sectional survey.11 Men and women aged 30 (born 1973–1974) and 36–79 years (born 1925–1968) residing in selected regions11 were identified through the National Registry and invited to participate in the survey, regardless of their ethnic background (N=27 987). Of those invited, 16 865 (60.3%) agreed to participate. Apart from those recruited from the municipality of Alta (n=4741), the population was exclusively rural (ibid.).
The design of the SAMINOR 1 Informed consent was obtained from all participants. Survey changed during the data collection. In the original design, which was used in the predominantly Sami municipalities of Tana, Nesseby, Karasjok and Kautokeino in Finnmark County, a two-paged initial questionnaire (Q1) was first sent. Those who replied and stated that they wanted to take part in the screening received an invitation to a clinical examination. The invitation included a three-paged screening questionnaire—the Q2—which they were asked to fill in and bring to the examination. The clinical examinations took place in two buses travelling between the municipalities. This two-stage design led to a very low attendance at the clinical examination. Hence, after the first four municipalities, the two-stage sampling design was dropped. In the rest of the survey, the Q1 and Q2 were combined into a five-paged questionnaire which was sent together with the invitation to the clinical examination. In Finnmark and Troms, the buses returned after 2 or 3 months, and non-responders got a second invitation to the screening. In Nordland and Trøndelag, there was only one screening round and no reminder. Owing to the original design, not all participants provided the Q1 and the Q2. The data collection is described in detail by Lund et al (ibid.). The questionnaires in Norwegian, Sami and English languages may be accessed at http://www.saminor.no.
Of the 16 865 participants, 815 did not complete the Q2—which includes the questions on CVD to be validated in this paper—and another 87 did not consent to having their information linked to national registries. Thus, a total of 15 963 participants were eligible for inclusion in this validation study.
Separate analyses were conducted for SMI and SRS; participants were included in both analyses if both events were reported. Excluded were those with missing information on SMI (n=676) and/or SRS (n=720). Participants who did not report their age at the time of myocardial infarction (n=54) and/or stroke (n=46) were also excluded. The CVDNOR project does not have hospital discharge data before 1994, and thus participants with SMI (n=228) and/or SRS (n=109) before this year were also excluded. The total numbers included in the final SMI and SRS analyses were 15 005 and 15 088, respectively.
Self-administered questionnaires were used to collect information on disease, ethnicity and education. SMI and SRS were measured by the question: Do you have, or have you had: ‘a myocardial infarction’ and/or ‘a stroke/brain haemorrhage?’ Response options were ‘yes’ or ‘no’, and participants who answered yes were asked to report their age at the time of the first event.
Ethnicity was determined by the questions: what language(s) do/did you, your parents, and your grandparents use at home? And what is your, your father’s and your mother’s ethnic background? The response options for both questions were ‘Norwegian’, ‘Sami’, ‘Kven’ or ‘other’. The respondents also reported whether they considered themselves to be Norwegian, Sami, Kven or other (self-perceived ethnicity). On all the aforementioned questions, multiple answers were allowed, and participants were categorised as Sami if they answered ‘Sami’ to one or more of the questions above. Respondents who were born abroad and answered ‘other’ to questions on ethnicity (n=263) were considered missing values. The remaining participants were categorised as non-Sami.
Education was assessed by the question: how many years of schooling/education have you completed? (including all the years you attended school or have been studying.) Information on age and sex was collected from the National Registry.
Hospital discharge data
Hospital discharge data were taken from the CVDNOR project (http://www.cvdnor.no), which is a collaborative project between the University of Bergen and the Norwegian Knowledge Centre for the Health Services. The CVDNOR project retrieved information on all hospital stays with a primary or secondary diagnosis of CVD or diabetes mellitus (International Classification of Diseases, Ninth Revision (ICD-9) codes 250, 390–459, 745–747; and ICD-10 codes E10–E14, I00–I99, Q20–Q28), as well as all related procedures, from the Patient Administrative Systems (PAS) of all Norwegian somatic hospitals from 1994 (the year from which all hospitals adopted an electronic PAS) through 2009.13 The CVDNOR data contain information on patients’ date and time of hospital admission and discharge, and up to 21 discharge diagnoses.12
The validity of SMI and SRS was assessed against hospital discharge data by linking SAMINOR 1 and CVDNOR data using the unique 11-digit national identification number assigned to each Norwegian resident. We used main and secondary hospital discharge diagnosis codes as reference standards including ICD-9: 410, or ICD-10: I21 or I22 for acute myocardial infarction (AMI); and ICD-9: 430, 431, 434 or 436, or ICD-10: I60, I61, I63 or I64 for cerebral infarction, intracerebral haemorrhage and subarachnoid haemorrhage (acute stroke). SMI and SRS were compared with hospital discharge data from 1994 until SAMINOR 1 Survey attendance.
SMI-specific and SRS-specific analyses were performed using STATA V.14.1 (StataCorp., College Station, Texas, USA). Sensitivity was the proportion of people hospitalised for CVD, who self-reported CVD a/(a+c). Specificity was calculated as the proportion of non-hospitalised people who did not report a CVD d/(d+b). PPV was the proportion of people who reported CVD who were hospitalised for CVD a/(a+b). Negative predictive value (NPV) was the proportion of people who did not report CVD who were not hospitalised for CVD d/(c+d) (figure 1).
The prevalence of self-reported CVD was calculated using this formula: (a+b)/(a+b+c+d). The prevalence of CVD based on hospital data was calculated by the following formula: (a+c)/(a+b+c+d).
Since the sensitivity, specificity, NPV and PPV are proportions, statistical methods based on the binomial distribution are used. The 95% CIs are exact binomial (Clopper-Pearson intervals) and were computed using the ‘diagt’ command in STATA. Owing to large groups, the Pearson's χ2 test is used for comparisons.14
Cohen's κ was calculated to determine the agreement between self-reported information in the SAMINOR 1 questionnaire and hospital discharge data from CVDNOR. The ‘kapci’ command in STATA was used. Differences in κ statistics were tested using the following test statistic:15
An alternative categorisation of ethnicity was assessed by comparing those who reported speaking Sami language at home with those who did not speak Sami. Potential selection bias was assessed by checking how many of those with SMI and SRS and no information on age at the time of the event were found in the hospital discharge data. We also checked how many of those with missing self-reported disease status that were found with a relevant hospital discharge code.
Non-responders tended to be male, younger, non-married and from Nordland County (table 1).
Poorer response to Q2 was observed in Sami compared with non-Sami; this was due to initial design issues in predominantly Sami municipalities in Finnmark.11
In the total sample, the sensitivity and PPV for SMI were 90.1% and 78.9%, respectively; specificity and NPV were >99.4%. The prevalence of SMI was 2.3%, while the hospital discharge data showed a prevalence of 2.0%. The corresponding κ value was 0.84 (table 2). In sensitivity analyses, only 3 of the 676 participants excluded due to missing self-reported disease status were found to have had an AMI in the hospital discharge data, as were 16 of the 54 participants who failed to report their age at the time of the event (data not shown).
Owing to the small numbers of SMI and hospital discharge events, precision is lacking and only very large differences in the sensitivity, PPV with regard to ethnicity, sex, age, and education would have been statistically significant in our study. Higher sensitivity (96.6% vs 87.4%) was observed in those aged ≤59 years compared with those aged ≥60 years (p<0.05). Small but statistically significant differences (p<0.05) were found for sex, age and education in terms of specificity and NPV which were consistently ≥99.0%. No significant ethnic variation in the validity of SMI was observed (table 2).
In the total sample, the sensitivity and PPV for SRS were 81.1% and 64.3%, respectively; specificity and NPV were >99.4. The prevalence of SRS was 1.5%, while the hospital discharge data gave a prevalence of 1.2%. The κ was 0.71 (table 3). Sensitivity analyses showed that only 4 of the 720 participants excluded due to missing SRS status had an acute stroke event as of the hospital discharge data (data not shown). Furthermore, only 10 of the 46 participants who failed to report their age at disease onset had a relevant discharge code (data not shown).
As with SMI, statistical power and precision are low when comparing the sensitivity, PPV and κ between categories of the selected demographic variables. A substantially higher PPV (81.8% vs 59.1%) was found among participants with ≥12 years compared with 0–11 years of education (p<0.05). Small but statistically significant (p<0.05) differences were observed for age and education in terms of specificity and NPV, as well as for sex in terms of NPV; all were >99.1. Statistically significant (p<0.05) stronger κ coefficients were observed in participants aged 30–59 years compared with those aged 60–79 years (0.76 vs 0.68), and for education ≥12 years compared with 0–11 years (0.83 vs 0.67) (table 3).
Of the 346 SMIs, 73 could not be verified in the hospital discharge data. However, 43 of these (12% of all SMIs) had a discharge code of angina pectoris (ICD-9: 411, 413; ICD-10: I20), and 6 (2% of all SMIs) had a discharge code of ischaemic heart disease (IHD) (ICD-9: 412, 414; ICD 19: I24–I25) other than AMI and angina pectoris (table 4). This gave an IHD-specific PPV of 93.1%.
Among the 221 participants with SRS, we were unable to find a relevant stroke discharge code for 79; among these, we found that 7 (3% of all SRSs) and 12 (5% of all SRSs) had discharge codes referring to a transient ischaemic attack (TIA) (ICD-9: 435.9; ICD-10: G45.9) or other cerebrovascular diseases (ICD-9: 432–433, 435, 437–438; ICD-10: I65–I69), respectively (table 4). None had a code referring to intracranial injury (850–854, S06) or amaurosis fugax (362.34, G45.3) (data not shown).
The alternative definition of ethnicity did not affect results.
The sensitivity (90.1%) and PPV (78.9%) of SMI were high and moderate, respectively; the PPV increased to 93.1% when all IHD diagnoses were considered. For SRS, both of these measures were moderate (81.1% and 64.3%). The self-reported prevalence estimates were almost identical to those based on hospital discharge data for both SMI and SRS. Moderate to no variation was observed in validity according to ethnicity, sex, age and education.
The validity of a questionnaire may vary across populations, time periods and social circumstances.16 Furthermore, when comparing results from validation studies, it is important to consider the reference standard used, and which information was obtained from questionnaires. Indeed, these variables vary widely between studies, which may contribute to the disparity that has been observed in this field of research.1 Nevertheless, we have identified some common findings between our study and recent studies based on population-based survey data that included both sexes and middle-aged and older participants.1–8
A typical AMI is easily recalled, as it usually involves severe pain and hospitalisation, which enhance the validity of SMI.3 Only a few studies have provided κ values for SMI, but the agreement they1 ,3 ,4 reported was lower than that found in our study (κ=0.84). However, κ measurements may be affected by marginal totals and disease prevalence.17 ,18 While the corresponding sensitivity estimate for SMI in our study was high (90.1%), which is in accordance with recent literature,1–4 our PPV of 78.9% was higher than what has been reported previously.1–4 ,6 ,7 The high specificities and NPVs in our total sample (>99.4%) are also in line with earlier studies.
Twelve per cent of all participants with SMI had a discharge code indicating angina pectoris, not AMI. Norwegian hospitals adopted the use of troponin during 1999–2001.19 The troponin method is more sensitive than earlier methods and have probably increased the rates of AMI as diagnosis on the cost of angina pectoris. Some patients may have been told that an acute heart attack diagnosed as angina during 1994–2001 could be an AMI according to new diagnostics, thus leading to a subsequent false-positive SMI. In an Australian study, Barr et al6 found that 18% of all SMIs were in fact unstable angina as per World Health Organization (WHO)/Multinational MONItoring of trends and determinants in CArdiovascular disease (MONICA) criteria.20 In most instances, IHDs share an underlying pathophysiological condition (atherosclerosis); hence, one may conclude that the high IHD-specific PPV of 93.1% in our study justifies the use of the SAMINOR 1 questionnaire in aetiological/analytical studies of AMI in this population. This study also showed that the SAMINOR 1 questionnaire is valid in terms of estimating the prevalence of AMI.
Similar to AMI, a stroke often has an abrupt onset and causes impairment leading to hospitalisation. The sensitivity (81.1%), PPV (64.3%) and κ (0.71) estimates for SRS in this study were in the higher range of what has been previously found.1–8 Three and five per cent of those reporting a stroke did not have a hospitalisation for acute stroke, but for TIA or other cerebrovascular diseases, respectively. Engstad et al8 and Machón et al1 found that about 6% of SRSs’ were in reality TIAs. We observed a large proportion of false-positive self-reports. Given this and the fact that ischaemic stroke and haemorrhagic strokes have different risk factor patterns,21 one may conclude that the use of SRS is inappropriate in aetiological/analytical studies. However, owing to the low false-negative rate, the variable is valid in terms of estimating the prevalence of acute stroke in this population.
Overall, no significant ethnic differences were observed with regard to the validity of SMI or SRS. We found no papers addressing the validity of questionnaire-assessed CVD in indigenous peoples.
Machón et al1 found lower sensitivity and higher PPV in men compared with women with regard to both SMI and SRS. Lower PPV for SMI in women was observed by Yamagishi et al.2 Engstad et al8 found higher PPV for SRS in men than in women. We found no evidence of this. As mentioned earlier, however, owing to the small numbers of SMI, SRS and hospital discharge events, only very large differences would have been statistically significant in our study. Age ≤60 years was associated with increased sensitivity of SMI and of SRS in our study. Engstad et al8 found a borderline significantly higher PPV for SRS (83% vs 73%) in those aged >60 years. In the study by Yamagishi,2 age <65 years compared with age ≥65 years was associated with increased validity for SMI and SRS. Poorer validity in older age may be due to age-related memory loss. For SRS, stroke-induced cognitive impairment may also be a plausible explanation.
Older age in this population was related to fewer years of education, as education opportunities in rural Norway developed gradually after World War II. Increased schooling may be related to the ability to understand health-related information22 and may potentially influence the ability to correctly report the presence or absence of ill health. SRS rendered higher PPV among those with ≥12 years of education. Few studies have examined the influence of education on the validity of SMI and SRS. Engstad et al8 found that higher education did not influence the PPV of SRS in the Tromsø study. Similar educational attainment in Sami and non-Sami23 may be a possible explanation for some of the ethnic homogeneity observed in this study.
The relatively large study sample enabled an in-depth analysis of the validity of SMI and SRS. CVDNOR covers the whole population of Norway; thus, there is limited selection bias with regard to hospital discharge records in this study. The total results are considered representative for the general population aged 36–79 years in the SAMINOR 1 region (see below).
Overall, non-responders in SAMINOR 1 were more likely to be males, of younger age (especially 30 years),11 from Nordland County and single (table 1). This is in line with previous population-based studies in Norway having shown poorer participation in the ill, in men and in younger age groups.24 The results specific to sex and age in this study are thus somewhat uncertain; we believe, however, that the differences in response rates are not big enough as to affect total results.
Furthermore, owing to initial design issues, poorer response to the Q2 in certain municipalities dominated by people of Sami ethnicity was observed. However, response rates were not too different; this has thus probably not affected the results specific to ethnicity and hence the overall results.
Some AMIs may have occurred without symptoms, leading to false-negative self-reporting and an absence of a registered AMI in the hospital discharge data. Furthermore, an AMI could be diagnosed without leading to hospitalisation, resulting in a true-positive SMI not being confirmed in hospital discharge data. However, we believe that this is not a major problem in this study, as hospitalisation among patients with AMI has been common for decades in Norway. In a forthcoming Norwegian study, one found a PPV of ∼95% for hospital admissions with a main or secondary diagnosis of AMI (ICD-10: I21 and I22) when compared with adjudication of medical records (Ragna Elise Støre Govatsmark, personal communication, 2016). This suggests that our reference standard is more or less correct.
A recent Norwegian study reported a PPV of 79.7% for hospital admissions with a main or secondary diagnosis of acute stroke (ICD-10: I61, I63, and I64) when compared with adjudication of medical records; 65% of the false-positives were defined as previous strokes, rehabilitations after stroke or old strokes not previously diagnosed.25 In the past, however, patients with ischaemic stroke were not consistently admitted to hospitals for exact diagnosis and specific treatment, although this practice has become more standard in the past decade.26 Thus, stroke may have been diagnosed in primary care alone, particularly during the first decade of the CVDNOR database (1994–2004). In a community study from Norway in the late 1990s, Ellekjær et al27 found that 41 of the 430 patients with incident stroke were not hospitalised. They also found that the sensitivity and PPV of hospital discharge ICD-9 codes 430, 431, 434 and 436 were 81% and 68%, respectively, when compared with a register also including non-hospitalised cases. Consequently, the validity of the SRS in this study may hence be either overestimated or underestimated; non-hospitalised strokes are likely to be milder than hospitalised cases. We may thus assume that by missing potentially milder stroke cases from the reference standard in our study, we could be overestimating the sensitivity of SRS; milder cases might be reported less accurately by participants, either because they are less likely to be recalled, or because they are more easily confused with TIA. If milder cases, however, are reported correctly, the PPV of SRS may be underestimated. This suggests that hospital discharge data is an imperfect reference standard with regard to SRS in our study.
Participants with SMI (n=54) and/or SRS (n=46) who did not specify their age at disease onset were excluded from the analyses, which may have introduced selection bias. However, if people failed to recall their age at disease onset, it is likely that in reality the event dates back years or even decades. Cases before 1994 could not be validated using CVDNOR data. Nonetheless, we found that 16 of those with SMI, and 10 of those with SRS who did not specify their age at disease onset, had been hospitalised with an AMI or an acute stroke during the study period. If they had been included in the analyses as true positives, the sensitivity and PPV would be even higher.
The sensitivity and PPV of SMI were high and moderate, respectively; for SRS, both of these measures were moderate. Our results show that SMI from the SAMINOR 1 Survey may be used in aetiological/analytical studies in this population due to a high IHD-specific PPV. The SAMINOR 1 questionnaire may also be used to estimate the prevalence of AMI and acute stroke.
The authors thank the participants in the SAMINOR 1 Survey for their contribution. They also thank Ellisiv Mathiesen and Ragna Elise S Govatsmark for their insightful comments on an earlier version of this paper. They would also like to express their gratitude to Tomislav Dimoski at the Norwegian Knowledge Centre for the Health Services, Oslo, Norway for his contribution by developing the software necessary for obtaining data from Norwegian hospitals, conducting the data collection and quality assurance of data in this project.
Contributors SG-I introduced the idea for the study. B-ME performed the statistical analyses and drafted the manuscript. MM performed the linkage of data sources and assisted with the statistical analyses. MM, SG-I, KBB, TB, ARB and GST helped draft the manuscript.
Funding The SAMINOR 1 Survey was financed by the Ministry of Health and Care services. CVDNOR received funding from Nasjonalforeningen for folkehelsen. The North Norwegian Regional Health Authority provides B-ME's postdoctoral research grant.
Competing interests None declared.
Ethics approval The SAMINOR 1 Survey was approved by the Regional Committee for Medical Research Ethics, region North (REC North) and the Norwegian Data Protection Authority. CVDNOR as a project was approved by the Regional Committee for Medical and Health Research Ethics, region west (REC west). This validation study was approved by REC North.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.