How large should a cause of death be in order to be included in mortality trend analysis? Deriving a cut-off point from retrospective trend analyses in 21 European countries

Objectives The International Classification of Diseases (ICD-10) distinguishes a large number of causes of death (CODs) that could each be studied individually when monitoring time-trends. We aimed to develop recommendations for using the size of CODs as a criterion for their inclusion in long-term trend analysis. Design Retrospective trend analysis. Setting 21 European countries of the WHO Mortality Database. Participants Deaths from CODs (3-position ICD-10 codes) with ≥5 average annual deaths in a 15-year period between 2000 and 2016. Primary and secondary outcome measures Fitting polynomial regression models, we examined for each COD in each country whether or not changes over time were statistically significant (with α=0.05) and we assessed correlates of this outcome. Applying receiver operating characteristicROC curve diagnostics, we derived COD size thresholds for selecting CODs for trends analysis. Results Across all countries, 64.0% of CODs had significant long-term trends. The odds of having a significant trend increased by 18% for every 10% increase of COD size. The independent effect of country was negligible. As compared to circulatory system diseases, the probability of a significant trend was lower for neoplasms and digestive system diseases, and higher for infectious diseases, mental diseases and signs-and-symptoms. We derived a general threshold of around 30 (range: 28–33) annual deaths for inclusion of a COD in trend analysis. The relevant threshold for neoplasms was around 65 (range: 61–70) and for infectious diseases was 20 (range: 19–20). Conclusions The likelihood that long-term trends are detected with statistical significance is strongly related to COD size and varies between ICD-10 chapters, but has no independent relation to country. We recommend a general size criterion of 30 annual deaths to select CODs for long-term mortality-trends analysis in European countries.

Strengths and limitations of this study ► The first study to develop a criterion to select causes of death for monitoring purposes based on their annual number of deaths. ► The analysis of a large sample of causes of death covering most European countries, using the WHO Mortality Database. ► Criteria for selection of causes were derived for different types of causes of death. ► Other criteria were not applied, such as causes of death that involve high healthcare costs or that are potentially modifiable.

AbStrACt
Objectives The International Classification of Diseases (ICD-10) distinguishes a large number of causes of death (CODs) that could each be studied individually when monitoring time-trends. We aimed to develop recommendations for using the size of CODs as a criterion for their inclusion in long-term trend analysis.
Design Retrospective trend analysis. Primary and secondary outcome measures Fitting polynomial regression models, we examined for each COD in each country whether or not changes over time were statistically significant (with α=0.05) and we assessed correlates of this outcome. Applying receiver operating characteristicROC curve diagnostics, we derived COD size thresholds for selecting CODs for trends analysis. results Across all countries, 64.0% of CODs had significant long-term trends. The odds of having a significant trend increased by 18% for every 10% increase of COD size. The independent effect of country was negligible. As compared to circulatory system diseases, the probability of a significant trend was lower for neoplasms and digestive system diseases, and higher for infectious diseases, mental diseases and signs-and-symptoms. We derived a general threshold of around 30 (range: [28][29][30][31][32][33] annual deaths for inclusion of a COD in trend analysis. The relevant threshold for neoplasms was around 65 (range: 61-70) and for infectious diseases was 20 (range: [19][20].
Conclusions The likelihood that long-term trends are detected with statistical significance is strongly related to COD size and varies between ICD-10 chapters, but has no independent relation to country. We recommend a general size criterion of 30 annual deaths to select CODs for longterm mortality-trends analysis in European countries.

IntrODuCtIOn
Mortality data are essential for the monitoring of population-wide trends in a large number of diseases and injuries, as well as for the evaluation of health policies. A common source for these data are the statistics maintained by national statistical offices. 1 2 National statistics of causes of death (CODs) include many codes of the 10th revision of the International Classification of Diseases (ICD-10 codes). 3 Given the detail of this classification-there are 1752 three-position ICD-10 codes-a part of it may not be instrumental for monitoring long-term time-trends due to the small number of deaths for specific codes. When using these statistics to monitor longterm trends in mortality, a main question is which of the many possible CODs to include. At the very least, the selection should include only CODs that are large enough to have a reasonable probability of detecting a longterm mortality trend. This probability may be influenced by several factors. One main factor is the COD size, defined as the mean annual number of deaths, which expresses the rarity of a disease or condition that is selected as underlying COD in a population. Incidence changes or effects of interventions are common factors discussed in mortality trends analyses. 4 5 In addition, this probability might depend on other factors, such as the type of COD, or the country of interest. Certain types Open access of CODs may be more likely to present a long-term trend. For example, neoplasms have been shown to be more gradual in their annual changes, 6 whereas infectious diseases 7 may have high year-to-year variation. As regards to different populations, the likelihood to detect a longterm trend for a COD may vary between countries because of differences in population size, COD coding practices that may also influence observed mortality trends, 8 trends in prevalence of risk factors, [9][10][11][12] implementation of new prevention strategies, 13 14 treatment protocols 5 or healthcare reforms. 15 Due to the fact that the likelihood to detect a long-term trend of a COD may depend on various factors, there is a need for an empirical assessment of such likelihood. Such analysis may provide an empirical basis for the identification of CODs for which long-trends are likely to be detectable. More specifically, it may be used to define a criterion, or rule of thumb, that identifies eligible CODs in terms of a minimum COD size. When such a criterion allows for variation by COD type and country, it may be used in national and international trend analysis across a broad range of CODs.
The general objective of this study was to determine a COD size criterion for the study of long-term mortality trends in European countries. The specific objectives were: (1) to assess the association between the size and the type of a COD and the probability of detecting a longterm trend in European countries, (2) to assess how this association varies according to country and (3) to identify a minimum annual number of deaths recommended to monitor trends in cause-specific mortality.

MethODS Data
We used annual mortality data for 21 European countries of the WHO Mortality Database (1 October 2017 update). 16 We included the 21 countries of the European Union (28 countries) or the European Free Trade Association (4 countries) that had been using ICD-10 (3 or 4 position) coding for at least 15 consecutive years. Iceland, Luxemburg and Malta were excluded because of their small population. 17 The most recent 15 year period was selected, which was 2001-2015 for all countries with few exceptions (Belgium, France and Switzerland: 2000-2014; Austria: 2002-2016). If the time series of a COD in a country was interrupted by a year without any data on that COD, we assumed that zero cases occurred.

Statistical analysis
For each year and COD in a country, we calculated an agestandardised count of deaths using the direct method. As reference population, we used the age-distribution of the European Standard Population 2013, scaled to the midperiod population of each country. This method intended to compensate for annual changes in the age-distribution of the population, while keeping the age-standardised count close to the observed absolute numbers.
For further analysis, we analysed CODs that had at least five average age-standardised annual deaths, because most of the smaller CODs had predominantly zero or only zero annual deaths.
Long-term time-trends of the age-standardised count of deaths of each COD in each country were analysed using ordinary least squares regression (OLS) models. Trends were fitted by applying linear regression models with polynomial terms of year as continuous, independent covariates. 18 We used orthogonal polynomials in order to account for multicollinearity of the polynomial components. 19 We fitted four models: the constant, the linear, the quadratic and the cubic model (with zero, first, second and third degree polynomials, respectively). The four models were applied for all CODs in each country. We used the lowest corrected Akaike Information Criterion to select the best model for each COD in each country. 20 In a next step, the best model was compared with the constant model using the F-test, at the significance level of α=0.05. If the best model performed better than the constant model with statistical significance, it was kept as the final best model. Otherwise, the constant model was selected as the best model for this COD. In the rest of the paper, the constant model is referred to as the absence of a demonstrable trend.
Next, using a multilevel logistic regression model, we determined how the categorisation of a COD as having a statistically significant trend (ie, best model being the linear, quadratic or cubic model) was related to COD size and COD type. These variables were included in the model as fixed effects. The COD size was defined as the mean annual number of deaths and the COD type was defined as the ICD-10 chapter in which it is classified. The chapter of circulatory diseases was the reference category, as it had the largest number of deaths. As the distribution of the number of deaths across CODs was highly skewed, we used its natural logarithm as a measure of COD size. The model also included the level of countries as random effect, in order to investigate the variation of European countries in the likelihood of detecting a long-term trend. We calculated the Intraclass Correlation Coefficient (ICC), which expresses the proportion of the variance in the outcome that is attributable to variations between the countries. 21 The ICC was calculated both with and without controlling for the fixed effects of the size and type of the COD.
Finally, we used receiver operating characteristic (ROC) curve diagnostics [22][23][24] to derive COD size thresholds for detecting a long-term time-trend. We calculated the Area Under the Curve (AUC) of the logistic model with COD size as the predictor and the binary categorisation of a COD as having a significant long-term time-trend as the outcome. We derived the COD size thresholds using three indices. First, we used the maximum Youden index [25][26][27] which represents the point of the ROC curve with the maximum sum of sensitivity (se) and specificity (sp). Second, we used the index measuring the minimum difference between sensitivity and specificity. 23 Third, we estimated the index that represents the point closest to the top-left part of the ROC curve. 22 26 All analyses were conducted using R statistical software V.3.5.1. 28

Patient and public involvement
No patients were involved in this study.

reSultS
The number of CODs with at least five annual deaths on average varied between 202 (Estonia) and 791 (Germany) (table 1). Of these CODs, 32.6%, 20.2% and 11.2% had a significant trend following a linear, quadratic or cubic model, respectively. The percentage of CODs with no significant trend (ie, constant model) varied from 27.5% to 43.9%, and was highest in the Nordic countries, Switzerland and Slovenia. More detailed information on the best model for each COD in each European country can be found in an additional file (online supplementary resource 1).
Both COD size and COD type were significantly associated with the likelihood of having a significant long-term trend (p<0.001) (table 2). For every 10% increase in the COD size, we observed a 18% increase (1.1ˆ1.73=0. 18) in the odds of having a significant trend (OR=1.73, 95% CI=1.67 to 1.79). Regarding the COD type, neoplasms and digestive system diseases had lower probability for detecting a trend in comparison to the circulatory system diseases. On the other hand, this probability was higher for infectious diseases, mental diseases, and signs and symptoms. Figure 1 shows for each COD chapter in each country the estimated probability of having a significant long-term trend in relation to COD size. The variation between COD chapters was substantial, irrespective of ‡'Other' consists of the causes of death classified in the ICD-10 chapters H00.H59: diseases of the eye and adnexa, H60.H95: diseases of the ear and mastoid process, L00.L99: diseases of the skin and subcutaneous tissue, O00.O99: Pregnancy, childbirth and the puerperium and P00.P96: certain conditions originating in the perinatal period.
COD size. Neoplasms (chapter C00.D48) as a group of CODs showed the lowest probability of having a detectable trend.
We found only small variation of countries in the likelihood of detecting a long-term trend, as the ICC for the country-level random effect was only 0.013 (without fixed effects for chapter and size) and 0.003 (with fixed effects) (table 2). Figure 2 illustrates the small differences between countries in the estimated probability of having a long-term trend. Figure 3A describes the se and sp for detecting a significant long-term trend using different levels of thresholds in terms of any COD size. The AUC corresponding to these se and sp values was 0.706, with 95% CI: 0.695 to 0.716. The maximised sum index (Youden Index) was 32.7 annual deaths, with se 61.4% and specificity (sp) 70.3%. The minimum difference index was 27.5 annual deaths (se=65.5%, sp=65.5%). The closest top-left index was 29.4 annual deaths (se=64.0%, sp=67.5%) ( figure 3A).

DISCuSSIOn
COD data are used in widely varying settings, ranging from detailed mortality profiles to macro estimates. Applications include studies in localised areas, 29 single countries 30 or worldwide 2 31 ; for a single-disease 32 33 or disease group 9 ; monitored for days or a long-term period 7 ; for specific age groups 33 34 or specific situations (eg, maternal mortality, 5 external causes [35][36][37] ). These settings all impose different requirements on the collected data. Here we focused on one particular application: national estimates of mortality time trends for a reasonably long period (15 years), for a considerable number of countries (21) that have quite comparable CODs data collection and Open access Figure 3 Sensitivity and specificity of the cause of death size for the detection of significant long-term time-trends, with thresholds for the optimal cause of death size for trend analysis.
registration systems, 38 covering as many CODs as possible. Our aim was to investigate the effect of the size of a COD on the probability to detect a significant trend, and how this is related to country and type of COD (ICD-10 chapter).
Our results indicate that both the size and the type of a COD were associated with the probability of detecting a significant trend, while variations among European countries were negligible. Some types of CODs, particularly neoplasms and digestive system diseases, had a lower probability for detecting a significant trend in comparison to the circulatory system diseases, whereas infectious and mental diseases had a higher probability. The results suggest a general size criterion of 30 annual deaths for selecting CODs to include in long-term mortality trends analysis, and a more specific criterion of 65 deaths for neoplasms and 20 for infectious diseases.
We should outline the limitations of our study. First, due to the exclusion of CODs with less than five annual deaths on average, smaller countries were represented in our analysis with fewer CODs. However, this is unlikely to have a strong influence on the results, as the suggested COD size threshold of about 30 deaths is much higher than the lower limit of 5 mean annual deaths. Second, although we proposed the COD size as a criterion to select CODs for long-term trend analysis, we acknowledge that other criteria could be used, such as greater preference to CODs that involve high healthcare costs or that are potentially modifiable by preventive or curative actions. Third, the likelihood to demonstrate a time trend with statistical significance depends on the statistical method that is used to describe these trends. Our results are dependent on the balance between avoiding type I error and type II errors. As for type I errors, we chose a significance level of α=0.05. A more restrictive significance level would have the consequence to increase type II errors, that is, to reduce the proportion of CODs for which a trend would be detected based on our method.
Moreover, our results should be seen as conditional on our use of OLS models with polynomial terms. The OLS approach may not be appropriate for small counts. However, the approximation of a Poisson by a normal error distribution is generally assumed to be adequate if the mean number of observations is about five or more. For larger counts, OLS has the benefit that a variance can be estimated, rather than postulated.
In addition, an alternative to the classic polynomial regression approach would have been to use Generalized Additive Models (GAMs). These models have the advantage of being able to pick up trends that are not polynomial. In a sensitivity analysis, we applied GAMs with Gaussian process smoothing function to our data. We found that a long-term trend could be detected in 71.7% of the CODs, as compared to 64.0% in our original analysis. There were virtually no CODs for which a trend could be detected when using polynomial models but not when using GAMs. This would imply that our results are approximately robust to the method used, although somewhat conservative.

Open access
Finally, including spatial correlation in our model may have altered the chance of detecting a significant trend for CODs with marked geographical patterns. We calculated Moran's I test for spatial correlation among countries regarding the proportion of CODs in each country with a detected long-term trend. The Moran's I test was found to be not statistically significant for all CODs collectively (p value=0.988). At the level of COD chapters, we found significant spatial correlation for the chapters C-D (p value=0.002), E (p value=0.025), and V-Y (p value=0.001), but not for other chapters.
We found that mortality from neoplasms was less likely to have a significant trend, for a given size of COD. This may relate to the fact that the neoplasm mortality levels tend to change gradually over time, without short-term trend changes. 6 Additionally, cancers are usually coded reliably and consistently over time, [39][40][41] so that coding artefacts can rarely induce artificial changes. Conversely, the dynamic nature of infectious diseases may be responsible for their higher likelihood to change over time, and to have significant trends even with relatively small numbers of deaths. Similarly, the chapter of signs-and-symptoms is sensitive to changes in the coding rules and practices, thus creating significant changes even with small number of deaths.
Our study showed that European countries did not vary substantially in the probability of detecting a significant long-term trend in CODs of the same size and type. This finding is surprising given the heterogeneity of the countries in terms of demographic characteristics, disease epidemiology, healthcare systems and coding practices. We found that differences between countries in the proportions of CODs with a significant trend (table 1) can be related to differences in COD size which is strongly related to the differences in population size. Consequently, our analysis provides support for establishing one common COD size threshold, applicable for all European countries and for use in international trend analyses.
Currently, there is no gold standard for the selection of CODs to analyse for long-term trends. In this study, we attempted to set such a standard, based on the criterion of the COD size which is easy to measure for each single COD. We calculated thresholds with three common methods which came close enough (eg, in the range of 28 to 33 deaths) to support one general recommendation for practical use. Of course, different thresholds may be preferred, depending on the user's preference to avoid either false positives (by selecting a higher threshold) or false negatives (lower threshold).
In our data, the number of CODs that surpassed our recommended threshold of 30 annual deaths on average was around 500 for the biggest countries, 200-250 for the middle-sized countries and around 100 for the smaller European countries (results not shown). In total, 52 CODs had over 30 annual deaths on average in each country included in our analysis. This implies that at least 52 CODs could be included in the international comparison of long-term trends, but up to 100 if one is to accept a greater risk of false positives in smaller countries.
From the public health practitioner's perspective, the findings of our study can be used in order to set realistic expectations about the number of CODs that are likely to have a significant long-term trend in populations. We recommend a size criterion of 30 annual deaths to be considered when planning for national or international monitoring and comparisons of cause-specific mortality.
Contributors AEK, JWPFK and MM designed the study. MM performed the calculations. MM, AEK and JWPFK analysed and interpreted the results and formulated the conclusions, and made major contributions to the manuscript. All authors read and approved the final manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent for publication Not required.
ethics approval No ethical approval was required for this study, as no living subjects were involved and only aggregated administrative data were used.
Provenance and peer review Not commissioned; externally peer reviewed.