Article Text

## Abstract

Life expectancy (LE) is considered a straightforward summary measure of mortality that comes with an implicit age standardisation. Thus, it has become common to present differences in mortality across populations as differences in LE, instead of, say, relative risks. However, most of the time LE does not quite provide what the term promises. LE is based on a synthetic cohort and is therefore not the true LE of anyone. Also, the implicit age standardisation is construed in such a way that it can be questioned whether it standardises age at all. In this paper, we examine LE from the point of view of its applicability to epidemiological and public health research and provide examples on the relation between an LE difference and a relative risk. We argue that the age standardisation in estimations of LE is not straightforward since it is standardised against different age distributions and that the translation of changes in age specific mortality into change in remaining LE will depend on the level and the distribution of mortality in the population. We conclude that LE is not the measure of choice in aetiological research or in research with the aim to identify risk factors of death, but that LE may be a compelling choice in public health contexts. One cannot escape the thought that the mathematical elegance of LE has contributed to its popularity.

- epidemiology
- public health
- statistics & research methods

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

## Statistics from Altmetric.com

## Introduction

Life expectancy (LE) originates from demography but is an often-used summary measure of mortality also in epidemiology and public health in general. A difference in LE has a seemingly intuitive meaning, while it takes experience to interpret the magnitude of a relative risk, or any of the related measures. As a bonus, LE comes with integrated age standardisation. This may explain why the conversion from relative risk to gain, or loss, in LE is commonly done.

A paper in *Circulation* provides a recent example.1 The authors compared remaining LE at the age of 50 between those with 5 versus 0 healthy lifestyle factors and found a 12.2-year difference for men and a 14.0-year difference for women. The basis for this result was an HR estimated at 0.26, comparing those with 5–0 healthy lifestyle factors, obtained from the Nurses’ Health Study and the Health Professionals Follow-up study. By combining the HR with death rates for the general population the results were transformed into differences in remaining LE.

In another example, based on a 35-year follow-up of 50-year-old men, the relative death rates were 0.68 when comparing physically active men to sedentary men, and 0.78 when comparing to moderately active men.2 This was converted to gains in LE and the physically active were estimated to live 3.8 years longer than the sedentary and 1.8 years longer than the moderately active.

However, most of the time the LE measure does not quite provide what the term promises. Furthermore, the implicit age standardisation is construed in such way that it can be questioned whether it standardises age at all. In this paper, we examine the LE measure from the point of view of its applicability to epidemiological and public health research and provide examples on the relation between a difference in LE and a relative risk.

## Life expectancy

LE is calculated solely from age specific death rates and can, thus, be viewed as a summary of these that is independent of the age structure of the underlying population and, hence, age standardised. The calculation begins with estimating the survival function, *l*(*x*), that is, the probability of surviving from birth to age *x*:

(1)

where *μ*(*x*) is the hazard of death at the age of *x*. The hazard of death is the continuous counterpart to death rates and is used in the theoretical expressions in the current paper. The survival function is then used to calculate the average length of life, the LE, as follows:

(2)

LE corresponds to the area under the survival curve in a graphical representation. Remaining LE at any given age, such as 50, is calculated in a similar way.

Assessing LE does not require that a cohort is followed until extinction but is often based on the cross-sectional mortality experience during an observed period, often a calendar year. It reflects the hypothetical survival curve that can be calculated from the age specific death rates during the observed period, which then emanate from subsequent birth cohorts. This is conceptualised as a synthetic cohort for which the average length of survival can be assessed. Clearly, this period LE is not the expected length of life of anyone but is merely a way of summarising the mortality during the selected period.

## LE and age standardisation

To examine to what extent this summary measure is age standardised it is helpful to rewrite equation (2) in the following form:

(3)

The denominator can be interpreted as the total number of deaths. The ratio / —the number of deaths over the number of person years—is the crude death rate in the population that has an age distribution that follows the survival curve, that is, the stationary population that corresponds to the age specific mortality.3

The reason to rewrite LE like this is to show in a clear way that LE stems from a weighted average of the age specific death rates. But the weights are proportional to the survival function, which in turn is derived from the death rates—that is, equation (1). The implication is that as soon as two populations differ in their level of mortality, so will the weights that determine LE. A low-mortality population would have a survival function that is pushed to higher ages and as a consequence the death rates at high age would receive higher weight, relative to what they would receive in a high-mortality population. This offsets some of the effect that the low mortality otherwise would have on LE.

The intriguing corollary is that the comparison of life expectancies encompasses juxtaposition of crude death rates, not standardised death rates. In a standard textbook of epidemiology this would be tantamount to confounding by age.

Cancer eradication provides an explanatory example. In the western world, cancer accounts for roughly 25% of all deaths in high-income countries (Eurostat). On the assumption that this proportion were homogeneous across all ages and that all causes of death were independent of each other, eradication of cancer would reduce all age specific death rates to 75% of what they were originally. According to equation (3)—ceteris paribus—this would raise the LE by about a third (1/0.75), which is between 25 and 30 years in many countries. However, as noted above, a reduction of death rates shifts the survival curve to higher age, which counters a significant part of that effect.

In reality, the fraction of mortality that is made up of cancer is not homogeneous but varies with age. For example, among women in the USA in 2013, cancer deaths accounted for 22% of all mortality,4 and LE was 81.3 years.5 Without cancer mortality, the LE would instead have been 84.2 years, that is, 2.9 years higher. Figure 1A shows the age specific death rates on a logarithmic scale with and without cancer mortality. Figure 1B shows the corresponding survival curves. Note that the area below the survival curve represents the expectancy of life according to equation (2). Again, this example builds on the assumption that cancer mortality would be prevented in such a way that other causes of death were not also affected.

## LE and relative risk

There is no direct translation from a relative risk of mortality, or any of the similar measures, to a difference in life expectancies because the level of mortality, as well as its distribution over age also plays a role.6 7 The lower the LE in a population the greater the impact of a given relative risk (see figure 2). However, one can always calculate LE from the baseline death rates in the population and then repeat the calculations with all death rates multiplied by the relative risk and then subtract to obtain the difference in LE between two populations. This is how figure 2 was done. It shows LE at birth (A) and remaining LE at the age of 50 (B). The figure shows the association between relative risk and difference in LE for selected levels of mortality, or LE rather. Of course, for a relative risk of one, there is no LE difference, while a relative risk of 2 corresponds to about 7 years shorter LE at birth and about 5 years shorter remaining LE at the age of 50 depending on the level of mortality.

Relative risks below unity have a corresponding but opposite effect. In the introductory example, an HR of 0.26 corresponds to an increase in remaining LE at the age of 50 of about 12 and 14 years for men and women, respectively. This relation can be noted in this figure by looking at 0.26 on the x-axis. Although a direct and general correspondence between relative risk and change in LE does not exist. Figure 2 gives a reasonable account of the impact of a given relative risk as well as of the level of mortality in the observed population.

## Discussion and conclusions

It is conceivable that a difference in LE is easier to interpret than a relative risk, or some related epidemiological measures because of a readily available yardstick. However, although both measures look at contrasts in mortality between populations, they do so in rather different ways.

First, in most instances LE is calculated based on the mortality during an observed period, rather than by following a cohort until extinction. It is assumed that a synthetic cohort during its course of life will be exposed to the age specific death rates from the observed period. Clearly, the elderly during the observed period have a rather different experience than the young will have when they reach the same age. The assumption behind the synthetic cohort requires mortality to be unchanged for about a century, or half a century if remaining LE at age 50 were considered. This is unrealistic and the period LE is not aimed for assessing anyone’s average length of life. Rather, it is an elegant mathematical construct and convenient measure that summarises the mortality experience during a period. Even though the calculations are the same for period and cohort life tables, the interpretations are not the same and the term LE in the context of period data is easily misunderstood and misconceived.

Second, that the survival function plays a role for LE is quite obvious. Still the exact role it has is not entirely transparent, but it helps to rewrite the formula as in equation (3). When someone trained in epidemiology looks at this expression it may even seem like the changing survival function introduces age confounding because the death rates are weighted according to different sets of weights when populations with different mortality are compared. This would suggest that LE is not age standardised after all. However, given that the death rates alone determine the survival function, LE is independent of the age distribution in the observed population and, in this meaning age standardised. Hence, these two different perspectives lead to contradictory propositions. This implies that basic mortality rates should be presented alongside summary measures whenever possible to guide the readers.

The cancer eradication example is worth keeping in mind when faced with the choice of a summary mortality measure. While it is perfectly correct that LE would increase by surprisingly modest 2.9 years (in line with what Manton and colleagues found already in 19918) due to the 22% decrease in mortality, this is because the reduced mortality has a double effect, first on the death rates per se, and second on the survival curve. The latter countering some of the direct effect on the death rates as the survival curve corresponds to the age structure of resulting population. Which of the two measures that best represents the results is not evident and will depend on the purpose. Again, presenting primary data will be helpful to the readers, preferably both some absolute and relative measure.9

It appears that in aetiological research and in research aiming at identifying risk factors of mortality the rather intricate calculations behind LE might be more confusing than helpful. As a summary public health measure, on the other hand, the average length of life is a compelling choice, although the reliance on period data and synthetic cohorts is a drawback as is the questionable age standardisation. One cannot escape the thought that the mathematical elegance of LE has contributed to its popularity.

## Footnotes

Contributors KM, AA and RR developed the research idea and design of the study. KM, AA and RR were involved in the acquisition of data. RR performed the statistical analysis. KM, AA and RR participated in interpretation of data. KM and AA drafted the manuscript. KM, AA and RR performed the critical revision of the manuscript and the preparation for the final version.

Funding This work was supported by the Swedish Research Council for Health, Working Life and Welfare (FORTE) [grant number 2016–00863].

Competing interests None declared.

Patient consent for publication Not required.

Provenance and peer review Not commissioned; externally peer reviewed.