Article Text

Original research
Unexplained mortality during the US COVID-19 pandemic: retrospective analysis of death certificate data and critical assessment of excess death calculations
  1. Kathleen A Fairman1,2,
  2. Kellie J Goodlet1,
  3. James D Rucker2,
  4. Roy S Zawadzki3
  1. 1Department of Pharmacy Practice, Midwestern University College of Pharmacy, Glendale, Arizona, USA
  2. 2Kathleen Fairman LTD, Phoenix, Arizona, USA
  3. 3Department of Statistics, Donald Bren School of Information and Computer Sciences, University of California, Irvine, California, USA
  1. Correspondence to Dr Kathleen A Fairman; kfairm{at}midwestern.edu

Abstract

Objectives Cause-of-death discrepancies are common in respiratory illness-related mortality. A standard epidemiological metric, excess all-cause death, is unaffected by these discrepancies but provides no actionable policy information when increased all-cause mortality is unexplained by reported specific causes. To assess the contribution of unexplained mortality to the excess death metric, we parsed excess deaths in the COVID-19 pandemic into changes in explained versus unexplained (unreported or unspecified) causes.

Design Retrospective repeated cross-sectional analysis, US death certificate data for six influenza seasons beginning October 2014, comparing population-adjusted historical benchmarks from the previous two, three and five seasons with 2019–2020.

Setting 48 of 50 states with complete data.

Participants 16.3 million deaths in 312 weeks, reported in categories—all causes, top eight natural causes and respiratory causes including COVID-19.

Outcome measures Change in population-adjusted counts of deaths from seasonal benchmarks to 2019–2020, from all causes (ie, total excess deaths) and from explained versus unexplained causes, reported for the season overall and for time periods defined a priori: pandemic awareness (19 January through 28 March); initial pandemic peak (29 March through 30 May) and pandemic post-peak (31 May through 26 September).

Results Depending on seasonal benchmark, 287 957–306 267 excess deaths occurred through September 2020: 179 903 (58.7%–62.5%) attributed to COVID-19; 44 022–49 311 (15.2%–16.1%) to other reported causes; 64 032–77 054 (22.2%–25.2%) unexplained (unspecified or unreported cause). Unexplained deaths constituted 65.2%–72.5% of excess deaths from 19 January to 28 March and 14.1%–16.1% from 29 March through 30 May.

Conclusions Unexplained mortality contributed substantially to US pandemic period excess deaths. Onset of unexplained mortality in February 2020 coincided with previously reported increases in psychotropic use, suggesting possible psychiatric or injurious causes. Because underlying causes of unexplained deaths may vary by group or region, results suggest excess death calculations provide limited actionable information, supporting previous calls for improved cause-of-death data to support evidence-based policy.

  • COVID-19
  • epidemiology
  • mental health
  • public health
  • statistics & research methods
  • health informatics

Data availability statement

Data are available upon reasonable request. Data are available in a public, open access repository.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Strengths and limitations of this study

  • This is the first study to assess the degree to which mortality from unreported or unspecified causes contributed to excess deaths in the US COVID-19 pandemic, raising important policy questions about the utility of the excess death metric.

  • We used population-adjusted US national data, accounting for deaths in every category reported by the National Center for Health Statistics (NCHS), including top natural causes and respiratory causes including COVID-19.

  • Our statistical analysis was descriptive, but our estimates of excess deaths are similar to those previously developed using more sophisticated statistical analyses of the same data files.

  • Study results generalise to the USA through 26 September 2020, but not to other countries, as cause-of-death attribution practices vary cross-nationally, or to other pandemic periods, including the COVID-19 case surge in late 2020.

  • Although we cite evidence suggesting many of the unspecified cause of deaths may have been due to suicide, overdose or underlying psychiatric causes, we were unable to address this question using the available NCHS data files, particularly in population subgroups that may have been especially vulnerable to injurious causes of mortality during the pandemic.

Introduction

Errors in cause-of-death attribution (CODA) are common in infectious respiratory illness, compromising accurate tracking of disease impact and spread.1–3 CODA is especially challenging in COVID-19 because of competing causes of mortality, including cancer, lung disease, obesity-related conditions and superannuation.4 Compounding this problem, key factors contributing to COVID-19 CODA, such as level of training and expertise of mortality coders, whether laboratory testing is or is not required, financial incentives for reporting and public health delivery systems, vary considerably by country.5 6 These variations make cross-national comparisons of COVID-19-attributed mortality problematic, threatening the accuracy of the virus mortality statistics needed for public health decision-making.3

A standard metric generally understood to account for these discrepancies is ‘excess death’, defined as mortality from all causes exceeding that expected from historical experience.7 Recent investigators have calculated excess deaths to estimate US COVID-19 impact, using death certificate data made available by the US National Center for Health Statistics (NCHS).7–9 Most interpretations of excess death calculations reflect an underlying assumption that 100% of the change in mortality that took place during the pandemic was attributable, indirectly or directly, to COVID-19.7 8 10 The calculation has the advantages of no reliance on CODA, because it considers only all-cause deaths, and of accounting for deaths due to undetected COVID-19 or to use of scarce health system resources by infected patients.7 8 10

Despite these advantages, the utility and interpretation of excess death calculations may be compromised by unexplained deaths. Fully adjudicated, final US mortality files report specific causes or contributing factors, as well as demographic characteristics, for each individual decedent.11 12 In contrast, files made available by the NCHS beginning in May 2020 to facilitate pandemic mortality analysis, which have been used to calculate excess deaths as an indication of the ‘full COVID-19 burden’,8 report only aggregated (summed) weekly death counts, grouped into broad diagnostic categories.13 14 These categories, shown in online supplemental appendix 1, represent ranges of International Classification of Diseases, 10th Revision (ICD-10) diagnosis codes.13 14 For excess deaths unexplained by these causes, the true underlying causes are unknown, despite presumably representing appropriate policy targets. For example, markedly different interventions would be suggested by unexplained deaths due to undetected COVID-19, high-speed automobile accidents on empty highways15 or delayed care for life-threatening conditions when people fear using available emergency department capacity.16

Risk factors for potential causes of unexplained death vary cross-nationally. For example, US opioid supplies, opioid mortality, substance use disorder prevalence and suicide rates far exceed those of other high-income nations,17 suggesting greater psychological vulnerability to pandemic period disruptions. Yet, neither substance misuse nor mental illness is recorded as a cause of death in the currently available NCHS pandemic period mortality files.13 14 Risk factors for COVID-19 mortality, such as obesity, smoking and healthcare-associated infections, also demonstrate considerable cross-national variability.17

Quantifying unexplained excess deaths would provide information about the degree to which the utility and interpretation of excess death calculations are potentially compromised by unreported or unspecified causes of mortality. Moreover, assessing the timing of unexplained deaths at various pandemic phases would inform current discussions about societal factors that may contribute to pandemic period morbidity and mortality, such as fear of contagion or economic vulnerability.18–20 Accordingly, we used publicly available NCHS mortality data files for the past six influenza seasons to calculate the timing and extent of changes in all-cause deaths that were explained versus unexplained by changes in reported causes.

The primary research question was to what degree unexplained mortality contributed to excess mortality during the pandemic. Because the excess mortality calculation represents change in mortality compared with historical experience, we addressed this research question by assessing the contributions of changes in explained versus unexplained causes of death to change in total, all-cause death. A secondary research question, intended to provide exploratory information about possible reasons for unexplained mortality, was when unexplained deaths escalated in 2020. We formulated both research questions, a priori to this project, after noticing large numbers of unexplained deaths in analyses for a different exploratory study on CODA.21

Methods

Design and data source

This study was a retrospective, repeated cross-sectional analysis of US mortality files made available by the NCHS beginning in 2020 for pandemic period analysis. The data files for 2020 represent provisional causes of death.13 The corresponding data files for 2014–2019 represent final adjudicated causes of death,14 reported in the same broad diagnostic categories as the 2020 data to facilitate analysis (online supplemental appendix 1).

The study measures were based on reported underlying cause of death (UCOD), defined as ‘the disease or injury which initiated the train of morbid events leading directly to death’.22 Only one UCOD is reported on each death certificate. The study files include weekly counts of deaths in total (all-cause) and by UCOD category, grouped by state.

In addition to categories representing the top eight US causes of natural death (heart disease, cancer, chronic lower respiratory disease, cerebrovascular disease, Alzheimer’s disease, diabetes, influenza and pneumonia, and kidney disease), which together accounted for 66% of all US deaths in 2017,23 the files include four additional categories: miscellaneous respiratory conditions, (eg, nasopharyngitis, sinusitis, pneumothorax); septicaemia; COVID-19 and non-specific cause of deaths.13 14 Non-specific causes of deaths, described in ICD-10 nomenclature as ‘symptoms, signs, and abnormal clinical and laboratory findings not elsewhere classified’ (NEC; ICD-10 range R00–R99), include ‘ill-defined and unknown cause of mortality’ (R99),13 14 a code commonly used pending forensic investigation of injurious death.24 Neither the 2020 provisional data nor the corresponding grouped data for 2014–2019 include reporting categories for specific psychiatric causes, including substance use disorders (ICD-10 codes F00–F99 excluding developmental disorders), or for injurious deaths including intentional self-harm (ICD-10 codes X71–X83) and unintentional overdose (ICD-10 codes T36–T50, excluding codes for underdosing).25

Data analyses

Data from 5 October 2014 through 26 September 2020 were downloaded on 22 January 2021. Analyses were performed using open-source analytical tools.11 Specifically, Python coding was used with Pandas, a data organisation tool,26 27 to group data into six influenza seasons, 52 weeks each. Using the same software tools, data for the 2014–2015 to 2018–2019 seasons were population adjusted to July 2019 using US Census data.11 28 As in previous research, Connecticut and North Carolina were excluded because of incomplete reporting.8 Data were further grouped a priori into time periods roughly corresponding to US trends in COVID-19 mortality: pandemic awareness (19 January through 28 March); initial pandemic peak (29 March through 30 May) and pandemic post-peak (31 May through 26 September).11 29

Analyses were descriptive to facilitate the parsing of excess death (ie, change in all-cause death) into explained and unexplained proportions. The decision was in accordance with the principle of parsimony in data presentation,30 as we found that descriptive results were similar to those produced using more sophisticated techniques.8 9 For each week, time period and diagnostic category, we calculated prior season averages for three historical benchmark periods: two seasons (2017–2018 to 2018–2019), three seasons (2016–2017 to 2018–2019) and five seasons (2014–2015 to 2018–2019). Averages were calculated as total death count for the indicated time period, divided by number of years. For example, the two-season average for week 1 was the sum of week 1 deaths reported in 2017–2018 and 2018–2019, divided by 2. Three benchmark time periods were used because it is common to compare current year mortality data with several historical benchmarks,31 consistent with the need to report sensitivity analyses of epidemiological data.32 Weekly prior season averages and 2019–2020 counts were graphed using Matplotlib.33

For each week and time period, excess deaths were defined a priori as increases in all-cause deaths over the population-adjusted prior season mean benchmarks (ie, 2019–2020 season values minus benchmark values). Unexplained deaths were defined as all-cause deaths either not reported in any diagnostic category (unreported) or reported in the NEC category (unspecified). Explained deaths were defined as all-cause deaths reported in any of the specific cause-of-death categories (ie, all-cause minus unexplained deaths). Changes in explained and unexplained causes were calculated using the same method as for excess deaths, by first calculating the population-adjusted prior season mean benchmarks, then subtracting the benchmark values from the 2019–2020 season values.

Results

Of a total of 16.3 million all-cause deaths reported over six influenza seasons in 48 states, 2.98 million occurred during 2019–2020, representing a population-adjusted increase of 288 467–319 858 excess deaths over prior season averages for 2, 3 and 5 years (figure 1, top). Mean annual population-adjusted total all-cause death counts varied modestly across seasonal benchmarks, ranging from 2 659 228 to 2 690 619. An increase in NEC deaths began in approximately week 19 and escalated sharply beginning at week 36 (31 May in the 2019–2020 season) through end of observation. Increases in explained and unexplained deaths, respectively, began in approximately week 25 (15 March–21 March in 2020) and week 20 (9 February–15 February in 2020; figure 1, bottom).

Figure 1

Trends in population-adjusted death counts by week of influenza season,a 2014–2015 through 2019–2020 seasons. aFor each influenza season, week 1 begins on approximately 1 October and week 52 ends on approximately 30 September. A total of 312 weeks (52 weeks for six seasons) were included in the analyses. Dividing lines represent the ends of weeks 16 (week prior to pandemic awareness period), 26 (week prior to initial pandemic peak) and 35 (week prior to pandemic post-peak period). Prior season benchmarks are means (deaths summed across seasons, divided by 2, 3 and 5, respectively, for two-season, three-season and five-season benchmarks). Benchmarks for two, three and five seasons, respectively, are indicated by long dashed lines, alternating short and long dashed lines, and dotted lines. bExplained deaths include specific causes reported in the mortality files, including heart disease, cancer, chronic lower respiratory disease, cerebrovascular disease, Alzheimer’s disease, diabetes, influenza/pneumonia, other respiratory illness, kidney disease, septicaemia and COVID-19. Diagnosis codes for each category are in online supplemental appendix 1. Unexplained deaths are all-cause deaths with no reported underlying cause or with a not elsewhere classified (NEC) cause (all-cause deaths minus explained deaths). NEC deaths are described in ICD-10 nomenclature as ‘symptoms, signs, and abnormal clinical and laboratory findings NEC’ (ICD-10 range R00–R99), include ‘ill-defined and unknown cause of mortality’ (R99), a code commonly used pending forensic investigation of injurious death.24 25 ICD-10, International Classification of Diseases, 10th Revision.

Contribution of unexplained mortality to excess deaths varied considerably by pandemic period (table 1; online supplemental appendix 2). Using the 5-year benchmark, of the total increase of 306 267 all-cause deaths reported from pandemic awareness through end of observation (seasonal weeks 17–52, 19 January through 26 September 2020), 179 903 (58.7%) were attributed to COVID-19; 49 311 (16.1%) to changes in reported causes other than COVID-19; 6909 (2.3%) to increased NEC deaths and 70 145 (22.9%) to increases in deaths with no reported cause. On a proportional basis, mortality change with unexplained (unreported or NEC) cause was much greater in the pandemic awareness period (19 January through 28 March, 65.2% of change in all-cause deaths) than in the initial pandemic peak period (29 March through 30 May, 16.1%) or the post-peak period (31 May through 26 September, 29.3%). In total, an increase of 77 054 unexplained deaths was responsible for 25.2% of change in all-cause mortality from 19 January through 26 September 2020.

Table 1

Overview of changes in reported deaths during pandemic periods, 2019–2020 vs prior five-season benchmark

Results using the 2-year and 3-year benchmarks were similar (online supplemental appendix 3). Using these benchmarks, increases in unexplained deaths accounted for 68.0%–72.5% of excess deaths during the pandemic awareness period; 14.1%–14.5% of excess deaths during initial peak and 26.6%–27.4% of excess deaths in the pandemic post-peak period. Measured from pandemic awareness through the end of observation, changes in unexplained deaths accounted for 22.2%–22.9% of excess deaths.

Discussion

This analysis of population-adjusted US death certificate data for six influenza seasons, the first to assess the extent and timing of unexplained pandemic deaths, indicated substantial impact of unexplained deaths on excess pandemic period mortality. The most important study limitation is that observation ended on 26 September 2020. Results may not apply to subsequent disease activity, including the surge late in 2020,29 or to other countries. Additionally, this analysis was descriptive, although it produced results for all-cause deaths similar to those using more sophisticated statistical methods.8 9 For example, the 299 028 excess deaths estimated by the US Centers for Disease Control and Prevention using Poisson regression modelling through 3 October 20209 is comparable with our estimate of 287 957–306 267 excess deaths through 26 September 2020.

Despite these limitations, the finding that a large proportion of 2020 excess mortality was unexplained by changes in top causes of natural death or respiratory disease suggests a need to extend thinking about pandemic mortality beyond COVID-19 or its physical sequelae. Although drug overdoses and suicides are not reported in the available NCHS data, several factors implicate these as potential causes of the unexplained US deaths. These include reports of increases in rates of serious psychological distress from 3.9% of adults in 2018 to 13.6% in April 202020 and of an 18.2% increase in 12-month overdose death rates from June 2019 to May 2020.34 The increase in NEC deaths, which accelerated sharply beginning in approximately May 2020, is also consistent with this explanation because the R99 category included in the ICD-10 NEC group is commonly used pending forensic investigation of injurious death, introducing a lag period before cause-of-death determination.24 However, the NEC increases could also represent COVID-19 not yet diagnosed because of pending laboratory testing.

Countering suicides but supporting overdose as causal factors underlying the unexplained deaths, US mortality data reported through August 2020 suggested early pandemic period increases in overdoses, homicides and unintentional injuries, but decreases in suicides and motor vehicle accidents, relative to historical experience.35 An important caveat to these early findings is that they represent the USA as a whole, possibly masking outcomes in economically and socially vulnerable populations that were already at increased risk of behavioural health-related mortality prior to the pandemic.36–39 Among these are young adults, described in a US Substance Abuse and Mental Health Services Administration (SAMHSA) report as ‘a uniquely vulnerable population’ based on pandemic period data on anxiety, depression, traumatic stress, psychological distress, loneliness, substance misuse and suicidal ideation.37 Also at elevated risk were women, racial and ethnic minorities, healthcare workers and paediatric populations.36–39 These disparate behavioural health effects suggest that the underlying causes of excess deaths should be explored in US population subgroups, rather than only for the nation as a whole.

Also supporting the interpretation of possible behavioural health effects, the timing of onset of unexplained deaths in February 2020 suggests they did not result from COVID-19 or sequelae of COVID-19 deaths (eg, bereavement). Increases in unexplained deaths began about 4–6 weeks before >1000 COVID-19 cases had been reported nationwide,29 approximately coinciding with extensive media coverage of COVID-1940 and a nationwide increase in use of psychotropic medications for anxiety, depression and sleep disorders.41 Also supporting this interpretation are survey data from early in the pandemic, suggesting no significant association between psychological distress and personal acquaintance with someone who died of COVID-1942 but strong associations with fear of COVID-19 contagion and of disruption to finances and employment.18 20

Moreover, a content analysis of media coverage of the pandemic, posted as a non-peer-reviewed working paper in November 2020, found that 91% of US major media stories, compared with 54% of non-US stories and 65% of scientific journal reports, were ‘negative in tone’.43 These preliminary findings suggest a possible bias unique to US media coverage of COVID-19. If confirmed with peer-reviewed research, the connection between this bias and psychological distress should be explored in additional studies. Neither the psychological effects of media coverage nor the specific causes of the unexplained deaths we observed could be assessed with available provisional mortality data. However, release of the full, final US cause-of-death file for 2020, which likely will occur by early 2022, will make analyses of psychiatric and injurious causes of death, overall and by demographic and regional subgroups, feasible.

Expanding pandemic period mortality research to include societal causes would help to evaluate a concern expressed by the SAMHSA about public health harms caused by focus ‘solely [on] virus containment’ rather than on ‘all aspects of health’.44 The addition of new UCOD categories for behavioural disorders, including psychiatric and substance use disorders, intentional self-harm and unintentional overdose, to the available files would facilitate this investigation. Together, these causes accounted for approximately 106 000 US deaths per year from 2010 to 2018,45 and their prevalence as UCODs has increased rapidly over time.46 47 The provisional files released by the NCHS in March of 2021, which included the diagnostic categories assessed in this research plus categories for accidents, intentional self-harm, homicide and drug overdoses,48 were aggregated monthly for the USA as a whole and therefore do not facilitate comparative policy analysis, such as by states with varying pandemic policies (eg, strict stay-at-home orders vs precautionary warnings).

Findings also suggest challenges in interpreting excess death reports because of between-group differences in predispositions to various causes of death.17 42 For example, the largest percentage increase in US pandemic period all-cause deaths occurred in adults aged 25–44 years,9 a group with low rates of COVID-19 mortality but elevated rates of anxiety and mood disorders,42 suggesting possible underlying psychiatric causes. Similarly, the USA had the fourth highest rate of alcohol dependence (8%) and the highest rate of opioid-related deaths (131 per million) in the world in 2016,17 implicating substance-related mortality as a likely contributor to unexplained deaths. In contrast, in groups with higher rates of risk factors for COVID-19 mortality, such as obesity or smoking,4 undetected COVID-19 may be a more likely cause of unexplained all-cause deaths. For example, rates of adult (aged >15 years) smoking in 2017 ranged from 11% or less in the USA and other countries (eg, Mexico, 8%) to >25% in France, Hungary, Turkey, Greece, Russia and Indonesia.17 Similarly, within the USA, statewide rates of obesity among adults in 2020 ranged from 24% to 40%.49

These large risk factor variations across groups and regions could represent markedly disparate true underlying causes for unexplained all-cause deaths. If so, the excess death calculation is uninterpretable when a large proportion of excess deaths is unexplained. This problem, which affected 22%–25% of the excess pandemic period deaths measured through September 2020, suggests that excess death calculations do not consistently provide actionable information and highlights previous calls for specific, standardised algorithms to certify mortality from respiratory illness and other causes on death certificates.1 3

Conclusion

Approximately 22%–25% of excess all-cause mortality during the US 2020 COVID-19 pandemic was unexplained by changes in the top eight causes of natural death, COVID-19, sepsis or other respiratory illness. The onset of unexplained deaths coincided with media coverage and previously reported nationwide increases in psychotropic use. Because unexplained excess deaths may represent disparate underlying causes in different demographic groups or regions, standard excess death calculations may lack utility for evidence-based policymaking. Findings highlight the need for improvements in death certification accuracy.

Data availability statement

Data are available upon reasonable request. Data are available in a public, open access repository.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors KAF and KJG performed concept and design, assisted by JDR and RSZ. Analyses were performed by KAF and JDR, assisted by RSZ. The manuscript was drafted by KAF and revised for important content by all authors. All authors read and approved of the final manuscript. All authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. As manuscript guarantor, KAF accepts full responsibility for the finished work and the conduct of the study, had access to the data, and controlled the decision to publish.

  • Funding This manuscript was supported solely by Midwestern University and Kathleen Fairman LTD. The research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

  • Competing interests KAF is President and JDR is Research Intern with Kathleen Fairman LTD, a for-profit research consulting firm. Kathleen Fairman LTD provided analytical support and article processing charges but has no financial or non-financial interests related to the topic of the manuscript. KJG and RSZ have no competing interests to report.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.