Article Text
Abstract
Objective To investigate the influence of demographic and socioeconomic factors on the COVID-19 case-fatality rate (CFR) globally.
Design Publicly available register-based ecological study.
Setting Two hundred and nine countries/territories in the world.
Participants Aggregated data including 10 445 656 confirmed COVID-19 cases.
Primary and secondary outcome measures COVID-19 CFR and crude cause-specific death rate were calculated using country-level data from the Our World in Data website.
Results The average of country/territory-specific COVID-19 CFR is about 2%–3% worldwide and higher than previously reported at 0.7%–1.3%. A doubling in size of a population is associated with a 0.48% (95% CI 0.25% to 0.70%) increase in COVID-19 CFR, and a doubling in the proportion of female smokers is associated with a 0.55% (95% CI 0.09% to 1.02%) increase in COVID-19 CFR. The open testing policies are associated with a 2.23% (95% CI 0.21% to 4.25%) decrease in CFR. The strictness of anti-COVID-19 measures was not statistically significantly associated with CFR overall, but the higher Stringency Index was associated with higher CFR in higher-income countries with active testing policies (regression coefficient beta=0.14, 95% CI 0.01 to 0.27). Inverse associations were found between cardiovascular disease death rate and diabetes prevalence and CFR.
Conclusion The association between population size and COVID-19 CFR may imply the healthcare strain and lower treatment efficiency in countries with large populations. The observed association between smoking in women and COVID-19 CFR might be due to the finding that the proportion of female smokers reflected broadly the income level of a country. When testing is warranted and healthcare resources are sufficient, strict quarantine and/or lockdown measures might result in excess deaths in underprivileged populations. Spatial dependence and temporal trends in the data should be taken into account in global joint strategy and/or policy making against the COVID-19 pandemic.
- COVID-19
- epidemiology
- public health
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the first study that investigated the relationship between COVID-19 case-fatality rate and demographic and socioeconomic factors globally.
Our study addressed the question from a geospatial perspective, which may inspire new reflections to fight against the COVID-19 pandemic globally.
Asymptomatic carriers and victims of COVID-19 were not taken into account in the analysis.
No detailed information on time from diagnosis to death and comorbidity of the COVID-19 cases is available in the current study, which might bias the association in an unknown direction.
Country-level analysis may conceal huge discrepancies between subnational entities in terms of both outcomes and predictors.
Introduction
The pandemic caused by the SARS-CoV-2 virus/COVID-19, which has been initially reported in Wuhan, a city in the Hubei province in China, in December 2019, has become a major global health concern.1 Poor outcomes in those with COVID-19 infections correlate with clinical and laboratory features of cytokine storm syndrome, an exaggerated systemic inflammatory phenomenon due to overproduction of proinflammatory cytokines by the immune system that results in diffuse inflammatory lung disease and acute respiratory distress syndrome (ARDS).2 It may be complicated by sepsis, respiratory failure, ARDS and subsequent multiorgan failure.3 Although COVID-19-related deaths are not clearly defined in the international reports available so far, many governments are warning people to be particularly stringent in following the recommended prevention measures because COVID-19 may result in severe conditions that need critical care, including ventilation or death.4 Untill the end of June 2020, the pandemic has resulted in over 10 million confirmed cases and 500 000 deaths worldwide.5 COVID-19-related fatality rates are difficult to assess with certainty, but according to the estimates based on the data from China, the UK, Italy and the Diamond Princess cruise ship, the overall death rate from the confirmed COVID-19 cases is around 0.7%–1.3%, sharply rising from less than 0.002% in children aged 9 years or younger to 8% in people aged over 80, which is much greater than seasonal influenza at about 0.1%.4 6–10
During the COVID-19 pandemic, numerous studies on the global public health emergency have covered a significant range of disciplines, including medicine, mathematics and social sciences. The spatial spread is one of the most important properties of COVID-19, a characteristic which mainly depends on the epidemic mechanism, human mobility and control strategy.11 Spatial statistical methods are frequently used to uncover relationships between spatiotemporal patterns of infectious diseases and host or environmental characteristics,12 to generate detailed maps to visualise the distribution of the diseases’ morbidity or mortality13 14 and to identify hotspots, clusters and potential risk factors.15–18
Clinical risk factors for mortality of adult patients with COVID-19 have been investigated in numerous studies, and the identified factors include older age19; male sex20; higher sequential organ failure assessment score21; obesity22 23; pre-existing concurrent diabetes24; cardiovascular, cerebrovascular25 and kidney diseases26; and macroeconomic and environmental risk factors, such as socioeconomic deprivation,27 air pollution28 29 and diurnal temperature variation.30 However, there is a lack of published studies on the effects of country-level demographic and socioeconomic characteristics on COVID-10 case-fatality rates (CFRs). It is an important issue for governments and regional or international non-governmental organisations to identify country characteristics that are associated with high CFR and to help develop prevention and intervention measures to fight against this global public health crisis.
Using the publicly available data from the non-governmental organisation Our World in Data,31 we aimed to investigate the relationship of key country-level demographic and socioeconomic indices and COVID-19 case-fatality, and to explore factors associated with COVID-19 CFR, which may indicate treatment efficiency and strain in healthcare resources, while controlling for the spatial dependence of the data collected at different locations.
Methods
Data on COVID-19 by Our World in Data
The COVID-19 dataset used in the study was downloaded from the Our World in Data website on 2 July 2020, which is a collection of the COVID-19 data maintained by the organisation Our World in Data and updated daily. The dataset includes country-level daily data on confirmed cases, deaths and testing, as well as other variables of potential interest.31 32 The data sources of the dataset, including the European Centre for Disease Prevention and Control, the International Organisation for Standardisation (ISO), national government reports, the Department of Economic and Social Affairs of the United Nations (UN), UN Population Division, UN Statistics Division, Oxford COVID-19 Government Response Tracker, the World Bank, the Global Burden of Disease Collaborative Network, and Eurostat of the Organisation for Economic Cooperation and Development.31 There are in total 34 indices from 209 countries and territories in the dataset by 2 July 2020. The dataset was linked to the global geospatial vector database using the ISO 3166-1 alpha-2 codes for the spatial modelling.33
Case-fatality rate
CFR of COVID-19 was calculated as the number of total deaths due to COVID-19 divided by the number of total confirmed COVID-19 cases by 2 July 2020 multiplied by 100. CFR was investigated in our study because it may reflect disease severity, as well as the efficiency of treatment and healthcare response and strain. CFR is not constant. It can vary between populations and over time, depending on the interplay between the causative agent of the disease, the host and the environment, as well as available treatments and quality of patient care. For example, it can increase if the healthcare system is overwhelmed by the sudden increase of cases.34
We also calculated the crude cause-specific death rate (CDR) of COVID-19 in a sensitivity analysis and compared it with CFR. The CDR was calculated as the number of total deaths due to COVID-19 divided by the production of population and months of the data collected, multiplied by 1 000 000.35
Statistical analysis
The numbers of confirmed cases, tests and tests per thousand people were not included in the analysis because they were depending much on the population, detection policy, and quarantine and isolation policies of the countries and territories. Instead, we included the Stringency Index in the analysis, which is a composite measure based on nine response indicators, including school closures, workplace closures, testing policy and travel bans, rescaled to a value from 0 to 100 (100=strictest response).36 The Stringency Index data were obtained from the World Intellectual Property Organisation website on 1 July 2020.37 Because the variable ‘proportion of the population with basic handwashing facilities on premises’ has missing values in more than 50% of the countries/territories, it was also excluded from the analysis.
A subcomponent of the Stringency Index is the government policy on the access to COVID-19 test by four groups: 0, no policy; 1, only those who were both symptomatic and met specific criteria; 2, anyone symptomatic; and 3, open public testing, such as drive-through testing.36 The testing policy indicator was used for stratification of the analysis.
Collinearity and multiple collinearity between the variables were examined using Pearson’s correlation coefficient and multiple correlation coefficient, respectively.38 Spatial autocorrelation (or spatial dependence) is defined as the relationship between spatial proximity among some observational units and similarity among their values; positive spatial autocorrelation refers to situations in which the nearer the observational units, the more similar their values (and vice versa for its negative counterpart).39 This feature violates the assumption of independence among observations on which many regression analyses are applied. Spatial autocorrelation among the fatality rates of the countries/territories was examined using a multivariate linear regression model and the Moran’s I test.40 The autocorrelation was visualised using the Matérn correlation coefficient.41
The Matérn correlation model, a commonly used model for spatial correlated data, was fitted for our data to investigate the relationship between COVID-19 case-fatality and the demographic and socioeconomic variables. The latitude and longitude of the centroid of the countries/territories were used as random effects in the Matérn correlation model.42
Variables with skewed distribution were log-transformed before entering the regression models. The multiple imputation method was used to handle the missing values in the data. The missing values were assumed to be missing at random. A total of 10 copies of the data were created, each of which had the missing values imputed by using switching regression, an iterative multivariable regression technique. Then, each complete dataset was analysed independently. Estimates of parameters of interest were then averaged across the 10 copies to give a single estimate using Rubin’s rule.43
Subgroup analysis was conducted by economic levels of the countries/territories according to the World Bank’s newest classification.44
The associations of the studied variables with COVID-19 CDR (per 1 000 000 person-months) of the countries/territories from 31 December 2019 to 1 July 2020 were also evaluated using the same methodology but using the Poisson regression model because of the rare event, and the results were presented in the supplemental materials.
All the analysis were conducted in R V.4.02 (the R Foundation for Statistical Computing, Vienna, Austria) using the package spaMM45 and in Python V.3.6 (Python Software Foundation)46 using the packages geopandas and geoplot.47
In the study, estimates of health indicators at the global level were reported according to the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) statement.48
Patient and public involvement
The study is a worldwide public available register-based study; therefore, it was not required and also not possible to involve patients or the public in the design, conduct, reporting or dissemination plans of our research.
Results
Descriptive characteristics of the variables
In total, 10 445 656 confirmed COVID-19 cases and 511 030 deaths from 209 countries and territories and 17 variables (including latitude and longitude) were included in the study. The descriptive statistics of the variables are shown in table 1. The pairwise relationships of the variables are shown in figure 1. The CFR, CDR, cardiovascular disease (CVD) death rate and diabetes prevalence shown in table 1 and figure 1, and/or following tables and figures were not adjusted/standardised for age and sex; therefore, they are crude rates. High multicollinearity was found between the number of confirmed cases and population size (pairwise Pearson’s r=0.76, p<0.001; figure 1), and gross domestic product (GDP) per capita and life expectancy, median age, the proportion of aged 65 years or older, the proportion of extreme poverty and hospital beds per 1000 people (all pairwise Pearson’s r greater than or approximate to 0.70 and p<0.001; multiple correlation coefficient=0.92, p<0.001; figure 1). Therefore, median age, the proportion of aged 65 or older and hospital beds per 1000 people were excluded in later regression analysis. Although the proportion of extreme poverty is an interesting factor to be investigated, it is highly correlated with GDP per capita (r=−0.83), and the latter is available in more countries; therefore, the proportion of extreme poverty was also excluded from the analysis.
Worldwide COVID-19 CFR distribution
Distribution of COVID-19 CFR of the 209 countries/territories is shown in figure 2. The mean and median CFR worldwide are 3.31% and 2.19%, respectively (table 1), with the highest rates found in Yemen (27%), West and North Europe (14%–19%), and North America (9%–12%, figure 2).
Spatial autocorrelation of the COVID-19 CFR
Statistically significant spatial autocorrelation was found among the countries/territories’ COVID-19 CFR. The residuals from the common (non-spatial) multivariate linear regression models show apparent spatial dependence around the countries/territories with high fatality (figure 3). The p value from Moran’s I test for the spatial autocorrelation of the residuals is 2.32×10−5.
The estimated spatial autocorrelation coefficient of COVID-19 CFR between two locations against their distance is shown in figure 4, with a strength parameter ν of 2.48 and a decay parameter ρ of 0.12. This indicates that locations more than 20° (in longitude or latitude) away have an autocorrelation coefficient below 0.5 (figure 4).
Associations of demographic and socioeconomic variables with COVID-19 CFR
Overall, after controlling for the spatial dependence, the statistically significant variables associated with COVID-19 CFR are population size and proportion of female smokers in a country/region (table 2). The multivariate adjusted results indicate that, approximately, a doubling in size of population is associated with a 0.48% (95% CI 0.25% to 0.70%) increase in COVID-19 CFR, and a doubling in proportion of female smokers is associated with a 0.55% (95% CI 0.09% to 1.02%) increase in COVID-19 CFR. Open public testing policy is associated with decreased CFR (beta=−2.23, 95% CI −4.25 to 0.21) compared with no testing policy.
Because associations might differ by the proportion of the population aged 65 years or older (65+), we produced stratified estimates by the proportion of people aged 65+ years (see online supplemental table S1). Briefly, population size and testing policy were found to be associated with CFR in the countries with a proportion of people aged 65+ years between 5% and 10%; and GDP per capita, population size, population density and the proportion of smokers were associated with CFR in the countries with a proportion of people aged 65+ years larger than 15%.
Supplemental material
The estimated contour of COVID-19 CFRs worldwide is shown in figure 5. The areas with the higher risks are mainly around North America and West Europe.
The subgroup analysis by economic level indicates that population size, CVD death rate and diabetes prevalence are statistically significantly associated with COVID-19 CFR in the lower-income to middle-income countries; population size, diabetes prevalence, and testing only symptomatic and specified policy and testing anyone symptomatic policy are statistically significantly associated with COVID-10 CFR in the upper-income to middle-income countries (table 2).
However, the subgroup analysis in upper-income to middle-income and high-income countries by testing policy indicates that, if testing was ensured (testing policy=2 or 3), increment in Stringency Index is associated with increased COVID-19 CFR (beta=0.14, 95% CI 0.01 to 0.27) (table 3). The finding might imply an open society policy does not necessarily lead to a great CFR when testing is ensured. Meanwhile, diabetes prevalence is inversely associated with CFR (beta=−3.30, 95% CI −5.85 to 0.74).
Discussion
Geospatial analysis of the COVID-19 pandemic
The COVID-19 pandemic is still full of unknowns, and many of them have a spatial dimension that leads to understanding the emergency geographically. Analysis of the COVID-19 data requires an interdisciplinary approach, including spatial statistics, that may provide important implications to policies addressing the spatial issues in the pandemic.49 A recently published review summarised studies by 1 May 2020 on geospatial and spatial–statistical analysis of the COVID-19 pandemic. In total, 63 studies were categorised into five subjects: spatiotemporal analysis, health and social geography, environmental variables, data mining and web-based mapping.50 Although 15 of the studies address the question globally, none of them investigated the association of COVID-19-related deaths with country-level demographic and/or socioeconomic factors. From a global health perspective, there is a knowledge gap in the research field. Our geospatial analysis fills this gap and shows the utility of the analysis to improve the understanding of the consequences of COVID-19 and their related factors from a global level, and contributes to the predictive modelling and decision-making to combat the pandemic.
Proportion of smokers and COVID-19 CFR
The proportion of female smokers was positively associated with COVID-19 CFR in the overall analysis, but the association diminished when the analysis was stratified by the economical level of the countries/territories. This is counterintuitive, given that severer COVID-19 was associated with male sex due to possibly immune system and hormone levels51 and smoking.52 53 The observed association between smoking in women and COVID-19 CFR might be due to the finding that the proportion of female smoker reflected broadly the income level of a country (figure 6A). Linking to the theory of diffusion of innovation, the spread of smoking habit has been illustrated to take several stages of a rise, levelling and decline, from rich to poor, men to women, and young to old.54–56 In the early phrase, the prevalence of smoking has increased in men, and women take up smoking about a few decades later. Subsequently, male smoking starts to decline and female follows later on. This pattern has been found to spread from rich to poor countries. In general, Asian and African countries tend to have low female smoking but high male smoking, while in European countries, prevalence is similar between men and women.57 58 Therefore, female smoking in the overall analysis is a marker of the development of a country and it diminishes when the analysis is stratified by it.
Population size and density and COVID-19 CFR
Our results indicate that a larger population size is the most consistently associated with higher COVID-19 CFR (figure 6B), but population density is not associated with the outcome, controlled for some demographic and socioeconomic variables and spatial dependence worldwide. The association of population size with CFR can be interpreted in at least two ways. It could be because larger countries have experienced a greater number of deaths and have conducted relatively less test compared with countries with smaller population. Alternatively, in larger countries, healthcare might have been strained and resulted in a relatively higher number of deaths among confirmed cases than smaller countries.59 We are unable to disentangle the mechanisms of the association. Therefore, it is recommended that the analysis is replicated by a study with more detailed healthcare information on both individual and country levels. However, in national studies, higher density has been shown to be associated with higher COVID-19 prevalence in Japan,60 Italy61 and Iran.62 The lack of an association between COVID-19 CFR and population density globally might be due to the confounding by testing strategies and economic levels. In countries like Germany and South Korea, which took a more active testing strategy than, for example, the UK, where PCR test for COVID-19 was only performed among those who were with severe symptoms and were admitted to the hospital at the beginning,63 CFR naturally showed lower values. There are weak but statistically significant negative associations between population size and population density (r=−0.15, p=0.042) and GDP per capita (r=−0.20, p=0.006). To minimise the confounding, we conducted a stratified analysis by economic level (table 2) and testing policy (only within upper-income and middle-income and high-income countries, table 3). Furthermore, we conducted a sensitivity analysis for CDR and the results were similar (see online supplemental tables S2, S3 and figures S1–S4). The results suggest that, globally, healthcare strain should be first relieved and treatment efficiency should be improved in countries with large populations.
Economic level and COVID-19 CFR
In our analysis, high COVID-19 CFR was found mainly around North America and West Europe (figure 1). One of the possible reasons might be that these countries counted COVID-19 deaths by including those who died with it, not only from it.8 64 Determination of COVID-19 deaths also differed by country. Some countries recorded a COVID-19 death as any death once the patient became a confirmed case, even the death happened after 2 months possibly by other reasons (such as an accident), while in some other countries, a COVID-19 death was recorded as the death that occurred within a certain period (ranging from 2 to 8 weeks) after COVID-19 symptom onset.65 Furthermore, the extent that the counting covered home, institutions and hospitals in high-income countries is different from that in low-income countries.64
It has been reported that CFR was more favourable in low-income countries.64 66 There are three possibilities to explain this unusual pattern: it may be because of younger population, poor data quality or it was still the early stage (at the time of writing this paper) of COVID-19 infection.64 There is a tight relationship between the income level of a country and demographic structure. In 2015, many of African countries were classified as low-income, and the median age of the population was less than 20 years. About 61% of the population was 24 years or younger, and merely 3% was equal to or older than 65 years.67 It has been shown that younger age is associated with a lower likelihood of severe COVID-19.4 68 69 However, the prevalence of risk factors such as lack of hygiene facilities, handwashing soap and water is greater,64 and higher viral load has been suggested to be linked to more severe disease.70 Healthcare resources are usually low in low-income countries.71 Therefore, our finding of favourable CFR for low-income countries may be in part due to the pandemic being at an early stage (figure 6C). Finally, even before the pandemic, developing countries had challenges to collect, verify and aggregate reliable data in a timely manner due to lack of resource, communication and technological development.72 Moreover, the pandemic might have accentuated the pre-existing challenges.64 The extent of bias is difficult to know, including whether it is still in the early stage of infection in developing countries. Further monitoring and investigation are necessary in the future.
Stringency of measures against the COVID-19 epidemic and COVID-19 CFR
According to the currently available data from the Oxford COVID-19 Government Response Tracker,36 South Americans and Asians took the strictest measures (table 4 and figure 6D), and they also had relatively lower COVID-19 CFR (table 4). However, in our multivariable analysis, which controlled for other variables and spatial dependence, we did not observe a statistically significant association between Stringency Index and COVID-19 CFR. In contrast, stricter measures were even found being associated with higher CFR in the high-income countries with active testing policies (table 3), which seems to support the current argument that lockdown measures might result in excess deaths in underprivileged populations, and those in need are hit harder by the crisis.73 So far, the evidence that stricter response reduced healthcare strain or treatment efficiency reflected by COVID-19 CFR is lacking. However, the findings need to be further examined by comparing the all-cause mortalities in previous years. Meanwhile, the reliability of the Stringency Index also needs to be further investigated. The relationship between socioeconomic measures against the pandemic and COVID-19 CFR is a complicated issue which needs deeper spatiotemporal analysis with more detailed and reliable information in the future.
Noticeably, we also observed negative associations between COVID-19 CFR and CVD death rate and diabetes prevalence in some analyses, which might be partially explained by the competing risk between the deaths and/or comorbidities,74 because most of COVID-19 deaths are among the elderly and have one or more comorbidities.75–77 Therefore, the COVID-19 CFR worldwide deserves deeper investigation with more detailed comorbidity information.
Strengths and limitations
To our knowledge, this is the first study that investigated relationship between COVID-19 CFR and demographic and socioeconomic factors globally. Although numerous studies have investigated the aforementioned factors related to the COVID-19 CFR, either they investigated the question locally or they did not approach this issue from a geospatial perspective. Our study may inspire new reflections from healthcare workers to work together against the COVID-19 pandemic geographically and globally. International comparison of CFR may be challenging when the ascertainment of COVID-19 cases differed by country. To tackle this, we performed a sensitivity analysis using CDR. Although some risk factors, such as CVD and diabetes, showed different patterns of association, the population showed consistent and positive associations with COVID-19 CDR (see online supplemental tables S1 and S2).
There are many limitations in our study. First, the case fatality analysed here was based on the reported COVID-19 cases and deaths by countries/territories. According to the recent estimations, asymptomatic carriers of COVID-19 could be as high as 10%–80% in a population.78–83 However, this fraction was not taken into account in our analysis. Therefore, the CFRs presented in the study might be significantly higher than the real ones. In addition, there is no single globally accepted definition of COVID-19-related death; therefore, the variation in the reported values of CFR could not be fully explained, and the bias derived from the difference in the definition of COVID-19-related death between the countries could not be excluded using the data available so far. Second, the age structure of the population influences both prevalence and mortality of COVID-19; although we adjusted our analysis using the proportion of age over 65 years in populations, residual confounding largely remains. Third, no individual-level data are available in the current study; thus, results should not be extrapolated to individual-level association. Fourth, because no diagnostic date was available in the Our World in Data, the time between diagnosis and death was not known, which could lead to variation in patient follow-up time among countries and, therefore, potential differences in CDR (because the CDR is calculated using person-time). Fifth, country-level analysis may conceal huge discrepancies between subnational entities in terms of both outcomes and predictors. The case of Northern and Southern Italy is an epitome of this. In-depth geospatial studies conducted at subnational levels are expected to provide less biassed and more actionable results. Furthermore, during an ongoing pandemic, delayed reporting occurs for both the number of cases and deaths, and strategies against the crisis also change by time. Although the analysis using data from two different time points obtained on 17 June and on 2 July produced the same results, suggesting the bias due to delayed report might be negligible; the dynamic of the problem needs to be addressed, incorporating with temporal statistics methods.
Conclusion
The average of country/territory-specific COVID-19 CFR is about 2%–3% worldwide, which is higher than previously reported at 0.7%–1.3% and possibly due to the unreported asymptomatic cases. The COVID-19 CFR is statistically significantly associated with population size, especially in middle-income and high-income countries, which may indicate the healthcare strain and/or lower treatment efficiency in the countries with large populations, and secondary to higher transmission risk and generally poorer health. When testing is warranted and healthcare resources are sufficient, strict quarantine and/or lockdown measures might result in excess deaths in countries with high-income level. No statistically significant findings were found in low-income countries, which might be due to the challenges in data collection, communication and verification in the countries and need to be further investigated in follow-up studies. To make global joint strategy and/or policy against the COVID-19 pandemic, spatial dependence and temporal trends must be considered in data analysis and decision making.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @clin_epi
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval The data used in the study are freely available in the open source for research. There are no individual data available in the dataset. The owner of the data permits the use, distribution and reproduction of the data in any medium. Therefore, no private and confidential information could be disclosed in the study, and ethical approval is not applicable.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. All data used in this study are publicly available at https://ourworldindata.org/ and are referenced in the article.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.