Article Text

Original research
Are cause of death data for Shanghai fit for purpose? A retrospective study of medical records
  1. Lei Chen1,
  2. Tian Xia2,
  3. Zheng-An Yuan3,
  4. Rasika Rampatige4,
  5. Jun Chen5,
  6. Hang Li4,
  7. Timothy Adair4,
  8. Hui-Ting Yu6,
  9. Martin Bratschi7,
  10. Philip Setel7,
  11. Megha Rajasekhar4,
  12. H R Chowdhury4,
  13. Saman Hattotuwa Gamage4,
  14. Bo Fang6,
  15. Omair Azam7,
  16. Romain Santon7,
  17. Zhen Gu7,
  18. Ziwen Tan1,
  19. Chunfang Wang1,
  20. Alan D Lopez8,
  21. Fan Wu9
  1. 1Vitral Statistics, Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China
  2. 2Division of Public Health and Program Management, Shanghai Institute of Preventive Medicine, Shanghai, China
  3. 3Central Office, Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China
  4. 4Melbourne School Of Population And Global Health, The University of Melbourne, Melbourne, Victoria, Australia
  5. 5Cancer Registration and Civil Statistic, Shanghai Putuo District Center for Disease Control and Prevention, Shanghai, China
  6. 6Vitral Statistics, Shanghai Institute of Preventive Medicine, Shanghai, China
  7. 7Public Health Programs, Vital Strategies, New York, New York, USA
  8. 8IHME, University of Washington, Seattle, Washington, USA
  9. 9Shanghai Medical College, Fudan University, Shanghai, China
  1. Correspondence to Dr Fan Wu; clbjsahh{at}


Objectives To assess the quality of cause of death reporting in Shanghai for both hospital and home deaths.

Design and setting Medical records review (MRR) to independently establish a reference data set against which to compare original and adjusted diagnoses from a sample of three tertiary hospitals, one secondary level hospital and nine community health centres in Shanghai.

Participants 1757 medical records (61% males, 39% females) of deaths that occurred in these sample sites in 2017 were reviewed using established diagnostic standards.

Interventions None.

Primary outcome Original underlying cause of death (UCOD) from medical facilities.

Secondary outcome Routine UCOD assigned from the Shanghai Civil Registration and Vital Statistics (CRVS) system and MRR UCODs from MRR.

Results The original UCODs as assigned by doctors in the study facilities were of relatively low quality, reduced to 31% of deaths assigned to garbage codes, reduced to 2.3% following data quality and follow back procedures routinely applied by the Shanghai CRVS system. The original UCOD had lower chance-corrected concordance and cause-specific mortality fraction accuracy of 0.57 (0.44, 0.70) and 0.66, respectively, compared with 0.75 (0.66, 0.85) and 0.96, respectively, after routine data checking procedures had been applied.

Conclusions Training in correct death certification for clinical doctors, especially tertiary hospital doctors, is essential to improve UCOD quality in Shanghai. A routine quality control system should be established to actively track diagnostic performance and provide feedback to individual doctors or facilities as needed.

  • public health
  • epidemiology
  • medical education & training
  • statistics & research methods

Data availability statement

Data may be obtained from a third party and are not publicly available. The data that support the findings of this study are available from Shanghai Municipal Center for Disease Control and Prevention (Shanghai CDC) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Aggregated results are however available from the authors on reasonable request and with permission of Shanghai CDC.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Assessment of diagnostic accuracy at the individual level for over 1700 deaths.

  • Established ex-ante diagnostic criteria used to develop reference diagnosis for assessing data quality, thus reducing subjectivity in choice of reference diagnoses.

  • Medical history documentation for the top 25 cause of deaths was reviewed by one trained doctor, thus increasing risk of potential diagnostic bias for reference diagnoses.


Accurate and complete cause of death (COD) information, particularly as-reported in the civil registration and vital statistics (CRVS) system, is essential for health decision making.1 2 Despite its development, this is equally applicable for Shanghai, a city of 14 million registered permanent residents,3 which although experiencing a relatively low mortality rate and high life expectancy (80.2 years),4 faces a number of health-related challenges stemming from an ageing population, including a significant non-communicable disease burden, and unequal access to healthcare resources for different populations subgroups, depending on income and residence.5–10 Accurate COD data are critical for reliably informing policies and programmes to address these challenges and better allocate available healthcare resources, yet no formal scientific evaluation of the completeness and quality of data generated by the existing vital registration system in Shanghai has been undertaken.

There are more than 300 hospitals in Shanghai, with over 70 000 doctors reporting causes of death. In addition, 246 community health centres (CHC) also report causes of death, particularly for those who die at home. Although CHC doctors have received training in diagnosing causes of death, the correct certification of home deaths’ remains challenging in cases where there is limited or no documentation of the previous healthcare experience of the deceased.

It is likely that the diagnostic accuracy of deaths in Shanghai is high given the standard quality assurance process that is routinely applied (see figure 1). First, death certificates with the original underlying causes of deaths (UCODS), that is, the last mentioned COD in Part 1 of the death certificate, are collected from the hospitals or CHCs. Next, trained physicia—coders at the District Center for Diseases Control and Prevention (CDC) bureau apply the rules of the International Classification of Diseases, 10th Revision to code the certificates. In cases where the morbid sequence is unclear or improbable, further investigation is undertaken to collect more information on the deceased’s medical history or from family members. Physicians at the Shanghai Municipal CDC then review all UCODs and in cases where the UCOD is inconsistent with the sequence specified on the death certificate, the district CDC doctor is contacted with suggested changes. Finally, every 6 months a quality control meeting is convened by Shanghai CDC where certifying physicians from each of the district CDC offices meet to review and discuss those death certificates with inconsistent UCODs. These certificates, following review as described above, comprise what is known as the ‘Routine UCOD’ in the Shanghai CRVS system (see online supplemental appendix S1 for a detailed description of the quality assurance process).

Figure 1

Routine CRVS procedure in Shanghai. CDC, Center for Diseases Control and Prevention; CHC, community health centres; CRVS, Civil Registration and Vital Statistics; UCOD, underlying cause of death.

In this study, we compare both the original UCOD and the routine UCOD to a reference UCOD derived from an independent medical records review (MRR) to assess the overall quality of reported CODs in Shanghai. Specifically, our objectives were to assess:

  • The quality of COD reporting in Shanghai for both hospital and home deaths.

  • Differences in the UCOD pattern in Shanghai suggested by the MRR reference diagnosis.

To our knowledge, this is the first ever empirical evaluation of the quality of COD data in Shanghai. We expect that the findings will be useful for those responsible for health policy formulation and evaluation since they provide a scientific basis for deciding how much confidence can be attributed to local mortality data that underlie health policy and programme decisions.


Data sources

Cases for the MRR were selected to be broadly representative of the distribution of deaths across facilities, socioeconomic level and location of the facility. Thirteen health facilities were selected, including three tertiary hospitals, one secondary level hospital and nine CHCs, chosen to be representative of these types of health facilities in Shanghai.

Data inclusion

To ensure that the reference diagnoses for the MRR were as accurate and comparable as possible, we applied the ‘gold standard’ (GS) criteria for the classification of COD developed by the Population Health Metrics Research Consortium (PHMRC).11 Specifying the classification criteria for causes of death ex-ante reduces the amount of subjectivity potentially introduced by the case reviewers. The degree of certainty for each of the MRR reference diagnoses was classified as follows:

  1. GS1: highest level of certainty—MRR diagnosis was supported by either an appropriate laboratory test or X-ray/imaging with positive findings and/or medically observed and documented appropriate illness sign(s) to a predetermined standard

  2. GS2A: high level of certainty—diagnosis supported by appropriate laboratory/X-ray with positive findings and/or medically observed and documented appropriate illness or signs to a predetermined standard

  3. GS2B: high level of certainty—presumed initial diagnosis of a particular condition with high certainty; this category was only used for cancer and HIV patients on long-term treatment where initial data had been lost.

  4. GS3: Reasonable level of certainty—medical diagnoses not supported by appropriate level of laboratory investigations but which meet established clinical criteria.

  5. GS4: Unsupported—medical diagnoses not supported by adequate observed and documented clinical evidence/criteria.

The MRR was conducted using the Shanghai adaption of the Medical Data Audit Form (MDAF) (online supplemental appendix S2), originally designed for other MRR studies,12 translated into Chinese for the data collection. All study physicians were trained on how to review a medical record using the MDAF, and how to apply the standard diagnostic criteria and GS levels. Data collection was conducted by the study physicians using the modified MDAF. Four clinicians representing four major clinical streams (medicine, surgery, paediatrics and obstetrics and gynaecology) were trained to provide a further level of quality assurance of the MRR carried out by the study physicians.

Death certificates from the selected facilities where the routine UCOD was among the leading 25 causes of death for Shanghai in 2017 were reviewed and classified into different GS levels based on the information obtained from the medical records. To ensure that, the MRR UCOD was as accurate as possible, only GS1, GS2A and GS2B cases were included in the final evaluation; GS3 and GS4 deaths were discarded due to lack of adequate evidence from the medical records.

Sample size and selection

A total of 1192 GS1 and GS2 cases were collected in the four hospitals (from a total of 2378 deaths reviewed), along with 565 GS1 and GS2 cases from the CHCs (out of 751 deaths reviewed). Figure 2 presents a flow chart of the case selection: 1350 cases were discarded because their UCOD was not in the top 25 COD for Shanghai, and 22 (1.2%) of cases were discarded because they were classified as GS3 or GS4.

Figure 2

Flow chart for selection of cases. CHC, community health centres; GS1, gold standard; sMRR, medical records review; UCOD, underlying cause of death.

Data analysis

First, we reclassified the recorded UCODs to the Global Burden of Disease (GBD) cause list, comprising 290 causes, including a code for ‘garbage’ codes.13 14

Based on the GBD cause list, we then compared:

  1. The original UCOD with the MRR UCOD.

  2. The routine UCOD with the MRR UCOD.

To assess COD data quality, we calculated the percentage of deaths in the original and routine UCOD data sets that had been assigned a ‘garbage-code’, namely, a cause that has limited public health utility.15 16 While this analysis is a useful and easily replicable first step in assessing data quality, it does not provide any insight into the potential misclassification of specific COD, which was one of the key aims of our study. To do so requires a MRR study, where trained, independent physicians review the medical records of a deceased individual using pre-set clinical diagnostic criteria and assign a UCOD which is compared with the previously assigned UCOD. This provides a ‘GS’ against which the quality of the original and routine COD data can be measured.17–19 In this sense, our study forms part of a collective of, several MRR studies have been conducted using the PHMRC ‘GS’ diagnostic definitions as ex-ante criteria to identify true cases of a disease or injury.11 12 20–24

A misclassification matrix was developed for each comparison in order to identify the pattern and extent of certification errors. Standard diagnostic validation metrics of sensitivity, positive predictive value (PPV), Cohen’s kappa, chance-corrected concordance (CCC), cause-specific mortality fraction (CSMF) accuracy and leading CSMFs were calculated to assess the concordance of the original and routine UCODs with the MRR UCOD.11 25 All analyses were done using R software (The R Foundation for statistical computing, V.3.6.1).

For the misclassification matrices, only the 16 leading causes of death based on MRR UCODs have been included to facilitate interpretation of findings; all other diseases were merged into residual group, labelled ‘others’.


Description of the study data

A total of 1757 deaths were included in the study, 61% male and 39% female (table 1). There were only two deaths at ages less than 15 years, while 70% of deaths were among people aged 70 years and above. The tertiary hospitals accounted for 43% of deaths, the secondary hospitals 25% and the CHCs 32%. Cases from the tertiary and secondary hospitals had a similar age and sex distribution, while cases from CHCs were older (see table 1).

Table 1

Deaths (number and %), by type of facility, sex and age

GS1 deaths comprised 62% of all deaths and GS2 38%. Tertiary hospitals had the highest percentage of cases that were GS1 (79%), followed by secondary hospitals (60%), being least in the CHCs (40%). This gradation in diagnostic capacity is as expected since tertiary hospitals would have had the most advanced and complete medical and diagnostic facilities, followed by secondary hospitals. GS1 cases were less common among the oldest age groups, compared with younger ages, as would be expected given that the elderly typically suffers from multiple comorbidities at or around the time of death, making diagnosis of the UCOD more difficult (see table 2).

Table 2

Gold standard levels by type of health facility and age (number and %)

Validation of the original and routine UCOD

We first compared the ranking of the leading CODs from the original UCOD, as well as the routine UCOD, to the reference UCOD determined from MRR. Interestingly, the leading COD as assessed from the original (ie, prequality checking) UCOD was the collective of ‘garbage’ codes, which were assigned to almost one-third (31%) of deaths—over one-quarter (27.7%) of these garbage codes were coded to essential hypertension, followed by pneumonia, ill-defined deaths and unknown causes, unspecified heart failure and unspecified respiratory failure (see from online supplemental table 1). Both ischaemic stroke (sixth to second) and intracerebral haemorrhage (ninth to sixth) rose in the rankings of original UCOD after MRR compared with the original UCOD, suggesting that these two conditions are being systematically underdiagnosed by healthcare facilities in Shanghai. For all other leading CODs, there were only minor changes in rankings suggested by MRR compared with the original UCOD, although there was a generalised increase in CSMFs due to reallocation of the garbage codes. Conversely, for the routine UCODs, garbage codes were only assigned to 2.3% of deaths, suggesting that the quality control process established in Shanghai for mortality data are working well. Indeed, most of the rankings for leading COD in the routine data were identical to those from the MRR UCOD, except for an increased in the ranking of falls after MRR (13–11th), indicating a high consistency for the leading CODs between the routine UCOD and MRR UCOD (see from online supplemental table 2).

We develop misclassification matrices to assess the accuracy of cause-specific diagnosis in both the original and routine dada sets by comparing individual diagnosis of the original UCOD to the MRR UCOD (see table 3). Overall, 59.9% (1053/1757) of the original death certificates were assigned the correct UCOD by doctors. For the 96 cases where the original UCOD was classified to a cause belonging to the residual or ‘other’ category, 84 (87.5%) of them were reclassified to specific leading causes after MRR (see table 3). Almost one-third (31%, 542/1757) of the original UCOD were assigned to garbage codes. Moreover, of theses, over one-third (34.5%) were categorised as having the most severe implications for policy, including such vague diagnosis as ill-defined and unknown cause of mortality (R99), unspecified heart failure (I50.9), and unspecified Respiratory failure (J96.9) (see online supplemental table 3). Falls (71.1%), ischaemic stroke (50.9%), chronic kidney disease (CKD) due to diabetes mellitus type 2 (48.0%), intracerebral haemorrhage (44.2%) and other diseases (58.3%) were frequently misassigned to garbage codes such as essential hypertension and pneumonia in the original data. Aside from garbage codes, many other diseases were also frequently misdiagnosed. CKD due to diabetes mellitus type 2 was often misdiagnosed as diabetes mellitus type 2, while many deaths due to ischaemic heart disease (IHD) were often misassigned to chronic obstructive pulmonary disease (COPD) and ischaemic stroke.

Table 3

Misclassification matrix between original UCOD and MRR UCOD

Misdiagnoses were much less frequent, when comparing the routine UCOD to the MRR UCOD, as shown in table 4. In particular, garbage codes, on further investigation, were found to be primarily deaths due to IHD, ischaemic stroke, colorectum cancer, diabetes mellitus type 2 and COPD. The rigour of the data quality processes in Shanghai is clear from the fact that only 40/1757 deaths in the routine dataset were assigned to garbage codes, compared with 542, or 13.5 times as much, in the original data. (After MRR, only 14/1757 deaths were assigned garbage codes. These were cases where the reviewers could not identify more specific UCODs even after going through all available medical records.)

Table 4

Misclassification matrix between routine UCOD and MRR UCOD

Overall, only 14.3% of causes in the routine dataset were reallocated on MRR. In particular, CKD due to diabetes mellitus type 2, diabetes mellitus type 2, and falls, were often misassigned to other diseases; CKD due to diabetes mellitus type 2 to diabetes mellitus type 2, and COPD; and diabetes mellitus type 2 to IHD. Overall, though, the resulting CSMFs emerging from the routine data closely approximate what the MRR suggests is the true distribution of UCOD in these facilities.

Table 5 provides the summary metrics that assess the concordance of the original UCOD as well as the routine UCOD with the MRR UCOD. CCC measures the probability that a given cause is correctly diagnosed. CSMF accuracy is the overall accuracy of the COD distribution in a population, ranging from 0 to 1, with a value of one implying perfect concordance.21 25

Table 5

Validation metrics comparing original UCOD or routine UCOD with MRR UCOD (top 16 specific UCOD)

Overall, the original UCOD had a CCC and CSMF accuracy of 0.57 (0.44, 0.70) and 0.66 (on a scale from 0 to 1), respectively, meaning that, on average, only 57% of all the UCODs reported by health facilities were correctly diagnosed, and that, overall, causes of death are only about two-thirds as accurate as they should be for guiding policy. The sensitivity and CCC were highest for garbage codes and cancers (breast, colon, lung, oesophagus and stomach) and lowest for falls, and diabetes mellitus type 2 (including CKD due to diabetes).

PPV was high for all causes, except for garbage codes and diabetes (including CKD due to diabetes), indicating that for cases when specific causes (instead of garbage codes) were reported as original UCODs, they were usually consistent with the MRR evaluation.

The validation metrics for the routine UCOD demonstrated a high level of concordance with the reference MRR diagnoses, with CCC and CSMF accuracy of 0.75 (0.66, 0.85) and 0.96, respectively (table 5 and online supplemental figure 1), confirming the impression based on the misclassification matrix. The only causes with relatively low concordance were falls and diabetes mellitus (including diabetes mellitus type 2 and CKD due to diabetes mellitus), which were commonly assigned to other causes such as garbage codes and cardiovascular diseases in the routine UCOD. In addition, CKD due to diabetes mellitus type 2 tended to be misclassified as diabetes type 2 (see tables 4 and 5).

The overall accuracy of mortality data, in Shanghai, as measured by CSMF and CCC, was substantially higher for the routine data than the original death certificates for all three types of health facilities. Interestingly, COD diagnoses for home deaths were more accurate than hospital deaths, with tertiary hospitals assigning less accurate CODs than the other two facilities (see from online supplemental tables 4–6).


The quality of COD data reported by the Shanghai CRVS system varies greatly between the original and routine UCOD. The original UCOD data, based solely on medical certification by doctors and public health physicians, is of only moderate quality, with 31% of deaths being assigned to garbage codes (ranked first among all causes) and an overall CSMF accuracy of 0.66 when compared with a much more reliable reference data set (MRR UCOD). The routine UCOD data, following extensive quality control and rigorous review with follow-back as necessary, are however, much more reliable and highly concordant with the MRR UCOD, with only 2.3% of deaths being assigned to garbage codes and overall CSMF accuracy of 0.96, exceeding that found in hospitals in the Philippines and Mexico.21 23 Introducing rigorous review procedures for COD data can therefore greatly improve, the quality of COD data, as has also been demonstrated in Brazil.26

Our study has identified a substantial diagnostic deficit in the quality of the original UCODS, with a high proportion of CODs incorrectly classified as garbage codes. Potentially more concerning, however, is that cardiovascular diseases were often misclassified as COPD and diabetes. MRR studies from Mexico and the Philippines suggested that deaths due to falls, pneumonia and cirrhosis were often wrongly assigned to cardiovascular disease,21 23 whereas these UCODs were often assigned to garbage codes in our data. Further, CKD due to diabetes mellitus tended to be misclassified as diabetes, although this may be less grave from a public health perspective as they both highlight the importance of diabetes control. On further investigation, we found that among all cases where diabetes as the UCOD had been misclassified to CKD due to diabetes, there was clear evidence from the death sequence that diabetes was leading to chronic renal insufficiency, and even to uraemia. It is unlikely that this distinction would have significant public health implications, although the GBD classification separates the two causes of deaths, possibly to facilitate more in-depth epidemiological analyses. It is also worth mentioning that death due to ‘falls’, which was not in the top 15 CSMFs based on the original UCOD, increased in rank based on the MRR. In terms of the original UCOD, when there is a fall, the clinician filling in the death certificate tends to describe the symptoms such as fracture, haematoma or multiple organ failure rather than the fall itself as the underlying COD.

The poor quality of the original UCOD reported by the certifying physicians can likely be attributed to insufficient trainings in correct death certification in the undergraduate medical curriculum or during their residency period, a lack of understanding of the public health utility of COD, and misunderstanding the concept of underlying COD. These problems are, somewhat surprisingly, worse in tertiary hospitals compared with secondary hospitals and especially CHCs. Possible explanations could in part be due to lack of training resources and sensitisation for doctors in large hospitals in the importance of correct COD certification. The workloads and responsibilities of doctors vary across different departments and different hospitals, undoubtedly affecting the quality of death certification. This situation is similar to many other countries.27 In addition, the public health sector (CDC) has no executive leadership capacity for secondary or tertiary hospitals in China, which makes the CDC requirements harder for hospitals to follow. Conversely, for home deaths reported through the CHCs, the terminal disease process is typically less complex, with likely greater compliance among doctors.

Even though the reporting of garbage codes is low in the routine Shanghai CRVS system, our study revealed that a certain degree of misclassification still exists. Deaths due to diabetes mellitus were misclassified to cardiovascular disease, undoubtedly reflecting the difficulties in deciding the clinical sequence and origin of the diseases from clinical judgement. Or because the determination of the UCOD in such cases is often no more than the certifiers medical opinion, which may differ from one to another, even when based on the same information. The misclassification of diabetes to CKD may have arisen, either because CKD may be recognised as simply a signal of physical failure by physicians, or because the certifier is not familiar with the relevance of the distinction for public health. Few deaths were assigned to falls, which usually occur among older people, who often present with many non-communicable diseases such as IHD, diabetes mellitus type 2, etc. The existence of possible directional misclassification might lead to unpredictable impacts on the true distribution of causes of death in Shanghai, potentially reducing their policy value. These more common misclassification errors should be specifically addressed in future efforts to train doctors in correct medical certification of COD.

There are some limitations of our study. Among the nearly 700 UCOD categories recorded in the database, the top 25 causes typically accounted for 75%–79% of all deaths. To reduce the workload of doctors and remove causes with too few cases expected in the selected facilities to provide reliable comparison with GS causes, as well as considerations of the timeline and representativeness of the sample, only the top 25 UCODs were included in our study. In addition, all the results and conclusions were deduced from records with adequate documentation, that are often not applicable for uncommon UCODs or deaths with insufficient medical records.

Another limitation was that the medical histories of deaths included in the study were only reviewed by one doctor. It is likely that a parallel review by two doctors may have revealed further insights from the medical histories, potentially leading to a different ‘GS’ diagnosis than that applied in this study. While it is difficult to assess the impact of one reviewer on diagnostic accuracy, the requirement of adhering to standardised ex-ante diagnostic criteria applied by the PHMRC should, in principle, have reduced the effect of this risk.

In conclusion, our study has highlighted that the quality control procedures implemented by district CDCs and the Shanghai municipal CDC as part of the routine CRVS system, where all deaths reported as garbage codes and other implausible causes of deaths are investigated and corrected, substantially increased the cause-specific diagnostic accuracy and greatly reduces the percentage of garbage codes in the data.

Our study also suggests that, proper training for clinical doctors in death certification would be the most important strategy for improving UCOD data quality in Shanghai. Compulsory training for doctors is likely to be a cost-effective means for improving diagnostic accuracy compared with the more time-consuming and labour-intensive quality control process currently applied. Multiple forms of training including face-to-face or online methods are available and should be considered.28 Shanghai CDC has developed the adapted e-learning curriculums provided by the University of Melbourne with the intention of making the training on correct certification part of the regular curriculum for medical students. In addition, Shanghai CDC is currently in the process of compiling training materials from actual case examples in Shanghai with systematic and high frequency errors. However, knowledge gained through a training course does not always guarantee improvement in certification. Hence, a clear implication of our study is the need to improve the information exchange mechanism between the district CDC doctors responsible for correcting the medical certificates of UCOD and the hospitals, to ensure that feedback is effective and contributes to preventing diagnostic errors at source, as recognised elsewhere.29–31 Making it a requirement that physicians show competency in COD certification in order to complete their residency training would also enhance certification competency, as is now being piloted by Shanghai CDC.

Data availability statement

Data may be obtained from a third party and are not publicly available. The data that support the findings of this study are available from Shanghai Municipal Center for Disease Control and Prevention (Shanghai CDC) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Aggregated results are however available from the authors on reasonable request and with permission of Shanghai CDC.

Ethics statements

Patient consent for publication

Ethics approval

Minimal risk ethics application of this research was granted from Melbourne School of Population and Global Health Human Ethics Advisory Group (Ethics ID: 1647517.1.1).


The authors would like to acknowledge the contributions of the following who provided feedback and comments on various iterations of this paper: Ian Riley, Rohina Joshi, Deirdre McLaughlin, Adam Karpati, and Fatima Marinho. This paper is, in part, an output of the Bloomberg Philanthropies Data for Health Initiative (


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • LC, TX, Z-AY and RR are joint first authors.

  • CW, ADL and FW are joint senior authors.

  • Twitter @SantonRomain

  • Contributors RR, FW and AL conceived the study design, MC, SHG, HRC, Z-AY, TX, RS, ZG and ZT oversaw the research. LC and JC were members of the writing group. HL, MB, PS, TA, CW and AL provided feedback on data analysis, results and discussion. H-TY, BF, OA and MR revised the manuscript critically for important intellectual content. All authors contributed to the framework construction, results interpretation, manuscript revision and approved the final version of the manuscript. The corresponding authors attest that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. The corresponding author is responsible for the overall content as guarantor.

  • Funding The research was funded by the Bloomberg Philanthropies Data for Health Initiative ( Award/Grant number is not applicable. Also, Clinical Research Project of Health Industry of Shanghai Health Commission in 2020(Award number: 20204Y0205).

  • Disclaimer The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.