Article Text

Download PDFPDF

Multilevel determinants of racial/ethnic disparities in severe maternal morbidity and mortality in the context of the COVID-19 pandemic in the USA: protocol for a concurrent triangulation, mixed-methods study
  1. Jihong Liu1,
  2. Peiyin Hung2,
  3. Chen Liang2,
  4. Jiajia Zhang1,
  5. Shan Qiao3,
  6. Berry A Campbell2,4,
  7. Bankole Olatosi2,
  8. Myriam E Torres1,
  9. Neset Hikmet5,
  10. Xiaoming Li3
  1. 1Department of Epidemiology & Biostatistics, University of South Carolina Arnold School of Public Health, Columbia, South Carolina, USA
  2. 2Department of Health Services Policy & Management, University of South Carolina Arnold School of Public Health, Columbia, South Carolina, USA
  3. 3Department of Health Promotion, Education, & Behavior, University of South Carolina Arnold School of Public Health, Columbia, South Carolina, USA
  4. 4Department of Obstetrics and Gynecology, University of South Carolina School of Medicine, Columbia, South Carolina, USA
  5. 5Department of Integrated Information Technology, University of South Carolina College of Engineering and Computing, Columbia, South Carolina, USA
  1. Correspondence to Dr Jihong Liu; jliu{at}


Introduction The COVID-19 pandemic has affected communities of colour the hardest. Non-Hispanic black and Hispanic pregnant women appear to have disproportionate SARS-CoV-2 infection and death rates.

Methods and analysis We will use the socioecological framework and employ a concurrent triangulation, mixed-methods study design to achieve three specific aims: (1) examine the impacts of the COVID-19 pandemic on racial/ethnic disparities in severe maternal morbidity and mortality (SMMM); (2) explore how social contexts (eg, racial/ethnic residential segregation) have contributed to the widening of racial/ethnic disparities in SMMM during the pandemic and identify distinct mediating pathways through maternity care and mental health; and (3) determine the role of social contextual factors on racial/ethnic disparities in pregnancy-related morbidities using machine learning algorithms. We will leverage an existing South Carolina COVID-19 Cohort by creating a pregnancy cohort that links COVID-19 testing data, electronic health records (EHRs), vital records data, healthcare utilisation data and billing data for all births in South Carolina (SC) between 2018 and 2021 (>200 000 births). We will also conduct similar analyses using EHR data from the National COVID-19 Cohort Collaborative including >270 000 women who had a childbirth between 2018 and 2021 in the USA. We will use a convergent parallel design which includes a quantitative analysis of data from the 2018–2021 SC Pregnancy Risk Assessment and Monitoring System (unweighted n>2000) and in-depth interviews of 40 postpartum women and 10 maternal care providers to identify distinct mediating pathways.

Ethics and dissemination The study was approved by institutional review boards at the University of SC (Pro00115169) and the SC Department of Health and Environmental Control (DHEC IRB.21-030). Informed consent will be provided by the participants in the in-depth interviews. Study findings will be disseminated with key stakeholders including patients, presented at academic conferences and published in peer-reviewed journals.

  • COVID-19
  • Health informatics
  • Maternal medicine

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This study investigates whether the COVID-19 pandemic, structural racism and racial discrimination have contributed to the racial/ethnic disparities in severe maternal morbidity and mortality in the USA.

  • This study employs a state-of-the-art design (ie, a convergent parallel design) and machine learning models to rigorously examine the questions of interest.

  • This study will use a large-scale population-based cohort study concurrently for both South Carolina and the USA, which will innovatively integrate COVID-19-related clinical, surveillance, electronic health record, and geospatial data at community, healthcare institutions and system/policy levels.

  • The effects of both county-level and ZIP-code-level social contexts will be calculated at the maternal residence location.

  • The stagnant residential social contexts might not reflect their long-term exposures to neighbourhood structural racism.


Annually, nearly 60 000 women experience severe maternal morbidity (SMM) (ie, unexpected complications of labour and delivery) and mortality (SMMM).1 2 Between 1993 and 2014, SMMM rates in the US tripled from 49.5 to 146.6 per 10 000 childbirths.3 For every 70 US women who experienced an SMM, one died during or immediately after pregnancy.4 The SMM occurrences have also led to significant short-term or long-term clinical impacts on women’s health5 and added significant costs to women, their families, taxpayers and the healthcare system.6–8

Non-Hispanic black (hereafter, black) women experience a threefold to fourfold risk of pregnancy-related deaths compared with non-Hispanic white women (hereafter, white).9 10 Black and Hispanic women were up to 110%more likely to experience SMMM,2 despite their younger maternal age (often a protective factor for SMMM) as compared with non-Hispanic white women. Such racial/ethnic disparities in SMMM rates have persisted for over a decade—with increasing rates among all race/ethnic groups.11 These SMMM rates are unevenly distributed socioeconomically and geographically—with the highest rate among low-income women who delivered at hospitals in the Deep South states.2 12–14

The unprecedented COVID-19 pandemic hit communities of colour the hardest.15–17 Pregnant Black and Hispanic women experienced disproportionate COVID-19 infection and death rates.18–20 The impacts of COVID-19 on SMMM remain unclear. During the pandemic, as unemployment, income instability and financial stress have affected many US families, Black and Hispanic families have faced even higher hardship rates.21 These disproportionate consequences reflect long-standing inequities, often stemming from structural racism and discrimination (eg, residential segregation, poverty, inadequate education, unemployment and lack of home ownership).22 23 These inequities can lead to uneven access to quality healthcare, psychosocial stress and unhealthy lifestyles among women of colour, which further increases SMMM risk.24 25 Yet, the aetiology of SMMM is complex, multifaceted and time-varying. Prior research efforts on racial/ethnic disparities in SMMM have mostly focused on maternal and healthcare factors,26 leaving questions regarding the dynamics and interactions of multilevel determinants, such as the broader social contexts of these risks, largely unanswered. Thus, there is an urgent need to examine how social contexts of all types play out in SMMM rates, especially during the COVID-19 pandemic.22 23

South Carolina (SC) ranked 11th in COVID-19 cases per capita as of 23 December 2021.27 Prior to the pandemic, SC ranked 42nd in the USA in overall health and 41st in maternal mortality.28 Births to Black women accounted for nearly 30% of all SC births.29 Black women living in SC experienced a twofold to threefold higher risk of SMMM than their white counterparts.30 The majority of counties in SC are designated medically underserved areas.31 Considering SC’s poor health ranking, striking racial disparities in SMMM, racially diverse population, and historical systemic Southern contexts, SC is an ideal environment in which to examine health disparities in SMMM occurring during the COVID-19 pandemic.

The overarching goal of this study is to investigate racial/ethnic disparities in SMMM, the contributing roles and mediating pathways of social contexts (eg, structural racism, racial discrimination), and the long-standing health consequences of the pandemic by studying the distributions of COVID-19 cases and multilevel determinants of maternal health in SC and the USA. Our study will: (1) examine the impacts of the COVID-19 pandemic on racial/ethnic disparities in SMMM; (2) examine and explore how the key features of social contexts (including structural racism and racial discrimination) have contributed to the widening racial/ethnic disparities in SMMM during the pandemic and identify distinct mediating pathways through maternity care and mental health; and (3) examine and identify the role of social contextual factors and protective factors on racial/ethnic disparities in pregnancy-related long-standing morbidities (eg, hypertension, pulmonary embolism, diabetes, cardiovascular disease), using machine learning algorithms.

Methods and analysis

Multilevel conceptual framework

The aetiology of racial/ethnic disparities in SMMM is complex and multifaceted (figure 1).24 At the microlevel, in addition to maternal race/ethnicity, other sociodemographics (eg, age, socioeconomic status (SES)), health behaviours (eg, prenatal care adequacy, smoking, diet, physical activity, gestational weight gain) and preexisting maternal conditions (eg, hypertension, prepregnancy body mass index (BMI), diabetes, HIV infection, obstetric comorbidity scores) potentially drive racial/ethnic disparities in SMMM.12 32 33 As compared with white women, black and Hispanic women usually have higher poverty rates, lower educational levels, and higher rates of pre-existing conditions or high-risk pregnancy.32 At the macrolevel, structural racism and discrimination—community and neighbourhood factors (eg, residential segregation, inadequate housing, lack of access to healthy food, no public transportation), healthcare institutional attributes (eg, access to risk appropriate perinatal care) and system-level factors (eg, COVID-19 pandemic, state public health emergency policies) may play a role in racial/ethnic disparities. These macrolevel factors interact with microlevel factors to further exacerbate racial/ethnic disparities in SMMM.

Figure 1

Multilevel conceptual framework to examine racial/ethnic disparities in severe maternal morbidity and mortality in the context of COVID-19 pandemic.

Study design

The above-mentioned multilevel conceptual framework guided our study design (figure 2). We will employ a concurrent triangulation, mixed-methods study design to rigorously examine racial/ethnic disparities in SMMM in SC and the USA. This convergent parallel design will allow us to better understand the underlying mechanisms for social contexts and racial/ethnic disparities in SMMM via maternity care and mental health using the data from the statewide pregnancy survey and via qualitative interviews with pregnancy and postpartum women and maternity care providers. Given the multilevel and multidomain nature of risk factors for pregnancy-related long-standing morbidities, we will use novel machine learning models to forecast the intertwining social context effects with multilevel factors on maternal health during the COVID-19 pandemic. We will conduct data analyses using the quantitative and qualitative methods concurrently. Then we will compare and contrast findings from these two methods for similarities and incongruences and will interpret findings jointly.

Figure 2

Using concurrent triangulation mixed-methods design to investigate racial/ethnic disparities in severe maternal morbidity and mortality during the COVID-19 pandemic. SC, South Carolina.

Data sources

We will leverage our statewide South Carolina COVID-19 Cohort (S3C) database, which integrates COVID-19-related clinical, surveillance, electronic health records (EHRs) and geospatial and temporal data at community, healthcare institutional and system levels to comprehensively examine the roles of social contexts on racial/ethnic disparities in SMMM. To ensure the generalisability of our findings, we will also examine them using EHR data from the ongoing National COVID-19 Cohort Collaborative (N3C).34 Nationwide social context databases (eg, American Community Survey (ACS), American Hospital Association (AHA)) and time-varying COVID-19 infection and social distancing policies data will be added to both S3C and N3C. Postpartum women’s survey responses and in-depth interview data will be analysed to understand complex pathways and multilevel determinants of maternal morbidities (figure 2).

SC COVID-19 Cohort-Pregnancy Database

With support from the National Institute of Allergy and Infectious Disease (R01A127203-4S1), our team has established a statewide S3C database for COVID-19 research since 06/2020 by integrating various state-level data sources including: (1) the COVID-19 testing data from the SC Department of Health and Environmental Control (SC DHEC), (2) hospital encounter data for inpatient hospitalisation, outpatient surgery, home health and emergency departments; (3) health utilisation data from large public and private health insurance plans (eg, Medicaid, State Health Plan, BlueCross BlueShield of SC); (4) EHR data from health systems (Prisma and MUSC) and (5) programme data from the SC Department of Mental Health. The database is updated every 6 months.

In this study, as part of the National Institutes of Health (NIH)’s ‘Implementing a Maternal health and PRegnancy Outcomes Vision for Everyone’ initiative supported by the Office of Director, NIH (3R01A127203-5S2), our team will create a population based S3C pregnant women cohort (S3C-P), which includes all women who gave birth between 2018 and 2021 in SC (>2 00 000 births, 57.2% white, 31.1% black, 4.6% Hispanic) and will add vital record data (birth and death certificates) to complement existing linkages from the parent S3C cohort. The identification of pregnancy status and COVID-19 infection will be cross-verified using EHR, claims data, laboratory reports and ongoing SC DHEC medical chart reviews among >4387 pregnant women with confirmed COVID-19 infections in SC as of December 2021. The SC Office of Revenue and Fiscal Affairs will collate databases and provide our team with a deidentified linked database system.35

National COVID-19 Cohort Collaborative

The N3C is a novel data consortium that integrates EHR and medical claims data from 92 healthcare systems and institutes across 50 states. N3C enables data sharing, computable phenotypes and collaborative data mining by harmonising EHR data of diverse standards using Observational Medical Outcomes Partnership Clinical Data Model. N3C was created to study potential risk factors and protective factors of COVID-19 and its long-term health consequences.34 As of 24 December 2021, N3C has aggregated 9.4 million patients (3.3 million COVID-19 patients) with their EHR dating back from January 2018, including 1.9 million women with COVID-19 (>61 k pregnant women) and 3.5 million women without COVID-19 (>209 k pregnant women). The participants in N3C represent diverse populations in the USA (eg, geographic, socioeconomic, racial/ethnic). Building on a secured cloud environment, N3C provides data harmonisation, privacy-preserving data linkage and high-performance data analytics. Our team has already gained access to the restricted N3C database including ZIP codes of patients and health systems and dates of services.

Nationwide social context databases

The 2015–2019 ACS and the 2018–2020 AHA Annual Survey will be used to calculate county-level residential segregation measures, racial discrimination in SES and ZIP-level accessibility to hospital-based obstetric units.

Time-varying local COVID-19 infection and social-distancing policy data

To better understand local pandemic settings, we will also add the Centers for Disease Control and Prevention (CDC) COVID-19 Case Surveillance restricted datasets for nationwide cases confirmed since 11 March 2020, and CDC’s state-level social distancing policies (eg, emergency declaration, stay at home order) in early pandemic and telehealth services expansion data and each corresponding date of enaction.36

SC Pregnancy Risk Assessment Monitoring System

SC Pregnancy Risk Assessment Monitoring System (PRAMS) as a part of national PRAMS is an ongoing survey of SC mothers who have recently given birth.37 These mothers are sampled from state birth certificates. After statistical weighting, PRAMS data are representative of all mothers who gave birth in SC. SC PRAMS added 11 COVID-19-related questions for mothers who delivered in August 2020 and after in their survey. SC PRAMS routinely collects detailed psychosocial and behavioural risk factors for each participant, which are not available in S3C and N3C. Residential ZIP codes will be used to add in social contextual variables and other ZIP-level or county-level characteristics. The unweighted sample size for SC 2018–2021 PRAMS will be at least 2000.

Key measures

Outcome measures

The main outcomes of interest will be SMMM.38 We will adapt from a previously validated algorithm by using the International Classification of Diseases, 10th Revision, Clinical Modification diagnosis and procedure codes to identify women with one or more of the 21 SMM indicators developed by the CDC and updated by the Alliance for Innovation on Maternal Health programme at the time of childbirth. Maternal mortality will be identified using statewide death certificate data from the childbirth date to up to 1-year post partum. A composite variable of SMMM will be created to reflect SMM or maternal mortality incidence. We will also study MMM composite, which includes mortality and morbidities related to hypertensive disorders of pregnancy, postpartum haemorrhage and infections/sepsis that happen during pregnancy through 6 weeks post partum.39 Other outcomes to be studied include: (1) adverse maternal outcomes including intensive care unit (ICU) admission, invasive ventilation, receipt of extracorporeal membrane oxygenation, etc; (2) prolonged length of stay,40 41 and (3) hypertension, pulmonary embolism, type 2 diabetes, cardiovascular diseases (eg, heart attack, myocardia infraction, thrombus, stroke) diagnosed within 1 year after delivery.

COVID-19 status and severity

Eligible COVID-19 cases are those with a positive test for SARS-CoV-2 since 11 March 2020, during pregnancy. Data on symptom status (symptomatic, asymptomatic, unknown) are available, while severity will be defined using the WHO’s Clinical Progression Scale.42

Social context measures

The five dimensions of county-level residential segregation, including evenness, exposure, concentration, centralisation and clustering,43 44 will be determined for each race/ethnic group (eg, black, Hispanic) using the ACS Census tract data.45 46 Each index will be calculated across census tracts within residential counties. Higher values indicate higher levels of segregation. We will create the group indicator for segregated vs less segregated counties using the cutoffs for each dimension index.46 Additional hypersegregation index—segregations scores at >0.6 on at least four aforementioned dimensions—will be created to reflect the highest levels of segregation.46 Within-county racial/ethnic discrimination in SES will be calculated using the ACS county data,45 including black-white and Hispanic-white ratios of poverty, unemployment and home ownership rates.47–51 These measures will be linked to databases via maternal residence counties.

Healthcare institution

Using the AHA annual survey data on hospital location,52 we will identify loss of hospital-based obstetric units using our published validated algorithm.53–55 An indicator for whether a hospital or a hospital’s obstetric service was closed for each year will be created. In turn, women’s access to hospital-based obstetric care within 30 mile distance for years 2018–2020 using the ArcGIS fastest route network will be determined: (1) had access to; (2) no access to and (3) experienced the loss of all hospital-based obstetric units.

County-level COVID-19 infections and social distancing policies

The CDC’s COVID-19 Case Surveillance will be used to compute monthly cumulative rates of in-county residents that had been confirmed COVID-19 positive, hospitalised, admitted to an ICU, and with mechanical ventilation/intubation as a result of COVID-19 disease. Number of months elapsed since a county had each of the following policy orders will be calculated from a delivery date: emergency declaration, closures of bars, restaurants and/or other non-essential business, stay at home order and telehealth services expansion.

Statistical analyses

Impacts of the COVID-19 pandemic on racial/ethnic disparities in SMMM

We will examine the overall impacts of the COVID-19 pandemic on SMMM using the data from the S3C-P and N3C. We hypothesise that: (1) compared with prepandemic periods, SMMM has increased during the pandemic and racial/ethnic disparities have widened during the pandemic and (2) compared with pregnant women without COVID-19 infection, women with COVID-19 infection experienced higher proportions of SMMM, and racial/ethnic disparities in SMMM have amplified among COVID-19 infected women.

First, we will examine the distributions for all measures and clean the database (eg, outliers, data entry errors etc) using appropriate statistical techniques. Second, we will conduct preliminary analyses and examine descriptive statistics for outcome measures. Unadjusted and adjusted associations of SMMM with key variables and covariates will be assessed using appropriate statistical procedures (eg, tests of proportions, χ2 tests, analysis of variance).

Women who gave birth between 1 January 2018 and 10 March 2020 will be categorised as before pandemic, while women, who gave birth between 11 March 2020 and 31 December 2021, will be considered as during the pandemic. The prepandemic versus pandemic impact on SMMM will be modelled via logistic regression. First, the crude model with the pandemic indicator only will address SMMM change before and after the pandemic. Second, to investigate whether racial/ethnic disparities in SMMM have widened during the pandemic, the crude model will be further adjusted with race/ethnicity, interactions between race/ethnicity and pandemic indicator and months elapsed since 11 March 2020, on delivery dates. Then additional variables will be added, including individual-level characteristics (eg, age, SES proxy (ie, Medicaid/uninsured status), parity, marital status, underlying health conditions). Variable selection and goodness of model fit will be evaluated using the AIC, BIC and likelihood ratio test.

We will also conduct analyses in women who delivered during the COVID-19 pandemic by comparing pregnant women with and without COVID-19 infection. We will first create the COVID-19 infection indicator and then will perform the similar analysis as those for the prepandemic versus pandemic impact. We will further adjust for county-level COVID-19 infections per capita and social distancing policies at the appropriate time points using logistic regressions with random effects accounting for correlations among counties.

Social contexts, racial/ethnic disparities in SMMM and distinct mediating pathways

We will study these issues using different databases and methods. First, we will examine the association between social contexts and changes in racial/ethnic disparities in SMMM before and during the pandemic. We hypothesise that racial/ethnic disparities SMMM are potentially disproportionately widened in communities with higher racial/ethnic economic disparities (measured by black-white ratios of economic disadvantages, Hispanic-white ratios of economic disadvantages) and in higher vs less segregated black or Hispanic counties (measured by residential segregation). We will conduct a parallel analysis between social contexts and SMMM using the data from S3C-P and N3C (figure 2). Similarly, we will examine the contributing roles of social contexts to racial/ethnic disparities in SMMM in the overall sample (prepandemic vs pandemic) and between COVID-19 positive vs COVID-19 negative women. The multilevel variables that will be investigated include individual-level characteristics, community-level characteristics (ZIP code accessibility of hospital obstetric units), and county-level characteristic (social contexts: residential segregation and racial discrimination on SES; COVID-19 infection per capita and social distancing policies). For exploratory analysis, the OR of SMMM among black and white and the social context level at the county level prepandemic versus pandemic will be visualised via the spatial temporal map using GIS. The summary statistics in SMMM with respect to race/ethnicity will be calculated according to individual-level and county-level characteristic.

We will model SMMM via multilevel hierarchical logistic regression. Women who reside at the same community-level or county-level will be accounted for via random effects, which will be further modelled in the regression model with a multivariate normal distribution to account for the correlations among community or county. We plan to use an incremental modelling strategy: (1) crude model (race/ethnicity and social context factors); (2) adjusting for individual level factors and (3) additional adjustment of additional community-level and county-levels characteristics. For N3C data, we will further adjust for state or Census region. To further examine whether social contexts moderate racial/ethnic disparities in SMMM and whether these disparities vary between pre-pandemic and pandemic, we will include two-way and three-way interaction terms in the model (eg, pandemic period*social context*race). To examine the added impact of COVID-19 infections on SMMM, we will repeat the model by restricting it to all women who gave birth during pandemic period (delivered after 11 March 2020) and including maternal COVID-19 severity status in the model. Models will be compared using the AIC and BIC criteria. In the modelling procedure, the outliers, missingness, multicollinearity and non-linear will be addressed accordingly, and sensitivity analysis will be conducted comparing models with or without the treatment of outliers or missing data. The magnitude and direction racial/ethnic disparities will be assessed through OR and its 95% CI.

Second, we will use a convergent parallel design to evaluate the underlying pathways between social contexts and racial/ethnic disparities in MMM via pandemic stressors, maternity care (prenatal and postpartum) and mental health condition. We hypothesise that social contexts might hinder maternity care and worsen mental health conditions among black and Hispanic women during the pandemic and exacerbate racial/ethnic disparities in MMM.

The quantitative analysis will be conducted using SC PRAMS data, which have unique data elements that are not available in S3C-P and N3C. The 2018–2021 SC PRAMS data will provide a more refined understanding of COVID-19 stressors, psychosocial stress and healthcare utilisation through the questionnaire, including pandemic stressors (financial, job loss, childcare, etc), individual mitigation practices, changes in prenatal and postpartum care, psychosocial stress, barriers to health services, intimate partner violence, prenatal and postpartum care utilisation, smoking, alcohol use, gestational weight gain and mental health. SC PRAMS asks respondents to assess how often they experienced depressive symptoms after delivery. Descriptive statistics will be used to examine pandemic-related changes in MMM and psychosocial and behavioural changes between prepandemic (delivered before 11 March 2020) and pandemic periods (11 March 2020 and after). The weighted hierarchical regression model will be applied to examine the association between social context and MMM. As previously mentioned, community and county levels will also be modelled in the regression. Different from the prior analyses above: (1) individual characteristics will mainly come from PRAMS or birth certificates; (2) the weight based on complex survey design will be modelled and (3) individual reports of healthcare utilisation and mental health condition included.

We will conduct in-depth interviews among 40 postpartum women of colour (~20 African American; ~20 Hispanic) stratified by COVID-19 infection status and 10 maternal care providers (MCPs) who serve pregnant and postpartum women in Black and Latino communities. The inclusion criteria of postpartum women include: (1) ≥18 years old; (2) either African American or Hispanic; (3) have given birth in 2021 and (4) living in SC. We will purposely recruit postpartum women and MCPs through local OBGYN clinics and community health organisations that serve a larger proportion of low-income black and Hispanic women. We will train female interviewers of the same race as the interviewees to obtain trust from the postpartum women participants, and interviews in Spanish will be conducted as needed. Guided by the conceptual framework, the main topics of the postpartum women interviews will include: (1) their perceptions toward their healthcare providers and institutions for perinatal care; (2) experience with prenatal and postpartum care; (3) stressors in the COVID-19 pandemic; (4) challenges in healthcare seeking (eg, appointments, clinic visits), especially from structural factors (racism and discrimination) and (5) their needs/recommendations for future healthcare. The main topics of the MCPs include: (1) stressors and challenges of their clients; (2) clients’ mental health conditions; (3) impacts of COVID-19 on their care provision and (4) their views on health disparities caused by structural factors. The interviews will last 50 min and will be recorded with each participant’s consent. Audio recordings will be transcribed and coded using NVivo V.11.0. We will employ thematic analyses.56 The findings will complement the quantitative data in providing a comprehensive picture on how COVID-19 affects psychosocial well-being of the postpartum women of colour; offer in-depth interpretation and explanation of quantitative results; and explore the mediating pathways in which structural factors amplify existing disparities in maternal health in the context of the pandemic.

Machine learning-based predictive models

We will develop and evaluate machine learning-based predictive models to identify risk factors of SMMM and forecast progression of hypertensive disorder, pulmonary conditions, type 2 diabetes mellitus and cardiovascular diseases among postpartum women. The predictive models will synthesise individuals’ demographics, EHR, social contextual factors and community and healthcare system level data to make predictions of individuals’ clinical outcomes at key time points. Because data sources suggestive of these factors are variable and high-dimensional, and these factors are inherently interconnected over time,57 machine learning is a superior approach to predicting clinical outcomes and proactively detecting the associated risk factors for early intervention and treatment. Constructed models will demonstrate critical factors predictive of clinical outcomes and how these factors interact over time.

Supervised machine learning algorithm will be adopted for the predictive models. Using N3C, the algorithm will learn from input variables and predict SMMM and long-standing morbidities (eg, hypertension, pulmonary embolism, diabetes, cardiovascular diseases) over time. Input variables will include maternal characteristics, for example, sociodemographics, sociobehavioural data, social context variables, diagnoses, procedures, laboratory tests and medications. The prediction of SMMM and long-standing morbidities will take place at critical time points: <3, 6, 12 weeks, 6 and 12 months postpartum.58 We will develop deep learning algorithms because of their ability to integrate complex clinical data and social contexts from multiple sources with superior predictive performance, including: (1) convolutional neural network (CNN) for its ability to capture dynamic patterns among multilevel input variables; (2) recurrent neural network (RNN) with long short-term memory (LSTM) architecture for its ability of capturing temporal patterns of clinical events (eg, onset of preinfection conditions, viral infections and clinical events marked with gestational weeks, and the date of childbirths) and (3) Deep Boltzmann Machine (DBM) for its interpretable scoring mechanism for risk prediction.59

We will use a 10-fold cross validation. Specifically, S3C and N3C data will each be randomly partitioned into ten splits. In each of the 10 iterations, 7 splits of data will be randomly selected for model training, 2 splits of data will be used for internal validation (finetuning hyperparameters) and the 1oth split used for testing. We will use F measure, precision, recall and the area under the receiver operating characteristic curve, if unbalanced data, to measure the predictive performance of models. We will use support vector machine (SVM) as the baseline algorithm to compare against the performance of CNN, RNN (LSTM) and DBM. The best-performed model will be identified based on F measure.

We will rank input variables and/or clusters of input variables by calculating the importance scores57 (eg, mutual information, SVM-based recursive linear elimination). Two content coinvestigators will independently review the ranked results and identify clinical/social risk factors. Disagreement between two reviewers will be resolved by panel discussion. Development of sophisticated machine learning models for predicting long-standing morbidities will be used to identify important risk factors prenatally, which can be used for early intervention, treatment and community-wide interventions.

Power and sample size calculation

We estimate that there were 200 000 women who gave birth in S3C and 270 000 pregnant women in N3C for our study period. The primary outcome of interest is SMM and the main exposures of interest are race (white vs black) and pre–peri COVID-19 period. We assume that there are 64 000 (32%) black and 114 000 (57%) white in S3C60 and 38 070 (14.1%) black and 160 380 (59.4%) white in N3C.61 We also assume that with the same time length of pre and peri COVID-19, the prevalence of pregnancy will be similar (50%). For all aims, we consider the logistic regression with mixed effects. Based on 1.5%–2.5% incidence of SMM with 20%–50% increase due to the COVID-19 impact or race disparities, we conduct the power analysis based on the logistic regression using SAS and conclude that 200 000 sample size will be adequate to reach the power around 90%. Figure 3 illustrates the relationship of power and OR with setting of n=200 000, significance at 0.05, variable of interest with a ratio of 10:90. This will be the basic model considered, and it indicates strong power for the prediction model.

Figure 3

Estimated power according to prevalence of outcomes and ORs.

Current status and anticipated timeline

As of April 2022, we have received the linked core databases for the S3C cohort for the period of January 2018–June 2021 and full datasets will be available by the Fall of 2022. Our team is also actively constructing N3C analytical data of women with childbirths during January 2018–December 2021 for statistical analysis and machine learning. Furthermore, our team is conducting in-depth interviews with our targeted populations according to our protocols described here. We anticipate completing our main analyses in May 2023.

Patient and public involvement

No patients were involvement in the design, conduct and reporting of our research. We will actively reach out patients and public in the dissemination of our findings.

Ethics and dissemination

The study was approved by institutional review boards at the University of South Carolina (Pro00115169) and the South Carolina Department of Health and Environmental Control (DHEC IRB.21-030). Informed consent will be completed for the participants to be enrolled in the in-depth interviews. Furthermore, the NIH’s N3C data access committee approved the data use request for this project (RP-2B9622). Study findings will be disseminated with key stakeholders including patients, presented at academic conferences and published in peer-reviewed journals.


The COVID-19 pandemic has led to unprecedented societal disruptions to individuals, communities, healthcare institutions and society. Empirical data on the scope of possible widening racial/ethnic disparities in SMMM during the COVID-19 pandemic and how historical structural racism and discrimination of all types have impacted women of colour disproportionally are sparse. This study will be among the first efforts to investigate whether the COVID-19 pandemic, structural racism and racial discrimination—exposures with broad scale and reach—have contributed to the racial/ethnic disparities in SMMM in the context of the COVID-19 pandemic. Second, this proposed study employs a state-of-the-art design (ie, a convergent parallel design) to comprehensively examine the impacts of structural racism and discrimination on maternal health and the complex pathways between multilevel determinants. This design has the advantage of allowing us to weigh both quantitative and qualitative methods equally and interpret the results together.62 Third, the proposed study will innovatively use machine learning models to predict SMMM and chronic morbidities up to 1 year after delivery in the context of the COVID-19 pandemic. Fourth, we propose a large-scale population-based cohort study concurrently for both SC and the USA, which will innovatively integrate COVID-19-related clinical, surveillance, EHR, and geospatial data at community, healthcare institutions and system/policy levels. These newly integrated data sources will allow us to examine multilevel determinants of maternal health during the pandemic and advance the investigation on racial/ethnic disparities in long-standing complications postpandemic. In brief, this research represents a significant and innovative contribution to the research on the unacceptable racial/ethnic disparities in SMMM during pregnancy and postpartum in the context of COVID-19. By focusing on social contextual factors (eg, structural racism), we seek to identify ways in which the largest number of women may be impacted by targeted programmes and policies aiming to alter the context in which these morbidities and mortality occur.

This study also has some limitations. First, it is possible that the county-level social contexts’ effects in our data may not be significant. If that happens, we will calculate ZIP-code level social contexts. By assessing racial segregations, spatial distribution of economic disadvantage communities within a residence county and within-community racial discrimination, this study will provide evidence on the associations between distinct social contexts and maternal health disparities. Second, considering some women may move during pregnancy, the stagnant residential social contexts might not reflect their long-term exposures to neighbourhood structural racism. Furthermore, in the case of inferior F measures (<0.7) of learning predictive models, we will apply feature selection algorithms and association rules in model training to maximise performance.

In conclusion, the rising SMMM rate and persistent racial/ethnic disparities should trigger public health concerns, not only due to the immediate burden faced by vulnerable women, but also due to potentially lasting effects on women’s health over a life course or along family lines across generations.63 This study will investigate racial/ethnic disparities in SMMM, the contributing roles and mediating pathways of social contexts (eg, structural racism, racial discrimination) and the long-standing health consequences of the pandemic by studying the distributions of COVID-19 cases and multilevel determinants of maternal health in a racially, socioeconomically and geographically diverse population of US pregnant women. A rigorous examination of social contexts and racial/ethnic disparities in SMMM during the pandemic will contribute to the identification of factors with a broad scale and reach for programmatic and policy interventions to alter the context in which morbidity and mortality occur. Our findings will inform continuing efforts to reverse the rising trends of SMMM in the USA.

Ethics statements

Patient consent for publication


The authors thank SC Department of Health and Environmental Control, SC Revenue and Fiscal Affairs office, and other SC agencies for contributing the data in South Carolina. The authors also want to thank the organisations ( and scientists who have contributed to the ongoing development of N3C database (



  • Twitter @JihongLiu88, @peiyinhung

  • Contributors JL conceptualised and designed the study and wrote the first draft and PH, CL, JZ, SQ and XL participated in writing sections of the original proposal. All authors critically reviewed and edited the manuscript. JL, PH, CL and BO acquired the data and completed IRB approvals. JL, PH, CL, JZ, BAC, NH, BO and XL participated in quantitative data analysis and data interpretation. JL, SQ and MET participated in qualitative data collection, data analysis and data interpretation. JL and XL secured the funding.

  • Funding Research reported in this publication was supported by the National Institute of Allergy And Infectious Diseases and Office of the Director of the National Institutes of Health under Award Number 3R01AI127203-5S2 for Implementing a Maternal health and PRegnancy Outcomes Vision for Everyone (IMPROVE). XL and JL are the MPIs for this study.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.