Article Text

This article has a correction. Please see:


Development and validation of a prediction model for gestational hypertension in a Ghanaian cohort
  1. Edward Antwi1,2,
  2. Rolf H H Groenwold1,
  3. Joyce L Browne1,
  4. Arie Franx3,
  5. Irene A Agyepong2,
  6. Kwadwo A Koram5,
  7. Kerstin Klipstein-Grobusch1,6,
  8. Diederick E Grobbee1
  1. 1Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
  2. 2Ghana Health Service, Accra, Ghana
  3. 3Department of Obstetrics and Gynecology, University Medical Center Utrecht, Utrecht, The Netherlands
  4. 4Noguchi Memorial Institute for Medical Research, College of Health Sciences, University of Ghana, Legon, Accra, Ghana
  5. 5Division of Epidemiology & Biostatistics, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
  6. 6Division of Epidemiology & Biostatistics, School of Public Health, University of the Witwatersrand, Johannesburg, South Africa
  1. Correspondence to Dr Edward Antwi; ed_antwi{at}


Objective To develop and validate a prediction model for identifying women at increased risk of developing gestational hypertension (GH) in Ghana.

Design A prospective study. We used frequencies for descriptive analysis, χ2 test for associations and logistic regression to derive the prediction model. Discrimination was estimated by the c-statistic. Calibration was assessed by calibration plot of actual versus predicted probability.

Setting Primary care antenatal clinics in Ghana.

Participants 2529 pregnant women in the development cohort and 647 pregnant women in the validation cohort. Inclusion criterion was women without chronic hypertension.

Primary outcome Gestational hypertension.

Results Predictors of GH were diastolic blood pressure, family history of hypertension in parents, history of GH in a previous pregnancy, parity, height and weight. The c-statistic of the original model was 0.70 (95% CI 0.67–0.74) and 0.68 (0.60 to 0.77) in the validation cohort. Calibration was good in both cohorts. The negative predictive value of women in the development cohort at high risk of GH was 92.0% compared to 94.0% in the validation cohort.

Conclusions The prediction model showed adequate performance after validation in an independent cohort and can be used to classify women into high, moderate or low risk of developing GH. It contributes to efforts to provide clinical decision-making support to improve maternal health and birth outcomes.

  • predictors
  • prediction model
  • hypertensive disorders of pregnancy
  • risk scores
  • gestational hypertension

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • Use of prospectively collected data from antenatal period through to delivery.

  • Data was collected in primary care setting and reflected practice.

  • The prediction model was validated in a different cohort of pregnant women.

  • Limitation of using only maternal clinical characteristics to predict gestational hypertension (GH).

  • The study had GH as only outcome and not preeclampsia or eclampsia.


Hypertensive disorders of pregnancy (HDP), which include gestational hypertension (GH), preeclampsia, eclampsia and the haemolysis, elevated liver enzymes and low platelets (HELLP) syndrome are the third leading cause of maternal deaths globally,1 with most of these deaths occurring in low income and middle income countries (LMICs). The International Society for the Study of Hypertension in Pregnancy (ISSHP) classifies HDPs as chronic hypertension, gestational hypertension, preeclampsia-de novo or superimposed on chronic hypertension and white coat hypertension.2 HDPs are the leading cause of maternal death in Latin America and the Caribbean accounting for 25.7% of mortality; in Africa they rank third (9.1%).3 In Ghana, 14% of all female deaths are pregnancy related with HDPs being the third leading cause of maternal deaths (9%) after haemorrhage (22%) and induced abortion (11%).4

The underlying causes of HDPs are not fully known,5 however accurate prediction of women at increased risk of HDP could lead to better antenatal care (ANC) and a reduction of complications from the condition.

Clinical prediction models estimate the probability of individuals having certain health conditions or obtaining defined health outcomes.6–9 They combine two or more items of patient data to predict clinical outcome and prior to application in clinical practice should be externally validated.6–12 The main approaches to predicting the occurrence of GH include the use of maternal clinical characteristics, uterine artery Doppler and biomarkers.13–15 Although a number of prediction models for HDP, mainly preeclampsia and eclampsia have been developed in high-income countries, they may not be suitable for LMICs because of differences in the availability and the cost of diagnostic tools.16

The aim of this study was to develop and externally validate a contextually appropriate and low cost clinical prediction model for GH based on maternal characteristics obtained at the first ANC visit for use in primary care settings in Ghana and potentially other LMIC.


Study design and population

Development cohort

The prediction model was developed in a prospective cohort of 2529 pregnant women attending ANC in primary care setting in six hospitals in the Greater Accra region of Ghana between February and May 2010. The eligibility criterion was pregnant women without chronic hypertension. The exclusion criteria were history of hypertension or having hypertension before 20 weeks gestation as per blood pressure (BP) measurements. After potential participants had given written informed consent, they were enrolled and followed up at ANC visits until they delivered. Ethical approval for the study was granted by the Ethical Review Committee of the Ghana Health Service (Ethical Clearance ID number GHS-ERC 02/1/10).

The sample size estimation was based on the incidence of HDPs in the Ghanaian population and on the principle of 10 outcome events per variable.17 The Ghana Maternal Health Survey of 20074 had estimated that 9% of all maternal deaths were due to HDP. Using an estimated incidence of GH of 10% in the study population and for 10 predictors, we aimed to enrol 2500 women but actually enrolled 2529.

Data was obtained from the women's medical records as measured by the midwives during routine ANC. The midwives had been given standardised training in data collection. Candidate predictors were selected based on a review of the literature on variables known to be associated with GH.18–22 Information on the following predictors: maternal age, diabetes mellitus (confirmed diagnosis of diabetes mellitus), family history of hypertension (confirmed diagnosis of hypertension in parents or siblings), family history of diabetes (confirmed diagnosis of diabetes in parents or siblings) and family history of multiple pregnancies were obtained during the first antenatal clinic visit. Blood pressure (measured with a mercury sphygmomanometer), height (measured in centimetres with a stadiometer), weight (measured in kilogrammes with a bathroom scale) and urine protein (defined as 2+or more on urine dipstick) were also obtained during the first and subsequent antenatal clinic visits. Pregnancy outcomes were obtained from the hospital's maternity register.

Validation cohort

For external validation of the derived prediction model, data from 647 adult pregnant women recruited as part of a prospective cohort study conducted between July 2012 and March 2014 at Ridge Regional Hospital and Maamobi General Hospital in Accra were used. These hospitals provide primary ANC similar to that received by the women in the derivation study. The inclusion criteria were women <17 weeks pregnant and 18 years or older with no pre-existing hypertension. Pregnant women were included in the study after they had given written informed consent and were interviewed by trained research assistants using a structured questionnaire for sociodemographic characteristics and obstetric history. Weight, height, BP and urine protein at the initial and subsequent ANC visits was obtained from the maternal health record books. Pregnancy outcomes were obtained from the hospital's maternity register. Data were entered by trained data clerks using EpiDataEntry (EpiData Association, Odense, Denmark, 2010) and validated by double entry, cleaned and checked for missing data.


The outcome, GH, was defined as a systolic BP of 140 mm Hg or more and or a diastolic BP of 90 mm Hg or more on at least two separate occasions, and present for the first time after 20 weeks of pregnancy.23 In both cohorts BP measurements were taken using a mercury sphygmomanometer by trained midwives. The appropriate adult sized cuff was placed on the bare left upper arm with the woman comfortably seated, her back supported and the legs uncrossed. The arm was at the level of the heart and neither the patient nor the observer talked during the measurement. Korotkoff phase V sounds were used.24 Two readings were taken at an interval of 5 min and the average was used to represent the woman's BP. The sphygmomanometers at the clinics are calibrated periodically to ensure accurate readings

The gestational age at which GH was diagnosed is available for both cohorts.

Data analysis

The mean and SD of continuous predictors were calculated for women who developed GH and those who did not. Means were compared using the independent t-test; percentages for categorical data were assessed by χ2 test. Missing data were imputed by multiple imputation using ‘Multivariate Imputation by Chained Equations (MICE)’ function in R.25 Missing values were imputed 10 times and Rubin's rule26 was applied to pool results over the 10 imputed data sets. Predictors that were related to GH by a predetermined p value of 0.20 or less were selected and used in a multivariable logistic regression model. Stepwise backward selection using p<0.20 was used to derive the model which was internally validated using the bootstrapping technique. Parity was forced in the model while systolic blood pressure dropped out of the model because of collinearity with diastolic blood pressure. The resulting shrinkage factor after bootstrapping was used to adjust the regression coefficients, thus correcting for model overfitting.

The performance of the models in the development and validation cohort was assessed by discrimination and calibration. Discrimination is the ability of the model to distinguish between women who develop GH and those who do not and was assessed using the c-statistic. The c-statistic or area under the receiver operating characteristic curve (AUC) ranges from 0.5 (no discrimination) to 1.0 (perfect discrimination).12 Calibration of the model was assessed by the calibration plot of actual probabilities versus predicted probabilities.

For application of the model, a score chart was derived using the regression coefficients of the predictors. The total score of each woman was related to her risk of developing GH. Cut-off points based on a total score of <1, between 2 and 6 and ≥7 were used to classify women into low, moderate and high risk of GH, respectively. The sensitivity, specificity, negative and positive predictive values of the cut-off points were calculated.

Reporting and analysis of study results was conducted according to the TRIPOD checklist.27 Statistical data analysis was performed by use of SPSS software (V.20.0, IBM SPSS Statistics, Chicago, Illinois, USA) and R statistical software (V.3.1.0 (2014–04–10)).


Table 1 describes the baseline characteristics of the development and validation cohorts at the first ANC visit.

Table 1

Characteristics of the development and validation cohort at first antenatal visit stratified by GH

Development cohort

Women with and without GH differed with respect to age (28.9 (SD 5.9) years vs 28.0 (SD: 5.8) years, p=0.01). There was no difference in mean height between women who developed GH and those without GH (159.9 cm (SD 6.7) vs 160.6 cm (SD 7.4), p=0.19). The mean weight differed between women with and without GH (73.3 kg (SD 19.0) vs 66.2 kg (SD 13.2), p<0.001). The mean diastolic BP also differed between women who developed GH and those who did not (71.9 mm Hg (SD 11.6) vs 66.2 mm Hg (SD 9.1), p<0.001).

About 27% of women with GH had a parent with hypertension compared to 17.2% of women without GH (p<0.001). Furthermore 15.3% of women with GH had a history of GH in a previous pregnancy compared to 1.0% of women without GH (p<0.001).

Validation cohort

The mean age of women who developed GH (29.8(SD 5.6) years) was higher than in those who did not. (28.2(SD 5.0) years, p=0.053). There was no difference in mean height between women with and without GH (161.4 cm (SD 9.5) vs 161.1 cm (SD 7.5), p=0.75). However, there was a difference in the mean weight of women with and without GH (74.0 kg (SD 14.8) vs 65.9 kg (SD 7.5), p<0.001). The mean diastolic BP differed between women who developed GH and those who did not (75.2 mm Hg (SD 12.6) vs 69.1 mm Hg (SD 10.5), p<0.001), as did mean systolic BP (115.6 mm Hg (SD 14.5) vs 111.6 mm Hg (SD 12.2), p=0.046).

Of the women who developed GH, 29.2% reported a family history of hypertension in parents compared to 3.6% of those who did not (p=0.02). Percentage of women with previous history of GH did not materially differ between those who developed GH and those who did not.

Table 2 shows the adjusted ORs of predictors of GH in the development cohort.

Table 2

Adjusted OR of predictors of GH at the first antenatal care visit in a cohort of 2529 pregnant women

These are maternal height, weight, diastolic BP, history of hypertension in the parents, previous history of GH in the mother and parity. The c-statistic of the model was 0.70 (95% CI 0.67 to 0.74).

The final prediction model was:

Final model: Logit (GH)=−1.53–0.031×Height+0.38×Hypertension in parents+2.26×Previous

GH+0.024×Weight+0.041×Diastolic BP—0.10×Parity.

The C-statistic after external validation was 0.68 (95% CI 0.60 to 0.77)

Figure 1 shows the calibration plot for the development cohort.

Figure 1

Calibration plot in development cohort.

The dotted 45o line denotes the perfect agreement between predicted risk (x-axis) and observed risk (y-axis).The smooth line approximates the agreement between the predicted and observed risks across subgroups of pregnant women ranked by increasing predicted risks.

The calibration plot shows a reasonable fit for probabilities between 0.1 and 0.16 where most of the events occur. Figure 2 shows the calibration plot in the validation cohort. Again the plot shows a good fit for probabilities between 0.04 and 0.16, where most of the events occur.

Figure 2

Calibration plot in validation cohort.

Table 3 presents the score chart for obtaining the total risk score of each woman.

Table 3

Score chart for the risk of developing GH in a cohort of pregnant women from Ghana

Table 4 shows the categorisation of the development cohort into low, moderate and high risk. Three hundred and one women were classified as being at high risk of developing GH and 82 of them eventually developed GH giving a positive predictive value (PPV) of 27.2% and a negative predictive value of 92.0%. The positive likelihood ratio was 1.22 for low risk and 3.24 for moderate risk while the negative likelihood ratio was 0.32 for low risk and 0.76 for moderate risk.

Table 4

Categorisation of development cohort into low, moderate and high risk

Table 5 presents information on the categorisation of the validation cohort into low, moderate and high risk of GH. Twelve women were classified as high risk and 4 of them eventually developed GH, giving a PPV of 33.3% and a negative predictive value of 94.0%. The positive likelihood ratio was 1.15 for low risk and 7.31 for moderate risk while the negative likelihood ratio was 0.50 for low risk and 0.92 for moderate risk. Table 6 shows the number of observations and missing values (with percentage missing) for the development and validation cohorts. Table 7 compares characteristics of women in the development and validation cohorts before and after imputation.

Table 5

Categorisation of the validation cohort into low, moderate and high risk

Table 6

Number of observations and missing values (with percentage missing) for the development and validation cohorts

Table 7

Comparison of characteristics of women in the development and validation cohorts before and after imputation


We developed and externally validated a simple prediction model for GH in two different cohorts of pregnant women attending ANC clinics in similar settings in line with the general recommendation that before being applied in clinical practice, prediction models should be externally validated 6–12 The c-statistic of the model in the original cohort (0.70 (95% CI 0.67 to 0.74)) was only slightly reduced (0.68 (95% CI: 0.60 to 0.77)) after external validation, consistent with findings from other studies.10 ,28–30 Nijdam et al31 in the Netherlands derived a prediction model for identifying nulliparous women who developed hypertension before 36 weeks of gestation using systolic BP, diastolic BP and weight. The AUC of the original model of 0.78 (95% CI 0.75 to 0.82) reduced to 0.75 (95% CI 0.68 to 0.81) after external validation. The small decrease in c-statistic in our study implies that the model predicts well based on data routinely collected as part of ANC and can be applied to the pregnant women in the study setting.

Most prediction models for HDPs, such as the SCOPE model,16 have focused on preeclampsia and eclampsia which are severer forms of the disorder. However, milder forms such as GH are also associated with less favourable pregnancy outcomes. Given that GH can be managed to prevent progression to severer forms, a model that identifies women at risk is useful.

A limitation of our study was the application of clinical characteristics only, excluding biomarkers and uterine artery Doppler in our prediction model. This is because of the non-routine use of these parameters in ANC in the Ghanaian setting. Both approaches are expensive and the equipment for analysing these biomarkers is generally not available in many low-resource settings. However, future research could assess the added value of these biomarkers as a recent systematic review for first trimester prediction of preeclampsia showed that a combination of uterine artery Doppler, maternal characteristics and two or more biomarkers yielded detection rates of 38–100%.14 The best rates were reported for the combination of Inhibin A, PLGF, PAPP-A, uterine artery Doppler and maternal characteristics.14 The difficulty of predicting GH using only maternal clinical characteristics has been pointed out;33 however, the feasibility of applying these models in low-resource settings currently remains limited due to constraints in the availability of diagnostic equipment and the high cost of the tests which are beyond the means of most people who require them. Thus despite the increased predictive value of adding biomarkers to the predictive model; the need to derive reasonably accurate prediction models that use variables, which are routinely easy to obtain for low-resource settings is important.

In the development cohort, 301 (11.9%) women were classified as being at high risk of developing GH. Eighty two of them eventually developed GH giving a PPV of 27.2% and NPV of 92%. In the validation cohort, 12 (1.9%) women were classified as being at high risk of GH and 4 of them developed the condition. The PPV was 33.3% and the NPV 94%. Classifying women into different risk categories allows for closer monitoring of pregnant women at high risk. This will include more frequent ANC visits or referral for specialist care.

Given that the addition of biomarkers in the screening of women could enhance the identification of those at high risk of GH, future research should explore the added value of biomarkers in the early identification of pregnant women at increased risk of HDPs in LMICs. Such studies should be accompanied by comparative cost-effectiveness of the routine data only predictive models and the models that combine routine data and biomarkers to provide essential health technology assessment information for future decision-making. In the interim however, despite the fact that the modest PPV in the development and validation cohorts show the limitation and difficulty of predicting GH using only demographic and clinical characteristics the model has the potential of identifying pregnant women at increased risk of GH for subsequent care and monitoring. Its further validation and use is worth serious consideration in low-resource settings.


We developed and validated a prediction model for GH at the first ANC visit using maternal data prospectively collected in a LMIC setting. Our results are easily converted into a simple user friendly clinical decision-making support tool for use in ANC clinics in low-resource settings that enables frontline providers of maternal health services to use a score chart to quickly categorise women into different risk levels. The strength of this model is the use of a few maternal clinical variables already routinely obtained by caregivers during routine ANC. Such a simple predictive model to aid frontline providers of maternal care to estimate the probability of GH later on in the pregnancy and take relevant precautions is potentially lifesaving. Obtaining the information does not involve expensive procedures such as uterine artery Doppler.33 The application of the model at the ANC should aid in the early detection of women at risk of GH and contribute to efforts to provide clinical decision-making support to improve maternal health outcomes. We would recommend its validation in other low-income settings as well as implementation research to inform implementation, monitoring and evaluation at scale in Ghana.


The authors acknowledge Ms Helenmary Bainson, Mrs Cecelia Opong-Peprah and Mrs Emma Antwi who supervised the data collection. The authors thank the midwives who collected the data and the data entry staff. EA gratefully acknowledges the UMC Utrecht Global Health Support program scholarship which enabled him to conduct this study and finalise this manuscript. Finally EA acknowledges funding from the Ghana Health Service for the initial data collection.


View Abstract


  • Contributors EA designed the study, collected data, carried out data analysis and wrote the initial draft of the manuscript. RHHG assisted with data analysis. DEG, RHHG, IA, KAK,KK-G,JLB and AF provided scientific guidance and were also actively involved in the preparation and review of the manuscript and approved it.

  • Funding This research received funding from the UMC Utrecht Global Health Support program. The funders played no role in the study design, data collection, data analysis and interpretation as well as writing of the manuscript.

  • Competing interests None declared.

  • Ethics approval Ghana Health Service Ethical Review Committee (GHS-ERC 02/1/10, GHS-ERC 07/09/11).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • Correction
    British Medical Journal Publishing Group