Objectives To develop a dynamic prediction model for high blood pressure at the age of 9–10 years that could be applied at any age between birth and the age of 6 years in community-based child healthcare.
Design, setting and participants Data were used from 5359 children in a population-based prospective cohort study in Rotterdam, the Netherlands.
Outcome measure High blood pressure was defined as systolic and/or diastolic blood pressure ≥95th percentile for gender, age and height. Using multivariable pooled logistic regression, the predictive value of characteristics at birth, and of longitudinal information on the body mass index (BMI) of the child until the age of 6 years, was assessed. Internal validation was performed using bootstrapping.
Results 227 children (4.2%) had high blood pressure at the age of 9–10 years. Final predictors were maternal hypertensive disease during pregnancy, maternal educational level, maternal prepregnancy BMI, child ethnicity, birth weight SD score (SDS) and the most recent BMI SDS. After internal validation, the area under the receiver operating characteristic curve ranged from 0.65 (prediction at age 3 years) to 0.73 (prediction at age 5–6 years).
Conclusions This prediction model may help to monitor the risk of developing high blood pressure in childhood which may allow for early targeted primordial prevention of cardiovascular disease.
- high blood pressure
- birth cohort
- prediction model
- risk assessment
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the first study in which a dynamic prediction model for childhood high blood pressure was developed, using longitudinal information on the child’s body mass index.
The prediction model is based on predictors that are usually recorded or are easy to obtain in community-based child care settings, and may in the future offer an opportunity for targeted primordial prevention of cardiovascular disease.
The outcome predicted in this study was childhood high blood pressure, and not childhood hypertension, which could only have been diagnosed if blood pressure had been measured on at least three different occasions.
Before considering implementation of this prediction model, external validation is needed, as well as careful evaluation of possible benefits and harms of targeted preventive strategies.
In recent decades, the prevalence of childhood high blood pressure has increased to 4%–5%, largely driven by the growing prevalence of childhood overweight and obesity.1 Childhood high blood pressure is usually asymptomatic until complications occur, but may have adverse consequences for later cardiovascular health,1 partly through early adverse vascular changes and partly through tracking of childhood blood pressure levels into adulthood.2 Childhood high blood pressure has been associated with atherosclerosis in autopsy studies,1 3 4 and with unfavourable changes in markers of subclinical atherosclerosis in adulthood, such as increased carotid intima-media thickness and coronary artery calcification, independently of adulthood blood pressure.5 6 Although blood pressure does not track as strongly from childhood into adulthood as other cardiovascular risk factors such as body fat and cholesterol,7 longitudinal studies have demonstrated that childhood high blood pressure is moderately predictive of adulthood hypertension,8 9 a well-known risk factor for cardiovascular disease (CVD).10
Several medical societies have recognised the importance of primordial prevention of CVD by preventing high blood pressure from early in the life course onwards,11 12 for example, by improving nutrition, increasing physical activity and decreasing sedentary behaviour.1 11 12 Such prevention efforts may be targeted to children at high risk of developing high blood pressure. This would require a tool (eg, based on a prediction model) that accurately identifies these children from the general population. In many developed countries, the general population of children is reached through preventive child healthcare services, well-child care services or other forms of community-based child healthcare, where there might be a possibility for targeted prevention based on predictive risk assessments.
To apply a prediction model in community-based child healthcare, ideally it should be based on information commonly recorded or easily obtainable in such a setting. Studies on prediction models for future high blood pressure applicable in childhood are scarce,8 9 13–15 and for most models information is needed that will likely be difficult to obtain in community-based child healthcare settings, such as repeated blood pressure measurements or genetic information.8 9 14 15 Also, the potential of some of these models for early-life targeted prevention is limited, as they are applicable only later in childhood.9 13 15 Importantly, generalisation of the performance of these prediction models is uncertain as none of them were validated.
Community-based child healthcare reaches many children from a very early age onwards and usually repeated consultations take place. For example, Preventive Child Health Care in the Netherlands reaches over 90% of all children, with frequent and free preventive health consultations from birth until the age of 18 years.16 This not only provides the opportunity to start with prevention very early in life, but the follow-up would also allow for updating of predictions based on new information such as more recent measurements of the child’s body mass index (BMI). This can be described as dynamic prediction.
In this study, as a part of the Prediction Of Child CardiOmetabolic Risk project,17 we aimed to develop a dynamic prediction model feasible to use in community-based child healthcare, from birth to the age of 6 years, to predict high blood pressure at the age of 9–10 years.
Data from Generation R, a population-based prospective birth cohort study, were used. Full details of the study design have been published elsewhere.18 Pregnant women with an expected delivery date between April 2002 and January 2006, living in Rotterdam, the Netherlands, at the time of birth, were eligible. Most women were enrolled in pregnancy, but enrolment was allowed until birth. During early life, growth data from Preventive Child Health Care centres in the area were retrieved by Generation R.
From the original cohort of 9749 live born children, 8548 children were invited for the follow-up visit at the age of 9–10 years; 5862 of these children attended the research centre, of which 5488 had complete blood pressure measurements. Children aged 8 or 11 years at this visit (n=129) were excluded. As we aimed to develop a prediction model for all children from the general population, no other exclusion criteria were applied. In total, data of 5359 children could be analysed (figure 1).
Blood pressure was measured in the research centre by well-trained staff at a median age of 9.73 years (range 9.00–10.99 years). Systolic blood pressure (SBP) and diastolic blood pressure (DBP) were measured in supine position, at the right brachial artery, using the validated automatic sphygmomanometer Datascope Accutorr Plus.19 An appropriately sized cuff was used for each child, with a bladder width of 40% of the arm circumference and a bladder length of more than 80% of the arm circumference. Blood pressure was measured four times with 1 min intervals. For this analysis, mean SBP and DBP values based on the last three measurements were used. Gender-specific, age-specific and height-specific blood pressure percentiles derived from German reference values were used to define the outcome high blood pressure.20 These percentiles are based on the distribution of blood pressure in a non-overweight population of 12 199 children, that was considered representative of the German population, and also included children with a migrant background (17.1% had a two-sided migrant background, most commonly Turkish or Russian). Height percentiles were comparable to the Dutch population, and blood pressure was measured with the same automated device as in Generation R.20 We chose to use reference values derived from a non-overweight population, as it is has been recognised that the increasing prevalence of overweight shifts the blood pressure distribution of a population upwards, while this is not a normal or healthy situation.21 High blood pressure was defined as mean SBP and/or DBP at or above the 95th percentile.
Based on previous studies and expert consultations, variables were identified that have been associated with childhood blood pressure, and that are usually recorded or would otherwise be relatively easy to obtain (eg, through self-reports or extracted from medical reports) in community-based child healthcare settings. These are presented in table S1 of online supplementary material 1, with supporting literature. To prevent overfitting of the prediction model, a selection from these potential candidate predictors was made based on (1) expected predictive strength based on the literature, (2) correlations between variables (eg, between maternal and paternal educational level, and maternal smoking during and after pregnancy) and (3) feasibility in community-based child healthcare. This resulted in the following candidate predictors for analysis: maternal prepregnancy BMI, maternal hypertensive disease in pregnancy, maternal educational level, hypertension in biological parents, CVD in family of biological parents, parental smoking, child gender, child ethnicity, gestational age at birth, birth weight SD score (SDS) and repeatedly measured BMI SDS.
Supplementary file 1
Maternal prepregnancy BMI was derived from a questionnaire during pregnancy. The presence of maternal hypertensive disease in pregnancy was based on delivery reports, cross-checked with hospital charts at admission and the Netherlands Perinatal Registry. Diseases that were considered as maternal hypertensive disease in pregnancy were pregnancy-induced hypertension, pre-eclampsia and haemolysis elevated liver enzymes and low platelets syndrome (HELLP syndrome). Pre-existing hypertension was not considered as maternal hypertensive disease in pregnancy, unless there was superimposed pre-eclampsia or HELLP syndrome. Maternal educational level (highest level finished) was based on a questionnaire at inclusion, and categorised as none or primary education; secondary education; and higher education. Hypertension in the biological parents was self-reported in questionnaires at inclusion, and categorised as no parental hypertension or at least one parent with hypertension. Parents who reported not knowing if they had hypertension (3%–5% of received answers) were classified as not having hypertension. Family history of CVD was also self-reported at inclusion, and considered positive if at least one parent had at least one relative (mother, father, sister, brother or child) with hypertension, myocardial infarction before the age of 65 or stroke. Parental smoking was assessed through questionnaires during pregnancy (asking the partner whether he smoked in the 2 months before pregnancy), and the first 6 months (asking the mother whether she smoked at that time point). Next, parental smoking was categorised as none of the parents smoke or at least one parent smokes. Child ethnicity was based on questionnaires and determined in accordance with Statistics Netherlands according to country of birth of the child’s parents: if one parent was born outside the Netherlands, that country was used to determine the child’s ethnicity, and if both parents were born outside the Netherlands, the country of birth of the mother was used to determine the child’s ethnicity.22 Categories were created by considering (1) studies on the association between childhood high blood pressure and ethnicity in the Netherlands, and (2) large ethnic groups in the Netherlands.22–24 Child ethnicity was categorised as Western (including Dutch, other European, western American, western Asian, Oceanian), Turkish, Moroccan, Surinamese and Other non-western (including Cape Verdean, Dutch Antilles, African, non-western Asian, Indonesian, non-western American). Gestational age at birth in weeks and birth weight were based on delivery reports. Birth weight SDS, adjusted for gestational age, was determined according to Niklasson et al.25 BMI was based on protocolled measurements of height and weight at either a Preventive Child Health Care centre (0–4 years) or the Generation R research centre (5–6 years, median 6.0 years). BMI SDS values, adjusted for age and gender, were constructed based on Dutch reference growth curves from 2010.26
The percentage of missing values for the candidate predictors ranged from 0% to 61%. To be able to use all of the observed information for each candidate predictor, missing values were imputed 30 times with multivariate imputation by chained equations27. For three candidate predictors based on complete information from both parents (parental smoking, parental hypertension and CVD in family), the proportions of missing values were high (61%, 48% and 48%, respectively). Therefore, we used the separate variables from each parent in the imputation model; for these variables the proportions of missing values ranged from 14%–45%. The final imputation model included all candidate predictors, the outcome and the following auxiliary variables: maternal smoking, paternal smoking, maternal hypertension, paternal hypertension, maternal family history of CVD, paternal family history of CVD, child BMI values at other ages between birth and 6 years, smoking in the home environment, family income, paternal educational level, paternal BMI and maternal BMI at child age 5–6 years.
First, we studied whether the BMI SDS trajectory of a child would predict high blood pressure better than the most recent BMI SDS only. We applied a two-step model to investigate the use of BMI SDS trajectories. In the first step, each child’s BMI SDS trajectory was modelled using a random effects model with restricted cubic splines, and in the second step the individual coefficients of each child’s trajectory were used as a predictor in a logistic regression model with high blood pressure as the outcome. We saw that, in our study, the trajectory was not of added predictive value when the most recent BMI SDS and birth weight SDS were already included. Therefore, in the subsequent analysis we used only the most recent BMI SDS and not the BMI SDS trajectory.
Next, logistic regression analyses were performed with high blood pressure at age 9–10 years as the outcome, predicted at different ages (6 months, 1 year, 2 years, 3 years, 4 years and 5–6 years). For each age, a backward stepwise selection procedure was performed, using the Akaike Information Criterion for predictor selection. For a variable with one parameter this corresponds to selection at a p value of 0.157.28 Restricted cubic splines were used to examine non-linearity in the association between birth weight SDS and high blood pressure, because not only low, but also high birth weight might be associated with high blood pressure.29 An interaction term was considered for birth weight SDS and the most recent BMI SDS value, as it was shown that a combination of high current BMI with low birth weight was predictive of higher SBP.30 Neither the splines nor the interaction term were of added predictive value in our study, and thus were not used in the final models.
As we aimed for a dynamic prediction model that could be applied at any age from birth until the age of 6 years, we first checked whether at each age the same baseline predictors remained in the model after backward selection. Furthermore, we checked whether the size of the baseline regression coefficients was similar at each age, and whether the coefficient for BMI SDS would increase with increasing age. As these conditions were satisfied, we developed a dynamic prediction model by including the selected baseline predictors and an interaction between BMI SDS and age. By doing so, the predictive value of BMI SDS was allowed to vary with the child’s age at measurement, while associations of the other predictors were kept constant. This approach is referred to in the literature as dynamic logistic regression or pooled logistic regression,31 and reduces the need for age-specific models. To take into account that each child contributes to the model with multiple measurements of BMI SDS (ranging from 0 to 14 measurements), robust standard errors were calculated by fitting the model using generalised estimating equations (GEE) with the independence correlation structure. GEE is usually used to deal with repeatedly measured outcomes, but can also be used to adjust standard errors for repeatedly measured predictors or exposures, such as BMI SDS and age in our study.31 32 For this purpose, an independent working correlation matrix must be specified.32 From the GEE analysis, we obtained the final model with coefficients, and the estimates for the apparent (ie, determined before internal validation) discrimination at different ages.
As currently there is no R package available to internally validate prediction models based on GEE, we performed internal validation procedures on the logistic regression models at each age using bootstrapping. 250 random sets of data were generated, with the same size as the original dataset, drawn with replacement from the original data. These datasets were used to estimate, for each age, the optimism in the quality of the prediction model. Compared with other internal validation techniques such as cross-validation, bootstrapping is better able to capture model uncertainty caused by variable selection methods such as backward selection.33 34
As performance measures we assessed, at different ages, the apparent area under the receiver operating characteristic curve (AUC) and the AUC adjusted for the optimism calculated with the bootstrapping procedure. The AUC is a measure of discrimination of a prediction model, and reflects the ability of the model to correctly assign a higher risk to individuals who have the outcome compared with those who do not.34 Next, we assessed the calibration slopes calculated in the bootstrap procedure which represent, at different ages, the ability of the model to estimate the level of risk accurately. It can range from 0 to 1, where 1 means that the model is perfectly calibrated.34 The mean of these calibration slopes was used to shrink the regression coefficients estimated with the GEE approach. As a final step, the intercept was adjusted to re-establish the calibration-in-the-large, so that the mean of the predicted risks was again in line with the mean of the observed risks.34 The resulting final model was applied in an Excel risk calculator. All analyses were performed in R, version 3.3.2.
Outcome definition and selection of candidate predictors were presented and discussed in a meeting with a group of stakeholders involved with our research project, including child healthcare professionals and a parent representative from a Dutch parent organisation.
Table 1 shows the characteristics of the 5359 included children, of which 150 (2.8%) were twins and 323 (6.1%) were born preterm. In total, 227 children (4.2%) had high blood pressure at the age of 9–10 years. Predictors selected in the logistic regression models were: maternal hypertensive disease in pregnancy, maternal educational level, maternal prepregnancy BMI, child ethnicity, birth weight SDS and child BMI SDS at the specific age. These models are presented in table S2 of online supplementary material. The predictive value of the child’s BMI SDS increased with increasing age. Table 2 shows the ORs for the univariable and multivariable associations between the candidate predictors and high blood pressure based on GEE. The strongest predictors were child age in combination with BMI SDS, maternal educational level and maternal prepregnancy BMI. Apparent AUCs for the model ranged from 0.67 to 0.74 at different ages (table 3).
Internal validation and final model
After correcting for optimism, AUCs at different ages ranged from 0.65 to 0.73 (table 3). Calibration slopes ranged from 0.91 to 0.94; a mean shrinkage factor of 0.92 was applied to the coefficients of the GEE model. The intercept had to be adjusted with −0.23. Figure 2 shows the Excel risk calculator. The predicted probabilities in our dataset ranged from 0.2% to 53.4% (median 3.0%). Within children, predicted probabilities can vary at different ages given their BMI SDS development (figure 3 and table 4). Table 5 shows the prevalence of high blood pressure across four categories of predicted risk at the ages 1, 3 and 5–6 years. For example, in children with a predicted risk of more than 15% at the age of 5–6 years, the measured prevalence of high blood pressure at the age of 9–10 years was 24.8%.
We developed a dynamic model to predict, for children from birth until the age of 6 years in the general population, their risk of high blood pressure at the age of 9–10 years, based on information that is relatively easy to obtain. The dynamic nature of the prediction model allows for incorporating new information on BMI SDS that becomes available as a child gets older, so that the predicted risk can be updated.
After internal validation, the discriminative ability of the prediction model was moderate, and highest at the age of 5–6 years (AUC 0.73) which can be explained by the higher predictive value of BMI SDS at an age closer to the age at outcome assessment. Although the overall discriminative performance was not excellent or good, the prediction model did allow for identification of a group of children at a considerably higher risk than the overall study population to have high blood pressure at the age of 9–10 years. The prediction model might therefore prove helpful to community-based child healthcare professionals, because it would allow them to objectively select children for targeted prevention. On the other hand, before considering implementation of this prediction model, first external validation studies are needed, in order to study the generalisability of the prediction model and to see what adaptations to specific populations might be necessary to improve the performance of the model.
In line with previous studies, we found the following predictors for childhood high blood pressure: maternal hypertensive disease in pregnancy,35 36 maternal educational level,37 38 maternal prepregnancy BMI,39 child ethnicity,23 24 birth weight,29 30 and BMI SDS.9 13–15 Candidate predictors that did not improve the model were gestational age at birth, child gender, hypertension in the biological parents, CVD in the family of the biological parents and parental smoking, even though in previous studies these were associated with blood pressure in the child.14 40–43 One reason these predictors proved unimportant in our study could be that they were correlated with other predictors we included, for example, parental hypertension with maternal BMI. For parental hypertension, another explanation might be that parents were still relatively young and therefore the prevalence was low. Further, the high proportions of missing values for parental smoking, parental hypertension and family history of CVD could have decreased the power to detect associations between these candidate predictors and childhood high blood pressure. A possible reason for the association between gender and high blood pressure reported in a previous study is that in that study the outcome was assessed in adolescence,41 where effects of puberty or other adolescence-related variables may play a role in differentiating the risk of high blood pressure.
As blood pressure was measured in Generation R participants at the age of 5–6 years, we have considered adding SBP and/or DBP at this age to the model to improve its discriminative ability, but decided not to, because in most countries routine measurement of blood pressure has not been incorporated into community-based child healthcare.44 45 Even though American and European medical societies recommend routine measurement of blood pressure for children from the age of 3 years,21 46 the debate on its usefulness is still ongoing.46 47 If blood pressure measurement would become a standard procedure in community-based child healthcare, updating the prediction model with information on current blood pressure should be considered.
This study has several strengths and limitations. First of all, we used data from a large prospective cohort study, allowing us to consider many possible candidate predictors. Based on literature and expert opinion, we believe that we included all the most relevant candidate predictors, while also taking into account that the model should be applicable in most community-based child healthcare settings. Second, we could study the change in the predictive value of childhood BMI SDS, and by using pooled logistic regression analysis we were able to incorporate this information into one dynamic prediction model that can be applied at different ages, increasing the ease of its use in practice.
A possible weakness of the cohort is the loss to follow-up: about 40% of the children in the original cohort did not attend the visit at the age of 9–10 years. In general, children remaining in the study more often had a Western ethnic background, and their mothers were older and better educated.18 Since all ethnicities and educational levels were still well-represented in the follow-up, we think it unlikely that this loss to follow-up appreciably biased the associations found. An important limitation of our study is that for some candidate predictors based on information from both parents (parental smoking, parental hypertension and CVD in the family), the proportions of missing values were very high. Even though we included all available information from each individual parent in the imputation model, the missing values could have reduced the power to detect these variables as predictors for high blood pressure in our model. Therefore, we cannot exclude that information about these variables in reality might be of added predictive value (and hence increase the performance of the model). This should be investigated in external validation studies with more complete data on these variables. Another point that should be noted is that we performed the internal validation of the GEE model indirectly, that is, through bootstrapping the logistic regression models at each age, as there is not yet a software package available that is able to perform this directly on the GEE model. The results were stable over the different ages and standard errors are correct in the analyses for one time point. Therefore, the estimated optimism may be considered as realistic, although external validation is again recommended.
It should be noted that we could only measure the outcome as high blood pressure and not as hypertension. To diagnose childhood hypertension, blood pressure needs to be measured on three separate occasions,21 which was not the case in this study. Most of the children with high blood pressure in our cohort would not be diagnosed with hypertension if they had been followed up on two more occasions. Based on previous studies, we estimate that about 1 in 4 to 5 children with high blood pressure in our study would be diagnosed as hypertensive.41 48 On the other hand, the high blood pressure prevalence in our study might be slightly underestimated, because blood pressure was measured in a supine position which tends to give lower blood pressure values than measurement in a sitting position, as in the study for the reference values.20 This might have filtered out some of the children without true hypertension. Setting aside these limitations, several studies have shown that high blood pressure in childhood measured on only one occasion is associated with an increased risk of hypertension in later life.8 9 Therefore, extra attention to these children could still be warranted, although we must be aware that this has not yet been studied for the more recent reference values for high blood pressure based on non-overweight populations.
If external validity can be confirmed, we would propose that, based on this prediction model, only minimally intensive (and not invasive or harmful) strategies should be offered to high-risk children, considering that (1) the discriminative performance is only moderate, (2) it concerns prediction of high blood pressure and not hypertension and (3) it has not yet been studied whether targeted interventions in this population would be effective. As mentioned previously, the prediction model might be helpful to guide community-based child healthcare professionals in better distributing their time and efforts, by identifying children that need relatively more attention to prevention of CVD, for example in the form of tailored lifestyle and nutritional advice,11 12 measurement of the child’s current blood pressure, and monitoring of blood pressure during follow-up. In overweight or obese children, a higher predicted risk could help to underline the importance of improving weight status. Depending on the strategies to be offered, higher or lower cut-offs to define the high risk group might be used, and the use of multiple cut-offs and differentiated strategies might also be considered. In very young children (eg, <4 years of age) with a high predicted risk, it might be a strategy to wait for the result of the next risk assessment before starting with targeted prevention. Before implementation, the possible benefits and harms of the preferred strategies should be discussed. It would also be important to investigate how parents and health professionals could experience the use of such a prediction model, including the acceptability and effectiveness of risk communication. Lastly, if the model would be implemented in the future, the effects of applying the model in combination with targeted prevention on the occurrence of high blood pressure should be investigated in a randomised or cluster randomised trial.
In summary, we developed a dynamic prediction model to predict the development of childhood high blood pressure based on information that is usually recorded or is easy to obtain in community-based child healthcare practice. This can be seen as a first step towards applying childhood prediction models for future high blood pressure in order to offer targeted primordial prevention of CVD.
We gratefully acknowledge the contribution of children and parents, child health care professionals, general practitioners, hospitals, and midwives in Rotterdam.
Contributors MLAdK and AHW conceived the project. MLAdK obtained funds for the ProCOR project, and is project coordinator. MH performed the analysis, supervised by YV, MWH and JWRT. MH drafted the first version of the manuscript, with help from MLAdK and YV. VWVJ contributed to the conception of the cohort. VWVJ and HR contributed to the design of the cohort. The final manuscript was critically revised and approved by all authors.
Funding This study is part of larger project aiming to develop prediction and decision tools for childhood overweight and cardiometabolic risk factors, funded by The Netherlands Organization for Health Research and Development (ZonMw grant no. 200500006). Generation R is financially supported by the Erasmus MC University Medical Center, the Netherlands Organization for Health Research and Development, and the Ministry of Health, Welfare and Sport. Generation R also received funding from the European Union’s Horizon 2020 Research and Innovation Programme (grant no. 733206, LifeCycle). Vincent Jaddoe received an additional grant from the Netherlands Organization for Scientific Research (NWO-VIDI 016.136.361) and a Consolidator Grant from the European Research Council (ERC-2014-CoG-648916).
Competing interests None declared.
Patient consent Obtained.
Ethics approval Medical Ethics Committee of the Erasmus University Medical Center, Rotterdam.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The datasets generated and/or analysed during the current study are not publicly available due to privacy of the participants, but are available from the corresponding author on reasonable request.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.