Article Text


Comparative safety of antiepileptic drugs for neurological development in children exposed during pregnancy and breast feeding: a systematic review and network meta-analysis
  1. Areti Angeliki Veroniki1,
  2. Patricia Rios1,
  3. Elise Cogo1,
  4. Sharon E Straus1,2,
  5. Yaron Finkelstein3,4,5,
  6. Ryan Kealey1,
  7. Emily Reynen1,
  8. Charlene Soobiah1,6,
  9. Kednapa Thavorn7,8,9,
  10. Brian Hutton7,10,
  11. Brenda R Hemmelgarn11,
  12. Fatemeh Yazdi1,
  13. Jennifer D'Souza1,
  14. Heather MacDonald1,
  15. Andrea C Tricco1,12
  1. 1 Li Ka Shing Knowledge Institute,St. Michael’s Hospital, Toronto, Canada
  2. 2 Department of Medicine, University of Toronto, 27 King’s College Circle, Toronto, Canada
  3. 3 The Hospital for Sick Children,555 University Avenue, Toronto, Canada
  4. 4 Department of Paediatrics, University of Toronto, Toronto, Canada
  5. 5 Department of Pharmacology and Toxicology, University of Toronto, Medical Sciences Building, Toronto, Canada
  6. 6 Institute for Health Policy Management & Evaluation,University of Toronto, Toronto, Canada
  7. 7 School of Epidemiology,Public Health and Preventive Medicine, University of Ottawa, Ottawa, Canada
  8. 8 Clinical Epidemiology Program,Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Canada
  9. 9 Institute of Clinical and Evaluative Sciences (ICES uOttawa), Ottawa, Canada
  10. 10 Ottawa Hospital Research Institute,Center for Practice Changing Research, Ottawa, Canada
  11. 11 Departments of Medicine and Community Health Sciences, University of Calgary, Calgary, Canada
  12. 12 Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
  1. Correspondence to Dr Andrea C Tricco; triccoa{at}


Objectives Compare the safety of antiepileptic drugs (AEDs) on neurodevelopment of infants/children exposed in utero or during breast feeding.

Design and setting Systematic review and Bayesian random-effects network meta-analysis (NMA). MEDLINE, EMBASE and the Cochrane Central Register of Controlled Trials were searched until 27 April 2017. Screening, data abstraction and quality appraisal were completed in duplicate by independent reviewers.

Participants 29 cohort studies including 5100 infants/children.

Interventions Monotherapy and polytherapy AEDs including first-generation (carbamazepine, clobazam, clonazepam, ethosuximide, phenobarbital, phenytoin, primidone, valproate) and newer-generation (gabapentin, lamotrigine, levetiracetam, oxcarbazepine, topiramate, vigabatrin) AEDs. Epileptic women who did not receive AEDs during pregnancy or breast feeding served as the control group.

Primary and secondary outcome measures Cognitive developmental delay and autism/dyspraxia were primary outcomes. Attention-deficit hyperactivity disorder, language delay, neonatal seizures, psychomotor developmental delay and social impairment were secondary outcomes.

Results The NMA on cognitive developmental delay (11 cohort studies, 933 children, 18 treatments) suggested that among all AEDs only valproate was statistically significantly associated with more children experiencing cognitive developmental delay compared with control (OR=7.40, 95% credible interval (CrI) 3.00 to 18.46). The NMA on autism (5 cohort studies, 2551 children, 12 treatments) suggested that oxcarbazepine (OR 13.51, CrI 1.28 to 221.40), valproate (OR 17.29, 95% CrI 2.40 to 217.60), lamotrigine (OR 8.88, CrI 1.28 to 112.00) and lamotrigine+valproate (OR 132.70, CrI 7.41 to 3851.00) were associated with significantly greater odds of developing autism compared with control. The NMA on psychomotor developmental delay (11 cohort studies, 1145 children, 18 treatments) found that valproate (OR 4.16, CrI 2.04 to 8.75) and carbamazepine+phenobarbital+valproate (OR 19.12, CrI 1.49 to 337.50) were associated with significantly greater odds of psychomotor delay compared with control.

Conclusions Valproate alone or combined with another AED is associated with the greatest odds of adverse neurodevelopmental outcomes compared with control. Oxcarbazepine and lamotrigine were associated with increased occurrence of autism. Counselling is advised for women considering pregnancy to tailor the safest regimen.

Trial registration number PROSPERO database (CRD42014008925).

  • multiple treatment meta-analysis
  • knowledge synthesis
  • epilepsy
  • pregnancy
  • infants
  • developmental delay

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • Twenty-nine cohort studies involving 5100 children of women who took antiepileptic drugs (AEDs) were included in this systematic review. More evidence from long-term follow-up studies is required.

  • This study was the first that compared and ranked the safety of AEDs, including comparative safety of treatments that have not been directly compared.

  • Across all neurological outcomes and treatments compared with control, valproate alone or combined with another AED is associated with the greatest odds of adverse development.

  • Oxcarbazepine and lamotrigine were associated with increased occurrence of autism.


Antiepileptic drugs (AEDs) are used by pregnant women for various conditions, such as epilepsy, pain syndromes, psychiatric disorders and chronic migraine.1 AED use during pregnancy is associated with risks to the fetus as these drugs can cross the placenta or may be transferred to the infant through breast feeding and may be associated with adverse neurodevelopment outcomes.2–4 Two systematic reviews examined the association between AED exposure and neurodevelopment in utero and reported that exposure to valproate was linked to significantly lower IQ scores and poorer overall neurodevelopmental outcomes in the children of women who used these medications.5 6 No significant associations were found between neurodevelopment and exposure to other AEDs such as carbamazepine, lamotrigine or phenytoin.5–8 However, there is a lack of sufficiently powered studies to assess the impact of AEDs on neurodevelopment in children of women exposed to these agents, especially for newer-generation drugs, thus highlighting the need for a systematic review.9 10

The aim of this study was to compare the safety of AEDs and assess their impact on neurodevelopment in infants and children exposed in utero or during breast feeding, employing a systematic review and network meta-analysis (NMA).


The methods are briefly described here; details can be found in the published protocol (see online supplementary additional file 1).11 This study was registered with PROSPERO (CRD42014008925). We followed the International Socieity for Pharmacoepidemiology and Outcomes Research (ISPOR)12 guidelines for our NMA and reported our findings using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) extension for NMA (see online supplementary additional file 2).13

Eligibility criteria

All randomised clinical trials (RCTs), quasi-RCTs and observational studies were eligible. Included studies assessed infants or children ≤12 years of age whose mothers consumed AEDs during pregnancy and/or while breast feeding. Both monotherapy and polytherapy AEDs were eligible, including first-generation (ie, carbamazepine, clobazam, clonazepam, ethosuximide, phenobarbital, phenytoin, primidone, valproate) and newer-generation (ie, marketed >1990: gabapentin, lamotrigine, levetiracetam, oxcarbazepine, topiramate, vigabatrin), with no restrictions on AED dosage. Placebo, no AED, other AEDs alone or in combination were considered as comparators. Duplicate studies that used the same registry or population sample (ie, companion studies) were used for supplementary information only. No language or other restrictions were imposed.

The primary neurological outcomes were cognitive developmental delay and autism/dyspraxia, and the secondary outcomes included attention-deficit hyperactivity disorder (ADHD), language delay, neonatal seizures, psychomotor developmental delay and social impairment. Table 1 shows the outcome measures and diagnostic scales used. We initially intended to evaluate all safety outcomes in infants/children exposed to AEDs in utero or during breast feeding in one publication, but given the breadth of evidence identified, we report results related to risk of major congenital malformations, birth and prenatal outcomes in a companion paper.14

Table 1

Outcome measures and diagnostic scales used in analysis

Information sources

An experienced librarian executed search strategies for MEDLINE, EMBASE and the Cochrane Central Register of Controlled Trials up to 18 March 2014 and then updated the search on 27 April 2017. The search strategy for MEDLINE was peer-reviewed by another librarian using the Peer Review of Electronic Search Strategies (PRESS) checklist15 and is available in the protocol.11 Additional studies were identified by scanning references and contacting authors. Unpublished studies were sought by searching clinical trial registries and conference abstracts.

Study selection and data collection

After a calibration exercise, titles/abstracts (level 1) and full-text papers (level 2) were screened by two reviewers independently. On completion of level 1, 6% of citations were discrepant between reviewer pairs, whereas at the conclusion of level 2, 16% of articles were discrepant. Conflicts were resolved through discussion or by a third reviewer. The same approach was used for data abstraction and appraisal of methodological quality. Three rounds of pilot testing were conducted prior to data abstraction to train reviewers and refine the data abstraction form. For studies published in the last 10 years, authors were contacted to request clarification or additional data.

Appraisal of methodological quality

Only observational studies were identified and included for analysis, and their methodological quality was appraised with the Newcastle-Ottawa Scale (NOS) (see online supplementary appendix A).16 For each outcome with ≥10 studies, the comparison-adjusted funnel plot was used to assess small-study effects,17 where the overall treatment effect for each comparison was estimated under the fixed-effect meta-analysis model. All eligible medications were ordered from oldest to newest using their international market approval dates. Hence, the comparison-adjusted funnel plot additionally assesses the hypothesis that newer AEDs are favoured over older ones. To overcome some of the correlations induced by multiarm studies, which may cause overestimation and mask funnel plot asymmetry, we plotted data points corresponding to the study-specific basic parameters (treatment comparisons with common comparator). In each study, we used the control group as the common comparator or, if this was missing, we used the oldest treatment comparator against the remaining AEDs.

Synthesis of included studies

We used the OR for each dichotomous outcome, and outcome data were pooled using hierarchical meta-analysis and NMA models and the Markov Chain Monte Carlo sampling method in a Bayesian framework. To account for anticipated methodological and clinical heterogeneity across studies, and to achieve the highest generalisability in the meta-analytical treatment effects, we applied a random-effects model.18

An NMA was applied for connected evidence networks and prespecified treatment nodes.19 We assessed the transitivity assumption for each outcome a priori using the effect modifiers: age, baseline risk, treatment indication, timing and methodological quality. The mean of each continuous effect modifier and the mode of each categorical effect modifier for each pairwise comparison were presented in tables for each outcome.20 The consistency assumption was evaluated for the entire network of each outcome using the random-effects design-by-treatment interaction model when multiple studies were available in each network design or the fixed-effect design-by-treatment interaction model when a single study informed each network design.21 If inconsistency was identified, further examination for local inconsistency in parts of the network was completed using the loop-specific method.22 23 Common within-network between-study variance (τ2) across treatment comparisons was assumed in the meta-analysis, NMA and design-by-treatment interaction model so that treatment comparisons including a single study can borrow strength from the remaining network. This assumption was clinically reasonable as the treatments included were of the same nature. In the loop-specific approach, common within-loop τ2 was assumed.

For cognitive developmental delay and autism/dyspraxia outcomes, network meta-regression analyses for maternal age and baseline risk (ie, using the control group) were conducted, when ≥10 studies provided relevant information, assuming a common fixed coefficient across treatment comparisons for AEDs versus control. Sensitivity analyses for cognitive developmental delay and autism/dyspraxia outcomes were performed for treatment indication of epilepsy, large study size (ie, >300), maternal alcohol intake, maternal tobacco use, only first-generation AEDs and methodological quality. The sensitivity analysis for methodological quality was restricted to studies with low risk of bias for the two items on the NOS where the greatest proportion of studies received a low-quality score: adequacy of follow-up of cohorts and comparability of cohorts. For autism/dyspraxia, a sensitivity analysis on maternal IQ/psychiatric history was additionally conducted. We measured the goodness of fit using the posterior mean of the residual deviance, the degree of τ2 and the deviance information criterion (DIC). In a well-fitting model, the posterior mean residual deviance should be close to the number of data points.24 25 A difference of three units in the DIC between an NMA and a network meta-regression model was considered important and the lowest value of the DIC corresponded to the model with the best fit.24 25

All analyses were conducted in OpenBUGS26 assuming non-informative priors for all model parameters, and τ~Ν(0,1), τ >0. The first 10 000 iterations were discarded and then 100 000 simulations were run with thinning of 10 values. Convergence was checked by visual inspection of the evaluation of the mixing of two chains. The median and 95% credible intervals (CrIs) were calculated for each parameter value. The network command27 was used to apply the design-by-treatment interaction model.

For NMA estimates, a 95% predictive interval (PrI) is also reported to capture the magnitude of τ2 and present the interval within which the treatment effect of a future study is expected to lie.28 29 The estimated safety of the included AEDs was ranked using the surface under the cumulative ranking (SUCRA) curve.30 The larger the SUCRA for a treatment, the higher its safety rank among all the available treatment options. SUCRA values are presented along with 95% CrIs to capture the uncertainty in the parameter values.31


Literature search and included studies

Our literature search identified 5707 titles and abstracts, which after the screening process yielded 681 articles potentially relevant for inclusion (figure 1). After full-text review, 95 studies fulfilled eligibility criteria along with 17 studies identified through supplemental methods. Of the 112 total eligible studies in the complete review,14 29 articles with seven companion reports and two potentially overlapping registry studies included one or more relevant neurological outcomes (see online supplementary appendix B). Four of the studies included in this analysis were conference abstracts with usable data,32–35 and four studies,36–39 not captured in the original literature search, were identified through reference scanning. A table with the key excluded studies and a rationale for their exclusion is presented in the online supplementary appendix C.

Study and patient characteristics

We included 29 cohort studies (5100 patients) published between 1989 and 2016 (table 2; see online supplementary appendix D,E). The number of patients included in each study ranged from 23 to 2011 (median 74.5). Most studies (76%) were published after 2000, 62% of the studies included <100 patients and 52% of the studies included a control group of pregnant/breastfeeding women with epilepsy who did not receive AEDs. The mean maternal age ranged from 24 to 34 years. About half of the studies (52%) were funded through government/public research funding.

Table 2

Summary characteristics of included studies

Methodological quality results

Twenty-nine observational studies were appraised using the NOS (see online supplementary appendix F). Overall, the studies were of good methodological quality and were rated as high quality across most items: 28 studies (97%) selected the non-exposed cohort from the same community as the exposed cohort, 26 (90%) included a representative or somewhat representative sample, 27 (93%) assessed outcomes independently, with blinding, or via a record linkage (eg, identified through database records) and 23 (79%) ascertained exposure via secured records (eg, database records) or structured interviews. The comparability of cohorts and adequacy of follow-up were the lowest scoring items across the studies with only 12 (41%) and 10 (34%) studies rated as high quality on these items. No evidence for small-study effects was identified by the visual inspection of the comparison-adjusted funnel plots (see online supplementary appendix G).

Statistical analysis results

No important concerns were raised regarding the violation of the transitivity assumption when maternal age, baseline risk, treatment indication and timing were assessed (see online supplementary appendix H). However, the average methodological quality appraisal across treatment comparisons varied across treatment comparisons. The evaluation of the consistency assumption using the design-by-treatment interaction model suggested that there was no evidence of significant inconsistency across all outcomes (see online supplementary appendix H).

In the following sections, we present the significant NMA results by outcome for AEDs compared with control (ie, no exposure to AEDs), while the SUCRA values from all outcomes are presented in figure 2 and depicted in a rank-heat plot ( in the online supplementary appendix I.

Figure 2

Forest plots for cognitive developmental delay, autism/dyspraxia, psychomotor developmental delay, language delay and attention-deficit hyperactivity disorder outcome. carbam, carbamazepine; ethos, ethosuximide; gabap, gabapentin; lamot, lamotrigine; levet, levetiracetam; pheno, phenobarbital; pheny, phenytoin; PrI, predictive interval; primid, primidone; SUCRA, surface under the cumulative ranking; topir, topiramate; valpro, valproate.

Cognitive developmental delay

The NMA for cognitive developmental delay (definitions in table 1) included 11 cohort studies, 933 children and examined 18 treatments (figure 3A; see online supplementary appendix J; τ2=0.12, 95% CrI 0.00 to 1.15). One study included children exposed to AEDs both in utero and through breast feeding, and 10 included children exposed to AEDs in utero. Across all AEDs, only valproate was associated with significantly increased odds of cognitive developmental delay compared with control (OR 7.40, 95% CrI 3.00 to 18.46; figure 2A; see online supplementary appendix H).

Figure 3

Network diagrams for cognitive developmental delay, autism/dyspraxia, psychomotor developmental delay, language delay and attention-deficit hyperactivity disorder outcomes. Each treatment node is weighted according to the number of patients that have received the particular treatment, and each edge is weighted according to the number of studies comparing the treatments it connects. carbam, carbamazepine; clobaz, clobazam; clonaz, clonazepam; ethos, ethosuximide; gabap, gabapentin; lamot, lamotrigine; levet, levetiracetam; oxcar, oxcarbazepine; pheno, phenobarbital; pheny, phenytoin; primid, primidone; topir, topiramate; valpro, valproate; vigab, vigabatrin.

The same results were observed in a network meta-regression of baseline risk for offspring of women with epilepsy who were not exposed to AEDs (estimated regression coefficient on OR scale: 1.01, 95% CrI 0.76 to 1.56; τ2=0.16, 95% CrI 0.00 to 1.24; residual deviance=45.27, data points=47, DIC=80.17). Similarly, the sensitivity analyses restricted to (1) studies that only included women receiving AEDs to treat epilepsy (10 studies, 910 children, 17 treatments; τ2=0.16, 95% CrI 0.00 to 1.36), (2) studies comparing only first-generation AEDs (6 studies, 480 children, 13 treatments; τ2=0.28, 95% CrI 0.00 to 2.97), (3) studies that reported maternal alcohol or tobacco use (3 studies, 504 children, 7 treatments; τ2=0.27, 95% CrI 0.00 to 3.29) and (4) studies with high methodological quality on NOS item ‘comparability of cohorts’ (3 studies, 366 children, 7 treatments; τ2=0.38, 95% CrI 0.00 to 4.14) were consistent with the NMA results (see online supplementary appendix K). The sensitivity analysis with studies of high methodological quality on the NOS item ‘adequacy of follow-up’ found no statistically significant results (4 studies, 283 patients, 12 treatments; τ2=1.01, 95% CrI 0.01 to 5.85; see online supplementary appendix K).


The NMA on autism/dyspraxia (definitions in table 1) included 5 cohort studies, 2551 children exposed in utero and examined 12 treatments (τ2=0.16, 95% CrI 0.00 to 1.95; figure 3B; see online supplementary appendix H). Compared with control, only valproate (OR 17.29, 95% CrI 2.40 to 217.60), oxcarbazepine (OR 13.51, 95% CrI 1.28 to 221.40), lamotrigine (OR 8.88, 95% CrI 1.28 to 112.00) and lamotrigine+valproate (OR 132.70, 95% CrI 7.41 to 3851.00) were significantly associated with increased occurrence of autism/dyspraxia (figure 2B).

Restricting the NMA to studies including only women with epilepsy as their treatment indication produced results that were generally in agreement with the NMA results, except that oxcarbazepine was no longer in the network (4 cohort studies, 540 children, 10 treatments; τ2=0.31, 95% CrI 0.00 to 304). Two cohort studies of 404 offspring of women with a history of tobacco use compared four treatments and found similar results except that oxcarbazepine and lamotrigine+valproate were no longer in the network (τ2=0.39, 95% CrI 0.00 to 4.47). The results were in agreement in sensitivity analyses including only higher methodological quality studies in the ‘comparability of cohorts’ item on the NOS (4 studies, 2395 children, 12 treatments; τ2=0.19, 95% CrI 0.00 to 2.43) and the ‘adequacy of follow-up of cohorts’ (3 studies, 2244 children, 10 treatments; τ2=0.23, 95% CrI 0.00 to 2.88), except that lamotrigine was no longer statistically significant than control for the latter (see online supplementary appendix K).

Neonatal seizure

One cohort study included 72 children who were exposed to AEDs in utero as well as through breast feeding reported on the incidence of neonatal seizures. The study compared valproate against lamotrigine and found no significant difference in neonatal seizures between the two drugs (OR 0.18, 95% CI 0.01 to 3.70).

Psychomotor developmental delay

The NMA on psychomotor developmental delay (definitions in table 1) included 11 cohort studies, 1145 children exposed in utero and examined 18 treatments (τ2=0.06, 95% CrI 0.00 to 0.63; figure 3C; see online supplementary appendices H,J). Valproate (OR 4.16, 95% CrI 2.04 to 8.75) and carbamazepine+phenobarbital+valproate (OR 19.12, 95% CrI 1.49 to 337.50) were significantly more harmful than control (figure 2C).

Language delay

The NMA on language delay (definitions in table 1) included 5 cohort studies, 509 children and examined 5 treatments (τ2=0.16, 95% CrI 0.00 to 2.15; figure 3D; see online supplementary appendices H,J). One study included children exposed to AEDs in utero and through breast feeding, and four included children exposed to AEDs in utero. Compared with control, valproate was the only treatment significantly associated with increased odds of language delay (OR 7.95, 95% CrI 1.50 to 49.13; figure 2D).

Attention-deficit hyperactivity disorder (ADHD)

The NMA on ADHD (definitions in table 1) included 5 cohort studies, 816 children and examined 7 treatments (τ2=0.11, 95% CrI 0.00 to 1.29). One study included children exposed to AEDs in utero and through breast feeding, while four studies included children exposed to AEDs in utero. None of the treatment comparisons reached statistical significance (figures 3E and 2E; supplementary appendices H,J).

Social impairment

One cohort study included 422 children exposed to AEDs in utero as well as through breast feeding. The children were exposed to carbamazepine (n=48), lamotrigine (n=71), valproate (n=27) and control (n=278). No significant differences in social impairment were identified.41


Our results suggest that AEDs generally pose a risk for infants and children exposed in utero or during breast feeding. Valproate was significantly associated with more children experiencing autism/dyspraxia, language, cognitive and psychomotor developmental delays versus children who were not exposed to AEDs. Oxcarbazepine, lamotrigine and lamotrigine+valproate were associated with increased occurrence of autism/dyspraxia, whereas for the cognitive developmental delay and psychomotor developmental delay outcomes, children exposed to the combination of carbamazepine, phenobarbital and valproate were at greater odds of harm than those who were not exposed to AEDs. However, these results should be interpreted with caution, as a number of factors (eg, anticonvulsant dosing, severity of epilepsy, duration of exposure, serum concentrations of exposure, mother’s IQ/education) that may all influence outcomes were not identified in these studies. Also, our subsequent analyses may be underpowered due to missing data (eg, 17 of the 27 studies did not report maternal age, 23 of 27 studies did not report alcohol use, 22 of 27 studies did not report tobacco use and 14 of 27 studies did not include control group).

NMA is a particularly useful tool for decision-makers because it allows the ranking of treatments for each outcome. However, the results of our SUCRA curves should be interpreted with caution, especially due to the small number of studies and children included in each NMA, which is also reflected in the high uncertainty around the SUCRA values (figure 2).31

Our results are consistent with a longitudinal study of 311 children that found exposure to lamotrigine was associated with significantly higher IQ scores and verbal function at 6 years of age compared with children exposed to valproate (see online supplementary appendix C).7 As indicated in the online supplementary appendix C, we were unable to include this study because the outcome was reported as a continuous measure, where we focused on dichotomous outcomes to facilitate interpretation. Our results are supported by findings from a cohort study, which found that children exposed to levetiracetam were not at increased risk for delayed development compared with unexposed children (see online supplementary appendix C).42 As indicated in the online supplementary appendix C, we were unable to include this study due to the same reason as above. An NMA of 195 RCTs (including 28 013 both male and female patients) showed that gabapentin and levetiracetam showed the best tolerability profile compared with other AEDs, whereas oxcarbazepine and topiramate had a higher withdrawal rate, and lamotrigine an intermediate withdrawal rate.43

Across all outcomes, valproate alone or combined with another AED (even with a newer-generation agent, eg, lamotrigine) was associated with the greatest odds. Similarly, two previous systematic reviews that did not conduct an NMA found valproate was associated with significantly lower IQ scores and poorer overall neurodevelopmental outcomes compared with an unexposed control group.5 6 Also consistent with our results, a 2014 Cochrane review including 28 studies (10 of these studies were included in the meta-analyses; with a maximum number of 5 studies per meta-analysis) concluded that AED polytherapy led to poorer developmental outcomes and IQ compared with healthy controls, epileptic controls and unspecified monotherapy.5 This Cochrane review also concluded that insufficient data exist for newer AEDs. However, unlike our review, it included and analysed fewer studies, and did not differentiate between specific polytherapy regimens, and thus did not compare these regimens versus each other or specific monotherapy AEDs. These risks must be balanced with the need to control seizure activity in pregnancy and thus informed decision-making by patients and clinicians is critical.

Strengths of our study include a comprehensive systematic review methodology that followed the Cochrane Handbook44 and ISPOR12 guidelines, and reported using the PRISMA extension for NMA.13 To the best of our knowledge, our study was the first that compared and ranked the safety of AEDs. We evaluated the comparative safety of treatments that have not been directly compared head-to-head before. In addition, we calculated predictive intervals, which account for between-study variation and provide a predicted range for the treatment effect estimate, should a future study be conducted. On average, the predictive intervals suggested that our results are robust.

Our systematic review has a few limitations worth noting. First, due to the complexity of the data and the studies’ under-reporting, differences in drug dosages could not be accounted for, and it was assumed that different dosages of the same AED were equally effective. When a study reported multiple dosages for the same treatment, we combined the data for this treatment. This is common for cohort studies, which report on a number of different types of exposures among patients. Second, several polytherapies had high SUCRA estimates but very wide CrIs, which is due to the small number of studies included for each drug combination with underpowered sample sizes. Evidence suggests that ranking probabilities for a treatment of being the best may be biased towards the treatments with the smallest number of studies, which may have influenced our SUCRA results.31 45 As such, the effect sizes need to be taken into account when considering the SUCRA values. Third, due to the absence of evidence from RCTs, our conclusions were based on evidence from observational studies only, and inherent biases because of confounding and shortcomings of these studies may have impacted our findings. For example, the included studies often failed to report important treatment effect modifiers,46 such as family history of autism, ADHD and maternal IQ, severity of epilepsy making it impossible for us to explore their impact through subgroup analysis and meta-regression. Recent research has explored methods to incorporate non-randomised with randomised evidence in an NMA and have highlighted the need to carefully explore the level of confidence in the non-randomised evidence.47 48 The use of observational studies allows the assessment of the safety profile of AED treatments and offers the opportunity to evaluate effects in pregnancy.49 Future large-scale observational studies are needed to allow the evaluation of rare adverse events that otherwise cannot be adequately evaluated in RCTs, especially during pregnancy. Fourth, although no intransitivity for most effect modifiers assessed was evident, there was an imbalance in the methodological study quality appraisal across treatment comparisons and most outcomes, which may impact our results. Unknown factors or factors that could not be assessed due to a dearth of data may pose the risk of residual confounding bias, and hence risk the validity of the transitivity assumption. However, the assessment of consistency suggested no disagreement between the different sources of evidence in the network. Fifth, although the tendency towards small-study effects is greater with observational studies than with randomised trials,50 the assessment of small-study effects using adjusted funnel plots suggested no evidence for their prevalence. Also, the majority of the included studies in this review compared multiple treatments inducing correlations in each funnel plot, which may mask asymmetry. Although we plotted data points corresponding to the study-specific basic parameters to reduce correlations, this issue may still exist. Sixth, we were unable to conduct subgroup analysis by type of exposure (breast feeding vs in utero) due to the small number of studies included in the NMA and due to the poor reporting; 22 studies did not report whether exposure was also in breast feeding (additional to in utero). Hence, we included all studies in the analysis irrespective of the type of exposure.

More evidence from long-term follow-up studies is required to further delineate neurodevelopmental risks in children. Future studies should assess the genetic contribution from the biological father, maternal seizures during pregnancy, exposure through breast feeding only, types of epilepsy and maternal family history. Registries should aim to include a suitable control group and collect information on potential confounders, such as alcohol and tobacco use, allowing researchers to identify the safest agents for different patient-level covariates and enhance decision-making for healthcare providers and patients. A critical evaluation of the validity of the control group is also necessary in order to examine potential differences between the treated and the non-treated populations. An individual patient data NMA would likely provide further clarity to the field, which allows the tailoring of management to specific patient characteristics.51


Across all outcomes and treatments compared with control, valproate alone or combined with another AED was associated with the greatest odds, whereas oxcarbazepine and lamotrigine were associated with increased occurrence of autism. Counselling is advised for women considering pregnancy to tailor the safest regimen.


The authors thank Dr David Moher for providing his feedback on our protocol. They thank Dr Laure Perrier for conducting the literature searches, Becky Skidmore for peer-reviewing the MEDLINE search and Alissa Epworth for obtaining the full-text articles. They also thank Alistair Scott, Wing Hui and Geetha Sanmugalingham for screening some of the citations and/or abstracting some of the data for a few of the included studies, Misty Pratt and Mona Ghannad for helping scan reference lists, and Ana Guzman, Susan Le and Inthuja Selvaratnam for contacting authors and formatting the manuscript.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
View Abstract


  • Contributors AAV analysed the data, interpreted the results and drafted the manuscript. ACT and SES conceived and designed the study, helped obtain funding, interpreted the results and helped write sections of the manuscript. PR and EC coordinated the review, screened citations and full-text articles, abstracted data, appraised quality, resolved discrepancies, contacted authors and edited the manuscript. CS provided methodological support and screened citations and full-text articles and edited the manuscript. RK, ER, FY, JDS, KT and HM screened citations and full-text articles, abstracted data and/or appraised quality. BH, BRH and YF helped conceive the study and edited the manuscript. All authors read and approved the final manuscript.

  • Funding This systematic review was funded by the Canadian Institutes for Health Research/Drug Safety and Effectiveness Network (CIHR/DSEN).

  • Disclaimer The funder had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; or decision to submit the manuscript for publication.

  • Competing interests AAV is funded by the Banting Postdoctoral Fellowship Program from the CIHR. SES is funded by a Tier 1 Canada Research Chair in Knowledge Translation. BH is funded by a CIHR/DSEN New Investigator Award in Knowledge Synthesis. BRH receives funding from the Alberta Heritage Foundation for Medical Research. ACT is funded by a Tier 2 Canada Research Chair in Knowledge Synthesis.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement All data sets generated and/or analysed during the current study are available from the corresponding author on reasonable request.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.