Article Text

Download PDFPDF

Meta-analysis of self-reported health symptoms in 1990–1991 Gulf War and Gulf War-era veterans
  1. Alexis L Maule1,2,
  2. Patricia A Janulewicz1,
  3. Kimberly A Sullivan1,
  4. Maxine H Krengel1,3,
  5. Megan K Yee3,
  6. Michael McClean1,
  7. Roberta F White1,4
  1. 1 Department of Environmental Health, Boston University School of Public Health, Boston, Massachusetts, USA
  2. 2 Military Performance Division, US Army Research Institute of Environmental Medicine, Natick, Massachusetts, USA
  3. 3 Research Service, VA Boston Health Care System, Boston, Massachusetts, USA
  4. 4 Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, USA
  1. Correspondence to Dr Alexis L Maule; lex0210{at}


Objectives Across diverse groups of Gulf War (GW) veterans, reports of musculoskeletal pain, cognitive dysfunction, unexplained fatigue, chronic diarrhoea, rashes and respiratory problems are common. GW illness is a condition resulting from GW service in veterans who report a combination of these symptoms. This study integrated the GW literature using meta-analytical methods to characterise the most frequently reported symptoms occurring among veterans who deployed to the 1990–1991 GW and to better understand the magnitude of ill health among GW-deployed veterans compared with non-deployed GW-era veterans.

Design Meta-analysis.

Methods Literature databases were searched for peer-reviewed studies published from January 1990 to May 2017 reporting health symptom frequencies in GW-deployed veterans and GW-era control veterans. Self-reported health symptom data were extracted from 21 published studies. A binomial-normal meta-analytical model was used to determine pooled prevalence of individual symptoms in GW-deployed veterans and GW-era control veterans and to calculate combined ORs of health symptoms comparing GW-deployed veterans and GW-era control veterans.

Results GW-deployed veterans had higher odds of reporting all 56 analysed symptoms compared with GW-era controls. Odds of reporting irritability (OR 3.21, 95% CI 2.28 to 4.52), feeling detached (OR 3.59, 95% CI 1.83 to 7.03), muscle weakness (OR 3.19, 95% CI 2.73 to 3.74), diarrhoea (OR 3.24, 95% CI 2.51 to 4.17) and rash (OR 3.18, 95% CI 2.47 to 4.09) were more than three times higher among GW-deployed veterans compared with GW-era controls.

Conclusions The higher odds of reporting mood-cognition, fatigue, musculoskeletal, gastrointestinal and dermatological symptoms among GW-deployed veterans compared with GW-era controls indicates these symptoms are important when assessing GW veteran health status.

  • meta-analysis
  • Gulf War veterans
  • health symptoms
  • deployment health
  • Gulf War illness

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This meta-analysis pulls together the largest combined data set published to date from 21 studies representing self-reported health symptom data from over 129 000 Gulf War (GW)-deployed veterans and GW-era control veterans from four different countries, all branches of the military, and both Active Duty and Reserve components of the military.

  • The binomial-normal model used to calculate the summary estimates are specific to binary outcomes (eg, proportions and ORs).

  • Meta-analyses lack individual-level data; therefore, some covariates relevant to symptom reporting could not be assessed.

  • Non-reporting bias limits the comparison of primary studies with null or negative findings.


From 1990 to early 1991, approximately 700 000 troops from the USA, along with military personnel from over 30 coalition countries, were deployed to the Persian Gulf in support of Operation Desert Shield and Operation Desert Storm, collectively known as the Gulf War (GW).1 After returning from the Persian Gulf, US GW veterans reported greater deployment-related health problems when compared with veterans of the same era who did not deploy to the Gulf or who were deployed elsewhere (eg, Bosnia, Germany).2–14 Similar reports of increased ill health were seen in GW veterans from other countries, including the UK,15–19 Australia,20 Denmark,21 Canada22 and France.23 Research indicates that US GW veterans developed certain chronic conditions at higher rates than their non-deployed counterparts (repeated seizures, neuralgia or neuritis, migraine headaches, and stroke) when assessed by self-report and where a representative sampling was verified by medical record reviews.5 8 13 18 24 25 Higher rates of chronic diseases were also shown in longitudinal assessments of Australian GW veterans who had reported higher health symptoms on initial assessment when compared with their low symptom reporting counterparts.26 In addition, about 25%–32% of deployed GW veterans reported symptoms from a common set of health complaints that have been used to construct criteria for a syndrome known as Gulf War illness (GWI).1 4 13 27 28 These symptoms have remained chronic with no improvement over time.10 29–33 GWI remains the primary chronic health disorder of GW deployment affecting over 200 000 deployed veterans.1 27 28

GWI is characterised in individual veterans by combinations of the following symptoms: chronic pain, fatigue, cognitive dysfunction, gastrointestinal complaints, respiratory symptoms and skin rashes, depending on the case definition employed and corresponding exclusion criteria.1 4 13 27 34 Although there has been some controversy over whether GWI is a unique syndrome because of overlap of symptoms in affected and non-affected veterans,3 20 35 this central group of symptoms has consistently been observed in investigations of USA, UK and Australian GW-deployed populations and has been used to determine case criteria for the illness.1 27 28 30

Two case definitions for GWI have received endorsement for use in clinical diagnosis and research investigations by the Institute of Medicine (IOM),34 the Centers for Disease Control and Prevention (CDC), chronic multisymptom illness (CMI)4 and the Kansas GWI definition.13 According to the CDC CMI case definition, a veteran is diagnosed with GWI if she/he reports one or more symptoms that last for at least 6 months in two of three categories: fatigue, musculoskeletal pain and mood/cognition.4 The Kansas definition requires moderate levels of self-reported symptoms in at least three out of six symptom categories: fatigue/sleep, pain, neurological/cognitive/mood, respiratory, gastrointestinal and skin.13 A third set of symptoms used to define GWI are the Haley criteria.36 These include three syndromes characterised by different symptom clusters. Syndrome 1 (impaired cognition) requires reported attention, memory, sleep and depression symptoms. Syndrome 2 (confusion/ataxia) requires reported thinking and balance symptoms. Syndrome 3 (neuropathic pain) requires self-reported joint and muscle pain.36

Uncertainty remains about the prevalence of GWI across GW-deployed populations because of differences in the study populations used to derive the case definitions and the methods used to ask about health symptoms. The Kansas definition is associated with a more consistent rate of GWI across multiple GW-deployed populations (34% prevalence in GW-deployed veterans) but excludes veterans with certain concomitant medical or psychiatric conditions who may also have GWI.13 28 34 37 In contrast, depending on the population studied, the CDC case definition includes between 29% and 60% of GW veterans and is considered the most inclusive but the least specific of the case criteria.34 Finally, the Haley criteria provide a more restrictive characterisation of GWI.38 The syndromes were originally devised by assessing a specific military unit of US Navy Seabees who showed a 20% rate of GWI.36 More current estimates in a larger population-based cohort showed that the combined Haley syndromes include about 14% of GW veterans.6

The epidemiological literature on health symptoms among GW veterans has identified environmental exposures unique to deployment to the Persian Gulf as aetiological agents in the development of specific health outcomes including brain cancer mortality and the occurrence of GWI.1 27 28 39 40 Troops were often exposed to a complex mixture of chemical and physical hazards making it difficult to establish causal links between individual exposures and health outcomes. However, exposures that have been linked to health effects (eg, cognitive dysfunction, mood complaints and respiratory problems) in this veteran population include oil well fires,33 pesticides,10 pyridostigmine bromide pills20 36 37 and chemical nerve gas agents,10 11 18 with pesticides and pyridostigmine bromide exposures most consistently linked to GWI.1 However, deployment experiences and exposures were not uniform across all troops deployed to the GW.41 42 Some studies have used unit-level characteristics as surrogates of deployment exposures and found that illness rates in GW-deployed veterans were associated with deployment location5 7 13 37 41 43 and time frame of deployment (ie, Operation Desert Shield, Operation Desert Storm).5 13 44

Collectively, prior studies have used several analytical techniques in separate cohorts to identify symptom prevalence rates and a common complex of symptoms in the GW-deployed population evaluated by each investigation. These analytical methods have included cluster analysis,45 correlation analysis13 and factor analysis.3 4 9 11 15 36 46–49 Several studies have used meta-analytical models to examine pain, depression, psychiatric disorders, alcohol and substance use, multisymptom illness and neuropsychological performance pooled across different GW veteran populations and their controls.50–55

The present study is the first, to our knowledge, to pool published self-reported health symptom data from different populations of GW-deployed veterans and their controls. It uses meta-analytical statistical methods to: (1) identify symptoms reported with the highest frequency in GW-deployed veterans and their controls, (2) determine summary ORs of symptom reporting in GW-deployed veterans compared with their controls and (3) examine the differences in symptom reporting between population-based GW cohort studies and GW cohorts recruited from specific military units.


This meta-analysis was designed and conducted within the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines.56

Data search

Two members of the research team (ALM, MKY) used the literature search strategy in figure 1 to identify studies examining self-reported health symptoms in deployed GW veterans and a relevant veteran comparison group. A Medline and Google Scholar database search was filtered for papers published between January 1990 and May 2017, and ‘Human Subjects’ and ‘English’ language studies (Medline search detailed in online  supplementary file). References from comprehensive reports on GW literature (eg, Research Advisory Committee on Gulf War Veterans’ Illnesses1 27 and IOM34) were also reviewed. This process was performed in duplicate to ensure that all relevant, peer-reviewed GW health symptom studies were identified and reviewed for possible inclusion.

Supplementary file 1

Figure 1

Meta-analysis literature search strategy (*  syntax indicates that all variations of the word was searched by the databases, eg, symptom* searched for symptoms, symptomatology, etc). OEF/OIF, Operation Enduring Freedom/Operation Iraqi Freedom.

Studies found during the literature search were included in the final analysis if the frequency of self-reported health symptoms were reported by GW-deployed veterans and a relevant veteran control group. ‘GW-deployed’ veterans were defined as veterans who deployed to the Gulf area in support of the 1990–1991 GW. A relevant comparison group was defined as non-deployed veterans or veterans serving in the military during the 1990–1991 GW period who deployed to areas other than the Gulf (eg, Germany, Bosnia). This group will be referred to as ‘GW-era controls’ throughout the rest of the manuscript. Studies were excluded if the study was conducted in theatre or during a conflict other than the GW, did not include a relevant comparison group and/or did not include self-reported health data.

The strategy used to determine the inclusion and exclusion of studies is further outlined in the four steps below (figure 1):

Step 1: Following the literature search, study titles, abstracts and full manuscripts were reviewed for eligibility criteria using a four-step process. Exclusion criteria included the following: (1) the study population included veterans of other wars or civilians in conflict zones; (2) the study data were collected in theatre and (3) duplicate titles were found or the paper was an editorial commentary.

Step 2: Studies were eliminated if the study’s outcome of interest was not health symptoms or health status. From step 2 forward, if it was unclear whether the study met the inclusion/exclusion criteria by reviewing the study title and abstract, full articles were reviewed.

Step 3: Studies were removed if the investigation: (1) had no relevant veteran comparison group; (2) was a follow-up survey to an original cohort and (3) did not include self-reported health symptoms/conditions.

Step 4:  Studies were eliminated for (1) overlapping GW veteran populations or (2) no usable data.

Data extraction

When published papers were found to have used survey results from the same veteran population (eg, survey data completed by four Air Force units were published by both CDC Morbidity and Mortality Weekly Report 2 and Fukuda et al 4), the prevalence data were extracted from the paper that presented results for the greatest number of self-reported health symptoms. If unique symptoms were reported in the second paper from the same veteran population, those specific symptoms were extracted from the second paper. One of the eligibility criteria for step 4 was the availability of usable data. If manuscripts published descriptive statistics other than symptom frequency (eg, mean symptom severity score, factor loading score), the corresponding author was contacted with a request for frequency data. If a follow-up request went unanswered, the study was eliminated.

The symptom checklists and wording of specific symptoms differed between studies. To determine which health symptoms matched across studies, members of the study team (ALM, PAJ, KAS, MHK) completed a qualitative comparison. For example, the Knoke et al 9 health symptom checklist includes ‘chest pains’ while Simmons et al 17 used ‘chest pains and tightness’, and a consensus was reached that these symptoms were comparable and both were included in the analysis of ‘chest pain.’ Once the final list of health symptoms that matched across studies was determined, the quantitative data were extracted if the symptom was reported in three or more studies. A member of the study team (ALM) extracted total n, symptomatic n, frequency, SE and unadjusted OR for both the GW-deployed veteran and GW-era veteran groups. This process was done in duplicate to assure the accuracy of the data extracted from tables in published studies. If any of the statistics listed above were not included, they were calculated using data that could be extracted.

Data analysis

To organise the output of the data analysis, the health symptoms were categorised into symptom domains by body system (neurological, mood cognition, sleep/fatigue, musculoskeletal, gastrointestinal, dermatological, cardiac, genitourinary, pulmonary and miscellaneous). Forest plots of one symptom from each of these categories are provided in the online supplementary figures.

For the meta-analysis of symptom prevalence in GW-deployed veterans, a random-effects binomial-normal model was used to estimate the combined log odds of a symptom and to calculate the pooled prevalence rate of individual symptoms.57 58 The random-effects binomial-normal model accounts for the heterogeneity between studies, the binomial distribution of proportions and the normal distribution of the study-specific odds around the summary estimates (µ) with a variance term, τ.2 ,57 The starting values for µ and τ2 were set using the summary log odds estimate from the fixed-effects model and the variance term from the maximum likelihood random-effects model, respectively.57 This process was repeated separately for the GW-era controls to yield a combined symptom prevalence in GW-era controls across studies. Next, for the meta-analysis of ORs comparing symptom reporting in GW-deployed veterans to GW-era controls, a random-effects binomial-normal model was used to estimate combined log ORs.58 In the model estimating log ORs, an offset term (log nGW-deployed/nGW-era controls) was included to take into account different sample sizes of GW-deployed veterans and GW-era controls within a study.58 The I2 value was calculated to assess heterogeneity between the studies used in the meta-analysis.

Confounding and bias assessment

In a meta-analysis, study characteristics are explored as potential confounders since individual-level data are not available. In previous studies, unit-level characteristics (eg, deployment location and deployment time frame) have been used as surrogates of deployment exposures since in theatre experiences were not the same across troops deployed to the GW. Studies included in our analysis either had participants who were recruited from specific military unit cohorts or participants who were sampled from a population-based cohort of GW-deployed and GW-era controls. Using participant cohort sampling strategy as a surrogate for deployment exposures, we performed a stratified analysis to explore the effect of confounding by participant cohort sampling strategy on the symptom ORs. If the symptom was reported by three or more studies in each stratum, the summary OR was estimated for each stratum using the binomial-normal model described above.

We performed a qualitative and quantitative bias assessment. First, the studies included in the analysis were qualitatively evaluated using a hypothesis validity checklist of the ‘threats to validity’ outlined by Wampold et al 59 (online supplementary table 1). The validity testing methodology was developed to critique clinical research and allows readers to evaluate: (1) hypothesis validity (eg, ambiguous or non-directional hypotheses); (2) statistical conclusion validity (eg, low statistical power or error rate problems (type I/type II error)); (3) internal validity (eg, selection bias or loss to follow-up); (4) construct validity (eg, mono-operation and mono-method bias) and (5) external validity (eg, generalisability). Each study was read and critiqued independently by two of four evaluators (ALM, MKY, AF, RG) to determine the presence or absence of specific ‘threats to validity’ using a multi-item checklist. Any discrepancies in the ratings were discussed before reaching a consensus final validity rating for each study. Studies were not eliminated based on the validity assessment.

Next, to assess publication and non-reporting bias on the summary ORs, we used a method described in Levy et al. 60 52 For the studies that did not report an OR for a health symptom, the OR for that health symptom was assigned the null (OR=1.0) and the SE was assumed to be the same as the minimum SE among the reported studies. The summary OR was estimated using a maximum likelihood random-effects model. We could not use the binomial-normal model for the bias assessment because it relies on counts rather than ORs for the binomial level of the model and for the offset term. Although these models yield slightly different results for summary OR estimates, they provide comparable estimates of the SEs.


The literature search identified 38 peer-reviewed studies examining self-reported health symptoms in GW-deployed veterans and GW-era control veterans. Sixteen of these studies were excluded because their study populations overlapped with another identified study and reported no unique health symptoms. Of the remaining papers, we extracted primary data directly from 19 of the studies. We contacted the authors of three additional papers to obtain primary data and received data for two of these studies.6 15 We did not receive primary data for the third study, which was not included in the analysis.61  Table 1 gives an overview of the final studies used in the meta-analysis, which included data from over 129 000 GW-deployed veterans and GW-era controls from four different countries, all branches of the military, and both Active Duty and Reserve components of the military. The GW-era controls in Proctor et al 10 were deployed to Germany; all other GW-era controls were non-deployed during the period of the 1990–1991 GW. Eleven of the studies sampled participants from specific military units (eg, US Navy Seabees) and 15 were population-based studies (table 1).

Table 1

Overview of 21 peer-reviewed studies used in health symptom meta-analysis

A total of 56 distinct health symptoms were reported in three or more studies and included in the meta-analysis. Table 2 shows the pooled prevalence of each symptom in GW-deployed veterans and GW-era controls. In GW-deployed veterans, lacking energy (combined prevalence (CP): 47.0%), flatulence/burping (CP=44.6%), unrefreshing sleep (CP=42.9%), headaches (CP=42.6%) and fatigue (CP=40.8%) had the highest CP (table 2).

Table 2

Combined prevalence of self-reported symptoms in Gulf War deployed and Gulf War-era control populations

Table 3 presents the results of the summary ORs of reporting symptoms in GW-deployed veterans compared with GW-era controls. GW-deployed veterans had significantly higher odds of reporting all of the analysed symptoms. The odds of reporting mood cognition (feeling detached: OR=3.59; irritability: OR=3.21), musculoskeletal (muscle weakness/loss of strength: OR=3.19), gastrointestinal (diarrhoea/loose stools: OR=3.24) and dermatological (rash: OR=3.18) symptoms were over three times higher in GW-deployed veterans compared with GW-era veterans.

Table 3

Meta-analysis of self-reported health symptoms in Gulf War-deployed control compared with Gulf War-era control groups

The bias assessment demonstrated that GW-deployed veterans continued to have higher odds of reporting all the analysed symptoms compared with GW-era controls, and the majority of the ORs shown in table 3 remained significant after assigning the missing studies an OR=1 and the minimum SE of the reported studies. However, the summary measure of effect for loss of balance/coordination, feeling detached, lacking energy, joint swelling, flatulence or burping, vomiting, itching, sweating, pain during intercourse, asthma, bleeding gums, lump in throat, swollen glands and weight gain were no longer significant (ie, 95% CI for the OR included null) when accounting for possible publication and non-reporting bias (table 3).

Using the ‘threats to validity’ checklist, the studies conducted by CDC,2 Cherry et al,15 Doebbeling et al,3 Gray et al,5 Simmons et al 17 and Steele13 had the greatest risk for bias among the studies included in the meta-analysis. The most common ‘threats to validity’ among these studies were external validity (eg, generalisability) and construct validity (eg, lack of specificity for exposure and outcome measures).

In the meta-analysis stratified by sampling strategy (military-unit vs population-based studies), a total of 19 distinct health symptoms were reported in three or more studies in both strata and were included in the analysis. The results of the meta-analysis stratified by sampling strategy showed that ORs moved further from the null compared with the unadjusted meta-analysis (unadjusted results shown in table 3) in studies with participants recruited from specific military units, for all but 2 of the 19 analysed symptoms (table 4). For self-reported dizziness, irritability, fatigue and several musculoskeletal, gastrointestinal and dermatological symptoms, the adjusted OR was more than a 10% change away from the null compared with the unadjusted symptom OR (table 4).

Table 4

Meta-analysis of self-reported health symptoms in Gulf War-deployed veterans compared with Gulf War-era controls stratified by cohort sampling strategy


Using meta-analytical models, we combined data from 21 studies reporting on health symptoms endorsed by over 129 000 GW-deployed veterans and GW-era controls. These 21 studies represented 1990–1991 GW veterans from 18 unique veteran populations, four different countries and all branches of the military. Results of the meta-analysis showed GW-deployed veterans had increased odds of reporting all of the analysed symptoms compared with GW-era controls, indicating that the health problems associated with GW deployment include widespread, multiple body symptoms. Furthermore, the odds of GW-deployed veterans’ reporting feeling detached, irritability, muscle weakness, diarrhoea and rash were more than three times higher than GW-era controls.

Additionally, in the unadjusted meta-analysis, the group of symptoms with the highest CP among GW-deployed veterans (fatigue, lacking energy, headaches and unrefreshing sleep) and the largest summary ORs comparing GW-deployed veterans to GW-era controls (irritability, feeling detached, muscle weakness, diarrhoea and rash) have been associated with objective brain imaging outcomes in recent studies,62–65 are consistent with some of the symptoms included in GWI case definitions (CDC multisymptom illness, Kansas case definition and Haley criteria), and have been reported by GW veterans diagnosed with GWI.4 13 36 Smith et al 66 recently reported that nearly half of all respondents in their population-weighted sample endorsed symptoms in all three CDC criteria categories (fatigue, mood cognition, musculoskeletal), with 96% of GWI cases reporting mood-cognition symptoms. However, we could not statistically analyse the overlap of our results with symptoms identified in current GWI case definitions because of a lack of individual symptom data.

We also characterised studies based on their cohort sampling strategy and performed a stratified meta-analysis comparing population-based studies to military unit-based studies, using military unit as a surrogate for deployment exposures. The stratified analysis showed evidence of confounding by sampling strategy. In studies where participants were sampled from specific military units, the adjusted summary ORs were higher compared with the unadjusted summary ORs. These results agree with previous studies that found GW veteran health problems were associated with deployment/operational time frame5 44 and location5 13 and may be reflective of specific deployment exposures experienced by different military units in the GW theatre.36–42

In our stratified analysis, several of the symptoms with higher adjusted ORs in the military-unit cohort studies have been associated with GW exposures in previous research. For example, in the Fort Devens cohort, Proctor et al 10 found that musculoskeletal symptom reporting was associated with pesticide and chemical warfare agent exposure, while neurological and psychological symptoms were linked to self-reported exposure to debris from SCUDS and chemical warfare agents. Similarly, McCauley et al 67 found that self-reported exposure to chemical warfare agents was associated with fatigue and gastrointestinal symptoms, and Cherry et al 68 found that self-reported exposure to pesticides was related to neurological, dermatological and musculoskeletal symptoms. Using the Kansas case definition of GWI, GW veterans with self-reported exposure to pesticides and pyridostigmine bromide pills were more likely to meet GWI case criteria compared with unexposed GW veterans.37 In addition, Iannacchione et al 6 reported that GW veterans who served as air flight crew or Army Special Forces during the war were 10 times more likely to meet criteria for GWI, as defined by the Haley syndrome criteria.

A major strength of this meta-analysis is the method used to estimate the summary measures of effect. The binomial-normal model is recommended for rare events, which made the analysis of some of the lesser reported health symptoms more robust. Moreover, the binomial-normal model is designed to analyse binary outcomes and take into account the non-normal (eg, binomial) distribution of the prevalence and OR effect estimate, in contrast to the fixed-effects or maximum likelihood random-effects model, which assumes normal distributions of effect estimates and is the traditional meta-analytical approach.

As mentioned previously, a limitation of a meta-analysis is the lack of individual-level data. Consequently, we were not able to assess the effect of some covariates relevant to health symptom reporting (eg, post-traumatic stress disorder and specific deployment exposures). While some of the primary studies published adjusted ORs,5 7 10 13 16–20 these effect measures were not adjusted for the same covariates across all studies. This limits comparability of the combined study data and increases the heterogeneity across studies; therefore, we extracted or calculated unadjusted ORs in this meta-analysis. In a meta-analysis, analysing study-level covariates controls for some heterogeneity across studies caused by differences in study methods and increases the confidence in the summary measure of effect.

The stratified analysis, examining sampling strategy (population-based cohorts vs military-unit cohorts) as a surrogate for differences in deployment experiences in specific military units compared with the entire GW-deployed population, also had its limitations. We recognise that within each strata deployment experiences could still vary; however, we could not examine any associations with specific exposures because of a lack of individual or unit-level exposure data. Furthermore, increased symptom reporting in military-unit studies compared with population-based studies could be due to differences other than deployment exposures (ie, more selective population, different predeployment training) between the study populations. There were other study characteristics such as country of origin or data collection time frame (ie, number of years postdeployment) that could affect symptom reporting which were unaccounted for in this meta-analysis.

Another limitation of the meta-analytical approach is the effect of publication and non-reporting bias on results. Publication bias occurs when studies with positive findings are more likely to be published than studies with null and/or negative findings. Non-reporting bias occurs when studies fail to publish non-significant results. In this analysis, we were limited to peer reviewed, published literature on GWI and then further limited by the number of health symptoms included in study questionnaires and reported by each study. To address the latter issue, we performed a bias analysis where individual study ORs were assigned the null value for a symptom that was unreported. The meta-analysis was rerun with the null ORs, and 42 out of the 56 summary ORs remained significant (table 3), demonstrating that the significant associations between GW veteran status and self-reported health symptoms cannot be attributed solely to publication bias.


Results of this meta-analysis of 21 health symptom studies provide the first comprehensive reference of pooled health symptom data from 129 000 deployed GW and GW-era control veterans representing four different countries and all branches of the military. The increased odds of symptom reporting among GW-deployed veterans compared with GW-era controls, for symptoms related to mood cognition, fatigue, musculoskeletal, gastrointestinal and dermatological symptom categories suggests these symptoms should continue to be used in symptom surveys when assessing GW veterans for health status, illness biomarkers or treatment trial efficacy.1 27 34 The stratified analysis demonstrates important differences by study sampling strategy, with higher symptoms ORs in studies of specific military-unit cohorts, potentially reflecting symptoms that are associated with specific deployment-related exposures that warrant further study.


We thank Dr Nicola Cherry and Dr Robert Haley for providing additional primary data to allow their studies to be included in this meta-analysis. We also thank Dr Michael LaValley for his assistance with statistical analysis during the revision process, Rachel Grashow and Ariella Fineman for their help with the validity assessment and Emily Sisson for creating the forest plots.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.


  • Contributors ALM, PAJ, KAS, MHK and MKY had substantial contributions to the design of the work, in addition to the acquisition and analysis of the data. ALM, PAJ, KAS, MHK, MKY, MM and RFW contributed significantly to the interpretation of data and the critical review of the work. All authors have given final approval of the submitted manuscript.

  • Funding This work was partially supported by a GWI Consortium award (W81XWH-13-2-00072) from the US DoD Congressionally Directed Medical Research Program (CDMRP) to KAS.

  • Competing interests None declared.

  • Patient consent Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement All of the data, with the exception of Iannacchione et al and Cherry et al, were extracted from published, peer-reviewed journal articles. Corresponding authors from Iannacchione et al and Cherry et al were contacted for the primary data relevant to this meta-analysis.