Article Text
Abstract
Objectives There is a growing body of literature on malaria forecasting methods and the objective of our review is to identify and assess methods, including predictors, used to forecast malaria.
Design Scoping review. Two independent reviewers searched information sources, assessed studies for inclusion and extracted data from each study.
Information sources Search strategies were developed and the following databases were searched: CAB Abstracts, EMBASE, Global Health, MEDLINE, ProQuest Dissertations & Theses and Web of Science. Key journals and websites were also manually searched.
Eligibility criteria for included studies We included studies that forecasted incidence, prevalence or epidemics of malaria over time. A description of the forecasting model and an assessment of the forecast accuracy of the model were requirements for inclusion. Studies were restricted to human populations and to autochthonous transmission settings.
Results We identified 29 different studies that met our inclusion criteria for this review. The forecasting approaches included statistical modelling, mathematical modelling and machine learning methods. Climaterelated predictors were used consistently in forecasting models, with the most common predictors being rainfall, relative humidity, temperature and the normalised difference vegetation index. Model evaluation was typically based on a reserved portion of data and accuracy was measured in a variety of ways including meansquared error and correlation coefficients. We could not compare the forecast accuracy of models from the different studies as the evaluation measures differed across the studies.
Conclusions Applying different forecasting methods to the same data, exploring the predictive ability of nonenvironmental variables, including transmission reducing interventions and using common forecast accuracy measures will allow malaria researchers to compare and improve models and methods, which should improve the quality of malaria forecasting.
 Infectious Diseases
This is an openaccess article distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/bync/2.0/ and http://creativecommons.org/licenses/bync/2.0/legalcode.
Statistics from Altmetric.com
Article summary
Article focus

Accurate predictions of malaria can provide public health and clinical health services with the information needed to strategically implement prevention and control measures.

The diversity in forecasting accuracy measures and the use of scaledependent measures limits the comparability of forecasting results, making it difficult to identify the optimal predictors and methods for malaria forecasting.

The objective was to identify and assess methods, including predictors, used to forecast malaria.
Key messages

When performing forecasting, it is important to understand the assumptions of each method as well as the associated advantages and disadvantages.

Common accuracy measures are essential as they will facilitate the comparison of findings between studies and methods.

Applying different forecasting methods to the same data and exploring the predictive ability of nonenvironmental variables, including transmission reducing interventions, are necessary next steps as they will help determine the optimal approach and predictors for malaria forecasting.
Strengths and limitations of this study

The strength of this review is that it is the first review to systematically assess malaria forecasting methods and predictors, and the recommendations in the review, if followed, should lead to improvement in the quality of malaria forecasting.

A limitation of a literature review is that unpublished methods, if any, are omitted from this review.
Introduction
In 1911, Christophers1 developed an earlywarning system for malaria epidemics in Punjab based on rainfall, feverrelated deaths and wheat prices. Since that initial system, researchers and practitioners have continued to search for determinants of spatial and temporal variability of malaria to improve systems for forecasting disease burden. Malaria forecasting is now conducted in many countries and typically uses data on environmental risk factors, such as climatic conditions, to forecast incidence for a specific geographic area over a certain period of time.
Malaria can be forecasted using an assortment of methods and significant malaria predictors have been identified in a variety of settings. Our objective was to identify and assess methods, including predictors, used to forecast malaria. This review is intended to serve as a resource for malaria researchers and practitioners to inform future forecasting studies.
Methods
We included in our scoping review studies that forecasted incidence, prevalence or epidemics of malaria over time. Whereas a systematic review is guided by a highly focused research question, a scoping review covers a subject area comprehensively by examining the extent, range and nature of research activity on a topic.2 The studies had to use models that included prior malaria incidence, prevalence or epidemics as a predictor. A description of the forecasting model and an assessment of the forecast accuracy were requirements for inclusion. Studies were restricted to human populations and to autochthonous transmission settings. We excluded studies that provided only spatial predictions, exploratory analysis (eg, assessing temporal correlations), mortality predictions and/or individuallevel transmission modelling. Commentaries, descriptive reports or studies that did not include original research were also excluded. In addition, for studies that were related (eg, the same setting and the same methods with different time periods), the study with the most comprehensive data was included in the review.
A review protocol was developed and electronic search strategies were guided by a librarian experienced in systematic and scoping reviews. Papers were identified using medical subject headings and key word combinations and truncations: (‘forecast*’ or ‘predictive model*’ or ‘prediction model*’ or ‘time serie*’ or ‘timeserie*’; AND ‘malaria*’). The searches were not restricted by year or language although our searches were restricted by the historical time periods of the databases. The citation searches began on 18 April 2011 and the final citation search was conducted on 29 May 2012. We searched the following databases: CAB Abstracts (1910–2012 Week 20), EMBASE (1947–2012 28 May), Global Health (1910–April 2012), MEDLINE (1948−May Week 3 2012), ProQuest Dissertations & Theses (1861–29 May 2012) and Web of Science (1899–28 May 2012). We performed manual searches of the Malaria Journal (2000–29 May 2012) and the American Journal of Tropical Medicine and Hygiene (1921–May 2012). Grey literature was also searched using Google Scholar, based upon the same key words used to search the databases. In addition, the websites of the WHO and the US Agency for International Development were also examined for any relevant literature. To ensure that all appropriate references were identified, hand searching of reference lists of all included studies was conducted and any potentially relevant references were incorporated into the review process.
The citations were imported into EndNote X5 (Thomas Reuters) for management. Two main reviewers (KZ and AV) examined all citations in the study selection process with the exception of articles in Chinese, which were reviewed by a third reviewer (ZS). The first stage of review involved each reviewer independently identifying potentially relevant studies based upon information provided in the title and abstract. If it was uncertain whether to include or exclude a study during the first stage of review, the citation was kept and included in the full article review.
The second stage of review involved each reviewer independently identifying potentially relevant studies based upon full article review; data abstraction occurred for those articles that met the inclusion criteria. From each study, we abstracted the following: setting, outcome, covariates, data source(s), timeframe of observed data, forecasting and model evaluation methodologies, final models and associated measures of prediction accuracy. Quality of the included studies was not assessed as the objective was to conduct a scoping review and not a systematic review. Any discordance among the reviewers regarding inclusion or exclusion of studies or with respect to the information abstracted from the included studies was resolved by consultation with another author (DB).
Results
Our search identified 613 potentially relevant articles for the scoping review after duplicate citations were removed (figure 1). We identified 29 different studies that met our inclusion criteria for this review; they are described briefly in table 1. Malaria forecasting has been conducted in 13 different countries with China as the most frequent site of malaria forecasting. The size of the geographic region of study ranged from the municipal level to larger administrative divisions such as country and provinces or districts. Almost all of the studies (97%) used health clinic records of malaria infections from the general population as their data source for malaria infections, with one study using cohort data. Eleven (38%) of the 29 studies used laboratory confirmation of malaria cases (microscopy and/or rapid diagnostic tests), seven (24%) used clinical confirmation and two (7%) used a mixture of clinical and microscopic confirmation. Nine studies did not state whether they used clinical or microscopic confirmation of malaria.
Forecasting studies
The forecasting approaches included statistical modelling, mathematical modelling and machinelearning methods (table 2). The statistical methods included generalised linear models, AutoRegressive Integrated Moving Average (ARIMA) models32 and HoltWinters models.33 The mathematical models were based upon extensions of the RossMacDonald susceptibleinfectedrecovered (SIR) malaria transmission model.34 Other authors predicted malaria incidence using neural networks, a machinelearning technique.35
Twelve studies (41%) included in the review used generalised linear models to forecast malaria counts, rates or proportions through linear, Poisson or logistic regression. All but one of the regression models included climaterelated covariates such as rainfall, temperature, vegetation and/or relative humidity.12 Typically, the weather covariates were lagged, to account for the delayed effects of weather on malaria infections. Two studies4 ,8 explored the effects of including covariates as higherorder polynomials. Several of the studies used a generalised linear model approach to time series analysis by including previous (lagged) malaria incidence as an autoregressive covariate in the model. Some models included terms for season or year to account for seasonal and annual variations.
Seven studies (24%) used forecasting approaches based on ARIMA modelling with some including a seasonal component (SARIMA). While not explicitly stated, many studies used a transfer function model, also known as ARIMAX. Typically, these ARIMAbased models incorporated various meteorological series as covariates although one study also included data on the malaria burden in neighbouring districts.14
Four studies (14%) from China used the Grey method for malaria forecasting, none of which incorporated predictors other than malaria incidence.26–28 ,31 There were two studies (7%) that used mathematical models.21 ,22 Gaudart et al21 included a vector component in a SIRtype model and used data from a cohort of children, remote sensing data, literature and expert opinions of entomologists and parasitologists. The study by Laneri et al22 used a vectorsusceptibleexposedinfectedrecoveredsusceptible (VSEIRS) model although they incorporated two different pathways from recovery to susceptibility that were based upon different timescales (seasonal and interannual), mimicking different transmission intensities. They found that rainfall had a significant effect on the interannual variability of epidemic malaria and including rainfall as a predictor improved forecast accuracy. The parameters in their models were based on literature as well as laboratory findings.
We identified three studies (10%) that used neural networks in their analyses, and each study used different input data and a unique network structure.23–25 Two of the studies used weather variables to predict malaria incidence.24 ,25 Gao et al24 also included evaporation and sunshine hours to predict malaria incidence; two variables that were not included in any other study.
As shown in table 3, climaterelated predictors were used consistently in forecasting models, with the most common predictors being rainfall, relative humidity, temperature and normalised difference vegetation index. One study accounted for the effect of malaria incidence in neighbouring districts, but it was not a significant predictor and was excluded from the final model.14 The mathematical models included nontime varying parameters such as the reporting fraction of cases (proportion of malaria cases in a population that is reported to public health), average life expectancy and several vector characteristics, which are listed in table 4.
Evaluation methods
Authors used different approaches to evaluate the accuracy of forecasting models. A typical approach was to segment the data into a model building or training portion with the other portion (the ‘holdout’ sample) used for model validation or assessing forecast accuracy. The crossvalidation approach used by Rahman et al7 and Teklehaimanot et al9 excluded 1 year of data at a time, the model was fit to the remaining data, forecast errors (prediction residuals) were computed using data from the missing year and then this process was repeated for subsequent years. The accuracy of the predictions was then estimated from the prediction residuals. Some of the studies used all the available data to fit a model and did not reserve data for assessing forecast accuracy.21 ,22
Studies compared the forecasts to observed values using various measures: meansquared error, mean relative error, mean percentage error, correlation coefficients, paired t tests (between predicted and observed values), 95% CI (of predicted values and determined if observed values fell within the interval) and visualisations (eg graphical representations of observed and predicted values).
Comparison of forecasting methods
We could not compare the forecast accuracy of models from different studies due to the lack of common measures and the lack of scaleindependent measures. However, we briefly discuss the findings from studies that compared different methods within a single study.
Abeku et al13 found that their ARIMA models provided the least accurate forecasts when compared with variations of seasonal averages, and the most accurate forecasts were produced by the seasonal average that incorporated deviations from the last three observations (SA_{3}). In contrast, Briet et al14 found that the most accurate model varied by district and forecasting horizon, but the SARIMA approach tended to provide the most accurate forecasts, followed by an ARIMA model with seasonality modelled using a sine term, then HoltWinters, with the SA_{3} providing the least accurate forecasts. They also considered independent time series, such as rainfall and malaria cases in neighbouring districts, in the models. Medina et al30 determined that their HoltWinters method provided more accurate forecasts and the accuracy did not deteriorate as rapidly as with the SA_{3} method. Cunha et al23 found that their neural network provided more accurate predictions across all three forecast horizons (3, 6 and 12 months) when compared with a logistic regression model.
Discussion
Malaria forecasting can be an invaluable tool for malaria control and elimination efforts. A public health practitioner developed a simple forecasting method, which led to the first earlywarning system of malaria.1 Forecasting methods for malaria have advanced since that early work, but the utility of more sophisticated models for clinical and public health decision making is not always evident. The accuracy of forecasts is a critical factor in determining the practical value of a forecasting system. The variability in methods is the strength of malaria forecasting, as it allows for tailored approaches to specific settings and contexts. There should also be continued effort to develop new methods although common forecasting accuracy measures are essential as they will help determine the optimal approach with existing and future methods.
When performing forecasting, it is important to understand the assumptions of forecast models and to understand the advantages and disadvantages of each. Forecast accuracy should always be measured on reserved data and common forecasting measures should be used to facilitate comparison between studies. One should explore nonclimate predictors, including transmission reducing interventions, as well as different forecasting approaches based upon the same data.
Differences between forecasting methods
The regression approach to time series prediction attempts to model the serial autocorrelation in the data through the inclusion of autoregressive terms and/or sine and cosine functions for seasonality. Generalised linear regression models are used commonly and their main advantages are their flexibility and the intuitive nature of this approach for many people relative to ARIMA models. For example, the temporal dynamics observed in time series plots can be feasibly managed in generalised linear models by including several cyclic factors, interaction terms and numerous predictors.36 The main disadvantages are that generalised linear models do not naturally account for correlation in the errors37 and the models may need to be complex to capture all the dynamics of the relationship within a series and between two or more series.38 Failure to accurately model serial autocorrelation may bias the estimation of the effect of predictors as well as underestimate the standard errors. Crucially, regression model residuals must be examined for autocorrelation and it was not always evident that this occurred in the studies we identified using this method. In addition, it was not apparent if any remedial measures were used to account for the effect of autocorrelation on estimates of variance, for example, reestimating standard errors using heteroskedasticity and autocorrelation consistent (HAC) estimators.39
ARIMA models are designed to account for serial autocorrelation in time series; current values of a series can be explained as a function of past values and past shocks.38 With ARIMA models, once the series have been detrended through differencing, any remaining seasonality can be modelled as part of additional autoregressive or moving average parameters of a SARIMA model. A rule of thumb is that 50 observations are a minimal requirement for ARIMA models,37 whereas SARIMA models require longer time series. The transfer function model, ARIMAX, extends ARIMA by also including as predictors current and/or past values of an independent variable. An advantage of ARIMA models versus GLMs is that ARIMA models naturally represent features of temporal patterns, such as seasonality and autocorrelation. As with generalised linear regression models, the residuals of ARIMA models need to be examined for residual correlation. Also, when incorporating an input series into the model, prewhitening should occur prior to the crosscorrelation assessment for the transfer function models. Prewhitening is when the residuals from an ARIMA model for the input series are reduced to ‘white noise’ and the same ARIMA model is applied to the output series.37 The authors did not always report that they prewhitened the series prior to assessing crosscorrelations. The relationship between the two resulting residual series is then estimated by the crosscorrelation function. Without prewhitening, the estimated crosscorrelation function may be distorted and misleading.
Four studies from China used the Grey method for malaria forecasting.26–28 ,31 This forecasting method is essentially a curvefitting technique based on a smoothed version of the observed data.40 ,41 The Grey model appears most useful in predicting malaria when using a very short time series and when there is a strong linear trend in the data. This is due to the nature of the GM(1,1) model which will always generate either exponentially increasing or decreasing series.42 Its value in malaria prediction beyond that of the simpler statistical modelling approaches is yet to be determined.
The approach to prediction differs between mathematical models and other approaches such as generalised linear models, ARIMA and Grey models. The RossMacdonald mathematical model divides the population under study into different compartments such as SIR, and uses differential equations to model the transition over time of individuals from one group to another. By using differential equations, these models can represent explicitly the dynamics of malaria infection, mosquito populations and human susceptibility. The disadvantages of mathematical models include the difficulty in finding appropriate, settingspecific data for the parameters. Also, the computational complexity of these models increases with the number of parameters, resulting in the omission of relevant features of malaria dynamics for the model to be manageable.43
A neural network is a machinelearning method that connects a set of inputs (eg, weather covariates) to outputs (eg, malaria counts).44 The connection between inputs and outputs are made via ‘neurons’ and the number of links and corresponding weights are chosen to give the best possible fit to the training data. Neural networks have been proven to be useful in their capacity to handle nonlinear relationships as well as a large number of parameters, and also their ability to detect all possible interactions between predictor variables.45 Mathematical models and neural networks are able to capture thresholds or limits on malaria transmission, which cannot be readily captured by statistical approaches. For example, in generalised linear models, a small decrease in the temperature leads to a small decrease in malaria incidence. Neural networks and mathematical models can express explicitly that there will be no malaria transmission below a certain temperature. The disadvantages of neural networks include difficulties in determining how the network is making its decision and its greater computational burden,46 both of which depend upon the number of input parameters included in the model. In addition, neural networks have a greater susceptibility to overfitting45 and several thousand observations are typically required to fit a neural network with confidence.46 Malaria time series are unlikely to contain several thousands of observations, perhaps unless the observations are aggregated over time (eg, monthly) and location (eg, national level).
Researchers have examined many forecasting methods, but published articles tend to describe the application of a single method to a unique dataset. Direct comparison of methods would be easier if multiple malaria forecasting methods were applied to the same data. This approach would allow the identification of methods that provide the most accurate shortterm, intermediateterm and longterm forecasts, for a given setting and a set of predictors. It would also allow the exploration of gains in forecast accuracy by using a weighted combination of forecasts from several models and/or methods.47
Malaria predictors
It has been suggested that climate and meteorological predictors have greater predictive power when modelling malaria incidence in areas with unstable transmission compared to areas with stable endemicity.48 It is interesting to note that nearly all of the models focused narrowly on a small number of environmental predictors despite the importance of other predictors of malaria incidence, such as land use, bednets, indoor residual spraying and antimalarial resistance. Forecast accuracy may be weakened if transmissionreducing interventions are not considered in the models.
Forecast evaluation
Model selection based upon modelfitting criteria, such as Akaike's information criterion, Bayesian information criterion or the coefficient of determination, are standard measures considered when choosing a regression model. Using such measures to guide forecast model selection may result in selecting models with a greater number of parameters and ‘overfitting’, which tends to result in inaccurate forecasts.49 For the purposes of forecasting, visualisations of forecasts compared to observations and forecast accuracy measures, such as the mean absolute forecast error, provide more direct and intuitive model selection criteria.
When choosing how much of the series to reserve for testing the model, it is recommended to reserve at least as much as the maximum forecast horizon.50 Crossvalidation is a more efficient use of data than partitioning a data set into train and test segment, although it is more computational intensive. It is recommended in crossvalidation that only prior observations be used for testing a future value.50
Various direct measures were used to estimate forecasting error. Absolute measures, such as the mean absolute error (MAE), are relevant for measuring accuracy within a particular series but not across series because the magnitude of the MAE depends on the scale of the data.51 percent errors, such as mean absolute percent error (MAPE), are scaleindependent but are not recommended when the data involve 0 counts as MAPE cannot be calculated with 0 values. Also, the MAPE places a heavier penalty on forecasts that exceed the observed compared to those that are less than the observed.52 In economics, a measure called mean absolute scaled error (MASE) has been recommended as an accuracy measure for forecasting.51 We recommend incorporating MASE into malaria forecast evaluation as this evaluation measure will facilitate comparison between studies. We also recommend reporting MAE as it allows an intuitive interpretation of the errors. In addition, MAPE should be reported and a constant such as 1 could replace the 0 values in the series, allowing the calculation of MAPE. An advantage of MAPE as that it considers scale variance. For example, if we observed 70 counts of malaria but predicted 60, MAPE would be 14.3, MAE 10 and MASE 0.7. If we observed 15 counts of malaria but predicted 5, MAPE would be 66.7, MAE 10 and MASE 0.7. MAPE and MASE could be used to compare findings across series and studies, and also compared to one another to understand if and how they differ in their ranking of forecast accuracy. The MAE, MAPE and MASE should be provided as sitespecific measures for each forecasting horizon, as summary measures for each site, and finally as summary measures for each forecasting horizon across all sites (within a study).
Conclusion
Accurate disease predictions and earlywarning signals of increased disease burden can provide public health and clinical health services with the information needed to strategically implement prevention and control measures. Potential barriers to their usefulness in public health settings include the spatial and temporal resolution of models and accuracy of prediction. Models that produce coarse forecasts may not provide the precision necessary to guide targeted intervention efforts. Additionally, technical skill and lack of readily available data may reduce the feasibility of model utility in practise, which should be considered in developing malaria forecasting models if the intent is to use these models in clinical or public health settings. Applying different forecasting methods to the same data, exploring the predictive ability of nonenvironmental variables, including transmissionreducing interventions, and using common forecast accuracy measures will allow malaria researchers to compare and improve models and methods, and lead to the improvement in the quality of malaria forecasting.
Acknowledgments
We would like to thank various authors for responding to our questions and also to gratefully acknowledge LK for her assistance in our literature search strategies. We would especially like to thank the reviewers for critically reading the manuscript and providing insightful suggestions.
References
Footnotes

Contributors KZ, AV, KC, TB and DB contributed to the study concept and design. KZ, AV and ZS contributed to the article review and data abstraction. KZ, AV, KC, TB, JB and DB contributed to the interpretation of the data, and draughted the manuscript. All authors critically revised the manuscript for important intellectual content and approved final version submitted for publication.

Funding This work was supported by the Canadian Institutes of Health Research Interdisciplinary Capacity Enhanced Team grant no HOA80072.

Competing interests None.

Provenance and peer review Not commissioned; externally peer reviewed.

Data sharing statement There are no additional data.