Article Text

Download PDFPDF

Original research
Novel microsimulation model of tobacco use behaviours and outcomes: calibration and validation in a US population
  1. Krishna P Reddy1,2,3,4,
  2. Alexander J B Bulteel1,
  3. Douglas E Levy2,4,5,
  4. Pamela Torola1,
  5. Emily P Hyle1,4,6,
  6. Taige Hou1,
  7. Benjamin Osher1,
  8. Liyang Yu1,
  9. Fatma M Shebl1,4,
  10. A David Paltiel7,
  11. Kenneth A Freedberg1,4,6,8,9,
  12. Milton C Weinstein4,9,
  13. Nancy A Rigotti2,4,8,
  14. Rochelle P Walensky1,4,6,8
  1. 1Medical Practice Evaluation Center, Massachusetts General Hospital, Boston, Massachusetts, USA
  2. 2Tobacco Research and Treatment Center, Massachusetts General Hospital, Boston, Massachusetts, USA
  3. 3Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
  4. 4Harvard Medical School, Boston, Massachusetts, USA
  5. 5Mongan Institute Health Policy Research Center, Massachusetts General Hospital, Boston, Massachusetts, USA
  6. 6Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, USA
  7. 7Yale School of Public Health, New Haven, Connecticut, USA
  8. 8Division of General Internal Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
  9. 9Department of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
  1. Correspondence to Dr Krishna P Reddy; kpreddy{at}


Background and objective Simulation models can project effects of tobacco use and cessation and inform tobacco control policies. Most existing tobacco models do not explicitly include relapse, a key component of the natural history of tobacco use. Our objective was to develop, calibrate and validate a novel individual-level microsimulation model that would explicitly include smoking relapse and project cigarette smoking behaviours and associated mortality risks.

Methods We developed the Simulation of Tobacco and Nicotine Outcomes and Policy (STOP) model, in which individuals transition monthly between tobacco use states (current/former/never) depending on rates of initiation, cessation and relapse. Simulated individuals face tobacco use-stratified mortality risks. For US women and men, we conducted cross-validation with a Cancer Intervention and Surveillance Modeling Network (CISNET) model. We then incorporated smoking relapse and calibrated cessation rates to reflect the difference between a transient quit attempt and sustained abstinence. We performed external validation with the National Health Interview Survey (NHIS) and the linked National Death Index. Comparisons were based on root-mean-square error (RMSE).

Results In cross-validation, STOP-generated projections of current/former/never smoking prevalence fit CISNET-projected data well (coefficient of variation (CV)-RMSE≤15%). After incorporating smoking relapse, multiplying the CISNET-reported cessation rates for women/men by 7.75/7.25, to reflect the ratio of quit attempts to sustained abstinence, resulted in the best approximation to CISNET-reported smoking prevalence (CV-RMSE 2%/3%). In external validation using these new multipliers, STOP-generated cumulative mortality curves for 20-year-old current smokers and never smokers each had CV-RMSE ≤1% compared with NHIS. In simulating those surveyed by NHIS in 1997, the STOP-projected prevalence of current/former/never smokers annually (1998–2009) was similar to that reported by NHIS (CV-RMSE 12%).

Conclusions The STOP model, with relapse included, performed well when validated to US smoking prevalence and mortality. STOP provides a flexible framework for policy-relevant analysis of tobacco and nicotine product use.

  • tobacco
  • nicotine
  • model
  • validation
  • calibration
  • relapse

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from

Strengths and limitations of this study

  • The Simulation of Tobacco and Nicotine Outcomes and Policy (STOP) microsimulation model and our calibration and validation methods capture monthly individual-level tobacco use behaviours and outcomes, including relapse, a key factor in nicotine addiction.

  • We validated STOP model results with those of another model and, in a partially dependent manner, with empirical data.

  • We validated with multiple outcomes, including smoking prevalence and mortality.

  • This analysis did not account for some aspects of heterogeneity in tobacco use behaviours.


In the USA, tobacco smoking reduces life expectancy by over a decade and accounts for over US$200 billion in healthcare costs annually, approximately 9% of all healthcare costs in the country.1 2 Though the prevalence of cigarette smoking among adults has decreased in the USA, from 42% in 1965 to 14% in 2018, the decline has not been seen in all segments of society.3 4 Meanwhile, tobacco treatment interventions, including behavioural therapy and pharmacotherapy, remain underutilised.5 Novel tobacco and nicotine products raise many new clinical and policy questions.6 7 Trial-based and cohort-based data to fully inform these questions will not be available for many years. In the meantime, a timely way to address them is via modelling.

Simulation models provide a critical complement to more traditional research approaches.8–14 Indeed, the Food and Drug Administration and the National Academies of Sciences, Engineering, and Medicine recently called for modelling studies to project the long-term effects, including both potential harms and benefits, of novel tobacco and nicotine products and regulatory policies to address them.15 16 While multiple model-based studies of tobacco and nicotine products have been published,17–25 most report aggregate trends, are focused at the population rather than individual level and do not explicitly account for smoking relapse, a key component of the natural history and resource utilisation of smoking cessation attempts. A current challenge of projecting longer term clinical and economic outcomes of short-term tobacco cessation studies lies in capturing the many smoking quit attempts and relapses.26 27 A new model that intentionally examines relapse would extend trial results by projecting outcomes beyond the time horizon of trials, when many relapses occur. Our objective was to develop, calibrate and validate a novel, individual-level microsimulation model that directly addresses the mechanics of smoking initiation, cessation and relapse, and the associated clinical outcomes. The intended applications of the model include projecting the downstream impact of clinical and public health policy decisions and informing the design of tobacco treatment trials.


Analytical overview

We developed a microsimulation model of tobacco-related and nicotine-related behaviours, clinical outcomes and treatments: the Simulation of Tobacco and Nicotine Outcomes and Policy (STOP) model. In this analysis, we focused on cigarette smoking among US women and men to demonstrate that the STOP model, in simulating individuals’ month-by-month smoking behaviours, can match historical smoking prevalence and mortality data. Our methods included: (1) performing internal validation to ensure the accuracy of the mathematical calculations; (2) conducting cross-validation with another model; (3) incorporating smoking relapse and then calibrating smoking cessation probabilities to reflect the difference between a quit attempt and sustained abstinence; and (4) using our new relapse parameters, performing external validation to compare model outputs for mortality and for prevalence of current, former and never smokers over time to empirical data from the National Health Interview Survey (NHIS) (table 1).

Table 1

Characteristics of cross-validation and external validation analyses for a new microsimulation model of smoking behaviours and outcomes

Because there is no consensus criterion by which to compare model-generated results to surveillance data, expert guidance suggests choosing a criterion appropriate for the model structure and data sources.28 Similar to methods used in validating other models, we chose root-mean-square error (RMSE, for cumulative risks and time-varying prevalence estimates) and mean absolute percentage error (MAPE, for mortality rates) to evaluate the goodness of fit between STOP model results and data sources.28–34 We applied the coefficient of variation of RMSE (CV-RMSE) as a relative measure of error.

Smoking definitions

Similar to NHIS and the Cancer Intervention and Surveillance Modeling Network (CISNET, which used NHIS data), we defined never smokers as those who had smoked <100 cigarettes in their lifetime.35–37 Among others (ever smokers), NHIS defined current smokers as those who reported currently smoking every day or some days. NHIS considered ever smokers who reported no smoking at the time of interview to be former smokers, regardless of the duration of abstinence. CISNET considered former smokers to be those who had quit smoking at least 2 years prior to interview; those with a shorter period of abstinence were still considered current smokers.

To better distinguish relapse and mortality risks among those with short-term or long-term abstinence, the STOP model includes three states for those who have ever smoked: (1) current smoker; (2) recent quitter (short-term abstinence); (3) former smoker (long-term abstinence) figure 1). This enables a differentiation between: (1) transient quit attempts: transition from the current smoker state to the recent quitter state, with a relatively high rate of early relapse back to the current smoker state; and (2) sustained abstinence: transition from the recent quitter state to the former smoker state, with a lower rate of later relapse back to the current smoker state.

Figure 1

Overview of tobacco use states and transitions in Simulation of Tobacco and Nicotine Outcomes and Policy microsimulation model. This is a simplified, stylised depiction of smoking states and transitions—for example, dimensions such as age and sex are not represented in the figure. The ovals represent possible cigarette smoking states or the deceased state. The arrows represent monthly transitions by which an individual can switch to a different state. The ‘abstinence, sustained’ transition is depicted by a dashed line because there is not a monthly probability of transition—instead, the transition occurs after an individual has spent a user-defined duration (eg, 1 year) in the ‘recent quitter’ state. Numerical examples of the transition probabilities are in online supplementary table 1.

STOP model structure

STOP is an individual-level Monte Carlo microsimulation.38 39 An individual enters the model with age and smoking status defined by random realisations from specified probability distributions. STOP follows a state-transition framework: individuals transition monthly through various cigarette smoking states (figure 1). Transitions between these states depend on age-stratified and sex-stratified monthly smoking initiation and cessation probabilities. Ex-smokers have monthly relapse probabilities (figure 1). Monthly mortality probabilities depend on age, sex and smoking status. Those who quit smoking retain the all-cause mortality probabilities of current smokers until maintaining abstinence for a defined period of time, after which the mortality probabilities decline.1 8 40

Individuals are simulated in series: for each simulated person, the model tracks smoking behavioural events (smoking initiation, quit attempt, relapse) and the duration spent in each smoking state. On an individual’s death, the next simulated person enters the model. Once a cohort large enough to attain stable estimates has been simulated, summary statistics are calculated, including mean number of quit attempts, life expectancy and the monthly prevalence of never, current and former smokers. For the purpose of model output displays, those in the recent quitter state are considered ‘former smokers’. We use a constant simulated population size of 1 million to obtain stable estimates of these ‘average’ outcomes of interest.

Internal validation

We conducted internal model validation by comparing model outputs to expected results and by conducting sensitivity analysis.


Overview and outcome comparisons

We conducted cross-validation by simulating the US population born in 1950, following them monthly until 2020, and then comparing STOP results to those from CISNET modelling studies (table 1 and online supplementary text).28 We selected the 1950 birth cohort because smoking prevalence in the USA peaked in the 1960s, which was the smoking initiation period (adolescence) for these individuals, and data collection frequency increased concurrently. We compared STOP-generated results to CISNET-reported results for the prevalence of female and male current, former and never smokers over time.40

We used CV-RMSE to assess the goodness of fit of the six sets of prevalence curves.29 32 First, RMSE was calculated as the square root of the average of the squared difference between STOP-projected prevalence and CISNET-projected prevalence at each year of age. Then, we calculated CV-RMSE by dividing RMSE by mean modelled prevalence, representing the relative error.

CISNET model

CISNET is a collaboration of National Cancer Institute-supported investigators modelling the impact of interventions on population incidence and mortality of various types of cancer, including lung cancer. The Yale CISNET-Lung models, for subsequent analyses of cancer care interventions, used data from NHIS to generate detailed smoking initiation and cessation rates, stratified by birth year, age and sex, and mortality rates, stratified by birth year, age, sex and smoking status.37 41

Input parameters for initial cross-validation

For the initial cross-validation exercise, we used data from CISNET modelling studies, which were derived from NHIS through 2009 and were stratified by birth cohort (table 2).37 41 Specifically, we used CISNET age-stratified and sex-stratified smoking initiation and cessation rates and smoking-stratified mortality rates among US women and men born in 1950, converting them to monthly probabilities. The CISNET smoking cessation rates reflected a direct transition from current smoker to former smoker after at least 2 years of sustained abstinence.41 This initial exercise used the same input parameters as CISNET and did not include smoking relapse.

Table 2

Simulation of Tobacco and Nicotine Outcomes and Policy model input parameters applied in each validation exercise

Incorporating smoking relapse and calibrating cessation probabilities

The STOP model specifically includes smoking relapse, critical to projecting both short-term and long-term impacts of smoking cessation interventions and novel tobacco and nicotine products.

Following the initial cross-validation of the STOP model (without relapse), we added smoking relapse probabilities and then recalibrated the model by adjusting the previously applied smoking cessation probabilities. First, we modelled relapse as an exponential decay function of time since quit, such that the highest risk of relapse was in the first month after a quit attempt. The coefficient and time constant are based on relapse probabilities in smoking cessation trials (table 2).42–45 Second, we calibrated the previously applied cessation probabilities (derived from CISNET cessation data) by a multiplier to reflect: (1) a quit attempt rather than sustained abstinence; and (2) the higher likelihood of making a quit attempt rather than attaining sustained abstinence in a given month. This multiplier represents the average number of quit attempts, lasting at least 1 month, prior to attaining sustained abstinence. We compared our multipliers to published data on the average number of quit attempts required to attain sustained abstinence.27 Our overall aim for this calibration step was to identify a STOP-generated current smoker prevalence curve with an RMSE <0.01 compared with the CISNET model-generated current smoker prevalence curve, similar to previously described methods.32

External validation

Overview and outcome comparisons

For external validation, we compared STOP model results to NHIS data.28 36 We accounted for smoking initiation, smoking cessation and mortality. Because NHIS data do not explicitly report relapse, we incorporated smoking relapse and the best-fitting cessation multipliers found in the cross-validation calibration step. We compared two outcomes: mortality and smoking prevalence (table 1 and online supplementary text).

First, to project and validate mortality outcomes, we simulated the population surveyed by NHIS from 1997 through 2009 (online supplementary text). To compare STOP-generated mortality rates to those derived from NHIS—stratified by age, sex and smoking status—we used MAPE, the mean absolute value of the percent difference between STOP and NHIS values. We also produced curves of cumulative mortality from STOP-generated results and from NHIS data, stratified by sex and by current/never smoking status. These curves reflect 20-year-old current smokers who continue to smoke until death or 20-year-old never smokers who never start smoking. We compared the four sets of cumulative mortality curves by RMSE and CV-RMSE (STOP vs NHIS) from age 20 years until age 84 years (goal RMSE <0.01). We did not generate mortality curves for 20-year-old former smokers because mortality risks for those who stop smoking prior to age 20 are similar to those of never smokers.1 Also, mortality risks depend on age at cessation, and at older ages this heterogeneous group would include people who quit smoking at a variety of ages.1 40

Second, with those surveyed by NHIS in 1997 as the input cohort, we used the STOP model to project the prevalence of current, former and never smokers each year from 1998 to 2009. In a two-way sensitivity analysis, we recalibrated cessation multipliers and initiation multipliers with the goal of identifying multipliers that would minimise the CV-RMSE of STOP-reported current smoker prevalence compared with NHIS current smoker prevalence. The initiation multipliers were applied to smoking initiation rates for never smokers. We then compared the cessation multipliers from this step with those from the cross-validation calibration step.

Input parameters

Initial distributions of age, sex and smoking status for the population simulated in the external validation exercises came from two sources: aggregated 1997–2009 NHIS data for the mortality external validation, and 1997 NHIS data for the smoking prevalence external validation (table 2 and online supplementary figure 1). We obtained NHIS data in aggregate for years 1997–2009 from the Integrated Public Use Microdata Series.35 We used NHIS data through 2009 because those were the data used in the CISNET studies, which were our comparator in cross-validation exercises.37 41 These data provided initial distributions of smoking status and years of abstinence for former smokers (to inform relapse risks). From these 1997–2009 NHIS data, we derived age-specific and sex-specific smoking initiation and cessation rates using self-reported age at initiation and age at cessation variables (online supplementary table 1). As in our cross-validation exercises, we converted the cessation rates to quit attempt rates by incorporating relapse rates and cessation multipliers.

The NHIS data included linked National Death Index (NDI) mortality outcomes through 2011 for respondents for whom mortality data were available. We calculated mortality rates by age, sex and smoking status of the same NHIS respondents (online supplementary table 1).

Patient and public involvement

We did not involve patients or the public in our work.



Initial cross-validation, without relapse

The STOP-projected prevalence of current, former and never smokers over time fit CISNET-projected data well for the 1950 birth cohort in the USA (figure 2, blue line vs red dotted line, RMSE <0.03, CV-RMSE 15%/7% for women/men). The STOP-estimated prevalence of current smokers at age 25 years, approaching peak prevalence for the 1950 birth cohort, was 40% for women and 54% for men, compared with CISNET estimates of 38% and 52%.

Figure 2

Cross-validation and calibration exercise: Simulation of Tobacco and Nicotine Outcomes and Policy (STOP)-generated results and Cancer Intervention and Surveillance Modeling Network (CISNET)-generated results for current smoking prevalence over time for US people born in 1950. (A) Women. (B) Men. The red-dotted line shows results from the CISNET model. The other three lines show STOP-generated results after each step of our parameterisation and calibration process. The blue line includes parameterisation of smoking initiation and cessation, but not relapse. The pink-dashed line includes smoking relapse as based on published studies. The black line includes calibration of smoking cessation probabilities to reflect quit attempts and relapse before sustained abstinence.

Incorporating smoking relapse and calibrating cessation probabilities

After incorporating smoking relapse, the prevalence of current smokers far exceeded that reported by the CISNET model, as expected since many of those who would have become former smokers reverted to being current smokers (figure 2, pink-dashed lines). We then aimed to reflect all quit attempts rather than only transitions to sustained abstinence. In rough calibrations, we found that the optimal multiplier would be between 5 and 10 when applied to cessation rates from the previous step. In finer calibrations, we varied the multiplier across the range of 5–10 in increments of 0.25. We found that multiplying the CISNET-reported cessation rates by 7.75 for women and by 7.25 for men best approximated the CISNET-projected prevalence of current smokers, with RMSE 0.004/0.008 and CV-RMSE 2%/3% for women/men (figure 2, black lines).

External validation


In simulating the 1997–2009 NHIS population along with smoking relapse, we found that the age-stratified, sex-stratified and smoking-stratified mortality rates generated by the STOP model were a good fit to those derived from NHIS (MAPE 7%, examples in online supplementary table 2). Cumulative mortality curves for 20-year-old female and male current smokers and never smokers were similar between STOP projections and NHIS-derived data, with RMSE <0.01 and CV-RMSE ≤1% (figure 3). For 20 years olds who continued to smoke until death, the STOP model predicted median life expectancy (counting years from birth) of 77.5 years for women and 72.5 years for men.

Figure 3

External validation: Simulation of Tobacco and Nicotine Outcomes and Policy (STOP) model results and National Health Interview Survey (NHIS)/National Death Index (NDI) results for cumulative mortality of current smokers and never smokers from age 20. (A) Women. (B) Men. Within each panel, the STOP results and the NHIS data are not easily distinguishable because they are essentially overlapping. Current smokers are those who continue to smoke until death. NHIS was linked to NDI for mortality data. CV-RMSE, coefficient of variation of root-mean-square error.

Smoking prevalence

Using those surveyed by NHIS in 1997 as the input cohort, the STOP-projected prevalence of current, former and never smokers each year from 1998 to 2009 was similar to that reported by NHIS, with overall RMSE 0.04 and CV-RMSE 12% for both women and men (ages 30–84 years combined; online supplementary figure 2 shows results specifically for ages 40–44 years). Compared with NHIS, the STOP model slightly underpredicted never smoker prevalence and slightly overpredicted former smoker prevalence in later years. In the two-way sensitivity analysis, we found that cessation multipliers of 7.5/7.0 for women/men and initiation multipliers of 0.9/1.0 provided the best overall fit (lowest RMSE) of STOP-projected current smoker prevalence compared with that reported by NHIS (online supplementary figure 3).


We developed, calibrated and validated STOP, a novel microsimulation model of individual-level tobacco use behaviours and outcomes. Our initial model input parameters included smoking initiation and cessation and smoking-stratified mortality, and we demonstrated cross-validity compared with the CISNET model. After incorporating relapse, we calibrated smoking cessation probabilities to reflect quit attempts rather than sustained abstinence. We then validated STOP model output with: (1) smoking prevalence over time reported by the CISNET model for US women and men born in 1950; (2) age-stratified, sex-stratified and smoking-stratified mortality rates and cumulative mortality reported by the NHIS-NDI linked database for the years 1997–2009; and (3) prevalence of current, former and never smokers by sex from 1998 to 2009 reported by NHIS, NHIS, using 1997 NHIS-reported population characteristics as inputs.

Most existing tobacco models simulate at the population level or lack the capacity to consider smoking initiation and (non-sustained) quit attempts throughout a lifetime.18 19 23–25 37 46 The individual-level details of STOP can be employed to simulate and compare behaviours and interventions. While this calibration and validation analysis focused on cigarette smoking because of the availability of historical data for comparisons, we intend to broaden the use of STOP to include electronic cigarettes (e-cigs). Longitudinal cohort studies and clinical trials are examining the effects of e-cig use on tobacco smoking behaviours and clinical outcomes over long time horizons, but data are needed now to inform guidelines and policy around these novel products.15 44 47 48 Results from multiple distinct, validated models can help motivate policy decisions, and consistency of policy recommendations across unique, independent models reinforces confidence in their recommendations.49–53

A novel aspect of the STOP model is the incorporation of smoking relapse on a monthly basis, reflecting the understanding of nicotine addiction as a chronic relapsing condition with rapid cycles between use and cessation.26 45 54–57 This key feature enables an important distinction between a quit attempt and sustained abstinence. This distinction is missing from most tobacco models and indeed from many epidemiological studies of smoking and smoking cessation, which consider the transition from ‘current’ to ‘former’ smoker to be an abrupt one that results in sustained abstinence. Incorporating relapse required calibrating cessation rates by applying multipliers. The cessation multipliers that provided the best fits to empirical data are in line with published data regarding the number of quit attempts required before sustained abstinence is achieved.27 54 58 The slightly higher multiplier needed for women compared with men is consistent with NHIS data showing that among ever smokers (current smokers plus former smokers) aged 60 years and above, a greater proportion of women compared with men are former smokers.35 Calibration of cessation rates may compensate for other inaccuracies in model inputs or structure, though the precalibration (without relapse) STOP-generated results fit well with those of CISNET.

Many trials of smoking cessation interventions follow patients for a few months or up to 1 year, but they do not report subsequent relapse. By including relapse, the STOP model can combine data from short-term trials of smoking cessation interventions with data from natural history studies of smoking and smoking cessation to project longer-term outcomes including sustained abstinence. The flexibility to integrate data from a variety of sources is a strength of modelling analyses.

Going forward, we plan to use the STOP model to study contemporary rather than historical populations and to predict future tobacco use, while using deterministic and probabilistic sensitivity analyses to account for uncertainty in future behavioural transition probabilities and mortality probabilities.59 As no empirical data exist with which to validate model output of future tobacco use, we validated STOP model output against historical populations. Most US historical data on smoking prevalence and smoking-associated mortality are based at least in part on NHIS, the oldest ongoing survey of smoking prevalence in the USA.1 60–62 We compared STOP model output to CISNET model output, to NHIS itself, and to results from a study by Jha et al,1 all of which used NHIS data. We demonstrated cross-validity of STOP compared with CISNET model results when using CISNET input parameters and then added relapse probabilities and cessation multipliers. We demonstrated external validity, in a partially dependent manner, of STOP compared with NHIS data when using some NHIS-derived input parameters plus the external relapse probabilities and cessation multipliers from our cross-validation. Though independent external validation sources are ideal, dependent sources can still be useful, especially in this scenario where most of the available US historical smoking prevalence, behaviour and mortality data are derived from NHIS.28 Of note, in a two-way sensitivity analysis in which we simultaneously varied the smoking initiation and cessation multipliers to achieve a close fit to NHIS smoking prevalence data, the optimal cessation multipliers were very similar to those we found in our cross-validation calibration step, demonstrating the robustness of these multipliers across different sets of assumptions.

In an external validation exercise, the STOP model projection for never smoker prevalence from 1998 to 2009 was slightly lower than that reported by NHIS, and the STOP model projection for former smoker prevalence was slightly higher than NHIS data. In NHIS, former smokers were self-defined but on average had been abstinent for over a decade. NHIS considered those who smoked ‘some days’ to be current smokers, though some of them may have been in the midst of a short-duration quit attempt. STOP model output formally labels these people, who may be in the recent quitter state, former smokers but assigns them the mortality risks of current smokers (until a defined period of abstinence). STOP reflects monthly quitting and relapsing behaviours whereas NHIS is an annual cross-sectional survey. Thus, one would expect the STOP model to report a higher prevalence of former smokers than NHIS, as seen in our results. Immigration could also account for some of the difference between NHIS data and STOP model-generated results: immigrants were surveyed in NHIS but our model analysis does not account for them. Smoking prevalence differs between the immigrant and non-immigrant populations.63 64 On the other hand, STOP model-generated life expectancies were similar to the median life expectancies for 30-year-old smokers reported by Jha et al (also derived from NHIS data): 77 years for women and 72 years for men.1

The STOP model has features, and will have applications, not described in this analysis. We developed the model to incorporate resource utilisation. The STOP model can capture the healthcare costs associated with being a current, former or never smoker, as well as the costs of tobacco cessation interventions. By incorporating the chronic relapsing nature of nicotine addiction, the STOP model can account for the resources required for recurrent cessation interventions (eg, restarting the same or a different intervention after smoking relapse), an important consideration in cost effectiveness and policy analyses. Ultimately, we will use the STOP model to evaluate behavioural and clinical outcomes, costs of care and cost effectiveness of tobacco cessation interventions, programmes and policies. An overarching goal is to provide information that can inform decision makers—including clinicians, public health officials and policy-makers—on cost-effective interventions that reduce the clinical and economic burden of tobacco use. The model can eventually assess the impact of different financing options for tobacco cessation interventions—for example, annual versus lifetime insurance coverage limits. The STOP model’s flexibility will allow for analyses beyond US populations, including settings where smoking-related behaviours and clinical outcomes may be different from those in the USA.65

The STOP model has limitations. Its projections are limited by assumptions and the specificity of available data—for example, age, sex and birth year stratifications of smoking behavioural transitions. While we have aimed to calibrate and validate the model with the best available historical data, any use of the model to project future outcomes should be approached with prudence. Calibration on historical data is no panacea because of concerns of calibration drift, and relapse rates could change over time due to changes in population-level nicotine dependence.66 Nonetheless, input parameter values can be varied in sensitivity analysis. STOP does not include dynamics such as the effects of one person’s smoking on another person’s smoking behaviours. Smoking is associated with other factors not directly captured by STOP, including race, socioeconomic status, mental illness and other substance use, but different populations can be separately simulated in the model with input parameters specific to that population. There is heterogeneity in smoking behaviours, including cigarettes consumed per day and daily versus non-daily smoking. STOP enables stratification by intensity of smoking, which can be used to represent amount or frequency of smoking.

In conclusion, STOP is a novel, individual-level microsimulation model that captures tobacco-related behaviours—importantly including relapse—and outcomes with a goal of informing decision-making around tobacco cessation interventions and tobacco policy. We have demonstrated that the model is well calibrated and validated to another model and to historical cohorts. We plan to use the model for policy-relevant analysis of contemporary patient-level and population-level care while reflecting real-life tobacco use and cessation behaviours.


View Abstract


  • Contributors KPR: study conception, study design, data analysis, data interpretation, and drafted the first version of the manuscript. AJBB and FMS: study design, data analysis and data interpretation. DEL: study conception, study design and data analysis. BO: study design. PT and LY: data analysis and data interpretation. EPH, MCW, ADP and KAF: study design and data interpretation. NAR, RPW and TH: study conception, study design and data interpretation. All authors critically reviewed the manuscript for important intellectual content and approved the final submitted version.

  • Funding This work was supported by awards from the National Institute on Drug Abuse (K01 DA042687) and the National Heart, Lung, and Blood Institute (K01 HL123349) of the National Institutes of Health, and the Steve and Deborah Gorlin MGH Research Scholars Award.

  • Disclaimer The funding sources had no role in the design, analysis or interpretation of the study, the writing of the manuscript, or in the decision to submit the manuscript for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or Massachusetts General Hospital.

  • Competing interests NAR receives royalties from UpToDate, is a consultant for Achieve Life Sciences, and has been an unpaid consultant for Pfizer. All other authors report no competing interests.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Partners Human Research Committee (2019P001772).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data used to parameterise the model are publicly available at and at

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.