Article Text


The use of individual patient-level data (IPD) to quantify the impact of pretreatment predictors of response to treatment in chronic hepatitis B patients
  1. Shehzad Ali1,
  2. Stuart Mealing1,
  3. Neil Hawkins1,2,
  4. Benedicte Lescrauwaet3,
  5. Stefan Bjork4,
  6. Lorenzo Mantovani5,
  7. Pietro Lampertico6
  1. 1Oxford Outcomes Ltd, Oxford, UK
  2. 2Centre for Health Economics, University of York, York, UK
  3. 3Xintera Consulting Bvba, Leuven, Belgium
  4. 4Bristol-Myers Squibb Sarl. Rueil Malmaison, Paris, France
  5. 5University of Naples, Naples, Italy
  6. 6First Division of Gastroenterology, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milano, Italy
  1. *Correspondence to
    Stuart Mealing;; Stuart.mealing{at}


Objectives Evidence synthesis is an integral part decision-making by reimbursement agencies. When direct evidence is not available, network-meta-analysis (NMA) techniques are commonly used. This approach assumes that the trials are sufficiently similar in terms of treatment-effect modifiers. When imbalances in potential treatment-effect modifiers exist, the NMA approach may not produce fair comparisons. The objective of this study was to identify and quantify the interaction between treatment-effect and potential treatment-effect modifiers, including time-of-response measurement and baseline viral load in chronic hepatitis B (CHB) patients.

Design Retrospective patient-level data econometric analysis.

Participants 1353 individuals from two randomised controlled trials of nucleoside-naïve CHB taking 0.5 mg entecavir (n=679) or 100 mg lamivudine (n=668) daily for 48 weeks.

Interventions Hepatitis B virus (HBV) DNA levels for both drugs were measured at baseline and weeks 24, 36 and 48. Generalised estimating equation for repeated binary responses was used to identify treatment-effect modifiers for response defined at ≤400 or ≤300 copies/ml.

Primary outcome measures OR at 48 weeks.

Results The OR for the time-of-response measurement and treatment-effect interaction term was 1.039 (p=0.00) and 1.035 (p=0.00) when response was defined at ≤400 or ≤300 copies/ml, respectively. The baseline HBV DNA and treatment-effect interaction OR was 0.94 (p=0.047) and 0.95 (p=0.096), respectively, for the two response definitions suggesting evidence of interaction between baseline disease activity and treatment effect. The interaction between HBeAg status and treatment effect was not statistically significant.

Conclusions The measurement time point seems to modify the relative treatment effect of entacavir compared to lamivudine, measured on the OR scale. Evidence also suggested that differences in baseline viral load may also alter relative treatment effect. Meta-analyses should account for such modifiers when generating relative efficacy estimates.

Statistics from

Article summary

Article focus

  • In patients with chronic hepatitis B, are there any baseline patient characteristics that impact on the probability of achieving undetectable viral load after 1 year of therapy?

  • If so, what is the magnitude of the effect?

Key messages

  • Time on treatment and baseline viral load were independent treatment-effect modifiers.

  • The treatment–effect interaction OR for baseline viral load was 0.94 (p=0.047).

  • Future meta-analyses should account for this interaction term in order to generate ‘like-for-like’ comparisons.

Strengths and limitations of this study

  • Analyses based on patient-level data from two high-quality randomised controlled trials (patient count=1353).

  • The statistical models took account of the longitudinal nature of the data and the correlation between repeated measures.

  • HBV DNA levels were observed at relatively small number of discrete time points, that is, 24, 36 and 48 weeks

  • Only two treatments included in the trials (entecavir and lamivudine).


Chronic hepatitis B (CHB) is an infectious condition caused by the hepatitis B virus (HBV). Long-term consequences of infection include cirrhosis, hepatocellular carcinoma and ultimately death.1 For patients in the UK, the licensed treatments for CHB are entecavir (ETV), lamivudine (LAM), telbivudine, adefovir dipivoxil, tenofovir disoproxil fumerate (TDF), peginterferon α-2a, interferon α-2a and interferon α-2b.2 These treatments have been evaluated in randomised controlled trials (RCTs); however, head-to-head comparisons are not always available for all possible treatment comparisons. From the perspective of a reimbursement agency such as the National Institute for Health and Clinical Excellence (NICE), this poses a challenge since relative estimates of the treatment efficacy may not be available for each intervention to inform reimbursement decisions.

One strategy to overcoming this challenge is to use network meta-analysis (NMA) to generate relative efficacy estimates between competing treatments.3–6 While a traditional meta-analysis includes studies that compare two interventions with each other, an NMA synthesises all available evidence from a network of RCTs involving multiple treatments compared directly or indirectly or both. NMA extends the concept of a traditional meta-analysis by including multiple pairwise comparisons across a range of interventions and provides estimates of the relative treatment effect on multiple treatment comparisons by using a chain of evidence that links treatments of interest. It represents a comprehensive analysis framework for combining both direct and indirect evidence in such a manner as to preserve randomisation within trials.

The NMA approach allows consideration of all relevant evidence and addresses research questions in the absence of direct comparative evidence, thus improving the precision of estimates by combining direct and indirect evidence. One of the key assumptions underpinning this method is that the studies included in the analysis are homogeneous, that is, the trials are sufficiently similar on study and patient characteristics (‘covariates’). If these covariates act as modifiers of the relative treatment effect (ie, the difference in response on a given scale, typically log-odds or odds) and their distribution is not balanced across the studies that are compared directly or indirectly, the similarity assumption is violated and the NMA is affected by confounding bias. Effect modifiers may include patient characteristics, study setting, length of follow-up, outcome definition and measurement and study methodology (eg, protocol requirements and study time frame).7 This is particularly important in cases where response to treatment is defined in terms of post-treatment level of a measure (eg, a biological marker such as response viral load) when the baseline level of this measure is known to vary across studies. As a result, if one study happens to recruit patients with a more advanced disease (ie, worse levels of a given clinical variable), and this variable is known to modify the impact of treatment, then the level of response achieved in this study is likely to be smaller compared to another study which primarily includes patients with less-advanced disease (ie, with better baseline variable levels), all things being equal. To identify treatment-effect modifiers and to provide an estimate of the magnitude of treatment–covariate interaction, regression analysis of individual patient-level data can be used. This, in turn, can form the basis for evaluating baseline imbalances across trials included in the NMA. It should, however, be noted that such within-trial interaction analysis would provide an estimate for the magnitude of potential treatment modifier effect rather than proving beyond doubt that these interactions exist at the study level.

To date, two network meta-analyses of treatments for CHB have been published in peer-review journals.8 ,9 Both studies defined response to treatment in terms of attainment of undetectable levels of HBV DNA below a predefined threshold at a given time point. However, in both cases, the authors did not attempt to fully account for the differences in different sources of heterogeneity, including baseline patient characteristics across different trials. Moreover, the effect of using different definitions of virological response to treatment was not explored during evidence synthesis.

The objective of this study was to explore and quantify the relationship between treatment effect and patient characteristics, in particular, baseline disease severity and time-of-response measurement, in predicting response to CHB treatment. Baseline HBV DNA level as measured using the PCR assay is known to be a predictor of treatment response but the extent of this relationship, and whether or not it is also a treatment-effect modifier (ie, whether or not it impacts on the efficacy of a particular treatment option) is not currently known. This work is of interest to the clinical community since it would provide evidence on whether and how much treatment effects vary based on these variables. It will also be of value in future network meta-analyses since it explores the methodological implications of treatment interactions that can impact on the observed level of response. Extending network meta-analysis models with treatment-by-covariate interactions may explain heterogeneity in relative treatment effects.

We used individual patient level data (IPD) from two clinical trials of CHB patients, comparing ETV and LAM, to quantify the relationship between treatment effect and baseline covariates and time-of-response measurement to identify any treatment-effect modifiers. Treatment response was defined in terms of attainment of undetectable levels of HBV DNA below predefined threshold values of ≤400 and ≤300 copies/ml.


Patient-level data from 1353 treatment-naïve individuals recruited into two multinational double-blinded, double-dummy studies of ETV 0.5 mg once daily (n=679) and LAM 100 mg once daily (n=668) were made available by Bristol-Myers Squibb for the purpose of this analysis (study names: 022 and 027).10 ,11 In both clinical trials, all individuals were nucleoside-naïve with the common composite primary efficacy endpoint being histological improvement (≥two point decrease in the Knodell necroinflammatory score) and no worsening of fibrosis (≥one point increase in the Knodell fibrosis score), at week 48, compared to baseline. Individuals were HBeAg-positive in Study 022 and HBeAg-negative in the Study 027.

The primary rationale for the study was to explore the effect of interaction between baseline variables and treatment effect in predicting response. Here, response is defined as a binary (yes/no) variable in terms of achieving a threshold of undetectable levels of HBV DNA. Two commonly used response thresholds were evaluated in this study, that is, ≤400 or ≤300 copies/ml measured by PCR assay. Response was evaluated for each individual patient at weeks 24, 36 and 48. The following covariates were used in the statistical analysis: natural log of baseline HBV DNA measured by PCR assay (variable: ‘LPCR_0’), treatment received (‘ETV’ or ‘LAM’, with LAM used as the reference group), HBeAg antigen status of the patient (‘HBeAg’) and a time variable (‘TIME’) expressed in weeks to evaluate the impact of duration on treatment response. The following interaction terms were included: treatment with ETV and baseline viral load (‘ETV*PCR’), treatment with ETV and time (‘ETV*TIME’) and treatment with ETV and e-antigen status (ETV*HBeAg). Study-specific intercepts were also included in the analysis. Covariates were centred at their mean values, except for the time variable which was centred at 48 weeks.

Statistical analysis was performed using the generalised estimating equations (GEE) with logit link function and autoregressive correlation structure. The use of GEE is preferred when outcomes are correlated and the focus of analysis is on estimating average effect in the population. The GEE takes an account of the correlation between repeated observations from the same individual at multiple time points. Moreover, since the sample size needed to achieve adequate statistical power to detect interaction effects is larger than the sample size needed to detect main effects,12 the use of GEE is preferred over cross-sectional logistic regression as it allows the use of multiple observations per individual. The analysis was conducted in Stata V.11 (StataCorp. 2009. Stata Statistical Software: Release 11. College Station, Texas: StataCorp LP).


To contextualise the results of the statistical analysis conducted in this study, we present the summary statistics from a systematic review of CHB interventions conducted separately by the authors and presented at the International Society for Pharmacoeconomics and Outcomes Research conference in 2011. Table 1 presents baseline HBV DNA levels in the RCTs of CHB interventions. The table shows that the baseline HBV DNA values in all trials ranged from 5.6 to 10.3 log10 copies/ml.

Table 1

Baseline HBV DNA levels by study

For HBeAg-positive patients, the range was between 5.7 and 10.3 log10 copies/ml, whereas for HBeAg-negative patients, the range was 5.6–7.8 log10 copies/ml. Of note, however, is that when the focus is restricted to the key HBeAg-positive regulatory trials for ETV and TDF (AI463-022 and Marcellin2008), there is an approximate difference of 1 log10 copies/ml. The corresponding difference between the values in the ETV and TDF in HBeAg-negative regulatory studies (AI463-023 and Marcellin2008) was approximately 0.5 log10 copies/ml. Hence, the studies appear dissimilar in terms of baseline viral load levels.

GEE logit regression models evaluated the odds of achieving treatment response at two threshold values (ie, HBV DNA ≤400 or ≤300 copies/ml) after adjusting for baseline characteristics. Both main effect and interaction terms were included in the analysis. The results are presented as both log-ORs (tables 2 and 3) and ORs (tables 4 and 5) with CIs. An OR of <1 suggests a decrease in odds of achieving treatment response when the level of baseline variable increases by one unit, and vice versa.

Table 2

Log-odds of response based on generalised estimating equation for treatment response at ≤400 copies/ml

Table 3

Log-odds of response based on generalised estimating equation for treatment response at ≤300 copies/ml

Table 4

Odds of response based on generalised estimating equation for treatment response at ≤400 copies/ml

Table 5

Odds of response based on generalised estimating equation for treatment response at ≤300 copies/ml

The results show that the coefficient on the interaction term for treatment and baseline log-PCR (ETV*PCR) is negative for both threshold values (tables 2 and 3) with different levels of statistical significance. The odds for the interaction term for response at ≤400 copies/ml were found to be 0.94 which represents the multiplicative factor by which the ratio of the predicted odds of response for ETV and predicted odds for LAM changes when the baseline log-PCR increases by one unit. In other words, when the baseline log-PCR increases by one unit from its mean value, the OR of response for ETV compared to LAM decreases by 5.6% (ie, (1–0.94)*100). The analysis at ≤300 copies/ml found that the odds were similar, that is, 0.95. However, it should be noted that the coefficients are only marginally significant in both analyses (p=0.047 for ≤400 copies/ml and p=0.096 for ≤300 copies/ml). These results show that there is evidence (albeit relatively weak) to suggest that the treatment effect may be moderated by the level of baseline log-PCR.

The interaction analysis of treatment and TIME in the GEE model is also statistically significant (p=0.00) for both response definitions. This indicates that the treatment effect is moderated by time-of-response measurement. The positive coefficient on the interaction variable suggests that when time-of-response measurement increases by 1 week, the OR of response for ETV compared to LAM increases by 3.9%. This indicates that, while the odds of achieving undetectable levels of HBV DNA increases with time in both treatment groups, the rate of change in odds is higher for ETV compared to LAM. We also evaluated the interaction between HBeAg status and treatment effect which was found to be statistically non-significant in both analyses. This result shows that there is no evidence to suggest that the odds of response for ETV compared to LAM is moderated by the HBeAg status.


This study evaluated the effect of interaction between treatment received and the baseline characteristics and time-of-response measurement in predicting the odds of treatment response (defined as undetectable HBV DNA levels at threshold values of 400 and 300 copies/ml). The baseline variables included the log of baseline HBV DNA level, the time-of-response measurement (in weeks) and HBeAg status. While the qualitative relationship between time, baseline viral load and response has been documented in the literature,13 the key aim of the analysis was to identify and quantify the interaction effects that may act as treatment-effect modifiers which, in turn, may be useful to adjust baseline differences in future meta-analyses.

The analyses presented in this paper showed that there is strong evidence to suggest that time-of-response measurement may act as treatment-effect modifier. This suggests that time of measurement should be taken into account during NMA, when clinical trials with different periods of patient follow-up are included in the analysis. Our analysis also found that there is weak evidence to suggest that the baseline HBV DNA level may also interact with treatment effect. This is potentially an important finding to suggest that future network meta-analyses should evaluate the need to account for differences in baseline disease activity in CHB patients. We also evaluated the main (non-interactive) impact of baseline HBV DNA level on odds of response in the models. The coefficient suggests that patients with more severe or advanced levels of disease activity before treatment initiation (as measured by baseline viral load) are less likely to achieve response (as measured by response viral load) at any given time point, that is, higher baseline disease activity predicts worse response. This is in line with earlier findings.

We did not find any significant interaction between treatment-effect and HBeAg status. However, HBeAg status was found to be a significant predictor as a main effect suggesting that patients with positive HBeAg status are less likely to achieve response after controlling for baseline HBV DNA and time-of-response measurement.

Reimbursement agencies, such as NICE, typically make decisions based on evidence regarding relative treatment effects based on clinical efficacy data on endpoints, such as the level of response achieved after treatment for a certain period of time, as a surrogate for clinical efficacy.14 From the perspective of a reimbursement agency, the key finding of this analysis is that the time-of-response measurement and baseline disease activity may act as treatment-effect modifiers, suggesting that when these variables are not distributed in a balanced way across trials, there is potential for confounding bias in the resulting meta-analysis estimates. Hence, analysts should evaluate differences in patient-level and study-level characteristics that may act as treatment-effect modifiers while undertaking using direct, indirect and mixed treatment comparisons to allow fair ‘like-for-like’ comparisons to be made. However, it should be noted that the within-trial interaction analysis provides an estimate of the magnitude of potential treatment-effect modifier rather than proving beyond doubt that these interactions exist at the study level.

All mixed treatment comparisons of CHB treatments published to date did not adjust for patient baseline characteristics. Therefore, if heterogeneity or treatment modifiers across clinical trials are not accounted for, then meta-analysis may produce biased estimates in favour of treatment(s) that had patients with relatively longer time-of-response measurement and less severe disease activity in the trials (ie, lower baseline risk). Furthermore, when such evidence is incorporated in an economic model, such as those developed by Dakin et al15 or Veenstra et al,16 ,17 the cost-effectiveness estimates may be biased against interventions that was studied in patients with more severely disease activity.

Strengths and weaknesses of the analyses

The study used individual patient level data from two longitudinal randomised trials over a period of 48 weeks. The statistical models took account of the longitudinal nature of the data and the correlation between repeated measures. Interaction effects that may act as treatment-effect modifiers were evaluated and quantified in this study.

One potential limitation of the study is that response HBV DNA levels were observed at relatively small number of discrete time points, that is, 24, 36 and 48 weeks. This may have an implication on the statistical power of the study to detect significant effects. Another limitation is that the relationship between treatment effect and patient-level baseline characteristics was explored only in the trials of ETV versus LAM in this study, and as such, we were not able to explore treatment-effect modifiers in other interventions such as adefovir and tenofovir. It is difficult to anticipate whether or not such interaction effects will be observed in other CHB studies. However, we do not know of any reasons why the observed relationship may not hold for CHB patients receiving other treatments. However, we would recommend that such analyses are repeated in other randomised trials to ascertain the validity of these findings. Finally, the analysis assumed that the unobserved patient characteristics in this study did not directly influence the observed relationship between treatment-effect and baseline characteristics.

Implications for evidence synthesis

This study found that the time-of-response measurement and the level of baseline disease activity may act as treatment-effect modifiers in CHB patients. From the perspective of evidence synthesis, this study identifies an important issue, that is, treatment-effect modifiers may impact on the comparisons made across several studies in meta-analyses. It highlights the need to explore the impact of baseline characteristics and time-period imbalances between studies included in meta-analyses. When effect-modifier relationships are significant, comparisons across treatments may not be fair and could potentially bias the comparative treatment-effect estimates.


View Abstract


  • Contributors SA performed all analyses and contributed to the preparation of the manuscript. SM and NH assisted SA in performing the analysis and assisted in the preparation of the manuscript. BL, SB, LM and PL assisted in identifying key parameters, validated all results and assisted in the preparation of the manuscript.

  • Funding This study was funded by Bristol-Myers Squibb.

  • Competing interests SA, SM and NH worked for Oxford Outcomes Ltd who received funding from Bristol-Myers Squibb to conduct this analysis. BL and SB were Bristol-Myers Squibb employees during the conduct of the study, and received funding for consulting services on specific HEOR projects. PL received honoraria for advisory boards and speaking bureau for Bristol-Myers Squibb. LM received honoraria from Bristol-Myers Squibb for advisory boards, but not in the field of hepatology nor in one of the infectivology.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.