Article Text
Abstract
Objectives Randomised controlled trials report grouplevel treatment effects. However, an individual patient confronting a treatment decision needs to know whether that person's expected treatment benefit will exceed the expected treatment harm. We describe a flexible model for individualising a treatment decision. It individualises grouplevel results from randomised trials using clinical prediction guides.
Methods We constructed models that estimate the size of individualised absolute risk reduction (ARR) for the target outcome that is required to offset individualised absolute risk increase (ARI) for the treatment harm. Inputs to the model include estimates for the individualised predicted absolute treatment benefit and harm, and the relative value assigned by the patient to harm/benefit. A decision rule recommends treatment when the predicted benefit exceeds the predicted harm, valueadjusted. We also derived expressions for the maximum treatment harm, or the maximum relative value for harm/benefit, above which treatment would not be recommended.
Results For the simpler model, including one kind of benefit and one kind of harm, the individualised ARR required to justify treatment was expressed as required ARR_{target(i)}=ARI_{harm(i)} × RV_{harm/target(i)}. A complex model was also developed, applicable to treatments causing multiple kinds of benefits and/or harms. We demonstrated the applicability of the models to treatments tested in superiority trials (either placebo or active control, either fixed harm or variable harm) and noninferiority trials.
Conclusions Individualised treatment recommendations can be derived using a model that applies clinical prediction guides to the results of randomised trials in order to identify which individual patients are likely to derive a clinically important benefit from the treatment. The resulting individualised predictionbased recommendations require validation by comparison with strategies of treat all or treat none.
 STATISTICS & RESEARCH METHODS
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BYNC 3.0) license, which permits others to distribute, remix, adapt, build upon this work noncommercially, and license their derivative works on different terms, provided the original work is properly cited and the use is noncommercial. See: http://creativecommons.org/licenses/bync/3.0/
Statistics from Altmetric.com
Article summary
Article focus

Randomised controlled trials provide relative grouplevel estimates of the beneficial and harmful effects of a treatment. However, the absolute size of those effects may vary across individuals according to their baseline risk.

Models have been described previously to individualise results of superiority placebocontrol trials in the case of a variable benefit/fixed harm scenario.
Key messages

We provide a generalised model to individualise treatment recommendations. We start from the definition of the Clinically Important Difference: the size of treatment benefit that offsets the treatment harm, after adjusting for the patient's values.

The model applies to a variable benefit and a fixed or variable harm, and to superiority (placebo and active control) and noninferiority trials. It can accommodate more than one kind of benefit and/or harm.

Clinical Prediction Guides are used to individualise the predicted risk of the target event and of the harm at trial entry.
Strengths and limitations of this study

Strengths: the model adopts an individual perspective and is flexible and timely. It allows the calculation of an individual's maximum predicted absolute risk increase for the treatment harm, or the maximum relative value for harm/target, that would overturn the treatment decision.

Limitations: economic costs are not modelled; uncertainty will exist for some quantities entering the model; the model awaits empirical validation.
Introduction
For questions of treatment and prevention, randomised controlled trials (RCTs) provide the most valid evidence concerning the benefits and, often, the harms of the intervention. However, RCTs typically report only grouplevel results, whereas treatment effects may depend importantly on characteristics of individual patients.
A clinical prediction guide (CPG)1–3 uses patientspecific risk data to predict the level of risk for a clinical outcome of interest for an individual patient. CPGs applied to participants in clinical trials can predict the individual patient's level of risk at trial entry (baseline risk (BLR)) for the target outcome at which the treatment is directed, and also for harm caused by the treatment. If the relative risk reduction for the target outcome (or relative risk increase for the harm) is constant across the range of BLR, then the absolute treatment effects can be predicted in individual patients: absolute risk reduction (ARR) for the target outcome (the treatment benefit) and absolute risk increase (ARI) for the treatment harm.
But what size of ARR for the target benefit is sufficiently large to justify acceptance of a treatment that carries with it the potential for benefit and harm? That depends on the frequency of the harm caused by treatment, and the relative importance of the harm compared to the benefit. The size of treatment benefit that is large enough to offset the treatment harm is the patient's clinically important difference (CID).
The concept of CID has been incorporated in several prior formulations: the threshold ARR (inverted, the threshold number needed to treat4), the threshold for agreeing with treatment,5 the decision threshold (inverted, the number willing to treat (NWT)).6 Each of these constructs embodies the concept that for treatment to be justified, the predicted treatment benefit must exceed the predicted harm for that individual.
Absolute treatment benefits vary directly with BLR for the target benefit: for an effective treatment, the higher the BLR, the greater the predicted absolute benefit. When modelling absolute treatment effects across individuals, the assumed model usually has incorporated a variable benefit, but a fixed harm.4–7 However, the absolute size of treatment harms may also vary across individuals, in which case a variable benefit/variable harm model would apply. The two models are illustrated in figure 1.
The objectives of this report were: (1) to derive an expression for CID that is flexible and applicable to either fixed or variable treatment harm and (2) to describe a generalised model for deriving a treatment recommendation based on CID, using grouplevel estimates of treatment effects provided by RCTs and CPGs for prediction of individualised absolute treatment benefit and harm.
Methods
We define CID as the size of benefit from the treatment that offsets the harm of the treatment. We define a benefit as the reduction of the occurrence of the target outcome, expressed as the negative outcome, for example, death, rather than the positive outcome, for example, survival. When the benefit is defined categorically, CID is the required ARR for the target outcome (ARR_{target}) obtained with the treatment compared with the control. The control can be no treatment (or placebo) or an active control. The model contains parameters for the predicted individualised treatment benefit, the predicted individualised treatment harm and the patient's values. The model accommodates more than one kind of benefit and more than one kind of harm. No economic cost, either direct or indirect, is included in the model.
When applied to individualise a treatment recommendation, the model provides an individualised required ARR_{target}. A decision rule recommends the treatment when the patient’s predicted ARR_{target} is greater than her required ARR_{target}.
Data requirements
Table 1 summarises the required quantities for entry into the model, distinguishing between grouplevel measures and individuallevel predictions.
Grouplevel quantities
Most of the shown grouplevel quantities are used to generate individualised estimates. For treatment benefits, the required grouplevel measure is the relative risk reduction. The data source can be a metaanalysis of large RCTs or a single large RCT. The required grouplevel quantities for the harms depend on the type of harm, fixed or variable. For fixed harms, we use a grouplevel absolute quantity, the ARI (ARI_{harm}). For variable harms, we use a grouplevel relative quantity, the relative risk increase (RRI_{harm}). Whether fixed or variable, the estimate of the treatment effect for harms comes from a metaanalysis of large RCTs, a single large RCT or best available observational evidence. Values are entered as the relative value (RV) of the harm compared with the target benefit. A grouplevel RV_{harm/target} can be derived from formal utilitybased analyses, patient groups or expert opinion.
Individuallevel quantities
The individualised treatment benefit is expressed as ARR (ARR_{target(i)}). The individualised treatment harm is expressed as ARI (ARI_{harm(i)}). For BLR, we mean the risk in the reference group (the control arm in the trial), whether it is represented by patients on no treatment or placebo or by patients on an existing treatment. Table 2 summarises the role of CPGs in individualising predicted treatment benefits and harms.

Benefits modelling: The model allows the predicted ARR_{target(i)} to increase for increasing BLRs for the target outcome (BLR_{target(i)}), according to the equation: predicted ARR_{target(i)}=RRR_{target}×BLR_{target(i)}. The grouplevel RRR_{target} is assumed to be constant across different BLRs. The BLR_{target(i)} for the target benefit is estimable using a validated CPG.

Harms modelling. In the case of a fixed harm, the grouplevel estimate (ARI_{harm(trial)}) is used for the predicted ARI_{harm(i)}. No CPG is needed to predict an individualised harm. When the receipt of the treatment per se is modelled as a fixed harm (as with risk/discomfort), that harm is experienced by every treated patient, so the ARI_{harm(trial)} for the harm is constantly equal to 1.0 (100%). In the case of a variable harm across patients, the predicted ARI_{harm(i)} is calculated by multiplying the grouplevel RRI_{harm} by the individualised BLR for that harm (BLR_{harm(i)}). The grouplevel RRI_{harm} is assumed to be constant across different BLRs. The BLR_{harm(i)} can be estimated using a validated CPG.

Values modelling. An individual RV (RV_{harm/target(i)}) assigned by the patient enters the model. We recognise that an RV_{harm/target(i)} may not be ascertained reliably. Therefore, we modelled a range of RVs centred on a grouplevel RV.
When more than one benefit and/or more than one harm is included, for each benefit and harm the specific ARR_{i}/ARI_{i}/RV_{i} is separately calculated or assigned as above.
Construction of models for individualising a treatment recommendation
We constructed two models: a simple model where there is one kind of treatment benefit and one kind of treatment harm; and a complex model where there is more than one kind of benefit and/or more than one kind of harm. In both cases, the model equation is solved for the required ARR_{target(i)} to offset the treatment harm(s), given the predicted ARI_{harm(i)} and RV_{harm/target(i)}. The same basic equation can then be used to calculate:

The maximum ARI_{harm(i)} above which treatment would not be justified, given the predicted ARR_{target(i)} and RV_{harm/target(i)}.

The maximum RV_{harm/target(i)} above which treatment would not be justified, given the predicted ARR_{target(i)} and ARI_{harm(i)}.
Results
Algebraic solution to the model
We derived the following equations to describe the model (see Appendix for algebraic details).
Simple model: one kind of treatment benefit, one kind of treatment harm
Required ARR_{target(i)}
The required size of the ARR_{target} that offsets the treatment harm, valueadjusted, for the patient i can be calculated as (Appendix section 1, equations (1) and (2)) (m1)
The equation includes the particular condition of a fixed harm when the ARI_{trial} can substitute for the ARI_{i}. When the treatment receipt is considered the harm, the ARI_{trial} is 1.0 and so the ARR_{(target)i} is directly predictable from the RV_{harm/target(i)} as (m2)
Decision rule: In case of fixed harm and variable harm, the treatment would be justified for the patient i when (d1)
Maximum ARI_{harm(i)} and maximum RV_{harm/target(i)}
The maximum ARI_{harm(i)} above which the treatment would not be justified for the patient i can be calculated as (Appendix equations (1) and (3)) (m3)
Decision rule: The treatment would not be justified for the patient i when (d2)
where the predicted ARI_{harm(i)} can be fixed (=ARI_{trial}) or variable.
Similarly, the maximum RV_{harm/target(i)} above which the treatment would not be justified for the patient i can be calculated as (Appendix equations (1) and (4)) (m4)
Decision rule: The treatment would not be justified for the patient i when (d3)
Complex model: multiple treatment benefits, multiple treatment harms
The model can be generalised to incorporate additional treatment benefits other than the reduction of the target outcome, and multiple harms, whether fixed or variable (Appendix section 2). A harm may have a fixed as well as a variable component. In that case, the fixed and variable components would be entered as separate harms, along with their separate RVs. The size of the ARR_{target}, which is required to offset the valueadjusted treatment harms and which accounts for other treatment benefits, is calculated for the patient i as (Appendix equations (5), (6) and (7)) (m5)where k is the number of treatment harms, m is the total number of treatment benefits, and the benefit(2) to benefit(m) are the benefits other than the target benefit. Every RV_{(i)} is expressed as the value assigned to each outcome, prevented or caused by the treatment, compared with the value of the target benefit.
Decision rule: Similar to the case of only one benefit and one harm, the treatment would be justified for the patient i when (d1)
The complex model can be used to predict the individualised maximum allowed ARI for a target harm and the maximum RV for the target benefit compared with a target harm, above which the treatment is not justified.
Applicability of the model
Theoretically, the model is applicable to every situation tackling the choice between two treatment strategies. Three examples are proposed to show the applicability of the model to individualised treatment recommendations: a superiority trial with a variable benefit/fixed harm scenario; a superiority trial with a variable benefit/variable harm scenario and the case of noninferiority trials.
Superiority trial: variable benefit, fixed harm. Rosuvastatin for primary prevention of cardiovascular events
The Justification for the Use of Statins in Prevention (JUPITER) trial8 evaluated the effect of rosuvastatin versus placebo for reduction of cardiovascular events in apparently healthy men and women with lowdensity lipoprotein cholesterol levels <3.4 mmol/L and elevated highsensitivity C reactive protein. The primary outcome was a composite of myocardial infarction, stroke, arterial revascularisation, hospitalisation for unstable angina or cardiovascular death. The grouplevel result showed a substantial relative benefit of rosuvastatin (HR 0.56, 95% CI 0.46 to 0.69). This is equivalent to an RRR_{target} of 0.44 (95% CI 0.31 to 0.54). Nevertheless, the individual's absolute benefit with rosuvastatin will vary according to her BLR (BLR_{i.}). Validated CPGs exist to predict the BLR for cardiovascular events. The Framingham risk score,9 for example, predicts the risk of cardiovascular events at 10 years combining risk factors such as age, gender, smoking, total and highdensity lipoprotein cholesterol levels, systolic blood pressure and hypertension. Dorresteijn et al6 used the grouplevel quantities provided by the JUPITER study and CPGs, either existing9 ,10 or newly developed,6 to individualise the predicted BLR_{i} and absolute effect of rosuvastatin at 10 years (ARR_{(target)i}) among JUPITER's participants. They found an approximate 20fold variation in the predicted BLR_{(target)i}. Thus, the predicted ARR_{(target)i} varied from about 1–20% at 10 years, with a slightly different patient stratification depending on the CPG used. Dorresteijn and colleagues then evaluated the application of these individualised predictions to recommend the treatment. They defined the ‘Number Willing to Treat (NWT)’ as the number of patients one is willing to treat in exchange for the prevention of one target outcome event. Its inverse ratio (1/NWT) was defined as the ‘decision threshold’ and is equivalent to the required ARR_{(target)i} defined for our model. They considered that the treatment receipt per se constituted the harm (fixed harm). Thus, the required ARR_{(target)i} (ie, 1/NWT) equalled the RV_{harm/target(i)} (m_{2}). They examined how the treatment recommendations varied across a range of hypothetical values for NWT.
Superiority trial: variable benefit, variable harm. Warfarin to prevent cardioembolic events in patients with atrial fibrillation
Six RCTs compared warfarin versus placebo/no treatment in patients with nonvalvular atrial fibrillation to reduce the occurrence of stroke and systemic cardioembolism. Hart et al11 metaanalysed those RCTs and found a pooled RRR for cardioembolic events (RRR_{stroke}) of 0.64—or 64%—(95% CI 0.49 to 0.74). On the other hand, warfarin was associated with a pooled RRI for major extracranial bleeding (RRI_{bleed}) of 1.3—or 130%—(95% CI 0.08 to 3.89; note: Hart et al11 included the intracranial haemorrhages among the strokes in the efficacy analyses).
Several CPGs to predict the risk of stroke and bleeding have been developed and validated in patients with atrial fibrillation. Using the individual predictions for the BLR for stroke (BLR_{stroke(i)}) and for bleeding (BLR_{bleed(i)}), the absolute beneficial effect and also the absolute adverse effect with warfarin can be individualised as ARR_{stroke(i)}=RRR_{stroke}×BLR_{stroke(i)} and ARI_{bleed(i)}=RRI_{bleed}×BLR_{bleed(i)}, respectively. As an example, for the prediction of the BLR_{stroke(i)}, we adopted the CHADS_{2} score developed on patients off anticoagulation.12 For the prediction of the BLR_{bleed(i)}, we adopted the HEMORR_{2}HAGES score.13 Since the HEMORR_{2}HAGES score was developed on patients on warfarin,13 the corresponding BLR_{bleed(i)} off warfarin was calculated by dividing the predicted risk on warfarin by 2.3, which is the reported relative risk for major bleeding for warfarin compared with placebo.11 The results are shown in table 3. The predicted ARR_{stroke(i)} varied from 1.22% to 11.65%/year and the predicted ARI_{bleed(i)} varied from 1.07% to 6.95%/year. Comparing the individualised predictions for the benefit and the harm, valueadjusted, we then obtained individualised treatment recommendations for warfarin. A range of plausible values of the RV_{bleed/stroke} was examined.
Required ARR_{stroke(i)} to justify warfarin
To justify warfarin, the predicted ARR_{stroke(i)} should be greater than the required ARR_{stroke(i)} (d_{1}), that is, greater than ARI_{bleed(i)}×RV_{bleed/stroke(i)} (m_{1}). Table 3 summarises the results of the application of the model to individualise warfarin recommendation in a hypothetical patient population, according to the coclassification of patients based on the CHADS_{2} and HEMORR_{2}HAGES scores. We arbitrarily chose a grouplevel RV for a bleed/stroke of 0.6, an RV calculated from a lostutility analysis over a 10year time frame.4 Table 3 shows the resulting treatment decisions for each of the 42 cells formed according to the CHADS_{2} and HEMORR_{2}HAGES scores.
As a base case, the table was obtained using for RRR_{stroke} the point estimate (0.64).11 Since a treatment is accepted as superior compared with placebo/no treatment only when the upper bound of the 95% CI for the relative risk for the target outcome is below 1, we repeated the example using for RRR_{stroke} a value of 0.49 (corresponding to the upper bound for RR_{stroke} 0.51). In that case, the predicted ARR_{stroke(i)} is reduced and slightly fewer patients would be recommended for treatment. For example, a CHADS_{2} 3 and HEMORR_{2}HAGES 4 patient would now not be recommended for warfarin treatment (results not shown). We also repeated the example, using for RRI_{bleed} a value of 3.89, corresponding to the upper bound of the 95% CI for RRI_{bleed}. Now, considerably fewer patients would be recommended for treatment (data not shown). The major differences in who would be recommended for treatment arise primarily from the great uncertainty in the estimate for ARI_{bleed} in this example. We caution that table 3 is presented only as a framework for presenting particularised treatment recommendations in a variable benefit/variable harm scenario. The recommendations shown there are based only on point estimates, and should not be accepted without taking into account the uncertainties in the estimates for ARR_{stroke} and ARI_{bleed} in deriving the treatment recommendations.
Maximum ARI_{bleed(i)} above which warfarin would not be justified
Figure 2 shows how the maximum ARI_{bleed(i)} (m_{3}) varies according to the different CHADS_{2} scores and different values of RV_{bleed/stroke(i)} centred on a grouplevel RV_{bleed/stroke} of 0.6.
Maximum RV_{bleed/stroke(i)} above which warfarin would not be justified
Similarly, given the CHADS_{2} and the HEMORR_{2}HAGES scores of the patient, the model can calculate which is the maximum RV_{bleed/stroke(i)} (m_{4}) such that if the patient assigns an RV_{bleed/stroke} higher than this maximum, warfarin would not be justified. The variation of the maximum RV_{bleed/stroke(i)} according to the different CHADS_{2} and HEMORR_{2}HAGES scores is depicted in figure 3.
Individualising recommendations for a noninferior treatment
Application of model to noninferiority trials
The objective of a noninferiority trial is to show that the effect of a new treatment on a target outcome is not worse, compared with an established effective treatment (EET), by more than a prespecified margin. This ‘noninferiority margin’ is the maximum loss of efficacy that is considered acceptable in exchange for a hypothesised reduction in harm, valueadjusted. At the design phase, the noninferiority margin is expressed as either an absolute or relative increase in the target event rate. A grouplevel RV_{harm/benefit} is at least implicit when setting the specified margin. When interpreting the results of a noninferiority trial at the group level, the CI for the observed treatment effect on the target outcome is compared with the noninferiority margin. If the bound of the CI that reflects the maximal estimate for inferiority is less than the margin (does not ‘cover’ the margin), then it is concluded that the new treatment is noninferior to EET.
In noninferiority trials, the CID for a patient can be expressed as the required reduction of the harm which exactly compensates for the allowed increase of the target outcome, valueadjusted. Thus, for application to noninferiority trials, the equation m_{1} can be rewritten as:
Individualisation of the results of a trial demonstrating grouplevel noninferiority
We individualise grouplevel results by using CPGs, as applicable (table 2), to predict BLR_{(i)} and thereby absolute treatment effects on the target outcome (ARI_{target(i)}) and the treatment harm (ARR_{harm(i)}). ARI_{target(i)} is derived as BLR_{target(i)} × RRI_{trial}. ARR_{harm(i)} is derived as BLR_{harm(i)} × RRR_{trial}. We valueadjust the treatment harm for the RV_{harm/target(i)}. We then compare the individualised predictions of treatment effects on the target outcome and on the harm to derive individualised treatment recommendations. A recommendation to treat with the noninferior therapy would result when the predicted reduction in harm, valueadjusted, exceeds the predicted loss of efficacy, that is, when ARR_{harm(i)} × RV_{harm/target(i)} > ARI_{target(i)} (or, holding the same terminology as for superiority trials, when predicted ARR_{harm(i)} > required ARR_{harm(i)}).
To examine the worst case, we then repeat the comparison of reduction in harm and loss of efficacy by calculating ARI_{target(i)} using not the point estimate for RRI_{trial} but the bound of its CI that reflects the maximal inferiority of the new treatment.
Discussion
We presented an extension of the previously described models to individualise treatment recommendations, based on the use of CPGs to predict individuallevel treatment effects, adjusted for the relative importance assigned by the patient to different outcomes.
Strengths
The adoption of an individuallevel perspective represents the fundamental feature of the model. The individualising process requires the conversion of grouplevel into individuallevel treatment effects and the use of the patient's values.14 The model presented here is more flexible than models for individualising treatment recommendations described previously.4 ,5 Either a fixed or a variable harm is accommodated in our model. LaHaye et al15 developed a decision aid specifically designed to individualise antithrombotic therapy in patients with atrial fibrillation that included a variable benefit/variable harm scenario and also the patient's RV_{bleed/stroke}. However, they did not explicitly conceptualise and generalise the underlying model. We showed the adaptability of our model to treatments causing multiple kinds of benefits and harms, as well as to noninferiority trials. The concepts of the maximum ARI_{harm} and maximum RV_{harm/benefit} that would overturn the clinical decision had not been developed previously. The model is timely, given the increasing number of very large RCTs providing precise grouplevel estimates of treatment harms as well as treatment benefits, and the recent rapid rise in validated CPGs, catalogued and searchable in EvidenceUpdates,3 which makes the individualisation of those grouplevel quantities more feasible.
Limitations
In our model, we did not include economic costs, either direct or indirect. Like clinical benefits and harms, economic costs can be fixed or variable across patients. This raises the question of whether a grouplevel costeffectiveness analysis of a treatment can be individualised.16 A step in that direction is to apply prognostic models to particularise grouplevel information on costeffectiveness according to the predicted risk and patient subgroup.17 Our model provides a method for individualising the consequences of treatment. However, analyses of incremental costeffectiveness or costutility at the individual level are constrained at present by the lack of reliable individualised data on the incremental direct and indirect costs of treatment.
Use and appropriateness of CPGs for individualising recommendations
We generically explained why, how and when model building requires the use of CPGs. CPGs are developed for different purposes. A particular application of a CPG is to individualise risk predictions in the control group of an RCT. There are some desirable features of a CPG for this specific application. In box 1, we provide an aid to guide the user in the search for and the evaluation of an appropriate CPG for individualising the grouplevel results of the RCT of interest.
How to use a Clinical Prediction Guide (CPG) on risk prediction to individualise the results of an RCT
Relevance
Will the CPG help me in making individualised risk predictions for patients in the control group of the randomized controlled trial (RCT) of interest?

Were the patients on whom the CPG was developed or validated similar to the RCT's control group in regard to their clinical characteristics?

Does the treatment status of the patients on whom the CPG was developed match that of the RCT's control group, that is, each on no treatment or placebo; each on established effective therapy?

Does the CPG provide the absolute risk (or is it at least derivable) for the outcome of interest (target event or harm), in a specified period of time, according to risk factors/risk score?
Validity
Are the predictions made by the CPG valid?

How was the CPG developed?
Was the CPG developed on a well defined and representative sample of patients prospectively followed up?

How well did the CPG perform in the population of derivation?
Was the CPG's calibration tested? How accurate were the predictions of the absolute risk, that is, how good was the agreement between predictions and observed outcome?
Were the CPG's discrimination (cstatistic) and reclassification tested? How good were they?

Did the CPG undergo internal validation to quantify and eventually adjust for overfitting/optimism?

Did the CPG undergo external validation?

Was the CPG's performance tested in patients different from those on whom it was developed? How good was it?

Precision
How precise were the predictions of the absolute risk, that is, how wide was the uncertainty around the provided estimates?
In the case of a variable benefit/variable harm, we look for two different CPGs to classify the patients according to the ‘baseline’ risks for the target event and for the harm. In this case, the predictions resulting from this coclassification might be constrained by a possible withinpatient correlation between the two variable risks, since the target event and the harm may share some risk factors or may not be independent outcomes.
Uncertainty in grouplevel estimates and patient values
The results of an RCT are usually provided as point estimates accompanied by a measure of variability (CI). Often, as shown in the example in table 3, the withintrial estimates for the harm have been characterised by high imprecision. However, this situation may be improving with the increasing reports of very large activecontrol RCTs.18
Probably the major source of uncertainty is the patient's RV_{harm/benefit} and its elicitation. The scenario presented to the patient should uniformly include the major clinical outcomes of the treatment decision, including death if relevant, and the time frame of the consequences of the decision. Decision aids, which are tools specifically designed to prepare the patient to participate in the decision process, have been shown to improve patient knowledge and involvement, especially when they target explicit values clarification.19
One may embed in the calculation of the individual quantities a measure of the variance (eg, SE) of the grouplevel measures entering the model.20 Additionally, one may estimate how much that uncertainty can affect the individual predictions in the most pessimistic direction, that is, using the CI bounds for the grouplevel estimate of the treatment effect on target corresponding to the worst scenario. We proposed an alternative approach to deal with the uncertainty around the quantities entering the model. We provided formulas for estimating the individualised maximum ARI_{harm} and RV_{harm/benefit} above which the decision to treat would be overturned.
Future research objectives

Resolution of uncertainty. In applying our model, methods are needed to resolve uncertainty arising from imprecision in the estimates of treatment benefit and treatment harm derived from grouplevel results from RCTs. In this paper, we addressed uncertainty by resorting to sensitivity analyses utilising bounds of CIs on treatment effects. However, in the field of costeffectiveness analysis, investigators progressed to approaches dealing simultaneously with the stochastic uncertainty of all the quantities entering the model. These approaches include nonparametric bootstrapping, Fieller's theorem and Bayesian methods.21 We suggest as a future goal that such methods be explored for their applicability to resolution of uncertainty in clinical harm/clinical benefit analyses.

Net benefit and model validation. Vickers et al5 conceived a method to empirically test whether individualised recommendations based on CPGbased predictions of absolute treatment effects, valueadjusted, would actually result in a greater net benefit in real life compared with a policy of treating all patients or treating none. The method utilises the distribution of predicted individualised treatment effects in the randomly allocated treatment and control groups of a large RCT. One combines the patients whose predicted ARR_{target(i)} exceeded the required ARR_{target(i)} who were randomised to the treatment group, and the patients whose predicted ARR_{target(i)} did not exceed their required ARR_{target(i)} who were randomised to the control group. Those are the respective patients who would or would not be recommended for treatment and who used predictionbased treatment in real life. One then compares the observed outcomes in the trial of that combined group with the outcomes for the treatment arm of the RCT. The superiority of the predictionbased policy is validated if its net benefit is greater than the net benefit of treat all, or treat none. The empirical result, in the examples of Vickers et al5 and later Dorresteijn et al,6 was that a predictionbased policy was superior, but only within a limited range of the required ARR_{target(i)}. If the required ARR_{target(i)} was extreme in either the low or high direction, a policy of treat all or treat none, respectively, was preferred.
Vickers et al5 and Dorresteijn et al6 used this approach to validate individualised recommendations in a fixed harm scenario, where the harm was receipt of the treatment per se. Nevertheless, the same approach can be used to validate individualised recommendations in variable harm scenarios, and for treatments tested in noninferiority as well as superiority trials. As with Vickers’ method in general, individualpatient trial data must be available to identify the patients whose predicted ARR_{target(i)} did or did not exceed their required ARR_{target(i)}.
Acknowledgments
We thank Dr Brian Haynes, Nancy Wilczynski and Dr Alfonso Iorio at the Health Information Research Unit, McMaster University Health Sciences, for stimulating discussions and support.
Appendix: Algebraic derivation of the models
Legend:
Target = target outcome that the treatment can prevent
Harm = any increase of an adverse outcome due to the treatment
CID = clinically important difference
ARR = absolute risk reduction
ARI = absolute risk increase
V = value
RV = relative value
1. Derivation of the simple model (one benefit, one harm)
The CID corresponds to the ARR for the target benefit sufficiently large to exactly offset the treatment harm. Allowing for a different value assigned to the target outcome prevented by the treatment and to the harm caused by the treatment (V_{target} and V_{harm}, respectively), the condition at the CID can be expressed algebraically as: (1)
1.1. Algebraic solution for the required ARR_{benefit} to offset the treatment harm
Dividing each side of the equation (1) by V_{target} (2)
1.2. Algebraic solution for the maximum ARI_{harm} above which treatment would not be justified
Dividing each side of the equation (1) by V_{harm} (3)or, expressed in terms of RV_{harm/target}
1.3. Algebraic solution for the maximum RV_{harm/target} above which treatment would not be justified
dividing each side of equation by ARI_{harm} (4)
2. Derivation of the complex model (multiple benefits, multiple harms)
Legend:
Benefit = any reduction of an adverse outcome additional to the target outcome.
At the CID, the sum of treatment benefits offsets the sum of treatment harms. Allowing for different values for every outcome prevented or caused by treatment, this can be expressed algebraically as: (5)where m is the total number of treatment benefits, the benefit(2) to benefit(m) are the benefits other than the target one, and k is the number of treatment harms. Or, likewise: (6)Subtracting from both sides and dividing both sides for V_{target}, we can obtain the required ARR_{target} such that the total treatment benefits offset the total treatment harms: (7)where every RV is expressed as the value of that outcome, prevented or caused by the treatment, compared with the value assigned to the target outcome.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
 Data supplement 1  Online supplement
Footnotes

Contributors MM designed and carried out the data analyses, interpreted the results, and drafted and revised the manuscript. JCS designed and carried out the data analyses, interpreted the results, and drafted and revised the manuscript. Each author had full access to all the data (including statistical reports and tables) in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis. All authors have read and approved the final manuscript.

Funding This research received no specific grant from any funding agency in the public, commercial or notforprofit sectors.

Competing interests MM and JCS do not have support from and have no relationships with any company that might have an interest in the submitted work in the previous 3 years; MM and JCS have no nonfinancial interests that may be relevant to the submitted work.

Provenance and peer review Not commissioned; externally peer reviewed.

Data sharing statement No additional data are available.