Objective This study aimed to conduct a systematic review of preclinical and clinical evidence to chart the successful trajectory of talimogene laherparepvec (T-VEC) from the bench to the clinic.
Design This study was a systematic review. The primary outcome of interest was the efficacy of treatment, determined by complete response. Abstract and full-text selection as well as data extraction were done by two independent reviewers. The Cochrane risk of bias tool was used to assess the risk of bias in studies.
Setting Embase, Embase Classic and OvidMedline were searched from inception until May 2016 to assess its development trajectory to approval in 2015.
Participants Preclinical and clinical controlled comparison studies, as well as observational studies.
Interventions T-VEC for the treatment of any malignancy.
Results 8852 records were screened and five preclinical (n=150 animals) and seven clinical studies (n=589 patients) were included. We saw large decreases in T-VEC’s efficacy as studies moved from the laboratory to patients, and as studies became more methodologically rigorous. Preclinical studies reported complete regression rates up to 100% for injected tumours and 80% for contralateral tumours, while the highest degree of efficacy seen in the clinical setting was a 24% complete response rate, with one study experiencing a complete response rate of 0%. We were unable to reliably assess safety due to the lack of reporting, as well as the heterogeneity seen in adverse event definitions. All preclinical studies had high or unclear risk of bias, and all clinical studies were at a high risk of bias in at least one domain.
Conclusions Our findings illustrate that even successful biotherapeutics may not demonstrate a clear translational road map. This emphasises the need to consider increasing rigour and transparency along the translational pathway.
PROSPERO registration number CRD42016043541.
- oncolytic virus
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
Comprehensive analysis of the translational pathway of talimogene laherparepvec.
Threats to both internal validity and construct validity were assessed.
Reporting of methods and findings was incomplete in most of the studies included, which limited our analysis
Preclinical research receives approximately half of the world’s biomedical research funding, yet very few of its findings translate clinically. This represents an enormous waste of resources with an estimated US$28 billion dollars per year in the USA alone being spent on biomedical research which is not reproducible and therefore not translatable.1 One study found that only 5% of highly efficacious preclinical therapeutics were clinically translated.2 These successes often take almost 20 years to become successfully translated across the research spectrum.2 3
Although the process of clinical translation is complicated, the transition from bench-to-bedside often starts with preclinical research. These investigations (usually on animals or cells) are aimed at studying efficacy, pharmacokinetics and dynamics, as well as detailing safety.4 Next, a drug is tested in a phase I clinical trial, which usually contains a small number of participants and is aimed at studying the safety of the drug. If a drug is safe, it may proceed to phase II studies which are larger than phase I studies and are designed to test safety, pharmacokinetics, pharmacodynamics and optimal dosing regimens. They may also offer preliminary evidence of drug efficacy. Finally, a methodologically rigorous phase III study is performed. These studies are designed and powered to test efficacy in the patient population of interest (usually against a comparator such as placebo), as well as identify rarer adverse events which may have gone unnoticed in a smaller phase I or II study.5
Given the high failure rate in translating therapies across this spectrum, as well as significant time-lags associated with translation, it is important that we examine the few agents that have successfully crossed the preclinical to clinical bridge in order to learn from and replicate their success. Thus, we conducted a comprehensive evaluation of available evidence supporting the successful translation of talimogene laherparepvec (T-VEC). T-VEC is a modified Herpes Simplex Virus (HSV)-1 virus produced by Amgen and it is the first, and only, Food and Drug Adminstration (FDA) approved oncolytic virus therapy; it is currently approved to treat advanced melanoma.6 Oncolytic viruses are an emerging cancer therapy that work by preferentially targeting and infecting cancer cells.6 On infection, oncolytic viruses can induce an antitumour immune response that reduces tumour burden. T-VEC was chosen as a model due to the fact that it is the only approved oncolytic virus therapy to date, despite the multitude of agents under investigation.7
Through a careful evaluation of T-VEC development, we hoped to identify factors that may contribute to bench-to-bedside success. This may serve an exemplar for other therapies as they move along the translational continuum. Thus, the purpose of this systematic review was to map the successful preclinical to clinical trajectory of T-VEC to inform the development paths of future biotherapeutics.
Our review was registered in full on PROSPERO, the international prospective register of systematic reviews (no. CRD42016043541). The review is reported in accordance to the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines.8
We included all clinical and preclinical in vivo controlled comparison studies of T-VEC for treatment of any malignancy (randomised, pseudo-randomised and non-randomised studies), as well as observational studies such as case-control, case-series and case reports. Studies reporting only ex vivo or in vitro experiments were excluded. For both preclinical and clinical studies, we included studies that administered T-VEC as a monotherapy or in combination with other therapies for treatment of malignancy. We had no exclusions on comparison treatments, which include standard line therapy or no treatment.
The primary outcome of interest was the efficacy of treatment. Our primary indicator of efficacy was complete response. Other measures of efficacy such as survival, response rates (durable, partial, objective), time to treatment failure and disease stability were also collected. Such measures were based on the Response Evaluation Criteria in Solid Tumours Guidelines.9 In preclinical studies, additional measures of efficacy such as changes in mean tumour volume and number of lesions were collected. The primary indicator of efficacy, complete response, was used as the primary outcome regardless of reporting within the individual study, in order to assess the continuity of evidence along the research spectrum. The secondary outcome of interest was safety, for which we collected data on all adverse events in preclinical and clinical studies.
In collaboration with a medical information specialist (Risa Shorr, Learning Services, The Ottawa Hospital) a search strategy was designed to identify all relevant preclinical and clinical studies. Searches were conducted in the following databases: Embase, Embase Classic and OvidMedline from inception until May 2016. This time frame was chosen to ensure that all published studies which contributed T-VECs FDA approval in 2015 were included. Search terms included Talimogen laherparepvec, Tvec, OncoVEX and Imlygic. Additional terms pertaining to preclinical studies (eg, animal experiment/model) and oncology (eg, cancer, neoplasm, oncolytic virus) were also included. Studies were also screened for inclusion based on reference tracking, by scanning the bibliography of included primary studies and relevant review articles. We did not impose any restrictions on language or publication type. A grey literature search was not performed. The finalised search strategy can be found in online supplementary file 1.
Study selection process
Studies identified by our literature search were collated and duplicates were removed. Titles and abstracts were independently screened for inclusion by two reviewers using DistillerSR (Evidence Partners, Ottawa, Ontario, Canada). Those deemed potentially relevant were recorded, and full-text articles were obtained. The same reviewers screened full articles for final eligibility. Disagreements at any stage were resolved by discussion or consultation with a senior team member when necessary. The study selection process was documented using a PRISMA flow diagram.
All data extraction was completed independently and in duplicate, using a standardised and piloted data extraction form, with disagreements resolved as mentioned above. Data pertaining to general and intervention characteristics of the included studies were extracted (eg, study design, country, type of malignancy, dosing of intervention and comparator treatments). For clinical studies, data was collected on patient characteristics (eg, age, sex, cancer staging, HSV status). For preclinical studies, characteristics on the animal model were extracted (eg, type of species, cell line used, disease induction method, age, sex, weight).
Risk of bias: assessment to risk of internal validity
Clinical studies that met inclusion criteria were assessed for risk of bias in duplicate, according to the recommended methodology of the Cochrane Collaboration.10 Five types of biases (selection, performance, detection, attrition, reporting biases) were assessed using six domains: randomisation, allocation concealment, blinding of participants/personnel, outcome assessment blinding, incomplete outcome reporting and selective outcome reporting. Additional domains assessed for risk of bias were (1) reported conflicts of interest, (2) sample size calculation and (3) funding. Each domain was given a score of ‘high’, ‘unclear’ or ‘low’ risk of bias for each study. Risk of bias assessment for preclinical studies were assessed using a modified Cochrane Risk of Bias tool and assessed the same domains as indicated for clinical studies.11
Assessment of threats to construct validity
Construct validity is the concept in how well a preclinical experiment (ie, animal studies) corresponds to the clinical entity it is intended to model. There are various threats to construct validity that can be introduced from the preclinical study design. The items evaluated in duplicate for each preclinical study include (1) use of adult animals, (2) use of animals with advanced stage disease (defined as the presence of multiple visceral lesions and/or clinical/histological signs of malignant progression), (3) immune status of animals to HSV, (4) whether a xenograft model was used and (5) the use of a humanised immune system model. Each of these items was given a score of ‘yes’, ‘no’ or ‘unclear’ for every preclinical study.
Efficacy was expressed as proportions with accompanying 95% CIs. If CIs were not present within the individual study, they were calculated via standard methods.12 To assess the continuity between preclinical and clinical studies, the efficacy of studies was plotted as percentage response.
Deviations from protocol
We were unable to assess safety as we could not acquire patient-level safety data. Furthermore, our primary efficacy outcome stated in the protocol was durable response rate. However, this was changed to complete response as most clinical studies did not report durable response, and we needed to track T-VEC’s trajectory over several studies. We acknowledge the limitation of this approach, given the FDA approved T-VEC based on the OPTIM trial,13 the primary endpoint of which was durable response rate. Subgroup analyses, meta-analyses, Egger’s test and pooling of data could not be conducted due to the limited available data.
Patient and public involvement
Patients and the public were not involved in this research.
On removal of duplicates, a total of 8852 references were identified by the electronic search. During the review of titles and abstracts, 7890 references were excluded. Following full-text screening, another 938 articles were excluded for reasons such as wrong study design (ie, review article) or wrong study intervention (ie, a different cancer therapeutic). A total of seven clinical studies13–19 and five preclinical studies20–24 were included in our review (figure 1).
Characteristics of included trials
Characteristics of included preclinical studies are shown in table 1. Preclinical studies were published between 2003 and 2016 and sample sizes ranged from 20 to 90. Of the five preclinical studies, three used a lymphoma model, one used a colorectal model and one used a melanoma model. All studies were performed in mice. The duration of follow-up was reported by two studies and ranged from 10 to 35 days. The dose of T-VEC used ranged from 3×104 plaque forming units (PFU) to 5×106 PFU. Frequency of T-VEC administration varied: every 3 days for 1 week, every 3 days for 9 days, a single dose given only once and every other day for 5 days. Specific details of study and intervention characteristics for each preclinical study can be found in online supplementary files 2 and 3.
Characteristics of included clinical studies are shown in table 2. Studies were published between 2006 and 2016 and took place in seven countries. Sample sizes ranged from 17 to 295. Of the seven clinical studies, four were in melanoma patients, one was in pancreatic cancer patients, one in head and neck cancer patients and one studied a mix of breast, colorectal, melanoma and head and neck cancer patients. Six were either phase I or II, and one trial was a phase III evaluation. The primary outcome was efficacy in two studies, safety in three studies and a combination of efficacy and safety in the other two studies. The duration of follow-up ranged from 6 weeks to 44 months.
T-VEC was administered alone in four studies, while it was administered immediately following to systemic therapy in three studies. The dose of T-VEC administered ranged from 104 to 108 PFU/mL. In the large study, phase III, T-VEC was administered at ≤4 mL × 106 PFU/mL once, and then 3 weeks later, ≤4 mL 108 PFU/mL was administered every 2 weeks for a median of 23 weeks. A similar dosing regimen was used in three other trials. The other three trials were dose-finding in nature and had multiple trial arms receiving increasing doses of T-VEC. In-depth study details, as well as participant and intervention details for each study, can be found in online supplementary files 4-6.
Efficacy of treatment
Treatment efficacy for each study is summarised in table 1, table 2 and figure 2. Preclinical studies reported complete regression rates up to 100% for injected tumours and 80% for contralateral tumours (see also online supplementary file 7). In comparison, the first published phase I T-VEC clinical trial reported a complete response of 0% for cutaneous lesions caused by malignancies of head and neck, breast, colorectal and melanoma. Of the multiple malignancies treated, melanoma had the best response in this trial. Subsequent phase I/II melanoma trials were then conducted and demonstrated complete response rates of 20%–22%. This was followed by the phase III OPTIM melanoma trial, which had a complete response rate of 10.8%. Studies involving non-melanoma cancers varied with efficacies between 0% and 24%.
Safety of treatment
We attempted to assess safety across clinical studies; however, we were unable to obtain patient-level data from any of the studies. The definitions of adverse events, and the manner in which they were classified, were found to be highly heterogenous across studies. Studies did not specify what percent of adverse events were repeated adverse events from the same patient(s), used different criteria for recording and reporting adverse events, and categorised them differently. Therefore, we were unable to pool adverse events or interpret findings reliably.
Construct validity, the concept of how well an animal model represents the clinical entity it is intended to mimic, was first assessed through the following domains: the use of appropriately aged mice, advanced stage of disease, HSV-immunity and types of mouse models. None of the preclinical studies fully reported or used methodologies to reduce threats to construct validity domains (table 3). No studies declared using adult animal models, no studies used animals with late stage disease, only one study used animals immune to HSV, no studies used a xenograft model and no studies reported using an animal model with a humanised immune system.
We also assessed internal validity (ie, risk of bias) and found that all preclinical studies had high or unclear risk of bias across the assessed domains: randomisation sequence, allocation concealment, blinding, incomplete reporting, sample size calculation and funding source (table 4). For clinical studies, early phase trials had high or unclear risk of bias across at least six of nine domains, whereas the more robust phase III OPTIM trial had the lowest risk of bias and also the lowest efficacy of any of the published melanoma clinical trials (table 5). Reporting of key methodological elements was lacking.
We hoped to synthesise the evidence to produce a clear road map of T-VEC’s translation in the published literature. This would allow us to follow the journey of a successful biotherapeutic, and potentially use this as a blueprint for similar efforts in the future. Yet, we were unable to paint a clear picture of how the evidence was used in proceeding to melanoma clinical trials. Rather, our assessment uncovered a disconnect between in vivo preclinical and clinical findings. Furthermore, the road map was plagued with poor reporting, high risk of bias and insufficient data along the translational path. Overall, we were surprised by the pace and magnitude of diminishing efficacy as T-VEC moved from bench-to-bedside and then towards later phase clinical trials (ie, phase I to III). Although T-VEC was successful in terms of gaining regulatory approval, its translational path is complicated and the pieces of the evidence puzzle do not easily fit together. While we appreciate that translation is not a predictable linear process, it is difficult to learn from the example of T-VEC given the available and reported preclinical and clinical evidence.
While many novel therapeutics are under intellectual property rights, details of study design and results should be transparently reported for scientists, clinicians and patients to evaluate findings. The fact that the only FDA approved oncolytic virus therapy is not clearly reported illustrates the issues plaguing the success of cancer therapeutics. Nonetheless, T-VEC has shown some efficacy in treating refractory melanoma25 and numerous clinical trials are underway to assess its use in combination with other cancer regimens and in treating other malignancies. It is also the recommended treatment by the National Comprehensive Cancer Centre for patients with in-transit melanoma.26
Perhaps the largest discrepancy noted was that only a single preclinical study used a melanoma model, whereas all but two clinical studies administered T-VEC to melanoma patients. Conversely, lymphoma, which was used in three preclinical studies, was not assessed in clinical studies. Interestingly, our subsequent searches found that Amgen’s FDA filing (STN# 125518.000)27 for T-VEC did not appear to report on any in vivo melanoma models, whereas the European Medicines Agency (EMA) report did (EMA/734400/2015).28 Thus, the majority of animal models were off-target from the malignancies studied in clinical trials and may have poorly represented melanoma in the clinical setting. Coupled with these findings was the fact that the majority of included studies were found to be at a high or unclear risk of bias for most domains. Such threats to internal validity can bias results and may help explain T-VEC’s superior preclinical efficacy compared with later phase clinical trials. A lack of randomisation and blinding in preclinical studies has been associated with inflated effect sizes,29 30 thus this may partially explain the preclinical to clinical discrepancy of T-VEC.
Reporting of methods and findings was incomplete in most of the studies included. Only one full preclinical article on T-VEC was published, and solely aggregate patient data for later phase trials was available. Poor reporting and study design are major contributors to the ongoing reproducibility crisis in preclinical research.31 Thus, in hopes of presenting a clearer picture of T-VEC’s successful translation, we contacted Amgen to obtain preclinical in vivo melanoma data, patient-level safety data and any additional efficacy data. Patient-level data would afford the ability to combine data across T-VEC’s clinical development and also provide clarification into the categorisation of adverse events. Recently, release of individual patient data to third parties has been advocated by the Institute of Medicine, journal editors and others as it enhances transparency, enables reanalyses of data and helps address reproducibility.32 The reporting of harms in clinical trials remains an issue in the scientific community,33–35 and represents a roadblock to translational success. Some basic steps required to improve the reporting of safety in translational research include the development of standardised scales and instruments, instituting active rather than passive surveillance for toxicity, including detailed information on participant withdrawals due to toxicity, reporting the timing, frequency and duration of clinically relevant events and the publication of raw data.36 37 Amgen, however, was unwilling to enter a data sharing agreement, as they stated that there was little value to compel a transparent data release for our proposed analyses. This lack of transparency and incomplete reporting is disappointing, especially considering that it was Amgen that previously highlighted poor reporting as contributing to its own failure to reproduce 47 of 53 high-impact preclinical cancer studies.38 Their findings fuelled a call by the National Institutes of Health (NIH) and other stakeholders to enhance the reproducibility and transparency of preclinical research.39
As stated, we recognise that translation is not a linear process, but we should observe consistent and coherent patterns. Moving forward, we suggest that preclinical and clinical studies for emerging therapies should be fully reported and attention should be given to validities in order to develop more precise estimates of effect early in development. Investigators should carefully match their preclinical model to the intended clinical population; when possible, both disease states and outcomes measured should have high construct validity. Following successful exploratory preclinical studies, investigators should consider preclinical systematic reviews40 and designing methodologically rigorous confirmatory and/or multicentre preclinical studies.41 These steps may allow preclinical testing to more accurately forecast downstream clinical results in human patients.30 Within the trajectory of clinical development (ie, once clinical trials have been initiated), careful consideration of methods to reduce bias should also be considered (although, this may not be possible for the earliest phase trials). We believe these steps will provide unbiased and valuable information that will ultimately provide patients with cancer therapies that match their preclinical and early clinical promise.
The findings from our systematic review demonstrate that even successful biotherapeutics may not demonstrate a clear translational road map. The magnitude of efficacy of T-VEC demonstrated in preclinical studies was considerably larger than subsequent clinical trials; the most methodologically rigorous trial included in our review demonstrated the smallest degree of efficacy. Methodologically rigorous studies should be performed earlier on in the translational pathway, which may help provide a realistic estimate of treatment efficacy prior to clinical translation.
The authors thank Risa Shorr (Ottawa Hospital Research Institute librarian) for systematic search assistance. ML is supported by The Ottawa Hospital Anesthesia Alternate Funds Association and the Scholarship Protected Time Program, Department of Anesthesiology and Pain Medicine, Ottawa.
ML and GJL contributed equally.
Contributors ML and DAF conceptualised the study. ML, RCA, DAF and GJL contributed to the study design. GJL, YYD, JM and CB conducted data extraction. All authors analysed and interpreted the data. ML and GJL were responsible for drafting the manuscript. All authors critically reviewed the manuscript and provided intellectual content. All authors approve the final version of the manuscript.
Funding Biotherapeutics for Cancer Treatment (BioCanRx) supported the conduct of this study by a Catalyst grant. BioCanRx is a Government of Canada funded Networks of Centres of Excellence and was not involved in any other aspect of the project, such as the design of the project’s protocol and analysis plan, the collection of data and analyses. CB was also supported by a BioCanRx studentship. ML is supported by The Ottawa Hospital Anesthesia Alternate Funds Association and the Scholarship Protected Time Program, Department of Anesthesiology and Pain Medicine, uOttawa.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.